Catenate files with blank lines between them [duplicate] - bash

This question already has answers here:
Concatenating Files And Insert New Line In Between Files
(8 answers)
Closed 7 years ago.
How can we copy all the contents of all the files in a given directory into a file so that there are two empty lines between contents of each files?
Need not to mention, I am new to bash scripting, and I know this is not an extra complicated code!
Any help will be greatly appreciated.
Related links are following:
* How do I compare the contents of all the files in a directory against another directory?
* Append contents of one file into another
* BASH: Copy all files and directories into another directory in the same parent directory
After reading comments, my initial attempt is this:
cat * > newfile.txt
But this does not create two empty lines between contents of each new files.

Try this.
awk 'FNR==1 && NR>1 { printf("\n\n") }1' * >newfile.txt
The variable FNR is the line number within the current file and NR is the line number overall.

One way:
(
files=(*)
cat "${files[0]}"
for (( i = 1; i < "${#files[#]}" ; ++i )) ; do
echo
echo
cat "${files[i]}"
done
) > newfile.txt

Example of file organization:
I have a directory ~/Pictures/Temp
If I wanted to move PNG's from that directory to another directory I would first want to set a variable for my file names:
# This could be other file types as well
file=$(find ~/Pictures/Temp/*.png)
Of course there are many ways to view this check out:
$ man find
$ man ls
Then I would want to set a directory variable (especially if this directory is going to be something like a date
dir=$(some defining command here perhaps an awk of an ls -lt)
# Then we want to check for that directories existence and make it if
# it doesn't exist
[[ -e $dir ]] || mkdir "$dir"
# [[ -d $dir ]] will work here as well
You could write a for loop:
# t is time with the sleep this will be in seconds
# super useful with an every minute crontab
for ((t=1; t<59; t++));
do
# see above
file=$(blah blah)
# do nothing if there is no file to move
[[ -z $file ]] || mv "$file" "$dir/$file"
sleep 1
done
Please Google this if any of it seems unclear here are some useful links:
https://www.gnu.org/software/gawk/manual/html_node/For-Statement.html
http://www.gnu.org/software/gawk/manual/gawk.html
Best link on this page is below:
http://mywiki.wooledge.org/BashFAQ/031
Edit:
Anyhow where I was going with that whole answer is that you could easily write a script that will organize certain files on your system for 60 sec and write a crontab to automatically do your organizing for you:
crontab -e
Here is an example
$ crontab -l
* * * * * ~/Applications/Startup/Desktop-Cleanup.sh
# where ~/Applications/Startup/Desktop-Cleanup.sh is a custom application that I wrote

Related

How to take multiple files in terminal in BASH shell scripting

I am new in sh and I am trying to scan output files and take some rows starts with "enthalpy new" to a new file. In advance I created a file named step.txt in the directory. There is not a certain number of files so I tried to do it like that:
for (( i=1; i<=$# ; i++))
do
grep "enthalpy new" $i >> step.txt
done
And I wrote this command to the terminal:
$bash hw1.sh sample2.out sample1.out
Then, I took these errors:
grep: 1: No such a file or directory
grep: 2: No such a file or directory
I am expecting to have step.txt file having 28 lines which 12 of them coming from sample1.out and 16 of them coming from sample2.out. Inside of step.txt will look like:
enthalpy new = 80 Ry
enthalpy new = 76 Ry
....
....
enthalpy new = 90 Ry
Is there anyone to tell my error and to help me fixing the code?
At present $I is referencing the iterations of the loop and so 1,2,3 .... These files cannot be found by grep and hence the error.
There are two approaches to ocercome this. $# contains the parameters passed to the script and so you could try:
grep "enthalpy new" "$#" >> step.txt
Alternatively, if you want to loop through each parameter/file try:
for fil in "$#"
do
grep "enthalpy new" "$fil" >> step.txt
done
To input filenames or any other variable you have to use $1 $2 $3 e.g. as the input for the script. It would work more flexible if you would drop them in a specific directory (let's say ./output) and call the script without variables in the parent directory - then it would be more flexible in terms of how many files you drop in there, without incriminating the variables and capturing input for code injection - the code should look like this:
for i in $(find ./output -name '*.out')
do
grep "enthalpy new" $i >> step.txt
done

Add time stamp to file which has at least one row in UNIX

I have list of files at a location ${POWERCENTER_FILE_DIR} .
The files consist of row header and values.
MART_Apple.csv
MART_SAMSUNG.csv
MART_SONY.csv
MART_BlackBerry.csv
Requirements:
select only those files which has atleast 1 row.
Add time stamp to the files which has at least 1 row.
For example:
If all the files except MART_BlackBerry.csv has atleast one row then my output files names should be
MART_Apple_20170811112807.csv
MART_SAMSUNG_20170811112807.csv
MART_SONY_20170811112807.csv
Code tried so far
#!/bin/ksh
infilename=${POWERCENTER_FILE_DIR}MART*.csv
echo File name is ${infilename}
if [ wc -l "$infilename"="0" ];
then
RV=-1
echo "input file name cannot be blank or *"
exit $RV
fi
current_timestamp=`date +%Y%m%d%H%M%S`
filename=`echo $infilename | cut -d"." -f1 `
sftpfilename=`echo $filename`_${current_timestamp}.csv
cp -p ${POWERCENTER_FILE_DIR}$infilename ${POWERCENTER_FILE_DIR}$sftpfilename
RV=$?
if [[ $RV -ne 0 ]];then
echo Adding timestamp to ${POWERCENTER_FILE_DIR}$infilename failed ... Quitting
echo Return Code $RV
exit $RV
fi
Encountering errors like:
line 3: [: -l: binary operator expected
cp: target `MART_Apple_20170811121023.csv' is not a directory
failed ... Quitting
Return Code 1
to be frank, i am not able to apprehend the errors nor i am sure i am doing it right. Beginner in unix scripting.Can any experts guide me where to the correct way.
Here's an example using just find, sh, mv, basename, and date:
find ${POWERCENTER_FILE_DIR}MART*.csv ! -empty -execdir sh -c "mv {} \$(basename -s .csv {})_\$(date +%Y%m%d%H%M%S).csv" \;
I recommend reading Unix Power Tools for more ideas.
When it comes to shell scripting there is rarely a single/one/correct way to accomplish the desired task.
Often times you may need to trade off between readability vs maintainability vs performance vs adhering-to-some-local-coding-standard vs shell-environment-availability (and I'm sure there are a few more trade offs). So, fwiw ...
From your requirement that you're only interested in files with at least 1 row, I read this to also mean that you're only interested in files with size > 0.
One simple ksh script:
#!/bin/ksh
# define list of files
filelist=${POWERCENTER_FILE_DIR}/MART*.csv
# grab current datetime stamp
dtstamp=`date +%Y%m%d%H%M%S`
# for each file in our list ...
for file in ${filelist}
do
# each file should have ${POWERCENTER_FILE_DIR} as a prefix;
# uncomment 'echo' line for debugging purposes to display
# the contents of the ${file} variable:
#echo "file=${file}"
# '-s <file>' => file exists and size is greater than 0
# '! -s <file>' => file doesn't exist or size is equal to 0, eg, file is empty in our case
#
# if the file is empty, skip/continue to next file in loop
if [ ! -s ${file} ]
then
continue
fi
# otherwise strip off the '.csv'
filebase=${file%.csv}
# copy our current file to a new file containing the datetime stamp;
# keep in mind that each ${file} already contains the contents of the
# ${POWERCENTER_FILE_DIR} variable as a prefix; uncomment 'echo' line
# for debug purposes to see what the cp command looks like:
#echo "cp command=cp ${file} ${filebase}.${dtstamp}.csv"
cp ${file} ${filebase}.${dtstamp}.csv
done
A few good resources for learning ksh:
O'Reilly: Learning the Korn Shell
O'Reilly: Learning the Korn Shell, 2nd Edition (includes the newer ksh93)
at your UNIX/Linux command line: man ksh
A simplified script would be something like
#!/bin/bash
# Note I'm using bash above, can't guarantee (but I hope) it would work in ksh too.
for file in ${POWERCENTER_FILE_DIR}/*.csv # Check Ref [1]
do
if [ "$( wc -l "$file" | grep -Eo '^[[:digit:]]+' )" -ne 0 ] # checking at least one row? Check Ref [2]
then
mv "$file" "${file%.csv}$(date +'%Y%m%d%H%M%S').csv" # Check Ref [3]
fi
done
References
File Globbing [1]
Command Substitution [2]
Parameter Substitution [3]

Iterate through several files in bash [duplicate]

This question already has answers here:
How to zero pad a sequence of integers in bash so that all have the same width?
(15 answers)
Closed 6 years ago.
I have a folder with several files that are named like this:
file.001.txt.gz, file.002.txt.gz, ... , file.150.txt.gz
What I want to do is use a loop to run a program with each file. I was thinking in something like this (just a sketch):
for i in {1:150}
gunzip file.$i.txt.gz
./my_program file.$i.txt output.$1.txt
gzip file.$1.txt
First of all, I don't know if something like this is gonna work, and second, I can't figure out how to keep the three digits numeration the file have ('001' instead of just '1').
Thanks a lot
The syntax for ranges in bash is
{1..150}
not {1:150}.
Moreover, if your bash is recent enough, you can add the leading zeroes:
{001..150}
The correct syntax of the for loop needs do and done.
for i in {001..150} ; do
# ...
done
It's unclear what $1 contains in your script.
To iterate over files I believe the simpler way is:
(assuming there are no files named 'file.*.txt' already in the directory and that your output file can have a different name)
for i in file.*.txt.gz; do
gunzip $i
./my_program $i $i-output.txt
gzip file.*.txt
done
Using find command:
# Path to the source directory
dir="./"
while read file
do
output="$(basename "$file")"
output="$(dirname "$file")/"${output/#file/output}
echo "$file ==> $output"
done < <(find "$dir" \
-regextype 'posix-egrep' \
-regex '.*file\.[0-9]{3}\.txt\.gz$')
The same via pipe:
find "$dir" \
-regextype 'posix-egrep' \
-regex '.*file\.[0-9]{3}\.txt\.gz$' | \
while read file
do
output="$(basename "$file")"
output="$(dirname "$file")/"${output/#file/output}
echo "$file ==> $output"
done
Sample output
/home/ruslan/tmp/file.001.txt.gz ==> /home/ruslan/tmp/output.001.txt.gz
/home/ruslan/tmp/file.002.txt.gz ==> /home/ruslan/tmp/output.002.txt.gz
(for $dir=/home/ruslan/tmp/).
Description
The scripts iterate the files in $dir directory. The $file variable is filled with the next line read from the find command.
The find command returns a list of paths corresponding to the regular expression '.*file\.[0-9]{3}\.txt\.gz$'.
The $output variable is built from two parts: basename (path without directories) and dirname (path to file's directory).
${output/#file/output} expression replaces file with output at the front end of $output variable (see Manipulating Strings)
Try-
for i in $(seq -w 1 150) #-w adds the leading zeroes
do
gunzip file."$i".txt.gz
./my_program file."$i".txt output."$1".txt
gzip file."$1".txt
done
The syntax for ranges is as choroba said, but when iterating over files you usually want to use a glob. If you know all the files have three digits in their names you can match on digits:
shopt -s nullglob
for i in file.0[0-9][0-9].txt.gz file.1[0-4][0-9] file.15[0].txt.gz; do
gunzip file.$i.txt.gz
./my_program file.$i.txt output.$i.txt
gzip file.$i.txt
done
This will only iterate through files that exist. If you use the range expression, you have to take extra care not to try to operate on files that don't exist.
for i in file.{000..150}.txt.gz; do
[[ -e "$i" ]] || continue
...otherstuff
done

Bash script, reading from set of files in a directory

I have a set of files places in a directory, and I want to read the second line from each file, extract the first substring which is placed between braces " () " and rename that file with this substing.
I'm not looking for a full bash code, I just need some hints and commands to use for each step
Example:
a file has these lines:
/* USER: 202166 (just_yousef) */
/* PROBLEM: 2954 (11854 - Egypt) */
/* SUBMISSION: 11071978 */
/* SUBMISSION TIME: 2012-12-25 15:49:25 */
/* LANGUAGE: 3 */
I need to take substring "11854 - Egypt" and rename the file with it, and proceed to the next file.
from each file
for f in /the/directory/*; do
# ...
done
read the second line [and] extract the first substring which is placed between [parentheses]
v=$(sed -n '2 { s/[^(]*(\([^)]*\)).*/\1/; p; q }' < "${f}")
rename that file
Use the mv command.
Use For each in the directory pipe the same with xargs and filter the files you want to read . and pass the file name with path to the Script which will do reading of second line.
For the script open the file and extract the String a per the logic.
THen close the file and use rename command to the file name you have as an input to this script.
rename $0 .txt
If the file structure is recurrrsive the extract the name before the last / using index of method of string.
Looks like sed may well be the tool for the job:
for i in *
do
j=`sed -e '1d;2{;s/.*(\(.*\)).*/\1/;q;}' "$i"`
test -z "$j" || test -e "$j" || mv -v "$i$ "$j"
done
I put the test for $j empty or already existing as safeguards; you might think of others. I also gave the -v flag to mv so you can see what it is doing.
You may prefer to use sed -n and just act on lines matching PROBLEM: if that's more reliable than always using the second line.

Bash script. create .info.file.prt if file.prt exists.

I have three files in a folder called /folder/files/$SET_DATE/ but may have much more depending on the date.
Ben.prt
.info.Ben.prt
Jim.prt
John.prt
I would like to create a .info.*.prt file for each .prt file in the folder, but if one already exists, I don't want to create two.
A ll-lart would then leave me with the following.
Server ben 10:30 <~> ll-lart
.info.Ben.prt
.info.Jim.prt
.info.John.prt
Ben.prt
Jim.prt
John.prt
The values in the .info.* foles would be the count of chars in the .prt files.
so I have the following.
SET_DATE= cat /tmp/date.txt
FILES="/folder/folder/folder/$DP_DATE/.info.*"
FILESF="/folder/folder/folder/$DP_DATE/"
FILESP="*.prt"
if [ ! -e $FILESF".info."$FILESP ]; then
echo 0 >> $FILESF.info.$FILESP
fi
Finding it hard to get my head around this now though.
Any kick in the right direction would be much appreicated.
for file in *.prt
do
[ -f ".info.$file" ] || wc -c < "$file" > ".info.$file"
done
For each .prt file name not starting with a . dot, if the corresponding .info file does not exist, create it with the number of characters found in the file.

Resources