Output a find with xargs to a log file - bash

I have some code that works. But I want to output it to a log file so that I know what is being copied from one location to another.
echo "find ${varSrcDirectory} -maxdepth 1 -type f -printf "%p\t%t\n" | sort -t $'\t' -k2 -nr | grep ${varFullYear} | grep ${month} | cut -f 1 | xargs -i cp '{}' -p -t ${varDstDirectory}/${varFullYear}/${monthNum} " >> $LOG
find ${varSrcDirectory} -maxdepth 1 -type f -printf "%p\t%t\n" | sort -t $'\t' -k2 -nr | grep ${varFullYear} | grep ${month} | cut -f 1 | xargs -i cp '{}' -p -t ${varDstDirectory}/${varFullYear}/${monthNum} >> $LOG
Here is the result in my log file
find /ftp/bondloans/transfers/out/ -maxdepth 1 -type f -printf %pt%tn | sort -t $'\t' -k2 -nr | grep 2008 | grep Jan | cut -f 1 | xargs -i cp '{}' -p -t /ftp/bondloans/transfers/out/testa/2008/01
But what I want to see is the actual file being copied from one location to another.

Add the -v option to cp, so it will print what it's copying.
find ${varSrcDirectory} -maxdepth 1 -type f -printf "%p\t%t\n" | sort -t $'\t' -k2 -nr | grep ${varFullYear} | grep ${month} | cut -f 1 | xargs -i cp -v '{}' -p -t ${varDstDirectory}/${varFullYear}/${monthNum} >> $LOG

Related

How to return an MD5 and SHA1 value for multiple files in a directory using BASH

I am creating a BASH script to take a directory as an argument and return to std out a list of all files in that directory with both the MD5 and SHA1 value of the files present in that directory. The only files I'm interested in are those between 100 and 500K. So far I gotten this far. (Section of Script)
cd $1 &&
find . -type f -size +100k -size -500k -printf '%f \t %s \t' -exec md5sum {} \; |
awk '{printf "NAME:" " " $1 "\t" "MD5:" " " $3 "\t" "BYTES:" "\t" $2 "\n"}'
I'm getting a little confused when adding the Sha1 and obviously leaving something out.
Can anybody suggest a way to achieve this.
Ideally I'd like the script to format in the following way
Name Md5 SHA1
(With the relevant fields underneath)
Your awk printf bit is overly complicated. Try this:
find . -type f -printf "%f\t%s\t" -exec md5sum {} \; | awk '{ printf "NAME: %s MD5: %s BYTES: %s\n", $1, $3, $2 }'
Just read line by line the list of files outputted by find:
find . -type f |
while IFS= read -r l; do
echo "$(basename "$l") $(md5sum <"$l" | cut -d" " -f1) $(sha1sum <"$l" | cut -d" " -f1)"
done
It's better to use a zero separated stream:
find . -type f -print0 |
while IFS= read -r -d '' l; do
echo "$(basename "$l") $(md5sum <"$l" | cut -d" " -f1) $(sha1sum <"$l" | cut -d" " -f1)"
done
You could speed up something with xargs and multiple processes with -P option to xargs:
find . -type f -print0 |
xargs -0 -n1 sh -c 'echo "$(basename "$1") $(md5sum <"$1" | cut -d" " -f1) $(sha1sum <"$1" | cut -d" " -f1)"' --
Consider adding -maxdepth 1 to find if you are not interested in files in subdirectories recursively.
It's easy from xargs to go to -exec:
find . -type f -exec sh -c 'echo "$1 $(md5sum <"$1" | cut -d" " -f1) $(sha1sum <"$1" | cut -d" " -f1)"' -- {} \;
Tested on repl.
Add those -size +100k -size -500k args to find to limit the sizes.
The | cut -d" " -f1 is used to remove the - that is outputted by both md5sum and sha1sum. If there are no spaces in filenames, you could run a single cut process for the whole stream, so it should be slightly faster:
find . -type f -print0 |
xargs -0 -n1 sh -c 'echo "$(basename "$1") $(md5sum <"$1") $(sha1sum <"$1")"' -- |
cut -d" " -f1,2,5
I also think that running a single md5sum and sha1sum process maybe would be faster rather then spawning multiple separate processes for each file, but such method needs storing all the filenames somewhere. Below a bash array is used:
IFS=$'\n' files=($(find . -type f))
paste -d' ' <(
printf "%s\n" "${files[#]}") <(
md5sum "${files[#]}" | cut -d' ' -f1) <(
sha1sum "${files[#]}" | cut -d' ' -f1)
Your find is fine, you want to join the results of two of those, one for each hash. The command for that is join, which expects sorted inputs.
doit() { find -type f -size +100k -size -500k -exec $1 {} + |sort -k2; }
join -j2 <(doit md5sum) <(doit sha1sum)
and that gets you the raw data in sane environments. If you want pretty data, you can use the column utility:
join -j2 <(doit md5sum) <(doit sha1sum) | column -t
and add nice headers:
(echo Name Md5 SHA1; join -j2 <(doit md5sum) <(doit sha1sum)) | column -t
and if you're in an unclean environment where people put spaces in file names, protect against that by subbing in tabs for the field markers:
doit() { find -type f -size +100k -size -500k -exec $1 {} + \
| sed 's, ,\t,'| sort -k2 -t$'\t' ; }
join -j2 -t$'\t' <(doit md5sum) <(doit sha1sum) | column -ts$'\t'

I want my script to echo "$1" into a file literally

This is part of my script
#!/bin/bash
echo "ls /SomeFolder | grep $1 | xargs cat | grep something | grep .txt | awk '{print $2}' | sed 's/;$//';" >> script2.sh
This echos everything nicely into my script except $1 and $2. Instead of that it outputs the input of those variables but i want it to literally read "$1" and "$2". Help?
Escape it:
echo "ls /SomeFolder | grep \$1 | xargs cat | grep something | grep .txt | awk '{print \$2}' | sed 's/;\$//';" >> script2.sh
Quote it:
echo "ls /SomeFolder | grep "'$'"1 | xargs cat | grep something | grep .txt | awk '{print "'$'"2}' | sed 's/;"'$'"//';" >> script2.sh
or like this:
echo 'ls /SomeFolder | grep $1 | xargs cat | grep something | grep .txt | awk '\''{print $2}'\'' | sed '\''s/;$//'\'';' >> script2.sh
Use quoted here document:
cat << 'EOF' >> script2.sh
ls /SomeFolder | grep $1 | xargs cat | grep something | grep .txt | awk '{print $2}' | sed 's/;$//';
EOF
Basically you want to prevent expansion, ie. take the string literaly. You may want to read bashfaq quotes
First, you'd never write this (see https://mywiki.wooledge.org/ParsingLs, http://porkmail.org/era/unix/award.html and you don't need greps+seds+pipes when you're using awk):
ls /SomeFolder | grep $1 | xargs cat | grep something | grep .txt | awk '{print $2}' | sed 's/;$//'`
you'd write this instead:
find /SomeFolder -mindepth 1 -maxdepth 1 -type f -name "*$1*" -exec \
awk '/something/ && /.txt/{sub(/;$/,"",$2); print $2}' {} +
or if you prefer using print | xargs instead of -exec:
find /SomeFolder -mindepth 1 -maxdepth 1 -type f -name "*$1*" -print0 |
xargs -0 awk '/something/ && /.txt/{sub(/;$/,"",$2); print $2}'
and now to append that script to a file would be:
cat <<'EOF' >> script2.sh
find /SomeFolder -mindepth 1 -maxdepth 1 -type f -name "*$1*" -print0 |
xargs -0 awk '/something/ && /.txt/{sub(/;$/,"",$2); print $2}'
EOF
Btw, if you want the . in .txt to be treated literally instead of as a regexp metachar meaning "any character" then you should be using \.txt instead of .txt.

Why does "... >> out | sort -n -o out" not actually run sort?

As an exercise, I should find all .c files starting from my home directory, count the lines of each file and store the sorted output in sorted_statistics.txt, using find, wc, cut ad sort.
I found this command to work
find /home/user/ -type f -name "*.c" 2> /dev/null -exec wc -l {} \; | cut -f 1 -d " " | sort -n -o sorted_statistics.txt
but I can't understand why
find /home/user/ -type f -name "*.c" 2> /dev/null -exec wc -l {} \; | cut -f 1 -d " " >> sorted_statistics.txt | sort -n sorted_statistics.txt
stops just before the sort command.
Just out of curiosity, why is that?
You were appending everything to sorted_statistics.txt ( consuming all the output ) and then trying to use that none existing output in a pipe for sort. I have corrected your code so it works now.
find /home/user/ -type f -name "*.c" 2> /dev/null -exec wc -l {} \; | cut -f 1 -d " " >> tmp.txt && sort -n tmp.txt > sorted_statistics.txt
Regards!
This part of the command makes no sense:
cut -f 1 -d " " >> sorted_statistics.txt | sort ...
because the output of cut is appended to the file sorted_statistics.txt and no output at all goes to the sort command. You will probably want to use tee:
cut -f 1 -d " " | tee -a sorted_statistics.txt | sort ...
The tee command sends its input to a file and also to the standard output. It is like a Tee junction in a pipeline.

moving files with xargs

I want to pipe the output of ls into head and pipe it into mv.
I used the following command on terminal but it isn't working properly.
ls -t Downloads/ | head -7 | xargs -i mv {} ~/cso/
Please do rectify the error. Thanks in advance!
It is well documented that parsing ls output is not recommended. You can use this safe approach using find + sort + cut + head + xargs pipeline:
find . -maxdepth 1 -type f -printf '%T#\t%p\0' |
sort -z -rnk1 |
cut -z -f2 |
head -z -n 7 |
xargs -0 -I {} mv {} ~/cso/
Use -I like here :
ls -t Downloads/* | head -7 | xargs -I '{}' mv '{}' ~/cso/

Bash scripting: Deleting the oldest directory

I want to look for the oldest directory (inside a directory), and delete it. I am using the following:
rm -R $(ls -1t | tail -1)
ls -1t | tail -1 does indeed gives me the oldest directory, the the problem is that it is not deleting the directory, and that it also list files.
How could I please fix that?
rm -R "$(find . -maxdepth 1 -type d -printf '%T#\t%p\n' | sort -r | tail -n 1 | sed 's/[0-9]*\.[0-9]*\t//')"
This works also with directory whose name contains spaces, tabs or starts with a "-".
This is not pretty but it works:
rm -R $(ls -lt | grep '^d' | tail -1 | tr " " "\n" | tail -1)
rm -R $(ls -tl | grep '^d' | tail -1 | cut -d' ' -f8)
find directory_name -type d -printf "%TY%Tm%Td%TH%TM%TS %p\n" | sort -nr | tail -1 | cut -d" " -f2 | xargs -n1 echo rm -Rf
You should remove the echo before the rm if it produces the right results

Resources