iterate over lines in file then find in directory - bash

I am having trouble looping and searching. It seems that the loop is not waiting for the find to finish. What am I doing wrong?
I made a loop the reads a file line by line. I then want to use that "name" to search a directory looking to see if a folder has that name. If it exists copy it to a drive.
#!/bin/bash
DIRFIND="$2"
DIRCOPY="$3"
if [ -d $DIRFIND ]; then
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "$line"
FILE=`find "$DIRFIND" -type d -name "$line"`
if [ -n "$FILE" ]; then
echo "Found $FILE"
cp -a "$FILE" "$DIRCOPY"
else
echo "$line not found."
fi
done < "$1"
else
echo "No such file or directory"
fi

Have you tried xargs...
Proposed Solution
cat filenamelist | xargs -n1 -I {} find . -type d -name {} -print | xargs -n1 -I {} mv {} .
what the above does is pipe a list of filenames into find (one at a time), when found find prints the name and passes to xarg which moves the file...
Expansion
file = yogo
yogo -> | xargs -n1 -I yogo find . -type d -name yogo -print | xargs -n1 -I {} mv ./<path>/yogo .
I hope the above helps, note that xargs has the advantage that you do not run out of command line buffer.

Related

bash iterate over a directory sorted by file size

As a webmaster, I generate a lot of junk files of code. Periodically I have to purge the unneeded files filtered by extention. Example: "cleaner txt" Easy enough. But I want to sort the files by size and process them for the "for" loop. How can I do that?
cleaner:
#/bin/bash
if [ -z "$1" ]; then
echo "Please supply the filename suffixes to delete.";
exit;
fi;
filter=$1;
for FILE in *.$filter; do clear;
cat $FILE; printf '\n\n'; rm -i $FILE; done
You can use a mix of find (to print file sizes and names), sort (to sort the output of find) and cut (to remove the sizes). In case you have very unusual file names containing any possible character including newlines, it is safer to separate the files by a character that cannot be part of a name: NUL.
#/bin/bash
if [ -z "$1" ]; then
echo "Please supply the filename suffixes to delete.";
exit;
fi;
filter=$1;
while IFS= read -r -d '' -u 3 FILE; do
clear
cat "$FILE"
printf '\n\n'
rm -i "$FILE"
done 3< <(find . -mindepth 1 -maxdepth 1 -type f -name "*.$filter" \
-printf '%s\t%p\0' | sort -zn | cut -zf 2-)
Note that we must use a different file descriptor than stdin (3 in this example) to pass the file names to the loop. Else, if we use stdin, it will also be used to provide the answers to rm -i.
Inspired from this answer, you could use the find command as follows:
find ./ -type f -name "*.yaml" -printf "%s %p\n" | sort -n
find command prints the the size of the files and the path so that the sort command prints the results from the smaller one to the larger.
In case you want to iterate through (let's say) the 5 bigger files you can do something like this using the tail command like this:
for f in $(find ./ -type f -name "*.yaml" -printf "%s %p\n" |
sort -n |
cut -d ' ' -f 2)
do
echo "### $f"
done
If the file names don't contain newlines and spaces
while read filesize filename; do
printf "%-25s has size %10d\n" "$filename" "$filesize"
done < <(du -bs *."$filter"|sort -n)
while read filename; do
echo "$filename"
done < <(du -bs *."$filter"|sort -n|awk '{$0=$2}1')

Bash script to String concat two variables and do File compare

what I am trying to achieve is, to delete same filenames(filename+modfiedtimestamp)exisitng in Src_Dir1 and Src_Dir2
So first i have tried to deploy all the filenames to tempa(Src_Dir1) and tempb(Src_Dir2) respectively.
Below is the screenshot of the source directory.
Files inside archive be like this and few files outside too..
So, initially I am want to deal with the files inside Archive(SRC_Dir1) and later outside Archive(SRC_Dir2) what I am trying to do is to use a while loop to read each and every filename and string concat with the modified timestamp(mtime) and input to tempc(like for example it should be like AirTimeActs_2018-12-03.csv+2019-01-24 14:41:53.000000000 -0500 = AirTimeActs_2018-12-03.csv_2019-01-24 14:41:53.000000000 -0500 this is how it should be generating into tempc file for each and every filename inside Archive(SRC_Dir1). This is where I am stuck under string concat variable section on how to proceed. Please help me with the code, hope I am comprehensible.
IMPORTANT
(Really appreciate it, if you help me out with the extension of the code which i haven't mentioned here and yet to achieve which is - >
Have to implement the same code(which I am trying to do for tempa, I'd like to do it for tempb too and name it as tempd) and then do a file data compare between tempc and tempd) if there is any kind of same data filename, then delete the file existing in Src_Dir2, if there is no same data filename, then do nothing.)
#!/bin/bash
Src_Dir1=path/Airtime_Activation/Archive
Src_Dir2=path/Airtime_Activation/
find "$Src_Dir1" -maxdepth 1 -name "*.xlsx" -o -name "*.csv" | sed "s/.*\///" > -print>path/Airtime_Activation/temp_a
find "$Src_Dir2" -maxdepth 1 -name "*.xlsx" -o -name "*.csv" | sed "s/.*\///" > -print>path/Airtime_Activation/temp_b
echo 'phase1'
cat path/Airtime_Activation/temp_a | while read file;
do
echo 'phase1.5'
echo "$file"
echo 'phase2'
mtime=$(stat -c '%y' $file)
Full_name=${file}_${mtime}
echo "$Full_name" >> path/Airtime_Activation/temp_c
echo 'phase3'
done
#!/bin/bash
Src_Dir1=path/Airtime_Activation/Archive
Src_Dir2=path/Airtime_Activation/
find "$Src_Dir1" -maxdepth 1 -name "*.xlsx" -o -name "*.csv" | sed "s/.*\///" > -print>path/Airtime_Activation/temp_a
find "$Src_Dir2" -maxdepth 1 -name "*.xlsx" -o -name "*.csv" | sed "s/.*\///" > -print>path/Airtime_Activation/temp_b
echo 'phase1'
cat path/Airtime_Activation/temp_a | while read file;
do
echo 'phase1.5'
echo "$file"
echo 'phase2'
mtime=$(stat -c '%y' $file)
Full_name=${file}_${mtime}
echo "$Full_name" >> path/Airtime_Activation/temp_c
echo 'phase3'
done
cat /path/Airtime_Activation/temp_b | while read file
#while IFS="" read -r -d $'\0' file;
do
#echo "$file"
echo 'phase2'
mtime=$(stat -c '%y' $Src_Dir2/$file)
Full_name=${file}_${mtime}
echo "$Full_name" >> path/temp_d
echo 'phase3'
done
#file compare and delete old files from outisde archive
grep -Ff temp_d temp_c > path/Airtime_Activation/temp_e
cat path/Airtime_Activation/temp_e | while read file
#while IFS="" read -r -d $'\0' file;
do
#echo "$file"
echo 'phase2'
echo "${file%_*}"
rm $Src_Dir2/${file%_*}
echo 'phase3'
done

How to list files and match first line in bash script?

I would like to check for (only) python files for those which do not have the #!/usr/bin/env python in the first line. So, I write a bash script that does the following:
#!/bin/bash
#list all of python files
for file in `find . -name "*.py"`
do
if [ `head -1 $file` != "#!/usr/bin/env python"] then;
echo "no match in file $file"
else then;
echo "match!"
fi
done
However, for some reason I cannot get the if statement correct! I've looked at many questions, but I cannot find one that succinctly describes the issue. Here is the error message:
./run_test.sh: line 9: syntax error near unexpected token `else'
./run_test.sh: line 9: ` else then;'
where am I going awry? Thank you.
You can do something like
find . -type f -name '*.py' -exec \
awk 'NR==1 && /#!\/usr\/bin\/env python/ \
{ print "Match in file " FILENAME; exit } \
{ print "No match in file " FILENAME; exit }' \
{} \;
If you are going to loop over it, don't use a for loop
#!/bin/bash
find . -type f -name '*.py' -print0 | while IFS= read -r -d $'\0' file; do
if [[ $(head -n1 "$file") == "#!/usr/bin/env python" ]]; then
echo "Match in file [$file]"
else
echo "No match in file [$file]"
fi
done
Things to notice:
The [] after your if statement needs correct spacing
The ';' (if you enter a new line is not necessary) goes after the if and not after the then
You added an extra then after the else.
#!/bin/bash
#list all of python files
for file in `find . -name "*.py"`
do
if [ `head -1 $file` != "#!/usr/bin/env python" ];
then
echo "no match in file $file"
else
echo "match!"
fi
done
can you use -exec option by any chance? I find it easier.
find . -name "*.py" -exec head -1 {} | grep -H '#!/usr/bin/env python' \;
You can control the output using grep options.
edit
Thanks to #chepner - To avoid the pipe being swallowed too early:
-exec sh -c "head -1 {} | grep -H '#!/usr/bin/env python'" \;

Is there a way to pipe from a variable?

I'm trying to find all files in a file structure above a certain file size, list them, then delete them. What I currently have looks like this:
filesToDelete=$(find $find $1 -type f -size +$2k -ls)
if [ -n "$filesToDelete" ];then
echo "Deleting files..."
echo $filesToDelete
$filesToDelete | xargs rm
else
echo "no files to delete"
fi
Everything works, except the $filesToDelete | xargs rm, obviously. Is there a way to use pipe on a variable? Or is there another way I could do this? My google-fu didn't really find anything, so any help would be appreciated.
Edit: Thanks for the information everyone. I will post the working code here now for anyone else stumbling upon this question later:
if [ $(find $1 -type f -size +$2k | wc -l) -ge 1 ]; then
find $1 -type f -size +$2k -exec sh -c 'f={}; echo "deleting file $f"; rm $f' {} \;
else
echo "no files above" $2 "kb found"
fi
As already pointed out, you don't need piping a var in this case. But just in case you needed it in some other situation, you can use
xargs rm <<< $filesToDelete
or, more portably
echo $filesToDelete | xargs rm
Beware of spaces in file names.
To also output the value together with piping it, use tee with process substitution:
echo "$x" | tee >( xargs rm )
You can directly use -exec to perform an action on the files that were found in find:
find $1 -type f -size +$2k -exec rm {} \;
The -exec trick makes find execute the command given for each one of the matches found. To refer the match itself we have to use {} \;.
If you want to perform more than one action, -exec sh -c "..." makes it. For example, here you can both print the name of the files are about to be removed... and remove them. Note the f={} thingy to store the name of the file, so that it can be used later on in echo and rm:
find $1 -type f -size +$2k -exec sh -c 'f={}; echo "removing $f"; rm $f' {} \;
In case you want to print a message if no matches were found, you can use wc -l to count the number of matches (if any) and do an if / else condition with it:
if [ $(find $1 -type f -size +$2k | wc -l) -ge 1 ]; then
find $1 -type f -size +$2k -exec rm {} \;
else
echo "no matches found"
fi
wc is a command that does word count (see man wc for more info). Doing wc -l counts the number of lines. So command | wc -l counts the number of lines returned by command.
Then we use the if [ $(command | wc -l) -ge 1 ] check, which does an integer comparison: if the value is greater or equal to 1, then do what follows; otherwise, do what is in else.
Buuuut the previous approach was using find twice, which is a bit inefficient. As -exec sh -c is opening a sub-shell, we cannot rely on a variable to keep track of the number of files opened. Why? Because a sub-shell cannot assign values to its parent shell.
Instead, let's store the files that were deleted into a file, and then count it:
find . -name "*.txt" -exec sh -c 'f={}; echo "$f" >> /tmp/findtest; rm $f' {} \;
if [ -s /tmp/findtest ]; then #check if the file is empty
echo "file has $(wc -l < /tmp/findtest) lines"
# you can also `cat /tmp/findtest` here to show the deleted files
else
echo "no matches"
fi
Note that you can cat /tmp/findtest to see the deleted files, or also use echo "$f" alone (without redirection) to indicate while removing. rm /tmp/findtest is also an option, to do once the process is finished.
You don't need to do all this. You can directly use find command to get the files over a particular size limit and delete it using xargs.
This should work:
#!/bin/bash
if [ $(find $1 -type f -size +$2k | wc -l) -eq 0 ]; then
echo "No Files to delete"
else
echo "Deleting the following files"
find $1 -size +$2 -exec ls {} \+
find $1 -size +$2 -exec ls {} \+ | xargs rm -f
echo "Done"
fi

Bash script to list files not found

I have been looking for a way to list file that do not exist from a list of files that are required to exist. The files can exist in more than one location. What I have now:
#!/bin/bash
fileslist="$1"
while read fn
do
if [ ! -f `find . -type f -name $fn ` ];
then
echo $fn
fi
done < $fileslist
If a file does not exist the find command will not print anything and the test does not work. Removing the not and creating an if then else condition does not resolve the problem.
How can i print the filenames that are not found from a list of file names?
New script:
#!/bin/bash
fileslist="$1"
foundfiles="~/tmp/tmp`date +%Y%m%d%H%M%S`.txt"
touch $foundfiles
while read fn
do
`find . -type f -name $fn | sed 's:./.*/::' >> $foundfiles`
done < $fileslist
cat $fileslist $foundfiles | sort | uniq -u
rm $foundfiles
#!/bin/bash
fileslist="$1"
while read fn
do
FPATH=`find . -type f -name $fn`
if [ "$FPATH." = "." ]
then
echo $fn
fi
done < $fileslist
You were close!
Here is test.bash:
#!/bin/bash
fn=test.bash
exists=`find . -type f -name $fn`
if [ -n "$exists" ]
then
echo Found it
fi
It sets $exists = to the result of the find. the if -n checks if the result is not null.
Try replacing body with [[ -z "$(find . -type f -name $fn)" ]] && echo $fn. (note that this code is bound to have problems with filenames containing spaces).
More efficient bashism:
diff <(sort $fileslist|uniq) <(find . -type f -printf %f\\n|sort|uniq)
I think you can handle diff output.
Give this a try:
find -type f -print0 | grep -Fzxvf - requiredfiles.txt
The -print0 and -z protect against filenames which contain newlines. If your utilities don't have these options and your filenames don't contain newlines, you should be OK.
The repeated find to filter one file at a time is very expensive. If your file list is directly compatible with the output from find, run a single find and remove any matches from your list:
find . -type f |
fgrep -vxf - "$1"
If not, maybe you can massage the output from find in the pipeline before the fgrep so that it matches the format in your file; or, conversely, massage the data in your file into find-compatible.
I use this script and it works for me
#!/bin/bash
fileslist="$1"
found="Found:"
notfound="Not found:"
len=`cat $1 | wc -l`
n=0;
while read fn
do
# don't worry about this, i use it to display the file list progress
n=$((n + 1))
echo -en "\rLooking $(echo "scale=0; $n * 100 / $len" | bc)% "
if [ $(find / -name $fn | wc -l) -gt 0 ]
then
found=$(printf "$found\n\t$fn")
else
notfound=$(printf "$notfound\n\t$fn")
fi
done < $fileslist
printf "\n$found\n$notfound\n"
The line counts the number of lines and if its greater than 0 the find was a success. This searches everything on the hdd. You could replace / with . for just the current directory.
$(find / -name $fn | wc -l) -gt 0
Then i simply run it with the files in the files list being separated by newline
./search.sh files.list

Resources