Is there a way to pipe from a variable? - bash

I'm trying to find all files in a file structure above a certain file size, list them, then delete them. What I currently have looks like this:
filesToDelete=$(find $find $1 -type f -size +$2k -ls)
if [ -n "$filesToDelete" ];then
echo "Deleting files..."
echo $filesToDelete
$filesToDelete | xargs rm
else
echo "no files to delete"
fi
Everything works, except the $filesToDelete | xargs rm, obviously. Is there a way to use pipe on a variable? Or is there another way I could do this? My google-fu didn't really find anything, so any help would be appreciated.
Edit: Thanks for the information everyone. I will post the working code here now for anyone else stumbling upon this question later:
if [ $(find $1 -type f -size +$2k | wc -l) -ge 1 ]; then
find $1 -type f -size +$2k -exec sh -c 'f={}; echo "deleting file $f"; rm $f' {} \;
else
echo "no files above" $2 "kb found"
fi

As already pointed out, you don't need piping a var in this case. But just in case you needed it in some other situation, you can use
xargs rm <<< $filesToDelete
or, more portably
echo $filesToDelete | xargs rm
Beware of spaces in file names.
To also output the value together with piping it, use tee with process substitution:
echo "$x" | tee >( xargs rm )

You can directly use -exec to perform an action on the files that were found in find:
find $1 -type f -size +$2k -exec rm {} \;
The -exec trick makes find execute the command given for each one of the matches found. To refer the match itself we have to use {} \;.
If you want to perform more than one action, -exec sh -c "..." makes it. For example, here you can both print the name of the files are about to be removed... and remove them. Note the f={} thingy to store the name of the file, so that it can be used later on in echo and rm:
find $1 -type f -size +$2k -exec sh -c 'f={}; echo "removing $f"; rm $f' {} \;
In case you want to print a message if no matches were found, you can use wc -l to count the number of matches (if any) and do an if / else condition with it:
if [ $(find $1 -type f -size +$2k | wc -l) -ge 1 ]; then
find $1 -type f -size +$2k -exec rm {} \;
else
echo "no matches found"
fi
wc is a command that does word count (see man wc for more info). Doing wc -l counts the number of lines. So command | wc -l counts the number of lines returned by command.
Then we use the if [ $(command | wc -l) -ge 1 ] check, which does an integer comparison: if the value is greater or equal to 1, then do what follows; otherwise, do what is in else.
Buuuut the previous approach was using find twice, which is a bit inefficient. As -exec sh -c is opening a sub-shell, we cannot rely on a variable to keep track of the number of files opened. Why? Because a sub-shell cannot assign values to its parent shell.
Instead, let's store the files that were deleted into a file, and then count it:
find . -name "*.txt" -exec sh -c 'f={}; echo "$f" >> /tmp/findtest; rm $f' {} \;
if [ -s /tmp/findtest ]; then #check if the file is empty
echo "file has $(wc -l < /tmp/findtest) lines"
# you can also `cat /tmp/findtest` here to show the deleted files
else
echo "no matches"
fi
Note that you can cat /tmp/findtest to see the deleted files, or also use echo "$f" alone (without redirection) to indicate while removing. rm /tmp/findtest is also an option, to do once the process is finished.

You don't need to do all this. You can directly use find command to get the files over a particular size limit and delete it using xargs.
This should work:
#!/bin/bash
if [ $(find $1 -type f -size +$2k | wc -l) -eq 0 ]; then
echo "No Files to delete"
else
echo "Deleting the following files"
find $1 -size +$2 -exec ls {} \+
find $1 -size +$2 -exec ls {} \+ | xargs rm -f
echo "Done"
fi

Related

iterate over lines in file then find in directory

I am having trouble looping and searching. It seems that the loop is not waiting for the find to finish. What am I doing wrong?
I made a loop the reads a file line by line. I then want to use that "name" to search a directory looking to see if a folder has that name. If it exists copy it to a drive.
#!/bin/bash
DIRFIND="$2"
DIRCOPY="$3"
if [ -d $DIRFIND ]; then
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "$line"
FILE=`find "$DIRFIND" -type d -name "$line"`
if [ -n "$FILE" ]; then
echo "Found $FILE"
cp -a "$FILE" "$DIRCOPY"
else
echo "$line not found."
fi
done < "$1"
else
echo "No such file or directory"
fi
Have you tried xargs...
Proposed Solution
cat filenamelist | xargs -n1 -I {} find . -type d -name {} -print | xargs -n1 -I {} mv {} .
what the above does is pipe a list of filenames into find (one at a time), when found find prints the name and passes to xarg which moves the file...
Expansion
file = yogo
yogo -> | xargs -n1 -I yogo find . -type d -name yogo -print | xargs -n1 -I {} mv ./<path>/yogo .
I hope the above helps, note that xargs has the advantage that you do not run out of command line buffer.

Count and remove old files using Unix find

I want to delete files in $DIR_TO_CLEAN older than $DAYS_TO_SAVE days. Easy:
find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE -exec rm {} \;
I suppose we could add a -type f or a -f flag for rm, but I really would like to count the number of files getting deleted.
We could do this naively:
DELETE_COUNT=`find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE | wc -l`
find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE -exec rm {} \;
But this solution leaves a lot to be desired. Besides the command duplication, this snippet overestimates the count if rm failed to delete a file.
I'm decently comfortable with redirection, pipes (including named ones), subshells, xargs, tee, etc, but I am eager to learn new tricks. I would like a solution that works on both bash and ksh.
How would you count the number of files deleted by find?
I would avoid -exec and go for a piped solution:
find "$DIR_TO_CLEAN" -type f -mtime +$DAYS_TO_SAVE -print0 \
| awk -v RS='\0' -v ORS='\0' '{ print } END { print NR }' \
| xargs -0 rm
Using awk to count matches and pass them on to rm.
Update:
kojiro made me aware that the above solution does not count the success/fail rate of rm. As awk has issues with badly named files I think the following bash solution might be better:
find "${DIR_TO_CLEAN?}" -type f -mtime +${DAYS_TO_SAVE?} -print0 |
(
success=0 fail=0
while read -rd $'\0' file; do
if rm "$file" 2> /dev/null; then
(( success++ ))
else
(( fail++ ))
fi
done
echo $success $fail
)
You could just use bash within find:
find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE -exec bash -c 'printf "Total: %d\n" $#; rm "$#"' _ {} +
Of course this can call bash -c … more than once if the number of files found is larger than MAX_ARGS, and it also can overestimate the count if rm fails. But solving those problems gets messy:
find "$DIR_TO_CLEAN" -mtime +$DAYS_TO_SAVE -exec bash -c 'printf "count=0; for f; do rm "$f" && (( count++ )); done; printf "Total: %d\n" $count' _ {} +
This solution to avoid MAX_ARGS limits avoids find altogether. If you need it to be recursive, you'll have to use recursive globbing, which is only available in newer shells. (globstar is a bash 4 feature.)
shopt -s globstar
# Assume DAYS_TO_SAVE reformatted to how touch -m expects it. (Exercise for the reader.)
touch -m "$DAYS_TO_SAVE" referencefile
count=0
for file in "$DIR_TO_CLEAN/"**/*; do
if [[ referencefile -nt "$file" ]]; then
rm "$file" && (( count++ ))
fi
done
printf 'Total: %d\n' "$count"
Here's an approach using find with printf (strictly compliant find doesn't have printf, but you can use printf as a standalone utility in that case).
find "$DIR_TO_CLEAN" -type -f -mtime "+$DAYS_TO_SAVE" -exec rm {} \; -printf '.' | wc -c
find "$DIR_TO_CLEAN" -type -f -mtime "+$DAYS_TO_SAVE" -exec rm {} \; -exec printf '.' \; | wc -c

Bash script to list files not found

I have been looking for a way to list file that do not exist from a list of files that are required to exist. The files can exist in more than one location. What I have now:
#!/bin/bash
fileslist="$1"
while read fn
do
if [ ! -f `find . -type f -name $fn ` ];
then
echo $fn
fi
done < $fileslist
If a file does not exist the find command will not print anything and the test does not work. Removing the not and creating an if then else condition does not resolve the problem.
How can i print the filenames that are not found from a list of file names?
New script:
#!/bin/bash
fileslist="$1"
foundfiles="~/tmp/tmp`date +%Y%m%d%H%M%S`.txt"
touch $foundfiles
while read fn
do
`find . -type f -name $fn | sed 's:./.*/::' >> $foundfiles`
done < $fileslist
cat $fileslist $foundfiles | sort | uniq -u
rm $foundfiles
#!/bin/bash
fileslist="$1"
while read fn
do
FPATH=`find . -type f -name $fn`
if [ "$FPATH." = "." ]
then
echo $fn
fi
done < $fileslist
You were close!
Here is test.bash:
#!/bin/bash
fn=test.bash
exists=`find . -type f -name $fn`
if [ -n "$exists" ]
then
echo Found it
fi
It sets $exists = to the result of the find. the if -n checks if the result is not null.
Try replacing body with [[ -z "$(find . -type f -name $fn)" ]] && echo $fn. (note that this code is bound to have problems with filenames containing spaces).
More efficient bashism:
diff <(sort $fileslist|uniq) <(find . -type f -printf %f\\n|sort|uniq)
I think you can handle diff output.
Give this a try:
find -type f -print0 | grep -Fzxvf - requiredfiles.txt
The -print0 and -z protect against filenames which contain newlines. If your utilities don't have these options and your filenames don't contain newlines, you should be OK.
The repeated find to filter one file at a time is very expensive. If your file list is directly compatible with the output from find, run a single find and remove any matches from your list:
find . -type f |
fgrep -vxf - "$1"
If not, maybe you can massage the output from find in the pipeline before the fgrep so that it matches the format in your file; or, conversely, massage the data in your file into find-compatible.
I use this script and it works for me
#!/bin/bash
fileslist="$1"
found="Found:"
notfound="Not found:"
len=`cat $1 | wc -l`
n=0;
while read fn
do
# don't worry about this, i use it to display the file list progress
n=$((n + 1))
echo -en "\rLooking $(echo "scale=0; $n * 100 / $len" | bc)% "
if [ $(find / -name $fn | wc -l) -gt 0 ]
then
found=$(printf "$found\n\t$fn")
else
notfound=$(printf "$notfound\n\t$fn")
fi
done < $fileslist
printf "\n$found\n$notfound\n"
The line counts the number of lines and if its greater than 0 the find was a success. This searches everything on the hdd. You could replace / with . for just the current directory.
$(find / -name $fn | wc -l) -gt 0
Then i simply run it with the files in the files list being separated by newline
./search.sh files.list

find -exec with multiple commands

I am trying to use find -exec with multiple commands without any success. Does anybody know if commands such as the following are possible?
find *.txt -exec echo "$(tail -1 '{}'),$(ls '{}')" \;
Basically, I am trying to print the last line of each txt file in the current directory and print at the end of the line, a comma followed by the filename.
find accepts multiple -exec portions to the command. For example:
find . -name "*.txt" -exec echo {} \; -exec grep banana {} \;
Note that in this case the second command will only run if the first one returns successfully, as mentioned by #Caleb. If you want both commands to run regardless of their success or failure, you could use this construct:
find . -name "*.txt" \( -exec echo {} \; -o -exec true \; \) -exec grep banana {} \;
find . -type d -exec sh -c "echo -n {}; echo -n ' x '; echo {}" \;
One of the following:
find *.txt -exec awk 'END {print $0 "," FILENAME}' {} \;
find *.txt -exec sh -c 'echo "$(tail -n 1 "$1"),$1"' _ {} \;
find *.txt -exec sh -c 'echo "$(sed -n "\$p" "$1"),$1"' _ {} \;
Another way is like this:
multiple_cmd() {
tail -n1 $1;
ls $1
};
export -f multiple_cmd;
find *.txt -exec bash -c 'multiple_cmd "$0"' {} \;
in one line
multiple_cmd() { tail -1 $1; ls $1 }; export -f multiple_cmd; find *.txt -exec bash -c 'multiple_cmd "$0"' {} \;
"multiple_cmd()" - is a function
"export -f multiple_cmd" - will export it so any other subshell can see it
"find *.txt -exec bash -c 'multiple_cmd "$0"' {} \;" - find that will execute the function on your example
In this way multiple_cmd can be as long and as complex, as you need.
Hope this helps.
There's an easier way:
find ... | while read -r file; do
echo "look at my $file, my $file is amazing";
done
Alternatively:
while read -r file; do
echo "look at my $file, my $file is amazing";
done <<< "$(find ...)"
Extending #Tinker's answer,
In my case, I needed to make a command | command | command inside the -exec to print both the filename and the found text in files containing a certain text.
I was able to do it with:
find . -name config -type f \( -exec grep "bitbucket" {} \; -a -exec echo {} \; \)
the result is:
url = git#bitbucket.org:a/a.git
./a/.git/config
url = git#bitbucket.org:b/b.git
./b/.git/config
url = git#bitbucket.org:c/c.git
./c/.git/config
I don't know if you can do this with find, but an alternate solution would be to create a shell script and to run this with find.
lastline.sh:
echo $(tail -1 $1),$1
Make the script executable
chmod +x lastline.sh
Use find:
find . -name "*.txt" -exec ./lastline.sh {} \;
Thanks to Camilo Martin, I was able to answer a related question:
What I wanted to do was
find ... -exec zcat {} | wc -l \;
which didn't work. However,
find ... | while read -r file; do echo "$file: `zcat $file | wc -l`"; done
does work, so thank you!
1st answer of Denis is the answer to resolve the trouble. But in fact it is no more a find with several commands in only one exec like the title suggest. To answer the one exec with several commands thing we will have to look for something else to resolv. Here is a example:
Keep last 10000 lines of .log files which has been modified in the last 7 days using 1 exec command using severals {} references
1) see what the command will do on which files:
find / -name "*.log" -a -type f -a -mtime -7 -exec sh -c "echo tail -10000 {} \> fictmp; echo cat fictmp \> {} " \;
2) Do it: (note no more "\>" but only ">" this is wanted)
find / -name "*.log" -a -type f -a -mtime -7 -exec sh -c "tail -10000 {} > fictmp; cat fictmp > {} ; rm fictmp" \;
I usually embed the find in a small for loop one liner, where the find is executed in a subcommand with $().
Your command would look like this then:
for f in $(find *.txt); do echo "$(tail -1 $f), $(ls $f)"; done
The good thing is that instead of {} you just use $f and instead of the -exec … you write all your commands between do and ; done.
Not sure what you actually want to do, but maybe something like this?
for f in $(find *.txt); do echo $f; tail -1 $f; ls -l $f; echo; done
should use xargs :)
find *.txt -type f -exec tail -1 {} \; | xargs -ICONSTANT echo $(pwd),CONSTANT
another one (working on osx)
find *.txt -type f -exec echo ,$(PWD) {} + -exec tail -1 {} + | tr ' ' '/'
A find+xargs answer.
The example below finds all .html files and creates a copy with the .BAK extension appended (e.g. 1.html > 1.html.BAK).
Single command with multiple placeholders
find . -iname "*.html" -print0 | xargs -0 -I {} cp -- "{}" "{}.BAK"
Multiple commands with multiple placeholders
find . -iname "*.html" -print0 | xargs -0 -I {} echo "cp -- {} {}.BAK ; echo {} >> /tmp/log.txt" | sh
# if you need to do anything bash-specific then pipe to bash instead of sh
This command will also work with files that start with a hyphen or contain spaces such as -my file.html thanks to parameter quoting and the -- after cp which signals to cp the end of parameters and the beginning of the actual file names.
-print0 pipes the results with null-byte terminators.
for xargs the -I {} parameter defines {} as the placeholder; you can use whichever placeholder you like; -0 indicates that input items are null-separated.
I found this solution (maybe it is already said in a comment, but I could not find any answer with this)
you can execute MULTIPLE COMMANDS in a row using "bash -c"
find . <SOMETHING> -exec bash -c "EXECUTE 1 && EXECUTE 2 ; EXECUTE 3" \;
in your case
find . -name "*.txt" -exec bash -c "tail -1 '{}' && ls '{}'" \;
i tested it with a test file:
[gek#tuffoserver tmp]$ ls *.txt
casualfile.txt
[gek#tuffoserver tmp]$ find . -name "*.txt" -exec bash -c "tail -1 '{}' && ls '{}'" \;
testonline1=some TEXT
./casualfile.txt
Here is my bash script that you can use to find multiple files and then process them all using a command.
Example of usage. This command applies a file linux command to each found file:
./finder.sh file fb2 txt
Finder script:
# Find files and process them using an external command.
# Usage:
# ./finder.sh ./processing_script.sh txt fb2 fb2.zip doc docx
counter=0
find_results=()
for ext in "${#:2}"
do
# #see https://stackoverflow.com/a/54561526/10452175
readarray -d '' ext_results < <(find . -type f -name "*.${ext}" -print0)
for file in "${ext_results[#]}"
do
counter=$((counter+1))
find_results+=("${file}")
echo ${counter}") ${file}"
done
done
countOfResults=$((counter))
echo -e "Found ${countOfResults} files.\n"
echo "Processing..."
counter=0
for file in "${find_results[#]}"
do
counter=$((counter+1))
echo -n ${counter}"/${countOfResults}) "
eval "$1 '${file}'"
done
echo "All files have been processed."

How to loop through a directory recursively to delete files with certain extensions

I need to loop through a directory recursively and remove all files with extension .pdf and .doc. I'm managing to loop through a directory recursively but not managing to filter the files with the above mentioned file extensions.
My code so far
#/bin/sh
SEARCH_FOLDER="/tmp/*"
for f in $SEARCH_FOLDER
do
if [ -d "$f" ]
then
for ff in $f/*
do
echo "Processing $ff"
done
else
echo "Processing file $f"
fi
done
I need help to complete the code, since I'm not getting anywhere.
As a followup to mouviciel's answer, you could also do this as a for loop, instead of using xargs. I often find xargs cumbersome, especially if I need to do something more complicated in each iteration.
for f in $(find /tmp -name '*.pdf' -or -name '*.doc'); do rm $f; done
As a number of people have commented, this will fail if there are spaces in filenames. You can work around this by temporarily setting the IFS (internal field seperator) to the newline character. This also fails if there are wildcard characters \[?* in the file names. You can work around that by temporarily disabling wildcard expansion (globbing).
IFS=$'\n'; set -f
for f in $(find /tmp -name '*.pdf' -or -name '*.doc'); do rm "$f"; done
unset IFS; set +f
If you have newlines in your filenames, then that won't work either. You're better off with an xargs based solution:
find /tmp \( -name '*.pdf' -or -name '*.doc' \) -print0 | xargs -0 rm
(The escaped brackets are required here to have the -print0 apply to both or clauses.)
GNU and *BSD find also has a -delete action, which would look like this:
find /tmp \( -name '*.pdf' -or -name '*.doc' \) -delete
find is just made for that.
find /tmp -name '*.pdf' -or -name '*.doc' | xargs rm
Without find:
for f in /tmp/* tmp/**/* ; do
...
done;
/tmp/* are files in dir and /tmp/**/* are files in subfolders. It is possible that you have to enable globstar option (shopt -s globstar).
So for the question the code should look like this:
shopt -s globstar
for f in /tmp/*.pdf /tmp/*.doc tmp/**/*.pdf tmp/**/*.doc ; do
rm "$f"
done
Note that this requires bash ≥4.0 (or zsh without shopt -s globstar, or ksh with set -o globstar instead of shopt -s globstar). Furthermore, in bash <4.3, this traverses symbolic links to directories as well as directories, which is usually not desirable.
If you want to do something recursively, I suggest you use recursion (yes, you can do it using stacks and so on, but hey).
recursiverm() {
for d in *; do
if [ -d "$d" ]; then
(cd -- "$d" && recursiverm)
fi
rm -f *.pdf
rm -f *.doc
done
}
(cd /tmp; recursiverm)
That said, find is probably a better choice as has already been suggested.
Here is an example using shell (bash):
#!/bin/bash
# loop & print a folder recusively,
print_folder_recurse() {
for i in "$1"/*;do
if [ -d "$i" ];then
echo "dir: $i"
print_folder_recurse "$i"
elif [ -f "$i" ]; then
echo "file: $i"
fi
done
}
# try get path from param
path=""
if [ -d "$1" ]; then
path=$1;
else
path="/tmp"
fi
echo "base path: $path"
print_folder_recurse $path
This doesn't answer your question directly, but you can solve your problem with a one-liner:
find /tmp \( -name "*.pdf" -o -name "*.doc" \) -type f -exec rm {} +
Some versions of find (GNU, BSD) have a -delete action which you can use instead of calling rm:
find /tmp \( -name "*.pdf" -o -name "*.doc" \) -type f -delete
For bash (since version 4.0):
shopt -s globstar nullglob dotglob
echo **/*".ext"
That's all.
The trailing extension ".ext" there to select files (or dirs) with that extension.
Option globstar activates the ** (search recursivelly).
Option nullglob removes an * when it matches no file/dir.
Option dotglob includes files that start wit a dot (hidden files).
Beware that before bash 4.3, **/ also traverses symbolic links to directories which is not desirable.
This method handles spaces well.
files="$(find -L "$dir" -type f)"
echo "Count: $(echo -n "$files" | wc -l)"
echo "$files" | while read file; do
echo "$file"
done
Edit, fixes off-by-one
function count() {
files="$(find -L "$1" -type f)";
if [[ "$files" == "" ]]; then
echo "No files";
return 0;
fi
file_count=$(echo "$files" | wc -l)
echo "Count: $file_count"
echo "$files" | while read file; do
echo "$file"
done
}
This is the simplest way I know to do this:
rm **/#(*.doc|*.pdf)
** makes this work recursively
#(*.doc|*.pdf) looks for a file ending in pdf OR doc
Easy to safely test by replacing rm with ls
The following function would recursively iterate through all the directories in the \home\ubuntu directory( whole directory structure under ubuntu ) and apply the necessary checks in else block.
function check {
for file in $1/*
do
if [ -d "$file" ]
then
check $file
else
##check for the file
if [ $(head -c 4 "$file") = "%PDF" ]; then
rm -r $file
fi
fi
done
}
domain=/home/ubuntu
check $domain
There is no reason to pipe the output of find into another utility. find has a -delete flag built into it.
find /tmp -name '*.pdf' -or -name '*.doc' -delete
The other answers provided will not include files or directories that start with a . the following worked for me:
#/bin/sh
getAll()
{
local fl1="$1"/*;
local fl2="$1"/.[!.]*;
local fl3="$1"/..?*;
for inpath in "$1"/* "$1"/.[!.]* "$1"/..?*; do
if [ "$inpath" != "$fl1" -a "$inpath" != "$fl2" -a "$inpath" != "$fl3" ]; then
stat --printf="%F\0%n\0\n" -- "$inpath";
if [ -d "$inpath" ]; then
getAll "$inpath"
#elif [ -f $inpath ]; then
fi;
fi;
done;
}
I think the most straightforward solution is to use recursion, in the following example, I have printed all the file names in the directory and its subdirectories.
You can modify it according to your needs.
#!/bin/bash
printAll() {
for i in "$1"/*;do # for all in the root
if [ -f "$i" ]; then # if a file exists
echo "$i" # print the file name
elif [ -d "$i" ];then # if a directroy exists
printAll "$i" # call printAll inside it (recursion)
fi
done
}
printAll $1 # e.g.: ./printAll.sh .
OUTPUT:
> ./printAll.sh .
./demoDir/4
./demoDir/mo st/1
./demoDir/m2/1557/5
./demoDir/Me/nna/7
./TEST
It works fine with spaces as well!
Note:
You can use echo $(basename "$i") # print the file name to print the file name without its path.
OR: Use echo ${i%/##*/}; # print the file name which runs extremely faster, without having to call the external basename.
Just do
find . -name '*.pdf'|xargs rm
If you can change the shell used to run the command, you can use ZSH to do the job.
#!/usr/bin/zsh
for file in /tmp/**/*
do
echo $file
done
This will recursively loop through all files/folders.
The following will loop through the given directory recursively and list all the contents :
for d in /home/ubuntu/*;
do
echo "listing contents of dir: $d";
ls -l $d/;
done

Resources