Find and delete old files excluding some subdirectories - shell

I have been searching for a while, but can't seem to get a succinct solution. I am trying to delete old files but excluding some subdirectories (passed via parm) and their child subdirecories.
The issue that I am having is that when the subdirectory_name is itself older than the informed duration (also passed via parm) the find command is including the subdirectory_name on the list of the find. In reality the remove won't be able to delete these subdirectories because the rm command default option is f.
Here is the find commmand generated by the script:
find /directory/ \( -type f -name '*' -o -type d \
-name subdirectory1 -prune -o -type d -name directory3 \
-prune -o -type d -name subdirectory2 -prune -o \
-type d -name subdirectory3 -prune \) -mtime +60 \
-exec rm {} \; -print
Here is the list of files (and subdirectories brought by the find command)
/directory/subdirectory1 ==> this is a subdreictory name and I'd like to not be included
/directory/subdirectory2 ==> this is a subdreictory name and I'd like to not be included
/directory/subdirectory3 ==> this is a subdreictory name and I'd like to not be included
/directory/subdirectory51/file51
/directory/file1 with spaces
Besides this -- the script works fine not bringing (excluding) the files under these 3 subdirectories:
subdirectory1, subdirectory2 and subdirectory3.
Thank you.

Following command will delete only files older than 1 day.
You can exclude the directories as shown in the example below, directories test1 & test2 will be excluded.
find /path/ -mtime +60 -type d \( -path ./test1 -o -path ./test2 \) -prune -o -type f -print0 | xargs -0 rm -f
Though it would be advisable to see what's going to be deleted using -print
find /path/ -mtime +60 -type d \( -path ./test1 -o -path ./test2 \) -prune -o -type f -print

find /directory/ -type d \(
-name subdirectory1 -o \
-name subdirectory2 -o \
-name subdirectory3 \) -prune -o \
-type f -mtime +60 -print -exec rm -f {} +
Note that the AND operator (-a, implicit between two predicates if not specified) has precedence over the OR one (-o). So the above is like:
find /directory/ \( -type d -a \(
-name subdirectory1 -o \
-name subdirectory2 -o \
-name subdirectory3 \) -a -prune \) -o \
\( -type f -a -mtime +60 -a -print -a -exec rm -f {} + \)
Note that every file name matches the * pattern, so -name '*' is like -true and is of no use.
Using + instead of ; runs fewer rm commands (as few as possible, and each is passed several files to remove).
Do not use that code above on directories writeable by others as it's vulnerable to attacks whereby the attacker can change a directory to a symlink to another one in between the time find traverses the directory and calls rm to have you delete any file on the filesystem. Can be alleviated by changing the -exec part with -delete or -execdir rm -f {} \; if your find supports them.
See also the -path predicate if you want to exclude a specific subdirectory1 instead of any directory whose name is subdirectory1.

Related

Need to find files with multiples arguments in bash

I've done this atm, I need to find in the main directory and in the sub-directory everything starting with the letter 'a', every files ending with 'z' and every files starting with 'z' and ending with 'a!'.
find . -name "a*" | find . "*z" -type f | find . "z*a!" -type f
I tried to be as clear as possible, sorry if it wasn't clear enough.
find . -type f \( -name 'a*' -or -name '*z' -or -name 'z*a!' \)
Use -o instead of -or for POSIX compliance.
If you really want to also find links, directories, pipes etc. starting with a but only files matching the remaining conditions, you can do
find . -name 'a*' -or -type f \(-name '*z' -or -name 'z*a!' \)
TL;DR
find . -name 'a*' -o -type f \( -name '*z' -o -name 'z*a!' \)
Explanations:
The find logical operators are -a (AND) and -o (OR). You use them to combine elementary tests. Note that because of operator's precedence you sometimes need parentheses and that they must be escaped (with \) to prevent their interpretation by the shell. Your test is:
everything starting with the letter 'a': -name 'a*'.
every files ending with 'z': -type f -a -name '*z'.
every files starting with 'z' and ending with 'a!': -type f -a -name 'z*a!'.
So the complete test could be:
-name 'a*' -o \( -type f -a -name '*z' \) -o \( -type f -a -name 'z*a!' \)
As -a is the default we can omit it, and as -type f (file) is common to the two last terms of the disjunction we can factor it:
-name 'a*' -o -type f \( -name '*z' -o -name 'z*a!' \)

Why doesn't find let me match multiple patterns?

I'm writing some bash/zsh scripts that process some files. I want to execute a command for each file of a certain type, and some of these commands overlap. When I try to find -name 'pattern1' -or -name 'pattern2', only the last pattern is used (files matching pattern1 aren't returned; only files matching pattern2). What I want is for files matching either pattern1 or pattern2 to be matched.
For example, when I try the following this is what I get (notice only ./foo.xml is found and printed):
$ ls -a
. .. bar.html foo.xml
$ tree .
.
├── bar.html
└── foo.xml
0 directories, 2 files
$ find . -name '*.html' -or -name '*.xml' -exec echo {} \;
./foo.xml
$ type find
find is an alias for noglob find
find is /usr/bin/find
Using -o instead of -or gives the same results. If I switch the order of the -name parameters, then only bar.html is returned and not foo.xml.
Why aren't bar.html and foo.xml found and returned? How can I match multiple patterns?
You need to use parentheses in your find command to group your conditions, otherwise only 2nd -name option is effective for -exec command.
find . \( -name '*.html' -or -name '*.xml' \) -exec echo {} \;
find utility
-print == default
If you just want to print file path and names, you have to drop exec echo, because -print is default.:
find . -name '*.html' -or -name '*.xml'
Order dependency
Otherwise, find is read from left to right, argument order is important!
So if you want to specify something, respect and and or precedence:
find . -name '*.html' -exec echo ">"{} \; -o -name '*.xml' -exec echo "+"{} \;
or
find . -maxdepth 4 \( -name '*.html' -o -name '*.xml' \) -exec echo {} \;
Expression -print0 and xargs command.
But, for most cases, you could consider -print0 with xargs command, like:
find . \( -name '*.html' -o -name '*.xml' \) -print0 |
xargs -0 printf -- "-- %s -\n"
The advantage of doing this is:
Only one (or few) fork for thousand of entry found. (Using -exec echo {} \; implies that one subprocess is run for each entry found, while xargs will build a long line with as many argument one command line could hold...)
In order to work with filenames containing special character or whitespace, -print0 and xargs -0 will use the NULL character as the filename delimiter.
find ... -exec ... {} ... +
From some years ago, find command accept a new syntax for -exec switch.
Instead of \;, -exec switch could end with a plus sign +.
find . \( -name '*.html' -o -name '*.xml' \) -exec printf -- "-- %s -\n" {} +
With this syntax, find will work like xargs command, building long command lines for reducing forks.

find works, but find -exec doesn’t

I am trying to find all images in subfolders of a given folder, and move them somewhere else. I have tried the following in zsh (my default) and sh (what most tutorials seem to be using) on a Mac running OS X 10.9.3.
This prints out all the images in the subfolders of $someDir:
find "$someDir" -iname \*.jpg -o -name \*.png -o -name \*.gif
However, when I want to pass those images to another command, I can’t get it to work. As an exercise, I tried it with echo:
find "$someDir" -iname \*.jpg -o -name \*.png -o -name \*.gif -exec sh -c "echo hello {}" \;
It just returns silently, and the value of $? is 0.
I eventually want to do something along these lines:
find "$someDir" -iname \*.jpg -o -name \*.png -o -name \*.gif -exec sh -c "mv {} $destination" \;
But I can‘t even get the echo example to work. What am I doing wrong?
You need to put parentheses around all the name tests:
find "$someDir" \( -iname \*.jpg -o -name \*.png -o -name \*.gif \) -exec sh -c "echo hello {}" \;
Otherwise, the -exec is only done for files that match *.gif.
When you leave out the action, there's a default -print in each branch of the -o. But if there's any action option in the command, there's no default actions anywhere.
These should work (make sure $destination is defined)
find "$someDir" \( -iname \*.jpg -o -name \*.png -o -name \*.gif \) -exec echo hello {}
find "$someDir" \( -iname \*.jpg -o -name \*.png -o -name \*.gif \) -exec mv {} $destination \;

Why is this command exploding

So I got this as an answer to a previous question as an answer for looking recursively through files in a directory and deleting the files and directories if found:
find \( -name ".git" -o -name ".gitignore" -o -name "Documentation" \) -exec rm -rf "{}" \;
There are two problems with this:
One:
find: `./adam.balan/AisisAjax/.git': No such file or directory
because of this error the rest of the script doesn't execute. Now I don't want to have to check for any of the files or folders. I don't care if they exist or not, I want to suppress the error on this.
The second is that I am also getting the error on a directory that needs to be excluded from this search: vendor/
find: `./vendor/adam.balan/AisisAjax/.git': No such file or directory
I do not want it searching vendor. I want it to leave vendor alone.
How do I solve these two problems? Suppression and ignoring.
The problem is that you're deleting a directory that find then tries to descend into. You can use -prune to prevent that:
find \( -name ".git" -o -name ".gitignore" -o -name "Documentation" \) -prune -exec rm -rf "{}" \;
To ignore all errors, you can use 2> /dev/null to squash the error messages, and || true to avoid set -e making your script exit:
find \( -name ".git" -o -name ".gitignore" -o -name "Documentation" \) -prune -exec rm -rf "{}" \; 2> /dev/null || true
Finally, to avoid descending any directory named 'vendor', you can use -prune again:
find -name vendor -prune -o \( -name ".git" -o -name ".gitignore" -o -name "Documentation" \) -prune -exec rm -rf "{}" \; 2> /dev/null || true

find command: list every directory and subdir, except .git or .hg dir and subdirs

I tried to use this command in my work dir
find . -type d \( ! -name .git -prune \) -o \( ! -name .hg -prune \)
Does not seem to work?
removing prune will only exclude .git or .hg directories, not their subdir
find . \( -name .git -o -name .hg \) -prune -o -type d -print
My answer is essentially the same as #Barmar's, which I have upvoted:
find . \( -name .git -o -name .hg \) -prune -o \( -type d -print \)
A little explanation about this command might be helpful to you:
Here -o means OR. When find finds a file's name matching .git or .hg, it stops the further search due to -prune option and evaluates to true, hence skips the other -o branch(which is directory printing). That's why only those directories not containing .git or .hg will show up.
You may also refer to this question on SO: How to use '-prune' option of 'find' in sh?

Resources