Bash: find with -depth and -prune to feed cpio - bash

I'm building a backup script where some directories should not be included in the backup archive.
cd /;
find . -maxdepth 2 \
\( -path './sys' -o -path './dev' -o -path './proc' -o -path './media' -o -path './mnt' \) -prune \
-o -print
This finds only the files and directories I want.
Problem is that cpio should be fed with the following option in order to avoid problems with permissions when restoring files.
find ... -depth ....
And if I add the -depth option, returned files and directories include those I want to avoid.
I really don't understand these sentences from the find manual:
-prune True; if the file is a directory, do not descend into it. If
-depth is given, false; no effect. Because -delete implies
-depth, you cannot usefully use -prune and -delete together.

I am quoting a passage from this tutorial which might offer better understanding of -prune option of find.
It is important to understand how to prevent find from going too far.
The important option in this case is -prune. This option confuses people because it is always true. It has a side-effect that is important. If the file being looked at is a directory, it will not travel down the directory. Here is an example that lists all files in a directory but does not look at any files in subdirectories under the top level:
find * -type f -print -o -type d -prune
This will print all plain files and prune the search at all directories. To print files except for those in a Source Code Control Directories, use:
find . -print -o -name SCCS -prune
If the -o option is excluded, the SCCS directory will be printed along with the other files.
Source

Related

Exclude directories with find command and executing a script on other directories

I currently have a directory structure that I need to be able to roll through each of 100 or so directories and run a script on them individually while excluding this check on a handful of other directories.
This is what I have been using in the past:
find ./OTHER/ -maxdepth 2 -wholename '*_*/*.txt' -execdir /files/bin/other_process {} +
I would like to exclude certain directories from this check and have not found a sufficient answer to this problem.
This has been my best attempt (or two) at the problem:
find ./OTHER/ \( -path ./OTHER/X???_EXCLUDE_THIS -prune -o -path ./OTHER/X???_IGNORE_THIS -prune -o \) -type d \(-name *_*/*.txt \) -execdir /files/bin/other_process {} +
I get:
find: paths must precede expression ./OTHER/A101_EXCLUDE_THIS/
This is the return that I get on nearly every variation that I have used.
This has been my best attempt (or two) at the problem:
find ./OTHER/ \( -path ./OTHER/X???_EXCLUDE_THIS -prune -o -path ./OTHER/X???_IGNORE_THIS -prune -o \) -type d \(-name *_*/*.txt \) -execdir /files/bin/other_process {} +
Errors in this attempt:
\(-name: There must be a space after \(.
-name *_*/*.txt: -name is for base of file name; use -path here.
*_*/*.txt: You should quote such patterns to prevent expansion by the shell.
-o \): -o does not belong at the end of an expression; you mean \) -o. But you don't need parentheses here.
-type d: Since you want to find regular files *.txt, you must not look for a directory.
With those errors corrected, it works:
find ./OTHER/ -path './OTHER/X???_EXCLUDE_THIS' -prune -o -path './OTHER/X???_IGNORE_THIS' -prune -o -path '*_*/*.txt' -execdir echo {} +

In Bash, how do you delete all files with same name, except the one located in a specific folder?

I have a specific file which is found in several directories. Usually I delete all of them by using the syntax:
find . -name "<Filename>" -delete
However, I want to retain one file from a specific folder, say FOLDER1.
How do I do this using find? (I want to use find because I use -print before -delete to check what files I am deleting. I am apprehensive on using rm since there is danger of deleting files I want to keep.)
Thanks in advance.
You can do it with
find . -name "filename" -and -not -path "./path/to/filename" -delete
You will want either to make sure that the path expression is a relative one, including the initial ./, so that it's matched by the expression, or else use wildcards. So if you know that it's in a folder named myfolder, but you don't know the full path to it, you can use
find . -name "filename" -and -not -path "*/myfolder/filename" -delete
If you don't want to delete anything under any directory named FOLDER1, you can tell find not to recurse down any directory so named at all, using -prune:
find . -name FOLDER1 -prune -o -name filename -delete
This is more efficient than recursing down that directory and then filtering out results that include it later.
Side note: When testing this, be sure you use the explicit -print:
find . -name FOLDER1 -prune -o -name filename -print
...whereas an implicit one won't behave as you expect:
# not what you want: equivalent to the below, not the above:
find . -name FOLDER1 -prune -o -name filename
...will behave as:
find . '(' -name FOLDER1 -prune -o -name filename ')' -print
...which thus includes contents on either side of the -o operator for the action.

-prune in find working without OR(-o) option - Unix

$pwd
/tmp
$touch 1.tst 2.tst
$mkdir inner_dir
$touch inner_dir/3.tst
$find . ! -name . -prune -name '*.tst'
1.tst
2.tst
I want to restrict 'find' to search only to the current directory for the files with 'tst' extension (I know this can be done with 'ls' command, but want to add other 'find' filters like mtime later on).
My question is how the above 'find' works?.
Why doesn't the following work(with an OR option)?
find . ! -name . -prune -o -name '*.tst'
Thanks.
-prune
Always evaluates to the value True. Stops the descent of the current path name if it is a directory. If the -depth flag is specified, the -prune flag is
ignored.
I think if you play with it, you can figure out what it is doing.
e.g.
find . ! -name . -prune
gives
./1.tst
./2.tst
./d
We don't go down into ./d because of the prune -- "Stops the descent ...". What is left is then filtered by the -name '*.tst' to be just the list files at the top directory.
HTH

Exclude a sub-directory using find

I have directory structure like this
data
|___
|
abc
|____incoming
def
|____incoming
|____processed
123
|___incoming
456
|___incoming
|___processed
There is an incoming sub-folder in all of the folders inside Data directory. I want to get all files from all the folders and sub-folders except the def/incoming and 456/incoming dirs.
I tried out with following command
find /home/feeds/data -type d \( -name 'def/incoming' -o -name '456/incoming' -o -name arkona \) -prune -o -name '*.*' -print
but it is not working as expected.
Ravi
This works:
find /home/feeds/data -type f -not -path "*def/incoming*" -not -path "*456/incoming*"
Explanation:
find /home/feeds/data: start finding recursively from specified path
-type f: find files only
-not -path "*def/incoming*": don't include anything with def/incoming as part of its path
-not -path "*456/incoming*": don't include anything with 456/incoming as part of its path
Just for the sake of documentation: You might have to dig deeper as there are many search'n'skip constellations (like I had to). It might turn out that prune is your friend while -not -path won't do what you expect.
So this is a valuable example of 15 find examples that exclude directories:
http://www.theunixschool.com/2012/07/find-command-15-examples-to-exclude.html
To link to the initial question, excluding finally worked for me like this:
find . -regex-type posix-extended -regex ".*def/incoming.*|.*456/incoming.*" -prune -o -print
Then, if you wish to find one file and still exclude pathes, just add | grep myFile.txt.
It may depend also on your find version. I see:
$ find -version
GNU find version 4.2.27
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION SELINUX
-name only matches the filename, not the whole path. You want to use -path instead, for the parts in which you are pruning the directories like def/incoming.
find $(INP_PATH} -type f -ls |grep -v "${INP_PATH}/.*/"
By following answer for How to exclude a directory in find . command:
find . \( -name ".git" -o -name "node_modules" \) -prune -o -print
This is what I did to exclude all the .git directories and passed it to -exec for greping something in the
find . -not -path '*/\.*' -type f -exec grep "pattern" [] \;
-not -path '*/\.*' will exclude all the hidden directories
-type f will only list type file and then you can pass that to -exec or whatever you want todo

In Ksh Shell use FIND to search through symbolic links but ignore certain directories

I am trying to search for *.csv files in file system. There are symbolic links in certain directories that i am looking through, but i want to ignore certain directories since they result in nasty long time consuming cycles.
find -L "location" -name "*.csv >> find_result.txt
How can i tell find to ignore certain directories while keep looking at symbolic links in others.
Use -prune to tell find not to descend into a given directory. For instance:
find -L location -name 'dontLookHere' -prune \
-o -name 'orThereEither' -prune \
-o -name '*.csv' -print
find dir -wholename dirtoskip -prune -o [rest of your find command]
or
find dir \( -wholename dirtoskip -o -wholename another \) -prune -o [rest of your find command]

Resources