-prune in find working without OR(-o) option - Unix - shell

$pwd
/tmp
$touch 1.tst 2.tst
$mkdir inner_dir
$touch inner_dir/3.tst
$find . ! -name . -prune -name '*.tst'
1.tst
2.tst
I want to restrict 'find' to search only to the current directory for the files with 'tst' extension (I know this can be done with 'ls' command, but want to add other 'find' filters like mtime later on).
My question is how the above 'find' works?.
Why doesn't the following work(with an OR option)?
find . ! -name . -prune -o -name '*.tst'
Thanks.

-prune
Always evaluates to the value True. Stops the descent of the current path name if it is a directory. If the -depth flag is specified, the -prune flag is
ignored.
I think if you play with it, you can figure out what it is doing.
e.g.
find . ! -name . -prune
gives
./1.tst
./2.tst
./d
We don't go down into ./d because of the prune -- "Stops the descent ...". What is left is then filtered by the -name '*.tst' to be just the list files at the top directory.
HTH

Related

Getting the contents of a directory excluding everything inside .git in bash

I need to get the number of the contents of a directory that is a git repository.
I have to get the number of:
1) Other directories inside the directory I am currently iterating (and the other sub-directories inside them if they exist)
2) .txt files inside the directory and its sub-directories
3) All the non-txt files inside the directory and its sub-directories
In all the above cases I must ignore the .git directory, along with all the files and directories that are inside of it.
Also I must use bash script exclusively. I can't use another programing language.
Right now I am using the following commands to achieve this:
To get all the .txt files I use : find . -type f \( -name "*.txt" \). There are no .txt files inside .git so this is working.
To get all the non-txt files I use: find . -type f \( ! -name "*.txt" \). The problem is that I also get all the files from .git and I don't know how to ignore them.
To get all the directories and sub-directories I use: find . -type d. I don't know how to ignore the .git directory and it's sub-directories
The easy way is to just add these extra tests:
find . ! -path './.git/*' ! -path ./.git -type f -name '*.txt'
The problem with this is ./.git is still traversed, unnecessarily, which takes time.
Instead, -prune can be used. -prune is not a test (like -path, or -type). It's an action. The action is "don't descend the current path, if it's a directory". It must be used separately to the print action.
# task 1
find . -path './.git' -prune -o -type f -name '*.txt' -print
# task 2
find . -path './.git' -prune -o -type f ! -name '*.txt' -print
# task 3
find . -path './.git' -prune -o -type d -print
If -print isn't specified, ./.git is also printed as the default action.
I used -path ./.git, because you said "the .git directory". If for some reason there are other .git directories in the tree, they will be traversed and printed. To ignore all directories in the tree named .git, replace -path ./.git with -name .git.
Sometimes writing a bash loop is more clear than a one-liner
for f in $(find .); do
if [[ -d $f && "$f" == "./.git" ]]; then
echo "skipping dir $f";
else
echo "do something with $f";
fi;
done

In Bash, how do you delete all files with same name, except the one located in a specific folder?

I have a specific file which is found in several directories. Usually I delete all of them by using the syntax:
find . -name "<Filename>" -delete
However, I want to retain one file from a specific folder, say FOLDER1.
How do I do this using find? (I want to use find because I use -print before -delete to check what files I am deleting. I am apprehensive on using rm since there is danger of deleting files I want to keep.)
Thanks in advance.
You can do it with
find . -name "filename" -and -not -path "./path/to/filename" -delete
You will want either to make sure that the path expression is a relative one, including the initial ./, so that it's matched by the expression, or else use wildcards. So if you know that it's in a folder named myfolder, but you don't know the full path to it, you can use
find . -name "filename" -and -not -path "*/myfolder/filename" -delete
If you don't want to delete anything under any directory named FOLDER1, you can tell find not to recurse down any directory so named at all, using -prune:
find . -name FOLDER1 -prune -o -name filename -delete
This is more efficient than recursing down that directory and then filtering out results that include it later.
Side note: When testing this, be sure you use the explicit -print:
find . -name FOLDER1 -prune -o -name filename -print
...whereas an implicit one won't behave as you expect:
# not what you want: equivalent to the below, not the above:
find . -name FOLDER1 -prune -o -name filename
...will behave as:
find . '(' -name FOLDER1 -prune -o -name filename ')' -print
...which thus includes contents on either side of the -o operator for the action.

Why is find finding .git directories?

I have the following find command and I'm surprised to see .git directories being found. Why?
$ find . ! -name '*git*' | grep git
./.git/hooks
./.git/hooks/commit-msg
./.git/hooks/applypatch-msg.sample
./.git/hooks/prepare-commit-msg.sample
./.git/hooks/pre-applypatch.sample
./.git/hooks/commit-msg.sample
./.git/hooks/post-update.sample
Because find searches for files and none of the found files have the search pattern in their name (see the man page). You need to remove the offending directory via the -prune switch:
find . -path ./.git -prune -o -not -name '*git*' -print |grep git
See Exclude directory from find . command
[edit] An alternative without -prune (and much more natural imho):
find . -not -path "*git*" -not -name '*git*' |grep git
You're just seeing expected behaviour of find. The -name test is only applied to the filename itself, not the whole path. If you want to search everything but the .git directory, you can use bash(1)'s extglob option:
$ shopt -s extglob
$ find !(.git)
It doesn't really find those git-files. Instead it finds files under ./.git/ that match the pattern ! -name '*git*' which includes all files that don't include git in their filename (not path name).
Finds -name is about the files, not the path.
Try -iwholename instead of -name:
find . ! -iwholename '*git*'
This is what I needed:
find . ! -path '*git*'

Bash: find with -depth and -prune to feed cpio

I'm building a backup script where some directories should not be included in the backup archive.
cd /;
find . -maxdepth 2 \
\( -path './sys' -o -path './dev' -o -path './proc' -o -path './media' -o -path './mnt' \) -prune \
-o -print
This finds only the files and directories I want.
Problem is that cpio should be fed with the following option in order to avoid problems with permissions when restoring files.
find ... -depth ....
And if I add the -depth option, returned files and directories include those I want to avoid.
I really don't understand these sentences from the find manual:
-prune True; if the file is a directory, do not descend into it. If
-depth is given, false; no effect. Because -delete implies
-depth, you cannot usefully use -prune and -delete together.
I am quoting a passage from this tutorial which might offer better understanding of -prune option of find.
It is important to understand how to prevent find from going too far.
The important option in this case is -prune. This option confuses people because it is always true. It has a side-effect that is important. If the file being looked at is a directory, it will not travel down the directory. Here is an example that lists all files in a directory but does not look at any files in subdirectories under the top level:
find * -type f -print -o -type d -prune
This will print all plain files and prune the search at all directories. To print files except for those in a Source Code Control Directories, use:
find . -print -o -name SCCS -prune
If the -o option is excluded, the SCCS directory will be printed along with the other files.
Source

find prune not working

I am trying to find all ruby files in the project. However I want to ignore all the files residing under directory vendor.
find . -name .vendor -prune -o -name '*.rb' -print
Above command is not working. Anyone knows the fix?
Try:
find . -name '*.rb' ! -wholename "./vendor/*" -print
You may have to escape ! (i.e. write \!) character depending on your shell.

Resources