The command is:
find $HOME -path $HOME/$dir_name -prune -o -name "*$file_suffix" -exec cp {} $HOME/$dir_name/ \;
The variables dir_name and file_suffix are assigned a path to a directory and an arbitrary word earlier in the script.
I do not understand what the purpose of -path $HOME/$dir_name is, or how it affects how the rest of the command is interpreted.
-path $HOME/$dir_name -prune
excludes $HOME/$dir_name from the search; which makes sense because otherwise
-name "*$file_suffix" -exec cp {} $HOME/$dir_name/ \;
would select files copied to $HOME/$dir_name before and attempt to copy them again.
The -path predicate allows you to specify a condition on what patterns to match. It's vaguely similar to the -name predicate, but applies to the full path, not just the file's name (basename).
Your specific command applies -prune to a specific subpath, so it will avoid scanning that particular subdirectory. If this predicate fails, it will proceed with the predicates after -o (as in "or").
You should still quote your variables.
Related
I have this find command:
find . -type f -not -path '**/.git/**' -not -path '**/node_modules/**' | xargs sed -i '' s/typescript-library-skeleton/xxx/g;
for some reason it's giving me these warnings/errors:
find: ./.git/objects/3c: No such file or directory
find: ./.git/objects/3f: No such file or directory
find: ./.git/objects/41: No such file or directory
I even tried using:
-not -path '**/.git/objects/**'
and got the same thing. Anybody know why the find is searching in the .git directory? Seems weird.
why is the find searching in the .git directory?
GNU find is clever and supports several optimizations over a naive implementation:
It can flip the order of -size +512b -name '*.txt' and check the name first, because querying the size will require a second syscall.
It can count the hard links of a directory to determine the number of subdirectories, and when it's seen all it no longers needs to check them for -type d or for recursing.
It can even rewrite (-B -or -C) -and -A so that if the checks are equally costly and free of side effects, the -A will be evaluated first, hoping to reject the file after 1 test instead of 2.
However, it is not yet clever enough to realize that -not -path '*/.git/*' means that if you find a directory .git then you don't even need to recurse into it because all files inside will fail to match.
Instead, it dutifully recurses, finds each file and matches it against the pattern as if it was a black box.
To explicitly tell it to skip a directory entirely, you can instead use -prune. See How to exclude a directory in find . command
Both more efficient and more correct would be to avoid the default -print action, change -not -path ... to -prune, and ensure that xargs is only used with NUL-delimited input:
find . -name .git -prune -o \
-name node_modules -prune -o \
-type f -print0 | xargs -0 sed -i '' s/typescript-library-skeleton/xxx/g '{}' +
Note the following points:
We use -prune to tell find to not even recurse down the undesired directories, rather than -not -path ... to tell it to discard names in those directories after they were found.
We put the -prunes before the -type f, so we're able to match directories for pruning.
We have an explicit action, not depending on the default -print. This is important because the default -print effectively has a set of parenthesis: find ... behaves like find '(' ... ')' -print, not like find ... -print, no if explicit action is given.
We use xargs only with the -0 argument enabling NUL-delimited input, and the -print0 action on the find side to generate a NUL-delimited list of names. NUL is the only character which cannot be present in an arbitrary file path (yes, newlines can be present) -- and thus the only character which is safe to use to separate paths. (If the -0 extension to xargs and the -print0 extension to find are not guaranteed to be available, use -exec sed -i '' ... {} + instead).
I'm learning bash scripting and needed some simple help.
Here is what I have thus far:
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep \;
So what this does is starts from a root path, finds all directories inside this root path that are empty and do not have a .git folder, and then when that operation is successful it runs -exec touch {}/.gitkeep to create a file .gitkeep inside that empty directory to ensure proper git commits.
What I want now is to echo out the current file path for the gitkeep file just created.
My first question is:
Should I be piping | as so:
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep | outputFilenameDisplayFunction \;
Or maybe repeat what -exec does as so:
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep - exec outputFilenameDisplayFunction \;
Or maybe use >
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep > outputFilenameDisplayFunction \;
None of these commands has been tested yet. I really am looking for explanations so i can be knowledgeable in the future.
As mentioned here, find accepts multiple -exec portions to the command.
In your case, the second one can call a script, as in here:
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep \; -exec myscript {} \;
Note the \;.
The script would be:
#!/bin/sh
echo "$1" > "afile"
Charles Duffy actually proposes in the comments fir the second -exec:
-exec sh -c 'echo "$1" >>aFile' _ {} \;
avoid the need for an external file storing your script.
Let's start from your stated requirements:
So what this does is starts from a root path, finds all directories inside this root path that are empty and do not have a .git folder, and then when that operation is successful it runs -exec touch {}/.gitkeep to create a file .gitkeep inside that empty directory to ensure proper git commits.
If a directory is empty, it "can't have a .git folder" in the sense of having a child named .git by definition -- if it had any subdirectory, it wouldn't be empty. So we can completely ignore that part of your description in prose -- or interpret to refer to what the code actually appears to be intended to do, pruning any directory which is under .git.
Should that be your intent, -path is the wrong tool for that job altogether, as it still searches the .git tree (and then excludes all the things that it found); instead, use -prune to stop find from recursing down that path at all:
while IFS= read -r -d '' dirname; do
touch -- "${dirname}/.gitkeep"
printf '%q\n' "$dirname" # this goes to the logfile, since we open it for the whole loop
done < <(find . -name .git -prune -o -type d -empty -print0) >logFile
Why prefer this approach?
Instead of starting a shell per directory found (as would happen if you used -exec to start a shell script or a shell), it keeps your initial/primary shell running, and iterates through the loop once per item found.
Because it's running code in that shell, you can use shell functions; modify shell variables (as with (( ++directoriesFound )) to keep a counter, f/e), or perform redirections scoped to the loop (ie. >logFile) to open an output file just once and use if repeatedly within.
On GNU/anything, find has -printf, which makes doing what you want a straight
find -name .git -prune \
-o -type d -empty -printf %p/.gitkeep\\n -execdir touch {}/.gitkeep \;
(note: fixed omitted {}/, and GNU find's -execdir doesn't change the behavior here but is safer than -exec on systems that may find themselves under attack, the exec'd command is run directly in the location find got to rather than causing the executed command to re-walk the path).
I have a specific file which is found in several directories. Usually I delete all of them by using the syntax:
find . -name "<Filename>" -delete
However, I want to retain one file from a specific folder, say FOLDER1.
How do I do this using find? (I want to use find because I use -print before -delete to check what files I am deleting. I am apprehensive on using rm since there is danger of deleting files I want to keep.)
Thanks in advance.
You can do it with
find . -name "filename" -and -not -path "./path/to/filename" -delete
You will want either to make sure that the path expression is a relative one, including the initial ./, so that it's matched by the expression, or else use wildcards. So if you know that it's in a folder named myfolder, but you don't know the full path to it, you can use
find . -name "filename" -and -not -path "*/myfolder/filename" -delete
If you don't want to delete anything under any directory named FOLDER1, you can tell find not to recurse down any directory so named at all, using -prune:
find . -name FOLDER1 -prune -o -name filename -delete
This is more efficient than recursing down that directory and then filtering out results that include it later.
Side note: When testing this, be sure you use the explicit -print:
find . -name FOLDER1 -prune -o -name filename -print
...whereas an implicit one won't behave as you expect:
# not what you want: equivalent to the below, not the above:
find . -name FOLDER1 -prune -o -name filename
...will behave as:
find . '(' -name FOLDER1 -prune -o -name filename ')' -print
...which thus includes contents on either side of the -o operator for the action.
This question already has answers here:
How do I exclude a directory when using `find`?
(46 answers)
Closed 7 years ago.
I'm using the find command to get a list of folders where certain files are located. But because of a permission denied error for certain subdirectories, I want to exclude a certain subdirectory name.
I already tried these solutions I found here:
find /path/to/folders -path "*/noDuplicates" -prune -type f -name "fileName.txt"
find /path/to/folders ! -path "*/noDuplicates" -type f -name "fileName.txt"
And some variations for these commands (variations on the path name for example).
In the first case it won't find a folder at all, in the second case I get the error again, so I guess it still tries to access this directory. Does anyone know what I'm doing wrong or does anyone have a different solution for this?
To complement olivm's helpful answer and address the OP's puzzlement at the need for -o:
-prune, as every find primary (action or test, in GNU speak), returns a Boolean, and that Boolean is always true in the case of -prune.
Without explicit operators, primaries are implicitly connected with -a (-and), which, like its brethren -o (-or) performs short-circuiting Boolean logic.
-a has higher precedence than -o.
For a summary of all find concepts, see https://stackoverflow.com/a/29592349/45375
Thus, the accepted answer,
find . -path ./ignored_directory -prune -o -name fileName.txt -print
is equivalent to (parentheses are used to make the evaluation precedence explicit):
find . \( -path ./ignored_directory -a -prune \) \
-o \
\( -name fileName.txt -a -print \)
Since short-circuiting applies, this is evaluated as follows:
an input path matching ./ignored_directory causes -prune to be evaluated; since -prune always returns true, short-circuiting prevents the right side of the -o operator from being evaluated; in effect, nothing happens (the input path is ignored)
an input path NOT matching ./ignored_directory, instantly - again due to short-circuiting - continues evaluation on the right side of -o:
only if the filename part of the input path matches fileName.txt is the -print primary evaluated; in effect, only input paths whose filename matches fileName.txt are printed.
Edit: In spite of what I originally claimed here, -print IS needed on the right-hand side of -o here; without it, the implied -print would apply to the entire expression and thus also print for left-hand side matches; see below for background information.
By contrast, let's consider what mistakenly NOT using -o does:
find . -path ./ignored_directory -prune -name fileName.txt -print
This is equivalent to:
find . -path ./ignored_directory -a -prune -a -name fileName.txt -a -print
This will only print pruned paths (that also match the -name filter), because the -name and -print primaries are (implicitly) connected with logical ANDs;
in this specific case, since ./ignored_directory cannot also match fileName.txt, nothing is printed, but if -path's argument is a glob, it is possible to get output.
A word on find's implicit use of -print:
POSIX mandates that if a find command's expression as a WHOLE does NOT contain either
output-producing primaries, such as -print itself
primaries that execute something, such as -exec and -ok
(the example primaries given are exhaustive for the POSIX spec. of find, but real-world implementations such as GNU find and BSD find add others, such as the output-producing -print0 primary, and the executing -execdir primary)
that -print be applied implicitly, as if the expression had been specified as:
\( expression \) -print
This is convenient, because it allows you to write commands such as find ., without needing to append -print.
However, in certain situations an explicit -print is needed, as is the case here:
Let's say we didn't specify -print at the end of the accepted answer:
find . -path ./ignored_directory -prune -o -name fileName.txt
Since there's now no output-producing or executing primary in the expression, it is evaluated as:
find . \( -path ./ignored_directory -prune -o -name fileName.txt \) -print
This will NOT work as intended, as it will print paths if the entire parenthesized expression evaluates to true, which in this case mistakenly includes the pruned directory.
By contrast, by explicitly appending -print to the -o branch, paths are only printed if the right-hand side of the -o expression evaluates to true; using parentheses to make the logic clearer:
find . -path ./ignored_directory -prune -o \( -name fileName.txt -print \)
If, by contrast, the left-hand side is true, only -prune is executed, which produces no output (and since the overall expression contains a -print, -print is NOT implicitly applied).
Following my previous comment, this works on my Debian :
find . -path ./ignored_directory -prune -o -name fileName.txt -print
or
find /path/to/folder -path "*/ignored_directory" -prune -o -name fileName.txt -print
or
find /path/to/folder -name fileName.txt -not -path "*/ignored_directory/*"
The differences are nicely debated here
Edit (added behavior specification details)
Pruning all permission denied directories in find
Using gnufind.
Specification behavior details - in this solutions we want to:
exclude unreadable directories contents (prune them),
avoid "permission denied" errors coming from unreadable dierctory,
keep the other errors and return states, but
process all files (even unreadable files, if we can read their names)
The basic design pattern is:
find ... \( -readable -o -prune \) ...
Example
find /var/log/ \( -readable -o -prune \) -name "*.1"
\thanks{mklement0}
The problem is in the way find evaluates the expression you are passing to the -path option.
Instead, you should try something like:
find /path/to/folders ! -path "*noDuplicates*" -type f -name "fileName.txt"
I'm building a backup script where some directories should not be included in the backup archive.
cd /;
find . -maxdepth 2 \
\( -path './sys' -o -path './dev' -o -path './proc' -o -path './media' -o -path './mnt' \) -prune \
-o -print
This finds only the files and directories I want.
Problem is that cpio should be fed with the following option in order to avoid problems with permissions when restoring files.
find ... -depth ....
And if I add the -depth option, returned files and directories include those I want to avoid.
I really don't understand these sentences from the find manual:
-prune True; if the file is a directory, do not descend into it. If
-depth is given, false; no effect. Because -delete implies
-depth, you cannot usefully use -prune and -delete together.
I am quoting a passage from this tutorial which might offer better understanding of -prune option of find.
It is important to understand how to prevent find from going too far.
The important option in this case is -prune. This option confuses people because it is always true. It has a side-effect that is important. If the file being looked at is a directory, it will not travel down the directory. Here is an example that lists all files in a directory but does not look at any files in subdirectories under the top level:
find * -type f -print -o -type d -prune
This will print all plain files and prune the search at all directories. To print files except for those in a Source Code Control Directories, use:
find . -print -o -name SCCS -prune
If the -o option is excluded, the SCCS directory will be printed along with the other files.
Source