Ignore/prune hidden directories with GNU find command - bash

When using the find command, why is it that the following will successfully ignore hidden directories (those starting with a period) while matching everything else:
find . -not \( -type d -name ".?*" -prune \)
but this will not match anything at all:
find . -not \( -type d -name ".*" -prune \)
The only difference is the question mark. Shouldn't the latter command likewise detect and exclude directories beginning with a period?

The latter command prunes everything because it prunes . - try these to see the difference:
$ ls -lad .*
.
..
.dotdir
$ ls -lad .?*
..
.dotdir
You see that in the second one, . isn't included because it is only one character long. The glob ".?*" includes only filenames that are at least two characters long (dot, plus any single character, non-optionally, plus any sequence of zero or more characters).
By the way, find is not a Bash command.

The latter command prunes . itself -- the directory you're running find against -- which is why it generates no results.

Related

executable files in linux using (perm)?

i'm trying to write out a list of the names of everything under the /etc directory that are executable to all other users and whose name starts or ends with a number.
find /etc "(" -name "[0-9]*" -o -name "*[0-9]" ")" -perm -o=x -print
But every time I get a wrong answer, can you help?
If you're using the zsh shell, you can get that list of files with its advanced filename generation globbing; no external programs needed. In particular, using a recursive glob, alternation, and a glob qualifier that matches world-executable files:
zsh$ printf "%s\n" /etc/**/([0-9]*|*[0-9])(X)
/etc/alternatives/animate-im6
/etc/alternatives/c89
/etc/alternatives/c99
/etc/alternatives/compare-im6
/etc/alternatives/composite-im6
...
/etc/X11
/etc/X11/fonts/Type1
/etc/xdg/xdg-xubuntu/xfce4
/etc/xdg/xfce4
/etc/xfce4
Do a setopt glob_dots first to match filenames starting with . like find does. Otherwise they get skipped.
If you're using find, you need the -mode argument to -perm to select files with at least the given permission bits (Which is actually what you have in your question and works for me)
find /etc \( -name "[0-9]*" -o -name "*[0-9]" \) -perm -o=x

Leaving out '-print' from 'find' command when '-prune' is used

I have never been able to fully understand the -prune action of the find command. But in actuality at least some of my misunderstanding stems from the effect of omitting the '-print' expression.
From the 'find' man page..
"If the expression contains no actions other than -prune, -print is performed on all files for which the expression is true."
.. which I have always (for many years) taken to mean I can leave out '-print'.
However, as the following example illustrates, there is a difference between using '-print' and omitting '-print', at least when a '-prune' expression appears.
First of all, I have the following 8 directories under my working directory..
aqua/
aqua/blue/
blue/
blue/orange/
blue/red/
cyan/blue/
green/
green/yellow/
There are a total of 10 files in those 8 directories..
aqua/blue/config.txt
aqua/config.txt
blue/config.txt
blue/orange/config.txt
blue/red/config.txt
cyan/blue/config.txt
green/config.txt
green/test.log
green/yellow/config.txt
green/yellow/test.log
My goal is to use 'find' to display all regular files not having 'blue' as part of the file's path. There are five files matching this requirement.
This works as expected..
% find . -path '*blue*' -prune -o -type f -print
./green/test.log
./green/yellow/config.txt
./green/yellow/test.log
./green/config.txt
./aqua/config.txt
But when I leave out '-print' it returns not only the five desired files, but also any directory whose path name contains 'blue'..
% find . -path '*blue*' -prune -o -type f
./green/test.log
./green/yellow/config.txt
./green/yellow/test.log
./green/config.txt
./cyan/blue
./blue
./aqua/blue
./aqua/config.txt
So why are the three 'blue' directories displayed?
This can be significant because often I'm trying to prune out a directory structure that contains more than 50,000 files. When that path is processed my find command, especially if I'm doing an '-exec grep' to each file, can take a huge amount of time processing files for which I have absolutely no interest. I need to have confidence that find is not going into the pruned structure.
The implicit -print applies to the entire expression, not just the last part of it.
% find . \( -path '*blue*' -prune -o -type f \) -print
./green/test.log
./green/yellow/config.txt
./green/yellow/test.log
./green/config.txt
./cyan/blue
./blue
./aqua/blue
./aqua/config.txt
It's not decending into the pruned directories, but it is printing out the top level.
A slight modification:
$ find . ! \( -path '*blue*' -prune \) -type f
./green/test.log
./green/yellow/config.txt
./green/yellow/test.log
./green/config.txt
./aqua/config.txt
(with implicit -a) would lead to having the same behavior with and without -print.

Bash Find with ignore

I need to find files and ignore files like "^02" (it is regex). If "^02" is directory, then I need to ignore every files, which are inside directory. I don't know how to do it. I tried to use something like.
find . -type f -not -regex "^9" -o -prune
But it doesn't works.
Note that the regex doesn't use ^ and $ as it always has to match the whole string. Moreover, the path starts with ./ if the first argument to find is ., so you need to include it, too.
find -type f -not -regex '\./02.*'
If you want to exclude even subdirectories, use .*/02.* for the regex.
If you want to only exclude the directories matching the pattern, but you want to keep the files, you need to use prune only for directories matching the regex, and -false to remove the directories from the list:
find . -type d -regex '\./02.*' -prune -false -or -type f
Also, you can use patterns instead of regexes for simple cases. That way, you can use -name to include subdirectories:
find . -name '02*' -prune -false -or -type f

Bash - Excluding subdirectories using the find command [duplicate]

This question already has answers here:
How do I exclude a directory when using `find`?
(46 answers)
Closed 7 years ago.
I'm using the find command to get a list of folders where certain files are located. But because of a permission denied error for certain subdirectories, I want to exclude a certain subdirectory name.
I already tried these solutions I found here:
find /path/to/folders -path "*/noDuplicates" -prune -type f -name "fileName.txt"
find /path/to/folders ! -path "*/noDuplicates" -type f -name "fileName.txt"
And some variations for these commands (variations on the path name for example).
In the first case it won't find a folder at all, in the second case I get the error again, so I guess it still tries to access this directory. Does anyone know what I'm doing wrong or does anyone have a different solution for this?
To complement olivm's helpful answer and address the OP's puzzlement at the need for -o:
-prune, as every find primary (action or test, in GNU speak), returns a Boolean, and that Boolean is always true in the case of -prune.
Without explicit operators, primaries are implicitly connected with -a (-and), which, like its brethren -o (-or) performs short-circuiting Boolean logic.
-a has higher precedence than -o.
For a summary of all find concepts, see https://stackoverflow.com/a/29592349/45375
Thus, the accepted answer,
find . -path ./ignored_directory -prune -o -name fileName.txt -print
is equivalent to (parentheses are used to make the evaluation precedence explicit):
find . \( -path ./ignored_directory -a -prune \) \
-o \
\( -name fileName.txt -a -print \)
Since short-circuiting applies, this is evaluated as follows:
an input path matching ./ignored_directory causes -prune to be evaluated; since -prune always returns true, short-circuiting prevents the right side of the -o operator from being evaluated; in effect, nothing happens (the input path is ignored)
an input path NOT matching ./ignored_directory, instantly - again due to short-circuiting - continues evaluation on the right side of -o:
only if the filename part of the input path matches fileName.txt is the -print primary evaluated; in effect, only input paths whose filename matches fileName.txt are printed.
Edit: In spite of what I originally claimed here, -print IS needed on the right-hand side of -o here; without it, the implied -print would apply to the entire expression and thus also print for left-hand side matches; see below for background information.
By contrast, let's consider what mistakenly NOT using -o does:
find . -path ./ignored_directory -prune -name fileName.txt -print
This is equivalent to:
find . -path ./ignored_directory -a -prune -a -name fileName.txt -a -print
This will only print pruned paths (that also match the -name filter), because the -name and -print primaries are (implicitly) connected with logical ANDs;
in this specific case, since ./ignored_directory cannot also match fileName.txt, nothing is printed, but if -path's argument is a glob, it is possible to get output.
A word on find's implicit use of -print:
POSIX mandates that if a find command's expression as a WHOLE does NOT contain either
output-producing primaries, such as -print itself
primaries that execute something, such as -exec and -ok
(the example primaries given are exhaustive for the POSIX spec. of find, but real-world implementations such as GNU find and BSD find add others, such as the output-producing -print0 primary, and the executing -execdir primary)
that -print be applied implicitly, as if the expression had been specified as:
\( expression \) -print
This is convenient, because it allows you to write commands such as find ., without needing to append -print.
However, in certain situations an explicit -print is needed, as is the case here:
Let's say we didn't specify -print at the end of the accepted answer:
find . -path ./ignored_directory -prune -o -name fileName.txt
Since there's now no output-producing or executing primary in the expression, it is evaluated as:
find . \( -path ./ignored_directory -prune -o -name fileName.txt \) -print
This will NOT work as intended, as it will print paths if the entire parenthesized expression evaluates to true, which in this case mistakenly includes the pruned directory.
By contrast, by explicitly appending -print to the -o branch, paths are only printed if the right-hand side of the -o expression evaluates to true; using parentheses to make the logic clearer:
find . -path ./ignored_directory -prune -o \( -name fileName.txt -print \)
If, by contrast, the left-hand side is true, only -prune is executed, which produces no output (and since the overall expression contains a -print, -print is NOT implicitly applied).
Following my previous comment, this works on my Debian :
find . -path ./ignored_directory -prune -o -name fileName.txt -print
or
find /path/to/folder -path "*/ignored_directory" -prune -o -name fileName.txt -print
or
find /path/to/folder -name fileName.txt -not -path "*/ignored_directory/*"
The differences are nicely debated here
Edit (added behavior specification details)
Pruning all permission denied directories in find
Using gnufind.
Specification behavior details - in this solutions we want to:
exclude unreadable directories contents (prune them),
avoid "permission denied" errors coming from unreadable dierctory,
keep the other errors and return states, but
process all files (even unreadable files, if we can read their names)
The basic design pattern is:
find ... \( -readable -o -prune \) ...
Example
find /var/log/ \( -readable -o -prune \) -name "*.1"
\thanks{mklement0}
The problem is in the way find evaluates the expression you are passing to the -path option.
Instead, you should try something like:
find /path/to/folders ! -path "*noDuplicates*" -type f -name "fileName.txt"

find h files using iregex Bash

I'm using this line to find h files using bash but when it does it also finds bash files because of the .sh ending and I'm not sure how i can limit the find to only find files with a .h not a . and some h as last character.
find . -iregex '.*\(h\)'
What about the much simpler
find -iname '*.h'
This is better because it does only find files that end in .h and is maybe faster then using a full regex.
For regex the right approach is
find -iregex '\.h$'
the \. escapes the '.' so that it matches a '.'. And the $ tells it it should be the last part in the match.
Added because of question in comment:
Normally
find \( -iname '*.h' -or -iname '*.c' \)
works fine for me. The \( \) is to escape the parenthesis from the shell.
You can use enhanced regex with anchor $ to only match .h:
find . -iregex '.*\.h$'

Resources