How to search filenames by regex with "find"

How to search filenames by regex with "find" - bash

I was trying to find all files dated and all files 3 days or more ago.
find /home/test -name 'test.log.\d{4}-d{2}-d{2}.zip' -mtime 3
It is not listing anything. What is wrong with it?

find /home/test -regextype posix-extended -regex '^.*test\.log\.[0-9]{4}-[0-9]{2}-[0-9]{2}\.zip' -mtime +3
-name uses globular expressions,
aka wildcards. What you want is
-regex
To use intervals as you intend, you
need to tell find to use Extended
Regular Expressions via the
-regextype posix-extended flag
You need to escape out the periods
because in regex a period has the
special meaning of any single
character. What you want is a
literal period denoted by \.
To match only those files that are
greater than 3 days old, you need to prefix your number with a + as
in -mtime +3.
Proof of Concept
$ find . -regextype posix-extended -regex '^.*test\.log\.[0-9]{4}-[0-9]{2}-[0-9]{2}\.zip'
./test.log.1234-12-12.zip

Use -regex not -name, and be aware that the regex matches against what find would print, e.g. "/home/test/test.log" not "test.log"

Start with:
find . -name '*.log.*.zip' -a -mtime +1
You may not need a regex, try:
find . -name '*.log.*-*-*.zip' -a -mtime +1
You will want the +1 in order to match 1, 2, 3 ...

Use -regex:
From the man page:
-regex pattern
File name matches regular expression pattern. This is a match on the whole path, not a search. For example, to match a file named './fubar3', you can use the
regular expression '.*bar.' or '.*b.*3', but not 'b.*r3'.
Also, I don't believe find supports regex extensions such as \d. You need to use [0-9].
find . -regex '.*test\.log\.[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]\.zip'

Just little elaboration of regex for search a directory and file
Find a directroy with name like book
find . -name "*book*" -type d
Find a file with name like book word
find . -name "*book*" -type f

Related

Bash Find with ignore

I need to find files and ignore files like "^02" (it is regex). If "^02" is directory, then I need to ignore every files, which are inside directory. I don't know how to do it. I tried to use something like.
find . -type f -not -regex "^9" -o -prune
But it doesn't works.

Note that the regex doesn't use ^ and $ as it always has to match the whole string. Moreover, the path starts with ./ if the first argument to find is ., so you need to include it, too.
find -type f -not -regex '\./02.*'
If you want to exclude even subdirectories, use .*/02.* for the regex.
If you want to only exclude the directories matching the pattern, but you want to keep the files, you need to use prune only for directories matching the regex, and -false to remove the directories from the list:
find . -type d -regex '\./02.*' -prune -false -or -type f
Also, you can use patterns instead of regexes for simple cases. That way, you can use -name to include subdirectories:
find . -name '02*' -prune -false -or -type f

Match .h, .m, .mm files but NOT .html

I want to do some processing on just the source files of type .h, .m, .mm. I do NOT want to include the .html files.
The following misses the .mm files as I'm only matching .h or .m and not trying to catch longer extensions.
find ./ -type f -name "*.[hm]"
This only catches the .mm files, as the ? operator always matches a character. I'd like it to be an optional zero or one match like in regex.
find ./ -type f -name "*.[hm]?"
But if I use * instead, it matches 0 or multiple characters. This returns everything, but also has the .html files that I don't want.
find ./ -type f -name "*.[hm]*"
Any ideas on how to do this?

You can combine conditions:
find -type f -and \( -name '*.h' -or -name '*.m' -or -name '*.mm' \)
Or:
find ./ -type f -and -name '*.[hm]*' -and -not -name '*.html'

You were running OS X. There is -E switch that you can use to tell find that the regular expression is an extended expression.
Interpret regular expressions followed by -regex and -iregex primaries as extended (modern) regular expressions rather than basic regular expressions (BRE's). The re_format(7) manual page fully describes both formats.
Your command should now work:
find -E . -type f -regex ".*\.([hm]|mm)"

What are "primaries" in find?

I was reading the manual for the find command. As I was going down the list of options I was reading the following..
PRIMARIES
All primaries which take a numeric argument allow the number to be preceded
by a plus sign (``+'') or a minus sign (``-''). A preceding plus
sign means ``more than n'', a preceding minus sign means ``less than n''
and neither means ``exactly n''.
I was having a hard time understanding what that means. I was also trying to find out what are "Primaries" in Google and couldn't get a good answer.
Can anyone help me understand what this means?

From the man page, this is the list of primaries in OS X find:
-Bmin
-Bnewer
-Btime
-amin
-anewer
-atime
-cmin
-cnewer
-ctime
-d
-delete
-depth
-empty
-exec
-execdir
-flags
-fstype
-gid
-group
-ignore
-ilname
-iname
-inum
-ipath
-iregex
-iwholename
-links
-lname
-ls
-maxdepth
-mindepth
-mmin
-mnewer
-mount
-mtime
-name
-newer
-newerXY
-nogroup
-noignore_readdir_race
-noleaf
-nouser
-ok
-okdir
-path
-perm
-print
-print0
-prune
-regex
-samefile
-size
-type
-uid
-user
-wholename

From the beginning of the same man page (emphasis mine):
DESCRIPTION
The find utility recursively descends the directory tree for each path listed, evaluating an expression (composed
of the ``primaries'' and ``operands'' listed below) in terms of each file in the tree.
"Primary" is the term used by the find documentation for one of the building blocks of an expression used by find to filter its output.

The find command accepts two kinds of parameters, they have been named 'primaries' and 'operators' by the authors of find. Primaries are parameters that allow filtering which files you want find to find, while Operators are the parameters that allow combining the primaries.
In mathematics, a primary is the basic component in an arithmetic or logic expression.
There also is a third class of parameters, that have no name and that modify the directory hierarchy traversal behavior of find, and a forth class that define what action to take upon the found files (print, delete, etc.)
The GNU man page uses the word 'Test' instead of 'Primary'

find h files using iregex Bash

I'm using this line to find h files using bash but when it does it also finds bash files because of the .sh ending and I'm not sure how i can limit the find to only find files with a .h not a . and some h as last character.
find . -iregex '.*\(h\)'

What about the much simpler
find -iname '*.h'
This is better because it does only find files that end in .h and is maybe faster then using a full regex.
For regex the right approach is
find -iregex '\.h$'
the \. escapes the '.' so that it matches a '.'. And the $ tells it it should be the last part in the match.
Added because of question in comment:
Normally
find \( -iname '*.h' -or -iname '*.c' \)
works fine for me. The \( \) is to escape the parenthesis from the shell.

You can use enhanced regex with anchor $ to only match .h:
find . -iregex '.*\.h$'

bash, find files which contain numbers on filename

In bash, I would like to use the command find to find files which contain the numbers from 40 to 70 in a certain position like c43_data.txt. How is it possible to implement this filter in find?
I tried file . -name "c**_data.txt" | grep 4, but this is not very nice.
Thanks

Perhaps something like:
find . -regextype posix-egrep -regex "./c(([4-6][0-9])|70)_data.txt"
This matches 40 - 69, and 70.
You may also use the iregex option for case-insensitive matching.

$ ls
c40_data.txt c42_data.txt c44_data.txt c70_data.txt c72_data.txt c74_data.txt
c41_data.txt c43_data.txt c45_data.txt c71_data.txt c73_data.txt c75_data.txt
$ find . -type f \( -name "c[4-6][0-9]_*txt" -o -name "c70_*txt" -o -name "c[1-2][3-4]_*.txt" \) -print
./c43_data.txt
./c41_data.txt
./c45_data.txt
./c70_data.txt
./c40_data.txt
./c44_data.txt
./c42_data.txt

Try something like:
find . -regextype posix-egrep -regex '.\*c([3-6][0-9]|70).\*'
with the appropriate refinements to limit this to the files you want

ls -R | grep -e 'c[4-7][0-9]_data.txt'
find can be used in place of ls, obviously.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to search filenames by regex with "find" - bash

I was trying to find all files dated and all files 3 days or more ago. find /home/test -name 'test.log.\d{4}-d{2}-d{2}.zip' -mtime 3 It is not listing anything. What is wrong with it?

Use -regex not -name, and be aware that the regex matches against what find would print, e.g. "/home/test/test.log" not "test.log"

Start with: find . -name '.log..zip' -a -mtime +1 You may not need a regex, try: find . -name '.log.--.zip' -a -mtime +1 You will want the +1 in order to match 1, 2, 3 ...

Just little elaboration of regex for search a directory and file Find a directroy with name like book find . -name "book" -type d Find a file with name like book word find . -name "book" -type f

Related

Bash Find with ignore

Match .h, .m, .mm files but NOT .html

What are "primaries" in find?

find h files using iregex Bash

bash, find files which contain numbers on filename

Categories

Resources

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to search filenames by regex with "find" - bash

I was trying to find all files dated and all files 3 days or more ago. find /home/test -name 'test.log.\d{4}-d{2}-d{2}.zip' -mtime 3 It is not listing anything. What is wrong with it?

Use -regex not -name, and be aware that the regex matches against what find would print, e.g. "/home/test/test.log" not "test.log"

Start with: find . -name '*.log.*.zip' -a -mtime +1 You may not need a regex, try: find . -name '*.log.*-*-*.zip' -a -mtime +1 You will want the +1 in order to match 1, 2, 3 ...

Just little elaboration of regex for search a directory and file Find a directroy with name like book find . -name "*book*" -type d Find a file with name like book word find . -name "*book*" -type f

Related

Bash Find with ignore

Match .h, .m, .mm files but NOT .html

What are "primaries" in find?

find h files using iregex Bash

bash, find files which contain numbers on filename

Categories

Resources

Start with: find . -name '.log..zip' -a -mtime +1 You may not need a regex, try: find . -name '.log.--.zip' -a -mtime +1 You will want the +1 in order to match 1, 2, 3 ...

Just little elaboration of regex for search a directory and file Find a directroy with name like book find . -name "book" -type d Find a file with name like book word find . -name "book" -type f