Why doesn't find let me match multiple patterns? - macos

I'm writing some bash/zsh scripts that process some files. I want to execute a command for each file of a certain type, and some of these commands overlap. When I try to find -name 'pattern1' -or -name 'pattern2', only the last pattern is used (files matching pattern1 aren't returned; only files matching pattern2). What I want is for files matching either pattern1 or pattern2 to be matched.
For example, when I try the following this is what I get (notice only ./foo.xml is found and printed):
$ ls -a
. .. bar.html foo.xml
$ tree .
.
├── bar.html
└── foo.xml
0 directories, 2 files
$ find . -name '*.html' -or -name '*.xml' -exec echo {} \;
./foo.xml
$ type find
find is an alias for noglob find
find is /usr/bin/find
Using -o instead of -or gives the same results. If I switch the order of the -name parameters, then only bar.html is returned and not foo.xml.
Why aren't bar.html and foo.xml found and returned? How can I match multiple patterns?

You need to use parentheses in your find command to group your conditions, otherwise only 2nd -name option is effective for -exec command.
find . \( -name '*.html' -or -name '*.xml' \) -exec echo {} \;

find utility
-print == default
If you just want to print file path and names, you have to drop exec echo, because -print is default.:
find . -name '*.html' -or -name '*.xml'
Order dependency
Otherwise, find is read from left to right, argument order is important!
So if you want to specify something, respect and and or precedence:
find . -name '*.html' -exec echo ">"{} \; -o -name '*.xml' -exec echo "+"{} \;
or
find . -maxdepth 4 \( -name '*.html' -o -name '*.xml' \) -exec echo {} \;
Expression -print0 and xargs command.
But, for most cases, you could consider -print0 with xargs command, like:
find . \( -name '*.html' -o -name '*.xml' \) -print0 |
xargs -0 printf -- "-- %s -\n"
The advantage of doing this is:
Only one (or few) fork for thousand of entry found. (Using -exec echo {} \; implies that one subprocess is run for each entry found, while xargs will build a long line with as many argument one command line could hold...)
In order to work with filenames containing special character or whitespace, -print0 and xargs -0 will use the NULL character as the filename delimiter.
find ... -exec ... {} ... +
From some years ago, find command accept a new syntax for -exec switch.
Instead of \;, -exec switch could end with a plus sign +.
find . \( -name '*.html' -o -name '*.xml' \) -exec printf -- "-- %s -\n" {} +
With this syntax, find will work like xargs command, building long command lines for reducing forks.

Related

find exec and strip extension from filenames

Any idea why this command is not working? btw, I'm trying to strip out the extensions of all csv files in current directory.
find -type f -iname "*.csv" -exec mv {} $(basename {} ".csv") \;
Tried many variants including the parameter expansions, xargs ... Even then all went futile.
This should do:
find ./ -type f -iname "*.csv" -exec sh -c 'mv {} $(basename {} .csv)' \;
find is able to substitute {} with its findings since the quotes prevent executing the subshell until find is done. Then it executes the -exec part.
The problem why yours is not working is that $(basename {} ".csv") is executed in a subshell (-> $()) and evaluated beforehand. If we look at the command execution step-by-step you will see what happens:
find -type f -iname "*.csv" -exec mv {} $(basename {} ".csv") \; - your command
find -type f -iname "*.csv" -exec mv {} {} \; - subshell gets evaluated ($(basename {} ".csv") returns {} since it interprets {} as a literal)
find -type f -iname "*.csv" -exec mv {} {} \; - as you see now: move does actually nothing
First, take care that you have no subdirectories; find, without extra arguments, will automatically recur into any directory below.
Simple approach: if you have a small enough number of files, just use the glob (*) operator, and take advantage of rename:
$ rename 's/.csv$//' *.csv
If you have too many files, use find, and perhaps xargs:
$ find . -maxdepth 1 -type f -name "*.csv" | xargs rename 's/.csv$//'
If you want to be really safe, tell find and xargs to delimit with null-bytes, so that you don't have weird filenames (e.g., with spaces or newlines) mess up the process:
$ find . -maxdepth 1 -type f -name "*.csv" -print0 | xargs -0 rename 's/.csv$//'

"find: bad option -name "*.user"" while passing variables to find

In a bash script this fails:
fileloc='/var/adm/logs/morelogs'
filename=' -name "*.user"'
fileList="$(find "$fileloc"/* -type f -prune "$filename" -print)"
find: bad option -name "*.user"
find: [-H | -L] path-list predicate-list
but this works:
find /var/adm/logs/morelogs/* -type f -prune -name "*.user" -print
in the same manner:
this fails:
fileloc='/var/adm/logs/morelogs'
filename='\( -name "admin.*" -o -name "*.user" -o -name "*.user.gz" \)'
fileList="$(find "$fileloc"/* -type f -prune "$filename" -print)"
find: bad option \( -name "admin.*" -o -name "*.user" -o -name "*.user.gz" \)
find: [-H | -L] path-list predicate-list
but this works:
find /var/adm/logs/morelogs/* -type f -prune \( -name "admin.*" -o -name "*.user" -o -name "*.user.gz" \) -print
GNU bash, version 3.00.16(1)-release-(sparc-sun-solaris2.10)
This is usecase when you should use BASH arrays or BASH function.
Using BASH arrays:
#!/bin/bash
# initialize your constants
fileloc='/var/adm/logs/morelogs'
filename='*.user'
# create an array with full find command
cmd=( find "$fileloc" -type f -prune -name "$filename" -print )
# execute find command line using BASH array
"${cmd[#]}"
It sounds like you're trying to build the list of names to search for dynamically -- if this is the case, a variant of #anubhava's answer using the array for just the name patterns is the best approach:
namepatterns=() # Start with no filenames to search for
while something; do
newsuffix="whatever"
namepatterns+=(-o -name "*.$newsuffix")
done
# Note that "${namepatterns[#]}" is not quite what we want to pass to find, since
# it always starts with "-o" (unless it's empty, in which case this'll have other
# problems). But "${namepatterns[#]:1}" leaves off the first element, and gets us
# what we need.
fileList="$(find "$fileloc"/* -type f -prune "(" "${namepatterns[#]:1}" ")" -print)"
Other notes: I second #BroSlow's recommendation to read BashFAQ #50: I'm trying to put a command in a variable, but the complex cases always fail!, and also you're going to have trouble using that filelist variable if any of the filenames contain funny characters (esp. whitespace and wildcards) -- see BashFAQ #20: How can I find and safely handle file names containing newlines, spaces or both? (short answer: arrays are better for this as well!)
Lets see what are you doing with set -x:
$ fileloc='/var/adm/logs/morelogs'
+ fileloc=/var/adm/logs/morelogs
$ filename=' -name "*.user"'
+ filename=' -name "*.user"'
Everything seems fine, now, next line:
$ fileList="$(find "$fileloc"/* -type f -prune "$filename" -print)"
++ find '/var/adm/logs/morelogs/*' -type f -prune ' -name "*.user"' -print
find: paths must precede expression: -name "*.user"
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]
+ fileList=
I think you see the problem, if you execute find '/var/adm/logs/morelogs/*' -type f -prune ' -name "*.user"' -print it will throw you an error:
$ find '/var/adm/logs/morelogs/*' -type f -prune ' -name "*.user"' -print
find: paths must precede expression: -name "*.user"
Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]
What's happening? Well, there's a bunch of single quotes that are in the way, but the one that causes problems is the two lasts, before -name and -print, which cause find to see it as a single parameter, the other can be ignored. So, how to fix this? Don't use double quotes to ask for the $filename variable:
$ find "$fileloc" -type f -prune $filename -print
+ find /var/adm/logs/morelogs -type f -prune -name '*.user' -print
That should solve it.
not an answer to problem, but a poor solution. After getting frustrated, i just hard-coded the search to have full options list.
so it looks like this now: and it works. i had to build some cases, and repeat myself - not a good programming practice, but i was tired of this shell ting....
so for example one option looks like:
fileList="$(find "$fileloc"/* -type f -prune \( -name "admin.*" -o -name "*.user" -o -name "*.user.gz" \) -print)"

find option available to omit leading './' in result

I think this is probably a pretty n00ber question but I just gotsta ask it.
When I run:
$ find . -maxdepth 1 -type f \( -name "*.mp3" -o -name "*.ogg" \)
and get:
./01.Adagio - Allegro Vivace.mp3
./03.Allegro Vivace.mp3
./02.Adagio.mp3
./04.Allegro Ma Non Troppo.mp3
why does find prepend a ./ to the file name? I am using this in a script:
fList=()
while read -r -d $'\0'; do
fList+=("$REPLY")
done < <(find . -type f \( -name "*.mp3" -o -name "*.ogg" \) -print0)
fConv "$fList" "$dBaseN"
and I have to use a bit of a hacky-sed-fix at the beginning of a for loop in function 'fConv', accessing the array elements, to remove the leading ./. Is there a find option that would simply omit the leading ./ in the first place?
The ./ at the beginning of the file is the path. The "." means current directory.
You can use "sed" to remove it.
find . -maxdepth 1 -type f \( -name "*.mp3" -o -name "*.ogg" \) | sed 's|./||'
I do not recommend doing this though, since find can search through multiple directories, how would you know if the file found is located in the current directory?
If you ask it to search under /tmp, the results will be on the form /tmp/file:
$ find /tmp
/tmp
/tmp/.X0-lock
/tmp/.com.google.Chrome.cUkZfY
If you ask it to search under . (like you do), the results will be on the form ./file:
$ find .
.
./Documents
./.xmodmap
If you ask it to search through foo.mp3 and bar.ogg, the result will be on the form foo.mp3 and bar.ogg:
$ find *.mp3 *.ogg
click.ogg
slide.ogg
splat.ogg
However, this is just the default. With GNU and other modern finds, you can modify how to print the result. To always print just the last element:
find /foo -printf '%f\0'
If the result is /foo/bar/baz.mp3, this will result in baz.mp3.
To print the path relative to the argument under which it's found, you can use:
find /foo -printf '%P\0'
For /foo/bar/baz.mp3, this will show bar/baz.mp3.
However, you shouldn't be using find at all. This is a job for plain globs, as suggested by R Sahu.
shopt -s nullglob
files=(*.mp3 *.ogg)
echo "Converting ${files[*]}:"
fConv "${files[#]}"
find . -maxdepth 1 -type f \( -name "*.mp3" -o -name "*.ogg" \) -exec basename "{}" \;
Having said that, I think you can use a simpler approach:
for file in *.mp3 *.ogg
do
if [[ -f $file ]]; then
# Use the file
fi
done
If your -maxdepth is 1, you can simply use ls:
$ ls *.mp3 *.ogg
Of course, that will pick up any directory with a *.mp3 or *.ogg suffix, but you probably don't have such a directory anyway.
Another is to munge your results:
$ find . -maxdepth 1 -type f \( -name "*.mp3" -o -name "*.ogg" \) | sed 's#^\./##'
This will remove all ./ prefixes, but not touch other file names. Note the ^ anchor in the substitution command.

I am getting an error "arg list too long" in unix

i am using the following command and getting an error "arg list too long".Help needed.
find ./* \
-prune \
-name "*.dat" \
-type f \
-cmin +60 \
-exec basename {} \;
Here is the fix
find . -prune -name "*.dat" -type f -cmin +60 |xargs -i basename {} \;
To only find files in the current directory, use -maxdepth 1.
find . -maxdepth 1 -name '*.dat' -type f -cmin +60 -exec basename {} \;
In all *nix systems the shell has a maximum length of arguments that can be passed to a command. This is measured after the shell has expanded filenames passed as arguments on the command line.
The syntax of find is find location_to_find_from arguments..... so when you are running this command the shell will expand your ./* to a list of all files in the current directory. This will expand your find command line to find file1 file2 file3 etc etc This is probably not want you want as the find is recursive anyway. I expect that you are running this command in a large directory and blowing your command length limit.
Try running the command as follows
find . -name "*.dat" -type f -cmin +60 -exec basename {} \;
This will prevent the filename expansion that is probably causing your issue.
Without find, and only checking the current directory
now=$(date +%s)
for file in *.dat; do
if (( $now - $(stat -c %Y "$file") > 3600 )); then
echo "$file"
fi
done
This works on my GNU system. You may need to alter the date and stat formats for different OS's
If you have to show only .dat filename in the ./ tree. Execute it without -prune option, and use just path:
find ./ -name "*.dat" -type f -cmin +60 -exec basename {} \;
To find all the .dat files which are older than 60 minutes in the present directory only do as follows:
find . -iregex "./[^/]+\.dat" -type f -cmin +60 -exec basename {} \;
And if you have croppen (for example aix) version of find tool do as follows:
find . -name "*.dat" -type f -cmin +60 | grep "^./[^/]\+dat" | sed "s/^.\///"

replace string with string2 ONLY on line starting with '//FIXME'

I have multiple files with a line that has 'date' that should be 'data' but the change should only be made where date is on the same line as "FIXME"
find . -maxdepth 1 -type -f \( -name "*.cpp" -o -name "*.h" \) -exec grep FIXME {} \; | sed 's/date/data/g'
will output the changes but if i add -i to sed i get errors.
so i cant get changes written to disk this way.
I think it's because sed only gets access to buffer contents grep pull up and does not know anything about the file it came from. i'm guessing.
-Thank you!
Remove -maxdepth 1 otherwise it doesn't traverse to sub directories. Also sed command needs to be corrected. Try this:
find . -type f \( -name "*.cpp" -o -name "*.h" \) -exec sed -i.bak '/FIXME/s/date/data/g' '{}' \;

Resources