Bash parameter expansion in brackets not working as expected - bash

I am writing a script that wraps the find command to search for specific source file types under a given directory. A sample invocation would be :
./find_them.sh --java --flex --xml dir1
The above command would search for .java, .as and .xml files under dir1.
To do this manually I came up with the following find command :
find dir1 -type f -a \( -name "*.java" -o -name "*.as" -o -name "*.xml" \)
As I am doing this in a script where I want to be able specify different file sets to search for you end up with the following structure :
find_cmd_file_sets=$(decode_file_sets) # Assume this creates a string with the file sets e.g. -name "*.java" -o -name "*.as" etc
dirs=$(get_search_dirs) # assume this gives you the list of dirs to search, defaulting to the current directory
for dir in $dirs
do
find $dir -type f -a \( $find_cmd_file_sets \)
done
The above script doesn't behave as expected, you execute the script and the find command churns for a while before returning no results.
I'm certain the equivalents of decode_file_sets and get_search_dirs I've created are generating the correct results.
A simpler example if to execute the following directly in a bash shell
file_sets=' -name "*.java" -o -name "*.as" '
find dir -type f -a \( $file_sets \) # Returns no result
# Executing result of below command directly in the shell returns correct result
echo find dir -type f -a \\\( $file_sets \\\)
I don't understand why variable expansion in brackets of the find command would change the result. If it makes any difference I am using git-bash under Windows.
This is really frustrating. Any help would be much appreciated. Most importantly I would like to understand why the variable expansion of $file_sets is behaving as it is.

Hope this will work, Its tested on bash.
file_sets=' -name "*.java" -o -name "*.as" '
command=`echo "find $dir -type f -a \( $file_sets \)"`
eval $command

TLDR: Don't use quotes in find_cmd_file_sets variable and disable pathname expansion (set -f) before calling find.
When you have "special" character in a variable content and then you try to expand that variable without quotes than bash will surround each word with "special" character with single quotes, e.g.:
#!/usr/bin/env bash
set -x
VAR='abc "def"'
echo $VAR
The output is:
+ VAR='abc "def"'
+ echo abc '"def"'
abc "def"
As you can see, bash surrounded "def" with single quotes. In your case, the call to find command becomes:
find ... -name '"*.java"' ...
So it tries to find files which start with " and end with .java"
To prevent that behavior, the only thing you can do (which I'm aware of) is to use double quotes when expanding the variable, e.g.:
#!/usr/bin/env bash
set -x
VAR='abc "def"'
echo "$VAR"
The output is:
+ VAR='abc "def"'
+ echo 'abc "def"'
abc "def"
The only problem, as you probably noticed already, is that now the whole variable is in quotes and is treated as single argument. So this won't work in your find command.
The only option left is to not use quotes, neither in variable content nor when expanding the variable. But then, of course, you have a problem with pathname expansion:
#!/usr/bin/env bash
set -x
VAR='abc *.java'
echo $VAR
The output is:
+ VAR='abc *.java'
+ echo abc file1.java file2.java
abc file1.java file2.java
Fortunately you can disable pathname expansion using set -f:
#!/usr/bin/env bash
set -x
VAR='abc *.java'
set -f
echo $VAR
The output is:
+ VAR='abc *.java'
+ set -f
+ echo abc '*.java'
abc *.java
To sum up, the following should work:
#!/usr/bin/env bash
pattern='-name *.java'
dir="my_project"
set -f
find "$dir" -type f -a \( $pattern \)

bash arrays were introduced to allow this kind of nested quoting:
file_sets=( -name "*.java" -o -name "*.as" )
find dir -type f -a \( "${file_sets[#]}" \)

Related

Bash -c argument passing to find

This command works
find . -name \*.txt -print
and outputs two filenames
This command works
bash -c 'echo . $0 $1 -print' "-name" "\*.txt"
and outputs this result:
. -name *.txt -print
But this command
bash -c 'find . $0 $1 -print' "-name" "\*.txt"
does not give an error but does not output anything either.
Can anyone tell me what is happening here?
It looks like you're trying to use "\*.txt" to forestall glob-expansion so that the find command sees *.txt instead of e.g. foo.txt.
However, what it ends up seeing is \*.txt. No files match that pattern, so you see no output.
To make find see *.txt as its 3rd argument, you could do this:
bash -c 'find . $0 "$1" -print' "-name" "*.txt"
Edit: Are you really getting . -name *.txt -print as the output of the first command where you replaced find with echo? When I run that command, I get . -name \*.txt -print.
Well the suggestions from francesco work. But I am still confused by the behaviour here.
We know that putting unquoted wild cards in a find command will usually result in an error. To wit:
find . -name *.txt -print
find: paths must precede expression: HowTo-Word-Split.txt' find:
possible unquoted pattern after predicate-name'?
However putting the wild card in single quotes (or escaping it if it is only 1 char) will work like so:
find . -name \*.txt -print
which gives this output (on two separate lines)
> ./HowTo-Word-Split.txt
> ./bash-parms.txt
So in the bash -c version what I was thinking was this:
bash -c 'find . $0 $1 -print' "-name" "*.txt"
would result in the *.txt being expanded even before being passed in to the cmd string,
and using single quotes around it would result in trying to execute (after the arg substitution and the -c taking effect)
find . -name *.txt -print
which as I just demonstrated does not work.
However there seems to be some sort of magic associated with the -c switch as demonstrated by setting -x at the bash prompt, like so:
$ set -x
$ bash -c ' find . $0 "$1" -print' "-name" "*.txt"
+ bash -c ' find . $0 "$1" -print' -name '*.txt'
./HowTo-Word-Split.txt
./bash-parms.txt
Note that even though I used double quotes in the -c line, bash actually executed the find with single quotes put around the argument, thus making find work.
Problem solved. :)!

Edit a find -exec echo command to include a grep for a string

So I have the following command which looks for a series of files and appends three lines to the end of everything found. Works as expected.
find /directory/ -name "file.php" -type f -exec sh -c "echo -e 'string1\string2\nstring3\n' >> {}" \;
What I need to do is also look for any instance of string1, string2, or string3 in the find ouput of file.php prior to echoing/appending the lines so I don't append a file unnecessarily. (This is being run in a crontab)
Using | grep -v "string" after the find breaks the -exec command.
How would I go about accomplishing my goal?
Thanks in advance!
That -exec command isn't safe for strings with spaces.
You want something like this instead (assuming finding any of the strings is reason not to add any of the strings).
find /directory/ -name "file.php" -type f -exec sh -c "grep -q 'string1|string2|string3' \"\$1\" || echo -e 'string1\nstring2\nstring3\n' >> \"\$1\"" - {} \;
To explain the safety issue.
find places {} in the command it runs as a single argument but when you splat that into a double-quoted string you lose that benefit.
So instead of doing that you pass the file as an argument to the shell and then use the positional arguments in the shell command with quotes.
The command above simply chains the echo to a failure from grep to accomplish the goal.

What is the meaning of ${arg##/*/} and {} \; in shell scripting

Please find the code below for displaying the directories in the current folder.
What is the meaning of ${arg##/*/} in shell scripting. (Both arg#* and arg##/*/ gives the same output. )
What is the meaning of {} \; in for loop statement.
for arg in `find . -type d -exec ls -d {} \;`
do
echo "Output 1" ${arg##/*/}
echo "Output 2" ${arg#*}
done
Adding to #JoSo's helpful answer:
${arg#*} is a fundamentally pointless expansion, as its result is always identical to $arg itself, since it strips the shortest prefix matching any character (*) and the shortest prefix matching any character is the empty string.
${arg##/*/} - stripping the longest prefix matching pattern /*/ - is useless in this context, because the output paths will be ./-prefixed due to use of find ., so there will be no prefix starting with /. By contrast, ${arg##*/} will work and strip the parent path (leaving the folder-name component only).
Aside from it being ill-advised to parse command output in a for loop, as #JoSo points out,
the find command in the OP is overly complicated and inefficient
(as an aside, just to clarify, the find command lists all folders in the current folder's subtree, not just immediate subfolders):
find . -type d -exec ls -d {} \;
can be simplified to:
find . -type d
The two commands do the same: -exec ls -d {} \; simply does what find does by default anyway (an implied -print).
If we put it all together, we get:
find . -mindepth 1 -type d | while read -r arg
do
echo "Folder name: ${arg##*/}"
echo "Parent path: ${arg%/*}"
done
Note that I've used ${arg%/*} as the second output item, which strips the shortest suffix matching /* and thus returns the parent path; furthermore, I've added -mindepth 1 so that find doesn't also match .
#JoSo, in a comment, demonstrates a solution that's both simpler and more efficient; it uses -exec to process a shell command in-line and + to pass as many paths as possible at once:
find . -mindepth 1 -type d -exec /bin/sh -c \
'for arg; do echo "Folder name: ${arg##*/}"; echo "Parent: ${arg%/*}"; done' \
-- {} +
Finally, if you have GNU find, things get even easier, as you can take advantage of the -printf primary, which supports placeholders for things like filenames and parent paths:
find . -type d -printf 'Folder name: %f\nParen path: %h\n'
Here's a bash-only solution based on globbing (pathname expansion), courtesy of #Adrian Frühwirth:
Caveat: This requires bash 4+, with the shell option globstar turned ON (shopt -s globstar) - it is OFF by default.
shopt -s globstar # bash 4+ only: turn on support for **
for arg in **/ # process all directories in the entire subtree
do
echo "Folder name: $(basename "$arg")"
echo "Parent path: $(dirname "$arg")"
done
Note that I'm using basename and dirname here for parsing, as they conveniently ignore the terminating / that the glob **/ invariably adds to its matches.
Afterthought re processing find's output in a while loop: on the off chance that your filenames contain embedded \n chars, you can parse as follows, using a null char. to separate items (see comments for why -d $'\0' rather than -d '' is used):
find . -type d -print0 | while read -d $'\0' -r arg; ...
${arg##/*/} is an application of "parameter expansion". (Search for this term in your shell's manual, e.g. type man bash in a linux shell). It expands to arg without the longest prefix of arg that matches /*/ as a glob pattern. E.g. if arg is /foo/bar/doo, it expands to doo.
That's bad shell code (similar to item #1 on Bash Pitfalls). The {} \; has not so much to do with shell, but more with the arguments that the find command expects to an -exec subcommand. The {} is replaced with the current filename, e.g. this results in find executing the command ls -d FILENAME with FILENAME replaced by each file it found. The \; serves as a terminator of the -exec argument. See the manual page of find, e.g. type man find on a linux shell, and look for the string -exec there to find the description.

Apply a script to subdirectories

I have read many times that if I want to execute something over all subdirectories I should run something like one of these:
find . -name '*' -exec command arguments {} \;
find . -type f -print0 | xargs -0 command arguments
find . -type f | xargs -I {} command arguments {} arguments
The problem is that it works well with corefunctions, but not as expected when the command is a user-defined function or a script. How to fix it?
So what I am looking for is a line of code or a script in which I can replace command for myfunction or myscript.sh and it goes to every single subdirectory from current directory and executes such function or script there, with whatever arguments I supply.
Explaining in another way, I want something to work over all subdirectories as nicely as for file in *; do command_myfunction_or_script.sh arguments $file; done works over current directory.
Instead of -exec, try -execdir.
It may be that in some cases you need to use bash:
foo () { echo $1; }
export -f foo
find . -type f -name '*.txt' -exec bash -c 'foo arg arg' \;
The last line could be:
find . -type f -name '*.txt' -exec bash -c 'foo "$#"' _ arg arg \;
Depending on what args might need expanding and when. The underscore represents $0.
You could use -execdir where I have -exec if that's needed.
The examples that you give, such as:
find . -name '*' -exec command arguments {} \;
Don't go to every single subdirectory from current directory and execute command there, but rather execute command from the current directory with the path to each file listed by the find as an argument.
If what you want is to actually change directory and execute a script, you could try something like this:
STDIR=$PWD; IFS=$'\n'; for dir in $(find . -type d); do cd $dir; /path/to/command; cd $STDIR; done; unset IFS
Here the current directory is saved to STDIR and the bash Internal Field Separator is set to a newline so names won't split on spaces. Then for each directory (-type d) that find returns, we cd to that directory, execute the command (using the full path here as changing directories will break a relative path) and then cd back to the starting directory.
There may be some way to use find with a function, but it won't be terribly elegant. If you have bash 4, what you probably want to do is use globstar:
shopt -s globstar
for file in **/*; do
myfunction "$file"
done
If you're looking for compatibility with POSIX or older versions of bash, you will be forced to source the file defining your function when you invoke bash. So something like this:
find <args> -exec bash -c '. funcfile;
for file; do
myfunction "$file"
done' _ {} +
But that's just ugly. When I get to this point, I usually just put my function in a script on my PATH and live with it.
If you want to use a bash function, this is one way.
work ()
{
local file="$1"
local dir=$(dirname $file)
pushd "$dir"
echo "in directory $(pwd) working with file $(basename $file)"
popd
}
find . -name '*' | while read line;
do
work "$line"
done

Processing list in Makefile

In bash, when I want to iterate in a recursive list of pdf files, without the extension, I could do the following:
for file in `find mypath -type f -name '*.pdf' -printf "%f\n"`
do
echo "${file%.*}"
done
This works perfectly, and I get a list of the pdf files without the extension.
But if I try to do the same in a Makefile, I get empty output:
my_test:
#for file in `find mypath -type f -name '*.pdf' -printf "%f\n"`; \
do \
echo "${file%.*}"; \
done; \
do you have an idea why this is happening?
thanks in advance
Just put in an extra $:
echo "$${file%.*}"; \
In your command Make expands the first $, interprets ${ as nothing, and things unravel fast. With $$, the first $ escapes the second and the ${...} gets passed to the shell.

Resources