What is this strange syntax inside `find -exec`? - bash

Recently I've came across a strange bash script, which is used to call a custom bash function from inside find -exec. I've developed following simple script to demonstrate the functionality I need to get explained.
In the following example, function foo will be called for each find result.
foo()
{
echo "$#"
}
export -f foo
find . -exec bash -c 'foo "$#"' bash {} \;
Can someone explain how the part after -exec is interpreted?
UPDATE:
To further simplify this, after exporting foo as above, following gets executed for each find result (assume there is a file named my_file).
bash -c 'foo "$#"' bash my_file
And this produces the output myfile. I don't understand how this works. What does the second bash does there? Any detailed explanation is appreciated.
(Please note that this question is not about find command. Also please ignore the functionality of function foo, I just wanted to export some function)

To understand you need to know 4 things:
The find action -exec allows you to apply a command on the found files and directories.
The -c bash option is documented as follows:
BASH(1)
...
OPTIONS
...
-c If the -c option is present, then commands are read from
the first non-option argument command_string.
If there are arguments after the command_string, they
are assigned to the positional parameters, starting with $0.
...
If bash is started with the -c option, then $0 is set to the first
argument after the string to be executed, if one is present.
Otherwise, it is set to the filename used to invoke bash, as given
by argument zero.
In bash, $# expands as all positional parameters ($1, $2...) starting at parameter $1.
In a bash function, the positional parameters are the arguments passed to the function when it is called.
So, in your case, the command executed for each found file or directory is:
bash -c 'foo "$#"' bash <the-file>
The positional parameters are thus set to:
$0 = bash
$1 = <the-file>
and bash is asked to execute 'foo "$#"' in this context. "$#" is first expanded as "<the-file>". So, function foo is called with one single argument: "<the-file>". In the context of function foo the positional parameters are thus:
$1 = "<the-file>"
and echo "$#" expands as echo "<the-file>".
All this just prints the names of all found files or directories. It is almost as if you had any of:
find . -exec echo {} \;
find . -print
find .
find
(for find versions that accept the last one).
Almost as if, only, because if file or directory names contain spaces, depending on your use of find and of quotes, you will get different results. So, if you intend to have a more complex foo function, you should pay attention to the quotes. Examples:
$ touch "filename with spaces" plain
$ ls -1
filename with spaces
plain # 2 files
$ foo() { echo "$#"; } # print arguments
$ find . -type f
./filename with spaces
./plain
$ find . -type f -exec bash -c 'foo "$#"' bash {} \;
./filename with spaces
./plain
$ find . -type f -exec bash -c 'foo $#' bash {} \;
./filename with spaces
./plain
The 3 find commands apparently do the same but:
$ bar() { echo $#; } # print number of arguments
$ wc -w < <(find . -type f)
4 # 4 words
$ find . -type f -exec bash -c 'bar "$#"' bash {} \;
1 # 1 argument
1 # per file
$ find . -type f -exec bash -c 'bar $#' bash {} \;
3 # 3 arguments
1 # 1 argument
With find . -type f -exec bash -c 'bar "$#"' bash {} \;, the first file name is passed to function bar as one single argument, while in all other cases it is considered as 3 separate arguments.

Related

Saving arguments passed from xargs to bash as a variable for processing

I am trying to run many commands in parallel, and need to do some string manipulation on the input first. How can I make the below example work?
find . -mindepth 1 -type d | xargs -n 1 -P 20 -i sh -c "v={}; echo $v"
When I use this, $v is null. Why is it not being saved as the value of {}?
The parent shell is expanding $v before the string gets passed to xargs.
Suppose your find command finds a subdirectory named ./stuff.
First, the parent bash shell (the one you typed the find command into) will expand $v, because the string is in double quotes. You currently have no value set for variable v, so it expands to an empty string.
Next, the arguments get passed to xargs, which will see this: v={}; echo
Then, xargs will read ./stuff from the pipe, and replace {} with ./stuff
Finally, the sh command is executed by xargs, and sh will see this: v=./stuff; echo
To fix this, you need to either escape the $ so that the parent shell doesn't expand it, or use single quotes to avoid variable expansion. You should also probably quote the strings so that any directory names with spaces in them don't cause problems with the final sh command:
find . -mindepth 1 -type d | xargs -n 1 -P 20 -i sh -c "v=\"{}\"; echo \"\$v\""
OR
find . -mindepth 1 -type d | xargs -n 1 -P 20 -i sh -c 'v="{}"; echo "$v"'
With either command, the final sh process will see: v="./stuff"; echo "$v"
By the way, one way to see for yourself that this is indeed what is happening would be to set a value for v in the parent shell, then run your original command. The shell will expand $v to whatever value you set, and you will see that value repeated for every directory found by find.
$ v=foobar
$ find . -mindepth 1 -type d | xargs -n 1 -P 20 -i sh -c "v={}; echo $v"
foobar
foobar
foobar
foobar
foobar
foobar
foobar
foobar
foobar
foobar
foobar
...
With GNU Parallel you would do:
find . -mindepth 1 -type d | parallel -P 20 'v={}; echo "$v"'

why doesn't export of function work within bash script

I'm trying to use a bash function from find. I know that I need to export the function. When I do this from the command line it works. When I do it from within a bash script it doesn't. the code is listed below. if you execute it doesn't work. if you source it, then it does. What is interesting, is that once you source it, the script will then work because it's defined at the time of the script invocation.
#!/bin/bash
function doit {
echo args "<$#>"
}
doit a b c
export -f doit
find . -maxdepth 1 -exec sh -c 'doit "$#" ' {} \+
here's the error I see:
> ./x.sh
args <a b c>
.: doit: command not found
You are using bash's export but using plain sh which may not understand it (which may be symlinked to bash, not necessarily true on all platforms).
Use bash to exec. Change:
find . -maxdepth 1 -exec sh -c 'doit "$#" ' {} \+
to
find . -maxdepth 1 -exec bash -c 'doit "$#" ' {} \+

Bash parameter expansion in brackets not working as expected

I am writing a script that wraps the find command to search for specific source file types under a given directory. A sample invocation would be :
./find_them.sh --java --flex --xml dir1
The above command would search for .java, .as and .xml files under dir1.
To do this manually I came up with the following find command :
find dir1 -type f -a \( -name "*.java" -o -name "*.as" -o -name "*.xml" \)
As I am doing this in a script where I want to be able specify different file sets to search for you end up with the following structure :
find_cmd_file_sets=$(decode_file_sets) # Assume this creates a string with the file sets e.g. -name "*.java" -o -name "*.as" etc
dirs=$(get_search_dirs) # assume this gives you the list of dirs to search, defaulting to the current directory
for dir in $dirs
do
find $dir -type f -a \( $find_cmd_file_sets \)
done
The above script doesn't behave as expected, you execute the script and the find command churns for a while before returning no results.
I'm certain the equivalents of decode_file_sets and get_search_dirs I've created are generating the correct results.
A simpler example if to execute the following directly in a bash shell
file_sets=' -name "*.java" -o -name "*.as" '
find dir -type f -a \( $file_sets \) # Returns no result
# Executing result of below command directly in the shell returns correct result
echo find dir -type f -a \\\( $file_sets \\\)
I don't understand why variable expansion in brackets of the find command would change the result. If it makes any difference I am using git-bash under Windows.
This is really frustrating. Any help would be much appreciated. Most importantly I would like to understand why the variable expansion of $file_sets is behaving as it is.
Hope this will work, Its tested on bash.
file_sets=' -name "*.java" -o -name "*.as" '
command=`echo "find $dir -type f -a \( $file_sets \)"`
eval $command
TLDR: Don't use quotes in find_cmd_file_sets variable and disable pathname expansion (set -f) before calling find.
When you have "special" character in a variable content and then you try to expand that variable without quotes than bash will surround each word with "special" character with single quotes, e.g.:
#!/usr/bin/env bash
set -x
VAR='abc "def"'
echo $VAR
The output is:
+ VAR='abc "def"'
+ echo abc '"def"'
abc "def"
As you can see, bash surrounded "def" with single quotes. In your case, the call to find command becomes:
find ... -name '"*.java"' ...
So it tries to find files which start with " and end with .java"
To prevent that behavior, the only thing you can do (which I'm aware of) is to use double quotes when expanding the variable, e.g.:
#!/usr/bin/env bash
set -x
VAR='abc "def"'
echo "$VAR"
The output is:
+ VAR='abc "def"'
+ echo 'abc "def"'
abc "def"
The only problem, as you probably noticed already, is that now the whole variable is in quotes and is treated as single argument. So this won't work in your find command.
The only option left is to not use quotes, neither in variable content nor when expanding the variable. But then, of course, you have a problem with pathname expansion:
#!/usr/bin/env bash
set -x
VAR='abc *.java'
echo $VAR
The output is:
+ VAR='abc *.java'
+ echo abc file1.java file2.java
abc file1.java file2.java
Fortunately you can disable pathname expansion using set -f:
#!/usr/bin/env bash
set -x
VAR='abc *.java'
set -f
echo $VAR
The output is:
+ VAR='abc *.java'
+ set -f
+ echo abc '*.java'
abc *.java
To sum up, the following should work:
#!/usr/bin/env bash
pattern='-name *.java'
dir="my_project"
set -f
find "$dir" -type f -a \( $pattern \)
bash arrays were introduced to allow this kind of nested quoting:
file_sets=( -name "*.java" -o -name "*.as" )
find dir -type f -a \( "${file_sets[#]}" \)

Rename files and directories using substitution and variables

I have found several similar questions that have solutions, except they don't involve variables.
I have a particular pattern in a tree of files and directories - the pattern is the word TEMPLATE. I want a script file to rename all of the files and directories by replacing the word TEMPLATE with some other name that is contained in the variable ${newName}
If I knew that the value of ${newName} was say "Fred lives here", then the command
find . -name '*TEMPLATE*' -exec bash -c 'mv "$0" "${0/TEMPLATE/Fred lives here}"' {} \;
will do the job
However, if my script is:
newName="Fred lives here"
find . -name '*TEMPLATE*' -exec bash -c 'mv "$0" "${0/TEMPLATE/${newName}}"' {} \;
then the word TEMPLATE is replaced by null rather than "Fred lives here"
I need the "" around $0 because there are spaces in the path name, so I can't do something like:
find . -name '*TEMPLATE*' -exec bash -c 'mv "$0" "${0/TEMPLATE/"${newName}"}"' {} \;
Can anyone help me get this script to work so that all files and directories that contain the word TEMPLATE have TEMPLATE replaced by whatever the value of ${newName} is
eg, if newName="A different name" and a I had directory of
/foo/bar/some TEMPLATE directory/with files then the directory would be renamed to
/foo/bar/some A different name directory/with files
and a file called some TEMPLATE file would be renamed to
some A different name file
You have two options.
1) The easiest solution is export newName. If you don't export the variable, then it's not available in subshells, and bash -c is a subshell. That's why you're getting TEMPLATE replaced by nothing.
2) Alternatively, you can try to construct a correctly quoted command line containing the replacement of $newName. If you knew that $newName were reasonably well-behaved (no double quotes or dollar signs, for example), then it's easy:
find . -name '*TEMPLATE*' \
-exec bash -c 'mv "$0" "${0/TEMPLATE/'"${newName}"'}"' {} \;
(Note: bash quoting is full of subtleties. The following has been edited several times, but I think it is now correct.)
But since you can't count on that, probably, you need to construct the command line by substituting both the filename and the substitution as command line parameters. But before we do that, let's fix the $0. You shouldn't be using $0 as a parameter. The correct syntax is:
bash -c '...$1...$1...' bash "argument"
Note the extra bash (many people prefer to use _); it's there to provide a sensible name for the subprocess.
So with that in mind:
find . -name '*TEMPLATE*' \
-exec bash -c 'mv "$1" "${1/TEMPLATE/$2}"' bash {} "$newName" \;
You an get around having to use quotes with IFS=$'\n' and since bash -c is a subshell an export of any variable is required. This works:
#!/bin/bash
IFS=$'\n'
export newName="Fred lives here"
find . -name '*TEMPLATE*' -exec bash -c 'mv "$0" "${0/TEMPLATE/${newName}}"' {} \;
If you do not mind two more lines and would like a script that is easier to read (no export required):
#!/bin/bash
IFS=$'\n'
newName="Fred lives here"
for file in $(find . -name '*TEMPLATE*'); do
mv ${file} ${file/TEMPLATE/${newName}}
done

'find -exec' a shell function in Linux

Is there a way to get find to execute a function I define in the shell?
For example:
dosomething () {
echo "Doing something with $1"
}
find . -exec dosomething {} \;
The result of that is:
find: dosomething: No such file or directory
Is there a way to get find's -exec to see dosomething?
Since only the shell knows how to run shell functions, you have to run a shell to run a function. You also need to mark your function for export with export -f, otherwise the subshell won't inherit them:
export -f dosomething
find . -exec bash -c 'dosomething "$0"' {} \;
find . | while read file; do dosomething "$file"; done
Jac's answer is great, but it has a couple of pitfalls that are easily overcome:
find . -print0 | while IFS= read -r -d '' file; do dosomething "$file"; done
This uses null as a delimiter instead of a linefeed, so filenames with line feeds will work. It also uses the -r flag which disables backslash escaping, and without it backslashes in filenames won't work. It also clears IFS so that potential trailing white spaces in names are not discarded.
Add quotes in {} as shown below:
export -f dosomething
find . -exec bash -c 'dosomething "{}"' \;
This corrects any error due to special characters returned by find,
for example files with parentheses in their name.
Processing results in bulk
For increased efficiency, many people use xargs to process results in bulk, but it is very dangerous. Because of that there was an alternate method introduced into find that executes results in bulk.
Note though that this method might come with some caveats like for example a requirement in POSIX-find to have {} at the end of the command.
export -f dosomething
find . -exec bash -c 'for f; do dosomething "$f"; done' _ {} +
find will pass many results as arguments to a single call of bash and the for-loop iterates through those arguments, executing the function dosomething on each one of those.
The above solution starts arguments at $1, which is why there is a _ (which represents $0).
Processing results one by one
In the same way, I think that the accepted top answer should be corrected to be
export -f dosomething
find . -exec bash -c 'dosomething "$1"' _ {} \;
This is not only more sane, because arguments should always start at $1, but also using $0 could lead to unexpected behavior if the filename returned by find has special meaning to the shell.
Have the script call itself, passing each item found as an argument:
#!/bin/bash
if [ ! $1 == "" ] ; then
echo "doing something with $1"
exit 0
fi
find . -exec $0 {} \;
exit 0
When you run the script by itself, it finds what you are looking for and calls itself passing each find result as the argument. When the script is run with an argument, it executes the commands on the argument and then exits.
Just a warning regaring the accepted answer that is using a shell,
despite it well answer the question, it might not be the most efficient way to exec some code on find results:
Here is a benchmark under bash of all kind of solutions,
including a simple for loop case:
(1465 directories, on a standard hard drive, armv7l GNU/Linux synology_armada38x_ds218j)
dosomething() { echo $1; }
export -f dosomething
time find . -type d -exec bash -c 'dosomething "$0"' {} \;
real 0m16.102s
time while read -d '' filename; do dosomething "${filename}" </dev/null; done < <(find . -type d -print0)
real 0m0.364s
time find . -type d | while read file; do dosomething "$file"; done
real 0m0.340s
time for dir in $(find . -type d); do dosomething $dir; done
real 0m0.337s
"find | while" and "for loop" seems best and similar in speed.
For those of you looking for a Bash function that will execute a given command on all files in current directory, I have compiled one from the above answers:
toall(){
find . -type f | while read file; do "$1" "$file"; done
}
Note that it breaks with file names containing spaces (see below).
As an example, take this function:
world(){
sed -i 's_hello_world_g' "$1"
}
Say I wanted to change all instances of "hello" to "world" in all files in the current directory. I would do:
toall world
To be safe with any symbols in filenames, use:
toall(){
find . -type f -print0 | while IFS= read -r -d '' file; do "$1" "$file"; done
}
(but you need a find that handles -print0 e.g., GNU find).
It is not possible to executable a function that way.
To overcome this you can place your function in a shell script and call that from find
# dosomething.sh
dosomething () {
echo "doing something with $1"
}
dosomething $1
Now use it in find as:
find . -exec dosomething.sh {} \;
To provide additions and clarifications to some of the other answers, if you are using the bulk option for exec or execdir (-exec command {} +), and want to retrieve all the positional arguments, you need to consider the handling of $0 with bash -c.
More concretely, consider the command below, which uses bash -c as suggested above, and simply echoes out file paths ending with '.wav' from each directory it finds:
find "$1" -name '*.wav' -execdir bash -c 'echo "$#"' _ {} +
The Bash manual says:
If the -c option is present, then commands are read from the first non-option argument command_string. If there are arguments after the command_string, they are assigned to positional parameters, starting with $0.
Here, 'echo "$#"' is the command string, and _ {} are the arguments after the command string. Note that $# is a special positional parameter in Bash that expands to all the positional parameters starting from 1. Also note that with the -c option, the first argument is assigned to positional parameter $0.
This means that if you try to access all of the positional parameters with $#, you will only get parameters starting from $1 and up. That is the reason why Dominik's answer has the _, which is a dummy argument to fill parameter $0, so all of the arguments we want are available later if we use $# parameter expansion for instance, or the for loop as in that answer.
Of course, similar to the accepted answer, bash -c 'shell_function "$0" "$#"' would also work by explicitly passing $0, but again, you would have to keep in mind that $# won't work as expected.
Put the function in a separate file and get find to execute that.
Shell functions are internal to the shell they're defined in; find will never be able to see them.
I find the easiest way is as follows, repeating two commands in a single do:
func_one () {
echo "The first thing with $1"
}
func_two () {
echo "The second thing with $1"
}
find . -type f | while read file; do func_one $file; func_two $file; done
Not directly, no. Find is executing in a separate process, not in your shell.
Create a shell script that does the same job as your function and find can -exec that.
I would avoid using -exec altogether. Use xargs:
find . -name <script/command you're searching for> | xargs bash -c

Resources