'find -exec' a shell function in Linux - bash

Is there a way to get find to execute a function I define in the shell?
For example:
dosomething () {
echo "Doing something with $1"
}
find . -exec dosomething {} \;
The result of that is:
find: dosomething: No such file or directory
Is there a way to get find's -exec to see dosomething?

Since only the shell knows how to run shell functions, you have to run a shell to run a function. You also need to mark your function for export with export -f, otherwise the subshell won't inherit them:
export -f dosomething
find . -exec bash -c 'dosomething "$0"' {} \;

find . | while read file; do dosomething "$file"; done

Jac's answer is great, but it has a couple of pitfalls that are easily overcome:
find . -print0 | while IFS= read -r -d '' file; do dosomething "$file"; done
This uses null as a delimiter instead of a linefeed, so filenames with line feeds will work. It also uses the -r flag which disables backslash escaping, and without it backslashes in filenames won't work. It also clears IFS so that potential trailing white spaces in names are not discarded.

Add quotes in {} as shown below:
export -f dosomething
find . -exec bash -c 'dosomething "{}"' \;
This corrects any error due to special characters returned by find,
for example files with parentheses in their name.

Processing results in bulk
For increased efficiency, many people use xargs to process results in bulk, but it is very dangerous. Because of that there was an alternate method introduced into find that executes results in bulk.
Note though that this method might come with some caveats like for example a requirement in POSIX-find to have {} at the end of the command.
export -f dosomething
find . -exec bash -c 'for f; do dosomething "$f"; done' _ {} +
find will pass many results as arguments to a single call of bash and the for-loop iterates through those arguments, executing the function dosomething on each one of those.
The above solution starts arguments at $1, which is why there is a _ (which represents $0).
Processing results one by one
In the same way, I think that the accepted top answer should be corrected to be
export -f dosomething
find . -exec bash -c 'dosomething "$1"' _ {} \;
This is not only more sane, because arguments should always start at $1, but also using $0 could lead to unexpected behavior if the filename returned by find has special meaning to the shell.

Have the script call itself, passing each item found as an argument:
#!/bin/bash
if [ ! $1 == "" ] ; then
echo "doing something with $1"
exit 0
fi
find . -exec $0 {} \;
exit 0
When you run the script by itself, it finds what you are looking for and calls itself passing each find result as the argument. When the script is run with an argument, it executes the commands on the argument and then exits.

Just a warning regaring the accepted answer that is using a shell,
despite it well answer the question, it might not be the most efficient way to exec some code on find results:
Here is a benchmark under bash of all kind of solutions,
including a simple for loop case:
(1465 directories, on a standard hard drive, armv7l GNU/Linux synology_armada38x_ds218j)
dosomething() { echo $1; }
export -f dosomething
time find . -type d -exec bash -c 'dosomething "$0"' {} \;
real 0m16.102s
time while read -d '' filename; do dosomething "${filename}" </dev/null; done < <(find . -type d -print0)
real 0m0.364s
time find . -type d | while read file; do dosomething "$file"; done
real 0m0.340s
time for dir in $(find . -type d); do dosomething $dir; done
real 0m0.337s
"find | while" and "for loop" seems best and similar in speed.

For those of you looking for a Bash function that will execute a given command on all files in current directory, I have compiled one from the above answers:
toall(){
find . -type f | while read file; do "$1" "$file"; done
}
Note that it breaks with file names containing spaces (see below).
As an example, take this function:
world(){
sed -i 's_hello_world_g' "$1"
}
Say I wanted to change all instances of "hello" to "world" in all files in the current directory. I would do:
toall world
To be safe with any symbols in filenames, use:
toall(){
find . -type f -print0 | while IFS= read -r -d '' file; do "$1" "$file"; done
}
(but you need a find that handles -print0 e.g., GNU find).

It is not possible to executable a function that way.
To overcome this you can place your function in a shell script and call that from find
# dosomething.sh
dosomething () {
echo "doing something with $1"
}
dosomething $1
Now use it in find as:
find . -exec dosomething.sh {} \;

To provide additions and clarifications to some of the other answers, if you are using the bulk option for exec or execdir (-exec command {} +), and want to retrieve all the positional arguments, you need to consider the handling of $0 with bash -c.
More concretely, consider the command below, which uses bash -c as suggested above, and simply echoes out file paths ending with '.wav' from each directory it finds:
find "$1" -name '*.wav' -execdir bash -c 'echo "$#"' _ {} +
The Bash manual says:
If the -c option is present, then commands are read from the first non-option argument command_string. If there are arguments after the command_string, they are assigned to positional parameters, starting with $0.
Here, 'echo "$#"' is the command string, and _ {} are the arguments after the command string. Note that $# is a special positional parameter in Bash that expands to all the positional parameters starting from 1. Also note that with the -c option, the first argument is assigned to positional parameter $0.
This means that if you try to access all of the positional parameters with $#, you will only get parameters starting from $1 and up. That is the reason why Dominik's answer has the _, which is a dummy argument to fill parameter $0, so all of the arguments we want are available later if we use $# parameter expansion for instance, or the for loop as in that answer.
Of course, similar to the accepted answer, bash -c 'shell_function "$0" "$#"' would also work by explicitly passing $0, but again, you would have to keep in mind that $# won't work as expected.

Put the function in a separate file and get find to execute that.
Shell functions are internal to the shell they're defined in; find will never be able to see them.

I find the easiest way is as follows, repeating two commands in a single do:
func_one () {
echo "The first thing with $1"
}
func_two () {
echo "The second thing with $1"
}
find . -type f | while read file; do func_one $file; func_two $file; done

Not directly, no. Find is executing in a separate process, not in your shell.
Create a shell script that does the same job as your function and find can -exec that.

I would avoid using -exec altogether. Use xargs:
find . -name <script/command you're searching for> | xargs bash -c

Related

How to find files with specific extensions recursively using the for/in syntax? [duplicate]

x=$(find . -name "*.txt")
echo $x
if I run the above piece of code in Bash shell, what I get is a string containing several file names separated by blank, not a list.
Of course, I can further separate them by blank to get a list, but I'm sure there is a better way to do it.
So what is the best way to loop through the results of a find command?
TL;DR: If you're just here for the most correct answer, you probably want my personal preference (see the bottom of this post):
# execute `process` once for each file
find . -name '*.txt' -exec process {} \;
If you have time, read through the rest to see several different ways and the problems with most of them.
The full answer:
The best way depends on what you want to do, but here are a few options. As long as no file or folder in the subtree has whitespace in its name, you can just loop over the files:
for i in $x; do # Not recommended, will break on whitespace
process "$i"
done
Marginally better, cut out the temporary variable x:
for i in $(find -name \*.txt); do # Not recommended, will break on whitespace
process "$i"
done
It is much better to glob when you can. White-space safe, for files in the current directory:
for i in *.txt; do # Whitespace-safe but not recursive.
process "$i"
done
By enabling the globstar option, you can glob all matching files in this directory and all subdirectories:
# Make sure globstar is enabled
shopt -s globstar
for i in **/*.txt; do # Whitespace-safe and recursive
process "$i"
done
In some cases, e.g. if the file names are already in a file, you may need to use read:
# IFS= makes sure it doesn't trim leading and trailing whitespace
# -r prevents interpretation of \ escapes.
while IFS= read -r line; do # Whitespace-safe EXCEPT newlines
process "$line"
done < filename
read can be used safely in combination with find by setting the delimiter appropriately:
find . -name '*.txt' -print0 |
while IFS= read -r -d '' line; do
process "$line"
done
For more complex searches, you will probably want to use find, either with its -exec option or with -print0 | xargs -0:
# execute `process` once for each file
find . -name \*.txt -exec process {} \;
# execute `process` once with all the files as arguments*:
find . -name \*.txt -exec process {} +
# using xargs*
find . -name \*.txt -print0 | xargs -0 process
# using xargs with arguments after each filename (implies one run per filename)
find . -name \*.txt -print0 | xargs -0 -I{} process {} argument
find can also cd into each file's directory before running a command by using -execdir instead of -exec, and can be made interactive (prompt before running the command for each file) using -ok instead of -exec (or -okdir instead of -execdir).
*: Technically, both find and xargs (by default) will run the command with as many arguments as they can fit on the command line, as many times as it takes to get through all the files. In practice, unless you have a very large number of files it won't matter, and if you exceed the length but need them all on the same command line, you're SOL find a different way.
What ever you do, don't use a for loop:
# Don't do this
for file in $(find . -name "*.txt")
do
…code using "$file"
done
Three reasons:
For the for loop to even start, the find must run to completion.
If a file name has any whitespace (including space, tab or newline) in it, it will be treated as two separate names.
Although now unlikely, you can overrun your command line buffer. Imagine if your command line buffer holds 32KB, and your for loop returns 40KB of text. That last 8KB will be dropped right off your for loop and you'll never know it.
Always use a while read construct:
find . -name "*.txt" -print0 | while read -d $'\0' file
do
…code using "$file"
done
The loop will execute while the find command is executing. Plus, this command will work even if a file name is returned with whitespace in it. And, you won't overflow your command line buffer.
The -print0 will use the NULL as a file separator instead of a newline and the -d $'\0' will use NULL as the separator while reading.
find . -name "*.txt"|while read fname; do
echo "$fname"
done
Note: this method and the (second) method shown by bmargulies are safe to use with white space in the file/folder names.
In order to also have the - somewhat exotic - case of newlines in the file/folder names covered, you will have to resort to the -exec predicate of find like this:
find . -name '*.txt' -exec echo "{}" \;
The {} is the placeholder for the found item and the \; is used to terminate the -exec predicate.
And for the sake of completeness let me add another variant - you gotta love the *nix ways for their versatility:
find . -name '*.txt' -print0|xargs -0 -n 1 echo
This would separate the printed items with a \0 character that isn't allowed in any of the file systems in file or folder names, to my knowledge, and therefore should cover all bases. xargs picks them up one by one then ...
Filenames can include spaces and even control characters. Spaces are (default) delimiters for shell expansion in bash and as a result of that x=$(find . -name "*.txt") from the question is not recommended at all. If find gets a filename with spaces e.g. "the file.txt" you will get 2 separated strings for processing, if you process x in a loop. You can improve this by changing delimiter (bash IFS Variable) e.g. to \r\n, but filenames can include control characters - so this is not a (completely) safe method.
From my point of view, there are 2 recommended (and safe) patterns for processing files:
1. Use for loop & filename expansion:
for file in ./*.txt; do
[[ ! -e $file ]] && continue # continue, if file does not exist
# single filename is in $file
echo "$file"
# your code here
done
2. Use find-read-while & process substitution
while IFS= read -r -d '' file; do
# single filename is in $file
echo "$file"
# your code here
done < <(find . -name "*.txt" -print0)
Remarks
on Pattern 1:
bash returns the search pattern ("*.txt") if no matching file is found - so the extra line "continue, if file does not exist" is needed. see Bash Manual, Filename Expansion
shell option nullglob can be used to avoid this extra line.
"If the failglob shell option is set, and no matches are found, an error message is printed and the command is not executed." (from Bash Manual above)
shell option globstar: "If set, the pattern ‘**’ used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/’, only directories and subdirectories match." see Bash Manual, Shopt Builtin
other options for filename expansion: extglob, nocaseglob, dotglob & shell variable GLOBIGNORE
on Pattern 2:
filenames can contain blanks, tabs, spaces, newlines, ... to process filenames in a safe way, find with -print0 is used: filename is printed with all control characters & terminated with NUL. see also Gnu Findutils Manpage, Unsafe File Name Handling, safe File Name Handling, unusual characters in filenames. See David A. Wheeler below for detailed discussion of this topic.
There are some possible patterns to process find results in a while loop. Others (kevin, David W.) have shown how to do this using pipes:
files_found=1
find . -name "*.txt" -print0 |
while IFS= read -r -d '' file; do
# single filename in $file
echo "$file"
files_found=0 # not working example
# your code here
done
[[ $files_found -eq 0 ]] && echo "files found" || echo "no files found"
When you try this piece of code, you will see, that it does not work: files_found is always "true" & the code will always echo "no files found". Reason is: each command of a pipeline is executed in a separate subshell, so the changed variable inside the loop (separate subshell) does not change the variable in the main shell script. This is why I recommend using process substitution as the "better", more useful, more general pattern.See I set variables in a loop that's in a pipeline. Why do they disappear... (from Greg's Bash FAQ) for a detailed discussion on this topic.
Additional References & Sources:
Gnu Bash Manual, Pattern Matching
Filenames and Pathnames in Shell: How to do it Correctly, David A. Wheeler
Why you don't read lines with "for", Greg's Wiki
Why you shouldn't parse the output of ls(1), Greg's Wiki
Gnu Bash Manual, Process Substitution
(Updated to include #Socowi's execellent speed improvement)
With any $SHELL that supports it (dash/zsh/bash...):
find . -name "*.txt" -exec $SHELL -c '
for i in "$#" ; do
echo "$i"
done
' {} +
Done.
Original answer (shorter, but slower):
find . -name "*.txt" -exec $SHELL -c '
echo "$0"
' {} \;
If you can assume the file names don't contain newlines, you can read the output of find into a Bash array using the following command:
readarray -t x < <(find . -name '*.txt')
Note:
-t causes readarray to strip newlines.
It won't work if readarray is in a pipe, hence the process substitution.
readarray is available since Bash 4.
Bash 4.4 and up also supports the -d parameter for specifying the delimiter. Using the null character, instead of newline, to delimit the file names works also in the rare case that the file names contain newlines:
readarray -d '' x < <(find . -name '*.txt' -print0)
readarray can also be invoked as mapfile with the same options.
Reference: https://mywiki.wooledge.org/BashFAQ/005#Loading_lines_from_a_file_or_stream
# Doesn't handle whitespace
for x in `find . -name "*.txt" -print`; do
process_one $x
done
or
# Handles whitespace and newlines
find . -name "*.txt" -print0 | xargs -0 -n 1 process_one
I like to use find which is first assigned to variable and IFS switched to new line as follow:
FilesFound=$(find . -name "*.txt")
IFSbkp="$IFS"
IFS=$'\n'
counter=1;
for file in $FilesFound; do
echo "${counter}: ${file}"
let counter++;
done
IFS="$IFSbkp"
As commented by #Konrad Rudolph this will not work with "new lines" in file name. I still think it is handy as it covers most of the cases when you need to loop over command output.
As already posted on the top answer by Kevin, the best solution is to use a for loop with bash glob, but as bash glob is not recursive by default, this can be fixed by a bash recursive function:
#!/bin/bash
set -x
set -eu -o pipefail
all_files=();
function get_all_the_files()
{
directory="$1";
for item in "$directory"/* "$directory"/.[^.]*;
do
if [[ -d "$item" ]];
then
get_all_the_files "$item";
else
all_files+=("$item");
fi;
done;
}
get_all_the_files "/tmp";
for file_path in "${all_files[#]}"
do
printf 'My file is "%s"\n' "$file_path";
done;
Related questions:
Bash loop through directory including hidden file
Recursively list files from a given directory in Bash
ls command: how can I get a recursive full-path listing, one line per file?
List files recursively in Linux CLI with path relative to the current directory
Recursively List all directories and files
bash script, create array of all files in a directory
How can I creates array that contains the names of all the files in a folder?
How can I creates array that contains the names of all the files in a folder?
How to get the list of files in a directory in a shell script?
based on other answers and comment of #phk, using fd #3:
(which still allows to use stdin inside the loop)
while IFS= read -r f <&3; do
echo "$f"
done 3< <(find . -iname "*filename*")
You can put the filenames returned by find into an array like this:
array=()
while IFS= read -r -d ''; do
array+=("$REPLY")
done < <(find . -name '*.txt' -print0)
Now you can just loop through the array to access individual items and do whatever you want with them.
Note: It's white space safe.
You can store your find output in array if you wish to use the output later as:
array=($(find . -name "*.txt"))
Now to print the each element in new line, you can either use for loop iterating to all the elements of array, or you can use printf statement.
for i in ${array[#]};do echo $i; done
or
printf '%s\n' "${array[#]}"
You can also use:
for file in "`find . -name "*.txt"`"; do echo "$file"; done
This will print each filename in newline
To only print the find output in list form, you can use either of the following:
find . -name "*.txt" -print 2>/dev/null
or
find . -name "*.txt" -print | grep -v 'Permission denied'
This will remove error messages and only give the filename as output in new line.
If you wish to do something with the filenames, storing it in array is good, else there is no need to consume that space and you can directly print the output from find.
I think using this piece of code (piping the command after while done):
while read fname; do
echo "$fname"
done <<< "$(find . -name "*.txt")"
is better than this answer because while loop is executed in a subshell according to here, if you use this answer and variable changes cannot be seen after while loop if you want to modify variables inside the loop.
function loop_through(){
length_="$(find . -name '*.txt' | wc -l)"
length_="${length_#"${length_%%[![:space:]]*}"}"
length_="${length_%"${length_##*[![:space:]]}"}"
for i in {1..$length_}
do
x=$(find . -name '*.txt' | sort | head -$i | tail -1)
echo $x
done
}
To grab the length of the list of files for loop, I used the first command "wc -l".
That command is set to a variable.
Then, I need to remove the trailing white spaces from the variable so the for loop can read it.
find <path> -xdev -type f -name *.txt -exec ls -l {} \;
This will list the files and give details about attributes.
Another alternative is to not use bash, but call Python to do the heavy lifting. I recurred to this because bash solutions as my other answer were too slow.
With this solution, we build a bash array of files from inline Python script:
#!/bin/bash
set -eu -o pipefail
dsep=":" # directory_separator
base_directory=/tmp
all_files=()
all_files_string="$(python3 -c '#!/usr/bin/env python3
import os
import sys
dsep="'"$dsep"'"
base_directory="'"$base_directory"'"
def log(*args, **kwargs):
print(*args, file=sys.stderr, **kwargs)
def check_invalid_characther(file_path):
for thing in ("\\", "\n"):
if thing in file_path:
raise RuntimeError(f"It is not allowed {thing} on \"{file_path}\"!")
def absolute_path_to_relative(base_directory, file_path):
relative_path = os.path.commonprefix( [ base_directory, file_path ] )
relative_path = os.path.normpath( file_path.replace( relative_path, "" ) )
# if you use Windows Python, it accepts / instead of \\
# if you have \ on your files names, rename them or comment this
relative_path = relative_path.replace("\\", "/")
if relative_path.startswith( "/" ):
relative_path = relative_path[1:]
return relative_path
for directory, directories, files in os.walk(base_directory):
for file in files:
local_file_path = os.path.join(directory, file)
local_file_name = absolute_path_to_relative(base_directory, local_file_path)
log(f"local_file_name {local_file_name}.")
check_invalid_characther(local_file_name)
print(f"{base_directory}{dsep}{local_file_name}")
' | dos2unix)";
if [[ -n "$all_files_string" ]];
then
readarray -t temp <<< "$all_files_string";
all_files+=("${temp[#]}");
fi;
for item in "${all_files[#]}";
do
OLD_IFS="$IFS"; IFS="$dsep";
read -r base_directory local_file_name <<< "$item"; IFS="$OLD_IFS";
printf 'item "%s", base_directory "%s", local_file_name "%s".\n' \
"$item" \
"$base_directory" \
"$local_file_name";
done;
Related:
os.walk without hidden folders
How to do a recursive sub-folder search and return files in a list?
How to split a string into an array in Bash?
How about if you use grep instead of find?
ls | grep .txt$ > out.txt
Now you can read this file and the filenames are in the form of a list.

What is this strange syntax inside `find -exec`?

Recently I've came across a strange bash script, which is used to call a custom bash function from inside find -exec. I've developed following simple script to demonstrate the functionality I need to get explained.
In the following example, function foo will be called for each find result.
foo()
{
echo "$#"
}
export -f foo
find . -exec bash -c 'foo "$#"' bash {} \;
Can someone explain how the part after -exec is interpreted?
UPDATE:
To further simplify this, after exporting foo as above, following gets executed for each find result (assume there is a file named my_file).
bash -c 'foo "$#"' bash my_file
And this produces the output myfile. I don't understand how this works. What does the second bash does there? Any detailed explanation is appreciated.
(Please note that this question is not about find command. Also please ignore the functionality of function foo, I just wanted to export some function)
To understand you need to know 4 things:
The find action -exec allows you to apply a command on the found files and directories.
The -c bash option is documented as follows:
BASH(1)
...
OPTIONS
...
-c If the -c option is present, then commands are read from
the first non-option argument command_string.
If there are arguments after the command_string, they
are assigned to the positional parameters, starting with $0.
...
If bash is started with the -c option, then $0 is set to the first
argument after the string to be executed, if one is present.
Otherwise, it is set to the filename used to invoke bash, as given
by argument zero.
In bash, $# expands as all positional parameters ($1, $2...) starting at parameter $1.
In a bash function, the positional parameters are the arguments passed to the function when it is called.
So, in your case, the command executed for each found file or directory is:
bash -c 'foo "$#"' bash <the-file>
The positional parameters are thus set to:
$0 = bash
$1 = <the-file>
and bash is asked to execute 'foo "$#"' in this context. "$#" is first expanded as "<the-file>". So, function foo is called with one single argument: "<the-file>". In the context of function foo the positional parameters are thus:
$1 = "<the-file>"
and echo "$#" expands as echo "<the-file>".
All this just prints the names of all found files or directories. It is almost as if you had any of:
find . -exec echo {} \;
find . -print
find .
find
(for find versions that accept the last one).
Almost as if, only, because if file or directory names contain spaces, depending on your use of find and of quotes, you will get different results. So, if you intend to have a more complex foo function, you should pay attention to the quotes. Examples:
$ touch "filename with spaces" plain
$ ls -1
filename with spaces
plain # 2 files
$ foo() { echo "$#"; } # print arguments
$ find . -type f
./filename with spaces
./plain
$ find . -type f -exec bash -c 'foo "$#"' bash {} \;
./filename with spaces
./plain
$ find . -type f -exec bash -c 'foo $#' bash {} \;
./filename with spaces
./plain
The 3 find commands apparently do the same but:
$ bar() { echo $#; } # print number of arguments
$ wc -w < <(find . -type f)
4 # 4 words
$ find . -type f -exec bash -c 'bar "$#"' bash {} \;
1 # 1 argument
1 # per file
$ find . -type f -exec bash -c 'bar $#' bash {} \;
3 # 3 arguments
1 # 1 argument
With find . -type f -exec bash -c 'bar "$#"' bash {} \;, the first file name is passed to function bar as one single argument, while in all other cases it is considered as 3 separate arguments.

How to execute bash code for each iteration of the find command?

Using this will perform a grep for each file found:
find . -name "$FILE" 2>null | xargs grep "search_string" >> $grep_out
But what if I want to execute custom code for each file found, rather than executing a grep? I would like to parse each file my own way, which is motivation for doing this. Could I write the code in the pipe? Should I execute a separate script using the pipe? Can I expand the pipe's scope to execute the next lines in the code before finding the next file?
Several ways to go about it, each with pros and cons. In addition to anubhava's inline method, you could use the -exec flag and a custom script. Example:
find . -name "$FILE" -exec /path/to/script.sh {} +
Then write /path/to/script.sh so that it accepts an arbitrary number of file arguments. Example:
#!/bin/bash
for file in "$#"; do
echo "$file"
done
This approach affords reuse over the inline method, but is less efficient.
The {} + business on find passes multiple files to a single invocation of the script, rather than firing up the script multiple times -- saves a bit on process overhead. If you want the script to execute fresh for each single file, use {} \; instead (and just ue "$1" in your script, no looping needed).
The "$#" bit keeps the file names quoted, important for the cases where your file names have white space in them.
find . -name "$FILE" 2>null -execdir /path/to/script.sh {} \;
This way, no more need to make a for loop somewhere.
You can use while loop like this in BASH:
while read f; do
# process files here
echo "$f"
done < <(find . -name "$FILE")
For using it with sh (which doesn't support process substitution):
find . -name "$FILE" | while read f; do
# process files here
echo "$f"
done
Read more about process substitution
You could use -exec option (instead of xargs) :
find . -name "$FILE" -exec ./test.sh {} \;
With a script test.sh which contain whatever you want. For example :
$ cat test.sh
#!/bin/bash
echo "name=$1"
grep "string" "$1"
$ cat test
string
string2
test
$ sudo find . -name "test" -exec ./test.sh {} \;
name=./test
string
string2

Is there such a thing as inline bash scripts?

I want to do something on the lines of:
find -name *.mk | xargs "for i in $# do mv i i.aside end"
I realize that there might be more than on error in this, but I'd like to specifically know about this sort of inline command definition that I can pass xargs to.
This particular command isn't a great example, but you can use an "inline shell script" by giving sh -c 'here is the script' as a command. And you can give it arguments which will be $# inside the script but there's a catch: the first argument after here is the script goes to $0 inside the script, so you have to put an extra word there or you'll lose the first argument.
find . -name '*.mk' -exec sh -c 'for i; do mv "$i" "$i.aside"; done' fnord '{}' +
Another fun feature I took advantage of there is the fact that for loops iterate over the command line arguments by default: for i; do ... is equivalent to for i in "$#"; do ...
I reiterate, the above command is convoluted and slow compared to the many other methods of doing the bulk mv. I'm posting it only to show some cool syntax.
There's no need for xargs here
find -name *.mk -exec mv {} {}.aside \;
I'm not sure what the semantics of your for loop should be, but blindly coding it would give something like this:
find -name *.mk | while read file
do
for i in $file; do mv $i $i.aside; done
done
If the body is used in multiple places, you can also use bash functions.
In some version of find an argument is needed : . for the current directory
Star * must be escaped
You can try with echo command to be sure what command will do
find . -name '*.mk' -print0 | xargs -0i sh -c "echo mv '{}' '{}.aside'"
man xargs
/-i
man sh
/-c
I'm certain you could do this in a nice manner, but since you requested xargs:
find -name "*.tk" | xargs -I% mv % %.aside
Looping over filenames makes no sense, since you can only rename one at a time. Using inline uglyness is not necessary, but I could not make it work with the pipe and either eval or bash -c.

How do I apply a shell command to many files in nested (and poorly escaped) subdirectories?

I'm trying to do something like the following:
for file in `find . *.foo`
do
somecommand $file
done
But the command isn't working because $file is very odd. Because my directory tree has crappy file names (including spaces), I need to escape the find command. But none of the obvious escapes seem to work:
-ls gives me the space-delimited filename fragments
-fprint doesn't do any better.
I also tried: for file in "find . *.foo -ls"; do echo $file; done
- but that gives all of the responses from find in one long line.
Any hints? I'm happy for any workaround, but am frustrated that I can't figure this out.
Thanks,
Alex
(Hi Matt!)
You have plenty of answers that explain well how to do it; but for the sake of completion I'll repeat and add to it:
xargs is only ever useful for interactive use (when you know all your filenames are plain - no spaces or quotes) or when used with the -0 option. Otherwise, it'll break everything.
find is a very useful tool; put using it to pipe filenames into xargs (even with -0) is rather convoluted as find can do it all itself with either -exec command {} \; or -exec command {} + depending on what you want:
find /path -name 'pattern' -exec somecommand {} \;
find /path -name 'pattern' -exec somecommand {} +
The former runs somecommand with one argument for each file recursively in /path that matches pattern.
The latter runs somecommand with as many arguments as fit on the command line at once for files recursively in /path that match pattern.
Which one to use depends on somecommand. If it can take multiple filename arguments (like rm, grep, etc.) then the latter option is faster (since you run somecommand far less often). If somecommand takes only one argument then you need the former solution. So look at somecommand's man page.
More on find: http://mywiki.wooledge.org/UsingFind
In bash, for is a statement that iterates over arguments. If you do something like this:
for foo in "$bar"
you're giving for one argument to iterate over (note the quotes!). If you do something like this:
for foo in $bar
you're asking bash to take the contents of bar and tear it apart wherever there are spaces, tabs or newlines (technically, whatever characters are in IFS) and use the pieces of that operation as arguments to for. That is NOT filenames. Assuming that the result of a tearing long string that contains filenames apart wherever there is whitespace yields in a pile of filenames is just wrong. As you have just noticed.
The answer is: Don't use for, it's obviously the wrong tool. The above find commands all assume that somecommand is an executable in PATH. If it's a bash statement, you'll need this construct instead (iterates over find's output, like you tried, but safely):
while read -r -d ''; do
somebashstatement "$REPLY"
done < <(find /path -name 'pattern' -print0)
This uses a while-read loop that reads parts of the string find outputs until it reaches a NULL byte (which is what -print0 uses to separate the filenames). Since NULL bytes can't be part of filenames (unlike spaces, tabs and newlines) this is a safe operation.
If you don't need somebashstatement to be part of your script (eg. it doesn't change the script environment by keeping a counter or setting a variable or some such) then you can still use find's -exec to run your bash statement:
find /path -name 'pattern' -exec bash -c 'somebashstatement "$1"' -- {} \;
find /path -name 'pattern' -exec bash -c 'for file; do somebashstatement "$file"; done' -- {} +
Here, the -exec executes a bash command with three or more arguments.
The bash statement to execute.
A --. bash will put this in $0, you can put anything you like here, really.
Your filename or filenames (depending on whether you used {} \; or {} + respectively). The filename(s) end(s) up in $1 (and $2, $3, ... if there's more than one, of course).
The bash statement in the first find command here runs somebashstatement with the filename as argument.
The bash statement in the second find command here runs a for(!) loop that iterates over each positional parameter (that's what the reduced for syntax - for foo; do - does) and runs a somebashstatement with the filename as argument. The difference here between the very first find statement I showed with -exec {} + is that we run only one bash process for lots of filenames but still one somebashstatement for each of those filenames.
All this is also well explained in the UsingFind page linked above.
Instead of relying on the shell to do that work, rely on find to do it:
find . -name "*.foo" -exec somecommand "{}" \;
Then the file name will be properly escaped, and never interpreted by the shell.
find . -name '*.foo' -print0 | xargs -0 -n 1 somecommand
It does get messy if you need to run a number of shell commands on each item, though.
xargs is your friend. You will also want to investigate the -0 (zero) option with it. find (with -print0) will help to produce the list. The Wikipedia page has some good examples.
Another useful reason to use xargs, is that if you have many files (dozens or more), xargs will split them up into individual calls to whatever xargs is then called upon to run (in the first wikipedia example, rm)
find . -name '*.foo' -print0 | xargs -0 sh -c 'for F in "${#}"; do ...; done' "${0}"
I had to do something similar some time ago, renaming files to allow them to live in Win32 environments:
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/[\\/:\*\?#"\|<>]/_/g'
if [ ${newf} != ${f} ]; then
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
fi
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .
This is probably a little simplistic, doesn't avoid name collisions, and I'm sure it could be done better -- but this does remove the need to use basename on the find results (in my case) before performing my sed replacement.
I might ask, what are you doing to the found files, exactly?

Resources