Is there such a thing as inline bash scripts? - bash

I want to do something on the lines of:
find -name *.mk | xargs "for i in $# do mv i i.aside end"
I realize that there might be more than on error in this, but I'd like to specifically know about this sort of inline command definition that I can pass xargs to.

This particular command isn't a great example, but you can use an "inline shell script" by giving sh -c 'here is the script' as a command. And you can give it arguments which will be $# inside the script but there's a catch: the first argument after here is the script goes to $0 inside the script, so you have to put an extra word there or you'll lose the first argument.
find . -name '*.mk' -exec sh -c 'for i; do mv "$i" "$i.aside"; done' fnord '{}' +
Another fun feature I took advantage of there is the fact that for loops iterate over the command line arguments by default: for i; do ... is equivalent to for i in "$#"; do ...
I reiterate, the above command is convoluted and slow compared to the many other methods of doing the bulk mv. I'm posting it only to show some cool syntax.

There's no need for xargs here
find -name *.mk -exec mv {} {}.aside \;

I'm not sure what the semantics of your for loop should be, but blindly coding it would give something like this:
find -name *.mk | while read file
do
for i in $file; do mv $i $i.aside; done
done
If the body is used in multiple places, you can also use bash functions.

In some version of find an argument is needed : . for the current directory
Star * must be escaped
You can try with echo command to be sure what command will do
find . -name '*.mk' -print0 | xargs -0i sh -c "echo mv '{}' '{}.aside'"
man xargs
/-i
man sh
/-c

I'm certain you could do this in a nice manner, but since you requested xargs:
find -name "*.tk" | xargs -I% mv % %.aside
Looping over filenames makes no sense, since you can only rename one at a time. Using inline uglyness is not necessary, but I could not make it work with the pipe and either eval or bash -c.

Related

Bash: recursively rename part of a file [duplicate]

I want to go through a bunch of directories and rename all files that end in _test.rb to end in _spec.rb instead. It's something I've never quite figured out how to do with bash so this time I thought I'd put some effort in to get it nailed. I've so far come up short though, my best effort is:
find spec -name "*_test.rb" -exec echo mv {} `echo {} | sed s/test/spec/` \;
NB: there's an extra echo after exec so that the command is printed instead of run while I'm testing it.
When I run it the output for each matched filename is:
mv original original
i.e. the substitution by sed has been lost. What's the trick?
To solve it in a way most close to the original problem would be probably using xargs "args per command line" option:
find . -name "*_test.rb" | sed -e "p;s/test/spec/" | xargs -n2 mv
It finds the files in the current working directory recursively, echoes the original file name (p) and then a modified name (s/test/spec/) and feeds it all to mv in pairs (xargs -n2). Beware that in this case the path itself shouldn't contain a string test.
This happens because sed receives the string {} as input, as can be verified with:
find . -exec echo `echo "{}" | sed 's/./foo/g'` \;
which prints foofoo for each file in the directory, recursively. The reason for this behavior is that the pipeline is executed once, by the shell, when it expands the entire command.
There is no way of quoting the sed pipeline in such a way that find will execute it for every file, since find doesn't execute commands via the shell and has no notion of pipelines or backquotes. The GNU findutils manual explains how to perform a similar task by putting the pipeline in a separate shell script:
#!/bin/sh
echo "$1" | sed 's/_test.rb$/_spec.rb/'
(There may be some perverse way of using sh -c and a ton of quotes to do all this in one command, but I'm not going to try.)
you might want to consider other way like
for file in $(find . -name "*_test.rb")
do
echo mv $file `echo $file | sed s/_test.rb$/_spec.rb/`
done
I find this one shorter
find . -name '*_test.rb' -exec bash -c 'echo mv $0 ${0/test.rb/spec.rb}' {} \;
You can do it without sed, if you want:
for i in `find -name '*_test.rb'` ; do mv $i ${i%%_test.rb}_spec.rb ; done
${var%%suffix} strips suffix from the value of var.
or, to do it using sed:
for i in `find -name '*_test.rb'` ; do mv $i `echo $i | sed 's/test/spec/'` ; done
You mention that you are using bash as your shell, in which case you don't actually need find and sed to achieve the batch renaming you're after...
Assuming you are using bash as your shell:
$ echo $SHELL
/bin/bash
$ _
... and assuming you have enabled the so-called globstar shell option:
$ shopt -p globstar
shopt -s globstar
$ _
... and finally assuming you have installed the rename utility (found in the util-linux-ng package)
$ which rename
/usr/bin/rename
$ _
... then you can achieve the batch renaming in a bash one-liner as follows:
$ rename _test _spec **/*_test.rb
(the globstar shell option will ensure that bash finds all matching *_test.rb files, no matter how deeply they are nested in the directory hierarchy... use help shopt to find out how to set the option)
The easiest way:
find . -name "*_test.rb" | xargs rename s/_test/_spec/
The fastest way (assuming you have 4 processors):
find . -name "*_test.rb" | xargs -P 4 rename s/_test/_spec/
If you have a large number of files to process, it is possible that the list of filenames piped to xargs would cause the resulting command line to exceed the maximum length allowed.
You can check your system's limit using getconf ARG_MAX
On most linux systems you can use free -b or cat /proc/meminfo to find how much RAM you have to work with; Otherwise, use top or your systems activity monitor app.
A safer way (assuming you have 1000000 bytes of ram to work with):
find . -name "*_test.rb" | xargs -s 1000000 rename s/_test/_spec/
Here is what worked for me when the file names had spaces in them. The example below recursively renames all .dar files to .zip files:
find . -name "*.dar" -exec bash -c 'mv "$0" "`echo \"$0\" | sed s/.dar/.zip/`"' {} \;
For this you don't need sed. You can perfectly get alone with a while loop fed with the result of find through a process substitution.
So if you have a find expression that selects the needed files, then use the syntax:
while IFS= read -r file; do
echo "mv $file ${file%_test.rb}_spec.rb" # remove "echo" when OK!
done < <(find -name "*_test.rb")
This will find files and rename all of them striping the string _test.rb from the end and appending _spec.rb.
For this step we use Shell Parameter Expansion where ${var%string} removes the shortest matching pattern "string" from $var.
$ file="HELLOa_test.rbBYE_test.rb"
$ echo "${file%_test.rb}" # remove _test.rb from the end
HELLOa_test.rbBYE
$ echo "${file%_test.rb}_spec.rb" # remove _test.rb and append _spec.rb
HELLOa_test.rbBYE_spec.rb
See an example:
$ tree
.
├── ab_testArb
├── a_test.rb
├── a_test.rb_test.rb
├── b_test.rb
├── c_test.hello
├── c_test.rb
└── mydir
└── d_test.rb
$ while IFS= read -r file; do echo "mv $file ${file/_test.rb/_spec.rb}"; done < <(find -name "*_test.rb")
mv ./b_test.rb ./b_spec.rb
mv ./mydir/d_test.rb ./mydir/d_spec.rb
mv ./a_test.rb ./a_spec.rb
mv ./c_test.rb ./c_spec.rb
if you have Ruby (1.9+)
ruby -e 'Dir["**/*._test.rb"].each{|x|test(?f,x) and File.rename(x,x.gsub(/_test/,"_spec") ) }'
In ramtam's answer which I like, the find portion works OK but the remainder does not if the path has spaces. I am not too familiar with sed, but I was able to modify that answer to:
find . -name "*_test.rb" | perl -pe 's/^((.*_)test.rb)$/"\1" "\2spec.rb"/' | xargs -n2 mv
I really needed a change like this because in my use case the final command looks more like
find . -name "olddir" | perl -pe 's/^((.*)olddir)$/"\1" "\2new directory"/' | xargs -n2 mv
I haven't the heart to do it all over again, but I wrote this in answer to Commandline Find Sed Exec. There the asker wanted to know how to move an entire tree, possibly excluding a directory or two, and rename all files and directories containing the string "OLD" to instead contain "NEW".
Besides describing the how with painstaking verbosity below, this method may also be unique in that it incorporates built-in debugging. It basically doesn't do anything at all as written except compile and save to a variable all commands it believes it should do in order to perform the work requested.
It also explicitly avoids loops as much as possible. Besides the sed recursive search for more than one match of the pattern there is no other recursion as far as I know.
And last, this is entirely null delimited - it doesn't trip on any character in any filename except the null. I don't think you should have that.
By the way, this is REALLY fast. Look:
% _mvnfind() { mv -n "${1}" "${2}" && cd "${2}"
> read -r SED <<SED
> :;s|${3}\(.*/[^/]*${5}\)|${4}\1|;t;:;s|\(${5}.*\)${3}|\1${4}|;t;s|^[0-9]*[\t]\(mv.*\)${5}|\1|p
> SED
> find . -name "*${3}*" -printf "%d\tmv %P ${5} %P\000" |
> sort -zg | sed -nz ${SED} | read -r ${6}
> echo <<EOF
> Prepared commands saved in variable: ${6}
> To view do: printf ${6} | tr "\000" "\n"
> To run do: sh <<EORUN
> $(printf ${6} | tr "\000" "\n")
> EORUN
> EOF
> }
% rm -rf "${UNNECESSARY:=/any/dirs/you/dont/want/moved}"
% time ( _mvnfind ${SRC=./test_tree} ${TGT=./mv_tree} \
> ${OLD=google} ${NEW=replacement_word} ${sed_sep=SsEeDd} \
> ${sh_io:=sh_io} ; printf %b\\000 "${sh_io}" | tr "\000" "\n" \
> | wc - ; echo ${sh_io} | tr "\000" "\n" | tail -n 2 )
<actual process time used:>
0.06s user 0.03s system 106% cpu 0.090 total
<output from wc:>
Lines Words Bytes
115 362 20691 -
<output from tail:>
mv .config/replacement_word-chrome-beta/Default/.../googlestars \
.config/replacement_word-chrome-beta/Default/.../replacement_wordstars
NOTE: The above function will likely require GNU versions of sed and find to properly handle the find printf and sed -z -e and :;recursive regex test;t calls. If these are not available to you the functionality can likely be duplicated with a few minor adjustments.
This should do everything you wanted from start to finish with very little fuss. I did fork with sed, but I was also practicing some sed recursive branching techniques so that's why I'm here. It's kind of like getting a discount haircut at a barber school, I guess. Here's the workflow:
rm -rf ${UNNECESSARY}
I intentionally left out any functional call that might delete or destroy data of any kind. You mention that ./app might be unwanted. Delete it or move it elsewhere beforehand, or, alternatively, you could build in a \( -path PATTERN -exec rm -rf \{\} \) routine to find to do it programmatically, but that one's all yours.
_mvnfind "${#}"
Declare its arguments and call the worker function. ${sh_io} is especially important in that it saves the return from the function. ${sed_sep} comes in a close second; this is an arbitrary string used to reference sed's recursion in the function. If ${sed_sep} is set to a value that could potentially be found in any of your path- or file-names acted upon... well, just don't let it be.
mv -n $1 $2
The whole tree is moved from the beginning. It will save a lot of headache; believe me. The rest of what you want to do - the renaming - is simply a matter of filesystem metadata. If you were, for instance, moving this from one drive to another, or across filesystem boundaries of any kind, you're better off doing so at once with one command. It's also safer. Note the -noclobber option set for mv; as written, this function will not put ${SRC_DIR} where a ${TGT_DIR} already exists.
read -R SED <<HEREDOC
I located all of sed's commands here to save on escaping hassles and read them into a variable to feed to sed below. Explanation below.
find . -name ${OLD} -printf
We begin the find process. With find we search only for anything that needs renaming because we already did all of the place-to-place mv operations with the function's first command. Rather than take any direct action with find, like an exec call, for instance, we instead use it to build out the command-line dynamically with -printf.
%dir-depth :tab: 'mv '%path-to-${SRC}' '${sed_sep}'%path-again :null delimiter:'
After find locates the files we need it directly builds and prints out (most) of the command we'll need to process your renaming. The %dir-depth tacked onto the beginning of each line will help to ensure we're not trying to rename a file or directory in the tree with a parent object that has yet to be renamed. find uses all sorts of optimization techniques to walk your filesystem tree and it is not a sure thing that it will return the data we need in a safe-for-operations order. This is why we next...
sort -general-numerical -zero-delimited
We sort all of find's output based on %directory-depth so that the paths nearest in relationship to ${SRC} are worked first. This avoids possible errors involving mving files into non-existent locations, and it minimizes need to for recursive looping. (in fact, you might be hard-pressed to find a loop at all)
sed -ex :rcrs;srch|(save${sep}*til)${OLD}|\saved${SUBSTNEW}|;til ${OLD=0}
I think this is the only loop in the whole script, and it only loops over the second %Path printed for each string in case it contains more than one ${OLD} value that might need replacing. All other solutions I imagined involved a second sed process, and while a short loop may not be desirable, certainly it beats spawning and forking an entire process.
So basically what sed does here is search for ${sed_sep}, then, having found it, saves it and all characters it encounters until it finds ${OLD}, which it then replaces with ${NEW}. It then heads back to ${sed_sep} and looks again for ${OLD}, in case it occurs more than once in the string. If it is not found, it prints the modified string to stdout (which it then catches again next) and ends the loop.
This avoids having to parse the entire string, and ensures that the first half of the mv command string, which needs to include ${OLD} of course, does include it, and the second half is altered as many times as is necessary to wipe the ${OLD} name from mv's destination path.
sed -ex...-ex search|%dir_depth(save*)${sed_sep}|(only_saved)|out
The two -exec calls here happen without a second fork. In the first, as we've seen, we modify the mv command as supplied by find's -printf function command as necessary to properly alter all references of ${OLD} to ${NEW}, but in order to do so we had to use some arbitrary reference points which should not be included in the final output. So once sed finishes all it needs to do, we instruct it to wipe out its reference points from the hold-buffer before passing it along.
AND NOW WE'RE BACK AROUND
read will receive a command that looks like this:
% mv /path2/$SRC/$OLD_DIR/$OLD_FILE /same/path_w/$NEW_DIR/$NEW_FILE \000
It will read it into ${msg} as ${sh_io} which can be examined at will outside of the function.
Cool.
-Mike
I was able handle filenames with spaces by following the examples suggested by onitake.
This doesn't break if the path contains spaces or the string test:
find . -name "*_test.rb" -print0 | while read -d $'\0' file
do
echo mv "$file" "$(echo $file | sed s/test/spec/)"
done
This is an example that should work in all cases.
Works recursiveley, Need just shell, and support files names with spaces.
find spec -name "*_test.rb" -print0 | while read -d $'\0' file; do mv "$file" "`echo $file | sed s/test/spec/`"; done
$ find spec -name "*_test.rb"
spec/dir2/a_test.rb
spec/dir1/a_test.rb
$ find spec -name "*_test.rb" | xargs -n 1 /usr/bin/perl -e '($new=$ARGV[0]) =~ s/test/spec/; system(qq(mv),qq(-v), $ARGV[0], $new);'
`spec/dir2/a_test.rb' -> `spec/dir2/a_spec.rb'
`spec/dir1/a_test.rb' -> `spec/dir1/a_spec.rb'
$ find spec -name "*_spec.rb"
spec/dir2/b_spec.rb
spec/dir2/a_spec.rb
spec/dir1/a_spec.rb
spec/dir1/c_spec.rb
Your question seems to be about sed, but to accomplish your goal of recursive rename, I'd suggest the following, shamelessly ripped from another answer I gave here:recursive rename in bash
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/^(.*_)test.rb$/\1spec.rb/g'
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .
More secure way of doing rename with find utils and sed regular expression type:
mkdir ~/practice
cd ~/practice
touch classic.txt.txt
touch folk.txt.txt
Remove the ".txt.txt" extension as follows -
cd ~/practice
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} \;
If you use the + in place of ; in order to work on batch mode, the above command will rename only the first matching file, but not the entire list of file matches by 'find'.
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} +
Here's a nice oneliner that does the trick.
Sed can't handle this right, especially if multiple variables are passed by xargs with -n 2.
A bash substition would handle this easily like:
find ./spec -type f -name "*_test.rb" -print0 | xargs -0 -I {} sh -c 'export file={}; mv $file ${file/_test.rb/_spec.rb}'
Adding -type -f will limit the move operations to files only, -print 0 will handle empty spaces in paths.
I share this post as it is a bit related to question. Sorry for not providing more details. Hope it helps someone else.
http://www.peteryu.ca/tutorials/shellscripting/batch_rename
This is my working solution:
for FILE in {{FILE_PATTERN}}; do echo ${FILE} | mv ${FILE} $(sed 's/{{SOURCE_PATTERN}}/{{TARGET_PATTERN}}/g'); done

How to execute bash code for each iteration of the find command?

Using this will perform a grep for each file found:
find . -name "$FILE" 2>null | xargs grep "search_string" >> $grep_out
But what if I want to execute custom code for each file found, rather than executing a grep? I would like to parse each file my own way, which is motivation for doing this. Could I write the code in the pipe? Should I execute a separate script using the pipe? Can I expand the pipe's scope to execute the next lines in the code before finding the next file?
Several ways to go about it, each with pros and cons. In addition to anubhava's inline method, you could use the -exec flag and a custom script. Example:
find . -name "$FILE" -exec /path/to/script.sh {} +
Then write /path/to/script.sh so that it accepts an arbitrary number of file arguments. Example:
#!/bin/bash
for file in "$#"; do
echo "$file"
done
This approach affords reuse over the inline method, but is less efficient.
The {} + business on find passes multiple files to a single invocation of the script, rather than firing up the script multiple times -- saves a bit on process overhead. If you want the script to execute fresh for each single file, use {} \; instead (and just ue "$1" in your script, no looping needed).
The "$#" bit keeps the file names quoted, important for the cases where your file names have white space in them.
find . -name "$FILE" 2>null -execdir /path/to/script.sh {} \;
This way, no more need to make a for loop somewhere.
You can use while loop like this in BASH:
while read f; do
# process files here
echo "$f"
done < <(find . -name "$FILE")
For using it with sh (which doesn't support process substitution):
find . -name "$FILE" | while read f; do
# process files here
echo "$f"
done
Read more about process substitution
You could use -exec option (instead of xargs) :
find . -name "$FILE" -exec ./test.sh {} \;
With a script test.sh which contain whatever you want. For example :
$ cat test.sh
#!/bin/bash
echo "name=$1"
grep "string" "$1"
$ cat test
string
string2
test
$ sudo find . -name "test" -exec ./test.sh {} \;
name=./test
string
string2

Recursively rename files using find and sed

I want to go through a bunch of directories and rename all files that end in _test.rb to end in _spec.rb instead. It's something I've never quite figured out how to do with bash so this time I thought I'd put some effort in to get it nailed. I've so far come up short though, my best effort is:
find spec -name "*_test.rb" -exec echo mv {} `echo {} | sed s/test/spec/` \;
NB: there's an extra echo after exec so that the command is printed instead of run while I'm testing it.
When I run it the output for each matched filename is:
mv original original
i.e. the substitution by sed has been lost. What's the trick?
To solve it in a way most close to the original problem would be probably using xargs "args per command line" option:
find . -name "*_test.rb" | sed -e "p;s/test/spec/" | xargs -n2 mv
It finds the files in the current working directory recursively, echoes the original file name (p) and then a modified name (s/test/spec/) and feeds it all to mv in pairs (xargs -n2). Beware that in this case the path itself shouldn't contain a string test.
This happens because sed receives the string {} as input, as can be verified with:
find . -exec echo `echo "{}" | sed 's/./foo/g'` \;
which prints foofoo for each file in the directory, recursively. The reason for this behavior is that the pipeline is executed once, by the shell, when it expands the entire command.
There is no way of quoting the sed pipeline in such a way that find will execute it for every file, since find doesn't execute commands via the shell and has no notion of pipelines or backquotes. The GNU findutils manual explains how to perform a similar task by putting the pipeline in a separate shell script:
#!/bin/sh
echo "$1" | sed 's/_test.rb$/_spec.rb/'
(There may be some perverse way of using sh -c and a ton of quotes to do all this in one command, but I'm not going to try.)
you might want to consider other way like
for file in $(find . -name "*_test.rb")
do
echo mv $file `echo $file | sed s/_test.rb$/_spec.rb/`
done
I find this one shorter
find . -name '*_test.rb' -exec bash -c 'echo mv $0 ${0/test.rb/spec.rb}' {} \;
You can do it without sed, if you want:
for i in `find -name '*_test.rb'` ; do mv $i ${i%%_test.rb}_spec.rb ; done
${var%%suffix} strips suffix from the value of var.
or, to do it using sed:
for i in `find -name '*_test.rb'` ; do mv $i `echo $i | sed 's/test/spec/'` ; done
You mention that you are using bash as your shell, in which case you don't actually need find and sed to achieve the batch renaming you're after...
Assuming you are using bash as your shell:
$ echo $SHELL
/bin/bash
$ _
... and assuming you have enabled the so-called globstar shell option:
$ shopt -p globstar
shopt -s globstar
$ _
... and finally assuming you have installed the rename utility (found in the util-linux-ng package)
$ which rename
/usr/bin/rename
$ _
... then you can achieve the batch renaming in a bash one-liner as follows:
$ rename _test _spec **/*_test.rb
(the globstar shell option will ensure that bash finds all matching *_test.rb files, no matter how deeply they are nested in the directory hierarchy... use help shopt to find out how to set the option)
The easiest way:
find . -name "*_test.rb" | xargs rename s/_test/_spec/
The fastest way (assuming you have 4 processors):
find . -name "*_test.rb" | xargs -P 4 rename s/_test/_spec/
If you have a large number of files to process, it is possible that the list of filenames piped to xargs would cause the resulting command line to exceed the maximum length allowed.
You can check your system's limit using getconf ARG_MAX
On most linux systems you can use free -b or cat /proc/meminfo to find how much RAM you have to work with; Otherwise, use top or your systems activity monitor app.
A safer way (assuming you have 1000000 bytes of ram to work with):
find . -name "*_test.rb" | xargs -s 1000000 rename s/_test/_spec/
Here is what worked for me when the file names had spaces in them. The example below recursively renames all .dar files to .zip files:
find . -name "*.dar" -exec bash -c 'mv "$0" "`echo \"$0\" | sed s/.dar/.zip/`"' {} \;
For this you don't need sed. You can perfectly get alone with a while loop fed with the result of find through a process substitution.
So if you have a find expression that selects the needed files, then use the syntax:
while IFS= read -r file; do
echo "mv $file ${file%_test.rb}_spec.rb" # remove "echo" when OK!
done < <(find -name "*_test.rb")
This will find files and rename all of them striping the string _test.rb from the end and appending _spec.rb.
For this step we use Shell Parameter Expansion where ${var%string} removes the shortest matching pattern "string" from $var.
$ file="HELLOa_test.rbBYE_test.rb"
$ echo "${file%_test.rb}" # remove _test.rb from the end
HELLOa_test.rbBYE
$ echo "${file%_test.rb}_spec.rb" # remove _test.rb and append _spec.rb
HELLOa_test.rbBYE_spec.rb
See an example:
$ tree
.
├── ab_testArb
├── a_test.rb
├── a_test.rb_test.rb
├── b_test.rb
├── c_test.hello
├── c_test.rb
└── mydir
└── d_test.rb
$ while IFS= read -r file; do echo "mv $file ${file/_test.rb/_spec.rb}"; done < <(find -name "*_test.rb")
mv ./b_test.rb ./b_spec.rb
mv ./mydir/d_test.rb ./mydir/d_spec.rb
mv ./a_test.rb ./a_spec.rb
mv ./c_test.rb ./c_spec.rb
if you have Ruby (1.9+)
ruby -e 'Dir["**/*._test.rb"].each{|x|test(?f,x) and File.rename(x,x.gsub(/_test/,"_spec") ) }'
In ramtam's answer which I like, the find portion works OK but the remainder does not if the path has spaces. I am not too familiar with sed, but I was able to modify that answer to:
find . -name "*_test.rb" | perl -pe 's/^((.*_)test.rb)$/"\1" "\2spec.rb"/' | xargs -n2 mv
I really needed a change like this because in my use case the final command looks more like
find . -name "olddir" | perl -pe 's/^((.*)olddir)$/"\1" "\2new directory"/' | xargs -n2 mv
I haven't the heart to do it all over again, but I wrote this in answer to Commandline Find Sed Exec. There the asker wanted to know how to move an entire tree, possibly excluding a directory or two, and rename all files and directories containing the string "OLD" to instead contain "NEW".
Besides describing the how with painstaking verbosity below, this method may also be unique in that it incorporates built-in debugging. It basically doesn't do anything at all as written except compile and save to a variable all commands it believes it should do in order to perform the work requested.
It also explicitly avoids loops as much as possible. Besides the sed recursive search for more than one match of the pattern there is no other recursion as far as I know.
And last, this is entirely null delimited - it doesn't trip on any character in any filename except the null. I don't think you should have that.
By the way, this is REALLY fast. Look:
% _mvnfind() { mv -n "${1}" "${2}" && cd "${2}"
> read -r SED <<SED
> :;s|${3}\(.*/[^/]*${5}\)|${4}\1|;t;:;s|\(${5}.*\)${3}|\1${4}|;t;s|^[0-9]*[\t]\(mv.*\)${5}|\1|p
> SED
> find . -name "*${3}*" -printf "%d\tmv %P ${5} %P\000" |
> sort -zg | sed -nz ${SED} | read -r ${6}
> echo <<EOF
> Prepared commands saved in variable: ${6}
> To view do: printf ${6} | tr "\000" "\n"
> To run do: sh <<EORUN
> $(printf ${6} | tr "\000" "\n")
> EORUN
> EOF
> }
% rm -rf "${UNNECESSARY:=/any/dirs/you/dont/want/moved}"
% time ( _mvnfind ${SRC=./test_tree} ${TGT=./mv_tree} \
> ${OLD=google} ${NEW=replacement_word} ${sed_sep=SsEeDd} \
> ${sh_io:=sh_io} ; printf %b\\000 "${sh_io}" | tr "\000" "\n" \
> | wc - ; echo ${sh_io} | tr "\000" "\n" | tail -n 2 )
<actual process time used:>
0.06s user 0.03s system 106% cpu 0.090 total
<output from wc:>
Lines Words Bytes
115 362 20691 -
<output from tail:>
mv .config/replacement_word-chrome-beta/Default/.../googlestars \
.config/replacement_word-chrome-beta/Default/.../replacement_wordstars
NOTE: The above function will likely require GNU versions of sed and find to properly handle the find printf and sed -z -e and :;recursive regex test;t calls. If these are not available to you the functionality can likely be duplicated with a few minor adjustments.
This should do everything you wanted from start to finish with very little fuss. I did fork with sed, but I was also practicing some sed recursive branching techniques so that's why I'm here. It's kind of like getting a discount haircut at a barber school, I guess. Here's the workflow:
rm -rf ${UNNECESSARY}
I intentionally left out any functional call that might delete or destroy data of any kind. You mention that ./app might be unwanted. Delete it or move it elsewhere beforehand, or, alternatively, you could build in a \( -path PATTERN -exec rm -rf \{\} \) routine to find to do it programmatically, but that one's all yours.
_mvnfind "${#}"
Declare its arguments and call the worker function. ${sh_io} is especially important in that it saves the return from the function. ${sed_sep} comes in a close second; this is an arbitrary string used to reference sed's recursion in the function. If ${sed_sep} is set to a value that could potentially be found in any of your path- or file-names acted upon... well, just don't let it be.
mv -n $1 $2
The whole tree is moved from the beginning. It will save a lot of headache; believe me. The rest of what you want to do - the renaming - is simply a matter of filesystem metadata. If you were, for instance, moving this from one drive to another, or across filesystem boundaries of any kind, you're better off doing so at once with one command. It's also safer. Note the -noclobber option set for mv; as written, this function will not put ${SRC_DIR} where a ${TGT_DIR} already exists.
read -R SED <<HEREDOC
I located all of sed's commands here to save on escaping hassles and read them into a variable to feed to sed below. Explanation below.
find . -name ${OLD} -printf
We begin the find process. With find we search only for anything that needs renaming because we already did all of the place-to-place mv operations with the function's first command. Rather than take any direct action with find, like an exec call, for instance, we instead use it to build out the command-line dynamically with -printf.
%dir-depth :tab: 'mv '%path-to-${SRC}' '${sed_sep}'%path-again :null delimiter:'
After find locates the files we need it directly builds and prints out (most) of the command we'll need to process your renaming. The %dir-depth tacked onto the beginning of each line will help to ensure we're not trying to rename a file or directory in the tree with a parent object that has yet to be renamed. find uses all sorts of optimization techniques to walk your filesystem tree and it is not a sure thing that it will return the data we need in a safe-for-operations order. This is why we next...
sort -general-numerical -zero-delimited
We sort all of find's output based on %directory-depth so that the paths nearest in relationship to ${SRC} are worked first. This avoids possible errors involving mving files into non-existent locations, and it minimizes need to for recursive looping. (in fact, you might be hard-pressed to find a loop at all)
sed -ex :rcrs;srch|(save${sep}*til)${OLD}|\saved${SUBSTNEW}|;til ${OLD=0}
I think this is the only loop in the whole script, and it only loops over the second %Path printed for each string in case it contains more than one ${OLD} value that might need replacing. All other solutions I imagined involved a second sed process, and while a short loop may not be desirable, certainly it beats spawning and forking an entire process.
So basically what sed does here is search for ${sed_sep}, then, having found it, saves it and all characters it encounters until it finds ${OLD}, which it then replaces with ${NEW}. It then heads back to ${sed_sep} and looks again for ${OLD}, in case it occurs more than once in the string. If it is not found, it prints the modified string to stdout (which it then catches again next) and ends the loop.
This avoids having to parse the entire string, and ensures that the first half of the mv command string, which needs to include ${OLD} of course, does include it, and the second half is altered as many times as is necessary to wipe the ${OLD} name from mv's destination path.
sed -ex...-ex search|%dir_depth(save*)${sed_sep}|(only_saved)|out
The two -exec calls here happen without a second fork. In the first, as we've seen, we modify the mv command as supplied by find's -printf function command as necessary to properly alter all references of ${OLD} to ${NEW}, but in order to do so we had to use some arbitrary reference points which should not be included in the final output. So once sed finishes all it needs to do, we instruct it to wipe out its reference points from the hold-buffer before passing it along.
AND NOW WE'RE BACK AROUND
read will receive a command that looks like this:
% mv /path2/$SRC/$OLD_DIR/$OLD_FILE /same/path_w/$NEW_DIR/$NEW_FILE \000
It will read it into ${msg} as ${sh_io} which can be examined at will outside of the function.
Cool.
-Mike
I was able handle filenames with spaces by following the examples suggested by onitake.
This doesn't break if the path contains spaces or the string test:
find . -name "*_test.rb" -print0 | while read -d $'\0' file
do
echo mv "$file" "$(echo $file | sed s/test/spec/)"
done
This is an example that should work in all cases.
Works recursiveley, Need just shell, and support files names with spaces.
find spec -name "*_test.rb" -print0 | while read -d $'\0' file; do mv "$file" "`echo $file | sed s/test/spec/`"; done
$ find spec -name "*_test.rb"
spec/dir2/a_test.rb
spec/dir1/a_test.rb
$ find spec -name "*_test.rb" | xargs -n 1 /usr/bin/perl -e '($new=$ARGV[0]) =~ s/test/spec/; system(qq(mv),qq(-v), $ARGV[0], $new);'
`spec/dir2/a_test.rb' -> `spec/dir2/a_spec.rb'
`spec/dir1/a_test.rb' -> `spec/dir1/a_spec.rb'
$ find spec -name "*_spec.rb"
spec/dir2/b_spec.rb
spec/dir2/a_spec.rb
spec/dir1/a_spec.rb
spec/dir1/c_spec.rb
Your question seems to be about sed, but to accomplish your goal of recursive rename, I'd suggest the following, shamelessly ripped from another answer I gave here:recursive rename in bash
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/^(.*_)test.rb$/\1spec.rb/g'
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .
More secure way of doing rename with find utils and sed regular expression type:
mkdir ~/practice
cd ~/practice
touch classic.txt.txt
touch folk.txt.txt
Remove the ".txt.txt" extension as follows -
cd ~/practice
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} \;
If you use the + in place of ; in order to work on batch mode, the above command will rename only the first matching file, but not the entire list of file matches by 'find'.
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} +
Here's a nice oneliner that does the trick.
Sed can't handle this right, especially if multiple variables are passed by xargs with -n 2.
A bash substition would handle this easily like:
find ./spec -type f -name "*_test.rb" -print0 | xargs -0 -I {} sh -c 'export file={}; mv $file ${file/_test.rb/_spec.rb}'
Adding -type -f will limit the move operations to files only, -print 0 will handle empty spaces in paths.
I share this post as it is a bit related to question. Sorry for not providing more details. Hope it helps someone else.
http://www.peteryu.ca/tutorials/shellscripting/batch_rename
This is my working solution:
for FILE in {{FILE_PATTERN}}; do echo ${FILE} | mv ${FILE} $(sed 's/{{SOURCE_PATTERN}}/{{TARGET_PATTERN}}/g'); done

'find -exec' a shell function in Linux

Is there a way to get find to execute a function I define in the shell?
For example:
dosomething () {
echo "Doing something with $1"
}
find . -exec dosomething {} \;
The result of that is:
find: dosomething: No such file or directory
Is there a way to get find's -exec to see dosomething?
Since only the shell knows how to run shell functions, you have to run a shell to run a function. You also need to mark your function for export with export -f, otherwise the subshell won't inherit them:
export -f dosomething
find . -exec bash -c 'dosomething "$0"' {} \;
find . | while read file; do dosomething "$file"; done
Jac's answer is great, but it has a couple of pitfalls that are easily overcome:
find . -print0 | while IFS= read -r -d '' file; do dosomething "$file"; done
This uses null as a delimiter instead of a linefeed, so filenames with line feeds will work. It also uses the -r flag which disables backslash escaping, and without it backslashes in filenames won't work. It also clears IFS so that potential trailing white spaces in names are not discarded.
Add quotes in {} as shown below:
export -f dosomething
find . -exec bash -c 'dosomething "{}"' \;
This corrects any error due to special characters returned by find,
for example files with parentheses in their name.
Processing results in bulk
For increased efficiency, many people use xargs to process results in bulk, but it is very dangerous. Because of that there was an alternate method introduced into find that executes results in bulk.
Note though that this method might come with some caveats like for example a requirement in POSIX-find to have {} at the end of the command.
export -f dosomething
find . -exec bash -c 'for f; do dosomething "$f"; done' _ {} +
find will pass many results as arguments to a single call of bash and the for-loop iterates through those arguments, executing the function dosomething on each one of those.
The above solution starts arguments at $1, which is why there is a _ (which represents $0).
Processing results one by one
In the same way, I think that the accepted top answer should be corrected to be
export -f dosomething
find . -exec bash -c 'dosomething "$1"' _ {} \;
This is not only more sane, because arguments should always start at $1, but also using $0 could lead to unexpected behavior if the filename returned by find has special meaning to the shell.
Have the script call itself, passing each item found as an argument:
#!/bin/bash
if [ ! $1 == "" ] ; then
echo "doing something with $1"
exit 0
fi
find . -exec $0 {} \;
exit 0
When you run the script by itself, it finds what you are looking for and calls itself passing each find result as the argument. When the script is run with an argument, it executes the commands on the argument and then exits.
Just a warning regaring the accepted answer that is using a shell,
despite it well answer the question, it might not be the most efficient way to exec some code on find results:
Here is a benchmark under bash of all kind of solutions,
including a simple for loop case:
(1465 directories, on a standard hard drive, armv7l GNU/Linux synology_armada38x_ds218j)
dosomething() { echo $1; }
export -f dosomething
time find . -type d -exec bash -c 'dosomething "$0"' {} \;
real 0m16.102s
time while read -d '' filename; do dosomething "${filename}" </dev/null; done < <(find . -type d -print0)
real 0m0.364s
time find . -type d | while read file; do dosomething "$file"; done
real 0m0.340s
time for dir in $(find . -type d); do dosomething $dir; done
real 0m0.337s
"find | while" and "for loop" seems best and similar in speed.
For those of you looking for a Bash function that will execute a given command on all files in current directory, I have compiled one from the above answers:
toall(){
find . -type f | while read file; do "$1" "$file"; done
}
Note that it breaks with file names containing spaces (see below).
As an example, take this function:
world(){
sed -i 's_hello_world_g' "$1"
}
Say I wanted to change all instances of "hello" to "world" in all files in the current directory. I would do:
toall world
To be safe with any symbols in filenames, use:
toall(){
find . -type f -print0 | while IFS= read -r -d '' file; do "$1" "$file"; done
}
(but you need a find that handles -print0 e.g., GNU find).
It is not possible to executable a function that way.
To overcome this you can place your function in a shell script and call that from find
# dosomething.sh
dosomething () {
echo "doing something with $1"
}
dosomething $1
Now use it in find as:
find . -exec dosomething.sh {} \;
To provide additions and clarifications to some of the other answers, if you are using the bulk option for exec or execdir (-exec command {} +), and want to retrieve all the positional arguments, you need to consider the handling of $0 with bash -c.
More concretely, consider the command below, which uses bash -c as suggested above, and simply echoes out file paths ending with '.wav' from each directory it finds:
find "$1" -name '*.wav' -execdir bash -c 'echo "$#"' _ {} +
The Bash manual says:
If the -c option is present, then commands are read from the first non-option argument command_string. If there are arguments after the command_string, they are assigned to positional parameters, starting with $0.
Here, 'echo "$#"' is the command string, and _ {} are the arguments after the command string. Note that $# is a special positional parameter in Bash that expands to all the positional parameters starting from 1. Also note that with the -c option, the first argument is assigned to positional parameter $0.
This means that if you try to access all of the positional parameters with $#, you will only get parameters starting from $1 and up. That is the reason why Dominik's answer has the _, which is a dummy argument to fill parameter $0, so all of the arguments we want are available later if we use $# parameter expansion for instance, or the for loop as in that answer.
Of course, similar to the accepted answer, bash -c 'shell_function "$0" "$#"' would also work by explicitly passing $0, but again, you would have to keep in mind that $# won't work as expected.
Put the function in a separate file and get find to execute that.
Shell functions are internal to the shell they're defined in; find will never be able to see them.
I find the easiest way is as follows, repeating two commands in a single do:
func_one () {
echo "The first thing with $1"
}
func_two () {
echo "The second thing with $1"
}
find . -type f | while read file; do func_one $file; func_two $file; done
Not directly, no. Find is executing in a separate process, not in your shell.
Create a shell script that does the same job as your function and find can -exec that.
I would avoid using -exec altogether. Use xargs:
find . -name <script/command you're searching for> | xargs bash -c

How do I apply a shell command to many files in nested (and poorly escaped) subdirectories?

I'm trying to do something like the following:
for file in `find . *.foo`
do
somecommand $file
done
But the command isn't working because $file is very odd. Because my directory tree has crappy file names (including spaces), I need to escape the find command. But none of the obvious escapes seem to work:
-ls gives me the space-delimited filename fragments
-fprint doesn't do any better.
I also tried: for file in "find . *.foo -ls"; do echo $file; done
- but that gives all of the responses from find in one long line.
Any hints? I'm happy for any workaround, but am frustrated that I can't figure this out.
Thanks,
Alex
(Hi Matt!)
You have plenty of answers that explain well how to do it; but for the sake of completion I'll repeat and add to it:
xargs is only ever useful for interactive use (when you know all your filenames are plain - no spaces or quotes) or when used with the -0 option. Otherwise, it'll break everything.
find is a very useful tool; put using it to pipe filenames into xargs (even with -0) is rather convoluted as find can do it all itself with either -exec command {} \; or -exec command {} + depending on what you want:
find /path -name 'pattern' -exec somecommand {} \;
find /path -name 'pattern' -exec somecommand {} +
The former runs somecommand with one argument for each file recursively in /path that matches pattern.
The latter runs somecommand with as many arguments as fit on the command line at once for files recursively in /path that match pattern.
Which one to use depends on somecommand. If it can take multiple filename arguments (like rm, grep, etc.) then the latter option is faster (since you run somecommand far less often). If somecommand takes only one argument then you need the former solution. So look at somecommand's man page.
More on find: http://mywiki.wooledge.org/UsingFind
In bash, for is a statement that iterates over arguments. If you do something like this:
for foo in "$bar"
you're giving for one argument to iterate over (note the quotes!). If you do something like this:
for foo in $bar
you're asking bash to take the contents of bar and tear it apart wherever there are spaces, tabs or newlines (technically, whatever characters are in IFS) and use the pieces of that operation as arguments to for. That is NOT filenames. Assuming that the result of a tearing long string that contains filenames apart wherever there is whitespace yields in a pile of filenames is just wrong. As you have just noticed.
The answer is: Don't use for, it's obviously the wrong tool. The above find commands all assume that somecommand is an executable in PATH. If it's a bash statement, you'll need this construct instead (iterates over find's output, like you tried, but safely):
while read -r -d ''; do
somebashstatement "$REPLY"
done < <(find /path -name 'pattern' -print0)
This uses a while-read loop that reads parts of the string find outputs until it reaches a NULL byte (which is what -print0 uses to separate the filenames). Since NULL bytes can't be part of filenames (unlike spaces, tabs and newlines) this is a safe operation.
If you don't need somebashstatement to be part of your script (eg. it doesn't change the script environment by keeping a counter or setting a variable or some such) then you can still use find's -exec to run your bash statement:
find /path -name 'pattern' -exec bash -c 'somebashstatement "$1"' -- {} \;
find /path -name 'pattern' -exec bash -c 'for file; do somebashstatement "$file"; done' -- {} +
Here, the -exec executes a bash command with three or more arguments.
The bash statement to execute.
A --. bash will put this in $0, you can put anything you like here, really.
Your filename or filenames (depending on whether you used {} \; or {} + respectively). The filename(s) end(s) up in $1 (and $2, $3, ... if there's more than one, of course).
The bash statement in the first find command here runs somebashstatement with the filename as argument.
The bash statement in the second find command here runs a for(!) loop that iterates over each positional parameter (that's what the reduced for syntax - for foo; do - does) and runs a somebashstatement with the filename as argument. The difference here between the very first find statement I showed with -exec {} + is that we run only one bash process for lots of filenames but still one somebashstatement for each of those filenames.
All this is also well explained in the UsingFind page linked above.
Instead of relying on the shell to do that work, rely on find to do it:
find . -name "*.foo" -exec somecommand "{}" \;
Then the file name will be properly escaped, and never interpreted by the shell.
find . -name '*.foo' -print0 | xargs -0 -n 1 somecommand
It does get messy if you need to run a number of shell commands on each item, though.
xargs is your friend. You will also want to investigate the -0 (zero) option with it. find (with -print0) will help to produce the list. The Wikipedia page has some good examples.
Another useful reason to use xargs, is that if you have many files (dozens or more), xargs will split them up into individual calls to whatever xargs is then called upon to run (in the first wikipedia example, rm)
find . -name '*.foo' -print0 | xargs -0 sh -c 'for F in "${#}"; do ...; done' "${0}"
I had to do something similar some time ago, renaming files to allow them to live in Win32 environments:
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/[\\/:\*\?#"\|<>]/_/g'
if [ ${newf} != ${f} ]; then
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
fi
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .
This is probably a little simplistic, doesn't avoid name collisions, and I'm sure it could be done better -- but this does remove the need to use basename on the find results (in my case) before performing my sed replacement.
I might ask, what are you doing to the found files, exactly?

Resources