Bash: recursively rename part of a file [duplicate] - bash

I want to go through a bunch of directories and rename all files that end in _test.rb to end in _spec.rb instead. It's something I've never quite figured out how to do with bash so this time I thought I'd put some effort in to get it nailed. I've so far come up short though, my best effort is:
find spec -name "*_test.rb" -exec echo mv {} `echo {} | sed s/test/spec/` \;
NB: there's an extra echo after exec so that the command is printed instead of run while I'm testing it.
When I run it the output for each matched filename is:
mv original original
i.e. the substitution by sed has been lost. What's the trick?

To solve it in a way most close to the original problem would be probably using xargs "args per command line" option:
find . -name "*_test.rb" | sed -e "p;s/test/spec/" | xargs -n2 mv
It finds the files in the current working directory recursively, echoes the original file name (p) and then a modified name (s/test/spec/) and feeds it all to mv in pairs (xargs -n2). Beware that in this case the path itself shouldn't contain a string test.

This happens because sed receives the string {} as input, as can be verified with:
find . -exec echo `echo "{}" | sed 's/./foo/g'` \;
which prints foofoo for each file in the directory, recursively. The reason for this behavior is that the pipeline is executed once, by the shell, when it expands the entire command.
There is no way of quoting the sed pipeline in such a way that find will execute it for every file, since find doesn't execute commands via the shell and has no notion of pipelines or backquotes. The GNU findutils manual explains how to perform a similar task by putting the pipeline in a separate shell script:
#!/bin/sh
echo "$1" | sed 's/_test.rb$/_spec.rb/'
(There may be some perverse way of using sh -c and a ton of quotes to do all this in one command, but I'm not going to try.)

you might want to consider other way like
for file in $(find . -name "*_test.rb")
do
echo mv $file `echo $file | sed s/_test.rb$/_spec.rb/`
done

I find this one shorter
find . -name '*_test.rb' -exec bash -c 'echo mv $0 ${0/test.rb/spec.rb}' {} \;

You can do it without sed, if you want:
for i in `find -name '*_test.rb'` ; do mv $i ${i%%_test.rb}_spec.rb ; done
${var%%suffix} strips suffix from the value of var.
or, to do it using sed:
for i in `find -name '*_test.rb'` ; do mv $i `echo $i | sed 's/test/spec/'` ; done

You mention that you are using bash as your shell, in which case you don't actually need find and sed to achieve the batch renaming you're after...
Assuming you are using bash as your shell:
$ echo $SHELL
/bin/bash
$ _
... and assuming you have enabled the so-called globstar shell option:
$ shopt -p globstar
shopt -s globstar
$ _
... and finally assuming you have installed the rename utility (found in the util-linux-ng package)
$ which rename
/usr/bin/rename
$ _
... then you can achieve the batch renaming in a bash one-liner as follows:
$ rename _test _spec **/*_test.rb
(the globstar shell option will ensure that bash finds all matching *_test.rb files, no matter how deeply they are nested in the directory hierarchy... use help shopt to find out how to set the option)

The easiest way:
find . -name "*_test.rb" | xargs rename s/_test/_spec/
The fastest way (assuming you have 4 processors):
find . -name "*_test.rb" | xargs -P 4 rename s/_test/_spec/
If you have a large number of files to process, it is possible that the list of filenames piped to xargs would cause the resulting command line to exceed the maximum length allowed.
You can check your system's limit using getconf ARG_MAX
On most linux systems you can use free -b or cat /proc/meminfo to find how much RAM you have to work with; Otherwise, use top or your systems activity monitor app.
A safer way (assuming you have 1000000 bytes of ram to work with):
find . -name "*_test.rb" | xargs -s 1000000 rename s/_test/_spec/

Here is what worked for me when the file names had spaces in them. The example below recursively renames all .dar files to .zip files:
find . -name "*.dar" -exec bash -c 'mv "$0" "`echo \"$0\" | sed s/.dar/.zip/`"' {} \;

For this you don't need sed. You can perfectly get alone with a while loop fed with the result of find through a process substitution.
So if you have a find expression that selects the needed files, then use the syntax:
while IFS= read -r file; do
echo "mv $file ${file%_test.rb}_spec.rb" # remove "echo" when OK!
done < <(find -name "*_test.rb")
This will find files and rename all of them striping the string _test.rb from the end and appending _spec.rb.
For this step we use Shell Parameter Expansion where ${var%string} removes the shortest matching pattern "string" from $var.
$ file="HELLOa_test.rbBYE_test.rb"
$ echo "${file%_test.rb}" # remove _test.rb from the end
HELLOa_test.rbBYE
$ echo "${file%_test.rb}_spec.rb" # remove _test.rb and append _spec.rb
HELLOa_test.rbBYE_spec.rb
See an example:
$ tree
.
├── ab_testArb
├── a_test.rb
├── a_test.rb_test.rb
├── b_test.rb
├── c_test.hello
├── c_test.rb
└── mydir
└── d_test.rb
$ while IFS= read -r file; do echo "mv $file ${file/_test.rb/_spec.rb}"; done < <(find -name "*_test.rb")
mv ./b_test.rb ./b_spec.rb
mv ./mydir/d_test.rb ./mydir/d_spec.rb
mv ./a_test.rb ./a_spec.rb
mv ./c_test.rb ./c_spec.rb

if you have Ruby (1.9+)
ruby -e 'Dir["**/*._test.rb"].each{|x|test(?f,x) and File.rename(x,x.gsub(/_test/,"_spec") ) }'

In ramtam's answer which I like, the find portion works OK but the remainder does not if the path has spaces. I am not too familiar with sed, but I was able to modify that answer to:
find . -name "*_test.rb" | perl -pe 's/^((.*_)test.rb)$/"\1" "\2spec.rb"/' | xargs -n2 mv
I really needed a change like this because in my use case the final command looks more like
find . -name "olddir" | perl -pe 's/^((.*)olddir)$/"\1" "\2new directory"/' | xargs -n2 mv

I haven't the heart to do it all over again, but I wrote this in answer to Commandline Find Sed Exec. There the asker wanted to know how to move an entire tree, possibly excluding a directory or two, and rename all files and directories containing the string "OLD" to instead contain "NEW".
Besides describing the how with painstaking verbosity below, this method may also be unique in that it incorporates built-in debugging. It basically doesn't do anything at all as written except compile and save to a variable all commands it believes it should do in order to perform the work requested.
It also explicitly avoids loops as much as possible. Besides the sed recursive search for more than one match of the pattern there is no other recursion as far as I know.
And last, this is entirely null delimited - it doesn't trip on any character in any filename except the null. I don't think you should have that.
By the way, this is REALLY fast. Look:
% _mvnfind() { mv -n "${1}" "${2}" && cd "${2}"
> read -r SED <<SED
> :;s|${3}\(.*/[^/]*${5}\)|${4}\1|;t;:;s|\(${5}.*\)${3}|\1${4}|;t;s|^[0-9]*[\t]\(mv.*\)${5}|\1|p
> SED
> find . -name "*${3}*" -printf "%d\tmv %P ${5} %P\000" |
> sort -zg | sed -nz ${SED} | read -r ${6}
> echo <<EOF
> Prepared commands saved in variable: ${6}
> To view do: printf ${6} | tr "\000" "\n"
> To run do: sh <<EORUN
> $(printf ${6} | tr "\000" "\n")
> EORUN
> EOF
> }
% rm -rf "${UNNECESSARY:=/any/dirs/you/dont/want/moved}"
% time ( _mvnfind ${SRC=./test_tree} ${TGT=./mv_tree} \
> ${OLD=google} ${NEW=replacement_word} ${sed_sep=SsEeDd} \
> ${sh_io:=sh_io} ; printf %b\\000 "${sh_io}" | tr "\000" "\n" \
> | wc - ; echo ${sh_io} | tr "\000" "\n" | tail -n 2 )
<actual process time used:>
0.06s user 0.03s system 106% cpu 0.090 total
<output from wc:>
Lines Words Bytes
115 362 20691 -
<output from tail:>
mv .config/replacement_word-chrome-beta/Default/.../googlestars \
.config/replacement_word-chrome-beta/Default/.../replacement_wordstars
NOTE: The above function will likely require GNU versions of sed and find to properly handle the find printf and sed -z -e and :;recursive regex test;t calls. If these are not available to you the functionality can likely be duplicated with a few minor adjustments.
This should do everything you wanted from start to finish with very little fuss. I did fork with sed, but I was also practicing some sed recursive branching techniques so that's why I'm here. It's kind of like getting a discount haircut at a barber school, I guess. Here's the workflow:
rm -rf ${UNNECESSARY}
I intentionally left out any functional call that might delete or destroy data of any kind. You mention that ./app might be unwanted. Delete it or move it elsewhere beforehand, or, alternatively, you could build in a \( -path PATTERN -exec rm -rf \{\} \) routine to find to do it programmatically, but that one's all yours.
_mvnfind "${#}"
Declare its arguments and call the worker function. ${sh_io} is especially important in that it saves the return from the function. ${sed_sep} comes in a close second; this is an arbitrary string used to reference sed's recursion in the function. If ${sed_sep} is set to a value that could potentially be found in any of your path- or file-names acted upon... well, just don't let it be.
mv -n $1 $2
The whole tree is moved from the beginning. It will save a lot of headache; believe me. The rest of what you want to do - the renaming - is simply a matter of filesystem metadata. If you were, for instance, moving this from one drive to another, or across filesystem boundaries of any kind, you're better off doing so at once with one command. It's also safer. Note the -noclobber option set for mv; as written, this function will not put ${SRC_DIR} where a ${TGT_DIR} already exists.
read -R SED <<HEREDOC
I located all of sed's commands here to save on escaping hassles and read them into a variable to feed to sed below. Explanation below.
find . -name ${OLD} -printf
We begin the find process. With find we search only for anything that needs renaming because we already did all of the place-to-place mv operations with the function's first command. Rather than take any direct action with find, like an exec call, for instance, we instead use it to build out the command-line dynamically with -printf.
%dir-depth :tab: 'mv '%path-to-${SRC}' '${sed_sep}'%path-again :null delimiter:'
After find locates the files we need it directly builds and prints out (most) of the command we'll need to process your renaming. The %dir-depth tacked onto the beginning of each line will help to ensure we're not trying to rename a file or directory in the tree with a parent object that has yet to be renamed. find uses all sorts of optimization techniques to walk your filesystem tree and it is not a sure thing that it will return the data we need in a safe-for-operations order. This is why we next...
sort -general-numerical -zero-delimited
We sort all of find's output based on %directory-depth so that the paths nearest in relationship to ${SRC} are worked first. This avoids possible errors involving mving files into non-existent locations, and it minimizes need to for recursive looping. (in fact, you might be hard-pressed to find a loop at all)
sed -ex :rcrs;srch|(save${sep}*til)${OLD}|\saved${SUBSTNEW}|;til ${OLD=0}
I think this is the only loop in the whole script, and it only loops over the second %Path printed for each string in case it contains more than one ${OLD} value that might need replacing. All other solutions I imagined involved a second sed process, and while a short loop may not be desirable, certainly it beats spawning and forking an entire process.
So basically what sed does here is search for ${sed_sep}, then, having found it, saves it and all characters it encounters until it finds ${OLD}, which it then replaces with ${NEW}. It then heads back to ${sed_sep} and looks again for ${OLD}, in case it occurs more than once in the string. If it is not found, it prints the modified string to stdout (which it then catches again next) and ends the loop.
This avoids having to parse the entire string, and ensures that the first half of the mv command string, which needs to include ${OLD} of course, does include it, and the second half is altered as many times as is necessary to wipe the ${OLD} name from mv's destination path.
sed -ex...-ex search|%dir_depth(save*)${sed_sep}|(only_saved)|out
The two -exec calls here happen without a second fork. In the first, as we've seen, we modify the mv command as supplied by find's -printf function command as necessary to properly alter all references of ${OLD} to ${NEW}, but in order to do so we had to use some arbitrary reference points which should not be included in the final output. So once sed finishes all it needs to do, we instruct it to wipe out its reference points from the hold-buffer before passing it along.
AND NOW WE'RE BACK AROUND
read will receive a command that looks like this:
% mv /path2/$SRC/$OLD_DIR/$OLD_FILE /same/path_w/$NEW_DIR/$NEW_FILE \000
It will read it into ${msg} as ${sh_io} which can be examined at will outside of the function.
Cool.
-Mike

I was able handle filenames with spaces by following the examples suggested by onitake.
This doesn't break if the path contains spaces or the string test:
find . -name "*_test.rb" -print0 | while read -d $'\0' file
do
echo mv "$file" "$(echo $file | sed s/test/spec/)"
done

This is an example that should work in all cases.
Works recursiveley, Need just shell, and support files names with spaces.
find spec -name "*_test.rb" -print0 | while read -d $'\0' file; do mv "$file" "`echo $file | sed s/test/spec/`"; done

$ find spec -name "*_test.rb"
spec/dir2/a_test.rb
spec/dir1/a_test.rb
$ find spec -name "*_test.rb" | xargs -n 1 /usr/bin/perl -e '($new=$ARGV[0]) =~ s/test/spec/; system(qq(mv),qq(-v), $ARGV[0], $new);'
`spec/dir2/a_test.rb' -> `spec/dir2/a_spec.rb'
`spec/dir1/a_test.rb' -> `spec/dir1/a_spec.rb'
$ find spec -name "*_spec.rb"
spec/dir2/b_spec.rb
spec/dir2/a_spec.rb
spec/dir1/a_spec.rb
spec/dir1/c_spec.rb

Your question seems to be about sed, but to accomplish your goal of recursive rename, I'd suggest the following, shamelessly ripped from another answer I gave here:recursive rename in bash
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/^(.*_)test.rb$/\1spec.rb/g'
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .

More secure way of doing rename with find utils and sed regular expression type:
mkdir ~/practice
cd ~/practice
touch classic.txt.txt
touch folk.txt.txt
Remove the ".txt.txt" extension as follows -
cd ~/practice
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} \;
If you use the + in place of ; in order to work on batch mode, the above command will rename only the first matching file, but not the entire list of file matches by 'find'.
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} +

Here's a nice oneliner that does the trick.
Sed can't handle this right, especially if multiple variables are passed by xargs with -n 2.
A bash substition would handle this easily like:
find ./spec -type f -name "*_test.rb" -print0 | xargs -0 -I {} sh -c 'export file={}; mv $file ${file/_test.rb/_spec.rb}'
Adding -type -f will limit the move operations to files only, -print 0 will handle empty spaces in paths.

I share this post as it is a bit related to question. Sorry for not providing more details. Hope it helps someone else.
http://www.peteryu.ca/tutorials/shellscripting/batch_rename

This is my working solution:
for FILE in {{FILE_PATTERN}}; do echo ${FILE} | mv ${FILE} $(sed 's/{{SOURCE_PATTERN}}/{{TARGET_PATTERN}}/g'); done

Related

Renaming bunch of files with xargs

I've been trying to rename a bunch of files in a proper order using xargs but to no avail. While digging around on piles of similar question, I found answers with the use of sed alongside xargs. Novice me wants to avoid the use of sed. I presume there must be some easier way around.
To be more specific, I've got some files as follows:
Abc.jpg
Def.jpg
Ghi.jpg
Jkl.jpg
and I want these to be renamed in an ordered way, like:
Something1.jpg
Something2.jpg
Something3.jpg
Something4.jpg
Could xargs command along with seq achieve this? If so, how do I implement it?
I don't know why anyone would try to engage sed for this. Probably not xargs or seq, either. Here's a pure-Bash one-liner:
(x=1; for f in *.jpg; do mv "$f" "Something$((x++)).jpg"; done)
At its core, that's a for loop over the files you want to rename, performing a mv command on each one. The files to operate on are expressed via a single glob expression, but you could also name them individually, use multiple globs, or use one of a variety of other techniques. Variable x is used as a simple counter, initialized to 1 before entering the loop. $((x++)) expands to the current value of x, with the side effect of incrementing x by 1. The whole thing is wrapped in parentheses to run it in a subshell, so that nothing in it affects the host shell environment. (In this case, that means it does not create or modify any variable x in the invoking shell.)
If you were putting that in a script instead of typing it on the command line then it would be more readable to split it over several lines:
(
x=1
for f in *.jpg; do
mv "$f" "Something$((x++)).jpg"
done
)
You can type it that way, too, if you wish.
This is an example of how to find, number and rename jpgs.
Regardless of how you use the find (what options you need. recursive, mindepth, maxdepth, regex, ...).
You can add numbers to find ouput with nl and use number and file as 2 arguments for xargs $1, $2
$ find . -type f -name "*.jpg" |nl| xargs -n 2 bash -c 'echo mv "$2" Something"$1".jpg' argv0
the echo echo mv ... will show this
mv ./Jkl.jpg Something1.jpg
mv ./Abc.jpg Something2.jpg
mv ./Def.jpg Something3.jpg
Using sort and testing the number of arguments
$ find . -type f -name "*.jpg" |sort|nl| xargs -n 2 bash -c '[ "$#" -eq 2 ] && echo mv "$2" Something"$1".jpg' argv0
mv ./Abc.jpg Something1.jpg
mv ./Def.jpg Something2.jpg
mv ./Jkl.jpg Something3.jpg

How to find files with specific extensions recursively using the for/in syntax? [duplicate]

x=$(find . -name "*.txt")
echo $x
if I run the above piece of code in Bash shell, what I get is a string containing several file names separated by blank, not a list.
Of course, I can further separate them by blank to get a list, but I'm sure there is a better way to do it.
So what is the best way to loop through the results of a find command?
TL;DR: If you're just here for the most correct answer, you probably want my personal preference (see the bottom of this post):
# execute `process` once for each file
find . -name '*.txt' -exec process {} \;
If you have time, read through the rest to see several different ways and the problems with most of them.
The full answer:
The best way depends on what you want to do, but here are a few options. As long as no file or folder in the subtree has whitespace in its name, you can just loop over the files:
for i in $x; do # Not recommended, will break on whitespace
process "$i"
done
Marginally better, cut out the temporary variable x:
for i in $(find -name \*.txt); do # Not recommended, will break on whitespace
process "$i"
done
It is much better to glob when you can. White-space safe, for files in the current directory:
for i in *.txt; do # Whitespace-safe but not recursive.
process "$i"
done
By enabling the globstar option, you can glob all matching files in this directory and all subdirectories:
# Make sure globstar is enabled
shopt -s globstar
for i in **/*.txt; do # Whitespace-safe and recursive
process "$i"
done
In some cases, e.g. if the file names are already in a file, you may need to use read:
# IFS= makes sure it doesn't trim leading and trailing whitespace
# -r prevents interpretation of \ escapes.
while IFS= read -r line; do # Whitespace-safe EXCEPT newlines
process "$line"
done < filename
read can be used safely in combination with find by setting the delimiter appropriately:
find . -name '*.txt' -print0 |
while IFS= read -r -d '' line; do
process "$line"
done
For more complex searches, you will probably want to use find, either with its -exec option or with -print0 | xargs -0:
# execute `process` once for each file
find . -name \*.txt -exec process {} \;
# execute `process` once with all the files as arguments*:
find . -name \*.txt -exec process {} +
# using xargs*
find . -name \*.txt -print0 | xargs -0 process
# using xargs with arguments after each filename (implies one run per filename)
find . -name \*.txt -print0 | xargs -0 -I{} process {} argument
find can also cd into each file's directory before running a command by using -execdir instead of -exec, and can be made interactive (prompt before running the command for each file) using -ok instead of -exec (or -okdir instead of -execdir).
*: Technically, both find and xargs (by default) will run the command with as many arguments as they can fit on the command line, as many times as it takes to get through all the files. In practice, unless you have a very large number of files it won't matter, and if you exceed the length but need them all on the same command line, you're SOL find a different way.
What ever you do, don't use a for loop:
# Don't do this
for file in $(find . -name "*.txt")
do
…code using "$file"
done
Three reasons:
For the for loop to even start, the find must run to completion.
If a file name has any whitespace (including space, tab or newline) in it, it will be treated as two separate names.
Although now unlikely, you can overrun your command line buffer. Imagine if your command line buffer holds 32KB, and your for loop returns 40KB of text. That last 8KB will be dropped right off your for loop and you'll never know it.
Always use a while read construct:
find . -name "*.txt" -print0 | while read -d $'\0' file
do
…code using "$file"
done
The loop will execute while the find command is executing. Plus, this command will work even if a file name is returned with whitespace in it. And, you won't overflow your command line buffer.
The -print0 will use the NULL as a file separator instead of a newline and the -d $'\0' will use NULL as the separator while reading.
find . -name "*.txt"|while read fname; do
echo "$fname"
done
Note: this method and the (second) method shown by bmargulies are safe to use with white space in the file/folder names.
In order to also have the - somewhat exotic - case of newlines in the file/folder names covered, you will have to resort to the -exec predicate of find like this:
find . -name '*.txt' -exec echo "{}" \;
The {} is the placeholder for the found item and the \; is used to terminate the -exec predicate.
And for the sake of completeness let me add another variant - you gotta love the *nix ways for their versatility:
find . -name '*.txt' -print0|xargs -0 -n 1 echo
This would separate the printed items with a \0 character that isn't allowed in any of the file systems in file or folder names, to my knowledge, and therefore should cover all bases. xargs picks them up one by one then ...
Filenames can include spaces and even control characters. Spaces are (default) delimiters for shell expansion in bash and as a result of that x=$(find . -name "*.txt") from the question is not recommended at all. If find gets a filename with spaces e.g. "the file.txt" you will get 2 separated strings for processing, if you process x in a loop. You can improve this by changing delimiter (bash IFS Variable) e.g. to \r\n, but filenames can include control characters - so this is not a (completely) safe method.
From my point of view, there are 2 recommended (and safe) patterns for processing files:
1. Use for loop & filename expansion:
for file in ./*.txt; do
[[ ! -e $file ]] && continue # continue, if file does not exist
# single filename is in $file
echo "$file"
# your code here
done
2. Use find-read-while & process substitution
while IFS= read -r -d '' file; do
# single filename is in $file
echo "$file"
# your code here
done < <(find . -name "*.txt" -print0)
Remarks
on Pattern 1:
bash returns the search pattern ("*.txt") if no matching file is found - so the extra line "continue, if file does not exist" is needed. see Bash Manual, Filename Expansion
shell option nullglob can be used to avoid this extra line.
"If the failglob shell option is set, and no matches are found, an error message is printed and the command is not executed." (from Bash Manual above)
shell option globstar: "If set, the pattern ‘**’ used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a ‘/’, only directories and subdirectories match." see Bash Manual, Shopt Builtin
other options for filename expansion: extglob, nocaseglob, dotglob & shell variable GLOBIGNORE
on Pattern 2:
filenames can contain blanks, tabs, spaces, newlines, ... to process filenames in a safe way, find with -print0 is used: filename is printed with all control characters & terminated with NUL. see also Gnu Findutils Manpage, Unsafe File Name Handling, safe File Name Handling, unusual characters in filenames. See David A. Wheeler below for detailed discussion of this topic.
There are some possible patterns to process find results in a while loop. Others (kevin, David W.) have shown how to do this using pipes:
files_found=1
find . -name "*.txt" -print0 |
while IFS= read -r -d '' file; do
# single filename in $file
echo "$file"
files_found=0 # not working example
# your code here
done
[[ $files_found -eq 0 ]] && echo "files found" || echo "no files found"
When you try this piece of code, you will see, that it does not work: files_found is always "true" & the code will always echo "no files found". Reason is: each command of a pipeline is executed in a separate subshell, so the changed variable inside the loop (separate subshell) does not change the variable in the main shell script. This is why I recommend using process substitution as the "better", more useful, more general pattern.See I set variables in a loop that's in a pipeline. Why do they disappear... (from Greg's Bash FAQ) for a detailed discussion on this topic.
Additional References & Sources:
Gnu Bash Manual, Pattern Matching
Filenames and Pathnames in Shell: How to do it Correctly, David A. Wheeler
Why you don't read lines with "for", Greg's Wiki
Why you shouldn't parse the output of ls(1), Greg's Wiki
Gnu Bash Manual, Process Substitution
(Updated to include #Socowi's execellent speed improvement)
With any $SHELL that supports it (dash/zsh/bash...):
find . -name "*.txt" -exec $SHELL -c '
for i in "$#" ; do
echo "$i"
done
' {} +
Done.
Original answer (shorter, but slower):
find . -name "*.txt" -exec $SHELL -c '
echo "$0"
' {} \;
If you can assume the file names don't contain newlines, you can read the output of find into a Bash array using the following command:
readarray -t x < <(find . -name '*.txt')
Note:
-t causes readarray to strip newlines.
It won't work if readarray is in a pipe, hence the process substitution.
readarray is available since Bash 4.
Bash 4.4 and up also supports the -d parameter for specifying the delimiter. Using the null character, instead of newline, to delimit the file names works also in the rare case that the file names contain newlines:
readarray -d '' x < <(find . -name '*.txt' -print0)
readarray can also be invoked as mapfile with the same options.
Reference: https://mywiki.wooledge.org/BashFAQ/005#Loading_lines_from_a_file_or_stream
# Doesn't handle whitespace
for x in `find . -name "*.txt" -print`; do
process_one $x
done
or
# Handles whitespace and newlines
find . -name "*.txt" -print0 | xargs -0 -n 1 process_one
I like to use find which is first assigned to variable and IFS switched to new line as follow:
FilesFound=$(find . -name "*.txt")
IFSbkp="$IFS"
IFS=$'\n'
counter=1;
for file in $FilesFound; do
echo "${counter}: ${file}"
let counter++;
done
IFS="$IFSbkp"
As commented by #Konrad Rudolph this will not work with "new lines" in file name. I still think it is handy as it covers most of the cases when you need to loop over command output.
As already posted on the top answer by Kevin, the best solution is to use a for loop with bash glob, but as bash glob is not recursive by default, this can be fixed by a bash recursive function:
#!/bin/bash
set -x
set -eu -o pipefail
all_files=();
function get_all_the_files()
{
directory="$1";
for item in "$directory"/* "$directory"/.[^.]*;
do
if [[ -d "$item" ]];
then
get_all_the_files "$item";
else
all_files+=("$item");
fi;
done;
}
get_all_the_files "/tmp";
for file_path in "${all_files[#]}"
do
printf 'My file is "%s"\n' "$file_path";
done;
Related questions:
Bash loop through directory including hidden file
Recursively list files from a given directory in Bash
ls command: how can I get a recursive full-path listing, one line per file?
List files recursively in Linux CLI with path relative to the current directory
Recursively List all directories and files
bash script, create array of all files in a directory
How can I creates array that contains the names of all the files in a folder?
How can I creates array that contains the names of all the files in a folder?
How to get the list of files in a directory in a shell script?
based on other answers and comment of #phk, using fd #3:
(which still allows to use stdin inside the loop)
while IFS= read -r f <&3; do
echo "$f"
done 3< <(find . -iname "*filename*")
You can put the filenames returned by find into an array like this:
array=()
while IFS= read -r -d ''; do
array+=("$REPLY")
done < <(find . -name '*.txt' -print0)
Now you can just loop through the array to access individual items and do whatever you want with them.
Note: It's white space safe.
You can store your find output in array if you wish to use the output later as:
array=($(find . -name "*.txt"))
Now to print the each element in new line, you can either use for loop iterating to all the elements of array, or you can use printf statement.
for i in ${array[#]};do echo $i; done
or
printf '%s\n' "${array[#]}"
You can also use:
for file in "`find . -name "*.txt"`"; do echo "$file"; done
This will print each filename in newline
To only print the find output in list form, you can use either of the following:
find . -name "*.txt" -print 2>/dev/null
or
find . -name "*.txt" -print | grep -v 'Permission denied'
This will remove error messages and only give the filename as output in new line.
If you wish to do something with the filenames, storing it in array is good, else there is no need to consume that space and you can directly print the output from find.
I think using this piece of code (piping the command after while done):
while read fname; do
echo "$fname"
done <<< "$(find . -name "*.txt")"
is better than this answer because while loop is executed in a subshell according to here, if you use this answer and variable changes cannot be seen after while loop if you want to modify variables inside the loop.
function loop_through(){
length_="$(find . -name '*.txt' | wc -l)"
length_="${length_#"${length_%%[![:space:]]*}"}"
length_="${length_%"${length_##*[![:space:]]}"}"
for i in {1..$length_}
do
x=$(find . -name '*.txt' | sort | head -$i | tail -1)
echo $x
done
}
To grab the length of the list of files for loop, I used the first command "wc -l".
That command is set to a variable.
Then, I need to remove the trailing white spaces from the variable so the for loop can read it.
find <path> -xdev -type f -name *.txt -exec ls -l {} \;
This will list the files and give details about attributes.
Another alternative is to not use bash, but call Python to do the heavy lifting. I recurred to this because bash solutions as my other answer were too slow.
With this solution, we build a bash array of files from inline Python script:
#!/bin/bash
set -eu -o pipefail
dsep=":" # directory_separator
base_directory=/tmp
all_files=()
all_files_string="$(python3 -c '#!/usr/bin/env python3
import os
import sys
dsep="'"$dsep"'"
base_directory="'"$base_directory"'"
def log(*args, **kwargs):
print(*args, file=sys.stderr, **kwargs)
def check_invalid_characther(file_path):
for thing in ("\\", "\n"):
if thing in file_path:
raise RuntimeError(f"It is not allowed {thing} on \"{file_path}\"!")
def absolute_path_to_relative(base_directory, file_path):
relative_path = os.path.commonprefix( [ base_directory, file_path ] )
relative_path = os.path.normpath( file_path.replace( relative_path, "" ) )
# if you use Windows Python, it accepts / instead of \\
# if you have \ on your files names, rename them or comment this
relative_path = relative_path.replace("\\", "/")
if relative_path.startswith( "/" ):
relative_path = relative_path[1:]
return relative_path
for directory, directories, files in os.walk(base_directory):
for file in files:
local_file_path = os.path.join(directory, file)
local_file_name = absolute_path_to_relative(base_directory, local_file_path)
log(f"local_file_name {local_file_name}.")
check_invalid_characther(local_file_name)
print(f"{base_directory}{dsep}{local_file_name}")
' | dos2unix)";
if [[ -n "$all_files_string" ]];
then
readarray -t temp <<< "$all_files_string";
all_files+=("${temp[#]}");
fi;
for item in "${all_files[#]}";
do
OLD_IFS="$IFS"; IFS="$dsep";
read -r base_directory local_file_name <<< "$item"; IFS="$OLD_IFS";
printf 'item "%s", base_directory "%s", local_file_name "%s".\n' \
"$item" \
"$base_directory" \
"$local_file_name";
done;
Related:
os.walk without hidden folders
How to do a recursive sub-folder search and return files in a list?
How to split a string into an array in Bash?
How about if you use grep instead of find?
ls | grep .txt$ > out.txt
Now you can read this file and the filenames are in the form of a list.

Check if any of the files within a folder contain a pattern then return filename

I'm writing a script that aim to automate the fulfill of some variables and I'm looking for help to achieve this:
I have a nginx sites-enabled folder which contain some reverses proxied sites.
I need to:
check if a pattern $var1 is found in any of the files in "/volume1/nginx/sites-enabled/"
return the name of the file containing $var1 as $var2
Many thanks for your attention and help!
I have found some lines but none try any files in a folder
if grep -q $var1 "/volume1/nginx/sites-enabled/testfile"; then
echo "found"
fi
find and grep can be used to produce a list of matching files:
find /volume1/nginx/sites-enabled/ -type f -exec grep -le "${var1}" {} +
The ‘trick’ is using find’s -exec and grep’s -l.
If you only want the filenames you could use:
find /volume1/nginx/sites-enabled/ -type f -exec grep -qe "${var1}" {} \; -exec basename {} \;
If you want to assign the result to a variable use command substitution ($()):
var2="$(find …)"
Don’t forget to quote your variables!
This command is the most traditional and efficient one which works on any Unix
without the requirement to have GNU versions of grep with special features.
The efficiency is, that xargs feeds the grep command as many filenames as arguments as it is possible according to the limits of the system (how long a shell command may be) and it excecutes the grep command by this only as least as possible.
With the -l option of grep it shows you only the filename once on a successful pattern search.
find /path -type f -print | xargs grep -l pattern
Assuming you have GNU Grep, this will store all files containing the contents of $var1 in an array $var2.
for file in /volume1/nginx/sites-enabled/*
do
if grep --fixed-strings --quiet "$var1" "$file"
then
var2+=("$file")
fi
done
This will loop through NUL-separated paths:
while IFS= read -d'' -r -u9 path
do
…something with "$path"
done 9< <(grep --fixed-strings --files-without-match --recursive "$var1" /volume1/nginx/sites-enabled)

Recursively rename files using find and sed

I want to go through a bunch of directories and rename all files that end in _test.rb to end in _spec.rb instead. It's something I've never quite figured out how to do with bash so this time I thought I'd put some effort in to get it nailed. I've so far come up short though, my best effort is:
find spec -name "*_test.rb" -exec echo mv {} `echo {} | sed s/test/spec/` \;
NB: there's an extra echo after exec so that the command is printed instead of run while I'm testing it.
When I run it the output for each matched filename is:
mv original original
i.e. the substitution by sed has been lost. What's the trick?
To solve it in a way most close to the original problem would be probably using xargs "args per command line" option:
find . -name "*_test.rb" | sed -e "p;s/test/spec/" | xargs -n2 mv
It finds the files in the current working directory recursively, echoes the original file name (p) and then a modified name (s/test/spec/) and feeds it all to mv in pairs (xargs -n2). Beware that in this case the path itself shouldn't contain a string test.
This happens because sed receives the string {} as input, as can be verified with:
find . -exec echo `echo "{}" | sed 's/./foo/g'` \;
which prints foofoo for each file in the directory, recursively. The reason for this behavior is that the pipeline is executed once, by the shell, when it expands the entire command.
There is no way of quoting the sed pipeline in such a way that find will execute it for every file, since find doesn't execute commands via the shell and has no notion of pipelines or backquotes. The GNU findutils manual explains how to perform a similar task by putting the pipeline in a separate shell script:
#!/bin/sh
echo "$1" | sed 's/_test.rb$/_spec.rb/'
(There may be some perverse way of using sh -c and a ton of quotes to do all this in one command, but I'm not going to try.)
you might want to consider other way like
for file in $(find . -name "*_test.rb")
do
echo mv $file `echo $file | sed s/_test.rb$/_spec.rb/`
done
I find this one shorter
find . -name '*_test.rb' -exec bash -c 'echo mv $0 ${0/test.rb/spec.rb}' {} \;
You can do it without sed, if you want:
for i in `find -name '*_test.rb'` ; do mv $i ${i%%_test.rb}_spec.rb ; done
${var%%suffix} strips suffix from the value of var.
or, to do it using sed:
for i in `find -name '*_test.rb'` ; do mv $i `echo $i | sed 's/test/spec/'` ; done
You mention that you are using bash as your shell, in which case you don't actually need find and sed to achieve the batch renaming you're after...
Assuming you are using bash as your shell:
$ echo $SHELL
/bin/bash
$ _
... and assuming you have enabled the so-called globstar shell option:
$ shopt -p globstar
shopt -s globstar
$ _
... and finally assuming you have installed the rename utility (found in the util-linux-ng package)
$ which rename
/usr/bin/rename
$ _
... then you can achieve the batch renaming in a bash one-liner as follows:
$ rename _test _spec **/*_test.rb
(the globstar shell option will ensure that bash finds all matching *_test.rb files, no matter how deeply they are nested in the directory hierarchy... use help shopt to find out how to set the option)
The easiest way:
find . -name "*_test.rb" | xargs rename s/_test/_spec/
The fastest way (assuming you have 4 processors):
find . -name "*_test.rb" | xargs -P 4 rename s/_test/_spec/
If you have a large number of files to process, it is possible that the list of filenames piped to xargs would cause the resulting command line to exceed the maximum length allowed.
You can check your system's limit using getconf ARG_MAX
On most linux systems you can use free -b or cat /proc/meminfo to find how much RAM you have to work with; Otherwise, use top or your systems activity monitor app.
A safer way (assuming you have 1000000 bytes of ram to work with):
find . -name "*_test.rb" | xargs -s 1000000 rename s/_test/_spec/
Here is what worked for me when the file names had spaces in them. The example below recursively renames all .dar files to .zip files:
find . -name "*.dar" -exec bash -c 'mv "$0" "`echo \"$0\" | sed s/.dar/.zip/`"' {} \;
For this you don't need sed. You can perfectly get alone with a while loop fed with the result of find through a process substitution.
So if you have a find expression that selects the needed files, then use the syntax:
while IFS= read -r file; do
echo "mv $file ${file%_test.rb}_spec.rb" # remove "echo" when OK!
done < <(find -name "*_test.rb")
This will find files and rename all of them striping the string _test.rb from the end and appending _spec.rb.
For this step we use Shell Parameter Expansion where ${var%string} removes the shortest matching pattern "string" from $var.
$ file="HELLOa_test.rbBYE_test.rb"
$ echo "${file%_test.rb}" # remove _test.rb from the end
HELLOa_test.rbBYE
$ echo "${file%_test.rb}_spec.rb" # remove _test.rb and append _spec.rb
HELLOa_test.rbBYE_spec.rb
See an example:
$ tree
.
├── ab_testArb
├── a_test.rb
├── a_test.rb_test.rb
├── b_test.rb
├── c_test.hello
├── c_test.rb
└── mydir
└── d_test.rb
$ while IFS= read -r file; do echo "mv $file ${file/_test.rb/_spec.rb}"; done < <(find -name "*_test.rb")
mv ./b_test.rb ./b_spec.rb
mv ./mydir/d_test.rb ./mydir/d_spec.rb
mv ./a_test.rb ./a_spec.rb
mv ./c_test.rb ./c_spec.rb
if you have Ruby (1.9+)
ruby -e 'Dir["**/*._test.rb"].each{|x|test(?f,x) and File.rename(x,x.gsub(/_test/,"_spec") ) }'
In ramtam's answer which I like, the find portion works OK but the remainder does not if the path has spaces. I am not too familiar with sed, but I was able to modify that answer to:
find . -name "*_test.rb" | perl -pe 's/^((.*_)test.rb)$/"\1" "\2spec.rb"/' | xargs -n2 mv
I really needed a change like this because in my use case the final command looks more like
find . -name "olddir" | perl -pe 's/^((.*)olddir)$/"\1" "\2new directory"/' | xargs -n2 mv
I haven't the heart to do it all over again, but I wrote this in answer to Commandline Find Sed Exec. There the asker wanted to know how to move an entire tree, possibly excluding a directory or two, and rename all files and directories containing the string "OLD" to instead contain "NEW".
Besides describing the how with painstaking verbosity below, this method may also be unique in that it incorporates built-in debugging. It basically doesn't do anything at all as written except compile and save to a variable all commands it believes it should do in order to perform the work requested.
It also explicitly avoids loops as much as possible. Besides the sed recursive search for more than one match of the pattern there is no other recursion as far as I know.
And last, this is entirely null delimited - it doesn't trip on any character in any filename except the null. I don't think you should have that.
By the way, this is REALLY fast. Look:
% _mvnfind() { mv -n "${1}" "${2}" && cd "${2}"
> read -r SED <<SED
> :;s|${3}\(.*/[^/]*${5}\)|${4}\1|;t;:;s|\(${5}.*\)${3}|\1${4}|;t;s|^[0-9]*[\t]\(mv.*\)${5}|\1|p
> SED
> find . -name "*${3}*" -printf "%d\tmv %P ${5} %P\000" |
> sort -zg | sed -nz ${SED} | read -r ${6}
> echo <<EOF
> Prepared commands saved in variable: ${6}
> To view do: printf ${6} | tr "\000" "\n"
> To run do: sh <<EORUN
> $(printf ${6} | tr "\000" "\n")
> EORUN
> EOF
> }
% rm -rf "${UNNECESSARY:=/any/dirs/you/dont/want/moved}"
% time ( _mvnfind ${SRC=./test_tree} ${TGT=./mv_tree} \
> ${OLD=google} ${NEW=replacement_word} ${sed_sep=SsEeDd} \
> ${sh_io:=sh_io} ; printf %b\\000 "${sh_io}" | tr "\000" "\n" \
> | wc - ; echo ${sh_io} | tr "\000" "\n" | tail -n 2 )
<actual process time used:>
0.06s user 0.03s system 106% cpu 0.090 total
<output from wc:>
Lines Words Bytes
115 362 20691 -
<output from tail:>
mv .config/replacement_word-chrome-beta/Default/.../googlestars \
.config/replacement_word-chrome-beta/Default/.../replacement_wordstars
NOTE: The above function will likely require GNU versions of sed and find to properly handle the find printf and sed -z -e and :;recursive regex test;t calls. If these are not available to you the functionality can likely be duplicated with a few minor adjustments.
This should do everything you wanted from start to finish with very little fuss. I did fork with sed, but I was also practicing some sed recursive branching techniques so that's why I'm here. It's kind of like getting a discount haircut at a barber school, I guess. Here's the workflow:
rm -rf ${UNNECESSARY}
I intentionally left out any functional call that might delete or destroy data of any kind. You mention that ./app might be unwanted. Delete it or move it elsewhere beforehand, or, alternatively, you could build in a \( -path PATTERN -exec rm -rf \{\} \) routine to find to do it programmatically, but that one's all yours.
_mvnfind "${#}"
Declare its arguments and call the worker function. ${sh_io} is especially important in that it saves the return from the function. ${sed_sep} comes in a close second; this is an arbitrary string used to reference sed's recursion in the function. If ${sed_sep} is set to a value that could potentially be found in any of your path- or file-names acted upon... well, just don't let it be.
mv -n $1 $2
The whole tree is moved from the beginning. It will save a lot of headache; believe me. The rest of what you want to do - the renaming - is simply a matter of filesystem metadata. If you were, for instance, moving this from one drive to another, or across filesystem boundaries of any kind, you're better off doing so at once with one command. It's also safer. Note the -noclobber option set for mv; as written, this function will not put ${SRC_DIR} where a ${TGT_DIR} already exists.
read -R SED <<HEREDOC
I located all of sed's commands here to save on escaping hassles and read them into a variable to feed to sed below. Explanation below.
find . -name ${OLD} -printf
We begin the find process. With find we search only for anything that needs renaming because we already did all of the place-to-place mv operations with the function's first command. Rather than take any direct action with find, like an exec call, for instance, we instead use it to build out the command-line dynamically with -printf.
%dir-depth :tab: 'mv '%path-to-${SRC}' '${sed_sep}'%path-again :null delimiter:'
After find locates the files we need it directly builds and prints out (most) of the command we'll need to process your renaming. The %dir-depth tacked onto the beginning of each line will help to ensure we're not trying to rename a file or directory in the tree with a parent object that has yet to be renamed. find uses all sorts of optimization techniques to walk your filesystem tree and it is not a sure thing that it will return the data we need in a safe-for-operations order. This is why we next...
sort -general-numerical -zero-delimited
We sort all of find's output based on %directory-depth so that the paths nearest in relationship to ${SRC} are worked first. This avoids possible errors involving mving files into non-existent locations, and it minimizes need to for recursive looping. (in fact, you might be hard-pressed to find a loop at all)
sed -ex :rcrs;srch|(save${sep}*til)${OLD}|\saved${SUBSTNEW}|;til ${OLD=0}
I think this is the only loop in the whole script, and it only loops over the second %Path printed for each string in case it contains more than one ${OLD} value that might need replacing. All other solutions I imagined involved a second sed process, and while a short loop may not be desirable, certainly it beats spawning and forking an entire process.
So basically what sed does here is search for ${sed_sep}, then, having found it, saves it and all characters it encounters until it finds ${OLD}, which it then replaces with ${NEW}. It then heads back to ${sed_sep} and looks again for ${OLD}, in case it occurs more than once in the string. If it is not found, it prints the modified string to stdout (which it then catches again next) and ends the loop.
This avoids having to parse the entire string, and ensures that the first half of the mv command string, which needs to include ${OLD} of course, does include it, and the second half is altered as many times as is necessary to wipe the ${OLD} name from mv's destination path.
sed -ex...-ex search|%dir_depth(save*)${sed_sep}|(only_saved)|out
The two -exec calls here happen without a second fork. In the first, as we've seen, we modify the mv command as supplied by find's -printf function command as necessary to properly alter all references of ${OLD} to ${NEW}, but in order to do so we had to use some arbitrary reference points which should not be included in the final output. So once sed finishes all it needs to do, we instruct it to wipe out its reference points from the hold-buffer before passing it along.
AND NOW WE'RE BACK AROUND
read will receive a command that looks like this:
% mv /path2/$SRC/$OLD_DIR/$OLD_FILE /same/path_w/$NEW_DIR/$NEW_FILE \000
It will read it into ${msg} as ${sh_io} which can be examined at will outside of the function.
Cool.
-Mike
I was able handle filenames with spaces by following the examples suggested by onitake.
This doesn't break if the path contains spaces or the string test:
find . -name "*_test.rb" -print0 | while read -d $'\0' file
do
echo mv "$file" "$(echo $file | sed s/test/spec/)"
done
This is an example that should work in all cases.
Works recursiveley, Need just shell, and support files names with spaces.
find spec -name "*_test.rb" -print0 | while read -d $'\0' file; do mv "$file" "`echo $file | sed s/test/spec/`"; done
$ find spec -name "*_test.rb"
spec/dir2/a_test.rb
spec/dir1/a_test.rb
$ find spec -name "*_test.rb" | xargs -n 1 /usr/bin/perl -e '($new=$ARGV[0]) =~ s/test/spec/; system(qq(mv),qq(-v), $ARGV[0], $new);'
`spec/dir2/a_test.rb' -> `spec/dir2/a_spec.rb'
`spec/dir1/a_test.rb' -> `spec/dir1/a_spec.rb'
$ find spec -name "*_spec.rb"
spec/dir2/b_spec.rb
spec/dir2/a_spec.rb
spec/dir1/a_spec.rb
spec/dir1/c_spec.rb
Your question seems to be about sed, but to accomplish your goal of recursive rename, I'd suggest the following, shamelessly ripped from another answer I gave here:recursive rename in bash
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/^(.*_)test.rb$/\1spec.rb/g'
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .
More secure way of doing rename with find utils and sed regular expression type:
mkdir ~/practice
cd ~/practice
touch classic.txt.txt
touch folk.txt.txt
Remove the ".txt.txt" extension as follows -
cd ~/practice
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} \;
If you use the + in place of ; in order to work on batch mode, the above command will rename only the first matching file, but not the entire list of file matches by 'find'.
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} +
Here's a nice oneliner that does the trick.
Sed can't handle this right, especially if multiple variables are passed by xargs with -n 2.
A bash substition would handle this easily like:
find ./spec -type f -name "*_test.rb" -print0 | xargs -0 -I {} sh -c 'export file={}; mv $file ${file/_test.rb/_spec.rb}'
Adding -type -f will limit the move operations to files only, -print 0 will handle empty spaces in paths.
I share this post as it is a bit related to question. Sorry for not providing more details. Hope it helps someone else.
http://www.peteryu.ca/tutorials/shellscripting/batch_rename
This is my working solution:
for FILE in {{FILE_PATTERN}}; do echo ${FILE} | mv ${FILE} $(sed 's/{{SOURCE_PATTERN}}/{{TARGET_PATTERN}}/g'); done

shell script to link specific files up one level

I've got a folder that's got a whole lot of folders inside, which each contain a movie file. Now I want to be able to see all these movies at one time (but I have to preserve the folder structure) so I wanted to symlink them all to the top level folder (flat view in Directory Opus would be the go here, but I'm on a mac). Here's my attempt
for file in */*/*.mov;
do
newname='echo $file | sed "s|[^/]*/[^/]*/||g"';
ln -s $file echo $newname;
done
So what I'm trying to do on line 3 is turn the path foo/bar/spud.mov into spud.mov and then on line 4 do ln -s foo/bar/spud.mov spud.mov
But it doesn't work, all I get is
ln: "s|[^/]/[^/]/||g": No such file or directory
I've tried various combinations of quotes and escapes, but with no luck. The sed expression works, but I've obviously mucked up something with my variables.
Thanks
What's with all the echo $file | sed or awk stuff, anyways? basename's made for this purpose.
for file in */*/*.mov; do
newname=`basename $file1
ln -s $file $newname;
done
But, really, ln does the same thing when given a directory, so:
for file in */*/*.mov; do
ln -s "$file" .
done
Or a really general purpose version that's smart enough to only call ln as many times as needed:
find . -name \*.mov -depth +1 -print0 | xargs -0 -J % ln -s % .
That is, in the local directory, find everything with a name ending in .mov that's not in the current directory (depth is more than 1), separate the output with NULLs, and pipe into xargs, which is looking for NULL-separate input, and will pass . as the argument to ln after all the arguments it read on stdin.
This will do the same for all folder levels:
for i in `find . -name "*.mov"`
do
file=`echo $i | /bin/awk -F'/' '{print $NF}'`
/bin/ln -s $i $file
done
Basename is what you want, and the problem with your solution is poor quoting:
bash-3.2$ echo $file
foo/bar/spud.mov
bash-3.2$ newname=$(echo $file | sed 's|[^/]*/[^/]*/||g'); echo $newname
spud.mov
Well thanks everyone. Seems to be a few worms in this can. It seems that what I was doing, while not the best way, would have worked if I'd used the right syntax.
But I've learned about basename and improved my knowledge of ln, so I'm happy.
I know a bit more about how to use find, so now I can do it in one line:
find . -name "*.mov" -exec ln -s {} \;
the -exec flag in the find command lets you run another process every time it finds a file. Put in a pair of curly braces as above to give that process the file as a parameter. Took me a while to work out that there needs to be a space before the escaped semicolon.

Resources