Use GNU find to show only the leaf directories - bash

I'm trying to use GNU find to find only the directories that contain no other directories, but may or may not contain regular files.
My best guess so far has been:
find dir -type d \( -not -exec ls -dA ';' \)
but this just gets me a long list of "."
Thanks!

You can use -links if your filesystem is POSIX compliant (i.e. a directory has a link for each subdirectory in it, a link from its parent and a link to itself, thus a count of 2 links if it has no subdirectories).
The following command should do what you want:
find dir -type d -links 2
However, it does not seems to work on Mac OS X (as #Piotr mentioned). Here is another version that is slower, but does work on Mac OS X. It is based on his version, with a correction to handle whitespace in directory names:
find . -type d -exec sh -c '(ls -p "{}"|grep />/dev/null)||echo "{}"' \;

I just found another solution to this that works on both Linux & macOS (without find -exec)!
It involves sort (twice) and awk:
find dir -type d | sort -r | awk 'a!~"^"$0{a=$0;print}' | sort
Explanation:
sort the find output in reverse order
now you have subdirectories appear first, then their parents
use awk to omit lines if the current line is a prefix of the previous line
(this command is from the answer here)
now you eliminated "all parent directories" (you're left with parent dirs)
sort them (so it looks like the normal find output)
Voila! Fast and portable.

#Sylvian solution didn't work for me on mac os x for some obscure reason. So I've came up with a bit more direct solution. Hope this will help someone:
find . -type d -print0 | xargs -0 -IXXX sh -c '(ls -p XXX | grep / >/dev/null) || echo XXX' ;
Explanation:
ls -p ends directories with '/'
so (ls -p XXX | grep / >/dev/null) returns 0 if there is no directories
-print0 && -0 is to make xargs handle spaces in directory names

I have some oddly named files in my directory trees that confuse awk as in
#AhmetAlpBalkan 's answer. So I took a slightly different approach
p=;
while read c;
do
l=${#c};
f=${p:0:$l};
if [ "$f" != "$c" ]; then
echo $c;
fi;
p=$c;
done < <(find . -type d | sort -r)
As in the awk solution, I reverse sort. That way if the directory path is a subpath of the previous hit, you can easily discern this.
Here p is my previous match, c is the current match, l is the length of the current match, f is the first l matching characters of the previous match. I only echo those hits that don't match the beginning of the previous match.
The problem with the awk solution offered is that the matching of the beginning of the string seems to be confused if the path name contains things such as + in the name of some of the subdirectories. This caused awk to return a number of false positives for me.

There is an alternative to find called rawhide (rh) that is much easier to use.
For filesystems other than btrfs:
rh 'd && nlink == 2'
For btrfs:
rh 'd && "[ `rh -red %S | wc -l` = 0 ]".sh'
A shorter/faster version for btrfs is:
rh 'd && "[ -z \"`rh -red %S`\" ]".sh'
The above commands search for directories and then list their sub-directories and only match when there are none (the first by counting the number of lines of output, and the second by checking if there is any output at all per directory).
For a version that works on all filesystems as efficiently as possible:
rh 'd && (nlink == 2 || nlink == 1 && "[ -z \"`rh -red %S`\" ]".sh)'
On normal (non-btrfs) filesystems, this will work without the need for any additional processes for each directory, but on btrfs, it will need them. This is probably best if you have a mix of different filesystems including btrfs.
Rawhide (rh) is available from https://raf.org/rawhide or https://github.com/raforg/rawhide. It works at least on Linux, FreeBSD, OpenBSD, NetBSD, Solaris, macOS, and Cygwin.
Disclaimer: I am the current author of rawhide.

What about this one ? It's portable and it doesn't depend on finnicky linking counts. Note however that it's important to put root/folder without the trailing /.
find root/folder -type d | awk '{ if (length($0)<length(prev) || substr($0,1,length(prev))!=prev) print prev; prev=($0 "/") } END { print prev }'

Here is solution which works on Linux and OS X:
find . -type d -execdir bash -c '[ "$(find {} -mindepth 1 -type d)" ] || echo $PWD/{}' \;
or:
find . -type d -execdir sh -c 'test -z "$(find "{}" -mindepth 1 -type d)" && echo $PWD/{}' \;

This awk/sort pipe works a bit better than the one originally proposed in this answer, but is heavily based on it :) It will work more reliably regardless of whether the path contains regex special characters or not:
find . -type d | sort -r | awk 'index(a,$0)!=1{a=$0;print}' | sort
Remember that awk strings are 1-indexed instead of 0-indexed, which might be strange if you're used to working with C-based languages.
If the index of the current line in the previous line is 1 (i.e. it starts with it) then we skip it, which works just like the match of "^"$0.

My 2 cents on this problem:
#!/bin/bash
(
while IFS= read -r -d $'\0' directory
do
files=$(ls -A "$directory" | wc -l)
if test $files -gt 0
then
echo "$directory"
fi
done < <(find . -type d -print0)
) | sort | uniq
It uses a subshell to capture output from the run, and lists directories which have files.

Related

How to copy files in Bash that have more than 1 line

I am trying to copy files from one directory (defined as $inDir below) to another (defined as $outDir below) if they 1) exist and 2) have more than 1 line in the file (this is to avoid copying files that are empty text files). I am able to do the first part using the below code but am struggling to know how to do the latter part. I'm gussing maybe using awk and NR somehow but I'm not very good with coding in Bash so any help would be appreciated. I'd like this to be incorporated into the below if possible, so that it can be done in one step.
for i in $inDir/NVDI_500m_mean_distance_*_40PCs; do
batch_name_dir=$i;
batch_name=$(basename $i);
if [ ! -f $outDir/${batch_name}.plink.gz ]; then
echo 'Copying' $batch_name;
find $batch_name_dir -name ${batch_name}.plink.gz -exec cp {} $outDir/${batch_name}.plink.gz \;
else
echo $batch_name 'already exists'
fi
done
You can use wc -l to check how many lines are in a file and awk to strip only the number from the result.
lines=$(wc -l $YOUR_FILE_NAME | awk '{print $1}')
if [ $lines -gt 0 ]; then
//copy the file
fi
Edit: I have corrected LINES to lines according to the comments below.
I propose this:
for f in "$(find $indir -type f -name 'NVDI_500m_mean_distance_*_40PC' -not -empty)";
do
cp "$f" /some/targetdir;
done
find is faster than wc to check for zero size.
I consider it more readable, than the other solution, subjectivly.
However, the for-loop is not necessary, since:
find "$indir" -type f -name 'NVDI_500m_mean_distance_*_40PC' -not -empty |\
xargs -I % cp % /some/targetdir/%
Always "quote" path strings, since most shell utils break when there are unescaped shell chars or white spaces in the string. There are rarely good reasons to use unquoted strings.

Bash: recursively rename part of a file [duplicate]

I want to go through a bunch of directories and rename all files that end in _test.rb to end in _spec.rb instead. It's something I've never quite figured out how to do with bash so this time I thought I'd put some effort in to get it nailed. I've so far come up short though, my best effort is:
find spec -name "*_test.rb" -exec echo mv {} `echo {} | sed s/test/spec/` \;
NB: there's an extra echo after exec so that the command is printed instead of run while I'm testing it.
When I run it the output for each matched filename is:
mv original original
i.e. the substitution by sed has been lost. What's the trick?
To solve it in a way most close to the original problem would be probably using xargs "args per command line" option:
find . -name "*_test.rb" | sed -e "p;s/test/spec/" | xargs -n2 mv
It finds the files in the current working directory recursively, echoes the original file name (p) and then a modified name (s/test/spec/) and feeds it all to mv in pairs (xargs -n2). Beware that in this case the path itself shouldn't contain a string test.
This happens because sed receives the string {} as input, as can be verified with:
find . -exec echo `echo "{}" | sed 's/./foo/g'` \;
which prints foofoo for each file in the directory, recursively. The reason for this behavior is that the pipeline is executed once, by the shell, when it expands the entire command.
There is no way of quoting the sed pipeline in such a way that find will execute it for every file, since find doesn't execute commands via the shell and has no notion of pipelines or backquotes. The GNU findutils manual explains how to perform a similar task by putting the pipeline in a separate shell script:
#!/bin/sh
echo "$1" | sed 's/_test.rb$/_spec.rb/'
(There may be some perverse way of using sh -c and a ton of quotes to do all this in one command, but I'm not going to try.)
you might want to consider other way like
for file in $(find . -name "*_test.rb")
do
echo mv $file `echo $file | sed s/_test.rb$/_spec.rb/`
done
I find this one shorter
find . -name '*_test.rb' -exec bash -c 'echo mv $0 ${0/test.rb/spec.rb}' {} \;
You can do it without sed, if you want:
for i in `find -name '*_test.rb'` ; do mv $i ${i%%_test.rb}_spec.rb ; done
${var%%suffix} strips suffix from the value of var.
or, to do it using sed:
for i in `find -name '*_test.rb'` ; do mv $i `echo $i | sed 's/test/spec/'` ; done
You mention that you are using bash as your shell, in which case you don't actually need find and sed to achieve the batch renaming you're after...
Assuming you are using bash as your shell:
$ echo $SHELL
/bin/bash
$ _
... and assuming you have enabled the so-called globstar shell option:
$ shopt -p globstar
shopt -s globstar
$ _
... and finally assuming you have installed the rename utility (found in the util-linux-ng package)
$ which rename
/usr/bin/rename
$ _
... then you can achieve the batch renaming in a bash one-liner as follows:
$ rename _test _spec **/*_test.rb
(the globstar shell option will ensure that bash finds all matching *_test.rb files, no matter how deeply they are nested in the directory hierarchy... use help shopt to find out how to set the option)
The easiest way:
find . -name "*_test.rb" | xargs rename s/_test/_spec/
The fastest way (assuming you have 4 processors):
find . -name "*_test.rb" | xargs -P 4 rename s/_test/_spec/
If you have a large number of files to process, it is possible that the list of filenames piped to xargs would cause the resulting command line to exceed the maximum length allowed.
You can check your system's limit using getconf ARG_MAX
On most linux systems you can use free -b or cat /proc/meminfo to find how much RAM you have to work with; Otherwise, use top or your systems activity monitor app.
A safer way (assuming you have 1000000 bytes of ram to work with):
find . -name "*_test.rb" | xargs -s 1000000 rename s/_test/_spec/
Here is what worked for me when the file names had spaces in them. The example below recursively renames all .dar files to .zip files:
find . -name "*.dar" -exec bash -c 'mv "$0" "`echo \"$0\" | sed s/.dar/.zip/`"' {} \;
For this you don't need sed. You can perfectly get alone with a while loop fed with the result of find through a process substitution.
So if you have a find expression that selects the needed files, then use the syntax:
while IFS= read -r file; do
echo "mv $file ${file%_test.rb}_spec.rb" # remove "echo" when OK!
done < <(find -name "*_test.rb")
This will find files and rename all of them striping the string _test.rb from the end and appending _spec.rb.
For this step we use Shell Parameter Expansion where ${var%string} removes the shortest matching pattern "string" from $var.
$ file="HELLOa_test.rbBYE_test.rb"
$ echo "${file%_test.rb}" # remove _test.rb from the end
HELLOa_test.rbBYE
$ echo "${file%_test.rb}_spec.rb" # remove _test.rb and append _spec.rb
HELLOa_test.rbBYE_spec.rb
See an example:
$ tree
.
├── ab_testArb
├── a_test.rb
├── a_test.rb_test.rb
├── b_test.rb
├── c_test.hello
├── c_test.rb
└── mydir
└── d_test.rb
$ while IFS= read -r file; do echo "mv $file ${file/_test.rb/_spec.rb}"; done < <(find -name "*_test.rb")
mv ./b_test.rb ./b_spec.rb
mv ./mydir/d_test.rb ./mydir/d_spec.rb
mv ./a_test.rb ./a_spec.rb
mv ./c_test.rb ./c_spec.rb
if you have Ruby (1.9+)
ruby -e 'Dir["**/*._test.rb"].each{|x|test(?f,x) and File.rename(x,x.gsub(/_test/,"_spec") ) }'
In ramtam's answer which I like, the find portion works OK but the remainder does not if the path has spaces. I am not too familiar with sed, but I was able to modify that answer to:
find . -name "*_test.rb" | perl -pe 's/^((.*_)test.rb)$/"\1" "\2spec.rb"/' | xargs -n2 mv
I really needed a change like this because in my use case the final command looks more like
find . -name "olddir" | perl -pe 's/^((.*)olddir)$/"\1" "\2new directory"/' | xargs -n2 mv
I haven't the heart to do it all over again, but I wrote this in answer to Commandline Find Sed Exec. There the asker wanted to know how to move an entire tree, possibly excluding a directory or two, and rename all files and directories containing the string "OLD" to instead contain "NEW".
Besides describing the how with painstaking verbosity below, this method may also be unique in that it incorporates built-in debugging. It basically doesn't do anything at all as written except compile and save to a variable all commands it believes it should do in order to perform the work requested.
It also explicitly avoids loops as much as possible. Besides the sed recursive search for more than one match of the pattern there is no other recursion as far as I know.
And last, this is entirely null delimited - it doesn't trip on any character in any filename except the null. I don't think you should have that.
By the way, this is REALLY fast. Look:
% _mvnfind() { mv -n "${1}" "${2}" && cd "${2}"
> read -r SED <<SED
> :;s|${3}\(.*/[^/]*${5}\)|${4}\1|;t;:;s|\(${5}.*\)${3}|\1${4}|;t;s|^[0-9]*[\t]\(mv.*\)${5}|\1|p
> SED
> find . -name "*${3}*" -printf "%d\tmv %P ${5} %P\000" |
> sort -zg | sed -nz ${SED} | read -r ${6}
> echo <<EOF
> Prepared commands saved in variable: ${6}
> To view do: printf ${6} | tr "\000" "\n"
> To run do: sh <<EORUN
> $(printf ${6} | tr "\000" "\n")
> EORUN
> EOF
> }
% rm -rf "${UNNECESSARY:=/any/dirs/you/dont/want/moved}"
% time ( _mvnfind ${SRC=./test_tree} ${TGT=./mv_tree} \
> ${OLD=google} ${NEW=replacement_word} ${sed_sep=SsEeDd} \
> ${sh_io:=sh_io} ; printf %b\\000 "${sh_io}" | tr "\000" "\n" \
> | wc - ; echo ${sh_io} | tr "\000" "\n" | tail -n 2 )
<actual process time used:>
0.06s user 0.03s system 106% cpu 0.090 total
<output from wc:>
Lines Words Bytes
115 362 20691 -
<output from tail:>
mv .config/replacement_word-chrome-beta/Default/.../googlestars \
.config/replacement_word-chrome-beta/Default/.../replacement_wordstars
NOTE: The above function will likely require GNU versions of sed and find to properly handle the find printf and sed -z -e and :;recursive regex test;t calls. If these are not available to you the functionality can likely be duplicated with a few minor adjustments.
This should do everything you wanted from start to finish with very little fuss. I did fork with sed, but I was also practicing some sed recursive branching techniques so that's why I'm here. It's kind of like getting a discount haircut at a barber school, I guess. Here's the workflow:
rm -rf ${UNNECESSARY}
I intentionally left out any functional call that might delete or destroy data of any kind. You mention that ./app might be unwanted. Delete it or move it elsewhere beforehand, or, alternatively, you could build in a \( -path PATTERN -exec rm -rf \{\} \) routine to find to do it programmatically, but that one's all yours.
_mvnfind "${#}"
Declare its arguments and call the worker function. ${sh_io} is especially important in that it saves the return from the function. ${sed_sep} comes in a close second; this is an arbitrary string used to reference sed's recursion in the function. If ${sed_sep} is set to a value that could potentially be found in any of your path- or file-names acted upon... well, just don't let it be.
mv -n $1 $2
The whole tree is moved from the beginning. It will save a lot of headache; believe me. The rest of what you want to do - the renaming - is simply a matter of filesystem metadata. If you were, for instance, moving this from one drive to another, or across filesystem boundaries of any kind, you're better off doing so at once with one command. It's also safer. Note the -noclobber option set for mv; as written, this function will not put ${SRC_DIR} where a ${TGT_DIR} already exists.
read -R SED <<HEREDOC
I located all of sed's commands here to save on escaping hassles and read them into a variable to feed to sed below. Explanation below.
find . -name ${OLD} -printf
We begin the find process. With find we search only for anything that needs renaming because we already did all of the place-to-place mv operations with the function's first command. Rather than take any direct action with find, like an exec call, for instance, we instead use it to build out the command-line dynamically with -printf.
%dir-depth :tab: 'mv '%path-to-${SRC}' '${sed_sep}'%path-again :null delimiter:'
After find locates the files we need it directly builds and prints out (most) of the command we'll need to process your renaming. The %dir-depth tacked onto the beginning of each line will help to ensure we're not trying to rename a file or directory in the tree with a parent object that has yet to be renamed. find uses all sorts of optimization techniques to walk your filesystem tree and it is not a sure thing that it will return the data we need in a safe-for-operations order. This is why we next...
sort -general-numerical -zero-delimited
We sort all of find's output based on %directory-depth so that the paths nearest in relationship to ${SRC} are worked first. This avoids possible errors involving mving files into non-existent locations, and it minimizes need to for recursive looping. (in fact, you might be hard-pressed to find a loop at all)
sed -ex :rcrs;srch|(save${sep}*til)${OLD}|\saved${SUBSTNEW}|;til ${OLD=0}
I think this is the only loop in the whole script, and it only loops over the second %Path printed for each string in case it contains more than one ${OLD} value that might need replacing. All other solutions I imagined involved a second sed process, and while a short loop may not be desirable, certainly it beats spawning and forking an entire process.
So basically what sed does here is search for ${sed_sep}, then, having found it, saves it and all characters it encounters until it finds ${OLD}, which it then replaces with ${NEW}. It then heads back to ${sed_sep} and looks again for ${OLD}, in case it occurs more than once in the string. If it is not found, it prints the modified string to stdout (which it then catches again next) and ends the loop.
This avoids having to parse the entire string, and ensures that the first half of the mv command string, which needs to include ${OLD} of course, does include it, and the second half is altered as many times as is necessary to wipe the ${OLD} name from mv's destination path.
sed -ex...-ex search|%dir_depth(save*)${sed_sep}|(only_saved)|out
The two -exec calls here happen without a second fork. In the first, as we've seen, we modify the mv command as supplied by find's -printf function command as necessary to properly alter all references of ${OLD} to ${NEW}, but in order to do so we had to use some arbitrary reference points which should not be included in the final output. So once sed finishes all it needs to do, we instruct it to wipe out its reference points from the hold-buffer before passing it along.
AND NOW WE'RE BACK AROUND
read will receive a command that looks like this:
% mv /path2/$SRC/$OLD_DIR/$OLD_FILE /same/path_w/$NEW_DIR/$NEW_FILE \000
It will read it into ${msg} as ${sh_io} which can be examined at will outside of the function.
Cool.
-Mike
I was able handle filenames with spaces by following the examples suggested by onitake.
This doesn't break if the path contains spaces or the string test:
find . -name "*_test.rb" -print0 | while read -d $'\0' file
do
echo mv "$file" "$(echo $file | sed s/test/spec/)"
done
This is an example that should work in all cases.
Works recursiveley, Need just shell, and support files names with spaces.
find spec -name "*_test.rb" -print0 | while read -d $'\0' file; do mv "$file" "`echo $file | sed s/test/spec/`"; done
$ find spec -name "*_test.rb"
spec/dir2/a_test.rb
spec/dir1/a_test.rb
$ find spec -name "*_test.rb" | xargs -n 1 /usr/bin/perl -e '($new=$ARGV[0]) =~ s/test/spec/; system(qq(mv),qq(-v), $ARGV[0], $new);'
`spec/dir2/a_test.rb' -> `spec/dir2/a_spec.rb'
`spec/dir1/a_test.rb' -> `spec/dir1/a_spec.rb'
$ find spec -name "*_spec.rb"
spec/dir2/b_spec.rb
spec/dir2/a_spec.rb
spec/dir1/a_spec.rb
spec/dir1/c_spec.rb
Your question seems to be about sed, but to accomplish your goal of recursive rename, I'd suggest the following, shamelessly ripped from another answer I gave here:recursive rename in bash
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/^(.*_)test.rb$/\1spec.rb/g'
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .
More secure way of doing rename with find utils and sed regular expression type:
mkdir ~/practice
cd ~/practice
touch classic.txt.txt
touch folk.txt.txt
Remove the ".txt.txt" extension as follows -
cd ~/practice
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} \;
If you use the + in place of ; in order to work on batch mode, the above command will rename only the first matching file, but not the entire list of file matches by 'find'.
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} +
Here's a nice oneliner that does the trick.
Sed can't handle this right, especially if multiple variables are passed by xargs with -n 2.
A bash substition would handle this easily like:
find ./spec -type f -name "*_test.rb" -print0 | xargs -0 -I {} sh -c 'export file={}; mv $file ${file/_test.rb/_spec.rb}'
Adding -type -f will limit the move operations to files only, -print 0 will handle empty spaces in paths.
I share this post as it is a bit related to question. Sorry for not providing more details. Hope it helps someone else.
http://www.peteryu.ca/tutorials/shellscripting/batch_rename
This is my working solution:
for FILE in {{FILE_PATTERN}}; do echo ${FILE} | mv ${FILE} $(sed 's/{{SOURCE_PATTERN}}/{{TARGET_PATTERN}}/g'); done

Terminal multiple file count exclusion based on character in file names

I'm a fairly novice terminal user, and I would like to know how to make a script select files based on a specific character in their names so that it will exclude them from a check of how many files there are in a single folder, which must be displayed as a single number. The character in question is º.
This works no matter what the filenames contain:
count="$(find . -mindepth 1 -not -name '*º*' -exec printf x \; | wc -c)"
Test:
$ cd -- "$(mktemp -d)"
$ touch aº
$ touch b
$ find . -mindepth 1 -not -name '*º*' -exec printf x \; | wc -c
1
Just use filename expansion, also known as globbing:
echo *[!w]*
will display a list of all the filenames in the current directory that do not include a w.
The * means "zero or more of any characters"
The [! ] contains a list of single characters to exclude
To get a count:
for fname in *[!w]*
do
(( count++ ))
done
echo "$count files without a 'w'"
I chose 'w' because it is a little easier to see and test. There are many other ways this could be done, including set, using an array, and the wc program.

Get the newest directory to a variable in Bash

I would like to find the newest sub directory in a directory and save the result to variable in bash.
Something like this:
ls -t /backups | head -1 > $BACKUPDIR
Can anyone help?
BACKUPDIR=$(ls -td /backups/*/ | head -1)
$(...) evaluates the statement in a subshell and returns the output.
There is a simple solution to this using only ls:
BACKUPDIR=$(ls -td /backups/*/ | head -1)
-t orders by time (latest first)
-d only lists items from this folder
*/ only lists directories
head -1 returns the first item
I didn't know about */ until I found Listing only directories using ls in bash: An examination.
This ia a pure Bash solution:
topdir=/backups
BACKUPDIR=
# Handle subdirectories beginning with '.', and empty $topdir
shopt -s dotglob nullglob
for file in "$topdir"/* ; do
[[ -L $file || ! -d $file ]] && continue
[[ -z $BACKUPDIR || $file -nt $BACKUPDIR ]] && BACKUPDIR=$file
done
printf 'BACKUPDIR=%q\n' "$BACKUPDIR"
It skips symlinks, including symlinks to directories, which may or may not be the right thing to do. It skips other non-directories. It handles directories whose names contain any characters, including newlines and leading dots.
Well, I think this solution is the most efficient:
path="/my/dir/structure/*"
backupdir=$(find $path -type d -prune | tail -n 1)
Explanation why this is a little better:
We do not need sub-shells (aside from the one for getting the result into the bash variable).
We do not need a useless -exec ls -d at the end of the find command, it already prints the directory listing.
We can easily alter this, e.g. to exclude certain patterns. For example, if you want the second newest directory, because backup files are first written to a tmp dir in the same path:
backupdir=$(find $path -type -d -prune -not -name "*temp_dir" | tail -n 1)
The above solution doesn't take into account things like files being written and removed from the directory resulting in the upper directory being returned instead of the newest subdirectory.
The other issue is that this solution assumes that the directory only contains other directories and not files being written.
Let's say I create a file called "test.txt" and then run this command again:
echo "test" > test.txt
ls -t /backups | head -1
test.txt
The result is test.txt showing up instead of the last modified directory.
The proposed solution "works" but only in the best case scenario.
Assuming you have a maximum of 1 directory depth, a better solution is to use:
find /backups/* -type d -prune -exec ls -d {} \; |tail -1
Just swap the "/backups/" portion for your actual path.
If you want to avoid showing an absolute path in a bash script, you could always use something like this:
LOCALPATH=/backups
DIRECTORY=$(cd $LOCALPATH; find * -type d -prune -exec ls -d {} \; |tail -1)
With GNU find you can get list of directories with modification timestamps, sort that list and output the newest:
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\0" | sort -z -n | cut -z -f2- | tail -z -n1
or newline separated
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\n" | sort -n | cut -f2- | tail -n1
With POSIX find (that does not have -printf) you may, if you have it, run stat to get file modification timestamp:
find . -mindepth 1 -maxdepth 1 -type d -exec stat -c '%Y %n' {} \; | sort -n | cut -d' ' -f2- | tail -n1
Without stat a pure shell solution may be used by replacing [[ bash extension with [ as in this answer.
Your "something like this" was almost a hit:
BACKUPDIR=$(ls -t ./backups | head -1)
Combining what you wrote with what I have learned solved my problem too. Thank you for rising this question.
Note: I run the line above from GitBash within Windows environment in file called ./something.bash.

Recursively rename files using find and sed

I want to go through a bunch of directories and rename all files that end in _test.rb to end in _spec.rb instead. It's something I've never quite figured out how to do with bash so this time I thought I'd put some effort in to get it nailed. I've so far come up short though, my best effort is:
find spec -name "*_test.rb" -exec echo mv {} `echo {} | sed s/test/spec/` \;
NB: there's an extra echo after exec so that the command is printed instead of run while I'm testing it.
When I run it the output for each matched filename is:
mv original original
i.e. the substitution by sed has been lost. What's the trick?
To solve it in a way most close to the original problem would be probably using xargs "args per command line" option:
find . -name "*_test.rb" | sed -e "p;s/test/spec/" | xargs -n2 mv
It finds the files in the current working directory recursively, echoes the original file name (p) and then a modified name (s/test/spec/) and feeds it all to mv in pairs (xargs -n2). Beware that in this case the path itself shouldn't contain a string test.
This happens because sed receives the string {} as input, as can be verified with:
find . -exec echo `echo "{}" | sed 's/./foo/g'` \;
which prints foofoo for each file in the directory, recursively. The reason for this behavior is that the pipeline is executed once, by the shell, when it expands the entire command.
There is no way of quoting the sed pipeline in such a way that find will execute it for every file, since find doesn't execute commands via the shell and has no notion of pipelines or backquotes. The GNU findutils manual explains how to perform a similar task by putting the pipeline in a separate shell script:
#!/bin/sh
echo "$1" | sed 's/_test.rb$/_spec.rb/'
(There may be some perverse way of using sh -c and a ton of quotes to do all this in one command, but I'm not going to try.)
you might want to consider other way like
for file in $(find . -name "*_test.rb")
do
echo mv $file `echo $file | sed s/_test.rb$/_spec.rb/`
done
I find this one shorter
find . -name '*_test.rb' -exec bash -c 'echo mv $0 ${0/test.rb/spec.rb}' {} \;
You can do it without sed, if you want:
for i in `find -name '*_test.rb'` ; do mv $i ${i%%_test.rb}_spec.rb ; done
${var%%suffix} strips suffix from the value of var.
or, to do it using sed:
for i in `find -name '*_test.rb'` ; do mv $i `echo $i | sed 's/test/spec/'` ; done
You mention that you are using bash as your shell, in which case you don't actually need find and sed to achieve the batch renaming you're after...
Assuming you are using bash as your shell:
$ echo $SHELL
/bin/bash
$ _
... and assuming you have enabled the so-called globstar shell option:
$ shopt -p globstar
shopt -s globstar
$ _
... and finally assuming you have installed the rename utility (found in the util-linux-ng package)
$ which rename
/usr/bin/rename
$ _
... then you can achieve the batch renaming in a bash one-liner as follows:
$ rename _test _spec **/*_test.rb
(the globstar shell option will ensure that bash finds all matching *_test.rb files, no matter how deeply they are nested in the directory hierarchy... use help shopt to find out how to set the option)
The easiest way:
find . -name "*_test.rb" | xargs rename s/_test/_spec/
The fastest way (assuming you have 4 processors):
find . -name "*_test.rb" | xargs -P 4 rename s/_test/_spec/
If you have a large number of files to process, it is possible that the list of filenames piped to xargs would cause the resulting command line to exceed the maximum length allowed.
You can check your system's limit using getconf ARG_MAX
On most linux systems you can use free -b or cat /proc/meminfo to find how much RAM you have to work with; Otherwise, use top or your systems activity monitor app.
A safer way (assuming you have 1000000 bytes of ram to work with):
find . -name "*_test.rb" | xargs -s 1000000 rename s/_test/_spec/
Here is what worked for me when the file names had spaces in them. The example below recursively renames all .dar files to .zip files:
find . -name "*.dar" -exec bash -c 'mv "$0" "`echo \"$0\" | sed s/.dar/.zip/`"' {} \;
For this you don't need sed. You can perfectly get alone with a while loop fed with the result of find through a process substitution.
So if you have a find expression that selects the needed files, then use the syntax:
while IFS= read -r file; do
echo "mv $file ${file%_test.rb}_spec.rb" # remove "echo" when OK!
done < <(find -name "*_test.rb")
This will find files and rename all of them striping the string _test.rb from the end and appending _spec.rb.
For this step we use Shell Parameter Expansion where ${var%string} removes the shortest matching pattern "string" from $var.
$ file="HELLOa_test.rbBYE_test.rb"
$ echo "${file%_test.rb}" # remove _test.rb from the end
HELLOa_test.rbBYE
$ echo "${file%_test.rb}_spec.rb" # remove _test.rb and append _spec.rb
HELLOa_test.rbBYE_spec.rb
See an example:
$ tree
.
├── ab_testArb
├── a_test.rb
├── a_test.rb_test.rb
├── b_test.rb
├── c_test.hello
├── c_test.rb
└── mydir
└── d_test.rb
$ while IFS= read -r file; do echo "mv $file ${file/_test.rb/_spec.rb}"; done < <(find -name "*_test.rb")
mv ./b_test.rb ./b_spec.rb
mv ./mydir/d_test.rb ./mydir/d_spec.rb
mv ./a_test.rb ./a_spec.rb
mv ./c_test.rb ./c_spec.rb
if you have Ruby (1.9+)
ruby -e 'Dir["**/*._test.rb"].each{|x|test(?f,x) and File.rename(x,x.gsub(/_test/,"_spec") ) }'
In ramtam's answer which I like, the find portion works OK but the remainder does not if the path has spaces. I am not too familiar with sed, but I was able to modify that answer to:
find . -name "*_test.rb" | perl -pe 's/^((.*_)test.rb)$/"\1" "\2spec.rb"/' | xargs -n2 mv
I really needed a change like this because in my use case the final command looks more like
find . -name "olddir" | perl -pe 's/^((.*)olddir)$/"\1" "\2new directory"/' | xargs -n2 mv
I haven't the heart to do it all over again, but I wrote this in answer to Commandline Find Sed Exec. There the asker wanted to know how to move an entire tree, possibly excluding a directory or two, and rename all files and directories containing the string "OLD" to instead contain "NEW".
Besides describing the how with painstaking verbosity below, this method may also be unique in that it incorporates built-in debugging. It basically doesn't do anything at all as written except compile and save to a variable all commands it believes it should do in order to perform the work requested.
It also explicitly avoids loops as much as possible. Besides the sed recursive search for more than one match of the pattern there is no other recursion as far as I know.
And last, this is entirely null delimited - it doesn't trip on any character in any filename except the null. I don't think you should have that.
By the way, this is REALLY fast. Look:
% _mvnfind() { mv -n "${1}" "${2}" && cd "${2}"
> read -r SED <<SED
> :;s|${3}\(.*/[^/]*${5}\)|${4}\1|;t;:;s|\(${5}.*\)${3}|\1${4}|;t;s|^[0-9]*[\t]\(mv.*\)${5}|\1|p
> SED
> find . -name "*${3}*" -printf "%d\tmv %P ${5} %P\000" |
> sort -zg | sed -nz ${SED} | read -r ${6}
> echo <<EOF
> Prepared commands saved in variable: ${6}
> To view do: printf ${6} | tr "\000" "\n"
> To run do: sh <<EORUN
> $(printf ${6} | tr "\000" "\n")
> EORUN
> EOF
> }
% rm -rf "${UNNECESSARY:=/any/dirs/you/dont/want/moved}"
% time ( _mvnfind ${SRC=./test_tree} ${TGT=./mv_tree} \
> ${OLD=google} ${NEW=replacement_word} ${sed_sep=SsEeDd} \
> ${sh_io:=sh_io} ; printf %b\\000 "${sh_io}" | tr "\000" "\n" \
> | wc - ; echo ${sh_io} | tr "\000" "\n" | tail -n 2 )
<actual process time used:>
0.06s user 0.03s system 106% cpu 0.090 total
<output from wc:>
Lines Words Bytes
115 362 20691 -
<output from tail:>
mv .config/replacement_word-chrome-beta/Default/.../googlestars \
.config/replacement_word-chrome-beta/Default/.../replacement_wordstars
NOTE: The above function will likely require GNU versions of sed and find to properly handle the find printf and sed -z -e and :;recursive regex test;t calls. If these are not available to you the functionality can likely be duplicated with a few minor adjustments.
This should do everything you wanted from start to finish with very little fuss. I did fork with sed, but I was also practicing some sed recursive branching techniques so that's why I'm here. It's kind of like getting a discount haircut at a barber school, I guess. Here's the workflow:
rm -rf ${UNNECESSARY}
I intentionally left out any functional call that might delete or destroy data of any kind. You mention that ./app might be unwanted. Delete it or move it elsewhere beforehand, or, alternatively, you could build in a \( -path PATTERN -exec rm -rf \{\} \) routine to find to do it programmatically, but that one's all yours.
_mvnfind "${#}"
Declare its arguments and call the worker function. ${sh_io} is especially important in that it saves the return from the function. ${sed_sep} comes in a close second; this is an arbitrary string used to reference sed's recursion in the function. If ${sed_sep} is set to a value that could potentially be found in any of your path- or file-names acted upon... well, just don't let it be.
mv -n $1 $2
The whole tree is moved from the beginning. It will save a lot of headache; believe me. The rest of what you want to do - the renaming - is simply a matter of filesystem metadata. If you were, for instance, moving this from one drive to another, or across filesystem boundaries of any kind, you're better off doing so at once with one command. It's also safer. Note the -noclobber option set for mv; as written, this function will not put ${SRC_DIR} where a ${TGT_DIR} already exists.
read -R SED <<HEREDOC
I located all of sed's commands here to save on escaping hassles and read them into a variable to feed to sed below. Explanation below.
find . -name ${OLD} -printf
We begin the find process. With find we search only for anything that needs renaming because we already did all of the place-to-place mv operations with the function's first command. Rather than take any direct action with find, like an exec call, for instance, we instead use it to build out the command-line dynamically with -printf.
%dir-depth :tab: 'mv '%path-to-${SRC}' '${sed_sep}'%path-again :null delimiter:'
After find locates the files we need it directly builds and prints out (most) of the command we'll need to process your renaming. The %dir-depth tacked onto the beginning of each line will help to ensure we're not trying to rename a file or directory in the tree with a parent object that has yet to be renamed. find uses all sorts of optimization techniques to walk your filesystem tree and it is not a sure thing that it will return the data we need in a safe-for-operations order. This is why we next...
sort -general-numerical -zero-delimited
We sort all of find's output based on %directory-depth so that the paths nearest in relationship to ${SRC} are worked first. This avoids possible errors involving mving files into non-existent locations, and it minimizes need to for recursive looping. (in fact, you might be hard-pressed to find a loop at all)
sed -ex :rcrs;srch|(save${sep}*til)${OLD}|\saved${SUBSTNEW}|;til ${OLD=0}
I think this is the only loop in the whole script, and it only loops over the second %Path printed for each string in case it contains more than one ${OLD} value that might need replacing. All other solutions I imagined involved a second sed process, and while a short loop may not be desirable, certainly it beats spawning and forking an entire process.
So basically what sed does here is search for ${sed_sep}, then, having found it, saves it and all characters it encounters until it finds ${OLD}, which it then replaces with ${NEW}. It then heads back to ${sed_sep} and looks again for ${OLD}, in case it occurs more than once in the string. If it is not found, it prints the modified string to stdout (which it then catches again next) and ends the loop.
This avoids having to parse the entire string, and ensures that the first half of the mv command string, which needs to include ${OLD} of course, does include it, and the second half is altered as many times as is necessary to wipe the ${OLD} name from mv's destination path.
sed -ex...-ex search|%dir_depth(save*)${sed_sep}|(only_saved)|out
The two -exec calls here happen without a second fork. In the first, as we've seen, we modify the mv command as supplied by find's -printf function command as necessary to properly alter all references of ${OLD} to ${NEW}, but in order to do so we had to use some arbitrary reference points which should not be included in the final output. So once sed finishes all it needs to do, we instruct it to wipe out its reference points from the hold-buffer before passing it along.
AND NOW WE'RE BACK AROUND
read will receive a command that looks like this:
% mv /path2/$SRC/$OLD_DIR/$OLD_FILE /same/path_w/$NEW_DIR/$NEW_FILE \000
It will read it into ${msg} as ${sh_io} which can be examined at will outside of the function.
Cool.
-Mike
I was able handle filenames with spaces by following the examples suggested by onitake.
This doesn't break if the path contains spaces or the string test:
find . -name "*_test.rb" -print0 | while read -d $'\0' file
do
echo mv "$file" "$(echo $file | sed s/test/spec/)"
done
This is an example that should work in all cases.
Works recursiveley, Need just shell, and support files names with spaces.
find spec -name "*_test.rb" -print0 | while read -d $'\0' file; do mv "$file" "`echo $file | sed s/test/spec/`"; done
$ find spec -name "*_test.rb"
spec/dir2/a_test.rb
spec/dir1/a_test.rb
$ find spec -name "*_test.rb" | xargs -n 1 /usr/bin/perl -e '($new=$ARGV[0]) =~ s/test/spec/; system(qq(mv),qq(-v), $ARGV[0], $new);'
`spec/dir2/a_test.rb' -> `spec/dir2/a_spec.rb'
`spec/dir1/a_test.rb' -> `spec/dir1/a_spec.rb'
$ find spec -name "*_spec.rb"
spec/dir2/b_spec.rb
spec/dir2/a_spec.rb
spec/dir1/a_spec.rb
spec/dir1/c_spec.rb
Your question seems to be about sed, but to accomplish your goal of recursive rename, I'd suggest the following, shamelessly ripped from another answer I gave here:recursive rename in bash
#!/bin/bash
IFS=$'\n'
function RecurseDirs
{
for f in "$#"
do
newf=echo "${f}" | sed -e 's/^(.*_)test.rb$/\1spec.rb/g'
echo "${f}" "${newf}"
mv "${f}" "${newf}"
f="${newf}"
if [[ -d "${f}" ]]; then
cd "${f}"
RecurseDirs $(ls -1 ".")
fi
done
cd ..
}
RecurseDirs .
More secure way of doing rename with find utils and sed regular expression type:
mkdir ~/practice
cd ~/practice
touch classic.txt.txt
touch folk.txt.txt
Remove the ".txt.txt" extension as follows -
cd ~/practice
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} \;
If you use the + in place of ; in order to work on batch mode, the above command will rename only the first matching file, but not the entire list of file matches by 'find'.
find . -name "*txt" -execdir sh -c 'mv "$0" `echo "$0" | sed -r 's/\.[[:alnum:]]+\.[[:alnum:]]+$//'`' {} +
Here's a nice oneliner that does the trick.
Sed can't handle this right, especially if multiple variables are passed by xargs with -n 2.
A bash substition would handle this easily like:
find ./spec -type f -name "*_test.rb" -print0 | xargs -0 -I {} sh -c 'export file={}; mv $file ${file/_test.rb/_spec.rb}'
Adding -type -f will limit the move operations to files only, -print 0 will handle empty spaces in paths.
I share this post as it is a bit related to question. Sorry for not providing more details. Hope it helps someone else.
http://www.peteryu.ca/tutorials/shellscripting/batch_rename
This is my working solution:
for FILE in {{FILE_PATTERN}}; do echo ${FILE} | mv ${FILE} $(sed 's/{{SOURCE_PATTERN}}/{{TARGET_PATTERN}}/g'); done

Resources