Bash find: exec in reverse oder - bash

I am iterating over files like so:
find $directory -type f -exec codesign {} \;
Now the problem here is that files on a higher hierarchy are signed first.
Is there a way to iterate over a directory tree and handle the deepest files first?
So that
/My/path/to/app/bin
is handled before
/My/path/mainbin

Yes, just use -depth:
-depth
The primary shall always evaluate as true; it shall cause descent of the directory hierarchy to be done so that all entries in a directory are acted on before the directory itself. If a -depth primary is not specified, all entries in a directory shall be acted on after the directory itself. If any -depth primary is specified, it shall apply to the entire expression even if the -depth primary would not normally be evaluated.
For example:
$ mkdir -p top/a/b/c/d/e/f/g/h
$ find top -print
top
top/a
top/a/b
top/a/b/c
top/a/b/c/d
top/a/b/c/d/e
top/a/b/c/d/e/f
top/a/b/c/d/e/f/g
top/a/b/c/d/e/f/g/h
$ find top -depth -print
top/a/b/c/d/e/f/g/h
top/a/b/c/d/e/f/g
top/a/b/c/d/e/f
top/a/b/c/d/e
top/a/b/c/d
top/a/b/c
top/a/b
top/a
top
Note that at a particular level, ordering is still arbitrary.

Using GNU utilities, and decorate-sort-undecorate pattern (aka Schwartzian transform):
find . -type f -printf '%d %p\0' |
sort -znr |
sed -z 's/[0-9]* //' |
xargs -0 -I# echo codesign #
Drop the echo if the output looks ok.

Using find's -depth option as my other answer, or naive sort as some others, only ensures that sub-directories of a directory are processed before the directory itself, but not that the deepest level is processed first.
For example:
$ mkdir -p top/a/b/d/f/h top/a/c/e/g
$ find top -depth -print
top/a/c/e/g
top/a/c/e
top/a/c
top/a/b/d/f/h
top/a/b/d/f
top/a/b/d
top/a/b
top/a
top
For overall deepest level to be processed first, the ordering should be something like:
top/a/b/d/f/h
top/a/c/e/g
top/a/b/d/f
top/a/c/e
top/a/b/d
top/a/c
top/a/b
top/a
top
To determine this ordering, the entire list must be known, and then the number of levels (ie. /) of each path counted to enable ranking.
A simple-ish Perl script (assigned to a shell function for this example) to do this ordering is:
$ dsort(){
perl -ne '
BEGIN { $/ = "\0" } # null-delimited i/o
$fname[$.] = $_;
$depth[$.] = tr|/||;
END {
print
map { $fname[$_] }
sort { $depth[$b] <=> $depth[$a] }
keys #fname
}
'
}
Then:
$ find top -print0 | dsort | xargs -0 -I# echo #
top/a/b/d/f/h
top/a/c/e/g
top/a/b/d/f
top/a/c/e
top/a/b/d
top/a/c
top/a/b
top/a
top

How about sorting the output of find in descending order:
while IFS= read -d "" -r f; do
codesign "$f"
done < <(find "$directory" -type f -print0 | sort -zr)
<(command ..) is a process substitution which feeds the output
of the command to the read command in while loop via the redirect.
-print0, sort -z and read -d "" combo uses a null character
as a file delimiter. It is useful to protect filenames which include
special characters such as whitespace.

I don't know if there is a native way in find, but you may pipe the output of it into a loop and process it line by line as you wish this way:
find . | while read file; do echo filename: "$file"; done
In your case, if you are happy just reversing the output of find, you may go with something like:
find $directory -type f | tac | while read file; do codesign "$file"; done

Related

delete all but the last match

I want to delete all but the last match of a set of files matching file* that are present in each folder within a directory.
For example:
Folder 1
file
file_1-1
file_1-2
file_2-1
stuff.txt
stuff
Folder 2
file_1-1
file_1-2
file_1-3
file_2-1
file_2-2
stuff.txt
Folder 3
...
and so on. Within every subfolder I want to keep only the last of the matched files, so for Folder 1 this would be file_2-1, in Folder 2 it would be file_2-2. The number of files is generally different within each subfolder.
Since I have a very nestled folder structure I thought about using the find command somehow like this
find . -type f -name "file*" -delete_all_but_last_match
I know how to delete all matches but not how to exclude the last match.
I also found the following piece of code:
https://askubuntu.com/questions/1139051/how-to-delete-all-but-x-last-items-from-find
but when I apply a modified version to a test folder
find . -type f -name "file*" -print0 | head -zn-1 | xargs -0 rm -rf
it deletes all the matches in most cases, only in some the last file is spared. So it does not work for me, presumably because of the different number of files in each folder.
Edit:
The folders contain no further subfolders, but they are generally at the end of several subfolder levels. It would therefore be a benefit if the script can be executed some levels above as well.
#!/bin/bash
shopt -s globstar
for dir in **/; do
files=("$dir"file*)
unset 'files[-1]'
rm "${files[#]}"
done
Try the following solution utilising awk and xargs:
find . -type f -name "file*" | awk -F/ '{ map1[$(NF-1)]++;map[$(NF-1)][map1[$(NF-1)]]=$0 }END { for ( i in map ) { for (j=1;j<=(map1[i]-1);j++) { print "\""map[i][j]"\"" } } }' | xargs rm
Explanation:
find . -type f -name "file*" | awk -F/ '{ # Set the field delimiter to "/" in awk
map1[$(NF-1)]++; # Create an array map1 with the sub-directory as the index and an incrementing counter the value (number of files in each sub-directory)
map[$(NF-1)][map1[$(NF-1)]]=$0 # Create a two dimentional array with the sub directory index one and the file count the second. The line the value
}
END {
for ( i in map ) {
for (j=1;j<=(map1[i]-1);j++) {
print "\""map[i][j]"\"" # Loop through the map array utilising map1 to get the last but one file and printing the results
}
}
}' | xargs rm # Run the result through xargs rm
Remove the pipe to xargs to verify that the files are listing as expected before adding back in to actually remove the files.

list subdirectories recursively

I want to write a shell script so as to recursively list all different subdirectories which contain NEW subdirectory (NEW is a fixed name).
Dir1/Subdir1/Subsub1/NEW/1.jpg
Dir1/Subdir1/Subsub1/NEW/2.jpg
Dir1/Subdir1/Subsub2/NEW/3.jpg
Dir1/Subdir1/Subsub2/NEW/4.jpg
Dir1/Subdir2/Subsub3/NEW/5.jpg
Dir1/Subdir2/Subsub4/NEW/6.jpg
I want to get
Dir1/Subdir1/Subsub1
Dir1/Subdir1/Subsub2
Dir1/Subdir2/Subsub3
Dir1/Subdir2/Subsub4
How can I do that?
find . -type d -name NEW | sed 's|/NEW$||'
--- EDIT ---
for your comment, sed does not have a -print0. There are various ways of doing this (most of which are wrong). One possible solution would be:
find . -type d -name NEW -print0 | \
while IFS= read -rd '' subdirNew; do \
subdir=$(sed 's|/NEW$||' <<< "$subdirNew"); \
echo "$subdir"; \
done
which should be tolerant of spaces and newlines in the filename
ls -R will list things recursively.
of find . | grep "/NEW/" should give you the type of list you are looking for.
You could try this:
find . -type d -name "NEW" -exec dirname {} \;

Recursively check length of directory name

I need to determine if there are any directory names > 31 characters in a given directory (i.e. look underneath that root).
I know I can use something like find /path/to/root/dir -type d >> dirnames.txt
This will give me a text file of complete paths.
What I need is to get the actual number of characters in each directory name. Not sure if parsing the above results w/sed or awk makes sense. Looking for ideas/thoughts/suggestions/tips on how to accomplish this. Thanks!
This short script does it all in one go, i.e. finds all directory names and then outputs any which are greater than 31 characters in length (along with their length in characters):
for d in `find /path/to/root/dir -type d -exec basename {} \;` ; do
len=$(echo $d | wc -c)
if [ $len -gt 31 ] ; then
echo "$d = $len characters"
fi
done
Using your dirnames.txt file created by your find cmd, you can then sort the data by length of pathname, i.e.
awk '{print length($0) "\t" $0}' dirnames.txt | sort +0nr -1 > dirNamesWithSize.txt
This will present the longest path names (based on the value of length) at the top of the file.
I hope this helps.
Try this
find . -type d -exec bash -c '[ $(wc -c <<<"${1##*/}") -gt 32 ] && echo "${1}"' -- {} \; 2>/dev/null
The one bug, which I consider minor, is that it will over-count directory name length by 1 every time.
If what you wanted was the whole path rather than the last path component, then use this:
find . -type d | sed -e '/.\{32,\}/!d'
This version also has a bug, but only when file names have embedded newlines.
The output of both commands is a list of file names which match the criteria. Counting the length of each one is trivial from there.

Using bash: how do you find a way to search through all your files in a directory (recursively?)

I need a command that will help me accomplish what I am trying to do. At the moment, I am looking for all the ".html" files in a given directory, and seeing which ones contain the string "jacketprice" in any of them.
Is there a way to do this? And also, for the second (but separate) command, I will need a way to replace every instance of "jacketprice" with "coatprice", all in one command or script. If this is feasible feel free to let me know. Thanks
find . -name "*.html" -exec grep -l jacketprice {} \;
for i in `find . -name "*.html"`
do
sed -i "s/jacketprice/coatprice/g" $i
done
As for the second question,
find . -name "*.html" -exec sed -i "s/jacketprice/coatprice/g" {} \;
Use recursive grep to search through your files:
grep -r --include="*.html" jacketprice /my/dir
Alternatively turn on bash's globstar feature (if you haven't already), which allows you to use **/ to match directories and sub-directories.
$ shopt -s globstar
$ cd /my/dir
$ grep jacketprice **/*.html
$ sed -i 's/jacketprice/coatprice/g' **/*.html
Depending on whether you want this recursively or not, perl is a good option:
Find, non-recursive:
perl -nwe 'print "Found $_ in file $ARGV\n" if /jacketprice/' *.html
Will print the line where the match is found, followed by the file name. Can get a bit verbose.
Replace, non-recursive:
perl -pi.bak -we 's/jacketprice/coatprice/g' *.html
Will store original with .bak extension tacked on.
Find, recursive:
perl -MFile::Find -nwE '
BEGIN { find(sub { /\.html$/i && push #ARGV, $File::Find::name }, '/dir'); };
say $ARGV if /jacketprice/'
It will print the file name for each match. Somewhat less verbose might be:
perl -MFile::Find -nwE '
BEGIN { find(sub { /\.html$/i && push #ARGV, $File::Find::name }, '/dir'); };
$found{$ARGV}++ if /jacketprice/; END { say for keys %found }'
Replace, recursive:
perl -MFile::Find -pi.bak -we '
BEGIN { find(sub { /\.html$/i && push #ARGV, $File::Find::name }, '/dir'); };
s/jacketprice/coatprice/g'
Note: In all recursive versions, /dir is the bottom level directory you wish to search. Also, if your perl version is less than 5.10, say can be replaced with print followed by newline, e.g. print "$_\n" for keys %found.

How can I copy all my disorganized files into a single directory? (on linux)

I have thousands of mp3s inside a complex folder structure which resides within a single folder. I would like to move all the mp3s into a single directory with no subfolders. I can think of a variety of ways of doing this using the find command but one problem will be duplicate file names. I don't want to replace files since I often have multiple versions of a same song. Auto-rename would be best. I don't really care how the files are renamed.
Does anyone know a simple and safe way of doing this?
You could change a a/b/c.mp3 path into a - b - c.mp3 after copying. Here's a solution in Bash:
find srcdir -name '*.mp3' -printf '%P\n' |
while read i; do
j="${i//\// - }"
cp -v "srcdir/$i" "dstdir/$j"
done
And in a shell without ${//} substitution:
find srcdir -name '*.mp3' -printf '%P\n' |
sed -e 'p;s:/: - :g' |
while read i; do
read j
cp -v "srcdir/$i" "dstdir/$j"
done
For a different scheme, GNU's cp and mv can make numbered backups instead of overwriting -- see -b/--backup[=CONTROL] in the man pages.
find srcdir -name '*.mp3' -exec cp -v --backup=numbered {} dstdir/ \;
bash like pseudocode:
for i in `find . -name "*.mp3"`; do
NEW_NAME = `basename $i`
X=0
while ! -f move_to_dir/$NEW_NAME
NEW_NAME = $NEW_NAME + incr $X
mv $i $NEW_NAME
done
#!/bin/bash
NEW_DIR=/tmp/new/
IFS="
"; for a in `find . -type f `
do
echo "$a"
new_name="`basename $a`"
while test -e "$NEW_DIR/$new_name"
do
new_name="${new_name}_"
done
cp "$a" "$NEW_DIR/$new_name"
done
I'd tend to do this in a simple script rather than try to fit in in a single command line.
For instance, in python, it would be relatively trivial to do a walk() through the directory, copying each mp3 file found to a different directory with an automatically incremented number.
If you want to get fancier, you could have a dictionary of existing file names, and simply append a number to the duplicates. (the index of the dictionary being the file name, and the value being the number of files found so far, which would become your suffix)
find /path/to/mp3s -name *.mp3 -exec mv \{\} /path/to/target/dir \;
At the risk of many downvotes, a perl script could be written in short time to accomplish this.
Pseudocode:
while (-e filename)
change filename to filename . "1";
In python: to actually move the file, change debug=False
import os, re
from_dir="/from/dir"
to_dir = "/target/dir"
re_ext = "\.mp3"
debug = True
w = os.walk(from_dir)
n = w.next()
while n:
d, arg, names = n
names = filter(lambda fn: re.match(".*(%s)$"%re_ext, fn, re.I) , names)
n = w.next()
for fn in names:
from_fn = os.path.join(d,fn)
target_fn = os.path.join(to_dir, fn)
file_exists = os.path.exists(target_fn)
if not debug:
if not file_exists:
os.rename(from_fn, target_fn)
else:
print "DO NOT MOVE - FILE EXISTS ", from_fn
else:
print "MOVE ", from_fn, " TO " , target_fn
Since you don't care how the duplicate files are named, utilize the 'backup' option on move:
find /path/to/mp3s -name *.mp3 -exec mv --backup=numbered {} /path/to/target/dir \;
Will get you:
song.mp3
song.mp3.~1~
song.mp3.~2~

Resources