More elegant use of find for passing files grouped by directory? - bash

This script has taken me too long (!!) to compile, but I finally have a reasonably nice script which does what I want:
find "$#" -type d -print0 | while IFS= read -r -d $'\0' dir; do
find "$dir" -iname '*.flac' -maxdepth 1 ! -exec bash -c '
metaflac --list --block-type=VORBIS_COMMENT "$0" 2>/dev/null | grep -i "REPLAYGAIN_ALBUM_PEAK" &>/dev/null
exit $?
' {} ';' -exec bash -c '
echo Adding ReplayGain tags to "$0"/\*.flac...
metaflac --add-replay-gain "${#:1}"
' "$dir" {} '+'
done
The purpose is to search the file tree for directories containing FLAC files, test whether any are missing the REPLAYGAIN_ALBUM_PEAK tag, and scan all the files in that directory for ReplayGain if they are missing.
The big stumbling block is that all the FLAC files for a given album must be passed to metaflac as one command, otherwise metaflac doesn't know they're all one album. As you can see, I've achieved this using find ... -exec ... +.
What I'm wondering is if there's a more elegant way to do this. In particular, how can I skip the while loop? Surely this should be unnecessary, because find is already iterating over the directories?

You can probably use xargs to achieve it.
For example, if you are looking for text foo in all your files you'll have something like
find . type f | xargs grep foo
xargs passes each result from left-end expression (find) to the right-end invokated command.
Then, if no command exists to achieve what you want to do, you can always create a function, and pass if to xargs

I can't comment on the flac commands themselves, but as for the rest:
find . -name '*.flac' \
! -exec bash -c 'metaflac --list --block-type=VORBIS_COMMENT "$1" | grep -qi "REPLAYGAIN_ALBUM_PEAK"' -- {} \; \
-execdir bash -c 'metaflac --add-replay-gain *.flac' \;
You just find the relevant files, and then treat the directory it's in.

Related

Solution for find -exec if single and double quotes already in use

I would like to recursively go through all subdirectories and remove the oldest two PDFs in each subfolder named "bak":
Works:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && pwd" \;
Does not work, as the double quotes are already in use:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && rm "$(ls -t *.pdf | tail -2)"" \;
Any solution to the double quote conundrum?
In a double quoted string you can use backslashes to escape other double quotes, e.g.
find ... "rm \"\$(...)\""
If that is too convoluted use variables:
cmd='$(...)'
find ... "rm $cmd"
However, I think your find -exec has more problems than that.
Using {} inside the command string "cd '{}' ..." is risky. If there is a ' inside the file name things will break and might execcute unexpected commands.
$() will be expanded by bash before find even runs. So ls -t *.pdf | tail -2 will only be executed once in the top directory . instead of once for each found directory. rm will (try to) delete the same file for each found directory.
rm "$(ls -t *.pdf | tail -2)" will not work if ls lists more than one file. Because of the quotes both files would be listed in one argument. Therefore, rm would try to delete one file with the name first.pdf\nsecond.pdf.
I'd suggest
cmd='cd "$1" && ls -t *.pdf | tail -n2 | sed "s/./\\\\&/g" | xargs rm'
find . -type d -name bak -exec bash -c "$cmd" -- {} \;
You have a more fundamental problem; because you are using the weaker double quotes around the entire script, the $(...) command substitution will be interpreted by the shell which parses the find command, not by the bash shell you are starting, which will only receive a static string containing the result from the command substitution.
If you switch to single quotes around the script, you get most of it right; but that would still fail if the file name you find contains a double quote (just like your attempt would fail for file names with single quotes). The proper fix is to pass the matching files as command-line arguments to the bash subprocess.
But a better fix still is to use -execdir so that you don't have to pass the directory name to the subshell at all:
find . -type d -name "bak" \
-execdir bash -c 'ls -t *.pdf | tail -2 | xargs -r rm' \;
This could stll fail in funny ways because you are parsing ls which is inherently buggy.
You are explicitely asking for find -exec. Usually I would just concatenate find -exec find -delete but in your case only two files should be deleted. Therefore the only method is running subshell. Socowi already gave nice solution, however if your file names do not contain tabulator or newlines, another workaround is find while read loop.
This will sort files by mtime
find . -type d -iname 'bak' | \
while read -r dir;
do
find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | \
sort | head -n2 | \
cut -f2- | \
while read -r file;
do
rm "$file";
done;
done;
The above find while read loop as "one-liner"
find . -type d -iname 'bak' | while read -r dir; do find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | sort | head -n2 | cut -f2- | while read -r file; do rm "$file"; done; done;
find while read loop can also handle NUL terminated file names. However head can not handle this, so I did improve other answers and made it work with nontrivial file names (only GNU + bash)
replace 'realpath' with rm
#!/bin/bash
rm_old () {
find "$1" -maxdepth 1 -type f -iname \*.$2 -printf "%T+\t%p\0" | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
export -f rm_old
find -type d -iname bak -execdir bash -c 'rm_old "{}" pdf 2' \;
However bash -c might still exploitable, to make it more secure let stat %N do the quoting
#!/bin/bash
rm_old () {
local dir="$1"
# we don't like eval
# eval "dir=$dir"
# this works like eval
dir="${dir#?}"
dir="${dir%?}"
dir="${dir//"'$'\t''"/$'\011'}"
dir="${dir//"'$'\n''"/$'\012'}"
dir="${dir//$'\047'\\$'\047'$'\047'/$'\047'}"
find "$dir" -maxdepth 1 -type f -iname \*.$2 -printf '%T+\t%p\0' | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
find -type d -iname bak -exec stat -c'%N' {} + | while read -r dir; do rm_old "$dir" pdf 2; done

Rename all files in directory and (deeply nested) sub-directories

What is the shell command for renaming all files in a directory and sub-directory (recursively)?
I would like to add an underscore to all the files ending with *scss from filename.scss to _filename.scss in all the directories and sub-directories.
I have found answers relating to this but most if not all require you to know the filename itself, and I do not want this because the filenames differ and are a lot to know by heart or even type them manually and some of them are deeply nested in directories.
Edit: I was under the impression that the bash -c bit was somehow necessary for multiple expansion of the found element; anubhava's answer proved me wrong. I am leaving that bit in the answer for now as it worked for the OP.
find . -type f -name *scss -exec bash -c 'mv $1 _$1' -- {} \;
find . -- find in current directory (recursively)
-type f -- files
-name *scss -- matching the pattern *scss
-exec -- execute for each element found
bash -c '...' -- execute command in a subshell
-- -- end option parsing
{} -- expands to the name of the element found (which becomes the positional parameter for the bash -c command)
\; -- end the -exec command
You can use -execdir option here:
find ./src/components -iname "*.scss" -execdir mv {} _{} \;
You are close to a solution:
find ./src/components -iname "*.scss" -print0 | xargs -0 -n 1 -I{} mv {} _{}
In this approach, the "loop" is executed by xargs. I prefer this solution overt the usage of the -exec in find. The syntax is clear to me.
Also, if you want to repeat the command and avoid double-adding the underscore to the already processed files, use a regexp to get only the files not yet processed:
find ./src/components -iregex ".*/[^_][^/]*\.scss" -print0 | xargs -0 -n 1 -I{} mv {} _{}
By adding the -print0/-0 options, you also avoid problems with whitespaces.
#!/bin/sh
EXTENSION='.scss'
cd YOURDIR
find . -type f | while read -r LINE; do
FILE="$( basename "$LINE" )"
case "$LINE" in
*"$EXTENSION")
DIRNAME="$( dirname "$LINE" )"
mv -v "$DIRNAME/$FILE" "$DIRNAME/_$FILE"
;;
esac
done

How to cd into grep output?

I have a shell script which basically searches all folders inside a location and I use grep to find the exact folder I want to target.
for dir in /root/*; do
grep "Apples" "${dir}"/*.* || continue
While grep successfully finds my target directory, I'm stuck on how I can move the folders I want to move in my target directory. An idea I had was to cd into grep output but that's where I got stuck. Tried some Google results, none helped with my case.
Example grep output: Binary file /root/ant/containers/secret/Documents/2FD412E0/file.extension matches
I want to cd into 2FD412E0and move two folders inside that directory.
dirname is the key to that:
cd $(dirname $(grep "...." ...))
will let you enter the directory.
As people mentioned, dirname is the right tool to strip off the file name from the path.
I would use find for such kind of task:
while read -r file
do
target_dir=`dirname $file`
# do something with "$target_dir"
done < <(find /root/ -type f \
-exec grep "Apples" --files-with-matches {} \;)
Consider using find's -maxdepth option. See the man page for find.
Well, there is actually simpler solution :) I just like to write bash scripts. You might simply use single find command like this:
find /root/ -type f -exec grep Apples {} ';' -exec ls -l {} ';'
Note the second -exec. It will be executed, if the previous -exec command exited with status 0 (success). From the man page:
-exec command ;
Execute command; true if 0 status is returned. All following arguments to find are taken to be arguments to the command until an argument consisting of ; is encountered. The string {} is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find.
Replace the ls -l command with your stuff.
And if you want to execute dirname within the -exec command, you may do the following trick:
find /root/ -type f -exec grep -q Apples {} ';' \
-exec sh -c 'cd `dirname $0`; pwd' {} ';'
Replace pwd with your stuff.
When find is not available
In the comments you write that find is not available on your system. The following solution works without find:
grep -R --files-with-matches Apples "${dir}" | while read -r file
do
target_dir=`dirname $file`
# do something with "$target_dir"
echo $target_dir
done

find piped to xargs with complex command

I am trying to process DVD files that are in many different locations on a disk. The thing they have in common is that they (each set of input files) are in a directory named VIDEO_TS. The output in each case will be a single file named for the parent of this directory.
I know I can get a fully qualified path to each directory with:
find /Volumes/VolumeName -type d -name "VIDEO_TS" -print0
and I can get the parent directory by piping to xargs:
find /Volumes/VolumeName -type d -name "VIDEO_TS" -print0 | xargs -0 -I{} dirname {}
and I also know that I can get the parent directory name on its own by appending:
| xargs -o I{} basename {}
What I can't figure out is how do I then pass these parameters to, e.g. HandBrakeCLI:
./HandBrakeCLI -i /path/to/filename/VIDEO_TS -o /path/to/convertedfiles/filename.m4v
I have read here about expansion capability of the shell and suspect that's going to help here (not using dirname or basename for a start), but the more I read the more confused I am getting!
You don't actually need xargs for this at all: You can read a NUL-delimited stream into a shell loop, and run the commands you want directly from there.
#!/bin/bash
source_dir=/Volumes/VolumeName
dest_dir=/Volumes/OtherName
while IFS= read -r -d '' dir; do
name=${dir%/VIDEO_TS} # trim /VIDEO_TS off the end of dir, assign to name
name=${name##*/} # remove everything before last remaining / from name
./HandBrakeCLI -i "$dir" -o "$dest_dir/$name.m4v"
done < <(find "$source_dir" -type d -name "VIDEO_TS" -print0)
See the article Using Find on Greg's wiki, or BashFAQ #001 for general information on processing input streams in bash, or BashFAQ #24 to understand the value of using process substitution (the <(...) construct here) rather than piping from find into the loop.
Also, find contains an -exec action which can be used as follows:
source_dir=/Volumes/VolumeName
dest_dir=/Volumes/OtherName
export dest_dir # export allows use by subprocesses!
find "$source_dir" -type d -name "VIDEO_TS" -exec bash -c '
for dir; do
name=${dir%/VIDEO_TS}
name=${name##*/}
./HandBrakeCLI -i "$dir" -o "$dest_dir/$name.m4v"
done
' _ {} +
This passes the found directory names directly on the argument list to the shell invoked with bash -c. Since the default object for for loop to iterate over is "$#", the argument list, this implicitly iterates over directories found by find.
If I understand what you are trying to do, the simplest solution would be to create a little wrapper which takes a path and invokes your CLI:
File: CLIWrapper
#!/bin/bash
for dir in "$#"; do
./HandBrakeCLI -i "${dir%/*}" -o "/path/to/convertedfiles/${dir##*/}.m4v"
done
Edit: I think I misunderstood the question. It's possible that the above script should read:
./HandBrakeCLI -i "$dir" -o "/path/to/convertedfiles/${dir##*/}.m4v"
or perhaps something slightly different. But the theory is valid. :)
Then you can invoke that script using the -exec option to find. The script loops over its arguments, making it possible for find to send multiple arguments to a single invocation using the + terminator:
find /Volumes/VolumeName -type d -name "VIDEO_TS" -exec ./CLIWrapper {} +

Execute loop on bash thru keywords

I have such script, it search my mail files and if keyword is found it move all files to other location.
How to make it work for multiple keywords?, for example i would have 11 KEY's and i would not want to copy and paste find command over and over.
DIRF='move/from'
DIRT='move/to'
KEY='discount'
find $DIRF -type f -exec grep -ilR "$KEY" {} \; | xargs -I % mv % $DIRT
Why are you using find here at all?
You are already telling grep to operate recursively (-R) so just point it at $DIRF and be done. -R is also pointless if you only ever give it files (from type -f).
Also grep takes a pattern that can do alternation. Just use that.
grep -RilE 'KEY1|KEY2|KEY3|Key4' "$DIRF"
for KEY in "discount" "other_value" "other_value2"
do
find $DIRF -type f -exec grep -ilR "$KEY " {} \; | xargs -I % mv % $DIRT
done

Resources