Unexpected behavior of find -exec - bash

I found an unexpected to me behavior of "find -exec" bash command and I would appreciate some interpretation. The same job can be done with "for file_name in find ....; do...." loop, so the question is why it doesn't work with -exec option of find.
There are two folders (SRC/ and src/) with the same set of files. I want to compare the files in these folders:
find src/ -type f -exec sh -c "diff {} `echo {} | sed 's/src/SRC/'`" \;
this, however, doesn't compare the files... Due to some reason sed command doesn't make the the substitution. If there is only one file, e.g., "a", in each of this folders then a command
find src/ -type f -exec sh -c "echo {} `echo {} | sed 's/src/SRC/'`" \;
outputs
src/a src/a
if one does a similar thing in bash, all the following commands give the same result (SRC/a):
echo src/a | sed 's/src/SRC/'
echo `echo src/a | sed 's/src/SRC/'`
sh -c "echo src/a | sed 's/src/SRC/'"
sh -c "echo `echo src/a | sed 's/src/SRC/'`"
but if this commands are supplied to "find -exec ..." the outputs are different:
find src/ -type f -exec bash -c "echo {} | sed 's/src/SRC/'" \;
gives "SRC/a"
and
find src/ -type f -exec bash -c "echo `echo {} | sed 's/src/SRC/'`" \;
gives "src/a"
Is that the expected behavior?

Use single quotes for sh -c for the script is interpreted by your shell first. And Pass the filename as an argument for sh instead of using {} inside the quotes:
find src/ -type f -exec sh -c 'diff "$1" "$(printf "%s\n" "$1" | sed "s/src/SRC/")"' _ {} \;
Or with bash:
find src/ -type f -exec bash -c 'diff "$1" "${1/src/SRC}"' _ {} \;

Related

Solution for find -exec if single and double quotes already in use

I would like to recursively go through all subdirectories and remove the oldest two PDFs in each subfolder named "bak":
Works:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && pwd" \;
Does not work, as the double quotes are already in use:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && rm "$(ls -t *.pdf | tail -2)"" \;
Any solution to the double quote conundrum?
In a double quoted string you can use backslashes to escape other double quotes, e.g.
find ... "rm \"\$(...)\""
If that is too convoluted use variables:
cmd='$(...)'
find ... "rm $cmd"
However, I think your find -exec has more problems than that.
Using {} inside the command string "cd '{}' ..." is risky. If there is a ' inside the file name things will break and might execcute unexpected commands.
$() will be expanded by bash before find even runs. So ls -t *.pdf | tail -2 will only be executed once in the top directory . instead of once for each found directory. rm will (try to) delete the same file for each found directory.
rm "$(ls -t *.pdf | tail -2)" will not work if ls lists more than one file. Because of the quotes both files would be listed in one argument. Therefore, rm would try to delete one file with the name first.pdf\nsecond.pdf.
I'd suggest
cmd='cd "$1" && ls -t *.pdf | tail -n2 | sed "s/./\\\\&/g" | xargs rm'
find . -type d -name bak -exec bash -c "$cmd" -- {} \;
You have a more fundamental problem; because you are using the weaker double quotes around the entire script, the $(...) command substitution will be interpreted by the shell which parses the find command, not by the bash shell you are starting, which will only receive a static string containing the result from the command substitution.
If you switch to single quotes around the script, you get most of it right; but that would still fail if the file name you find contains a double quote (just like your attempt would fail for file names with single quotes). The proper fix is to pass the matching files as command-line arguments to the bash subprocess.
But a better fix still is to use -execdir so that you don't have to pass the directory name to the subshell at all:
find . -type d -name "bak" \
-execdir bash -c 'ls -t *.pdf | tail -2 | xargs -r rm' \;
This could stll fail in funny ways because you are parsing ls which is inherently buggy.
You are explicitely asking for find -exec. Usually I would just concatenate find -exec find -delete but in your case only two files should be deleted. Therefore the only method is running subshell. Socowi already gave nice solution, however if your file names do not contain tabulator or newlines, another workaround is find while read loop.
This will sort files by mtime
find . -type d -iname 'bak' | \
while read -r dir;
do
find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | \
sort | head -n2 | \
cut -f2- | \
while read -r file;
do
rm "$file";
done;
done;
The above find while read loop as "one-liner"
find . -type d -iname 'bak' | while read -r dir; do find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | sort | head -n2 | cut -f2- | while read -r file; do rm "$file"; done; done;
find while read loop can also handle NUL terminated file names. However head can not handle this, so I did improve other answers and made it work with nontrivial file names (only GNU + bash)
replace 'realpath' with rm
#!/bin/bash
rm_old () {
find "$1" -maxdepth 1 -type f -iname \*.$2 -printf "%T+\t%p\0" | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
export -f rm_old
find -type d -iname bak -execdir bash -c 'rm_old "{}" pdf 2' \;
However bash -c might still exploitable, to make it more secure let stat %N do the quoting
#!/bin/bash
rm_old () {
local dir="$1"
# we don't like eval
# eval "dir=$dir"
# this works like eval
dir="${dir#?}"
dir="${dir%?}"
dir="${dir//"'$'\t''"/$'\011'}"
dir="${dir//"'$'\n''"/$'\012'}"
dir="${dir//$'\047'\\$'\047'$'\047'/$'\047'}"
find "$dir" -maxdepth 1 -type f -iname \*.$2 -printf '%T+\t%p\0' | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
find -type d -iname bak -exec stat -c'%N' {} + | while read -r dir; do rm_old "$dir" pdf 2; done

Running multiple commands with xargs - for loop

Based on the top answer in Running multiple commands with xargs I'm trying to use find / xargs to work upon more files. Why the first file 1.txt is missing in for loop?
$ ls
1.txt 2.txt 3.txt
$ find . -name "*.txt" -print0 | xargs -0
./1.txt ./2.txt ./3.txt
$ find . -name "*.txt" -print0 | xargs -0 sh -c 'for arg do echo "$arg"; done'
./2.txt
./3.txt
Why do you insist on using xargs? You can do the following as well.
while read -r file; do
echo $file
done <<<$(find . -name "*.txt")
Because this is executed in the same shell, changing variables is possible in the loop. Otherwise you'll get a sub-shell in which that doesn't work.
When you use your for-loop in a script example.sh, the call example.sh var1 var2 var3 will put var1 in the first argument, not example.sh.
When you want to process one file for each command, use the xargs option -L:
find . -name "*.txt" -print0 | xargs -0 -L1 sh -c 'echo "$0"'
# or for a simple case
find . -name "*.txt" -print0 | xargs -0 -L1 echo
I ran across this while having the same issue. You need the extra _ at the end as place holder 0 for xargs
$ find . -name "*.txt" -print0 | xargs -0 sh -c 'for arg do echo "$arg"; done' _

What is the correct Linux command of find, grep and sort?

I am writing a command using find, grep and sort to display a sorted list of all files that contain 'some-text'.
I was unable to figure out the command.
Here is my attempt:
$find . -type f |grep -l "some-text" | sort
but it didn't work.
You need to use something like XARGS so that the content of each file passed through the pipe | is made available for grep.
XARGS: converts input from standard input into arguments to a command
In my case, I have files1,2,3 and they contain the word test. This will do it.
za:tmp za$ find . -type f | xargs grep -l "test" | sort
./file1.txt
./file2.txt
./file3.txt
or
za:tmp za$ find . -type f | xargs grep -i "test" | sort
./file1.txt:some test string
./file2.txt:some test string
./file3.txt:some test string
You can use it in any unix:
find . -type f -exec sh -c 'grep "some text" {} /dev/null > /dev/null 2>&1' \; -a -print 2> /dev/null|sort
A more optimized solution that works only with GNU-grep:
find . -type f -exec grep -Hq "some-text" {} \; -a -print 2> /dev/null|sort

Escaping basename in bourne shell when using find

I want to merge output of three logwatch outputs and pipe result through sendmail.
Example:
#!/bin/sh
LOG_DIR="/var/log/remote-hosts"
MAIL_TO="me#email.com"
sh -c "logwatch && find ${LOG_DIR} -type d -name \"ip*\" -print0 | xargs -0 -I{} sh -c 'logwatch --logdir {} --hostname $(basename {})'" |
sed '1!b;s/^/To: '${MAIL_TO}'\nSubject: Logwatch report\n\n/' | sendmail -t
first logwatch is executed on /var/log folder
and then I would like to traverse /var/log/remote-hosts subfolders (ip-10-0-0-38 and ip-10-0-0-39 ) with find and also do logwatch on them.
The merged output will be sent throught sentmail. However I would like to replace hostname with basename of /var/log/remote-hosts subfolder so instead of /var/log/remote-hosts/ip-10-0-0-38 I will have ip-10-0-0-38 only.
But unfortunatelly I don't how to do the basename part correctly. Any help? Thanks in advance.
Don't use sh -c for grouping statements, use (...):
(logwatch && find ${LOG_DIR} -type d -name "ip*" -print0 | xargs -0 -I{} sh -c 'logwatch --logdir {} --hostname $(basename {})') |
sed '1!b;s/^/To: '${MAIL_TO}'\nSubject: Logwatch report\n\n/' | sendmail -t

Bash syntax string insertion

I have a command which I want to have as a function in my .bashrc.
From the commandline
find . -name '*.pdf' -exec sh -c 'pdftotext {} - | grep --with-filename --label={} --color "string of words" ' \;
Will find "string of words" in any pdf in the current directory.
Despite the best part of an hour, I seriously can't get "string of words" to work as a string variable - i.e.
eg="string of words"
find . -name '*.pdf' -exec sh -c 'pdftotext {} - | grep --with-filename --label={} --color $eg ' \;
Which obviously won't work, but I have tried all kinds of combinations of "/'/\ with echo hacks, array expansions, but no luck. I'm sure its possible, and I'm sure its easy, but I cannot get it to work.
Things like variable expansion only work inside of double quotes, not single quotes. Have you tried using double quotes on that sring?
Like so:
find . -name "*.pdf' -exec sh -c 'pdftotext {} - | grep --with-filename --label={} --color $eg " \;
The problem is probably the single quotes ' around the pdftotext command. The single quotes will prevent any variable expansion in the string which they occur. You may have more luck with double quotes ".
eg="string of words"
find . -name '*.pdf' -exec sh -c "pdftotext {} - | grep --with-filename --label={} --color $eg " \;
Probably simplest to do:
find . -name '*.pdf' -exec \
sh -c 'pdftotext $0 - | grep --with-filename --label=$0 --color "$1"' {} "$eg" \;
Write a small shell script mypdfgrep and call that from find:
#/bin/bash
pdftotext "$1" - | grep --with-filename --label "$1" --color "$2"
Then run
$ chmod +x mypdfgrep
$ find . -name '*.pdf' -execdir /full/path/to/mypdfgrep '{}' "string of words" \;
You need to decorate the logic just a bit differently than what you've done:
eg="string of words"
find . -name '*.pdf' -exec sh -c "pdftotext {} - | \
grep -H --label={} --color '$eg'" \;
i.e., by making the shell process outer quote delimiter ", the shell variable expansion works, and delimiting the search variable with ' preserves it as a string.

Resources