I have a command which I want to have as a function in my .bashrc.
From the commandline
find . -name '*.pdf' -exec sh -c 'pdftotext {} - | grep --with-filename --label={} --color "string of words" ' \;
Will find "string of words" in any pdf in the current directory.
Despite the best part of an hour, I seriously can't get "string of words" to work as a string variable - i.e.
eg="string of words"
find . -name '*.pdf' -exec sh -c 'pdftotext {} - | grep --with-filename --label={} --color $eg ' \;
Which obviously won't work, but I have tried all kinds of combinations of "/'/\ with echo hacks, array expansions, but no luck. I'm sure its possible, and I'm sure its easy, but I cannot get it to work.
Things like variable expansion only work inside of double quotes, not single quotes. Have you tried using double quotes on that sring?
Like so:
find . -name "*.pdf' -exec sh -c 'pdftotext {} - | grep --with-filename --label={} --color $eg " \;
The problem is probably the single quotes ' around the pdftotext command. The single quotes will prevent any variable expansion in the string which they occur. You may have more luck with double quotes ".
eg="string of words"
find . -name '*.pdf' -exec sh -c "pdftotext {} - | grep --with-filename --label={} --color $eg " \;
Probably simplest to do:
find . -name '*.pdf' -exec \
sh -c 'pdftotext $0 - | grep --with-filename --label=$0 --color "$1"' {} "$eg" \;
Write a small shell script mypdfgrep and call that from find:
#/bin/bash
pdftotext "$1" - | grep --with-filename --label "$1" --color "$2"
Then run
$ chmod +x mypdfgrep
$ find . -name '*.pdf' -execdir /full/path/to/mypdfgrep '{}' "string of words" \;
You need to decorate the logic just a bit differently than what you've done:
eg="string of words"
find . -name '*.pdf' -exec sh -c "pdftotext {} - | \
grep -H --label={} --color '$eg'" \;
i.e., by making the shell process outer quote delimiter ", the shell variable expansion works, and delimiting the search variable with ' preserves it as a string.
Related
I would like to recursively go through all subdirectories and remove the oldest two PDFs in each subfolder named "bak":
Works:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && pwd" \;
Does not work, as the double quotes are already in use:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && rm "$(ls -t *.pdf | tail -2)"" \;
Any solution to the double quote conundrum?
In a double quoted string you can use backslashes to escape other double quotes, e.g.
find ... "rm \"\$(...)\""
If that is too convoluted use variables:
cmd='$(...)'
find ... "rm $cmd"
However, I think your find -exec has more problems than that.
Using {} inside the command string "cd '{}' ..." is risky. If there is a ' inside the file name things will break and might execcute unexpected commands.
$() will be expanded by bash before find even runs. So ls -t *.pdf | tail -2 will only be executed once in the top directory . instead of once for each found directory. rm will (try to) delete the same file for each found directory.
rm "$(ls -t *.pdf | tail -2)" will not work if ls lists more than one file. Because of the quotes both files would be listed in one argument. Therefore, rm would try to delete one file with the name first.pdf\nsecond.pdf.
I'd suggest
cmd='cd "$1" && ls -t *.pdf | tail -n2 | sed "s/./\\\\&/g" | xargs rm'
find . -type d -name bak -exec bash -c "$cmd" -- {} \;
You have a more fundamental problem; because you are using the weaker double quotes around the entire script, the $(...) command substitution will be interpreted by the shell which parses the find command, not by the bash shell you are starting, which will only receive a static string containing the result from the command substitution.
If you switch to single quotes around the script, you get most of it right; but that would still fail if the file name you find contains a double quote (just like your attempt would fail for file names with single quotes). The proper fix is to pass the matching files as command-line arguments to the bash subprocess.
But a better fix still is to use -execdir so that you don't have to pass the directory name to the subshell at all:
find . -type d -name "bak" \
-execdir bash -c 'ls -t *.pdf | tail -2 | xargs -r rm' \;
This could stll fail in funny ways because you are parsing ls which is inherently buggy.
You are explicitely asking for find -exec. Usually I would just concatenate find -exec find -delete but in your case only two files should be deleted. Therefore the only method is running subshell. Socowi already gave nice solution, however if your file names do not contain tabulator or newlines, another workaround is find while read loop.
This will sort files by mtime
find . -type d -iname 'bak' | \
while read -r dir;
do
find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | \
sort | head -n2 | \
cut -f2- | \
while read -r file;
do
rm "$file";
done;
done;
The above find while read loop as "one-liner"
find . -type d -iname 'bak' | while read -r dir; do find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | sort | head -n2 | cut -f2- | while read -r file; do rm "$file"; done; done;
find while read loop can also handle NUL terminated file names. However head can not handle this, so I did improve other answers and made it work with nontrivial file names (only GNU + bash)
replace 'realpath' with rm
#!/bin/bash
rm_old () {
find "$1" -maxdepth 1 -type f -iname \*.$2 -printf "%T+\t%p\0" | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
export -f rm_old
find -type d -iname bak -execdir bash -c 'rm_old "{}" pdf 2' \;
However bash -c might still exploitable, to make it more secure let stat %N do the quoting
#!/bin/bash
rm_old () {
local dir="$1"
# we don't like eval
# eval "dir=$dir"
# this works like eval
dir="${dir#?}"
dir="${dir%?}"
dir="${dir//"'$'\t''"/$'\011'}"
dir="${dir//"'$'\n''"/$'\012'}"
dir="${dir//$'\047'\\$'\047'$'\047'/$'\047'}"
find "$dir" -maxdepth 1 -type f -iname \*.$2 -printf '%T+\t%p\0' | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
find -type d -iname bak -exec stat -c'%N' {} + | while read -r dir; do rm_old "$dir" pdf 2; done
I'm attempting to recursively remove a leading underscore from some scss files. Here's what I have.
find . -name '*.scss' -print0 | xargs -0 -n1 bash -c 'mv "$0" `echo $0 | sed -e 's:^_*::'`'
When I'm in a specific directory this works perfectly:
for FILE in *.scss; do mv $FILE `echo $FILE | sed -e 's:^_*::'`; done
What am I doing wrong in the find?
As the starting point is ., all paths that find prints start with a dot. Thus ^_* doesn't match anything and sed returns its input unchanged.
I wouldn't bother with sed or xargs though.
The script below works with any find and sh that isn't terribly broken, and properly handles filenames with underscores in the middle as well.
find . -name '_*.scss' -exec sh -c '
for fp; do # pathname
fn=${fp##*/} # filename
fn=${fn#"${fn%%[!_]*}"} # filename w/o leading underscore(s)
echo mv "$fp" "${fp%/*}/$fn"
done' sh {} +
A less portable but shorter and much cleaner alternative in bash looks like:
shopt -s globstar extglob nullglob
for fp in ./**/_*.scss; do
echo mv "$fp" "${fp%/*}/${fp##*/+(_)}"
done
Drop echo if the output looks good.
Look at the syntax highlighting of the sed command. Single quotes can't be nested. Easiest fix: switch to double quotes.
find . -name '*.scss' -print0 | xargs -0 -n1 bash -c 'mv "$0" `echo $0 | sed -e "s:^_*::"`'
I would recommend leaving $0 as the program name and using $1 for the first argument. That way if bash prints an error message it'll prefix it with bash:.
find . -name '*.scss' -print0 | xargs -0 -n1 bash -c 'mv "$1" `echo "$1" | sed -e "s:^_*::"`' bash
You can also simplify this with find -exec.
find . -name '*.scss' -exec bash -c 'mv "$1" `echo "$1" | sed -e "s:^_*::"`' bash {} ';'
You could also let bash do the substitution with its ${var#prefix} prefix removal syntax:
find . -name '*.scss' -exec bash -c 'mv "$1" "${1#_}"' bash {} ';'
You don't need find at all (unless you are trying to do this with an ancient version of bash, such as what macOS ships):
shopt -s globstar extglob
for f in **/*.scss; do
mv -- "$f" "${f##*(_)}"
done
${f##...} expands f, minus the longest prefix matching .... The extended pattern *(_) matches 0 or more _, analogous to the regular expression _*.
find . -iname "*.txt" -exec program '{}' \; | sed 's/Value= //'
-"program" returns a different value for each file, and the output is prefixed with "Value= "
In this time the output will be "Value= 128" and the after sed just 128.
How can I take just the value "128" and have the input file be renamed to 128.txt
but also have this find run thought multiple files.
sorry for bad descriptions.
I will try to clear if needed
First write a shell script capable of renaming an argument:
mv "$1" "$(program "$1" | sed "s/Value= //").txt"
Then embed that script in your find command:
find . -iname "*.txt" \
-exec sh -c 'mv "$1" "$(program "$1" | sed "s/Value= //").txt"' _ {} \;
In my .bash_profile, I have a function that returns all php files containing the parameter string passed in:
summon() {
"find . -name '*.php' -exec grep -ril '$1' '{}' \;"
}
When I am on my command line (mac) and I run summon foo, I get the error:
-bash: find . -name '*.php' -exec grep -ril 'foo' '{}' \;: command not found
But if I just copy/paste the find . -name '*.php' -exec grep -ril 'foo' '{}' \; into the command line, then it works properly, returning all of the php files that contain the string 'foo'.
Does anyone have any idea why the function is not being evaluated?
Just remove the quotes from your summon function. By quoting it, you are telling it to look for a command called find . -name '*.php' -exec grep -ril '$1' '{}' \; rather than a command called find with arguments of . -name '*.php' -exec grep -ril '$1' '{}' \; There is a good reason for this; consider if there were an application whose name contained a space (let's call it foo bar). If not for this quoting syntax, the program would be more difficult to execute from bash, because typing foo bar would try to run the command foo with argument bar, as opposed to running foo bar (As a side note, if this were the case, you could also run it by escaping the space: foo\ bar). Of course, it is considered bad form to name an executable something containing a space for this reason of adding complexity to run the command.
Your function should look like this:
summon() {
find . -name '*.php' -exec grep -ril "$1" '{}' \;
}
Also see #gniourf_gniourf 's comment on this answer with a few more suggestions, including using -type f on the find command to limit the search to files and removing the unnecessary -r flag from grep, because all files passed there will be files.
Loose the double-quotes around the find within the function.
summon() {
find . -name '*.php' -exec grep -il "$1" '{}' +
}
Within double-quotes, shell tries to expand it, so that it can evaluate it as an expression, Shell-Expansion
Argument inside single quote is the problem. Try like the below
summon() {
find . -name '*.php' -exec grep -ril "$1" {} \;
}
I found an unexpected to me behavior of "find -exec" bash command and I would appreciate some interpretation. The same job can be done with "for file_name in find ....; do...." loop, so the question is why it doesn't work with -exec option of find.
There are two folders (SRC/ and src/) with the same set of files. I want to compare the files in these folders:
find src/ -type f -exec sh -c "diff {} `echo {} | sed 's/src/SRC/'`" \;
this, however, doesn't compare the files... Due to some reason sed command doesn't make the the substitution. If there is only one file, e.g., "a", in each of this folders then a command
find src/ -type f -exec sh -c "echo {} `echo {} | sed 's/src/SRC/'`" \;
outputs
src/a src/a
if one does a similar thing in bash, all the following commands give the same result (SRC/a):
echo src/a | sed 's/src/SRC/'
echo `echo src/a | sed 's/src/SRC/'`
sh -c "echo src/a | sed 's/src/SRC/'"
sh -c "echo `echo src/a | sed 's/src/SRC/'`"
but if this commands are supplied to "find -exec ..." the outputs are different:
find src/ -type f -exec bash -c "echo {} | sed 's/src/SRC/'" \;
gives "SRC/a"
and
find src/ -type f -exec bash -c "echo `echo {} | sed 's/src/SRC/'`" \;
gives "src/a"
Is that the expected behavior?
Use single quotes for sh -c for the script is interpreted by your shell first. And Pass the filename as an argument for sh instead of using {} inside the quotes:
find src/ -type f -exec sh -c 'diff "$1" "$(printf "%s\n" "$1" | sed "s/src/SRC/")"' _ {} \;
Or with bash:
find src/ -type f -exec bash -c 'diff "$1" "${1/src/SRC}"' _ {} \;