bash get relative path from absoulte - bash

This line gets absolute path, i used output to pass it to rsync, but rsync wants relative path
find /www-data/ -type f -exec sh -c 'if ! lsof `readlink -f {}` > /dev/null; then echo `realpath {}`; fi' \; | tr '\n' '\0'
No idea how to feed realpath --relative-to from above output
Full code:
cd /www-data
find ./ -type f -exec sh -c 'if ! lsof `readlink -f {}` > /dev/null; then echo `realpath {}`; fi' \; | tr '\n' '\0' | rsync -avz --from0 --files-from=- ./ /data/map/uploads/ --dry-run

Using tr '\n' '\000' is fundamentally broken. The reason you want to push in null-terminated strings is to disambiguate between newlines which are part of a file name, and those which aren't; but if you are replacing all newlines, you are not disambiguating anything. Perhaps see also https://mywiki.wooledge.org/BashFAQ/020
Somewhat similarly, echo `command` is just a useless use of echo, unless you specifically want the shell to squish whitespace and expand wildcards in the output from command.
If I'm allowed to guess slightly at what you are actually trying to ask here, try
find ./ -type f -exec sh -c 'for f; do
lsof "$(readlink -f "$f")" > /dev/null ||
printf "%s\0" "$(realpath --relative-to /var/www-data "$f")"
done' _ {} + |
rsync -avz --from0 --files-from=- ./ /data/mapis/clientuploads/ --dry-run
The crucial change is really to have find pass in the file names as arguments to sh -c '...' rather than try to replace {} smack dab in the middle of a string which may or may not require quoting.
Using -exec ... {} + with a + at the end should improve efficiency somewhat, at the very minor cost of adding a for loop to the embedded sh script.

Related

Solution for find -exec if single and double quotes already in use

I would like to recursively go through all subdirectories and remove the oldest two PDFs in each subfolder named "bak":
Works:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && pwd" \;
Does not work, as the double quotes are already in use:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && rm "$(ls -t *.pdf | tail -2)"" \;
Any solution to the double quote conundrum?
In a double quoted string you can use backslashes to escape other double quotes, e.g.
find ... "rm \"\$(...)\""
If that is too convoluted use variables:
cmd='$(...)'
find ... "rm $cmd"
However, I think your find -exec has more problems than that.
Using {} inside the command string "cd '{}' ..." is risky. If there is a ' inside the file name things will break and might execcute unexpected commands.
$() will be expanded by bash before find even runs. So ls -t *.pdf | tail -2 will only be executed once in the top directory . instead of once for each found directory. rm will (try to) delete the same file for each found directory.
rm "$(ls -t *.pdf | tail -2)" will not work if ls lists more than one file. Because of the quotes both files would be listed in one argument. Therefore, rm would try to delete one file with the name first.pdf\nsecond.pdf.
I'd suggest
cmd='cd "$1" && ls -t *.pdf | tail -n2 | sed "s/./\\\\&/g" | xargs rm'
find . -type d -name bak -exec bash -c "$cmd" -- {} \;
You have a more fundamental problem; because you are using the weaker double quotes around the entire script, the $(...) command substitution will be interpreted by the shell which parses the find command, not by the bash shell you are starting, which will only receive a static string containing the result from the command substitution.
If you switch to single quotes around the script, you get most of it right; but that would still fail if the file name you find contains a double quote (just like your attempt would fail for file names with single quotes). The proper fix is to pass the matching files as command-line arguments to the bash subprocess.
But a better fix still is to use -execdir so that you don't have to pass the directory name to the subshell at all:
find . -type d -name "bak" \
-execdir bash -c 'ls -t *.pdf | tail -2 | xargs -r rm' \;
This could stll fail in funny ways because you are parsing ls which is inherently buggy.
You are explicitely asking for find -exec. Usually I would just concatenate find -exec find -delete but in your case only two files should be deleted. Therefore the only method is running subshell. Socowi already gave nice solution, however if your file names do not contain tabulator or newlines, another workaround is find while read loop.
This will sort files by mtime
find . -type d -iname 'bak' | \
while read -r dir;
do
find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | \
sort | head -n2 | \
cut -f2- | \
while read -r file;
do
rm "$file";
done;
done;
The above find while read loop as "one-liner"
find . -type d -iname 'bak' | while read -r dir; do find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | sort | head -n2 | cut -f2- | while read -r file; do rm "$file"; done; done;
find while read loop can also handle NUL terminated file names. However head can not handle this, so I did improve other answers and made it work with nontrivial file names (only GNU + bash)
replace 'realpath' with rm
#!/bin/bash
rm_old () {
find "$1" -maxdepth 1 -type f -iname \*.$2 -printf "%T+\t%p\0" | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
export -f rm_old
find -type d -iname bak -execdir bash -c 'rm_old "{}" pdf 2' \;
However bash -c might still exploitable, to make it more secure let stat %N do the quoting
#!/bin/bash
rm_old () {
local dir="$1"
# we don't like eval
# eval "dir=$dir"
# this works like eval
dir="${dir#?}"
dir="${dir%?}"
dir="${dir//"'$'\t''"/$'\011'}"
dir="${dir//"'$'\n''"/$'\012'}"
dir="${dir//$'\047'\\$'\047'$'\047'/$'\047'}"
find "$dir" -maxdepth 1 -type f -iname \*.$2 -printf '%T+\t%p\0' | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
find -type d -iname bak -exec stat -c'%N' {} + | while read -r dir; do rm_old "$dir" pdf 2; done

Bash Recursively Remove Leading Underscore

I'm attempting to recursively remove a leading underscore from some scss files. Here's what I have.
find . -name '*.scss' -print0 | xargs -0 -n1 bash -c 'mv "$0" `echo $0 | sed -e 's:^_*::'`'
When I'm in a specific directory this works perfectly:
for FILE in *.scss; do mv $FILE `echo $FILE | sed -e 's:^_*::'`; done
What am I doing wrong in the find?
As the starting point is ., all paths that find prints start with a dot. Thus ^_* doesn't match anything and sed returns its input unchanged.
I wouldn't bother with sed or xargs though.
The script below works with any find and sh that isn't terribly broken, and properly handles filenames with underscores in the middle as well.
find . -name '_*.scss' -exec sh -c '
for fp; do # pathname
fn=${fp##*/} # filename
fn=${fn#"${fn%%[!_]*}"} # filename w/o leading underscore(s)
echo mv "$fp" "${fp%/*}/$fn"
done' sh {} +
A less portable but shorter and much cleaner alternative in bash looks like:
shopt -s globstar extglob nullglob
for fp in ./**/_*.scss; do
echo mv "$fp" "${fp%/*}/${fp##*/+(_)}"
done
Drop echo if the output looks good.
Look at the syntax highlighting of the sed command. Single quotes can't be nested. Easiest fix: switch to double quotes.
find . -name '*.scss' -print0 | xargs -0 -n1 bash -c 'mv "$0" `echo $0 | sed -e "s:^_*::"`'
I would recommend leaving $0 as the program name and using $1 for the first argument. That way if bash prints an error message it'll prefix it with bash:.
find . -name '*.scss' -print0 | xargs -0 -n1 bash -c 'mv "$1" `echo "$1" | sed -e "s:^_*::"`' bash
You can also simplify this with find -exec.
find . -name '*.scss' -exec bash -c 'mv "$1" `echo "$1" | sed -e "s:^_*::"`' bash {} ';'
You could also let bash do the substitution with its ${var#prefix} prefix removal syntax:
find . -name '*.scss' -exec bash -c 'mv "$1" "${1#_}"' bash {} ';'
You don't need find at all (unless you are trying to do this with an ancient version of bash, such as what macOS ships):
shopt -s globstar extglob
for f in **/*.scss; do
mv -- "$f" "${f##*(_)}"
done
${f##...} expands f, minus the longest prefix matching .... The extended pattern *(_) matches 0 or more _, analogous to the regular expression _*.

How can I use sed to change my target dir in this shell command line?

I use this command line to find all the SVGs (thousands) in a directory and convert them to PNGs using Inkscape. Works great. Here is my issue. It outputs the PNGs in the same directory. I would like to change the target directory.
for i in `find /home/wyatt/test/svgsDIR -name "*.svg"`; do inkscape $i --export-background-opacity=0 --export-png=`echo $i | sed -e 's/svg$/png/'` -w 700 ; done
It appears $i is the file_path + file_name, and sed does a search/replace on the file extension. How do I search/replace my file_path? Or is there a better way to define a different target path within this command line?
Any help is much appreciated.
Would you please try:
destdir="DIR" # replace with your desired directory name
mkdir -p "$destdir"
find /home/wyatt/test/svgsDIR -name "*.svg" -print0 | while IFS= read -r -d "" i; do
destfile="$destdir/$(basename -s .svg "$i").png"
inkscape "$i" --export-background-opacity=0 --export-png="$destfile" -w 700
done
or
destdir="DIR"
mkdir -p "$destdir"
for i in /home/wyatt/test/svgsDIR/*.svg; do
destfile="$destdir/$(basename -s .svg "$i").png"
inkscape "$i" --export-background-opacity=0 --export-png="$destfile" -w 700
done
This may be off-topic but it is not recommended to use a for loop relying on the word-splitting especially when dealing with the filenames. Please consider the filenames and the pathnames may contain whitespace, newline, tab or other special characters.
Or with a one-liners (split for readability)
find /home/wyatt/test/svgsDIR -name "*.svg" |
xargs -I{} sh -c 'inkscape "{}" --export-background-opacity=0 --export-png='$destdir'/$(basename {} .svg).png -w 700'
Might work with find built-in exec:
find /home/wyatt/test/svgsDIR -name "*.svg" -exec sh -c 'inkscape "{}" --export-background-opacity=0 --export-png='$destdir'/$(basename {} .svg).png -w 700' \;
Or by passing target-dir as arguments, to simplify quoting.
find /home/wyatt/test/svgsDIR -name "*.svg" -exec sh -c 'inkscape "$1" --export-background-opacity=0 --export-png="$2/$(basename $1 .svg).png" -w 700' '{}' "$targetdir" \;

Printing the shell find and remove command to screen and log file

I have a script that finds log files older than x days within a specified directory and removes them.
find $LOG_ARCHIVE/* -mtime +$DAYS_TO_KEEP_LOGS -exec rm -f {} \;
This is working as expected but I would like to have the option to print the processing to the screen and log file so I know what files (if any) have been deleted. I've tried appending tee at the end but have had no success.
find $LOG_ARCHIVE/* -mtime +$DAYS_TO_KEEP_LOGS -exec rm -fv {} \; | tee -a $LOG
There are multiple ways the task can be done.
One possibility is to simply run find twice:
find "$LOG_ARCHIVE" -mtime +"$DAYS_TO_KEEP_LOGS" -print > "$LOG"
find "$LOG_ARCHIVE" -mtime +"$DAYS_TO_KEEP_LOGS" -exec rm -f {} +
Another possibility is to use tee along with (GNU extensions) -print0 to find and -0 to xargs:
find "$LOG_ARCHIVE" -mtime +"$DAYS_TO_KEEP_LOGS" -print0 |
tee "$LOG" |
xargs -0 rm -f
With this version, the log file will have null bytes at the end of each file name. You can arrange to replace those with newlines if you don't mind the possible ambiguity:
find "$LOG_ARCHIVE" -mtime +"$DAYS_TO_KEEP_LOGS" -print0 |
tee >(tr '\0' '\n' >"$LOG") |
xargs -0 rm -f
This uses Bash (and Korn shell) process substitution to pass the log file through tr to map null bytes '\0' to newlines '\n'.
Another way of doing it is to write a tiny custom script (call it remove-log.sh):
printf '%s\n' "$#" >> "$LOG"
rm -f "$#"
and then use:
find "$LOG_ARCHIVE" -mtime +"$DAYS_TO_KEEP_LOGS" -exec bash remove-log.sh {} +
Note that the script needs to see the value of $LOG, so that must be exported as an environment variable. You could avoid that by passing the log name explicitly:
logfile="$1"
shift
printf '%s\n' "$#" >> "$logfile"
rm -f "$#"
plus:
find "$LOG_ARCHIVE" -mtime +"$DAYS_TO_KEEP_LOGS" -exec bash remove-log.sh "$LOG" {} +
Note that both of these use >> to append because the script might be invoked more than once (though it probably won't be). The onus is on you to ensure that the log file is empty before you run the find command.
Note that I dropped the /* from the path argument for find; it wasn't really needed. You might want to add -type f to ensure that only files are removed. The + is a feature from the POSIX 2008 specification of find which makes find act rather like xargs without needing to explicitly use xargs.
find $LOG_ARCHIVE/* -mtime +$DAYS_TO_KEEP_LOGS -exec sh -c 'echo {} |tee -a "$LOG"; rm -f {}' \;
Try and see if it works.

How to execute multiple commands after xargs -0?

find . -name "filename including space" -print0 | xargs -0 ls -aldF > log.txt
find . -name "filename including space" -print0 | xargs -0 rm -rdf
Is it possible to combine these two commands into one so that only 1 find will be done instead of 2?
I know for xargs -I there may be ways to do it, which may lead to errors when proceeding filenames including spaces. Any guidance is much appreciated.
find . -name "filename including space" -print0 |
xargs -0 -I '{}' sh -c 'ls -aldF {} >> log.txt; rm -rdf {}'
Ran across this just now, and we can invoke the shell less often:
find . -name "filename including space" -print0 |
xargs -0 sh -c '
for file; do
ls -aldF "$file" >> log.txt
rm -rdf "$file"
done
' sh
The trailing "sh" becomes $0 in the shell. xargs provides the files (returrned from find) as command line parameters to the shell: we iterate over them with the for loop.
If you're just wanting to avoid doing the find multiple times, you could do a tee right after the find, saving the find output to a file, then executing the lines as:
find . -name "filename including space" -print0 | tee my_teed_file | xargs -0 ls -aldF > log.txt
cat my_teed_file | xargs -0 rm -rdf
Another way to accomplish this same thing (if indeed it's what you're wanting to accomplish), is to store the output of the find in a variable (supposing it's not TB of data):
founddata=`find . -name "filename including space" -print0`
echo "$founddata" | xargs -0 ls -aldF > log.txt
echo "$founddata" | xargs -0 rm -rdf
I believe all these answers by now have given out the right ways to solute this problem. And I tried the 2 solutions of Jonathan and the way of Glenn, all of which worked great on my Mac OS X. The method of mouviciel did not work on my OS maybe due to some configuration reasons. And I think it's similar to Jonathan's second method (I may be wrong).
As mentioned in the comments to Glenn's method, a little tweak is needed. So here is the command I tried which worked perfectly FYI:
find . -name "filename including space" -print0 |
xargs -0 -I '{}' sh -c 'ls -aldF {} | tee -a log.txt ; rm -rdf {}'
Or better as suggested by Glenn:
find . -name "filename including space" -print0 |
xargs -0 -I '{}' sh -c 'ls -aldF {} >> log.txt ; rm -rdf {}'
As long as you do not have newline in your filenames, you do not need -print0 for GNU Parallel:
find . -name "My brother's 12\" records" | parallel ls {}\; rm -rdf {} >log.txt
Watch the intro video to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ
Just a variation of the xargs approach without that horrible -print0 and xargs -0, this is how I would do it:
ls -1 *.txt | xargs --delimiter "\n" --max-args 1 --replace={} sh -c 'cat {}; echo "\n"'
Footnotes:
Yes I know newlines can appear in filenames but who in their right minds would do that
There are short options for xargs but for the reader's understanding I've used the long ones.
I would use ls -1 when I want non-recursive behavior rather than find -maxdepth 1 -iname "*.txt" which is a bit more verbose.
You can execute multiple commands after find using for instead of xargs:
IFS=$'\n'
for F in `find . -name "filename including space"`
do
ls -aldF $F > log.txt
rm -rdf $F
done
The IFS defines the Internal Field Separator, which defaults to <space><tab><newline>. If your filenames may contain spaces, it is better to redefine it as above.
I'm late to the party, but there is one more solution that wasn't covered here: user-defined functions. Putting multiple instructions on one line is unwieldy, and can be hard to read/maintain. The for loop above avoids that, but there is the possibility of exceeding the command line length.
Here's another way (untested).
function processFiles {
ls -aldF "$#"
rm -rdf "$#"
}
export -f processFiles
find . -name "filename including space"` -print0 \
| xargs -0 bash -c processFiles dummyArg > log.txt
This is pretty straightforward except for the "dummyArg" which gave me plenty of grief. When running bash in this way, the arguments are read into
"$0" "$1" "$2" ....
instead of the expected
"$1" "$2" "$3" ....
Since processFiles{} is expecting the first argument to be "$1", we have to insert a dummy value into "$0".
Footnontes:
I am using some elements of bash syntax (e.g. "export -f"), but I believe this will adapt to other shells.
The first time I tried this, I didn't add a dummy argument. Instead I added "$0" to the argument lines inside my function ( e.g. ls -aldf "$0" "$#" ). Bad idea.
Aside from stylistic issues, it breaks when the "find" command returns nothing. In that case, $0 is set to "bash", Using the dummy argument instead avoids all of this.
Another solution:
find . -name "filename including space" -print0 \
| xargs -0 -I FOUND echo "$(ls -aldF FOUND > log.txt ; rm -rdf FOUND)"

Resources