Bash Recursively Remove Leading Underscore - bash

I'm attempting to recursively remove a leading underscore from some scss files. Here's what I have.
find . -name '*.scss' -print0 | xargs -0 -n1 bash -c 'mv "$0" `echo $0 | sed -e 's:^_*::'`'
When I'm in a specific directory this works perfectly:
for FILE in *.scss; do mv $FILE `echo $FILE | sed -e 's:^_*::'`; done
What am I doing wrong in the find?

As the starting point is ., all paths that find prints start with a dot. Thus ^_* doesn't match anything and sed returns its input unchanged.
I wouldn't bother with sed or xargs though.
The script below works with any find and sh that isn't terribly broken, and properly handles filenames with underscores in the middle as well.
find . -name '_*.scss' -exec sh -c '
for fp; do # pathname
fn=${fp##*/} # filename
fn=${fn#"${fn%%[!_]*}"} # filename w/o leading underscore(s)
echo mv "$fp" "${fp%/*}/$fn"
done' sh {} +
A less portable but shorter and much cleaner alternative in bash looks like:
shopt -s globstar extglob nullglob
for fp in ./**/_*.scss; do
echo mv "$fp" "${fp%/*}/${fp##*/+(_)}"
done
Drop echo if the output looks good.

Look at the syntax highlighting of the sed command. Single quotes can't be nested. Easiest fix: switch to double quotes.
find . -name '*.scss' -print0 | xargs -0 -n1 bash -c 'mv "$0" `echo $0 | sed -e "s:^_*::"`'
I would recommend leaving $0 as the program name and using $1 for the first argument. That way if bash prints an error message it'll prefix it with bash:.
find . -name '*.scss' -print0 | xargs -0 -n1 bash -c 'mv "$1" `echo "$1" | sed -e "s:^_*::"`' bash
You can also simplify this with find -exec.
find . -name '*.scss' -exec bash -c 'mv "$1" `echo "$1" | sed -e "s:^_*::"`' bash {} ';'
You could also let bash do the substitution with its ${var#prefix} prefix removal syntax:
find . -name '*.scss' -exec bash -c 'mv "$1" "${1#_}"' bash {} ';'

You don't need find at all (unless you are trying to do this with an ancient version of bash, such as what macOS ships):
shopt -s globstar extglob
for f in **/*.scss; do
mv -- "$f" "${f##*(_)}"
done
${f##...} expands f, minus the longest prefix matching .... The extended pattern *(_) matches 0 or more _, analogous to the regular expression _*.

Related

bash get relative path from absoulte

This line gets absolute path, i used output to pass it to rsync, but rsync wants relative path
find /www-data/ -type f -exec sh -c 'if ! lsof `readlink -f {}` > /dev/null; then echo `realpath {}`; fi' \; | tr '\n' '\0'
No idea how to feed realpath --relative-to from above output
Full code:
cd /www-data
find ./ -type f -exec sh -c 'if ! lsof `readlink -f {}` > /dev/null; then echo `realpath {}`; fi' \; | tr '\n' '\0' | rsync -avz --from0 --files-from=- ./ /data/map/uploads/ --dry-run
Using tr '\n' '\000' is fundamentally broken. The reason you want to push in null-terminated strings is to disambiguate between newlines which are part of a file name, and those which aren't; but if you are replacing all newlines, you are not disambiguating anything. Perhaps see also https://mywiki.wooledge.org/BashFAQ/020
Somewhat similarly, echo `command` is just a useless use of echo, unless you specifically want the shell to squish whitespace and expand wildcards in the output from command.
If I'm allowed to guess slightly at what you are actually trying to ask here, try
find ./ -type f -exec sh -c 'for f; do
lsof "$(readlink -f "$f")" > /dev/null ||
printf "%s\0" "$(realpath --relative-to /var/www-data "$f")"
done' _ {} + |
rsync -avz --from0 --files-from=- ./ /data/mapis/clientuploads/ --dry-run
The crucial change is really to have find pass in the file names as arguments to sh -c '...' rather than try to replace {} smack dab in the middle of a string which may or may not require quoting.
Using -exec ... {} + with a + at the end should improve efficiency somewhat, at the very minor cost of adding a for loop to the embedded sh script.

Solution for find -exec if single and double quotes already in use

I would like to recursively go through all subdirectories and remove the oldest two PDFs in each subfolder named "bak":
Works:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && pwd" \;
Does not work, as the double quotes are already in use:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && rm "$(ls -t *.pdf | tail -2)"" \;
Any solution to the double quote conundrum?
In a double quoted string you can use backslashes to escape other double quotes, e.g.
find ... "rm \"\$(...)\""
If that is too convoluted use variables:
cmd='$(...)'
find ... "rm $cmd"
However, I think your find -exec has more problems than that.
Using {} inside the command string "cd '{}' ..." is risky. If there is a ' inside the file name things will break and might execcute unexpected commands.
$() will be expanded by bash before find even runs. So ls -t *.pdf | tail -2 will only be executed once in the top directory . instead of once for each found directory. rm will (try to) delete the same file for each found directory.
rm "$(ls -t *.pdf | tail -2)" will not work if ls lists more than one file. Because of the quotes both files would be listed in one argument. Therefore, rm would try to delete one file with the name first.pdf\nsecond.pdf.
I'd suggest
cmd='cd "$1" && ls -t *.pdf | tail -n2 | sed "s/./\\\\&/g" | xargs rm'
find . -type d -name bak -exec bash -c "$cmd" -- {} \;
You have a more fundamental problem; because you are using the weaker double quotes around the entire script, the $(...) command substitution will be interpreted by the shell which parses the find command, not by the bash shell you are starting, which will only receive a static string containing the result from the command substitution.
If you switch to single quotes around the script, you get most of it right; but that would still fail if the file name you find contains a double quote (just like your attempt would fail for file names with single quotes). The proper fix is to pass the matching files as command-line arguments to the bash subprocess.
But a better fix still is to use -execdir so that you don't have to pass the directory name to the subshell at all:
find . -type d -name "bak" \
-execdir bash -c 'ls -t *.pdf | tail -2 | xargs -r rm' \;
This could stll fail in funny ways because you are parsing ls which is inherently buggy.
You are explicitely asking for find -exec. Usually I would just concatenate find -exec find -delete but in your case only two files should be deleted. Therefore the only method is running subshell. Socowi already gave nice solution, however if your file names do not contain tabulator or newlines, another workaround is find while read loop.
This will sort files by mtime
find . -type d -iname 'bak' | \
while read -r dir;
do
find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | \
sort | head -n2 | \
cut -f2- | \
while read -r file;
do
rm "$file";
done;
done;
The above find while read loop as "one-liner"
find . -type d -iname 'bak' | while read -r dir; do find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | sort | head -n2 | cut -f2- | while read -r file; do rm "$file"; done; done;
find while read loop can also handle NUL terminated file names. However head can not handle this, so I did improve other answers and made it work with nontrivial file names (only GNU + bash)
replace 'realpath' with rm
#!/bin/bash
rm_old () {
find "$1" -maxdepth 1 -type f -iname \*.$2 -printf "%T+\t%p\0" | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
export -f rm_old
find -type d -iname bak -execdir bash -c 'rm_old "{}" pdf 2' \;
However bash -c might still exploitable, to make it more secure let stat %N do the quoting
#!/bin/bash
rm_old () {
local dir="$1"
# we don't like eval
# eval "dir=$dir"
# this works like eval
dir="${dir#?}"
dir="${dir%?}"
dir="${dir//"'$'\t''"/$'\011'}"
dir="${dir//"'$'\n''"/$'\012'}"
dir="${dir//$'\047'\\$'\047'$'\047'/$'\047'}"
find "$dir" -maxdepth 1 -type f -iname \*.$2 -printf '%T+\t%p\0' | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
find -type d -iname bak -exec stat -c'%N' {} + | while read -r dir; do rm_old "$dir" pdf 2; done

Bash.Bad result of command substitution

I want to replace spaces in filenames. My test directory contains files with spaces:
$ ls
'1 2 3.txt' '4 5.txt' '6 7 8 9.txt'
For example this code works fine:
$ printf "$(printf 'spaces in file name.txt' | sed 's/ /_/g')"
spaces_in_file_name.txt
I replace spaces on underscore and command substitution return result to double quotes as text. This construction with important substitution is essential in the next case. Such commands as find and xargs have substitution mark like {}(curly braces). Therefore the next command can replace spaces in files.
$ find ./ -name "*.txt" -print0 | xargs --null -I '{}' mv '{}' "$( printf '{}' | sed 's/ /_/g' )"
mv: './6 7 8 9.txt' and './6 7 8 9.txt' are the same file
mv: './4 5.txt' and './4 5.txt' are the same file
mv: './1 2 3.txt' and './1 2 3.txt' are the same file
But I get error. In order to more clearly consider error, instead of mv I just use echo(or printf):
$ find ./ -name "*.txt" -print0 | xargs --null -I '{}' echo "$( printf '{}' | sed 's/ /_/g' )"
./6 7 8 9.txt
./4 5.txt
./1 2 3.txt
As we can see, spaces were not replaced on underscore. But without command substitution, the replacing will be correct:
$ find ./ -name "*.txt" -print0 | xargs --null -I '{}' printf '{}\n' | sed 's/ /_/g'
./6_7_8_9.txt
./4_5.txt
./1_2_3.txt
So the fact of the command substitution with curly braces is corrupt the result(because in the first command was correct result), but without command substitution the result is correct. But why???
Your command substitution is run before find and you're executing
mv '{}' "{}"
You could change the find command to match .txt files with at least one space character and use -exec and a small bash script to rename the files:
find . -type f -name "* *.txt" -exec bash -c '
for file; do
fname=${file##*/}
mv -i "$file" "${file%/*}/${fname// /_}"
done
' bash {} +
${file##*/} remove the parent directories (longest prefix pattern */) and leaves the filename (like the basename command)
${file%/*} removes the filename (shortest suffix pattern /*) and leaves the parent directories (like the dirname command)
${fname// /_} replaces all spaces with underscores
it's quite fast and simple with loop just replace absolute_path with your path :
for f in absolute_path/*.txt; do mv "$f" "${f// /_}";done
The ${f// /_} part utilizes bash's parameter expansion mechanism to replace a pattern within a parameter with supplied string.

Strip off characters from xargs {} in bash

Assume I have files named "*.data.done". Now I want to rename them (recursively) back to "*.data" then ones which contains "pattern"
So here we go:
grep -l -R -F "pattern" --include '*.data.done' * | xargs -I{} mv {} ${{}::-5}
Well, this stripping of '.done' is not working (bash 4.3.11):
bash: ${{}::-5}: bad substitution
How can I do this most easiest way?
Placeholder {} cannot be used in BASH's string manipulations inside ${...}.
You can use:
grep -lRF "pattern" --include '*.data.done' . |
xargs -I{} bash -c 'f="{}"; mv "$f" "${f/.done}"'
However if you want to avoid spawning subshell for each file then use a for loop:
while IFS= read -d '' -r f; do
mv "$f" "${f/.done}"
done < <(grep -lRF "pattern" --include '*.data.done' --null .)

How can I get xargs to do something with the input, then do another thing?

I'm in zsh.
I'd like to do something like:
find . -iname *.md | xargs cat && echo "---" > all_slides_with_separators_in_between.md
Of course this cats all the slides, then appends a single "---" at the end instead of after each slide.
Is there an xargs way of doing this? Can I replace cat && echo "---" with some inline function or do block?
Very strangely, when I create a file cat---.sh with the contents
cat $1
echo ---
and run
find . -iname *.md | xargs ./cat---.sh
it only executes for the first result of find.
Replace cat---.sh with cat and it runs on both files.
There's no need to use xargs at all here. Following is a properly paranoid approach (robust against files with spaces, files with newlines, files with literal backslashes in their names, etc):
while IFS= read -r -d '' filename; do
printf '---\n'
cat -- "$filename"
done < <(find . -iname '*.md' -print0) >all_slides_with_separators.md
However -- you don't even need that either: find can do all the work itself, both printing the separator and calling cat!
find . -iname '*.md' -printf '---\n' -exec cat -- '{}' ';' >all_slides_with_separators.md
A common usage pattern is xargs sh -c 'command; another' _ where the entire shell script in the quotes will have access to the command-line arguments. The underscore is because the first argument to sh -c will be assigned to $0 (where you'd often see e.g. -sh in a ps listing).
find . -iname '*.md' |
xargs sh -c 'for x; do
cat "$x" && echo "---"
done' _ > all_slides_with_separators_in_between.md
As noted in the comments, you should probably investigate find -print0 and the corresponding xargs -0 option in GNU find (and maybe install it if you don't have it).
You can do something like this, but it can be insecure in some cases (see comments):
find . -iname '*.md' | xargs -I % sh -c '{ cat %; echo "----"; }' > output.txt
You'll rarely need find in zsh; its globbing facilities cover nearly every use case of find.
for f in (#i)**/*.md; do
cat $f
print -- "---"
done > all_slides.md
This looks in the current directory hierarchy for every file that matches *.md in a case-insensitive manner.
For even more efficiency, replace cat $f with < $f; zsh itself will read the file and write its contents to standard output.
Using GNU Parallel it looks like this:
parallel cat {}\; print -- --- ::: **/*.md

Resources