How to use xargs to replace 2 arguments - bash

I would like to write a script to convert svg files in a directory to png using svgexport cli
svgexport input.svg input.jpg
How can I use find and xargs -I {} to find and print out the svg files using the following:
find . -iname -0 "*.svg" | xargs -I {} svgexport {} ???
How can I fill-in the second argument by using the first argument and replacing .svg with .jpg?

You can use bash -c in xargs and use BASH's string replacement:
find . -name "*.svg" -print0 |
xargs -0 -I {} bash -c 'svgexport "$1" "${1%.svg}.jpg"' - {}

I think it's best to do this with a while loop:
find . -iname "*.svg" -print0 |
while IFS= read -r -d '' file; do
svgexport "$file" "${file%.svg}.jpg"
done

Do them all simply, and faster, in parallel with GNU Parallel
parallel --dry-run svgexport {} {.}.jpg ::: *.svg
Remove the --dry-run if you like what it shows you and run it again to actually process the files.

You can avoid xargs when you use find. It already provides the same feature a simpler way (-exec with + terminator):
find . -iname "*.svg" -exec \
bash -c 'for i do svgexport "$i" "${i::-3}jpg";done' bash {} +
Should you don't want to recurse in subdirectories:
find . -maxdepth 1 -iname "*.svg" -exec \
bash -c 'for i do svgexport "$i" "${i::-3}jpg";done' bash {} +
but in that case, find is not necessary either:
for i in *.[sS][vV][gG]; do svgexport "$i" "${i::-3}jpg"; done
If the suffix is always in lowercase, this can be slightly simplified:
for i in *.svg; do svgexport "$i" "${i%svg}jpg"; done
If your bash version doesn't support ${i::-3}, you can use the portable {i%???} instead.
Should you want to avoid the bash and find GNUisms, here is a POSIX way to achieve the recursive processing:
find . -name "*.[Ss][Vv][Gg]" -exec \
sh -c 'for i do svgexport "$i" "${i%???}jpg";done' sh {} +
and another for the non recursive one:
for i in *.svg; do svgexport "$i" "${i%???}jpg"; done

Easy:
find . -iname "*.svg" -print0 | xargs -0 -I \
bash -c 'export file="{}"; svgexport "$file" "${file%.*}.jpg"'
Add -print0 to find and -0 to xargs to deal with special filenames.
${file%.*} removes all chars after the last dot, so it will remove ".svg" and you can add the new file extension.

Related

Solution for find -exec if single and double quotes already in use

I would like to recursively go through all subdirectories and remove the oldest two PDFs in each subfolder named "bak":
Works:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && pwd" \;
Does not work, as the double quotes are already in use:
find . -type d -name "bak" \
-exec bash -c "cd '{}' && rm "$(ls -t *.pdf | tail -2)"" \;
Any solution to the double quote conundrum?
In a double quoted string you can use backslashes to escape other double quotes, e.g.
find ... "rm \"\$(...)\""
If that is too convoluted use variables:
cmd='$(...)'
find ... "rm $cmd"
However, I think your find -exec has more problems than that.
Using {} inside the command string "cd '{}' ..." is risky. If there is a ' inside the file name things will break and might execcute unexpected commands.
$() will be expanded by bash before find even runs. So ls -t *.pdf | tail -2 will only be executed once in the top directory . instead of once for each found directory. rm will (try to) delete the same file for each found directory.
rm "$(ls -t *.pdf | tail -2)" will not work if ls lists more than one file. Because of the quotes both files would be listed in one argument. Therefore, rm would try to delete one file with the name first.pdf\nsecond.pdf.
I'd suggest
cmd='cd "$1" && ls -t *.pdf | tail -n2 | sed "s/./\\\\&/g" | xargs rm'
find . -type d -name bak -exec bash -c "$cmd" -- {} \;
You have a more fundamental problem; because you are using the weaker double quotes around the entire script, the $(...) command substitution will be interpreted by the shell which parses the find command, not by the bash shell you are starting, which will only receive a static string containing the result from the command substitution.
If you switch to single quotes around the script, you get most of it right; but that would still fail if the file name you find contains a double quote (just like your attempt would fail for file names with single quotes). The proper fix is to pass the matching files as command-line arguments to the bash subprocess.
But a better fix still is to use -execdir so that you don't have to pass the directory name to the subshell at all:
find . -type d -name "bak" \
-execdir bash -c 'ls -t *.pdf | tail -2 | xargs -r rm' \;
This could stll fail in funny ways because you are parsing ls which is inherently buggy.
You are explicitely asking for find -exec. Usually I would just concatenate find -exec find -delete but in your case only two files should be deleted. Therefore the only method is running subshell. Socowi already gave nice solution, however if your file names do not contain tabulator or newlines, another workaround is find while read loop.
This will sort files by mtime
find . -type d -iname 'bak' | \
while read -r dir;
do
find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | \
sort | head -n2 | \
cut -f2- | \
while read -r file;
do
rm "$file";
done;
done;
The above find while read loop as "one-liner"
find . -type d -iname 'bak' | while read -r dir; do find "$dir" -maxdepth 1 -type f -iname '*.pdf' -printf "%T+\t%p\n" | sort | head -n2 | cut -f2- | while read -r file; do rm "$file"; done; done;
find while read loop can also handle NUL terminated file names. However head can not handle this, so I did improve other answers and made it work with nontrivial file names (only GNU + bash)
replace 'realpath' with rm
#!/bin/bash
rm_old () {
find "$1" -maxdepth 1 -type f -iname \*.$2 -printf "%T+\t%p\0" | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
export -f rm_old
find -type d -iname bak -execdir bash -c 'rm_old "{}" pdf 2' \;
However bash -c might still exploitable, to make it more secure let stat %N do the quoting
#!/bin/bash
rm_old () {
local dir="$1"
# we don't like eval
# eval "dir=$dir"
# this works like eval
dir="${dir#?}"
dir="${dir%?}"
dir="${dir//"'$'\t''"/$'\011'}"
dir="${dir//"'$'\n''"/$'\012'}"
dir="${dir//$'\047'\\$'\047'$'\047'/$'\047'}"
find "$dir" -maxdepth 1 -type f -iname \*.$2 -printf '%T+\t%p\0' | sort -z | sed -zn 's,\S*\t\(.*\),\1,p' | grep -zim$3 \.$2$ | xargs -0r realpath
}
find -type d -iname bak -exec stat -c'%N' {} + | while read -r dir; do rm_old "$dir" pdf 2; done

How to run a command (1000 times) that requires two different types of input files

I have calculated directed modularity by means of DirectedLouvain (https://github.com/nicolasdugue/DirectedLouvain). I am now trying to test the significance of the values obtained, by means of a null model. To do it I need to run 1000 times one of the commands of DirectedLouvain over 1000 different input files.
Following # KamilCuk recomendations I have used this code that takes the 1000 *.txt input files and generates 1000 *.bin files and 1000 *.weights files. It worked perfectly:
find -type f -name '*.txt' |
while IFS= read -r file; do
file_no_extension=${file##*/};
file_no_extension=${file_no_extension%%.*}
./convert -i "$file" -o "$file_no_extension".bin -w "$file_no_extension".weights
done
Now I am trying to use another command that takes these two types of files (*.bin and *.weights) and generates *.tree files. I have tried this with no success:
find ./ -type f \( -iname \*.bin -o -iname \*.weights \) |
while IFS= read -r file; do
file_no_extension=${file##*/};
file_no_extension=${file_no_extension%%.*}
./community "$file.bin" -l -1 -w "$file.weights" > "$file_no_extension".tree
done
Any suggestion?
Find all files with that extension.
For each file
Extract the filename without exntesion
Run the command
So:
find -type f -name '*.ext' |
while IFS= read -r file; do
file_no_extension=${file##*/};
file_no_extension=${file_no_extension%%.*}
./convert -i "$file" -o "$file_no_extension".bin -w "$file_no_extension".weights
done
// with find:
find -type f -name '*.ext' -exec sh -c 'f=$(basename "$1" .ext); ./convert -i "$1" -o "$f".bin -w "$f".weights' _ {} \;
// with xargs:
find -type f -name '*.ext' |
xargs -d '\n' -n1 sh -c 'f=$(basename "$1" .ext); ./convert -i "$1" -o "$f".bin -w "$f".weights' _
You could use GNU Parallel to run your jobs in parallel across all your CPU cores like this:
parallel convert -i {} -o {.}.bin -w {.}.weights ::: input*.txt
Initially, you may like to do a "dry run" that shows what it would do without actually doing anything:
parallel --dry-run convert -i {} -o {.}.bin -w {.}.weights ::: input*.txt
If you get errors about the argument list being too long because you have too many files, you can feed their names in on stdin like this instead:
find . -name "input*txt" -print0 | parallel -0 convert -i {} -o {.}.bin -w {.}.weights
You can use find to list your files and execute a command on all of them:
find -name '*.ext' -exec ./runThisExecutable '{}' \;
If you have a.ext and b.ext in a directory, this will run ./runThisExecutable a.ext and ./runThisExecutable b.ext.
To test whether it identifies the right files, you can run it without -exec so it only prints the filenames:
find -name '*.ext'
./a.ext
./b.ext

How can I use sed to change my target dir in this shell command line?

I use this command line to find all the SVGs (thousands) in a directory and convert them to PNGs using Inkscape. Works great. Here is my issue. It outputs the PNGs in the same directory. I would like to change the target directory.
for i in `find /home/wyatt/test/svgsDIR -name "*.svg"`; do inkscape $i --export-background-opacity=0 --export-png=`echo $i | sed -e 's/svg$/png/'` -w 700 ; done
It appears $i is the file_path + file_name, and sed does a search/replace on the file extension. How do I search/replace my file_path? Or is there a better way to define a different target path within this command line?
Any help is much appreciated.
Would you please try:
destdir="DIR" # replace with your desired directory name
mkdir -p "$destdir"
find /home/wyatt/test/svgsDIR -name "*.svg" -print0 | while IFS= read -r -d "" i; do
destfile="$destdir/$(basename -s .svg "$i").png"
inkscape "$i" --export-background-opacity=0 --export-png="$destfile" -w 700
done
or
destdir="DIR"
mkdir -p "$destdir"
for i in /home/wyatt/test/svgsDIR/*.svg; do
destfile="$destdir/$(basename -s .svg "$i").png"
inkscape "$i" --export-background-opacity=0 --export-png="$destfile" -w 700
done
This may be off-topic but it is not recommended to use a for loop relying on the word-splitting especially when dealing with the filenames. Please consider the filenames and the pathnames may contain whitespace, newline, tab or other special characters.
Or with a one-liners (split for readability)
find /home/wyatt/test/svgsDIR -name "*.svg" |
xargs -I{} sh -c 'inkscape "{}" --export-background-opacity=0 --export-png='$destdir'/$(basename {} .svg).png -w 700'
Might work with find built-in exec:
find /home/wyatt/test/svgsDIR -name "*.svg" -exec sh -c 'inkscape "{}" --export-background-opacity=0 --export-png='$destdir'/$(basename {} .svg).png -w 700' \;
Or by passing target-dir as arguments, to simplify quoting.
find /home/wyatt/test/svgsDIR -name "*.svg" -exec sh -c 'inkscape "$1" --export-background-opacity=0 --export-png="$2/$(basename $1 .svg).png" -w 700' '{}' "$targetdir" \;

renaming series of files using xargs

I would like to rename several files picked by find in some directory, then use xargs and mv to rename the files, with parameter expansion. However, it did not work...
example:
mkdir test
touch abc.txt
touch def.txt
find . -type f -print0 | \
xargs -I {} -n 1 -0 mv {} "${{}/.txt/.tx}"
Result:
bad substitution
[1] 134 broken pipe find . -type f -print0
Working Solution:
for i in ./*.txt ; do mv "$i" "${i/.txt/.tx}" ; done
Although I finally got a way to fix the problem, I still want to know why the first find + xargs way doesn't work, since I don't think the second way is very general for similar tasks.
Thanks!
Remember that shell variable substitution happens before your command runs. So when you run:
find . -type f -print0 | \
xargs -I {} -n 1 -0 mv {} "${{}/.txt/.tx}"
The shell tries to expan that ${...} construct before xargs even
runs...and since that contents of that expression aren't a valid shell variable reference, you get an error. A better solution would be to use the rename command:
find . -type f -print0 | \
xargs -I {} -0 rename .txt .tx {}
And since rename can operate on multiple files, you can simplify
that to:
find . -type f -print0 | \
xargs -0 rename .txt .tx

Bash script to change file names recursively

I have a script for changing file names of mht files but it does not traverse through dirs and sub dirs. I asked a question on a local forum and I got an answer that this is a solution:
find . -type f -name "*.mhtml" -o -type f -name "*.mht" | xargs -I item sh -c '{ echo item; echo item | sed "s/[:?|]//g"; }' | xargs -n2 mv
But it generates an error. With some of my experimenting it turns out that sh -c breaks file names with space and that this generates an error. How can I fix this?
#!/bin/bash
# renames.sh
# basic file renamer
for i in . *.mht
do
j=`echo $i | sed 's/|/ /g' | sed 's/:/ /g' | sed 's/?//g' | sed 's/"//g'`
mv "$i" "$j"
done
#! /bin/bash
find . -type f \( -name "*.mhtml" -o -name ".mht" \) -print0 |
while IFS= read -r -d '' source; do
target="${source//[:?|]/}"
[ "X$source" != "X$target" ] &&
mv -nv "$source" "$target"
done
Update: Do the rename according to the original question, and added support for .mht.
Use rename. With rename you can specify a renaming pattern:
find . -type f \( -name "*.mhtml" -o -name "*.mht" \) -print0 | xargs -0 -I'{}' rename 's/[:?|]//g' "{}"
This way you can properly handle names with spaces. xargs will replace {} with every names of file provided by the find command. Also note the use of -print0 and -0. This use a \0 as a separator so its avoid problems dealing with filnames containing \n (newline).
The -o was not working the way it was intended to. you must use parenthesis to group conditions.
You may also consider using -iname instead of -name if you deal with file ending with ".mHtml".

Resources