Adding extra argument to xargs - shell

I'm trying to kick off multiple processes to work through some test suites. In my bash script I have the following
printf "%s\0" "${SUITE_ARRAY[#]}" | xargs -P 2 -0 bash -c 'run_test_suite "$#" ${EXTRA_ARG}'
Below is the defined script, cut down to it's basics.
SUITE_ARRAY will be a list of suites that may have 1 or more, {Suite 1, Suite 2, ..., Suite n}
EXTRA_ARG will be like a specific name to store values in another script
#!/bin/bash
run_test_suite(){
suite=$1
someArg=$2
someSaveDir=someArg"/"suite
# some preprocess work happens here, but isn't relevant to running
runSomeScript.sh suite someSaveDir
}
export -f run_test_suite
SUITES=$1
EXTRA_ARG=$2
IFS=','
SUITECOUNT=0
for csuite in ${SUITES}; do
SUITE_ARRAY[$SUITECOUNT]=$csuite
SUITECOUNT=$(($SUITECOUNT+1))
done
unset IFS
printf "%s\0" "${SUITE_ARRAY[#]}" | xargs -P 2 -0 bash -c 'run_test_suite "$#" ${EXTRA_ARG}'
The issue I'm having is how to get the ${EXTRA_ARG} passed into xargs. From how I've come to understand it, xargs will take whatever is piped into it, so the way I have it doesn't seem correct.
Any suggestions on how to correctly pass the values? Thanks in advance

If you want EXTRA_ARG to be available to the subshell, you need to export it. You can do that either explicitly, with the export keyword, or by putting the var=value assignment in the same simple command as xargs itself:
#!/bin/bash
run_test_suite(){
suite=$1
someArg=$2
someSaveDir=someArg"/"suite
# some preprocess work happens here, but isn't relevant to running
runSomeScript.sh suite someSaveDir
}
export -f run_test_suite
# assuming that the "array" in $1 is comma-separated:
IFS=, read -r -a suite_array <<<"$1"
# see the EXTRA_ARG="$2" just before xargs on the same line; this exports the variable
printf "%s\0" "${suite_array[#]}" | \
EXTRA_ARG="$2" xargs -P 2 -0 bash -c 'run_test_suite "$#" "${EXTRA_ARG}"' _
The _ prevents the first argument passed from xargs to bash from becoming $0, and thus not included in "$#".
Note also that I changed "${suite_array[#]}" to be assigned by splitting $1 on commas. This or something like it (you could use IFS=$'\n' to split on newlines instead, for example) is necessary, as $1 cannot contain a literal array; every shell command-line argument is only a single string.

This is something of a guess:
#!/bin/bash
run_test_suite(){
suite="$1"
someArg="$2"
someSaveDir="${someArg}/${suite}"
# some preprocess work happens here, but isn't relevant to running
runSomeScript.sh "${suite}" "${someSaveDir}"
}
export -f run_test_suite
SUITE_ARRAY="$1"
EXTRA_ARG="$2"
printf "%s\0" "${SUITE_ARRAY[#]}" |
xargs -n 1 -I '{}' -P 2 -0 bash -c 'run_test_suite {} '"${EXTRA_ARG}"

Using GNU Parallel it looks like this:
#!/bin/bash
run_test_suite(){
suite="$1"
someArg="$2"
someSaveDir="$someArg"/"$suite"
# some preprocess work happens here, but isn't relevant to running
echo runSomeScript.sh "$suite" "$someSaveDir"
}
export -f run_test_suite
EXTRA_ARG="$2"
parallel -d, -q run_test_suite {} "$EXTRA_ARG" ::: "$1"
Called as:
mytester 'suite 1,suite 2,suite "three"' 'extra "quoted" args here'
If you have the suites in an array:
parallel -q run_test_suite {} "$EXTRA_ARG" ::: "${SUITE_ARRAY[#]}"
Added bonus: Any output from the jobs will not be mixed, so you will not have to deal with http://mywiki.wooledge.org/BashPitfalls#Using_output_from_xargs_-P

Related

How to write a Bash script to edit many text files using the same commands? [duplicate]

This question already has answers here:
Run script on multiple files
(3 answers)
Closed 3 years ago.
I'm very new to bash. I have ten text files that I want to edit with the same line of code.
#!/bin/bash
sed -i -e 's/.\{6\}/&\n/g' -e 's/edit/edit2/g' | tr -d "\n" | sed 's/edit2/edit/g'| grep -o "here.*there" | sed -r '/^.{,100}$/d'
< files 1-10
I know I could use sed -f sed.sh <file1 >file1 but that only works with sed commands and it only works one file at a time?
Do I have to run a loop?
There's some great existing answers on the Unix stack exchange that help deal with your problem. Specifically, from this post, they use a loop to recursively loop through all the files in a particular directory, as follows:
( shopt -s globstar dotglob;
for file in **; do
if [[ -f $file ]] && [[ -w $file ]]; then
sed -i -- 's/foo/bar/g' "$file"
fi
done
)
Note the line, shopt -s globstar dotglob;, which allows us to use globbing patterns in the for loop. We also enclose the code in brackets, to prevent the shopt -s globstar dotglob; line option from becoming a global setting.
If you would like to apply this example to your file, you can just place your files in the current directory, and the code would probably look something like this:
( shopt -s globstar dotglob;
for file in **; do
if [[ -f $file ]] && [[ -w $file ]]; then
sed -i -e 's/.\{6\}/&\n/g' -e 's/edit/edit2/g' | tr -d "\n" | sed 's/edit2/edit/g' | grep -o "here.*there" | sed -r '/^.{,100}$/d' "$file"
fi
done
)
Note that we have placed a "$file" variable beside each of the seds that you used in your code, this replaces the name of the file for each command.
There is another example given in the code that allows you to pick which files to run on, rather than all the files in a directory, which you can also re-purpose for your code, as given here:
( shopt -s globstar dotglob
sed -i -- 's/foo/bar/g' **baz*
sed -i -- 's/foo/bar/g' **.baz
)
To answer your question of doing a loop on each line, you will need to put a loop for each line inside your for loop, like so:
while read line ; do
: sed -i -e 's/.\{6\}/&\n/g' -e 's/edit/edit2/g' | tr -d "\n" | sed 's/edit2/edit/g' | grep -o "here.*there" | sed -r '/^.{,100}$/d' "$line”
done
)
Although the for loop can be useful for dealing with files in recursive directories, I would recommend against also using another loop to grab lines, since it muddies your code, and it’s possible there is a better way to do it without parsing line by line.
The linked question is a fairly complete guide to many of the cases you may come across, and is also worth a read if you want to learn more.
Hope that helps!
You could use a for loop.
You could use the tool parallel.
Example
Create a set of test files using a for-loop
mkdir -p /tmp/so58333536
cd /tmp/so58333536
for i in 1.txt 2.txt 3.txt 4.txt 5.txt;do echo "The answer is 41" > $i;done
cat /tmp/so58333536/*
Now correct your mistake using parallel [1].
mkdir /tmp/so58333536.new
ls /tmp/so58333536/* |parallel "sed 's/41/42/' {} > /tmp/so58333536.new/{/}"
cat /tmp/so58333536.new/*
{}:: refers to the current file
{/}:: refers to name of the current file (path is removed)
Reads: List all files in so58333536 and apply the following sed command to each file and write the output to so58333536.new.
[1] Another option is to use sed -i for in-place editing.
Be very carefull with this!! Mistakes can cause serious damages!
# !! Do not use -i option regularly !!
ls /tmp/so58333536/* |parallel "sed -i 's/41/42/'"

Pipe output from command to another command

A bash function, prepend_line, takes two parameters: a string and a fully-qualified path to a file. It's used for logging, inserting the current date/time and the string at the top of the log file.
Standalone use works fine: prepend_line "test string" "$log_file"
How can I get the output from a command, e.g. mv -fv "$fileOne" "$fileTwo" to be used as the first parameter for prepend_line?
I've tried various combinations of piping to xargs, but I don't understand how it works and I'm not convinced it's the best way in any case.
If you really have to:
export -f prepend_line
mv -fv "$fileOne" "$fileTwo" |
xargs -0 bash -c 'prepend_line "$1" "$log_file"' --
The -0 parses the line as beeing zero delimetered. As there should be no zeros in mv -v output, as filenames can't have a zero byte, you will get only a single element. This element/line will be passed as the first argument to the bash subshell.
Tested with:
prepend_line() {
printf "%s\n" "$#" | xxd -p
}
fileOne=$'1\x01\x02\x031234566\n\t\e'
fileTwo=$'2\x01\x02\x031234566\n\t\e \n\n\n'
export -f prepend_line
printf "%s\n" "$fileOne -> $fileTwo" |
xargs -0 bash -c 'prepend_line "$1" "$log_file"' --
The script will output (output from the xxd -p inside prepend_line):
31010203313233343536360a091b202d3e2032010203313233343536360a
091b200a0a0a0a0a0a
Same hex output with some extra newlines and comments:
# first filename $'1\x01\x02\x031234566\n\t\e'
31010203313233343536360a091b
# the string: space + '->' + space
202d3e20
# second filename $'2\x01\x02\x031234566\n\t\e \n\n\n'
32010203313233343536360a091b200a0a0a0a0a0a
If you really have to parse some strange input's you can convert your string to hex with xxd -p. Then, later, convert it back to machine representation with xxd -r -p and streaming right into the output:
prepend_line() {
# some work
# append the output of the "$1" command to the log_file
<<<"$1" xxd -p -r >> "$2"
# some other work
}
prepend_line "$(mv -fv "$fileOne" "$fileTwo" | xxd -p)" "$log_file"
But I doubt you will ever need to handle such cases. Who names filenames using $'\x01' and suffixes with empty newlines 'great_script.sh'$'\n\n'?
Anyway, objectively I would rather see the interface as using a stream:
mv -fv "$fileOne" "$fileTwo" | prepend_line "$log_file"
It needs set -o pipefail to propagate errors correctly. Inside prepend_line I would just redirect the output to the log file or some temporary file, sparing the need of parsing and corner cases.

Modify a path stored in a bash script variable

I have a variable f in a bash script
f=/path/to/a/file.jpg
I'm using the variable as an input argument to a program that requires and input and an output path.
For example the program's usage would look like this
./myprogram -i inputFilePath -o outputFilePath
using my variable, I'm trying to maintain the same basename, change the extension, and put the output file into a sub directory. For example
./myprogram -i /path/to/a/file.jpg -o /path/to/a/new/file.tiff
I'm trying to do that by doing this
./myprogram -i "$f" -o "${f%.jpg}.tiff"
of course this keeps the basename, changes the extension, but doesn't put the file into the new subdirectory.
How can I modify f to to change /path/to/a/file.jpg into /path/to/a/new/file.tiff?
Actually you can do this in several ways:
Using sed as pointed out by #anubhava
Using dirname and basename:
./myprogram -i "$f" -o "$(dirname -- "$f")/new/$(basename -- "$f" .jpg).tiff"
Using only Bash:
./myprogram -i "$f" -o "${f%/*}/new/$(b=${f##*/}; echo -n ${b%.jpg}.tiff)"
Note that unlike the second solution (using dirname/basename) that is more robust, the third solution (in pure Bash) won't work if "$f" does not contain any slash:
$ dirname "file.jpg"
.
$ f="file.jpg"; echo "${f%/*}"
file.jpg
You may use this sed:
s='/path/to/a/file.jpg'
sed -E 's~(.*/)([^.]+)\.jpg$~\1new/\2.tiff~' <<< "$s"
/path/to/a/new/file.tiff
If you're on a system that supports the basename and dirnamecommands you could use a simple wrapper function eg:
$ type newSubDir
newSubDir is a function
newSubDir ()
{
oldPath=$(dirname "${1}");
fileName=$(basename "${1}");
newPath="${oldPath}/${2}/${fileName}";
echo "${newPath}"
}
$ newSubDir /path/to/a/file.jpg new
/path/to/a/new/file.jpg
If your system doesn't have those, you can accomplish the same thing using string manipulation:
$ file="/path/to/a/file.jpg"
$ echo "${file%/*}"
/path/to/a
$ echo "${file##*/}"
file.jpg

An issue with quotes in a bash call of a function

I wrote a small bash script that looks round about like this:
VAR1="test"
VAR2="test2"
letsDoSomeStuff() {
echo $1
echo $2
echo $3`
}
fswatch -0 . | xargs -0 -n 1 -I {} bash -c 'letsDoSomeStuff {} "$VAR1 $VAR2"'
What I want to do, is to look for changes in a folder (via the swatch) and then do some stuff with the changed files in a function in the bash script. Unfortunately, I need to pass on the $VAR1 and $VAR2 as parameters as they don't "survive" the xargs call.
The bash scripts works fine. However, $VAR1 and $VAR2 are not properly passed on to the function. When I start the script, it outputs for every changed file:
filename
empty line
empty line
Can anybody here help me out with this call? I guess I'm messing up the single and double quotes but can't find the right way.
Thanks in advance
Norbert
You need to pass in parameters to your function call:
fswatch -0 . |
xargs -0 -n 1 -I {} bash -c 'letsDoSomeStuff "$1" "$2" "$3"' - {} "$VAR1" "$VAR2"

Inline comments for Bash?

I'd like to be able to comment out a single flag in a one-line command. Bash only seems to have from # till end-of-line comments. I'm looking at tricks like:
ls -l $([ ] && -F is turned off) -a /etc
It's ugly, but better than nothing. Is there a better way?
The following seems to work, but I'm not sure whether it is portable:
ls -l `# -F is turned off` -a /etc
My preferred is:
Commenting in a Bash script
This will have some overhead, but technically it does answer your question
echo abc `#put your comment here` \
def `#another chance for a comment` \
xyz etc
And for pipelines specifically, there is a cleaner solution with no overhead
echo abc | # normal comment OK here
tr a-z A-Z | # another normal comment OK here
sort | # the pipelines are automatically continued
uniq # final comment
How to put a line comment for a multi-line command
I find it easiest (and most readable) to just copy the line and comment out the original version:
#Old version of ls:
#ls -l $([ ] && -F is turned off) -a /etc
ls -l -a /etc
$(: ...) is a little less ugly, but still not good.
Here's my solution for inline comments in between multiple piped commands.
Example uncommented code:
#!/bin/sh
cat input.txt \
| grep something \
| sort -r
Solution for a pipe comment (using a helper function):
#!/bin/sh
pipe_comment() {
cat -
}
cat input.txt \
| pipe_comment "filter down to lines that contain the word: something" \
| grep something \
| pipe_comment "reverse sort what is left" \
| sort -r
Or if you prefer, here's the same solution without the helper function, but it's a little messier:
#!/bin/sh
cat input.txt \
| cat - `: filter down to lines that contain the word: something` \
| grep something \
| cat - `: reverse sort what is left` \
| sort -r
Most commands allow args to come in any order. Just move the commented flags to the end of the line:
ls -l -a /etc # -F is turned off
Then to turn it back on, just uncomment and remove the text:
ls -l -a /etc -F
How about storing it in a variable?
#extraargs=-F
ls -l $extraargs -a /etc
If you know a variable is empty, you could use it as a comment. Of course if it is not empty it will mess up your command.
ls -l ${1# -F is turned off} -a /etc
§ 10.2. Parameter Substitution
For disabling a part of a command like a && b, I simply created an empty script x which is on path, so I can do things like:
mvn install && runProject
when I need to build, and
x mvn install && runProject
when not (using Ctrl + A and Ctrl + E to move to the beginning and end).
As noted in comments, another way to do that is Bash built-in : instead of x:
$ : Hello world, how are you? && echo "Fine."
Fine.
It seems that $(...) doesn't survive from ps -ef.
My scenario is that I want to have a dummy param that can be used to identify the very process. Mostly I use this method, but the method is not workable everywhere. For example, python program.py would be like
mkdir -p MyProgramTag;python MyProgramTag/../program.py
The MyProgramTag would be the tag for identifying the process started.

Resources