append variables from a while loop into a command line option - bash

I have a while loop, where A=1~3
mysql -e "select A from xxx;" while read A;
do
whatever
done
The mySQL command will return only numbers, each number in each line. So the while loop here will have A=1, A=2, A=3
I would like to append the integer number in the loop (here is A=1~3) into a command line to run outside the while loop. Any bash way to do this?
parallel --joblog test.log --jobs 2 -k sh ::: 1.sh 2.sh 3.sh

You probably want something like this:
mysql -e "select A from xxx;" | while read A; do
whatever > standard_out 2>standard_error
echo "$A.sh"
done | xargs parallel --joblog test.log --jobs 2 -k sh :::

Thanks for enlightening me. xargs works perfectly here:
Assuming we have A.csv (mimic the mysql command)
1
2
3
4
We can simply do:
cat A.csv | while read A; do
echo "echo $A" > $A.sh
echo "$A.sh"
done | xargs -I {} parallel --joblog test.log --jobs 2 -k sh ::: {}
The above will print the following output as expected
1
2
3
4
Here -I {} & {} are the argument list markers:
https://www.cyberciti.biz/faq/linux-unix-bsd-xargs-construct-argument-lists-utility/

Related

Is there a way to only require one echo in this scenario?

I have the following line of code:
for h in "${Hosts[#]}" ; do echo "$MyLog" | grep -m 1 -B 3 -A 1 $h >> /LogOutput ; done
My hosts variable is a large array of hosts
Is there a better way to do this that doesn't require me to echo on each loop? Like grep on a variable instead?
No echo, no loop
#!/bin/bash
hosts=(host1 host2 host3)
MyLog="
asf host
sdflkj
sadkjf
sdlkjds
lkasf
sfal
asf host2
sdflkj
sadkjf
"
re="${hosts[#]}"
egrep -m 1 -B 3 -A 1 ${re// /|} <<< "$MyLog"
Variant with one echo
echo "$MyLog" | egrep -m 1 -B 3 -A 1 ${re// /|}
Usage
$ ./test
sdlkjds
lkasf
sfal
asf host2
sdflkj
One echo, no loops, and all grepping done in parallel, with GNU Parallel:
echo "$MyLog" | parallel -k --tee --pipe 'grep -m 1 -B 3 -A 1 {}' ::: "${hosts[#]}"
The -k keeps the output in order.
The --tee and the --pipe ensure that the stdin is duplicated to all processes.
The processes that are run in parallel are enclosed in single quotes.
printf your string to multiple-line that you can then grep? Something like:
printf '%s\n' "${Hosts[#]}" | grep -m 1 -B 3 -A 1 $h >> /LogOutput
Assuming you're on GNU system. otherwise info grep
From grep --help
grep --help | head -n1
Output
Usage: grep [OPTION]... PATTERN [FILE]...
So according to that you can do.
for h in "${Hosts[#]}" ; do grep -m 1 -B 3 -A 1 "$h" "$MyLog" >> /LogOutput ; done

how to group all arguments as position argument for `xargs`

I have a script which takes in only one positional parameter which is a list of values, and I'm trying to get the parameter from stdin with xargs.
However by default, xargs passes all the lists to my script as positional parameters, e.g. when doing:
echo 1 2 3 | xargs myScript, it will essentially be myScript 1 2 3, and what I'm looking for is myScript "1 2 3". What is the best way to achieve this?
Change the delimiter.
$ echo 1 2 3 | xargs -d '\n' printf '%s\n'
1 2 3
Not all xargs implementations have -d though.
And not sure if there is an actual use case for this but you can also resort to spawning another shell instance if you have to. Like
$ echo -e '1 2\n3' | xargs sh -c 'printf '\''%s\n'\'' "$*"' sh
1 2 3
If the input can be altered, you can do this. But not sure if this is what you wanted.
echo \"1 2 3\"|xargs ./myScript
Here is the example.
$ cat myScript
#!/bin/bash
echo $1; shift
echo $1; shift
echo $1;
$ echo \"1 2 3\"|xargs ./myScript
1 2 3
$ echo 1 2 3|xargs ./myScript
1
2
3

How to suspend the main command when piping the output to another delay command

I have two custom scripts to implement their own tasks, one for outputting some URLs (pretend as cat command below) and another for receiving a URL to parse it via network requests (pretend as sleep command below).
Here is the prototype:
Case 1:
cat urls.txt | xargs -I{} sleep 1 && echo "END: {}"
The output is END: {} and the sleep works.
Case 2:
cat urls.txt | xargs -I{} echo "BEGIN: {}" && sleep 1 && echo "END: {}"
The output is
BEGIN: https://www.example.com/1
BEGIN: https://www.example.com/2
BEGIN: https://www.example.com/3
END: {}
but it seems only sleep 1 second.
Q1: I'm a little confused, why are these outputs?
Q2: Are there any solutions to execute the full pipelined xargs delay command for every cat line output?
You can put the commands into a separate script:
worker.sh
#!/bin/bash
echo "BEGIN: $*" && sleep 1 && echo "END: $*"
set execute permission:
chmod +x worker.sh
and call it with xargs:
cat urls.txt | xargs -I{} ./worker.sh {}
output
BEGIN: https://www.example.com/1
END: https://www.example.com/1
BEGIN: https://www.example.com/2
END: https://www.example.com/2
BEGIN: https://www.example.com/3
END: https://www.example.com/3
Between BEGIN and END the script sleep for one second.
Thanks for shellter and UtLox's reminder, I found the xargs is the key.
Here is my finding, the shell/zsh interpreter splits the sleep 5 and echo END: {} as another serial of commands, so xargs didn't receive my expected two && inline commands as one utility command and replace the {} with value in the END expression. This could be proved by xargs -t.
cat urls.txt | xargs -I{} -t echo "BEGIN: {}" && sleep 1 && echo "END: {}"
Inspired by UtLox's the answer, I found I could join my expectation with sh -c in xargs.
cat urls.txt | xargs -I{} -P 5 sh -c 'echo "BEGIN: {}" && sleep 1 && echo "END: {}"'
For the -P 5, it makes the utility commmand ran with max specified subprocesses in parallel mode to make use of most bandwide resources.
Done!

gnu parellel re-run when it fails with a while loop

Assuming we have a csv file
1
2
3
4
Here is the code:
cat A.csv | while read A; do
echo "echo $A" > $A.sh
echo "$A.sh"
done | xargs -I {} parallel --joblog test.log --jobs 2 -k sh ::: {}
The above is a simplified case. But pretty much get the bulk part. The parallel here will run like this:
parallel --joblog test.log --jobs 2 -k sh ::: 1.sh 2.sh 3.sh 4.sh
Now assume 3.sh failed for some reasons. Is there going to be any easy way to rerun the failed 3.sh in the current shell script setting within the same line of parallel command? I have tried the following, but it doesnt works and quite lengthy.
cat A.csv | while read A; do
echo "echo $A" > $A.sh
echo "$A.sh"
done | xargs -I {} parallel --joblog test.log --jobs 2 -k sh ::: {}
# The above will do this:
# parallel --joblog test.log --jobs 2 -k sh ::: 1.sh 2.sh 3.sh 4.sh
cat A.csv | while read A; do
echo "echo $A" > $A.sh
echo "$A.sh"
done | xargs -I {} parallel --resume-failed --joblog test.log --jobs 2 -k sh ::: {}
# The above will do this:
# parallel --resume-failed --joblog test.log --jobs 2 -k sh ::: 1.sh 2.sh 3.sh 4.sh
######## 2017-09-25
Thanks Ole. I have tried the following
doit() {
myarg="$1"
if [ $myarg -eq 3 ]
then
exit 1
else
echo do crazy stuff with "$myarg"
fi
}
export -f doit
parallel -k --retries 3 --joblog ole.log doit :::: A.csv
It returns the log file like this:
Seq Host Starttime JobRuntime Send Receive Exitval Signal Command
1 : 1506362303.003 0.016 0 22 0 0 doit 1
2 : 1506362303.006 0.013 0 22 0 0 doit 2
3 : 1506362303.026 0.002 0 0 1 0 doit 3
4 : 1506362303.014 0.006 0 22 0 0 doit 4
However, I dont see the doit 3 being retried 3 times as expected. Could you enlighten? Thanks.
First: Generating .sh files seems like a bad idea. You can most likely just make a function instead:
doit() {
myarg="$1"
echo do crazy stuff with "$myarg"
}
export -f doit
To retry a failing command use --retries:
parallel --retries 3 doit :::: file.csv
If your CSV-file has multiple columns, use --colsep:
parallel --retries 3 --colsep '\t' doit :::: file.csv
Using this:
doit() {
myarg="$1"
if [ $myarg -eq 3 ] ; then
echo do not do crazy stuff with "$myarg"
exit 1
else
echo do crazy stuff with "$myarg"
fi
}
export -f doit
This will retry '3' job 3 times:
parallel -k --retries 3 --joblog ole.log doit ::: 1 2 3 4
It will only log the last time. To be convinced this actually runs thrice, run the output unbuffered:
parallel -u --retries 3 --joblog ole.log doit ::: 1 2 3 4

Use argument twice from standard output pipelining

I have a command line tool which receives two arguments:
TOOL arg1 -o arg2
I would like to invoke it with the same argument provided it for arg1 and arg2, and to make that easy for me, i thought i would do:
each <arg1_value> | TOOL $1 -o $1
but that doesn't work, $1 is not replaced, but is added once to the end of the commandline.
An explicit example, performing:
cp fileA fileA
returns an error fileA and fileA are identical (not copied)
While performing:
echo fileA | cp $1 $1
returns the following error:
usage: cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file target_file
cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file ... target_directory
any ideas?
If you want to use xargs, the [-I] option may help:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not terminate input items; instead the separa‐
tor is the newline character. Implies -x and -L 1.
Here is a simple example:
mkdir test && cd test && touch tmp
ls | xargs -I '{}' cp '{}' '{}'
Returns an Error cp: tmp and tmp are the same file
The xargs utility will duplicate its input stream to replace all placeholders in its argument if you use the -I flag:
$ echo hello | xargs -I XXX echo XXX XXX XXX
hello hello hello
The placeholder XXX (may be any string) is replaced with the entire line of input from the input stream to xargs, so if we give it two lines:
$ printf "hello\nworld\n" | xargs -I XXX echo XXX XXX XXX
hello hello hello
world world world
You may use this with your tool:
$ generate_args | xargs -I XXX TOOL XXX -o XXX
Where generate_args is a script, command or shell function that generates arguments for your tool.
The reason
each <arg1_value> | TOOL $1 -o $1
did not work, apart from each not being a command that I recognise, is that $1 expands to the first positional parameter of the current shell or function.
The following would have worked:
set - "arg1_value"
TOOL "$1" -o "$1"
because that sets the value of $1 before calling you tool.
You can re-run a shell to perform variable expansion, with sh -c. The -c takes an argument which is command to run in a shell, performing expansion. Next arguments of sh will be interpreted as $0, $1, and so on, to use in the -c. For example:
sh -c 'echo $1, i repeat: $1' foo bar baz will print execute echo $1, i repeat: $1 with $1 set to bar ($0 is set to foo and $2 to baz), finally printing bar, i repeat: bar
The $1,$2...$N are only visible to bash script to interpret arguments to those scripts and won't work the way you want them to. Piping redirects stdout to stdin and is not what you are looking for either.
If you just want a one-liner, use something like
ARG1=hello && tool $ARG1 $ARG1
Using GNU parallel to use STDIN four times, to print a multiplication table:
seq 5 | parallel 'echo {} \* {} = $(( {} * {} ))'
Output:
1 * 1 = 1
2 * 2 = 4
3 * 3 = 9
4 * 4 = 16
5 * 5 = 25
One could encapsulate the tool using awk:
$ echo arg1 arg2 | awk '{ system("echo TOOL " $1 " -o " $2) }'
TOOL arg1 -o arg2
Remove the echo within the system() call and TOOL should be executed in accordance with requirements:
echo arg1 arg2 | awk '{ system("TOOL " $1 " -o " $2) }'
Double up the data from a pipe, and feed it to a command two at a time, using sed and xargs:
seq 5 | sed p | xargs -L 2 echo
Output:
1 1
2 2
3 3
4 4
5 5

Resources