How to suspend the main command when piping the output to another delay command - macos

I have two custom scripts to implement their own tasks, one for outputting some URLs (pretend as cat command below) and another for receiving a URL to parse it via network requests (pretend as sleep command below).
Here is the prototype:
Case 1:
cat urls.txt | xargs -I{} sleep 1 && echo "END: {}"
The output is END: {} and the sleep works.
Case 2:
cat urls.txt | xargs -I{} echo "BEGIN: {}" && sleep 1 && echo "END: {}"
The output is
BEGIN: https://www.example.com/1
BEGIN: https://www.example.com/2
BEGIN: https://www.example.com/3
END: {}
but it seems only sleep 1 second.
Q1: I'm a little confused, why are these outputs?
Q2: Are there any solutions to execute the full pipelined xargs delay command for every cat line output?

You can put the commands into a separate script:
worker.sh
#!/bin/bash
echo "BEGIN: $*" && sleep 1 && echo "END: $*"
set execute permission:
chmod +x worker.sh
and call it with xargs:
cat urls.txt | xargs -I{} ./worker.sh {}
output
BEGIN: https://www.example.com/1
END: https://www.example.com/1
BEGIN: https://www.example.com/2
END: https://www.example.com/2
BEGIN: https://www.example.com/3
END: https://www.example.com/3
Between BEGIN and END the script sleep for one second.

Thanks for shellter and UtLox's reminder, I found the xargs is the key.
Here is my finding, the shell/zsh interpreter splits the sleep 5 and echo END: {} as another serial of commands, so xargs didn't receive my expected two && inline commands as one utility command and replace the {} with value in the END expression. This could be proved by xargs -t.
cat urls.txt | xargs -I{} -t echo "BEGIN: {}" && sleep 1 && echo "END: {}"
Inspired by UtLox's the answer, I found I could join my expectation with sh -c in xargs.
cat urls.txt | xargs -I{} -P 5 sh -c 'echo "BEGIN: {}" && sleep 1 && echo "END: {}"'
For the -P 5, it makes the utility commmand ran with max specified subprocesses in parallel mode to make use of most bandwide resources.
Done!

Related

Cannot get bash function call to work inside another bash call with xargs in it [duplicate]

I am trying to use xargs to call a more complex function in parallel.
#!/bin/bash
echo_var(){
echo $1
return 0
}
seq -f "n%04g" 1 100 |xargs -n 1 -P 10 -i echo_var {}
exit 0
This returns the error
xargs: echo_var: No such file or directory
Any ideas on how I can use xargs to accomplish this, or any other solution(s) would be welcome.
Exporting the function should do it (untested):
export -f echo_var
seq -f "n%04g" 1 100 | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
You can use the builtin printf instead of the external seq:
printf "n%04g\n" {1..100} | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
Also, using return 0 and exit 0 like that masks any error value that might be produced by the command preceding it. Also, if there's no error, it's the default and thus somewhat redundant.
#phobic mentions that the Bash command could be simplified to
bash -c 'echo_var "{}"'
moving the {} directly inside it. But it's vulnerable to command injection as pointed out by #Sasha.
Here is an example why you should not use the embedded format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "{}"'
Sun Aug 18 11:56:45 CDT 2019
Another example of why not:
echo '\"; date\"' | xargs -I {} bash -c 'echo_var "{}"'
This is what is output using the safe format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "$#"' _ {}
$(date)
This is comparable to using parameterized SQL queries to avoid injection.
I'm using date in a command substitution or in escaped quotes here instead of the rm command used in Sasha's comment since it's non-destructive.
Using GNU Parallel is looks like this:
#!/bin/bash
echo_var(){
echo $1
return 0
}
export -f echo_var
seq -f "n%04g" 1 100 | parallel -P 10 echo_var {}
exit 0
If you use version 20170822 you do not even have to export -f as long as you have run this:
. `which env_parallel.bash`
seq -f "n%04g" 1 100 | env_parallel -P 10 echo_var {}
Something like this should work also:
function testing() { sleep $1 ; }
echo {1..10} | xargs -n 1 | xargs -I# -P4 bash -c "$(declare -f testing) ; testing # ; echo # "
Maybe this is bad practice, but you if you are defining functions in a .bashrc or other script, you can wrap the file or at least the function definitions with a setting of allexport:
set -o allexport
function funcy_town {
echo 'this is a function'
}
function func_rock {
echo 'this is a function, but different'
}
function cyber_func {
echo 'this function does important things'
}
function the_man_from_funcle {
echo 'not gonna lie'
}
function funcle_wiggly {
echo 'at this point I\'m doing it for the funny names'
}
function extreme_function {
echo 'goodbye'
}
set +o allexport
Seems I can't make comments :-(
I was wondering about the focus on
bash -c 'echo_var "$#"' _ {}
vs
bash -c 'echo_var "{}"'
The 1st substitutes the {} as an arg to bash while the 2nd as an arg to the function. The fact that example 1 doesn't expand the $(date) is simply a a side effect.
If you don't want the functions args expanded , use single single quotes rather than double. To avoid messy nesting , use double quote (expand args on the other one)
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c 'printit "{}"'
Fri 11 Sep 17:02:24 BST 2020
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c "printit '{}'"
$(date)

Calling a nested function in xargs (on MacOS) [duplicate]

I am trying to use xargs to call a more complex function in parallel.
#!/bin/bash
echo_var(){
echo $1
return 0
}
seq -f "n%04g" 1 100 |xargs -n 1 -P 10 -i echo_var {}
exit 0
This returns the error
xargs: echo_var: No such file or directory
Any ideas on how I can use xargs to accomplish this, or any other solution(s) would be welcome.
Exporting the function should do it (untested):
export -f echo_var
seq -f "n%04g" 1 100 | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
You can use the builtin printf instead of the external seq:
printf "n%04g\n" {1..100} | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
Also, using return 0 and exit 0 like that masks any error value that might be produced by the command preceding it. Also, if there's no error, it's the default and thus somewhat redundant.
#phobic mentions that the Bash command could be simplified to
bash -c 'echo_var "{}"'
moving the {} directly inside it. But it's vulnerable to command injection as pointed out by #Sasha.
Here is an example why you should not use the embedded format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "{}"'
Sun Aug 18 11:56:45 CDT 2019
Another example of why not:
echo '\"; date\"' | xargs -I {} bash -c 'echo_var "{}"'
This is what is output using the safe format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "$#"' _ {}
$(date)
This is comparable to using parameterized SQL queries to avoid injection.
I'm using date in a command substitution or in escaped quotes here instead of the rm command used in Sasha's comment since it's non-destructive.
Using GNU Parallel is looks like this:
#!/bin/bash
echo_var(){
echo $1
return 0
}
export -f echo_var
seq -f "n%04g" 1 100 | parallel -P 10 echo_var {}
exit 0
If you use version 20170822 you do not even have to export -f as long as you have run this:
. `which env_parallel.bash`
seq -f "n%04g" 1 100 | env_parallel -P 10 echo_var {}
Something like this should work also:
function testing() { sleep $1 ; }
echo {1..10} | xargs -n 1 | xargs -I# -P4 bash -c "$(declare -f testing) ; testing # ; echo # "
Maybe this is bad practice, but you if you are defining functions in a .bashrc or other script, you can wrap the file or at least the function definitions with a setting of allexport:
set -o allexport
function funcy_town {
echo 'this is a function'
}
function func_rock {
echo 'this is a function, but different'
}
function cyber_func {
echo 'this function does important things'
}
function the_man_from_funcle {
echo 'not gonna lie'
}
function funcle_wiggly {
echo 'at this point I\'m doing it for the funny names'
}
function extreme_function {
echo 'goodbye'
}
set +o allexport
Seems I can't make comments :-(
I was wondering about the focus on
bash -c 'echo_var "$#"' _ {}
vs
bash -c 'echo_var "{}"'
The 1st substitutes the {} as an arg to bash while the 2nd as an arg to the function. The fact that example 1 doesn't expand the $(date) is simply a a side effect.
If you don't want the functions args expanded , use single single quotes rather than double. To avoid messy nesting , use double quote (expand args on the other one)
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c 'printit "{}"'
Fri 11 Sep 17:02:24 BST 2020
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c "printit '{}'"
$(date)

append variables from a while loop into a command line option

I have a while loop, where A=1~3
mysql -e "select A from xxx;" while read A;
do
whatever
done
The mySQL command will return only numbers, each number in each line. So the while loop here will have A=1, A=2, A=3
I would like to append the integer number in the loop (here is A=1~3) into a command line to run outside the while loop. Any bash way to do this?
parallel --joblog test.log --jobs 2 -k sh ::: 1.sh 2.sh 3.sh
You probably want something like this:
mysql -e "select A from xxx;" | while read A; do
whatever > standard_out 2>standard_error
echo "$A.sh"
done | xargs parallel --joblog test.log --jobs 2 -k sh :::
Thanks for enlightening me. xargs works perfectly here:
Assuming we have A.csv (mimic the mysql command)
1
2
3
4
We can simply do:
cat A.csv | while read A; do
echo "echo $A" > $A.sh
echo "$A.sh"
done | xargs -I {} parallel --joblog test.log --jobs 2 -k sh ::: {}
The above will print the following output as expected
1
2
3
4
Here -I {} & {} are the argument list markers:
https://www.cyberciti.biz/faq/linux-unix-bsd-xargs-construct-argument-lists-utility/

Using export -f with xargs not working [duplicate]

I am trying to use xargs to call a more complex function in parallel.
#!/bin/bash
echo_var(){
echo $1
return 0
}
seq -f "n%04g" 1 100 |xargs -n 1 -P 10 -i echo_var {}
exit 0
This returns the error
xargs: echo_var: No such file or directory
Any ideas on how I can use xargs to accomplish this, or any other solution(s) would be welcome.
Exporting the function should do it (untested):
export -f echo_var
seq -f "n%04g" 1 100 | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
You can use the builtin printf instead of the external seq:
printf "n%04g\n" {1..100} | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
Also, using return 0 and exit 0 like that masks any error value that might be produced by the command preceding it. Also, if there's no error, it's the default and thus somewhat redundant.
#phobic mentions that the Bash command could be simplified to
bash -c 'echo_var "{}"'
moving the {} directly inside it. But it's vulnerable to command injection as pointed out by #Sasha.
Here is an example why you should not use the embedded format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "{}"'
Sun Aug 18 11:56:45 CDT 2019
Another example of why not:
echo '\"; date\"' | xargs -I {} bash -c 'echo_var "{}"'
This is what is output using the safe format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "$#"' _ {}
$(date)
This is comparable to using parameterized SQL queries to avoid injection.
I'm using date in a command substitution or in escaped quotes here instead of the rm command used in Sasha's comment since it's non-destructive.
Using GNU Parallel is looks like this:
#!/bin/bash
echo_var(){
echo $1
return 0
}
export -f echo_var
seq -f "n%04g" 1 100 | parallel -P 10 echo_var {}
exit 0
If you use version 20170822 you do not even have to export -f as long as you have run this:
. `which env_parallel.bash`
seq -f "n%04g" 1 100 | env_parallel -P 10 echo_var {}
Something like this should work also:
function testing() { sleep $1 ; }
echo {1..10} | xargs -n 1 | xargs -I# -P4 bash -c "$(declare -f testing) ; testing # ; echo # "
Maybe this is bad practice, but you if you are defining functions in a .bashrc or other script, you can wrap the file or at least the function definitions with a setting of allexport:
set -o allexport
function funcy_town {
echo 'this is a function'
}
function func_rock {
echo 'this is a function, but different'
}
function cyber_func {
echo 'this function does important things'
}
function the_man_from_funcle {
echo 'not gonna lie'
}
function funcle_wiggly {
echo 'at this point I\'m doing it for the funny names'
}
function extreme_function {
echo 'goodbye'
}
set +o allexport
Seems I can't make comments :-(
I was wondering about the focus on
bash -c 'echo_var "$#"' _ {}
vs
bash -c 'echo_var "{}"'
The 1st substitutes the {} as an arg to bash while the 2nd as an arg to the function. The fact that example 1 doesn't expand the $(date) is simply a a side effect.
If you don't want the functions args expanded , use single single quotes rather than double. To avoid messy nesting , use double quote (expand args on the other one)
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c 'printit "{}"'
Fri 11 Sep 17:02:24 BST 2020
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c "printit '{}'"
$(date)

restricting xargs from reading stdin to buffer

It looks like xargs reads input lines from stdin, even when it is already running maximum number of process that it can run.
Here is an example:
#!/bin/bash
function xTrigger()
{
for ii in `seq 1 100`; do echo $ii; sleep 2; done
}
function xRunner()
{
sleep 10;
echo $1;
}
export -f xTrigger
export -f xRunner
bash -c "xTrigger" | xargs -n 1 -P 1 -i bash -c "xRunner {}"
20 seconds after starting above process, I killall xTrigger, so but xargs has buffered everything xTrigger printed, so xRunner continued to print 1..10. What I want is for it to print only 1,2
Is there anyway by which we can change this behaviour and get xargs to read from stdin only when it wants to start a new command, so that xTrigger would wait at the echo statement until xargs reads from it? My stdin has very dynamic content so this would be very useful.
Trying to stick to xargs just because it would be stable and elegent. Want to write extra code only if there is no easy way of doing it with xargs.
Thanks for all your help!
Don't you have to kill the Bash PID of xTrigger()?
bash -c "echo $$; xTrigger" | xargs -n 1 -P 1 bash -c 'xRunner "$#"' _
kill -HUP <PID>
On my system, xargs will by default halt if one of the jobs it is running exits with a non-zero return code. Therefore, you should be sending the signal to the bash pid that is running XRunner.
Got xTrigger to to generate next trigger only when there are no 'bash -c xRunner' jobs running. Works great now:
#!/bin/bash
function xTrigger()
{
for ii in `seq 1 100`; do
echo $ii;
while [[ $(psgrep xRunner|grep -v xargs|wc -l) -ge 1 ]]; do
sleep 2;
done
done
}
function xRunner()
{
sleep 10;
echo $1;
}
export -f xTrigger
export -f xRunner
bash -c "xTrigger" | xargs -n 1 -P 1 -i bash -c "xRunner {}"

Resources