I am trying to use xargs to call a more complex function in parallel.
#!/bin/bash
echo_var(){
echo $1
return 0
}
seq -f "n%04g" 1 100 |xargs -n 1 -P 10 -i echo_var {}
exit 0
This returns the error
xargs: echo_var: No such file or directory
Any ideas on how I can use xargs to accomplish this, or any other solution(s) would be welcome.
Exporting the function should do it (untested):
export -f echo_var
seq -f "n%04g" 1 100 | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
You can use the builtin printf instead of the external seq:
printf "n%04g\n" {1..100} | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
Also, using return 0 and exit 0 like that masks any error value that might be produced by the command preceding it. Also, if there's no error, it's the default and thus somewhat redundant.
#phobic mentions that the Bash command could be simplified to
bash -c 'echo_var "{}"'
moving the {} directly inside it. But it's vulnerable to command injection as pointed out by #Sasha.
Here is an example why you should not use the embedded format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "{}"'
Sun Aug 18 11:56:45 CDT 2019
Another example of why not:
echo '\"; date\"' | xargs -I {} bash -c 'echo_var "{}"'
This is what is output using the safe format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "$#"' _ {}
$(date)
This is comparable to using parameterized SQL queries to avoid injection.
I'm using date in a command substitution or in escaped quotes here instead of the rm command used in Sasha's comment since it's non-destructive.
Using GNU Parallel is looks like this:
#!/bin/bash
echo_var(){
echo $1
return 0
}
export -f echo_var
seq -f "n%04g" 1 100 | parallel -P 10 echo_var {}
exit 0
If you use version 20170822 you do not even have to export -f as long as you have run this:
. `which env_parallel.bash`
seq -f "n%04g" 1 100 | env_parallel -P 10 echo_var {}
Something like this should work also:
function testing() { sleep $1 ; }
echo {1..10} | xargs -n 1 | xargs -I# -P4 bash -c "$(declare -f testing) ; testing # ; echo # "
Maybe this is bad practice, but you if you are defining functions in a .bashrc or other script, you can wrap the file or at least the function definitions with a setting of allexport:
set -o allexport
function funcy_town {
echo 'this is a function'
}
function func_rock {
echo 'this is a function, but different'
}
function cyber_func {
echo 'this function does important things'
}
function the_man_from_funcle {
echo 'not gonna lie'
}
function funcle_wiggly {
echo 'at this point I\'m doing it for the funny names'
}
function extreme_function {
echo 'goodbye'
}
set +o allexport
Seems I can't make comments :-(
I was wondering about the focus on
bash -c 'echo_var "$#"' _ {}
vs
bash -c 'echo_var "{}"'
The 1st substitutes the {} as an arg to bash while the 2nd as an arg to the function. The fact that example 1 doesn't expand the $(date) is simply a a side effect.
If you don't want the functions args expanded , use single single quotes rather than double. To avoid messy nesting , use double quote (expand args on the other one)
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c 'printit "{}"'
Fri 11 Sep 17:02:24 BST 2020
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c "printit '{}'"
$(date)
Related
I am trying to use xargs to call a more complex function in parallel.
#!/bin/bash
echo_var(){
echo $1
return 0
}
seq -f "n%04g" 1 100 |xargs -n 1 -P 10 -i echo_var {}
exit 0
This returns the error
xargs: echo_var: No such file or directory
Any ideas on how I can use xargs to accomplish this, or any other solution(s) would be welcome.
Exporting the function should do it (untested):
export -f echo_var
seq -f "n%04g" 1 100 | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
You can use the builtin printf instead of the external seq:
printf "n%04g\n" {1..100} | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
Also, using return 0 and exit 0 like that masks any error value that might be produced by the command preceding it. Also, if there's no error, it's the default and thus somewhat redundant.
#phobic mentions that the Bash command could be simplified to
bash -c 'echo_var "{}"'
moving the {} directly inside it. But it's vulnerable to command injection as pointed out by #Sasha.
Here is an example why you should not use the embedded format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "{}"'
Sun Aug 18 11:56:45 CDT 2019
Another example of why not:
echo '\"; date\"' | xargs -I {} bash -c 'echo_var "{}"'
This is what is output using the safe format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "$#"' _ {}
$(date)
This is comparable to using parameterized SQL queries to avoid injection.
I'm using date in a command substitution or in escaped quotes here instead of the rm command used in Sasha's comment since it's non-destructive.
Using GNU Parallel is looks like this:
#!/bin/bash
echo_var(){
echo $1
return 0
}
export -f echo_var
seq -f "n%04g" 1 100 | parallel -P 10 echo_var {}
exit 0
If you use version 20170822 you do not even have to export -f as long as you have run this:
. `which env_parallel.bash`
seq -f "n%04g" 1 100 | env_parallel -P 10 echo_var {}
Something like this should work also:
function testing() { sleep $1 ; }
echo {1..10} | xargs -n 1 | xargs -I# -P4 bash -c "$(declare -f testing) ; testing # ; echo # "
Maybe this is bad practice, but you if you are defining functions in a .bashrc or other script, you can wrap the file or at least the function definitions with a setting of allexport:
set -o allexport
function funcy_town {
echo 'this is a function'
}
function func_rock {
echo 'this is a function, but different'
}
function cyber_func {
echo 'this function does important things'
}
function the_man_from_funcle {
echo 'not gonna lie'
}
function funcle_wiggly {
echo 'at this point I\'m doing it for the funny names'
}
function extreme_function {
echo 'goodbye'
}
set +o allexport
Seems I can't make comments :-(
I was wondering about the focus on
bash -c 'echo_var "$#"' _ {}
vs
bash -c 'echo_var "{}"'
The 1st substitutes the {} as an arg to bash while the 2nd as an arg to the function. The fact that example 1 doesn't expand the $(date) is simply a a side effect.
If you don't want the functions args expanded , use single single quotes rather than double. To avoid messy nesting , use double quote (expand args on the other one)
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c 'printit "{}"'
Fri 11 Sep 17:02:24 BST 2020
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c "printit '{}'"
$(date)
At fisrt,I have write the code,and it run well.
# version1
all_num=10
thread_num=5
a=$(date +%H%M%S)
seq 1 ${all_num} | xargs -n 1 -I {} -P ${thread_num} sh -c 'echo abc{}'
b=$(date +%H%M%S)
echo -e "startTime:\t$a"
echo -e "endTime:\t$b"
Now I want to extract code into a funciton,but it was wrong,how to fix it?
get_file(i){
echo "abc"+i
}
all_num=10
thread_num=5
a=$(date +%H%M%S)
seq 1 ${all_num} | xargs -n 1 -I {} -P ${thread_num} sh -c "$(get_file {})"
b=$(date +%H%M%S)
echo -e "startTime:\t$a"
echo -e "endTime:\t$b"
Because /bin/sh isn't guaranteed to have support for either printing text that when evaluates defines your function, or exporting functions through the environment, we need to do this the hard way, just duplicating the text of the function inside the copy of sh started by xargs.
Other questions already exist in this site describing how to accomplish this with bash, which is quite considerably easier. See f/e How can I use xargs to run a function in a command substitution for each match?
#!/bin/sh
all_num=10
thread_num=5
batch_size=1 # but with a larger all_num, turn this up to start fewer copies of sh
a=$(date +%H%M%S) # warning: this is really inefficient
seq 1 ${all_num} | xargs -n "${batch_size}" -P "${thread_num}" sh -c '
get_file() { i=$1; echo "abc ${i}"; }
for arg do
get_file "$arg"
done
' _
b=$(date +%H%M%S)
printf 'startTime:\t%s\n' "$a"
printf 'endTime:\t%s\n' "$b"
Note:
echo -e is not guaranteed to work with /bin/sh. Moreover, for a shell to be truly compliant, echo -e is required to write -e to its output. See Why is printf better than echo? on UNIX & Linux Stack Exchange, and the APPLICATION USAGE section of the POSIX echo specification.
Putting {} in a sh -c '...{}...' position is a Really Bad Idea. Consider the case where you're passed in a filename that contains $(rm -rf ~)'$(rm -rf ~)' -- it can't be safely inserted in an unquoted context, or a double-quoted context, or a single-quoted context, or a heredoc.
Note that seq is also nonstandard and not guaranteed to be present on all POSIX-compliant systems. i=0; while [ "$i" -lt "$all_num" ]; do echo "$i"; i=$((i + 1)); done is an alternative that will work on all POSIX systems.
I have a function that generates a random file name
#generate random file names
get_rand_filename() {
if [ "$ASCIIONLY" == "1" ]; then
for ((i=0; i<$((MINFILENAMELEN+RANDOM%MAXFILENAMELEN)); i++)) {
printf \\$(printf '%03o' ${AARR[RANDOM%aarrcount]});
}
else
# no need to escape double quotes for filename
cat /dev/urandom | tr -dc '[ -~]' | tr -d '[$></~:`\\]' | head -c$((MINFILENAMELEN+RANDOM%MAXFILENAMELEN)) #| sed 's/\(["]\)/\\\1/g'
fi
printf "%s" $FILEEXT
}
export -f get_rand_filename
When I call it from within another function
cf(){
fD=$1
echo "the target dir recieved is " $fD
CFILE="$(get_rand_filename)"
echo "the file name is "$CFILE
}
export -f cf
when I call
echo "$targetdir" | xargs -0 sh -c 'cf $1' sh
I only get the FILEXT (no random file name)
when I call
cf "$targetdir"
I get a valid result
I need to be able to handle spaces in the $targetdir and file name string.
echo "$targetdir" | xargs -0 sh -c 'cf $1' sh
You should invoke bash rather than sh. Function exporting is a bash feature.
$ foo() { echo bar; }
$ export -f foo
$ sh -c 'foo'
sh: 1: foo: not found
$ bash -c 'foo'
bar
Also, get rid of the -0 option since the input isn't NUL-separated. Use -d'\n' instead. And quote "$1" for robustness.
echo "$targetdir" | xargs -d'\n' bash -c 'cf "$1"' bash
Actually, you could use -0 if you change the input format.
printf '%s\0' "$targetdir" | xargs -0 bash -c 'cf "$1"' bash
For what it's worth, mktemp creates random temporary files, and does it safely. It makes sure the file doesn't already exist and then creates it to prevent anybody else from snatching up the name in the split second between the name being generated and it being returned to the caller.
I am trying to use xargs to call a more complex function in parallel.
#!/bin/bash
echo_var(){
echo $1
return 0
}
seq -f "n%04g" 1 100 |xargs -n 1 -P 10 -i echo_var {}
exit 0
This returns the error
xargs: echo_var: No such file or directory
Any ideas on how I can use xargs to accomplish this, or any other solution(s) would be welcome.
Exporting the function should do it (untested):
export -f echo_var
seq -f "n%04g" 1 100 | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
You can use the builtin printf instead of the external seq:
printf "n%04g\n" {1..100} | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
Also, using return 0 and exit 0 like that masks any error value that might be produced by the command preceding it. Also, if there's no error, it's the default and thus somewhat redundant.
#phobic mentions that the Bash command could be simplified to
bash -c 'echo_var "{}"'
moving the {} directly inside it. But it's vulnerable to command injection as pointed out by #Sasha.
Here is an example why you should not use the embedded format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "{}"'
Sun Aug 18 11:56:45 CDT 2019
Another example of why not:
echo '\"; date\"' | xargs -I {} bash -c 'echo_var "{}"'
This is what is output using the safe format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "$#"' _ {}
$(date)
This is comparable to using parameterized SQL queries to avoid injection.
I'm using date in a command substitution or in escaped quotes here instead of the rm command used in Sasha's comment since it's non-destructive.
Using GNU Parallel is looks like this:
#!/bin/bash
echo_var(){
echo $1
return 0
}
export -f echo_var
seq -f "n%04g" 1 100 | parallel -P 10 echo_var {}
exit 0
If you use version 20170822 you do not even have to export -f as long as you have run this:
. `which env_parallel.bash`
seq -f "n%04g" 1 100 | env_parallel -P 10 echo_var {}
Something like this should work also:
function testing() { sleep $1 ; }
echo {1..10} | xargs -n 1 | xargs -I# -P4 bash -c "$(declare -f testing) ; testing # ; echo # "
Maybe this is bad practice, but you if you are defining functions in a .bashrc or other script, you can wrap the file or at least the function definitions with a setting of allexport:
set -o allexport
function funcy_town {
echo 'this is a function'
}
function func_rock {
echo 'this is a function, but different'
}
function cyber_func {
echo 'this function does important things'
}
function the_man_from_funcle {
echo 'not gonna lie'
}
function funcle_wiggly {
echo 'at this point I\'m doing it for the funny names'
}
function extreme_function {
echo 'goodbye'
}
set +o allexport
Seems I can't make comments :-(
I was wondering about the focus on
bash -c 'echo_var "$#"' _ {}
vs
bash -c 'echo_var "{}"'
The 1st substitutes the {} as an arg to bash while the 2nd as an arg to the function. The fact that example 1 doesn't expand the $(date) is simply a a side effect.
If you don't want the functions args expanded , use single single quotes rather than double. To avoid messy nesting , use double quote (expand args on the other one)
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c 'printit "{}"'
Fri 11 Sep 17:02:24 BST 2020
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c "printit '{}'"
$(date)
For some reason I cannot pass the 2nd parameter to a function which is on a another file, exactly here:
$lsValidLocal | xargs -n 1 -I {} bash -c 'Push "{}" "**$inFolder**"
The Push function on functions.sh does not read the 2nd parameter $inFolder.
I tried several different ways, the only working way till now is exporting the variable to make it globally accessible (not a good solution though)
script.sh
#!/bin/bash
#other machine
export otherachine="IP_address_otherachine"
#folders
inFolder="$HOME/folderIn"
outFolder="$HOME/folderOut"
#loading functions.sh
. /home/ec2-user/functions.sh
export lsValidLocal="lsValid $inFolder"
echo $inFolder
#execution
$lsValidLocal | xargs -n 1 -I {} bash -c 'Push "{}" "$inFolder"'
functions.sh
function Push() {
local FILE=$1
local DEST=$2
scp $FILE $otherachine:$DEST &&
rm $FILE ${FILE}_0 &&
ssh $otherachine "touch ${FILE}_0"
}
function lsValid() { #from directory
local DIR=$1
ls $DIR/*_0 | sed 's/.\{2\}$//'
}
export -f Push
export -f Pull
export -f lsValid
The problem with the code you have written is that $inFolder is inside single quotes (') which will prevent it being expanded.
$lsValidLocal | xargs -n 1 -I {} bash -c 'Push "{}" "**$inFolder**"'
This will be executed as three separate layers of processes
bash <your scrpit>
|
\xargs ...
|
\bash -c Push ...
Your code is not transferring the value across from the outer shell to inner shell... But you are expanding the variable inFolder using the inner shell. As you correctly point out it can be done with an exported environment variable.
The alternative is to have the outer shell expand it before passing to xargs.
$lsValidLocal | xargs -n 1 -I {} bash -c "Push '{}' '**$inFolder**'"
Notice I have reversed ' and " to allow $inFolder to be expanded before xargs is called.