How to extract code into a funciton when using xargs -P? - shell

At fisrt,I have write the code,and it run well.
# version1
all_num=10
thread_num=5
a=$(date +%H%M%S)
seq 1 ${all_num} | xargs -n 1 -I {} -P ${thread_num} sh -c 'echo abc{}'
b=$(date +%H%M%S)
echo -e "startTime:\t$a"
echo -e "endTime:\t$b"
Now I want to extract code into a funciton,but it was wrong,how to fix it?
get_file(i){
echo "abc"+i
}
all_num=10
thread_num=5
a=$(date +%H%M%S)
seq 1 ${all_num} | xargs -n 1 -I {} -P ${thread_num} sh -c "$(get_file {})"
b=$(date +%H%M%S)
echo -e "startTime:\t$a"
echo -e "endTime:\t$b"

Because /bin/sh isn't guaranteed to have support for either printing text that when evaluates defines your function, or exporting functions through the environment, we need to do this the hard way, just duplicating the text of the function inside the copy of sh started by xargs.
Other questions already exist in this site describing how to accomplish this with bash, which is quite considerably easier. See f/e How can I use xargs to run a function in a command substitution for each match?
#!/bin/sh
all_num=10
thread_num=5
batch_size=1 # but with a larger all_num, turn this up to start fewer copies of sh
a=$(date +%H%M%S) # warning: this is really inefficient
seq 1 ${all_num} | xargs -n "${batch_size}" -P "${thread_num}" sh -c '
get_file() { i=$1; echo "abc ${i}"; }
for arg do
get_file "$arg"
done
' _
b=$(date +%H%M%S)
printf 'startTime:\t%s\n' "$a"
printf 'endTime:\t%s\n' "$b"
Note:
echo -e is not guaranteed to work with /bin/sh. Moreover, for a shell to be truly compliant, echo -e is required to write -e to its output. See Why is printf better than echo? on UNIX & Linux Stack Exchange, and the APPLICATION USAGE section of the POSIX echo specification.
Putting {} in a sh -c '...{}...' position is a Really Bad Idea. Consider the case where you're passed in a filename that contains $(rm -rf ~)'$(rm -rf ~)' -- it can't be safely inserted in an unquoted context, or a double-quoted context, or a single-quoted context, or a heredoc.
Note that seq is also nonstandard and not guaranteed to be present on all POSIX-compliant systems. i=0; while [ "$i" -lt "$all_num" ]; do echo "$i"; i=$((i + 1)); done is an alternative that will work on all POSIX systems.

Related

make the bash script to be faster

I have a fairly large list of websites in "file.txt" and wanted to check if the words "Hello World!" in the site in the list using looping and curl.
i.e in "file.txt" :
blabla.com
blabla2.com
blabla3.com
then my code :
#!/bin/bash
put() {
printf "list : "
read list
run=$(cat $list)
}
put
scan_list() {
for run in $(cat $list);do
if [[ $(curl -skL ${run}) =~ "Hello World!" ]];then
printf "${run} Hello World! \n"
else
printf "${run} No Hello:( \n"
fi
done
}
scan_list
this takes a lot of time, is there a way to make the checking process faster?
Use xargs:
% tr '\12' '\0' < file.txt | \
xargs -0 -r -n 1 -t -P 3 sh -c '
if curl -skL "$1" | grep -q "Hello World!"; then
echo "$1 Hello World!"
exit
fi
echo "$1 No Hello:("
' _
Use tr to convert returns in the file.txt to nulls (\0).
Pass through xargs with -0 option to parse by nulls.
The -r option prevents the command from being ran if the input is empty. This is only available on Linux, so for macOS or *BSD you will need to check that file.txt is not empty before running.
The -n 1 permits only one file per execution.
The -t option is debugging, it prints the command before it is ran.
We allow 3 simultaneous commands in parallel with the -P 3 option.
Using sh -c with a single quoted multi-line command, we substitute $1 for the entries from the file.
The _ fills in the $0 argument, so our entries are $1.

Is there a way to perform echo | tee during find -exec?

I have a problem with a code similar to the following:
function echotee() { echo $1 | tee -a ${FILE}; }
export -f echotee
find . -delete -exec sh -c 'echotee "Deleting: {}"' \;
The function echotee usually works as expected. However, during the -exec it does not. Indeed, it just prints on the terminal, omitting tee.
Hoping the question is not too trivial, thanks in advance.
Why don't you just use this:
find . -delete -exec sh -c 'echo "Deleting: $1" | tee -a "$2"' _ {} "${FILE}" \;
No need to define and call a function.
You mentioned in a comment that you want to use echotee as a central point to print and log information. Have you considered a setup like this instead:
#!/usr/bin/env bash
# Send all script output to console and logfile
LOGFILE="..."
exec > >(tee -ia "${LOGFILE}") 2>&1
find . -delete -printf "Deleting: %f\n"
or this:
#!/usr/bin/env bash
# Set up fd 3 to send output to console and logfile on demand
LOGFILE="..."
exec 3> >(tee -ia "${LOGFILE}")
find . -delete -printf "Deleting: %f\n" 1>&3 2>&1
Use name() instead of function name().
You did not set nor export FILE variable.
sh does not support exporting functions. It's a feature of bash, you have to call bash.
sh -c ' .... "{}"' will break on filenames containing " character. Put it as positional argument and use $1.
$1 and $FILE expansions are not quoted and are subject to word splitting and filename expansion.
echo $1 will break on filenames like -e. Prefer printf.
Check your scripts with shellcheck - it will catch many such mistakes.
I think you meant to:
FILE=/tmp/log.txt
echotee() { printf "%s\n" "$1" | tee -a "$FILE"; }
export -f echotee
export FILE
find . -exec bash -c 'echotee "Deleting: $1"' -- {} \;
But the version from Shawn with -printf "Deleting: %p\n" | tee "$FILE" looks just nicer.
I think spawning tee and pipe will be slower then, I think doing like so could be a bit faster:
echotee() { printf "%s\n" "$1" >> "$FILE"; printf "%s\n" "$1"; }
or like:
exec 10>>"$FILE"
echotee() { printf "%s\n" "$1" >&10; printf "%s\n" "$1"; }
You could remove the pipe either way, just:
echotee() { tee -a "$FILE" <<<"$1"; }

calling shell function using parallel with list of quoted filenames as input

Using Bash.
I have an exported shell function which I want to apply to many files.
Normally I would use xargs, but the syntax like this (see here) is too ugly for use.
...... | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
In that discussion, parallel had an easier syntax:
..... | parallel -P 10 echo_var {}
Now I have run into the following problem: the list of files to which I want to apply my function is a list of files on one line, each quoted and separated by spaces thus:
"file 1" "file 2" "file 3".
how can I feed this space-separated, quoted, list into parallel?
I can replicate the list using echo for testing.
e.g.
echo '"file 1" "file 2" "file 3"'|parallel -d " " my_function {}
but I can't get this to work.
How can I fix it?
How can I fix it?
You have to choose a unique separator.
echo 'file 1|file 2|file 3' | xargs -d "|" -n1 bash -c 'my_function "$#"' --
echo 'file 1^file 2^file 3' | parallel -d "^" my_function
The safest is to use zero byte as the separator:
echo -e 'file 1\x00file 2\x00file 3' | xargs -0 ' -n1 bash -c 'my_function "$#"' --
printf "%s\0" 'file 1' 'file 2' 'file 3' | parallel -0 my_function
The best is to store your elements inside a bash array and use a zero separated stream to process them:
files=("file 1" "file 2" "file 3")
printf "%s\0" "${files[#]}" | xargs -0 -n1 bash -c 'my_function "$#"' --
printf "%s\0" "${files[#]}" | parallel -0 my_function
Note that empty arrays will run the function without any arguments. It's sometimes preferred to use -r --no-run-if-empty option not to run the function when input is empty. The --no-run-if-empty is supported by parallel and is a gnu extension in xargs (xargs on BSD and on OSX do not have --no-run-if-empty).
Note: xargs by default parses ', " and \. This is why the following is possible and will work:
echo '"file 1" "file 2" "file 3"' | xargs -n1 bash -c 'my_function "$#"' --
echo "'file 1' 'file 2' 'file 3'" | xargs -n1 bash -c 'my_function "$#"' --
echo 'file\ 1 file\ 2 file\ 3' | xargs -n1 bash -c 'my_function "$#"' --
And it can result in some strange things, so remember to almost always specify -d option to xargs:
$ # note \x replaced by single x
$ echo '\\a\b\c' | xargs
\abc
$ # quotes are parsed and need to match
$ echo 'abc"def' | xargs
xargs: unmatched double quote; by default quotes are special to xargs unless you use the -0 option
$ echo "abc'def" | xargs
xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
xargs is a portable tool available quite everywhere, while parallel is a GNU program, which has to be installed separately.
The problem boils down to the values can contain space, and space is the value separator. So we need something that can parse the input into separate values containing space. Since they are bash-quoted the obvious choice is to use bash for unquoting the values.
You have several options:
(echo "file 1";
echo "file 2";
echo "file \"name\" \$(3)") | parallel my_function
printf "%s\n" "file 1" "file 2" "file \"name\" \$(3)" |
parallel my_function
If the input is in a variable:
var='"file 1" "file 2" "file \"name\" \$(3)"'
eval 'printf "%s\n" '"$var" |
parallel my_function
Or you can convert the variable to an array:
var='"file 1" "file 2" "file \"name\" \$(3)"'
eval arr=("$var")
And if the input is in an array:
parallel my_function ::: "${arr[#]}"

Why I am not getting a value when i call a function within another in a bash script

I have a function that generates a random file name
#generate random file names
get_rand_filename() {
if [ "$ASCIIONLY" == "1" ]; then
for ((i=0; i<$((MINFILENAMELEN+RANDOM%MAXFILENAMELEN)); i++)) {
printf \\$(printf '%03o' ${AARR[RANDOM%aarrcount]});
}
else
# no need to escape double quotes for filename
cat /dev/urandom | tr -dc '[ -~]' | tr -d '[$></~:`\\]' | head -c$((MINFILENAMELEN+RANDOM%MAXFILENAMELEN)) #| sed 's/\(["]\)/\\\1/g'
fi
printf "%s" $FILEEXT
}
export -f get_rand_filename
When I call it from within another function
cf(){
fD=$1
echo "the target dir recieved is " $fD
CFILE="$(get_rand_filename)"
echo "the file name is "$CFILE
}
export -f cf
when I call
echo "$targetdir" | xargs -0 sh -c 'cf $1' sh
I only get the FILEXT (no random file name)
when I call
cf "$targetdir"
I get a valid result
I need to be able to handle spaces in the $targetdir and file name string.
echo "$targetdir" | xargs -0 sh -c 'cf $1' sh
You should invoke bash rather than sh. Function exporting is a bash feature.
$ foo() { echo bar; }
$ export -f foo
$ sh -c 'foo'
sh: 1: foo: not found
$ bash -c 'foo'
bar
Also, get rid of the -0 option since the input isn't NUL-separated. Use -d'\n' instead. And quote "$1" for robustness.
echo "$targetdir" | xargs -d'\n' bash -c 'cf "$1"' bash
Actually, you could use -0 if you change the input format.
printf '%s\0' "$targetdir" | xargs -0 bash -c 'cf "$1"' bash
For what it's worth, mktemp creates random temporary files, and does it safely. It makes sure the file doesn't already exist and then creates it to prevent anybody else from snatching up the name in the split second between the name being generated and it being returned to the caller.

How to run commands off of a pipe

I would like to run commands such as "history" or "!23" off of a pipe.
How might I achieve this?
Why does the following command not work?
echo "history" | xargs eval $1
To answer (2) first:
history and eval are both bash builtins. So xargs cannot run either of them.
xargs does not use $1 arguments. man xargs for the correct syntax.
For (1), it doesn't really make much sense to do what you are attempting because shell history is not likely to be synchronised between invocations, but you could try something like:
{ echo 'history'; echo '!23'; } | bash -i
or:
{ echo 'history'; echo '!23'; } | while read -r cmd; do eval "$cmd"; done
Note that pipelines run inside subshells. Environment changes are not retained:
x=1; echo "x=2" | while read -r cmd; do eval "$cmd"; done; echo "$x"
You can try like this
First redirect the history commands to a file (cut out the line numbers)
history | cut -c 8- > cmd.txt
Now Create this script hcmd.sh(Referred to this Read a file line by line assigning the value to a variable)
#!/bin/bash
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "Text read from file: $line"
$line
done < "cmd.txt"
Run it like this
./hcmd.sh

Resources