command fails when fed argument via xargs, but not when fed the argument directly - bash

I have a bash function
agg_generror () {
echo $1
find ${folder} -name "${prefix}_*_${1}_${suffix}.count" | xargs -I % sh -c 'cat %; echo "";' | awk 'BEGIN{e=0;t=0} {e+=$1;t+=$2} END{print e/t}' > generror_${1}
}
which if I call directly
agg_generror 17.5
works and doesn't complain.
But if I do
echo 17.5 | xargs -I % sh -c 'agg_generror %'
It fails with
17.5
awk: fatal: division by zero attempted
Why may the behaviour different in the two cases?

while read; do agg_generror $REPLY; done < input.txt

Related

Passing Arguments to GNU parallel

I'm trying to use awk and GNU parallel to filter the files based on the values in column 1 and column 2 and dump the result in a single .csv.gz file. Thanks to the answer here, I could manage to write myscript.sh to do the job in parallel.
#!/bin/bash
doit() {
pigz -dc $1 | awk -F, '$1>0.5 && $2<1.5'
}
export -f doit
find $1 -name '*.csv.gz' | parallel doit | pigz > output.csv.gz
and then run the script in the terminal.
./myscript.sh /path/to/files
I'm wondering how I can pass 0.5 and 1.5 as arguments of myscript.sh?
./myscript.sh /path/to/files 0.5 1.5
This is may be an easier, or more explicit, way of passing variables and parameters around:
#!/bin/bash
dir="$1"
# Pick up second and third parameters, defaulting to 0.5 and 1.5 if unspecified
a=${2:-0.5}
b=${3:-1.5}
doit() {
file=$1
a=$2
b=$3
echo "File: $file, a=$a, b=$b"
cat "$1" | awk -F, -v a="$a" -v b="$b" '$1>a && $2<b'
}
export -f doit
find "$dir" -name '*.tst' | parallel doit {} "$a" "$b"
#!/bin/bash
doit() {
# $1 $2 $3 are arguments to doit
# '$1' and '$2' are variables in awk
pigz -dc $1 | awk -F, '$1>'$2' && $2<'$3
}
export -f doit
find $1 -name '*.csv.gz' | parallel doit {} $2 $3 | pigz > output.csv.gz
Call as:
paste <(seq 10 | shuf) <(seq 10 | shuf) | gzip > h.csv.gz
./myscript.sh . 5 6
zcat output.csv.gz

Cannot get bash function call to work inside another bash call with xargs in it [duplicate]

I am trying to use xargs to call a more complex function in parallel.
#!/bin/bash
echo_var(){
echo $1
return 0
}
seq -f "n%04g" 1 100 |xargs -n 1 -P 10 -i echo_var {}
exit 0
This returns the error
xargs: echo_var: No such file or directory
Any ideas on how I can use xargs to accomplish this, or any other solution(s) would be welcome.
Exporting the function should do it (untested):
export -f echo_var
seq -f "n%04g" 1 100 | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
You can use the builtin printf instead of the external seq:
printf "n%04g\n" {1..100} | xargs -n 1 -P 10 -I {} bash -c 'echo_var "$#"' _ {}
Also, using return 0 and exit 0 like that masks any error value that might be produced by the command preceding it. Also, if there's no error, it's the default and thus somewhat redundant.
#phobic mentions that the Bash command could be simplified to
bash -c 'echo_var "{}"'
moving the {} directly inside it. But it's vulnerable to command injection as pointed out by #Sasha.
Here is an example why you should not use the embedded format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "{}"'
Sun Aug 18 11:56:45 CDT 2019
Another example of why not:
echo '\"; date\"' | xargs -I {} bash -c 'echo_var "{}"'
This is what is output using the safe format:
$ echo '$(date)' | xargs -I {} bash -c 'echo_var "$#"' _ {}
$(date)
This is comparable to using parameterized SQL queries to avoid injection.
I'm using date in a command substitution or in escaped quotes here instead of the rm command used in Sasha's comment since it's non-destructive.
Using GNU Parallel is looks like this:
#!/bin/bash
echo_var(){
echo $1
return 0
}
export -f echo_var
seq -f "n%04g" 1 100 | parallel -P 10 echo_var {}
exit 0
If you use version 20170822 you do not even have to export -f as long as you have run this:
. `which env_parallel.bash`
seq -f "n%04g" 1 100 | env_parallel -P 10 echo_var {}
Something like this should work also:
function testing() { sleep $1 ; }
echo {1..10} | xargs -n 1 | xargs -I# -P4 bash -c "$(declare -f testing) ; testing # ; echo # "
Maybe this is bad practice, but you if you are defining functions in a .bashrc or other script, you can wrap the file or at least the function definitions with a setting of allexport:
set -o allexport
function funcy_town {
echo 'this is a function'
}
function func_rock {
echo 'this is a function, but different'
}
function cyber_func {
echo 'this function does important things'
}
function the_man_from_funcle {
echo 'not gonna lie'
}
function funcle_wiggly {
echo 'at this point I\'m doing it for the funny names'
}
function extreme_function {
echo 'goodbye'
}
set +o allexport
Seems I can't make comments :-(
I was wondering about the focus on
bash -c 'echo_var "$#"' _ {}
vs
bash -c 'echo_var "{}"'
The 1st substitutes the {} as an arg to bash while the 2nd as an arg to the function. The fact that example 1 doesn't expand the $(date) is simply a a side effect.
If you don't want the functions args expanded , use single single quotes rather than double. To avoid messy nesting , use double quote (expand args on the other one)
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c 'printit "{}"'
Fri 11 Sep 17:02:24 BST 2020
$ echo '$(date)' | xargs -0 -L1 -I {} bash -c "printit '{}'"
$(date)

Use argument twice from standard output pipelining

I have a command line tool which receives two arguments:
TOOL arg1 -o arg2
I would like to invoke it with the same argument provided it for arg1 and arg2, and to make that easy for me, i thought i would do:
each <arg1_value> | TOOL $1 -o $1
but that doesn't work, $1 is not replaced, but is added once to the end of the commandline.
An explicit example, performing:
cp fileA fileA
returns an error fileA and fileA are identical (not copied)
While performing:
echo fileA | cp $1 $1
returns the following error:
usage: cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file target_file
cp [-R [-H | -L | -P]] [-fi | -n] [-apvX] source_file ... target_directory
any ideas?
If you want to use xargs, the [-I] option may help:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not terminate input items; instead the separa‐
tor is the newline character. Implies -x and -L 1.
Here is a simple example:
mkdir test && cd test && touch tmp
ls | xargs -I '{}' cp '{}' '{}'
Returns an Error cp: tmp and tmp are the same file
The xargs utility will duplicate its input stream to replace all placeholders in its argument if you use the -I flag:
$ echo hello | xargs -I XXX echo XXX XXX XXX
hello hello hello
The placeholder XXX (may be any string) is replaced with the entire line of input from the input stream to xargs, so if we give it two lines:
$ printf "hello\nworld\n" | xargs -I XXX echo XXX XXX XXX
hello hello hello
world world world
You may use this with your tool:
$ generate_args | xargs -I XXX TOOL XXX -o XXX
Where generate_args is a script, command or shell function that generates arguments for your tool.
The reason
each <arg1_value> | TOOL $1 -o $1
did not work, apart from each not being a command that I recognise, is that $1 expands to the first positional parameter of the current shell or function.
The following would have worked:
set - "arg1_value"
TOOL "$1" -o "$1"
because that sets the value of $1 before calling you tool.
You can re-run a shell to perform variable expansion, with sh -c. The -c takes an argument which is command to run in a shell, performing expansion. Next arguments of sh will be interpreted as $0, $1, and so on, to use in the -c. For example:
sh -c 'echo $1, i repeat: $1' foo bar baz will print execute echo $1, i repeat: $1 with $1 set to bar ($0 is set to foo and $2 to baz), finally printing bar, i repeat: bar
The $1,$2...$N are only visible to bash script to interpret arguments to those scripts and won't work the way you want them to. Piping redirects stdout to stdin and is not what you are looking for either.
If you just want a one-liner, use something like
ARG1=hello && tool $ARG1 $ARG1
Using GNU parallel to use STDIN four times, to print a multiplication table:
seq 5 | parallel 'echo {} \* {} = $(( {} * {} ))'
Output:
1 * 1 = 1
2 * 2 = 4
3 * 3 = 9
4 * 4 = 16
5 * 5 = 25
One could encapsulate the tool using awk:
$ echo arg1 arg2 | awk '{ system("echo TOOL " $1 " -o " $2) }'
TOOL arg1 -o arg2
Remove the echo within the system() call and TOOL should be executed in accordance with requirements:
echo arg1 arg2 | awk '{ system("TOOL " $1 " -o " $2) }'
Double up the data from a pipe, and feed it to a command two at a time, using sed and xargs:
seq 5 | sed p | xargs -L 2 echo
Output:
1 1
2 2
3 3
4 4
5 5

Do a tail -F until matching a pattern (with no error)

I have a line working from this thread that tails a file until a matching pattern is found. It works well, but I can't find a way to suppress the output that occurs afterwards.
The line is:
sh -c 'tail -n +0 -f $logfile | { sed "/EOF/ q" && kill $$ ;}'
piping to /dev/null doesn't work as I don't get any output at all from the tail command that way. Also, I'm on OSX and various other sed and awk suggestions don't work due to the syntax.
It always finishes with the below, instead of nothing:
sh: line 10: 14285 Terminated: 15 sh -c 'tail -n +0 -f $logfile | { sed "/EOF/ q" && kill $$ ;}'
I'd also like not to display the matched text (EOF in the above example).
Any suggestions welcomed.
for a file (log for example)
sed -u "/pattern/ q" YourFile
for a pipe
ls -l | sed -u "/pattern/ q"
the -u of sed tell it to work as a stream input
It's the shell's job monitoring popping the message.
nomonitor() {
set +m
"$#"
set -m
}
nomonitor sh -c 'tail -n +0 -f $logfile | { sed "/EOF/ q" && kill $$; }'
You can actually discard the stderr like this:
sh -c 'tail -n +0 -f $logfile | { sed "/EOF/q" && p=$$ && kill $((p+1)) ; }'

[: : bad number on the bash script

This is my bash script:
#!/usr/local/bin/bash -x
touch /usr/local/p
touch /usr/local/rec
DATA_FULL=`date +%Y.%m.%d.%H`
CHECK=`netstat -an | grep ESTAB | egrep '(13001|13002|13003|13004|13061|13099|16001|16002|16003|16004|16061|16099|18001|18002|18003|18004|18061|18099|20001|20002|20003|20004|20061|20099|13000|16000|18000|20000)' | awk '{ print $5 }' | sort -u | wc -l`
netstat -an | grep ESTAB | egrep '(13001|13002|13003|13004|13061|13099|16001|16002|16003|16004|16061|16099|18001|18002|18003|18004|18061|18099|20001|20002|20003|20004|20061|20099|13000|16000|18000|20000)' | awk '{ print $5 }' | sort -u | wc -l > /usr/local/www/p
STAT=`cat /usr/local/www/rec`
if [ "$CHECK" -gt "$STAT" ]; then
echo $CHECK"\n"$DATA_FULL > /usr/local/p
fi
Ofcourse I've runned chmod +x script.sh and then sh script.sh, then I receive the following message: [: : bad number.
Why does it happends?
Run your script using
sh -x script.sh
It'll print every line it executes and the variable output.
Run the netstat command and stat command outside and check.
If these are integer for sure, use this syntax,
if [ "0$(echo $CHECK|tr -d ' ')" -gt "0$(echo $STAT|tr -d ' ')" ];
A simple hack. Only works if $STAT is always either empty or positive number.
Are you sure that both STAT and CHECK are numbers that can be compared with -gt?
probably your /usr/local/www/rec is empty. Try
STAT=`cat /usr/local/www/rec 2>/dev/null || echo 0`
maybe.

Resources