Bash Expression Evaluation Order on Command Line - bash

Background:
I'm working on quickly calling bash command line expressions inside of SGE's job submission program qSub in parallel. While doing so, I was attempting to submit an expression (as an argument) to be ran inside of another script like so:
./runArguments.sh grep foo bar.txt > output.txt
runArguments.sh looks like this:
#!/bin/bash
${1} ${2} ${3} etc....to 12
The idea is that I want "grep foo bar.txt > output.txt" to be evaluated in the script...NOT ON THE COMMAND LINE. In the example above, "grep foo bar.txt" will evaluate during runArguments.sh execution, but the output redirection would be evaluated on the command line. I eventually found a working solution using "eval" that is shown below, but I do not understand why it works.
Question(s)
1) Why does
./runArguments.sh eval "grep foo bar.txt > output.txt"
allow the eval and the expression to be taken as arguments, but
./runArguments.sh $(grep foo bar.txt > output.txt)
evaluates on the command line before the script is called? (the output of $(grep...) is taken as the arguments instead)
2) Is there a better way of doing this?
Thanks in advance!

Your first question is a bit hard to answer, because you've already answered it yourself. As you've seen, command substitution (the $(...) or `...` notation) substitutes the output of the command, and then processes the result. For example, this:
cat $(echo tmp.sh)
gets converted to this:
cat tmp.sh
So in your case, this:
./runArguments.sh $(grep foo bar.txt > output.txt)
runs grep foo bar.txt > output.txt, grabs its output — which will be nothing, since you've redirected any output to output.txt — and substitutes it, yielding:
./runArguments.sh
(so your script is run with no arguments).
By contrast, this:
./runArguments.sh eval "grep foo bar.txt > output.txt"
does not perform any command substitution, so your script is run with two arguments: eval, and grep foo bar.txt > output.txt. This command inside your script:
${1} ${2} ${3} ${4} ${5} ${6} ${7} ${8} ${9} ${10} ${11} ${12}
is therefore equivalent to this:
eval grep foo bar.txt '>' output.txt
which invokes the eval built-in with five arguments: grep, foo, bar.txt, >, and output.txt. The eval built-in assembles its arguments into a command, and runs them, and even translates the > and output.txt arguments into an output-redirection, so the above is equivalent to this:
grep foo bar.txt > output.txt
. . . and you already know what that does. :-)
As for your second question — no, there's not really a better way to do this. You need to pass the > in as an argument, and that means that you need to use eval ... or bash -c "..." or the like in order to "translate" it back into meaning output-redirection. If you're O.K. with modifying the script, then you might want to change this line:
${1} ${2} ${3} ${4} ${5} ${6} ${7} ${8} ${9} ${10} ${11} ${12}
to this:
eval ${1} ${2} ${3} ${4} ${5} ${6} ${7} ${8} ${9} ${10} ${11} ${12}
so that you don't need to handle this in the parameters. Or, actually, you might as well change it to this:
eval ${#}
which will let you pass in more than twelve parameters; or, better yet, this:
eval "${#}"
which will give you slightly more control over word-splitting and fileglobbing and whatnot.

Related

Using xargs to run bash scripts on multiple input lists with arguments

I am trying to run a script on multiple lists of files while also passing arguments in parallel. I have file_list1.dat, file_list2.dat, file_list3.dat. I would like to run script.sh which accepts 3 arguments: arg1, arg2, arg3.
For one run, I would do:
sh script.sh file_list1.dat $arg1 $arg2 $arg3
I would like to run this command in parallel for all the file lists.
My attempt:
Ncores=4
ls file_list*.dat | xargs -P "$Ncores" -n 1 [sh script.sh [$arg1 $arg2 $arg3]]
This results in the error: invalid number for -P option. I think the order of this command is wrong.
My 2nd attempt:
echo $arg1 $arg2 $arg3 | xargs ls file_list*.dat | xargs -P "$Ncores" -n 1 sh script.sh
But this results in the error: xargs: ls: terminated by signal 13
Any ideas on what the proper syntax is for passing arguments to a bash script with xargs?
I'm not sure I understand exactly what you want to do. Is it to execute something like these commands, but in parallel?
sh script.sh $arg1 $arg2 $arg3 file_list1.dat
sh script.sh $arg1 $arg2 $arg3 file_list2.dat
sh script.sh $arg1 $arg2 $arg3 file_list3.dat
...etc
If that's right, this should work:
Ncores=4
printf '%s\0' file_list*.dat | xargs -0 -P "$Ncores" -n 1 sh script.sh "$arg1" "$arg2" "$arg3"
The two major problems in your version were that you were passing "Ncores" as a literal string (rather than using $Ncores to get the value of the variable), and that you had [ ] around the command and arguments (which just isn't any relevant piece of shell syntax). I also added double-quotes around all variable references (a generally good practice), and used printf '%s\0' (and xargs -0) instead of ls.
Why did I use printf instead of ls? Because ls isn't doing anything useful here that printf or echo or whatever couldn't do as well. You may think of ls as the tool for getting lists of filenames, but in this case the wildcard expression file_list*.dat gets expanded to a list of files before the command is run; all ls would do with them is look at each one, say "yep, that's a file" to itself, then print it. echo could do the same thing with less overhead. But with either ls or echo the output can be ambiguous if any filenames contain spaces, quotes, or other funny characters. Some versions of ls attempt to "fix" this by adding quotes or something around filenames with funny characters, but that might or might not match how xargs parses its input (if it happens at all).
But printf '%s\0' is unambiguous and predictable -- it prints each string (filename in this case) followed by a NULL character, and that's exactly what xargs -0 takes as input, so there's no opportunity for confusion or misparsing.
Well, ok, there is one edge case: if there aren't any matching files, the wildcard pattern will just get passed through literally, and it'll wind up trying to run the script with the unexpanded string "file_list*.dat" as an argument. If you want to avoid this, use shopt -s nullglob before this command (and shopt -u nullglob afterward, to get back to normal mode).
Oh, and one more thing: sh script.sh isn't the best way to run scripts. Give the script a proper shebang line at the beginning (#!/bin/sh if it uses only basic shell features, #!/bin/bash or #!/usr/bin/env bash if it uses any bashisms), and run it with ./script.sh.

Why does cat exit a shell script, but only when it's fed by a pipe?

Why does cat exit a shell script, but only when it's fed by a pipe?
Case in point, take this shell script called "foobar.sh":
#! /bin/sh
echo $#
echo $#
cat $1
sed -e 's|foo|bar|g' $1
And a text file called "foo.txt" which contains only one line:
foo
Now if I type ./foobar.sh foo.txt on the command line, then I'll get this expected output:
1
foo.txt
foo
bar
However if I type cat foo.txt | ./foobar.sh then surprisingly I only get this output:
0
foo
I don't understand. If the number of arguments reported by $# is zero, then how can cat $1 still return foo? And, that being the case, why doesn't sed -e 's|foo|bar|g' $1 return anything since clearly $1 is foo?
This seems an awful lot like a bug, but I'm assuming it's magic instead. Please explain!
UPDATE
Based on the given answer, the following script gives the expected output, assuming a one-line foo.txt:
#! /bin/sh
if [ $# ]
then
yay=$(cat $1)
else
read yay
fi
echo $yay | cat
echo $yay | sed -e 's|foo|bar|g'
No, $1 is not "foo". $1 is
ie, undefined/nothing.
Unlike a programming language, variables in the shell are quite dumbly and literally replaced, and the resulting commands textually executed (well, sorta kinda). In this case, "cat $1" becomes just "cat ", which will take input from stdin. That's terribly convenient to your execution since you've kindly provided "foo" on stdin via your pipe!
See what's happening?
sed likewise will read from stdin, but is already on end of stream, so exits.
When you don't give an argument to cat, it reads from stdin. When $1 isn't given the cat $1 is the same as a simple cat, which reads the text you piped in (cat foo.txt).
Then the sed command runs, and same as cat, it reads from stdin because it has no filename argument. cat has already consumed all of stdin. There's nothing left to read, so sed quits without printing anything.

Difference between eval and backticks (reverse apostrophe)

Can anyone tell me what the big difference here is and why the latter doesn't work?
test="ls -l"
Both now work fine:
eval $test
echo `$test`
But in this case:
test="ls -l >> test.log"
eval $test
echo `$test`
The latter will not work. Why is that? I know that eval is just executing a script while the apostrophes are executing it and return the result as a string. What makes it not possible to use >> or simmilar stuff inside the command to execute? Maybe is there a way to make it work with apostrophes and I'm doing something wrong?
When you're using backticks to execute your command, the command being sent to the shell is:
ls -l '>>' test.log
which makes both >> and test.log arguments to ls (note the quotes around >>).
While using eval, the command being executed is:
ls -l >> test.log
(Execute your script by saying bash -vx scriptname to see what's happening.)
eval is 'expression value' i.e.
test="ls -l >> test.log"
eval $test
is execute in same way in terminal as
ls -l >> test.log
whether
echo is for display purpose only.

All arguments into files with correct quoting using "$#"

I need my bashscript to cat all of its parameters into a file. I tried to use cat for this because I need to add a lot of lines:
#!/bin/sh
cat > /tmp/output << EOF
I was called with the following parameters:
"$#"
or
$#
EOF
cat /tmp/output
Which leads to the following output
$./test.sh "dsggdssgd" "dsggdssgd dgdsdsg"
I was called with the following parameters:
"dsggdssgd dsggdssgd dgdsdsg"
or
dsggdssgd dsggdssgd dgdsdsg
I want neither of these two things: I need the exact quoting which was used on the command line. How can I achieve this? I always thought $# does everything right in regards to quoting.
Well, you are right that "$#" has the args including the whitespace in each arg. However, since the shell performs quote removal before executing a command, you can never know how exactly the args were quoted (e.g. whether with single or double quotes, or backslashes or any combination thereof--but you shouldn't need to know, since all you should care for are the argument values).
Placing "$#" in a here-document is pointless because you lose the information about where each arg starts and ends (they're joined with a space inbetween). Here's a way to see just this:
$ cat test.sh
#!/bin/sh
printf 'I was called with the following parameters:\n'
printf '"%s"\n' "$#"
$ ./test.sh "dsggdssgd" "dsggdssgd dgdsdsg"
I was called with the following parameters:
"dsggdssgd"
"dsggdssgd dgdsdsg"
Try:
#!/bin/bash
for x in "$#"; do echo -ne "\"$x\" "; done; echo
To see what's interpreted by Bash, use:
bash -x ./script.sh
or add this to the beginning of your script:
set -x
You might want add this on the parent script.

printf, ignoring excess arguments?

I noticed today Bash printf has a -v option
-v var assign the output to shell variable VAR rather than
display it on the standard output
If I invoke like this it works
$ printf -v var "Hello world"
$ printf "$var"
Hello world
Coming from a pipe it does not work
$ grep "Hello world" test.txt | xargs printf -v var
-vprintf: warning: ignoring excess arguments, starting with `var'
$ grep "Hello world" test.txt | xargs printf -v var "%s"
-vprintf: warning: ignoring excess arguments, starting with `var'
xargs will invoke /usr/bin/printf (or wherever that binary is installed on your system). It will not invoke bash's builtin function. And only a builtin (or sourcing a script or similar) can modify the shell's environment.
Even if it could call bash's builtin, the xargs in your example runs in a subsell. The subshell cannot modify it's parent's environment anyway. So what you're trying cannot work.
A few options I see if I understand your sample correctly; sample data:
$ cat input
abc other stuff
def ignored
cba more stuff
Simple variable (a bit tricky depending on what exactly you want):
$ var=$(grep a input)
$ echo $var
abc other stuff cba more stuff
$ echo "$var"
abc other stuff
cba more stuff
With an array if you want individual words in the arrays:
$ var=($(grep a input))
$ echo "${var[0]}"-"${var[1]}"
abc-other
Or if you want the whole lines in each array element:
$ IFS=$'\n' var=($(grep a input)) ; unset IFS
$ echo "${var[0]}"-"${var[1]}"
abc other stuff-cba more stuff
There are two printf's - one is a shell bultin and this is invoked if you just run printf and the other is a regular binary, usually /usr/bin/printf. The latter doesn't take a -v argument, hence the error message. Since printf is an argument to xargs here, the binary is run, not the shell bulitin. Additionally, since it's at the receiving end of a pipeline, it is run as a subprocess. Variables can only be inherited from parent to child process but not the other way around, so even if the printf binary could modify the environment, the change wouldn't be visible to the parent process. So there are two reasons why your command cannot work. But you can always do var=$(something | bash -c 'some operation using builtin printf').
Mat gives an excellent explanation of what's going on and why.
If you want to iterate over the output of a command and set a variable to successive values using Bash's sprintf-style printf feature (-v), you can do it like this:
grep "Hello world" test.txt | xargs bash -c 'printf -v var "%-25s" "$#"; do_something_with_formatted "$var"' _ {} \;

Resources