Bash escaping issue with $# - bash

I've written a script to simplify running a long launch command:
# in ~/.bash_profile
function runProgram() { sbt "run-main com.longpackagename.mainclass $# arg3"; };
export -f runProgram;
However, it fails when I try to pass multiple arguments:
$ runProgram arg1 arg2
...
[info] Running com.longpackagename.mainclass arg1
What happened to arg2 and arg3? Were they eaten by bash or by sbt?
The script works as expected if I run it like this:
$ runProgram "arg1 arg2"
--
Additionally: this type of issue happens all the time for me. I would also appreciate a reference on how to escape properly in bash. The first & second resources that I tried didn't address this situation.

The best reference for bash, including how quoting works, is the bash manual itself, which is almost certainly installed on your machine where you can read it without an internet connection by typing man bash. It's a lot to read, but there's no real substitute.
Nonetheless, I will try to explain this particular issue. There are two important things to know: first, how (and when) bash splits a command line into separate "words" (or command line arguments); second, what $# and $* mean. These are not entirely unrelated.
Word-splitting is partially controlled by the special parameter IFS, but I just mention that; I'm assuming it hasn't been altered. For more details, see man bash.
Below, I call quoting a string with double-quotes ("...") weak quoting, and quoting with apostrophes ('...') strong quoting. The backslash (\) is also a form of strong quoting.
Word-splitting happens:
after parameters (shell variables) have been substituted with their values,
wherever there is a sequence of whitespace characters,
except if the whitespace is quoted in any way, (" ", ' ', \ are three ways),
before quotes are removed.
Once a command has been split into words, the first word is used to find the program or function to invoke, and the remaining words become the program's arguments. (I'm ignoring lots of stuff like shell metacharacters, redirections, pipes, etc., etc. For more details, see man bash.)
Parameters are substituted with their values (step 1) if their name is preceded by a $ unless the $name is strongly quoted (that is, '$name' or, for example, \$name). There's lots more forms of parameter substitution. For more details, see man bash.
Now, $# and $* both mean "all of the positional parameters to the current command/function", and if they are used without quotes, they do precisely the same thing. They are replaced by all of the positional parameters, with a single space between each parameter. Since this is a type of parameter substitution (as above), word-splitting happens after the substitution except if the substitution is in quotes, as in the above list.
If the substitution is in quotes, then according to the above rules, the whitespace which was inserted between the parameters is not subject to word-splitting. And that's precisely how $* works. $* is replaced by the space-separated command-line parameters and the result is word-split; "$*" is replaced by the space-separated command-line parameters as a single word.
"$#" is an exception. And, in fact, this is why $# exists at all. If the $# is inside weak quotes ("$#"), then the quotes are removed, and each positional parameter is individually quoted. These quoted positional parameters are then spaced-separated and substituted for the $#. Since the $# is no longer quoted itself, the inserted spaces do cause word-splitting. The final result is that the individual parameters are retained as individual words.
In case that was not totally clear, here's an example. printf has the virtue of repeating the provided format until it runs out of parameters, which makes it easy to see what's going on.
showargs() {
echo -n '$*: '; printf "<%s> " $*; echo
echo -n '"$*": '; printf "<%s> " "$*"; echo
echo -n '"$#": '; printf "<%s> " "$#"; echo
}
showargs one two three
showargs "one two" three
(Try to figure out what that prints before you execute it.)
It's often said that you almost always want "$#" and almost never "$#" or $*. That's generally true, but it's also the case that you almost never want "something with $# inside of it". To understand that, you need to know what "something with $# inside of it" does. It's a bit wierd, but it shouldn't be unexpected. We'll take the invocation of sbt from the OP as an example:
sbt "run-main com.longpackagename.mainclass $# arg3"
with two positional parameters supplied to the function, so that $1 is arg1 and $2 is arg2.
First, bash removes the quotes around $#. However, it can't just remove them altogether, since there is also quoted text there. So it has to close off the quoted text and reopen the quotes afterwards, producing:
sbt "run-main com.longpackagename.mainclass "$#" arg3"
Now, it can substitute in the quoted, spaced-separated arguments:
sbt "run-main com.longpackagename.mainclass ""arg1" "arg2"" arg3"
This is now word-split:
sbt
"run-main com.longpackagename.mainclass ""arg1"
"arg2"" arg3"
and the quotes are removed:
sbt
run-main com.longpackagename.mainclass arg1
arg2 arg3
sbt is expecting only one positional parameter. You gave it two, and it ignored the second one.
Now, suppose the function were called with a single argument, "arg1 arg2". In that case, the substitution of $# results in:
sbt "run-main com.longpackagename.mainclass ""arg1 arg2"" arg3"
and word-splitting produces
sbt
"run-main com.longpackagename.mainclass ""arg1 arg2"" arg3"
without quotes:
sbt
run-main com.longpackagename.mainclass arg1 arg2 arg3"
and there is only one positional parameter for sbt.

Related

how to pass args to bash functions [duplicate]

This question already has answers here:
Propagate all arguments in a Bash shell script
(12 answers)
Closed 3 years ago.
Let's say I have a function abc() that will handle the logic related to analyzing the arguments passed to my script.
How can I pass all arguments my Bash script has received to abc()? The number of arguments is variable, so I can't just hard-code the arguments passed like this:
abc $1 $2 $3 $4
Better yet, is there any way for my function to have access to the script arguments' variables?
The $# variable expands to all command-line parameters separated by spaces. Here is an example.
abc "$#"
When using $#, you should (almost) always put it in double-quotes to avoid misparsing of arguments containing spaces or wildcards (see below). This works for multiple arguments. It is also portable to all POSIX-compliant shells.
It is also worth noting that $0 (generally the script's name or path) is not in $#.
The Bash Reference Manual Special Parameters Section says that $# expands to the positional parameters starting from one. When the expansion occurs within double quotes, each parameter expands to a separate word. That is "$#" is equivalent to "$1" "$2" "$3"....
Passing some arguments:
If you want to pass all but the first arguments, you can first use shift to "consume" the first argument and then pass "$#" to pass the remaining arguments to another command. In Bash (and zsh and ksh, but not in plain POSIX shells like dash), you can do this without messing with the argument list using a variant of array slicing: "${#:3}" will get you the arguments starting with "$3". "${#:3:4}" will get you up to four arguments starting at "$3" (i.e. "$3" "$4" "$5" "$6"), if that many arguments were passed.
Things you probably don't want to do:
"$*" gives all of the arguments stuck together into a single string (separated by spaces, or whatever the first character of $IFS is). This looses the distinction between spaces within arguments and the spaces between arguments, so is generally a bad idea. Although it might be ok for printing the arguments, e.g. echo "$*", provided you don't care about preserving the space within/between distinction.
Assigning the arguments to a regular variable (as in args="$#") mashes all the arguments together like "$*" does. If you want to store the arguments in a variable, use an array with args=("$#") (the parentheses make it an array), and then reference them as e.g. "${args[0]}" etc. Note that in Bash and ksh, array indexes start at 0, so $1 will be in args[0], etc. zsh, on the other hand, starts array indexes at 1, so $1 will be in args[1]. And more basic shells like dash don't have arrays at all.
Leaving off the double-quotes, with either $# or $*, will try to split each argument up into separate words (based on whitespace or whatever's in $IFS), and also try to expand anything that looks like a filename wildcard into a list of matching filenames. This can have really weird effects, and should almost always be avoided. (Except in zsh, where this expansion doesn't take place by default.)
I needed a variation on this, which I expect will be useful to others:
function diffs() {
diff "${#:3}" <(sort "$1") <(sort "$2")
}
The "${#:3}" part means all the members of the array starting at 3. So this function implements a sorted diff by passing the first two arguments to diff through sort and then passing all other arguments to diff, so you can call it similarly to diff:
diffs file1 file2 [other diff args, e.g. -y]
Use the $# variable, which expands to all command-line parameters separated by spaces.
abc "$#"
Here's a simple script:
#!/bin/bash
args=("$#")
echo Number of arguments: $#
echo 1st argument: ${args[0]}
echo 2nd argument: ${args[1]}
$# is the number of arguments received by the script. I find easier to access them using an array: the args=("$#") line puts all the arguments in the args array. To access them use ${args[index]}.
It's worth mentioning that you can specify argument ranges with this syntax.
function example() {
echo "line1 ${#:1:1}"; #First argument
echo "line2 ${#:2:1}"; #Second argument
echo "line3 ${#:3}"; #Third argument onwards
}
I hadn't seen it mentioned.
abc "$#" is generally the correct answer.
But I was trying to pass a parameter through to an su command, and no amount of quoting could stop the error su: unrecognized option '--myoption'. What actually worked for me was passing all the arguments as a single string :
abc "$*"
My exact case (I'm sure someone else needs this) was in my .bashrc
# run all aws commands as Jenkins user
aws ()
{
sudo su jenkins -c "aws $*"
}
abc "$#"
$# represents all the parameters given to your bash script.

Getting quoted-dollar-at ( "$#" ) behaviour for other variable expansion?

The shell has a great feature, where it'll preserve argument quoting across variable expansion when you use "$#", such that the script:
for f in "$#"; do echo "$f"; done
when invoked with arguments:
"with spaces" '$and $(metachars)'
will print, literally:
with spaces
$and $(metachars)
This isn't the normal behaviour of expansion of a quoted string, it seems to be a special case for "$#".
Is there any way to get this behaviour for other variables? In the specific case I'm interested in, I want to safely expand $SSH_ORIGINAL_COMMAND in a command= specifier in a restricted public key entry, without having to worry about spaces in arguments, metacharacters, etc.
"$SSH_ORIGINAL_COMMAND" expands like "$*" would, i.e. a naïve expansion that doesn't add any quoting around separate arguments.
Is the information required for "$#" style expansion simply not available to the shell in this case, by the time it gets the env var SSH_ORIGINAL_COMMAND? So I'd instead need to convince sshd to quote the arguments?
The answer to this question is making me wonder if it's possible at all.
You can get similar "quoted dollar-at" behavior for arbitrary arrays using "${YOUR_ARRAY_HERE[#]}" syntax for bash arrays. Of course, that's no complete answer, because you still have to break the string into multiple array elements according to the quotes.
One thought was to use bash -x, which renders expanded output, but only if you actually run the command; it doesn't work with -n, which prevents you from actually executing the commands in question. Likewise you could use eval or bash -c along with set -- to manage the quote removal, performing expansion on the outer shell and quote removal on the inner shell, but that would be extremely hard to bulletproof against executing arbitrary code.
As an end run, use xargs instead. xargs handles single and double quotes. This is a very imperfect solution, because xargs treats backslash-escaped characters very differently than bash does and fails entirely to handle semicolons and so forth, but if your input is relatively predictable it gets you most of the way there without forcing you to write a full shell parser.
SSH_ORIGINAL_COMMAND='foo "bar baz" $quux'
# Build out the parsed array.
# Bash 4 users may be able to do this with readarray or mapfile instead.
# You may also choose to null-terminate if newlines matter.
COMMAND_ARRAY=()
while read line; do
COMMAND_ARRAY+=("$line")
done < <(xargs -n 1 <<< "$SSH_ORIGINAL_COMMAND")
# Demonstrate working with the array.
N=0
for arg in "${COMMAND_ARRAY[#]}"; do
echo "COMMAND_ARRAY[$N]: $arg"
((N++))
done
Output:
COMMAND_ARRAY[0]: foo
COMMAND_ARRAY[1]: bar baz
COMMAND_ARRAY[2]: $quux

What Does $* Do

I am trying to modify a shell script that isn't very well documented. I know the basics, but this snippet is confusing.
I am not sure what this line does:
time java -showversion -jar ${here_dir}/AESampleTool.jar -f $FILES -d ${output_dir} $*
I don't know what $* is. Google doesn't have much. Does the above line set $* equal to what is before it? The next line in the script is $* is passed as a parameter to a function called launch:
launch $* 1>$log_file 2>&1
Below is the function. The weird part is it seems to be a circular reference. inside the function is what sets $* but then that is passed as a parameter to the function itself.
function launch {
hset -x
USER=$AEX_USER
l_output_dir=$output_dir
l_here_dir=$here_dir
l_LOGFILE=$LOGFILE
l_FILES=$FILES
l_EXE_JAR=$EXE_SH
l_AEX_LOGDIR=$AEX_LOGDIR
l_AEX_LOGNAME=$AEX_LOGNAME
time java -showversion -jar ${here_dir}/AESampleTool.jar -f $FILES -d ${output_dir} $*
rc=$?
}
$* means all arguments, but it's the wrong way to pass them on because it will either split them on whitespace if unquoted (like here) or combine them into a single argument if quoted. It's better to use "$#", which will pass them along in a whitespace-safe manner.
$*, and $# are all related to all the arguments to the shell, but they do different things.
When unquoted, $* and $# do the same thing. They treat each word (sequence of non-whitespace) as a separate argument.
When quoted they are quite different. "$*" treats the argument list as a single space-separated string, whereas "$#" treats the arguments almost exactly as they were when specified on the command line.

Bash arguments literal interpretation

I have a simple bash script to run a remote command on a given set of servers.
#!/bin/bash
echo "Command to be run:"
echo "$*"
read nothing
servers="server1 server2 server3"
for server in `echo $servers`
do
echo $server
ssh $server "$*"
echo ""
done
The problem is that the command could contain any number of arguments, hence the use of $* and could also have many different characters including quotes and regular expressions. The basic need here is for the shell to take the arguments, whatever they are, literally so they are passed to the remote server intact without removing quotes or interpreting parenthesis etc.
There are a number of variations I have seen but most deal with a specific character problem or overcomplicate the script or arguments required, and I'm looking to keep at least the arguments free of escape characters etc.
An example with using "#":
./cmd tw_query --no-headings "search Host where name matches '(?i)peter' show summary, nodecount(traverse :::Detail where name matches 'bob')"
Gives:
Command to be run:
tw_query --no-headings search Host where name matches '(?i)peter' show summary, nodecount(traverse :::Detail where name matches 'bob')
You seem to be looking for $#. Say:
ssh $server "$#"
instead. From the manual:
*
Expands to the positional parameters, starting from one. When the expansion occurs within double quotes, it expands to a single word
with the value of each parameter separated by the first character of
the IFS special variable. That is, "$*" is equivalent to "$1c$2c…",
where c is the first character of the value of the IFS variable. If
IFS is unset, the parameters are separated by spaces. If IFS is null,
the parameters are joined without intervening separators.
#
Expands to the positional parameters, starting from one. When the expansion occurs within double quotes, each parameter expands to a
separate word. That is, "$#" is equivalent to "$1" "$2" …. If the
double-quoted expansion occurs within a word, the expansion of the
first parameter is joined with the beginning part of the original
word, and the expansion of the last parameter is joined with the last
part of the original word. When there are no positional parameters,
"$#" and $# expand to nothing (i.e., they are removed).
You actually don't want the arguments passed to the remote server intact, you want them passed to the remote command intact. But that means they need to be wrapped in an extra layer of quotes/escapes/etc so that so that they will come out intact after the remote shell has parsed them.
bash actually has a feature in its printf builtin to add quoting/escaping to a string, but it quotes suitably for interpretation by bash itself -- if the remote shell were something else, it might not understand the quoting mode that it chooses. So in this case I'd recommend a simple-and-dumb quoting style: just add single-quotes around each argument, and replace each single-quote within the argument with '\'' (that'll end the current quoted string, add an escaped (literal) quote, then start another quoted string). It'll look a bit weird, but should decode properly under any POSIX-compliant shell.
Converting to this format is a bit tricky, since bash does inconsistent things with quotes in its search-and-replace patterns. Here's what I came up with:
#!/bin/bash
quotedquote="'\''"
printf -v quotedcommand "'%s' " "${#//\'/$quotedquote}"
echo "Command to be run:"
echo "$quotedcommand"
read nothing
servers="server1 server2 server3"
for server in $servers
do
echo $server
ssh $server "$quotedcommand"
echo ""
done
And here's how it quotes your example command:
'tw_query' '--no-headings' 'search Host where name matches '\''(?i)peter'\'' show summary, nodecount(traverse :::Detail where name matches '\''bob'\'')'
It looks strange to have the command itself quoted, but as long as you aren't trying to use an alias this doesn't cause any actual trouble. There is one significant limitation, though: there's no way to pass shell metacharacters (like > for output redirection) to the remote shell:
./cmd somecommand >outfile # redirect is done on local computer
./cmd somecommand '>outfile' # ">outfile" is passed to somecommand as an argument
If you need to do things like remote redirects, things get a good deal more complicated.
Besides the issue with $* versus $#, if this is for use in a production environment, you might want to consider a tool such as pdsh.
Otherwise, you can try feeding the commands to your script through stdin rather than putting them in argument so you avoid one level of parsing.
#!/bin/bash
read cmd
echo "Command to be run:"
echo $cmd
servers="server1 server2 server3"
for server in `echo $servers` do
echo $server
ssh $server "$cmd"
echo ""
done
and use it like this
$ ./cmd <<'EOT'
> tw_query --no-headings "search Host where name matches '(?i)peter' show summary, nodecount(traverse :::Detail where name matches 'bob')"
> EOT
Command to be run:
tw_query --no-headings "search Host where name matches '(?i)peter' show summary, nodecount(traverse :::Detail where name matches 'bob')"
Maybe a little far-fetched, but it could work.

Failure of Bash $# variable

I've done this several times, and it never seems to work properly. Can anyone explain why?
function Foobar
{
cmd -opt1 -opt2 $#
}
What this is supposed to do is make it so that calling Foobar does the same thing as calling cmd, but with a few extra parameters (-opt1 and -opt2, in this example).
Unfortunately, this doesn't work properly. It works OK if all your arguments lack spaces. But if you want an argument with spaces, you write it in quotes, and Bash helpfully strips away the quotes, breaking the command. How do I prevent this incorrect behavior?
You need to double-quote the $#to keep bash from performing the unwanted parsing steps (word splitting etc) after substituting the argument values:
function Foobar
{
cmd -opt1 -opt2 "$#"
}
EDIT from the Special Parameters section of the bash manpage:
# Expands to the positional parameters, starting from one. When
the expansion occurs within double quotes, each parameter
expands to a separate word. That is, "$#" is equivalent to "$1"
"$2" ... If the double-quoted expansion occurs within a word,
the expansion of the first parameter is joined with the begin-
ning part of the original word, and the expansion of the last
parameter is joined with the last part of the original word.
When there are no positional parameters, "$#" and $# expand to
nothing (i.e., they are removed).

Resources