I'm trying to understand why Bash removes double quotes (but not single quotes) when doing variable expansion with ${parameter:+word} (Use Alternate Value), in a here-document, for example:
% var=1
% cat <<EOF
> ${var:+"Hi there"}
> ${var:+'Bye'}
> EOF
Hi there
'Bye'
According to the manual, the "word" after :+ is processed with tilde expansion, parameter expansion, command substitution, and arithmetic expansion. None of these should do anything.
What am I missing? How can I get double quotes in the expansion?
tl;dr
$ var=1; cat <<EOF
"${var:+Hi there}"
${var:+$(printf %s '"Hi there"')}
EOF
"Hi there"
"Hi there"
The above demonstrates two pragmatic workarounds to include double quotes in the alternative value.
The embedded $(...) approach is more cumbersome, but more flexible: it allows inclusion of embedded double quotes and also gives you control over whether the value should be expanded or not.
Jens' helpful answer and Patryk Obara's helpful answer both shed light on and further demonstrate the problem.
Note that the problematic behavior equally applies to:
(as noted in the other answers) regular double-quoted strings (e.g., echo "${var:+"Hi there"}"; for the 1st workaround, you'd have to \-quote surrounding " instances; e.g., echo "\"${var:+Hi there}\""; curiously, as Gunstick points out in a comment on the question, using \" in the alternative value to produce " in the output does work in double-quoted strings - e.g., echo "${var:+\"Hi th\"ere\"}" - unlike in unquoted here-docs.)
related expansions ${var+...}, ${var-...} / ${var:-...}, and ${var=...} / ${var:=...}
Also, there's a related oddity with respect to \-handling inside double-quoted alternative values inside a double-quoted string / unquoted here-doc: bash and ksh unexpectedly remove embedded \ instances; e.g.,
echo "${nosuch:-"a\b"}" unexpectedly yields ab, even though echo "a\b" in isolation yields a\b - see this question.
I have no explanation for the behavior[1]
, but I can offer pragmatic solutions that work with all major POSIX-compatible shells (dash, bash, ksh, zsh):
Note that " instances are never needed for syntactic reasons inside the alternative value: The alternative value is implicitly treated like a double-quoted string: no tilde expansion, no word-splitting, and no globbing take place, but parameter expansions, arithmetic expansions and command substitutions are performed.
Note that in parameter expansions involving substitution or prefix/suffix-removal, quotes do have syntactic meaning; e.g.: echo "${BASH#*"bin"}" or echo "${BASH#*'bin'}" - curiously, dash doesn't support single quotes, though.
If you want to surround the entire alternative value with quotes, and it has no embedded quotes and you want it expanded,
quote the entire expansion, which bypasses the problem of " removal from the alternative value:
# Double quotes
$ var=1; cat <<EOF
"${var:+The closest * is far from $HOME}"
EOF
"The closest * is far from /Users/jdoe"
# Single quotes - but note that the alternative value is STILL EXPANDED,
# because of the overall context of the unquoted here-doc.
var=1; cat <<EOF
'${var:+The closest * is far from $HOME}'
EOF
'The closest * is far from /Users/jdoe'
For embedded quotes, or to prevent expansion of the alternative value,
use an embedded command substitution (unquoted, although it'll behave as if it were quoted):
# Expanded value with embedded quotes.
var=1; cat <<EOF
${var:+$(printf %s "We got 3\" of snow at $HOME")}
EOF
We got 3" of snow at /Users/jdoe
# Literal value with embedded quotes.
var=1; cat <<EOF
${var:+$(printf %s 'We got 3" of snow at $HOME')}
EOF
We got 3" of snow at $HOME
These two approaches can be combined as needed.
[1]
In effect, the alternative value:
behaves like an implicitly double-quoted string,
' instances, as in regular double-quoted strings, are treated as literals.
Given the above,
it would make sense to treat embedded " instances as literals too, and simply pass them through, just like the ' instances.
Instead, sadly, they are removed, and if you try to escape them as \", the \ is retained too (inside unquoted here-documents, but curiously not inside double-quoted strings), except in ksh - the laudable exception -, where the \ instances are removed. In zsh, curiously, trying to use \" breaks the expansion altogether (as do unbalanced unescaped ones in all shells).
More specifically, the double quotes have no syntactic function in the alternative value, but they are parsed as if they did: quote removal is applied, and you can't use (unbalanced) " instances in the interior without \"-escaping them (which, as stated, is useless, because the \ instances are retained).
Given the implicit double-quoted-string semantics, literal $ instances must either be \$-escaped, or a command substitution must be used to embed a single-quoted string ($(printf %s '...')).
The behavior looks deliberate--it is consistent across all Bourne shells I tried (e.g. ksh93 and zsh behave the same way).
The behavior is equivalent to treating the here-doc as double-quoted for these special expansions only. In other words, you get the same result for
$ echo "${var:+"hi there"}"
hi there
$ echo "${var:+'Bye'}"
'Bye'
There is only a very faint hint in the POSIX spec I found that something special happens for double quoted words in parameter expansions. This is from the informative "Examples" section of Parameter Expansion:
The double-quoting of patterns is different depending on where the double-quotes are placed.
"${x#*}"
The <asterisk> is a pattern character.
${x#"*"}
The literal <asterisk> is quoted and not special.
I would read the last line as suggesting that quote removal for double quotes applies to the word. This example would not make sense for single quotes, and by omission, there's no quote removal for single quotes.
Update
I tried the FreeBSD /bin/sh, which is derived from an Almquist Shell. This shell outputs single and double quotes. So the behavior is no longer consistent across all shells, only across most shells I tried.
As for getting double quotes in the expansion of the word after :+, my take is
$ var=1
$ q='"'
$ cat <<EOF
${var:+${q}hi there$q}
EOF
"hi there"
$ cat <<EOF
${var:+bare alt value is string already}
${var:+'and these are quotes within string'}
${var:+"these are double quotes within string"}
${var:+"which are removed during substitution"}
"${var:+but you can simply not substitute them away ;)}"
EOF
bare alt value is string already
'and these are quotes within string'
these are double quotes within string
which are removed during substitution
"but you can simply not substitute them away ;)"
Note, that here-document is not needed to reproduce this:
$ echo "${var:+'foo'}"
'foo'
Related
I have read a ton of pages including the bash manual, but still find the "non-obvious" use of backslashes confusing.
If I do:
echo \*
it prints a single asterisks, this is normal as I am escaping the asterisks making it literal.
If I do:
echo \\*
it prints \*
This also seems normal, the first backslash escapes the second.
If I do
echo `echo \\*`
It prints the contents of the directory. But in my mind it should print the same as echo \\* because when that is substituted and passed to echo. I understand this is the non-obvious use of backslashes everyone talks about, but I am struggling to understand WHY it happens.
Also the bash manual says
When the old-style backquote form of substitution is used, backslash retains its literal meaning except when followed by ‘$’, ‘`’, or ‘\’.
But it doesn't define what the "literal meaning on backslash" is. Is it as an escape character, a continuation character, or just literally a backslash character?
Also, it says it retain it's literal meaning, except when followed by ... So when it's followed by one of those three characters what does it do? Does it only escape those three characters?
This is mostly for historical interest since `...` command substitution has been superseded by the cleaner $(...) form. No new script should ever use backticks.
Here's how you evaluate a $(command) substitution
Run the command
Here's how you evaluate a `string` command substitution:
Determine the span of the string, from the opening backtick to the closing unescaped backtick (behavior is undefined if this backtick is inside a string literal: the shell will typically either treat it as literal backtick or as a closing backtick depending on its parser implementation)
Unescape the string by removing backslashes that come before one of the three characters dollar, backtick or backslash. This following character is then inserted literally into the command. A backslash followed by any other character will be left alone.
E.g. Hello\\ World will become Hello\ World, because the \\ is replaced with \
Hello\ World will also become Hello\ World, because the backslash is followed by a character other than one of those three, and therefore retains its literal meaning of just being a backslash
\\\* will become \\* since the \\ will become just \ (since backslash is one of the three), and the \* will remain \* (since asterisk is not)
Evaluate the result as a shell command (this includes following all regular shell escaping rules on the result of the now-unescaped command string)
So to evaluate echo `echo \\*`:
Determine the span of the string, here echo \\*
Unescape it according to the backtick quoting rules: echo \*
Evaluate it as a command, which runs echo to output a literal *
Since the result of the substitution is unquoted, the output will undergo:
Word splitting: * becomes * (since it's just one word)
Pathname expansion on each of the words, so * becomes bin Desktop Downloads Photos public_html according to files in the current directory
Note in particular that this was not the same as replacing the the backtick command with the output and rerunning the result. For example, we did not consider escapes, quotes and expansions in the output, which a simple text based macro expansion would have.
Pass each of these as arguments to the next command (also echo): echo bin Desktop Downloads Photos public_html
The result is a list of files in the current directory.
I have this strange issue with my bash script. I compile boost as part of it. The call from the script looks like this:
./b2 --reconfigure ${PARALLEL} link=static cxxflags=-fPIC install boost.locale.iconv=off boost.locale.posix=off -sICU_PATH="${ICU_PREFIX}" -sICU_LINK="${BOOST_ICU_LIBS}" >> "${BOOST_LOG}" 2>&1
That command works perfectly well. The log file shows that it finds ICU without a problem. However, if I change it to run from a variable, it no longer finds ICU (but it still compiles everything else):
bcmd="./b2 --reconfigure ${PARALLEL} link=static cxxflags=-fPIC install boost.locale.iconv=off boost.locale.posix=off -sICU_PATH=\"${ICU_PREFIX}\" -sICU_LINK=\"${BOOST_ICU_LIBS}\""
$bcmd >> "${BOOST_LOG}" 2>&1
What's the difference? I would like to be able to use the second approach so that I can pass the command into another function before running it.
Don't use a variable to store complex commands involving quotes that are nested. The problem is when you call the variable with just $cmd, the quotes are stripped incorrectly. Putting commands (or parts of commands) into variables and then getting them back out intact is complicated.
Quote removal is part of the one of the word expansions done by the shell. From the excerpt seen in POSIX specification of shell
2.6.7 Quote Removal
The quote characters ( backslash, single-quote, and double-quote) that were present in the original word shall be removed unless they have themselves been quoted.
Your example can be simply reproduced by a simple example. Assuming you have a few command flags (not actual ones)
cmdFlags='--archive --exclude="foo bar.txt"'
If you carefully look through the above, it contains 2 args, one --archive and another for --exclude="foo bar.txt", notice the double-quotes which needs to be preserved when you are passing it.
Notice how the quotes are incorrectly split when I don't quote cmdFlags, in the printf() call below
printf "'%s' " $cmdFlags; printf '\n'
'--archive' '--exclude="foo' 'bar.txt"'
and compare the result with one with proper quoting done below.
printf "'%s' " "$cmdFlags"; printf '\n'
'--archive --exclude="foo bar.txt"'
So along with the suggestion of properly quoting the variable, the general suggestion would be to use an array to store the flags and pass the quoted array expansion
cmdArray=()
cmdArray=(./b2 --reconfigure ${PARALLEL} link=static cxxflags=-fPIC install boost.locale.iconv=off boost.locale.posix=off -sICU_PATH="${ICU_PREFIX}" -sICU_LINK="${BOOST_ICU_LIBS}")
and pass the array as
"${cmdArrray[#]}" >> "${BOOST_LOG}" 2>&1
Try to use eval when you want to execute a string as a command. This way, you won't have issues regarding strings that have spaces etc. The expanded cmd string is not re-evaluated by bash hence, things like "hi there" are expanded as two separate tokens.
eval "$bcmd" >> "${BOOST_LOG}" 2>&1
To demonstrate this behavior, consider this code:
cmd='echo "hi there"'
$cmd
eval "$cmd"
Which outputs to:
"hi there"
hi there
The token "hi there" is not re-evaluated as a quoted string.
Use single quota instead of two.
bcmd='./b2 --reconfigure ${PARALLEL} link=static cxxflags=-fPIC install boost.locale.iconv=off boost.locale.posix=off -sICU_PATH=\"${ICU_PREFIX}\" -sICU_LINK=\"${BOOST_ICU_LIBS}\"'
Bash documentation states that single quota does not interpolate:
3.1.2.2 Single Quotes
Enclosing characters in single quotes (') preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.
That way you omit double-quota stripping and removing problem, and your string will be passed correctly. If you want to stay with double quotes, you have to vary that they DO NOT preserve literal value of certain characters: $, ' and \ unless preceded with \, as manual states:
3.1.2.3 Double Quotes
Enclosing characters in double quotes (") preserves the literal value of all characters within the quotes, with the exception of $, `, \, and, when history expansion is enabled, !. The characters $ and ` retain their special meaning within double quotes (see Shell Expansions). The backslash retains its special meaning only when followed by one of the following characters: $, `, ", \, or newline. Within double quotes, backslashes that are followed by one of these characters are removed. Backslashes preceding characters without a special meaning are left unmodified. A double quote may be quoted within double quotes by preceding it with a backslash.
In your example you forgot to mark $ with backslash as well. Difference between these two is explained perfectly by Adam here: Differences between single and double quota
What do those two assignations (i and C omitting the first one to void) do? Is it some kind of regex for the variable? I tried with bash, but so far there were no changes in the output of my strings after instantiating them with "${i//\\/\\\\}" or "\"${i//\"/\\\"}\""
C=''
for i in "$#"; do
i="${i//\\/\\\\}"
C="$C \"${i//\"/\\\"}\""
done
${i//\\/\\\\} is a slightly complicated-looking parameter expansion:
It expands the variable $i in the following way:
${i//find/replace} means replace all instances of "find" with "replace". In this case, the thing to find is \, which itself needs escaping with another \.
The replacement is two \, which each need escaping.
For example:
$ i='a\b\c'
$ echo "${i//\\/\\\\}"
a\\b\\c
The next line performs another parameter expansion:
find " (which needs to be escaped, since it is inside a double-quoted string)
replace with \" (both the double quote and the backslash need to be escaped).
It looks like the intention of the loop is to build a string C, attempting to safely quote/escape the arguments passed to the script. This type of approach is generally error-prone, and it would probably be better to work with the input array directly. For example, the arguments passed to the script can be safely passed to another command like:
cmd "$#" # does "the right thing" (quotes each argument correctly)
if you really need to escape the backslashes, you can do that too:
cmd "${#//\\/\\\\}" # replaces all \ with \\ in each argument
It's bash parameter expansions
it replace all backslashes by double backslashes :"${i//\\/\\\\}
it replace all \" by \\" : ${i//\"/\\\"}
Check http://wiki.bash-hackers.org/syntax/pe
This question already has answers here:
Is it necessary to quote command substitutions during variable assignment in bash?
(3 answers)
Closed 6 years ago.
I always thought I must put quotes around $() in bash, like in:
FOO="$(echo "bar baz")"
but apparently this is unnecessary, at least during variable assignment:
$ FOO=$(echo "foo bar")
$ echo "$FOO"
foo bar
On the other hand, if I just try to assign multiple words to a variable, I get an error, because it's interpreted as "set variable for duration of subsequent command":
$ FOO=bar fooooo
fooooo: command not found
Also, if I just use $() without quotes in non-assignment context, they're again treated as separate words:
$ echo $(echo "baa beee")
baa beee
So, what are the rules regarding $() and "" interaction, and how safe is the non-quote variant? I'd be especially grateful for manpage quotes, or some other authoritative references. Also, is there some "good practice/style" here?
In brief, quotes are necessary to suppress otherwise normal behavior. In most contexts, you would need to quote the command substitution to suppress word-splitting and pathname expansion.
From the man page, under "Command substitution":
If the [command] substitution appears within double quotes, word
splitting and pathname expansion are not performed on the results.
However, the right-hand side of an assignment is not one of those contexts. From the man page, under "PARAMETERS":
A variable may be assigned to by a statement of the form
name=[value]
If value is not given, the variable is assigned the null string. All values
undergo tilde expansion, parameter and variable expansion, command substitution,
arithmetic expansion, and quote removal (see EXPANSION below).
Note that neither word splitting nor pathname expansion are mentioned.
As a rule, when in doubt, quote any expansion. The times when you want word-splitting and pathname expansion of an expansion are rare and usually obvious.
Sample bash script
QRY="select * from mysql"
CMD="mysql -e \"$QRY\""
`$CMD`
I get errors because the * is getting evaluated as a glob (enumerating) files in my CWD.
I Have seen other posts that talk about quoting the "$CMD" reference for purposes of echo output, but in this case
"$CMD"
complains the whole literal string as a command.
If I
echo "$CMD"
And then copy/paste it to the command line, things seems to work.
You can just use:
qry='select * from db'
mysql -e "$qry"
This will not subject to * expansion by shell.
If you want to store mysql command line also then use BASH arrays:
cmd=(mysql -e "$qry")
"${cmd[#]}"
Note: anubhava's answer has the right solution.
This answer provides background information.
As for why your approach didn't work:
"$CMD" doesn't work, because bash sees the entire value as a single token that it interprets as a command name, which obviously fails.
`$CMD`
i.e., enclosing $CMD in backticks, is pointless in this case (and will have unintended side effects if the command produces stdout output[1]); using just:
$CMD
yields the same - broken - result (only more efficiently - by enclosing in backticks, you needlessly create a subshell; use backticks - or, better, $(...) only when embedding one command in another - see command substitution).
$CMD doesn't work,
because unquoted use of * subjects it to pathname expansion (globbing) - among other shell expansions.
\-escaping glob chars. in the string causes the \ to be preserved when the string is executed.
While it may seem that you've enclosed the * in double quotes by placing it (indirectly) between escaped double quotes (\"$QRY\") inside a double-quoted string, the shell does not see what's between these escaped double quotes as a single, double-quoted string.
Instead, these double quotes become literal parts of the tokens they abut, and the shell still performs word splitting (parsing into separate arguments by whitespace) on the string, and expansions such as globbing on the resulting tokens.
If we assume for a moment that globbing is turned off (via set -f), here is the breakdown of the arguments passed to mysql when the shell evaluates (unquoted) $CMD:
-e # $1 - all remaining arguments are the unintentionally split SQL command.
"select # $2 - note that " has become a literal part of the argument
* # $3
from # $4
mysql" # $5 - note that " has become a literal part of the argument
The only way to get your solution to work with the existing, single string variable is to use eval as follows:
eval "$CMD"
That way, the embedded escaped double-quoted string is properly parsed as a single, double-quoted string (to which no globbing is applied), which (after quote removal) is passed as a single argument to mysql.
However, eval is generally to be avoided due to its security implications (if you don't (fully) control the string's content, arbitrary commands could be executed).
Again, refer to anubhava's answer for the proper solution.
[1] A note re using `$CMD` as a command by itself:
It causes bash to execute stdout output from $CMD as another command, which is rarely the intent, and will typically result in a broken command or, worse, a command with unintended effects.
Try running `echo ha` (with the backticks - same as: $(echo ha)); you'll get -bash: ha: command not found, because bash tries to execute the command's output - ha - as a command, which fails.