I have read a ton of pages including the bash manual, but still find the "non-obvious" use of backslashes confusing.
If I do:
echo \*
it prints a single asterisks, this is normal as I am escaping the asterisks making it literal.
If I do:
echo \\*
it prints \*
This also seems normal, the first backslash escapes the second.
If I do
echo `echo \\*`
It prints the contents of the directory. But in my mind it should print the same as echo \\* because when that is substituted and passed to echo. I understand this is the non-obvious use of backslashes everyone talks about, but I am struggling to understand WHY it happens.
Also the bash manual says
When the old-style backquote form of substitution is used, backslash retains its literal meaning except when followed by ‘$’, ‘`’, or ‘\’.
But it doesn't define what the "literal meaning on backslash" is. Is it as an escape character, a continuation character, or just literally a backslash character?
Also, it says it retain it's literal meaning, except when followed by ... So when it's followed by one of those three characters what does it do? Does it only escape those three characters?
This is mostly for historical interest since `...` command substitution has been superseded by the cleaner $(...) form. No new script should ever use backticks.
Here's how you evaluate a $(command) substitution
Run the command
Here's how you evaluate a `string` command substitution:
Determine the span of the string, from the opening backtick to the closing unescaped backtick (behavior is undefined if this backtick is inside a string literal: the shell will typically either treat it as literal backtick or as a closing backtick depending on its parser implementation)
Unescape the string by removing backslashes that come before one of the three characters dollar, backtick or backslash. This following character is then inserted literally into the command. A backslash followed by any other character will be left alone.
E.g. Hello\\ World will become Hello\ World, because the \\ is replaced with \
Hello\ World will also become Hello\ World, because the backslash is followed by a character other than one of those three, and therefore retains its literal meaning of just being a backslash
\\\* will become \\* since the \\ will become just \ (since backslash is one of the three), and the \* will remain \* (since asterisk is not)
Evaluate the result as a shell command (this includes following all regular shell escaping rules on the result of the now-unescaped command string)
So to evaluate echo `echo \\*`:
Determine the span of the string, here echo \\*
Unescape it according to the backtick quoting rules: echo \*
Evaluate it as a command, which runs echo to output a literal *
Since the result of the substitution is unquoted, the output will undergo:
Word splitting: * becomes * (since it's just one word)
Pathname expansion on each of the words, so * becomes bin Desktop Downloads Photos public_html according to files in the current directory
Note in particular that this was not the same as replacing the the backtick command with the output and rerunning the result. For example, we did not consider escapes, quotes and expansions in the output, which a simple text based macro expansion would have.
Pass each of these as arguments to the next command (also echo): echo bin Desktop Downloads Photos public_html
The result is a list of files in the current directory.
Related
To escape characters in bash, Why the syntax is confusing when nesting commands deeply?, I know that there is an alternate approach with $() to nest commands, Just curious, why it is as such when nesting commands using backticks!
For example:
echo `echo \`echo \\\`echo inside\\\`\``
Gives output: inside
But
echo `echo \`echo \\`echo inside\\`\``
Fails with,
bash: command substitution: line 1: unexpected EOF while looking for matching ``'
bash: command substitution: line 2: syntax error: unexpected end of file
bash: command substitution: line 1: unexpected EOF while looking for matching ``'
bash: command substitution: line 2: syntax error: unexpected end of file
echo inside\
My question is that why the number of backslashes required for second level nesting is 3 and why it is not 2. In the above example given, one backslash is used for one level deep and three are used for second-level nesting commands to preserve the literal meaning of the backtick.
The basic problem is that there's no distinction between an open-backtick and a close-backtick. So if the shell sees something like this:
somecommand ` something1 ` something2 ` something3 `
...there's no intrinsic way to tell if that's two separate backticked commands (something1 and something3), with a literal string ("something2") in between; or a nested backtick expression, with something2 being run first and its output passed to something1 as an argument (along with the literal string "something3"). In order to avoid ambiguity, the shell syntax picks the first interpretation, and requires that if you want the second interpretation you need to escape the inner level of backticks:
somecommand ` something1 ` something2 ` something3 ` # Two separate expansions
somecommand ` something1 \` something2 \` something3 ` # Nested expansions
And that means adding another level of parsing-and-removing escapes, which means you need to escape any escapes you didn't want parsed at that point, and the whole thing gets quickly out of hand.
The $( ) syntax, on the other hand, is not ambiguous, because the opening and closing markers are not the same. Compare the two possibilities:
somecommand $( something1 ) something2 $( something3 ) # Two separate expansions
somecommand $( something1 $( something2 ) something3 ) # Nested expansions
There's no ambiguity there, so no need for escapes or other syntactic weirdness.
The reason the number of escapes grows so fast with the number of levels is again to avoid ambiguity. And it's not something specific to command expansions with backticks; this escape inflation shows up anytime you have a string going through multiple levels of parsing, each of which applies (and removes) escapes.
Suppose the shell runs across two escapes and a backtick (\\`) as it parses a line. Should it parse that as a doubly-escaped backtick, or a singly-escaped escape (backslash) character followed by a not-escaped-at-all backtick? If it runs across three escapes and a backtick (\\\`), is that a triply-escaped backtick, a doubly-escaped escape followed a not-escaped-at-all backtick, or a singly-escaped escape followed by a singly-escaped backtick?
The shell (like most things that deal with escapes) avoids the ambiguity by not treating stacked escapes as a special thing. When it runs into an escape character, that applies only to the thing immediately after it; if the thing immediately after it is another escape, then it escapes that one character and has no effect on whatever's after it. Thus \\` is an escaped escape, followed by a not-escaped-at-all backtick. That means you can't just add another escape to the front, you have to add an escape in front of each and every escape-worthy character in the string (including escapes from lower levels).
So, let's start with a simple backtick, and work through escaping it to various levels:
First level is easy, just escape it: \'.
For the second level, we have to escape that escape (\\) and then separately escape the backtick itself (\`), giving a total of three backticks: \\\`.
For the third level, we have to individually escape each of those three escapes (so 3x\\) and once again escape the backtick itself (\`), giving a total of seven backticks: \\\\\\\`.
It continues like that, more than doubling the number of escapes for each level. From 7 it goes to 15, then 31, then 63, then... There's a good reason people try to avoid situations with deeply nested escapes.
Oh, and as I mentioned, the shell isn't the only thing that does this, and that can complicate matters because different levels can have different escaping syntaxes, and some things may not need escaping at some of the levels. For example, suppose the thing being escaped is the regular expression \s. To add a level to that, you'd only need one additional escape (\\s) because the "s" doesn't need to be escaped by itself. Additional levels of escaping on that would give 4, 8, 16, 32 etc escapes.
TLDR; Yo, dawg, I heard you like escapes...
P.s. You can use the shell's -v option to make it print commands before executing them. With nested commands like this, it'll print each of the commands as it un-nests them, so you can watch the stack escaped escapes collapse as the layers get stripped off:
$ set -v
$ echo "this is `echo "a literal \`echo "backtick: \\\\\\\`" \`" `"
echo "this is `echo "a literal \`echo "backtick: \\\\\\\`" \`" `"
echo "a literal `echo "backtick: \\\`" `"
echo "backtick: \`"
this is a literal backtick: `
(For even more fun, try this after set -vx -- the -x option will print the commands after parsing, so after you see it drill into the nested commands, you'll then see what happens as it unwinds back out to the final top-level command.)
There is nothing confusing per se in the syntax that you have shown. You just need to breakdown each of the levels one by one.
The GNU bash man page says
When the old-style backquote form of substitution is used, backslash retains its literal meaning except when followed by $, `, or \.
Command substitutions may be nested. To nest when using the backquoted form, escape the inner backquotes with backslashes.
So with that in context, the nested substitution has one \ to escape the back-quote and one more to escape the escape character (now read the above quote that \ loses its special meaning except when followed by another \). So that's the reason the second level of escaping needs two additional backslashes to escape the original character
echo `echo \`echo \\\`echo inside\\\`\``
# ^^^^ ^^^^
becomes
echo `echo \`echo inside\``
# ^^ ^^
which in turn becomes
echo `echo inside`
# ^ ^
which eventually becomes
echo inside
I have this strange issue with my bash script. I compile boost as part of it. The call from the script looks like this:
./b2 --reconfigure ${PARALLEL} link=static cxxflags=-fPIC install boost.locale.iconv=off boost.locale.posix=off -sICU_PATH="${ICU_PREFIX}" -sICU_LINK="${BOOST_ICU_LIBS}" >> "${BOOST_LOG}" 2>&1
That command works perfectly well. The log file shows that it finds ICU without a problem. However, if I change it to run from a variable, it no longer finds ICU (but it still compiles everything else):
bcmd="./b2 --reconfigure ${PARALLEL} link=static cxxflags=-fPIC install boost.locale.iconv=off boost.locale.posix=off -sICU_PATH=\"${ICU_PREFIX}\" -sICU_LINK=\"${BOOST_ICU_LIBS}\""
$bcmd >> "${BOOST_LOG}" 2>&1
What's the difference? I would like to be able to use the second approach so that I can pass the command into another function before running it.
Don't use a variable to store complex commands involving quotes that are nested. The problem is when you call the variable with just $cmd, the quotes are stripped incorrectly. Putting commands (or parts of commands) into variables and then getting them back out intact is complicated.
Quote removal is part of the one of the word expansions done by the shell. From the excerpt seen in POSIX specification of shell
2.6.7 Quote Removal
The quote characters ( backslash, single-quote, and double-quote) that were present in the original word shall be removed unless they have themselves been quoted.
Your example can be simply reproduced by a simple example. Assuming you have a few command flags (not actual ones)
cmdFlags='--archive --exclude="foo bar.txt"'
If you carefully look through the above, it contains 2 args, one --archive and another for --exclude="foo bar.txt", notice the double-quotes which needs to be preserved when you are passing it.
Notice how the quotes are incorrectly split when I don't quote cmdFlags, in the printf() call below
printf "'%s' " $cmdFlags; printf '\n'
'--archive' '--exclude="foo' 'bar.txt"'
and compare the result with one with proper quoting done below.
printf "'%s' " "$cmdFlags"; printf '\n'
'--archive --exclude="foo bar.txt"'
So along with the suggestion of properly quoting the variable, the general suggestion would be to use an array to store the flags and pass the quoted array expansion
cmdArray=()
cmdArray=(./b2 --reconfigure ${PARALLEL} link=static cxxflags=-fPIC install boost.locale.iconv=off boost.locale.posix=off -sICU_PATH="${ICU_PREFIX}" -sICU_LINK="${BOOST_ICU_LIBS}")
and pass the array as
"${cmdArrray[#]}" >> "${BOOST_LOG}" 2>&1
Try to use eval when you want to execute a string as a command. This way, you won't have issues regarding strings that have spaces etc. The expanded cmd string is not re-evaluated by bash hence, things like "hi there" are expanded as two separate tokens.
eval "$bcmd" >> "${BOOST_LOG}" 2>&1
To demonstrate this behavior, consider this code:
cmd='echo "hi there"'
$cmd
eval "$cmd"
Which outputs to:
"hi there"
hi there
The token "hi there" is not re-evaluated as a quoted string.
Use single quota instead of two.
bcmd='./b2 --reconfigure ${PARALLEL} link=static cxxflags=-fPIC install boost.locale.iconv=off boost.locale.posix=off -sICU_PATH=\"${ICU_PREFIX}\" -sICU_LINK=\"${BOOST_ICU_LIBS}\"'
Bash documentation states that single quota does not interpolate:
3.1.2.2 Single Quotes
Enclosing characters in single quotes (') preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.
That way you omit double-quota stripping and removing problem, and your string will be passed correctly. If you want to stay with double quotes, you have to vary that they DO NOT preserve literal value of certain characters: $, ' and \ unless preceded with \, as manual states:
3.1.2.3 Double Quotes
Enclosing characters in double quotes (") preserves the literal value of all characters within the quotes, with the exception of $, `, \, and, when history expansion is enabled, !. The characters $ and ` retain their special meaning within double quotes (see Shell Expansions). The backslash retains its special meaning only when followed by one of the following characters: $, `, ", \, or newline. Within double quotes, backslashes that are followed by one of these characters are removed. Backslashes preceding characters without a special meaning are left unmodified. A double quote may be quoted within double quotes by preceding it with a backslash.
In your example you forgot to mark $ with backslash as well. Difference between these two is explained perfectly by Adam here: Differences between single and double quota
I need to have in a string the text "\$CONDITIONS". I tried used:
> echo "\$CONDITIONS"
$CONDITIONS
> echo "\\$CONDITIONS"
\
Could you help me? What should I enter in echo command to get
\$CONDITIONS
as a result?
echo doesn't do anything except print exactly the string you pass in. The trick is to know enough about the shell to be able to pass in the string you want.
If you don't need the shell to perform substitutions on the value, simply use single quotes instead.
echo '\$CONDITIONS'
If you absolutely need to use double quotes, you can still single-quote individual parts of the string. Single quotes adjacent to double quotes will get pasted together into a single string before the shell passes it on.
echo '\$'"CONDITIONS"
Good old echo is slightly tired; you might also want to consider printf which is somewhat more versatile.
printf "\x5c\x24CONDITIONS\n"
(I'd normally use single quotes here as well; the double quotes are just to demonstrate that this works even with double quotes. But be careful with the backslashes; these happen to work even with single backslashes, but often they will need to be doubled if you want literal backslashes inside double quotes.)
To review what happened in your failed attempts,
echo "\$CONDITIONS" # produces $CONDITIONS
the backslash properly escapes the dollar sign from the shell, and is removed as part of the process. So you are saying, a literal dollar sign, and the text CONDITIONS.
echo "\\$CONDITIONS" # produces \
Here, the backslash similarly escapes the backslash, and the shell expands the variable $CONDITIONS which is unset or empty.
echo "\\\$CONDITIONS"
Well, this works, but it's ugly. There is a backslash-escaped backslash, and a backslash-escaped dollar sign, and the text CONDITIONS.
Backslashes and dollar signs (and backticks `) don't get processed inside single quotes, so that's what you should usually use if your string contains any of these (and more generally, if you don't specifically require the shell to handle these constructs).
Backslashes are kind of tricky inside double quotes. The shell will remove the ones it processes (so \$ gets turned into just $) but retain the ones it doesn't actually do anything with (so \x is preserved as \x inside double quotes). Without quotes, the behavior is different again. (Not even going into that rabbithole. Just use quotes.)
Use \\\ to achieve this.
#!/bin/bash
echo "\\\$CONDITION" # prints \$CONDITION
How do I escape characters in linux using the sed command?
I want to print something like this
echo hey$ya
But I'm just receiving a
hey
how can escape the $ character?
The reason you are only seing "hey" echoed is that because of the $, the shell tries to expand a variable called ya. Since no such variable exists, it expands to an empty string (basically it disappears).
You can use single quotes, they prevent variable expansion :
echo 'hey$ya'
You can also escape the character :
echo hey\$ya
Strings can also be enclosed in double quotes (e.g. echo "hey$ya"), but these do not prevent expansion, all they do is keep the whole expression as a single string instead of allowing word splitting to separate words in separate arguments for the command being executed. Using double quotes would not work in your case.
\ is the escape character. So your example would be:
~ » echo hey\$ya
hey$ya
~ »
I'm trying to understand why Bash removes double quotes (but not single quotes) when doing variable expansion with ${parameter:+word} (Use Alternate Value), in a here-document, for example:
% var=1
% cat <<EOF
> ${var:+"Hi there"}
> ${var:+'Bye'}
> EOF
Hi there
'Bye'
According to the manual, the "word" after :+ is processed with tilde expansion, parameter expansion, command substitution, and arithmetic expansion. None of these should do anything.
What am I missing? How can I get double quotes in the expansion?
tl;dr
$ var=1; cat <<EOF
"${var:+Hi there}"
${var:+$(printf %s '"Hi there"')}
EOF
"Hi there"
"Hi there"
The above demonstrates two pragmatic workarounds to include double quotes in the alternative value.
The embedded $(...) approach is more cumbersome, but more flexible: it allows inclusion of embedded double quotes and also gives you control over whether the value should be expanded or not.
Jens' helpful answer and Patryk Obara's helpful answer both shed light on and further demonstrate the problem.
Note that the problematic behavior equally applies to:
(as noted in the other answers) regular double-quoted strings (e.g., echo "${var:+"Hi there"}"; for the 1st workaround, you'd have to \-quote surrounding " instances; e.g., echo "\"${var:+Hi there}\""; curiously, as Gunstick points out in a comment on the question, using \" in the alternative value to produce " in the output does work in double-quoted strings - e.g., echo "${var:+\"Hi th\"ere\"}" - unlike in unquoted here-docs.)
related expansions ${var+...}, ${var-...} / ${var:-...}, and ${var=...} / ${var:=...}
Also, there's a related oddity with respect to \-handling inside double-quoted alternative values inside a double-quoted string / unquoted here-doc: bash and ksh unexpectedly remove embedded \ instances; e.g.,
echo "${nosuch:-"a\b"}" unexpectedly yields ab, even though echo "a\b" in isolation yields a\b - see this question.
I have no explanation for the behavior[1]
, but I can offer pragmatic solutions that work with all major POSIX-compatible shells (dash, bash, ksh, zsh):
Note that " instances are never needed for syntactic reasons inside the alternative value: The alternative value is implicitly treated like a double-quoted string: no tilde expansion, no word-splitting, and no globbing take place, but parameter expansions, arithmetic expansions and command substitutions are performed.
Note that in parameter expansions involving substitution or prefix/suffix-removal, quotes do have syntactic meaning; e.g.: echo "${BASH#*"bin"}" or echo "${BASH#*'bin'}" - curiously, dash doesn't support single quotes, though.
If you want to surround the entire alternative value with quotes, and it has no embedded quotes and you want it expanded,
quote the entire expansion, which bypasses the problem of " removal from the alternative value:
# Double quotes
$ var=1; cat <<EOF
"${var:+The closest * is far from $HOME}"
EOF
"The closest * is far from /Users/jdoe"
# Single quotes - but note that the alternative value is STILL EXPANDED,
# because of the overall context of the unquoted here-doc.
var=1; cat <<EOF
'${var:+The closest * is far from $HOME}'
EOF
'The closest * is far from /Users/jdoe'
For embedded quotes, or to prevent expansion of the alternative value,
use an embedded command substitution (unquoted, although it'll behave as if it were quoted):
# Expanded value with embedded quotes.
var=1; cat <<EOF
${var:+$(printf %s "We got 3\" of snow at $HOME")}
EOF
We got 3" of snow at /Users/jdoe
# Literal value with embedded quotes.
var=1; cat <<EOF
${var:+$(printf %s 'We got 3" of snow at $HOME')}
EOF
We got 3" of snow at $HOME
These two approaches can be combined as needed.
[1]
In effect, the alternative value:
behaves like an implicitly double-quoted string,
' instances, as in regular double-quoted strings, are treated as literals.
Given the above,
it would make sense to treat embedded " instances as literals too, and simply pass them through, just like the ' instances.
Instead, sadly, they are removed, and if you try to escape them as \", the \ is retained too (inside unquoted here-documents, but curiously not inside double-quoted strings), except in ksh - the laudable exception -, where the \ instances are removed. In zsh, curiously, trying to use \" breaks the expansion altogether (as do unbalanced unescaped ones in all shells).
More specifically, the double quotes have no syntactic function in the alternative value, but they are parsed as if they did: quote removal is applied, and you can't use (unbalanced) " instances in the interior without \"-escaping them (which, as stated, is useless, because the \ instances are retained).
Given the implicit double-quoted-string semantics, literal $ instances must either be \$-escaped, or a command substitution must be used to embed a single-quoted string ($(printf %s '...')).
The behavior looks deliberate--it is consistent across all Bourne shells I tried (e.g. ksh93 and zsh behave the same way).
The behavior is equivalent to treating the here-doc as double-quoted for these special expansions only. In other words, you get the same result for
$ echo "${var:+"hi there"}"
hi there
$ echo "${var:+'Bye'}"
'Bye'
There is only a very faint hint in the POSIX spec I found that something special happens for double quoted words in parameter expansions. This is from the informative "Examples" section of Parameter Expansion:
The double-quoting of patterns is different depending on where the double-quotes are placed.
"${x#*}"
The <asterisk> is a pattern character.
${x#"*"}
The literal <asterisk> is quoted and not special.
I would read the last line as suggesting that quote removal for double quotes applies to the word. This example would not make sense for single quotes, and by omission, there's no quote removal for single quotes.
Update
I tried the FreeBSD /bin/sh, which is derived from an Almquist Shell. This shell outputs single and double quotes. So the behavior is no longer consistent across all shells, only across most shells I tried.
As for getting double quotes in the expansion of the word after :+, my take is
$ var=1
$ q='"'
$ cat <<EOF
${var:+${q}hi there$q}
EOF
"hi there"
$ cat <<EOF
${var:+bare alt value is string already}
${var:+'and these are quotes within string'}
${var:+"these are double quotes within string"}
${var:+"which are removed during substitution"}
"${var:+but you can simply not substitute them away ;)}"
EOF
bare alt value is string already
'and these are quotes within string'
these are double quotes within string
which are removed during substitution
"but you can simply not substitute them away ;)"
Note, that here-document is not needed to reproduce this:
$ echo "${var:+'foo'}"
'foo'