Behavior of echo with many backslashes - shell

I have a project with the goal of implementing the same behavior as the echo command. My problem is with backslashes. My information says that when a backslash appears you must considering the next character as a simple character, but here I guess it's not the same.
This an example :
echo \\\\
OUTPUT : \
The problem here is that I expect that the output to be 2 backslashes, not just one.
To get 2 backslashes I need to write 6 backslashes:
echo \\\\\\
Can anyone help me to understand this behavior?

There are multiple layers where the backslashes are interpreted. It is an escape character in the shell(among other places). A backslash followed by a character is an escape code for another character(for instance, \n is interpreted as a line break).
When you first execute echo \\\\\\, the shell parses the escape sequences and ends up passing \\\ to the command(in this case echo).
Quoting the string on the shell will prevent interpretation there(i.e. echo "\\" will literally pass two backslashes to the echo command). You also either have an additional layer of interpretation or your program is incorrectly handling the backslash sequence. Ultimately, you'll need to escape it for each layer.

Related

Trouble understanding the non-obvious use of backslash inside of backticks

I have read a ton of pages including the bash manual, but still find the "non-obvious" use of backslashes confusing.
If I do:
echo \*
it prints a single asterisks, this is normal as I am escaping the asterisks making it literal.
If I do:
echo \\*
it prints \*
This also seems normal, the first backslash escapes the second.
If I do
echo `echo \\*`
It prints the contents of the directory. But in my mind it should print the same as echo \\* because when that is substituted and passed to echo. I understand this is the non-obvious use of backslashes everyone talks about, but I am struggling to understand WHY it happens.
Also the bash manual says
When the old-style backquote form of substitution is used, backslash retains its literal meaning except when followed by ‘$’, ‘`’, or ‘\’.
But it doesn't define what the "literal meaning on backslash" is. Is it as an escape character, a continuation character, or just literally a backslash character?
Also, it says it retain it's literal meaning, except when followed by ... So when it's followed by one of those three characters what does it do? Does it only escape those three characters?
This is mostly for historical interest since `...` command substitution has been superseded by the cleaner $(...) form. No new script should ever use backticks.
Here's how you evaluate a $(command) substitution
Run the command
Here's how you evaluate a `string` command substitution:
Determine the span of the string, from the opening backtick to the closing unescaped backtick (behavior is undefined if this backtick is inside a string literal: the shell will typically either treat it as literal backtick or as a closing backtick depending on its parser implementation)
Unescape the string by removing backslashes that come before one of the three characters dollar, backtick or backslash. This following character is then inserted literally into the command. A backslash followed by any other character will be left alone.
E.g. Hello\\ World will become Hello\ World, because the \\ is replaced with \
Hello\ World will also become Hello\ World, because the backslash is followed by a character other than one of those three, and therefore retains its literal meaning of just being a backslash
\\\* will become \\* since the \\ will become just \ (since backslash is one of the three), and the \* will remain \* (since asterisk is not)
Evaluate the result as a shell command (this includes following all regular shell escaping rules on the result of the now-unescaped command string)
So to evaluate echo `echo \\*`:
Determine the span of the string, here echo \\*
Unescape it according to the backtick quoting rules: echo \*
Evaluate it as a command, which runs echo to output a literal *
Since the result of the substitution is unquoted, the output will undergo:
Word splitting: * becomes * (since it's just one word)
Pathname expansion on each of the words, so * becomes bin Desktop Downloads Photos public_html according to files in the current directory
Note in particular that this was not the same as replacing the the backtick command with the output and rerunning the result. For example, we did not consider escapes, quotes and expansions in the output, which a simple text based macro expansion would have.
Pass each of these as arguments to the next command (also echo): echo bin Desktop Downloads Photos public_html
The result is a list of files in the current directory.

Deeply nesting commands

To escape characters in bash, Why the syntax is confusing when nesting commands deeply?, I know that there is an alternate approach with $() to nest commands, Just curious, why it is as such when nesting commands using backticks!
For example:
echo `echo \`echo \\\`echo inside\\\`\``
Gives output: inside
But
echo `echo \`echo \\`echo inside\\`\``
Fails with,
bash: command substitution: line 1: unexpected EOF while looking for matching ``'
bash: command substitution: line 2: syntax error: unexpected end of file
bash: command substitution: line 1: unexpected EOF while looking for matching ``'
bash: command substitution: line 2: syntax error: unexpected end of file
echo inside\
My question is that why the number of backslashes required for second level nesting is 3 and why it is not 2. In the above example given, one backslash is used for one level deep and three are used for second-level nesting commands to preserve the literal meaning of the backtick.
The basic problem is that there's no distinction between an open-backtick and a close-backtick. So if the shell sees something like this:
somecommand ` something1 ` something2 ` something3 `
...there's no intrinsic way to tell if that's two separate backticked commands (something1 and something3), with a literal string ("something2") in between; or a nested backtick expression, with something2 being run first and its output passed to something1 as an argument (along with the literal string "something3"). In order to avoid ambiguity, the shell syntax picks the first interpretation, and requires that if you want the second interpretation you need to escape the inner level of backticks:
somecommand ` something1 ` something2 ` something3 ` # Two separate expansions
somecommand ` something1 \` something2 \` something3 ` # Nested expansions
And that means adding another level of parsing-and-removing escapes, which means you need to escape any escapes you didn't want parsed at that point, and the whole thing gets quickly out of hand.
The $( ) syntax, on the other hand, is not ambiguous, because the opening and closing markers are not the same. Compare the two possibilities:
somecommand $( something1 ) something2 $( something3 ) # Two separate expansions
somecommand $( something1 $( something2 ) something3 ) # Nested expansions
There's no ambiguity there, so no need for escapes or other syntactic weirdness.
The reason the number of escapes grows so fast with the number of levels is again to avoid ambiguity. And it's not something specific to command expansions with backticks; this escape inflation shows up anytime you have a string going through multiple levels of parsing, each of which applies (and removes) escapes.
Suppose the shell runs across two escapes and a backtick (\\`) as it parses a line. Should it parse that as a doubly-escaped backtick, or a singly-escaped escape (backslash) character followed by a not-escaped-at-all backtick? If it runs across three escapes and a backtick (\\\`), is that a triply-escaped backtick, a doubly-escaped escape followed a not-escaped-at-all backtick, or a singly-escaped escape followed by a singly-escaped backtick?
The shell (like most things that deal with escapes) avoids the ambiguity by not treating stacked escapes as a special thing. When it runs into an escape character, that applies only to the thing immediately after it; if the thing immediately after it is another escape, then it escapes that one character and has no effect on whatever's after it. Thus \\` is an escaped escape, followed by a not-escaped-at-all backtick. That means you can't just add another escape to the front, you have to add an escape in front of each and every escape-worthy character in the string (including escapes from lower levels).
So, let's start with a simple backtick, and work through escaping it to various levels:
First level is easy, just escape it: \'.
For the second level, we have to escape that escape (\\) and then separately escape the backtick itself (\`), giving a total of three backticks: \\\`.
For the third level, we have to individually escape each of those three escapes (so 3x\\) and once again escape the backtick itself (\`), giving a total of seven backticks: \\\\\\\`.
It continues like that, more than doubling the number of escapes for each level. From 7 it goes to 15, then 31, then 63, then... There's a good reason people try to avoid situations with deeply nested escapes.
Oh, and as I mentioned, the shell isn't the only thing that does this, and that can complicate matters because different levels can have different escaping syntaxes, and some things may not need escaping at some of the levels. For example, suppose the thing being escaped is the regular expression \s. To add a level to that, you'd only need one additional escape (\\s) because the "s" doesn't need to be escaped by itself. Additional levels of escaping on that would give 4, 8, 16, 32 etc escapes.
TLDR; Yo, dawg, I heard you like escapes...
P.s. You can use the shell's -v option to make it print commands before executing them. With nested commands like this, it'll print each of the commands as it un-nests them, so you can watch the stack escaped escapes collapse as the layers get stripped off:
$ set -v
$ echo "this is `echo "a literal \`echo "backtick: \\\\\\\`" \`" `"
echo "this is `echo "a literal \`echo "backtick: \\\\\\\`" \`" `"
echo "a literal `echo "backtick: \\\`" `"
echo "backtick: \`"
this is a literal backtick: `
(For even more fun, try this after set -vx -- the -x option will print the commands after parsing, so after you see it drill into the nested commands, you'll then see what happens as it unwinds back out to the final top-level command.)
There is nothing confusing per se in the syntax that you have shown. You just need to breakdown each of the levels one by one.
The GNU bash man page says
When the old-style backquote form of substitution is used, backslash retains its literal meaning except when followed by $, `, or \.
Command substitutions may be nested. To nest when using the backquoted form, escape the inner backquotes with backslashes.
So with that in context, the nested substitution has one \ to escape the back-quote and one more to escape the escape character (now read the above quote that \ loses its special meaning except when followed by another \). So that's the reason the second level of escaping needs two additional backslashes to escape the original character
echo `echo \`echo \\\`echo inside\\\`\``
# ^^^^ ^^^^
becomes
echo `echo \`echo inside\``
# ^^ ^^
which in turn becomes
echo `echo inside`
# ^ ^
which eventually becomes
echo inside

Characters \$ in echo

I need to have in a string the text "\$CONDITIONS". I tried used:
> echo "\$CONDITIONS"
$CONDITIONS
> echo "\\$CONDITIONS"
\
Could you help me? What should I enter in echo command to get
\$CONDITIONS
as a result?
echo doesn't do anything except print exactly the string you pass in. The trick is to know enough about the shell to be able to pass in the string you want.
If you don't need the shell to perform substitutions on the value, simply use single quotes instead.
echo '\$CONDITIONS'
If you absolutely need to use double quotes, you can still single-quote individual parts of the string. Single quotes adjacent to double quotes will get pasted together into a single string before the shell passes it on.
echo '\$'"CONDITIONS"
Good old echo is slightly tired; you might also want to consider printf which is somewhat more versatile.
printf "\x5c\x24CONDITIONS\n"
(I'd normally use single quotes here as well; the double quotes are just to demonstrate that this works even with double quotes. But be careful with the backslashes; these happen to work even with single backslashes, but often they will need to be doubled if you want literal backslashes inside double quotes.)
To review what happened in your failed attempts,
echo "\$CONDITIONS" # produces $CONDITIONS
the backslash properly escapes the dollar sign from the shell, and is removed as part of the process. So you are saying, a literal dollar sign, and the text CONDITIONS.
echo "\\$CONDITIONS" # produces \
Here, the backslash similarly escapes the backslash, and the shell expands the variable $CONDITIONS which is unset or empty.
echo "\\\$CONDITIONS"
Well, this works, but it's ugly. There is a backslash-escaped backslash, and a backslash-escaped dollar sign, and the text CONDITIONS.
Backslashes and dollar signs (and backticks `) don't get processed inside single quotes, so that's what you should usually use if your string contains any of these (and more generally, if you don't specifically require the shell to handle these constructs).
Backslashes are kind of tricky inside double quotes. The shell will remove the ones it processes (so \$ gets turned into just $) but retain the ones it doesn't actually do anything with (so \x is preserved as \x inside double quotes). Without quotes, the behavior is different again. (Not even going into that rabbithole. Just use quotes.)
Use \\\ to achieve this.
#!/bin/bash
echo "\\\$CONDITION" # prints \$CONDITION

How to escape characters from a single command?

How do I escape characters in linux using the sed command?
I want to print something like this
echo hey$ya
But I'm just receiving a
hey
how can escape the $ character?
The reason you are only seing "hey" echoed is that because of the $, the shell tries to expand a variable called ya. Since no such variable exists, it expands to an empty string (basically it disappears).
You can use single quotes, they prevent variable expansion :
echo 'hey$ya'
You can also escape the character :
echo hey\$ya
Strings can also be enclosed in double quotes (e.g. echo "hey$ya"), but these do not prevent expansion, all they do is keep the whole expression as a single string instead of allowing word splitting to separate words in separate arguments for the command being executed. Using double quotes would not work in your case.
\ is the escape character. So your example would be:
~ » echo hey\$ya
hey$ya
~ »

Confusing Shell Quotation

I'm not a newbie to shell, but still got confused with some not so complex quotation problems. I guess there must be something misunderstood.
a: echo 'Don\'t quote me // Don quote me
b: echo Don'\t' quote me // Don quote me
c: echo Don\t quote me // Dont quote me
d: echo Don"\t" qoute me // Don quote me
Above three quotations go quite against my intuition. Doesn't single quote '...' literally returns what is quoted? What I thought is..
For a: in single quoted 'Don\', \ is nothing but a common character. So a) should be Don\t quote me.
For b: like a), '\t' suppressed the special meaning of \t, so I thought b) should be Don\t quote me too.
For c: I understand why c works, but don't understand the diff between a&b and c.
For d: no difference between ' and "?
Probably I misunderstand how shell parse and execute the line of command..
Problem solved by using /bin/echo instead of (built-in)echo on Mac. Latter one will interpret backslash.
As per bash
the first one should return Don\t quote me
the second should return like the first one
the third should return Dont quote me
the last one should return Don\t qoute me
Why:
first one you scaped the don\t by putting it inside single quotes
you scaped only the \t
there is no scaping because \t means print the character after \ as is
double quote doesnt scape scape characters
Your understanding of shell quoting is correct, but it appears that echo on OSX is a shell builtin which interprets backslash escapes. This behavior can be turned off by executing shopt -u xpg_echo.
See here for more information:
How can I escape shell arguments in AppleScript?

Resources