How are backslashs interpreted outside quotation? - bash

I'd expect echo -E \n and echo -E "\n" to be equivalents. However echo -E \n prints n (backslash is not printed) whereas echo -E "\n" prints \n (backslash is printed). Apparently backslashs are interpreted differently not only in single and double quotes, but also in double quotes and outside of quotes. How are backslashs interpreted outside of quotes?

Backslashes are always removed, unless they're themselves escaped (with another backslash) or inside single- or double-quotes. From the POSIX standard for shell syntax, section 2.2.1 "Escape Character (Backslash)":
A <backslash> that is not quoted shall preserve the literal value of the following character, with the exception of a <newline>. If a <newline> follows the <backslash>, the shell shall interpret this as line continuation. The <backslash> and <newline> shall be removed before splitting the input into tokens. Since the escaped <newline> is removed entirely from the input and is not replaced by any white space, it cannot serve as a token separator.
...so outside of quotes, the shell interprets \n as a literal n.
Inside double-quotes, on the other hand (section 2.2.3 "Double-Quotes"):
\
The <backslash> shall retain its special meaning as an escape character (see Escape Character (Backslash)) only when followed by one of the following characters when considered special:
$ ` " \ <newline>
...since in "\n", the backslash is not followed by one of those characters, it doesn't retain its special meaning and is just passed through as a literal backslash.
BTW, just to add confusion, some versions of echo will do their own backslash-interpretation (on any that make it past the shell's parsing process), using yet another set of rules. In some versions, -E disables this... but some just print "-E" as part of their output. If you want predictability, use printf instead:
printf '%s\n' \n # Prints just 'n'
printf '%s\n' "\n" # Prints '\n'

Outside of quotes, unescaped backslashes are always deleted. They are only used to disable the special meaning of other symbols.
Inside double quotes, backslashes are kept, except when escaping one of $, `, ", \, or marking a line continuation.
From POSIX Shell Command Language, emphasis mine:
2.2.1 Escape Character (Backslash)
A <backslash> that is not quoted shall preserve the literal value of the following character, with the exception of a <newline>. If a <newline> follows the <backslash>, the shell shall interpret this as line continuation. The <backslash> and <newline> shall be removed before splitting the input into tokens. [....]
2.2.3 Double-Quotes
Enclosing characters in double-quotes ( "" ) shall preserve the literal value of all characters within the double-quotes, with the exception of the characters backquote, <dollar-sign>, and <backslash>, as follows:
[...]
\
The <backslash> shall retain its special meaning as an escape character (see Escape Character (Backslash)) only when followed by one of the following characters when considered special:
$ ` " \ <newline>

Related

How does escape character work in bash quote?

Reading through Escape Character and Quotes sections
For escape character "", It mentions that
It preserves the literal value of the next character that follows
For sigle quote
' preserves the literal value of each character within the quotes
echo 'a\\nb'
> a\nb
echo 'a\\\nb'
> a\
> b
Questions are:
Why \ is not interpretated as its literal meaning, instead 2 backslash is interpretated as one \ here ?
2.1 Are all the sequence in ANSI c standard reserved as a single character ?
2.2 But following example seems contradicating to the above assumption.
echo '\"'
> \" # instead of printing ", it prints \"
The ANSI escape sequences are interpreted by the shell, only if you wrap them around $'..' and not otherwise
In the case of 'a\\nb' without the ANSI sequences, it is treated literally as characters a, followed by two \, n and b. Only within the ANSI quoting syntax it is treated special, i.e. in $'a\\nb', the \n is interpreted as special sequence for a newline character, but that expansion is deferred because of the additional escape which prevents the \n from being expanded, making it expanded literally.
echo 'a\\nb'
a\\nb
echo $'a\\nb'
a\nb
The case of 'a\\\nb' is the same, without the ANSI escapes, the content is expanded literally, but with the presence of quotes, the sequence a\\\nb is interpreted by the shell to be having one escape character for \ as \\ and one \n, so it expands to
echo $'a\\\nb'
a\
b
Yes, the ANSI escape sequences are treated as a single character when the shell expands it.
And there is no escape sequence expansion involved in '\"', with the presence of single quotes the content within is preserved literally
echo '\"'
\"
echo '"'
"
Note that with ANSI quoting for the above case the $'..' is expanded, expanding any potential escape sequences, so \" expands to just "
echo $'\"'
"
Reference - ANSI-C Quoting
Between the ' ' quotes there is no interpretation of escape sequences. Write
"\""
and you get the expected behavior.

Unexpected strings escape in process argv

Got kinda surprised with:
$ node -p 'process.argv' $SHELL '$SHELL' \t '\t' '\\t'
[ 'node', '/bin/bash', '$SHELL', 't', '\\t', '\\\\t' ]
$ python -c 'import sys; print sys.argv' $SHELL '$SHELL' \t '\t' '\\t'
['-c', '/bin/bash', '$SHELL', 't', '\\t', '\\\\t']
Expected the same behavior as with:
$ echo $SHELL '$SHELL' \t '\t' '\\t'
/bin/bash $SHELL t \t \\t
Which is how I need the stuff to be passed in.
Why the extra escape with '\t', '\\t' in process argv? Why handled differently than '$SHELL'? Where's this actually coming from? Why different from the echo behavior?
First I thought this to be some extras on the minimist part, but then got the same with both bare Node.js and Python. Might be missing something obvious here.
Use $'...' form to pass escape sequences like \t, \n, \r, \0 etc in BASH:
python -c 'import sys; print sys.argv' $SHELL '$SHELL' \t $'\t' $'\\t'
['-c', '/bin/bash', '$SHELL', 't', '\t', '\\t']
As per man bash:
Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as follows:
\a alert (bell)
\b backspace
\e
\E an escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\' single quote
\" double quote
\nnn the eight-bit character whose value is the octal value nnn (one to three digits)
\xHH the eight-bit character whose value is the hexadecimal value HH (one or two hex digits)
\uHHHH the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHH (one to four hex digits)
\UHHHHHHHH the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHHHHHH (one to eight hex digits)
\cx a control-x character
In both python and node.js, there is a difference between the way print works with scalar strings and the way it works with collections.
Strings are printed simply as a sequence of characters. The resulting output is generally what the user expects to see, but it cannot be used as the representation of the string in the language. But when a list/array is printed out, what you get is a valid list/array literal, which can be used in a program.
For example, in python:
>>> print("x")
x
>>> print(["x"])
['x']
When printing the string, you just see the characters. But when printing the list containing the string, python adds quote characters, so that the output is a valid list literal. Similarly, it would add backslashes, if necessary:
>>> print("\\")
\
>>> print(["\\"])
['\\']
node.js works in exactly the same way:
$ node -p '"\\"'
\
$ node -p '["\\"]'
[ '\\' ]
When you print the string containing a single backslash, you just get a single backslash. But when you print a list/array containing a string consisting of a single backslash, you get a quoted string in which the backslash is escaped with a backslash, allowing it to be used as a literal in a program.
As with the printing of strings in node and python, the standard echo shell utility just prints the actual characters in the string. In a standard shell, there is no mechanism similar to node and python printing of arrays. Bash, however, does provide a mechanism for printing out the value of a variable in a format which could be used as part of a bash program:
$ quote=\"
# $quote is a single character:
$ echo "${#quote}"
1
# $quote prints out as a single quote, as you would expect
$ echo "$quote"
"
# If you needed a representation, use the 'declare' builtin:
$ declare -p quote
declare -- quote="\""
# You can also use the "%q" printf format (a bash extension)
$ printf "%q\n" "$quote"
\"
(References: bash manual on declare and printf. Or type help declare and help printf in a bash session.)
That's not the full story, though. It is also important to understand how the shell interprets what you type. In other words, when you write
some_utility \" "\"" '\"'
What does some_utility actually see in the argv array?
In most contexts in a standard shell (including bash), C-style escapes sequences like \t are not interpreted as such. (The standard shell utility printf does interpret these sequences when they appear in a format string, and some other standard utilities also interpret the sequences, but the shell itself does not.) The handling of backslash by a standard shell depends on the context:
Unquoted strings: the backslash quotes the following character, whatever it is (unless it is a newline, in which case both the backslash and the newline are removed from the input).
Double-quoted strings: backslash can be used to escape the characters $, \, ", `; also, a backslash followed by a newline is removed from the input, as in an unquoted string. In bash, if history expansion is enabled (as it is by default in interactive shells), backslash can also be used to avoid history expansion of !, but the backslash is retained in the final string.
Single-quoted strings: backslash is treated as a normal character. (As a result, there is no way to include a single quote in a single-quoted string.)
Bash adds two more quoting mechanisms:
C-style quoting, $'...'. If a single-quoted string is preceded by a dollar sign, then C-style escape sequences inside the string are interpreted in roughly the same way a C compiler would. This includes the standard whitespace characters such as newline (\n), octal, hexadecimal and unicode escapes (\010, \x0a, \u000A, \U0000000A), plus a few non-C sequences including "control" characters (\cJ) and the ESC character \e or \E (the same as \x1b). Backslashes can also be used to escape \, ' and ". (Note that this is a different list from the list of backslashable characters in double-quoted strings; here, a backslash before a dollar sign or a backtic is not special, while a backslash before a single quote is special; moreover, the backslash-newline sequence is not interpreted.)
Locale-specific Translation: $"...". If a double-quoted string is preceded by a dollar sign, backslashes (and variable expansions and command substitutions) are interpreted as with a normal double-quoted strings, and then the string is looked up in a message catalog determined by the current locale.
(References: Posix standard, Bash manual.)

using xautomation in shell file

I am trying to input something on terminal after reading from file. I am using xautomation for this, but I am not sure how to enter a variable in xautomation. PFB my code -
FILENAME="sample.txt"
#set -vx
QUOTES=\'
cat $FILENAME | while read LINE
do
set -vx
# echo $QUOTES$STR$LINE$QUOTES
xte 'sleep 1' \'$LINE\' 'key Return'
#read the output and put in the output text file
done
EDIT: but on terminal it gives output as -
+ read LINE
+ set -vx
+ xte 'sleep 1' ''\''str' 'pwd'\''' 'key Return'
Unknown command ''str'
Unknown command 'pwd''
EDIT2
1 command - xte 'sleep 1' 'str pwd' 'key Return' This will give output of pwd command. so while running any code I need to put quotes around it.
Please let me know if I am doing it wrongly, I am new to shell programming.
thanks
xte 'sleep 1' "${LINE}" 'key Return'
You have to use double quotes to preserve the variable and use the proper format for calling variables in strings.
Quoting the GNU Bash reference:
Enclosing characters in double quotes (‘"’) preserves the literal value of all characters within the quotes, with the exception of ‘$’, ‘’, ‘\’, and, when history expansion is enabled, ‘!’. When the shell is in POSIX mode (see Bash POSIX Mode), the ‘!’ has no special meaning within double quotes, even when history expansion is enabled. The characters ‘$’ and ‘’ retain their special meaning within double quotes (see Shell Expansions). The backslash retains its special meaning only when followed by one of the following characters: ‘$’, ‘`’, ‘"’, ‘\’, or newline. Within double quotes, backslashes that are followed by one of these characters are removed. Backslashes preceding characters without a special meaning are left unmodified. A double quote may be quoted within double quotes by preceding it with a backslash. If enabled, history expansion will be performed unless an ‘!’ appearing in double quotes is escaped using a backslash. The backslash preceding the ‘!’ is not removed.

bash: assign punctuation to variable

when I assign like this:
rmall="\,\.\?\!\:\;\(\)\[\]\{\}\"\'"
then echo $rmall, I got this:
\,\.\?\!\:\;\(\)\[\]\{\}\"\'
But what I want is only , How can I do?
,.?!:;()[]{}"'
as later I need to remove those.
Thank you
You are double quoting by using quotes and backslashes. Use one or the other.
Note: You will always need to use backslash for escaping your quote character but otherwise not needed.
Inside double quotes, only three escape sequences are treated specially:
\" is replaced by a literal "
\$ is replaced by a literal $
\\ is replaced by a literal \
These three are required to allow the literal character in contexts where they would normally produce special behavior. \", obviously, lets you include a double-quote inside a double-quoted string. \$ lets you output a literal dollar sign where it would otherwise trigger parameter substitution:
bash $ foo=5; echo "\$foo = $foo"
$foo = 5
\\ lets you output a literal backslash that precedes a parameter substitution or at the end of a string.
bash $ foo=5; echo "\\$foo"
\5
bash $ echo "Use a \\"
Use a \
A backslash followed by any other character is treated literally:
bash $ echo "\x"
\x

print double quotes in shell programming

I want to print double quotes using echo statement in shell programming.
Example:
echo "$1,$2,$3,$4";
prints xyz,123,abc,pqrs
How to print "xyz","123","abc","pqrs";
I had tried to place double quotes in echo statement but its not being printed.
You just have to quote them:
echo "\"$1\",\"$2\",\"$3\",\"$4\""
As noted here:
Enclosing characters in double quotes (‘"’) preserves the literal
value of all characters within the quotes, with the exception of ‘$’,
‘`’, ‘\’, and, when history expansion is enabled, ‘!’. The characters
‘$’ and ‘`’ retain their special meaning within double quotes (see
Shell Expansions). The backslash retains its special meaning only when
followed by one of the following characters: ‘$’, ‘`’, ‘"’, ‘\’, or
newline. Within double quotes, backslashes that are followed by one of
these characters are removed. Backslashes preceding characters without
a special meaning are left unmodified. A double quote may be quoted
within double quotes by preceding it with a backslash. If enabled,
history expansion will be performed unless an ‘!’ appearing in double
quotes is escaped using a backslash. The backslash preceding the ‘!’
is not removed.
The special parameters ‘*’ and ‘#’ have special meaning when in double
quotes (see Shell Parameter Expansion).
Use printf, no escaping is required:
printf '"%s","%s","%s","%s";\n' "$1" "$2" "$3" "$4"
and the trailing ; gets printed too!
You should escape the " to make it visible in the output, you can do this :
echo \""$1"\",\""$2"\",\""$3"\",\""$4"\"

Resources