escape character usage in bash, what do we actually escape?

escape character usage in bash, what do we actually escape? - bash

I am reading the bash manual, found the escape character definition pretty surprising, instead of modifying the meaning of every character follows it (as in Java / C):
It preserves the literal value of the next character that follows
Does it mean in bash, we only use it to escape special meaning character like ', ", \, $
And other cases, like \t\e\s\t actually is exactly as test ? I verified that
echo test
echo \t\e\s\t
outputs same result.

Does it mean in bash, we only use it to escape special meaning character like ', ", \, $
Yes. Also, e.g. newline:
echo foo
bar
# foo
# -bash: bar: command not found
echo foo \
bar
# foo
# bar
And other cases, like "\t\e\s\t" actually is exactly as "test"
If unquoted, yes. Quoted, the backslash is preserved. Some UNIX utilities do use backslash for "special meanings", but it is the utility, not bash, that gives those sequences meanings. Examples are printf, and GNU echo when given -e option:
/bin/echo \t\e\s\t
# test
/bin/echo "\t\e\s\t"
# \t\e\s\t
/bin/echo -e "\t\e\s\t" # GNU version (will not do the same thing on Mac)
# s
# (tab)(escape)s(tab)
printf "\t\e\s\t"
# s
# (tab)(escape)s(tab)
As #rici reminds me, bash can interpret C-style escape sequences itself, if you use the special quotes of the form $'...':
/bin/echo $'\t\e\s\t'
# s
Here it really is bash that does it, not echo.

Related

Prevent "echo" from interpreting backslash escapes

I'd like to echo something to a file that contains new line escape sequences, however I would like them to remain escaped. I'm looking for basically the opposite to this question.
echo "part1\npart2" >> file
I would like to look like this in the file
$ cat file
old
part1\npart2
but it looks like
$ cat file
old
part1
part2

This is a good example of why POSIX recommends using printf instead of echo (see here, under "application usage"): you don't know what you get with echo1.
You could get:
A shell builtin echo that does not interpret backslash escapes by default
Example: the Bash builtin echo has an -e option to enable backslash escape interpretation and checks the xpg_echo shell option
A shell builtin echo that interprets backslash escapes by default
Examples: zsh, dash
A standalone executable /bin/echo: probably depends on which one – GNU Coreutils echo understands the -e option, like the Bash builtin
The POSIX spec says this (emphasis mine):
The following operands shall be supported:
string
A string to be written to standard output. If the first operand is -n, or if any of the operands contain a <backslash> character, the results are implementation-defined.
So, for a portable solution, we can use printf:
printf '%s\n' 'part1\npart2' >> file
where the \n in the format string will always be interpreted, and the \n in the argument will never be interpreted, resulting in
part1\npart2
being appended to file.
1 For an exhaustive overview of various behaviours for echo and printf, see echo(1) and printf(1) on in-ulm.de.

removing backslash with tr

So Im removing special characters from filenames and replacing with spaces. I have all working apart from files with single backslashes contained therein.
Note these files are created in the Finder on OS X
old_name="testing\this\folder"
new_name=$(echo $old_name | tr '<>:\\#%|?*' ' ');
This results in new_name being "testing hisolder"
How can I just removed the backslashes and not the preceding character?

This results in new_name being "testing hisolder"
This string looks like the result of echo -e "testing\this\folder", because \t and \f are actually replaced with the tabulation and form feed control characters.
Maybe you have an alias like alias echo='echo -e', or maybe the implementation of echo in your version of the shell interprets backslash escapes:
POSIX does not require support for any options, and says that the
behavior of ‘echo’ is implementation-defined if any STRING contains a
backslash or if the first argument is ‘-n’. Portable programs can use
the ‘printf’ command if they need to omit trailing newlines or output
control characters or backslashes.
(from the info page)
So you should use printf instead of echo in new software. In particular, echo $old_name should be replaced with printf %s "$old_name".
There is a good explanation in this discussion, for instance.
No need for printf
As #mklement0 suggested, you can avoid the pipe by means of the Bash here string:
tr '<>:\\#%|?*' ' ' <<<"$old_name"

Ruslan's excellent answer explains why your command may not be working for you and offers a robust, portable solution.
tl;dr:
You probably ran your code with sh rather than bash (even though on macOS sh is Bash in disguise), or you had shell option xpg_echo explicitly turned on.
Use printf instead of echo for portability.
In Bash, with the default options and using the echo builtin, your command should work as-is (except that you should double-quote $old_name for robustness), because echo by default does not expand escape sequences such as \t in its operands.
However, Bash's echo can be made to expand control-character escape sequences:
explicitly, by executing shopt -s xpg_echo
implicitly, if you run Bash as sh or with the --posix option (which, among other options and behavior changes, activates xpg_echo)
Thus, your symptom may have been caused by running your code from a script with shebang line #!/bin/sh, for instance.
However, if you're targeting sh, i.e., if you're writing a portable script, then echo should be avoided altogether for the very reason that its behavior differs across shells and platforms - see Ruslan's printf solution.
As an aside: perhaps a more robust approach to your tr command is a whitelisting approach: stating only the characters that are explicitly allowed in your result, and excluding other with the -C option:
old_name='testing\this\folder'
new_name=$(printf '%s' "$old_name" | tr -C '[:alnum:]_-' ' ')
That way, any characters that aren't either letters, numbers, _, or - are replaced with a space.

With Bash, you can use parameter expansion:
$ old_name="testing\this\folder"
$ new_name=${old_name//[<>:\\#%|?*]/ }
$ echo $new_name
testing this folder
For more, please refer to the Bash manual on shell parameter expansion.

I think your test case is missing proper escaping for \, so you're not really testing the case of a backslash contained in a string.
This worked for me:
old_name='testing\\this\\folder'
new_name=$(echo $old_name | tr '<>:\\#%|?*' ' ');
echo $new_name
# testing this folder

How to trim leading and trailing whitespaces from a string value in a variable?

I know there is a duplicate for this question already at: How to trim whitespace from a Bash variable?.
I read all the answers there but I have a question about another solution in my mind and I want to know if this works.
This is the solution I think works.
a=$(printf "%s" $a)
Here is a demonstration.
$ a=" foo "
$ a=$(printf "%s" $a)
$ echo "$a"
foo
Is there any scenario in which this solution may fail?
If there is such a scenario in which this solution may fail, can we modify this solution to handle that scenario without compromising the simplicity of the solution too much?

If the variable a is set with something like "-e", "-n" in the begining, depending on how you process later your result, a user might crash your script:
-e option allows echo to interpret things backslashed.
Even in the case you only want to display the variable a, -n would screw your layout.
You could think about using regex to check if your variable starts with '-' and is followed by one of the available echo options (-n, -e, -E, --help, --version).

It fails when the input contains spaces between non-whitespace characters.
$ a=" foo bar "
$ a=$(printf "%s" $a)
$ echo "$a"
foobar
The expected output was the following instead.
foo bar

You could use Bash's builtin pattern substitution.
Note: Bash pattern substitution uses 'Pathname Expansion' (glob) pattern matching, not regular expressions. My solution requires enabling the optional shell behaviour extglob (shopt -s extglob).
$shopt -s extglob
$ a=" foo bar "
$ echo "Remove trailing spaces: '${a/%*([[:space:]])}'"
Remove trailing spaces: ' foo bar'
$ echo "Remove leading spaces: '${a/#*([[:space:]])}'"
Remove leading spaces: 'foo bar '
$ echo "Remove all spaces anywhere: '${a//[[:space:]]}'"
Remove all spaces anywhere: 'foobar'
For reference, refer to the 'Parameter Expansion' (Pattern substitution) and 'Pathname Expansion' subsections of the EXPANSION section of the Bash man page.

Why does \$ reduce to $ inside backquotes [though not inside $(...)]?

Going over the POSIX standard, I came across another rather technical/pointless question. It states:
Within the backquoted style of command substitution, <backslash> shall retain its literal meaning, except when followed by: '$' , '`' , or <backslash>.
It's easy to see why '`' and '\' lose their literal meanings: nested command substitution demands a "different" backquote inside the command substitution, which in turn forces '\' to lose its literal meaning. So, for instance, the following different behavior seems reasonable:
$ echo $(echo \\\\)
\\
$ echo `echo \\\\`
\
But what about '$'? I.e., what's the point or, more concretely, a possible benefit of the following difference?
$ echo $(echo \$\$)
$$
$ echo `echo \$\$`
4735
As '$' by itself is not ruled out inside backquotes, it looks like you would use either '$' or '\\\$' all the time, but never the middle case '\$'.
To recap,
$ echo `echo $$` # PID, OK
4735
$ echo `echo \\\$\\\$` # literal "$$", OK
$$
$ echo `echo \$\$` # What's the point?
4735
PS: I know this question is rather technical... I myself go for the more modern $(...) substitution all the time, but I'm still curious.

By adding a \, you make the inner subshell expand it instead of the outer shell. A good example would be to actually force the starting of a new shell, like this:
$ echo $$
4988
$ echo `sh -c 'echo $$'`
4988
$ echo `sh -c 'echo \$\$'`
4990
$ echo `sh -c 'echo \\\$\\\$'`
$$

Basic Answer
Consider the following command, which finds the base directory where gcc was installed:
gcc_base=$(dirname $(dirname $(which gcc)))
With the $(...) notation, there is no problem with the parsing; it is trivial and is one of the primary reason why the notation is recommended. The equivalent command using back-ticks is:
gcc_base=`dirname \`dirname \\\`which gcc\\\`\``
When the shell first parses this command, it encounters the first backtick, and has to find the matching close backtick. That's when the quoted section comes into effect:
Within the backquoted style of command substitution, shall retain its literal meaning, except when followed by: '$' , '`' , or .
gcc_base=`dirname \`dirname \\\`which gcc\\\`\``
^ ^ ^ ^ ^ ^
1 2 3 4 5 6
backslash-backtick - special rule
backslash-backslash - special rule
backslash-backtick - special rule
backslash-backslash - special rule
backslash-backtick - special rule
backslash-backtick - special rule
So, the unescaped backtick at the end marks the end of the outermost backtick command. The sub-shell that processes that command sees:
dirname `dirname \`which gcc\``
The backslash-back escapes are given the special treatment again, and the sub-sub-shell sees:
dirname `which gcc`
The sub-sub-sub-shell gets to see which gcc and evaluates it (e.g. /usr/gcc/v4.6.1/bin/gcc).
The sub-sub-shell evaluates dirname /usr/gcc/v4.6.1/bin/gcc and produces /usr/gcc/v4.6.1/bin.
The sub-shell evaluates dirname /usr/gcc/v4.6.1/bin and produces /usr/gcc/v4.6.1.
The shell assigns /usr/gcc/v4.6.1 to gcc_base.
In this example, the backslashes were only followed by the special characters - backslash, backtick, dollar. A more complex example would have, for example, \" sequences in the command, and then the special rule would not apply; the \" would simply be copied through unchanged and passed to the relevant sub-shell(s).
Extraordinarily Complex Stuff
For example, suppose you had a command with a blank in its name (heaven forbid; and this shows why!) such as totally amazing (two blanks; it is a more stringent test than a single blank). Then you could write:
$ cmd="totally amazing"
$ echo "$cmd"
totally amazing
$ which "$cmd"
/Users/jleffler/bin/totally amazing
$ dirname $(which "$cmd")
usage: dirname path
$ # Oops!
$ dirname "$(which \"\$cmd\")"
"$cmd": not found
.
$ # Oops!
$ dirname "$(which \"$cmd\")"
"totally: not found
amazing": not found
.
$ dirname "$(eval which \"$cmd\")"
totally amazing: not found
.
$ dirname "$(eval which \"\$cmd\")"
/Users/jleffler/bin
$ # Ouch, but at least that worked!
$ # But how to extend that to the next level?
$ dirname "$(eval dirname \"\$\(eval which \\\"\\\$cmd\\\"\)\")"
/Users/jleffler
$
OK - well, that's the "easy" one! Do you need a better reason to avoid spaces in command names or path names? I've also demonstrated to my own satisfaction that it works correctly with pathnames that contain spaces.
So, can we compress the learning cycle for backticks? Yes...
$ cat x3.sh
cmd="totally amazing"
which "$cmd"
dirname "`which \"$cmd\"`"
dirname "`dirname \"\`which \\"\$cmd\\\"\`\"`"
$ sh -x x3.sh
+ cmd='totally amazing'
+ which 'totally amazing'
/Users/jleffler/bin/totally amazing
++ which 'totally amazing'
+ dirname '/Users/jleffler/bin/totally amazing'
/Users/jleffler/bin
+++ which 'totally amazing'
++ dirname '/Users/jleffler/bin/totally amazing'
+ dirname /Users/jleffler/bin
/Users/jleffler
$
That is still a ghastly, daunting, non-intuitive set of escape sequences. It's actually shorter than the version for $(...) notation, and doesn't use any eval commands (which always complicate things).

This probably has to do with the strange way the Bourne shell parses substitutions (the real Korn shell is slightly similar but most other shells do not exhibit the strange behaviour at all).
Basically, the Bourne shell's parser does not interpret substitutions ($ and `) inside double-quotes, or parameter substitution ($) anywhere. This is only done at expansion time. Also, in many cases unmatched quotes (single-quotes, double-quotes or backquotes) are not an error; the closing quote is assumed at the end.
One consequence is that if a parameter substitution with a word containing spaces like ${v+a b} occurs outside double-quotes, it is not parsed correctly and will cause an expansion error when executed. The space needs to be quoted. Other shells do not have this problem.
Another consequence is that double-quotes inside backquotes inside double-quotes do not work reliably. For example,
v=0; echo "`v=1; echo " $v "`echo b"
will print
1 echo b
in most shells (one command substitution), but
0 b
in the Bourne shell and the real Korn shell (ksh93) (two command substitutions).
(Ways to avoid the above issue are to assign the substitution to a variable first, so double-quotes are not necessary, or to use new-style command substitution.)
The real Korn shell (ksh93) attempts to preserve much of the strange Bourne shell behaviour but does parse substitutions at parse time. Thus, ${v+a b} is accepted but the above example has "strange" behaviour. A further strange thing is that something like
echo "`${v+pwd"
is accepted (the result is like with the missing closing brace). And where does the opening brace in the error message from
echo "`${v+pwd`"
come from?
The below session shows an obscure case where $ and \$ differ in a non-obvious way:
$ echo ${.sh.version}
Version JM 93u 2011-02-08
$ v=0; echo "`v=1; echo "${v+p q}"`echo b"
p qecho b
$ v=0; echo "`v=1; echo "\${v+p q}"`echo b"
p{ q}b

Basically, a backslash is an escape character. You put it before another character to represent something special. An 'n','t','$' and '\'are these special characters.
"\n" --> newline
"\t" --> tab (indent)
"\$" --> $ (because a $ before a word in shell denotes a variable)
"\\" --> \
The backslash before characters is only interpreted the above way when it is inside quotes.
If you want to find more info or other escape chars go here

echo outputs -e parameter in bash scripts. How can I prevent this?

I've read the man pages on echo, and it tells me that the -e parameter will allow an escaped character, such as an escaped n for newline, to have its special meaning. When I type the command
$ echo -e 'foo\nbar'
into an interactive bash shell, I get the expected output:
foo
bar
But when I use this same command (i've tried this command character for character as a test case) I get the following output:
-e foo
bar
It's as if echo is interpretting the -e as a parameter (because the newline still shows up) yet also it interprets the -e as a string to echo. What's going on here? How can I prevent the -e showing up?

You need to use #!/bin/bash as the first line in your script. If you don't, or if you use #!/bin/sh, the script will be run by the Bourne shell and its echo doesn't recognize the -e option. In general, it is recommended that all new scripts use printf instead of echo if portability is important.
In Ubuntu, sh is provided by a symlink to /bin/dash.

Different implementations of echo behave in annoyingly different ways. Some don't take options (i.e. will simply echo -e as you describe) and automatically interpret escape sequences in their parameters. Some take flags, and don't interpret escapes unless given the -e flag. Some take flags, and interpret different escape sequences depending on whether the -e flag was passed. Some will cause you to tear your hair out if you try to get them to behave in a predictable manner... oh, wait, that's all of them.
What you're probably seeing here is a difference between the version of echo built into bash vs /bin/echo or maybe vs. some other shell's builtin. This bit me when Mac OS X v10.5 shipped with a bash builtin echo that echoed flags, unlike what all my scripts expected...
In any case, there's a solution: use printf instead. It always interprets escape sequences in its first argument (the format string). The problems are that it doesn't automatically add a newline (so you have to remember do that explicitly), and it also interprets % sequences in its first argument (it is, after all, a format string). Generally, you want to put all the formatting stuff in the format string, then put variable strings in the rest of the arguments so you can control how they're interpreted by which % format you use to interpolate them into the output. Some examples:
printf "foo\nbar\n" # this does what you're trying to do in the example
printf "%s\n" "$var" # behaves like 'echo "$var"', except escapes will never be interpreted
printf "%b\n" "$var" # behaves like 'echo "$var"', except escapes will always be interpreted
printf "%b\n" "foo\nbar" # also does your example

Use
alias echo /usr/bin/echo
to force 'echo' invoking coreutils' echo which interpret '-e' parameter.

Try this:
import subprocess
def bash_command(cmd):
subprocess.Popen(['/bin/bash', '-c', cmd])
code="abcde"
// you can use echo options such as -e
bash_command('echo -e "'+code+'"')
Source: http://www.saltycrane.com/blog/2011/04/how-use-bash-shell-python-subprocess-instead-binsh/

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio