BASH scripting: when to include the back slash symbol - bash

I am writing a BASH script and I am using the bash command. Which one of the following is correct (or are both incorrect)?
bash $pbs_dir/${module_name}.${target_ID}.${instance_ID}.pbs
or
bash \$pbs_dir/\${module_name}.\${target_ID}.\${instance_ID}.pbs

\$ will be expanded to literal $, so there is a big difference:
$ a="hello"
$ echo $a
hello
$ echo \$a
$a
Also note that you almost always want to double quote your parameter expansions to avoid word splitting and pathname expansion:
echo "$a"
So you properly want to use the following:
bash "$pbs_dir/${module_name}.${target_ID}.${instance_ID}.pbs"

Related

How to understand quotes in command-substitution which surrounded by double-quotes in Bash?

$ a=33
$ echo "$(echo '$a')" # (1)
$a
$ echo "$(echo "$a")" # (2)
33
I can't understand the shell command-line parsing result above.
For command line (1), according to man bash, single quote inside the double-quotes will be parsed as literal, and $a inside the double quotes will be interpolated, so I think the result of command line (1) should be '33'.
You don't have single-quotes in double-quotes. $(....) is a command substitution. Essentially what occurs within (...) takes place within its own subshell. You must employ quoting rules within that subshell. The subshell represents its own environment.
For example:
$ a=33
$ echo $(echo "'$a'") # single-quotes within double-quotes
The quoting rules within the command substitution are applied correctly resulting in the output:
$ '33'
Let me know if you have further questions.

Bash - difference between <<EOF and <<'EOF'

GNU Bash - 3.6.6 Here Documents
[n]<<[-]word
here-document
delimiter
If any part of word is quoted, the delimiter is the result of quote removal on word, and the lines in the here-document are not expanded. If word is unquoted, all lines of the here-document are subjected to parameter expansion, command substitution, and arithmetic expansion, the character sequence \newline is ignored, and ‘\’ must be used to quote the characters ‘\’, ‘$’, and ‘`’.
If I single-quote EOF, it works. I think because bash /bin/bash process to be invoked gets un-expanded strings and then the invoked process interprets the lines.
$ /bin/bash<<'EOF'
#!/bin/bash
echo $BASH_VERSION
EOF
3.2.57(1)-release
However, the below is causing an error. I thought BASH_VERSION would have been expanded and the version of current bash process is passed to the /bin/bash process to be invoked. But not working.
$ /bin/bash<<EOF
#!/bin/bash
echo $BASH_VERSION
EOF
/bin/bash: line 2: syntax error near unexpected token `('
/bin/bash: line 2: `echo 5.0.17(1)-release'
/bin/bash<<EOF
#!/bin/bash
echo $BASH_VERSION
EOF
As you can infer from the error message, the heredoc is being expanded to:
/bin/bash<<EOF
#!/bin/bash
echo 5.0.17(1)-release
EOF
It sounds like that's what you expect: it's being expanded to the outer shell's version. The problem isn't with the heredoc or the expansion; it's that unquoted parentheses are a syntax error. Try running just the echo command by hand and you'll get the same error:
$ echo 5.0.17(1)-release
bash: syntax error near unexpected token `('
To fix this, you could add extra quotes:
/bin/bash<<EOF
echo '$BASH_VERSION'
EOF
This will work and print the outer shell's version. I used single quotes to demonstrate that these quotes will not inhibit variable expansion. The outer shell doesn't see these quotes. Only the inner shell does.
(I also got rid of the #!/bin/bash shebang line. There's no need for it since you're explicitly invoking bash.)
However, quoting is not 100% robust. If $BASH_VERSION happened to contain single quotes you'd have a problem. The quotes make parentheses ( ) safe but they aren't foolproof. As a general technique, if you want this to be completely safe no matter what special characters are in play you'll have to jump through some ugly hoops.
Use printf '%q' to escape all special characters.
/bin/bash <<EOF
echo $(printf '%q' "$BASH_VERSION")
EOF
This will expand to echo 5.0.17\(1\)-release.
Pass it in as an environment variable and use <<'EOF' to disable interpolation inside the script.
OUTER_VERSION="$BASH_VERSION" /bin/bash <<'EOF'
echo "$OUTER_VERSION"
EOF
This would be my choice. I prefer use the <<'EOF' form whenever possible. Having the parent shell interpolate the script being passed to a child shell can be confusing and difficult to reason about. Also, the explicit $OUTER_VERSION variable makes it crystal clear what's happening.
Use bash -c 'script' instead of a heredoc and then pass the version in as a command-line argument.
bash -c 'echo "$1"' bash "$BASH_VERSION"
I might go with this for a single-line script.
If you don't quote EOF, variables in the heredoc are expanded by the original shell before passing it as input to the invoked shell. So it's equivalent to executing
echo 3.2.57(1)-release
in the invoked shell. That's not valid bash syntax, so you get an error.
Quoting the word prevents variable expansion, so the invoked shell receives $BASH_VERSION literally, and expands it itself.
In the first case, the quotes prevent any changes in the here document, so the sub-shell sees echo $BASH_VERSION and it expands the string and echoes it.
In the second case, the absence of quotes means that the first shell expands the information and it sees echo 3.2.57(1)-release, and if you type that at the command line, you get the syntax error.
If you used echo "$BASH_VERSION" in both, then both would work, but different shells would expand $BASH_VERSION.

What's the difference between <<EOF and <<\EOF heredocs in shell

I have noticed several distinction between them:
Inside a <<EOF heredoc, new values can not be assigned to variables:
bash <<EOF
s=fds
echo $s
EOF
will print empty line, where
bash <<\EOF
s=fds
echo $s
EOF
will print the value of the variable s.
Global variables can be accessed within <<EOF but not within <<\EOF (with export it is possible to access variables inside <<\EOF):
s=fds
bash <<EOF
echo $s
EOF
will print the value fds, where,
s=fds
bash <<\EOF
echo $s
EOF
will print empty line.
So what are the differences between them and what is the legitimate documented behavior?
From the POSIX spec:
If any character in word is quoted, the delimiter shall be formed by performing quote removal on word, and the here-document lines shall not be expanded. Otherwise, the delimiter shall be the word itself.
So the <<EOF version has the shell expand all variables before running the here doc contents and the <<\EOF (or <<'EOF' or <<EO'F' etc.) versions don't expand the contents (which lets bash in this case do that work).
Try it with cat instead of bash for a clearer view on what is happening.
Also with printf '[%s]\n' "$s" and/or possibly bash -x instead of bash:
$ bash -x <<EOF
s=fds
printf '[%s]\n' "$s"
EOF
+ s=fds
+ printf '[%s]\n' ''
[]
$ bash -x <<\EOF
s=fds
printf '[%s]\n' "$s"
EOF
+ s=fds
+ printf '[%s]\n' fds
[fds]
Documentation: http://www.gnu.org/software/bash/manual/bash.html#Here-Documents
In your first example the delimiter is unquoted, so variable expansion occurs and it's like you're running the code
echo "s=fds
echo $s" | bash
which expands $s in the current shell, where it's empty. So the new shell sees
s=fds
echo
Read the Advanced Bash Scripting Guide & bash reference manual in particular about redirections:
The format of here-documents is:
<<[-]word
here-document
delimiter
No parameter and variable expansion, command substitution, arithmetic
expansion, or filename expansion is performed on word. If any
characters in word are quoted, the delimiter is the result of quote
removal on word, and the lines in the here-document are not expanded.
If word is unquoted, all lines of the here-document are subjected to
parameter expansion, command substitution, and arithmetic expansion,
the character sequence \newline is ignored, and ‘\’ must be used to
quote the characters ‘\’, ‘$’, and ‘`’.

escape curly braces in unix shell script

I have a string:
{2013/05/01},{2013/05/02},{2013/05/03}
I want to append a { at the beginning and a } at the end. The output should be:
{{2013/05/01},{2013/05/02},{2013/05/03}}
However, in my shell script when I concatenate the curly braces to the beginning and end of the string, the output is as follows:
{2013/05/01} {2013/05/02} {2013/05/03}
Why does this happen? How can I achieve my result? Am sure there is a simple solution to this but I am a unix newbie, thus would appreciate some help.
Test script:
#!/usr/bin/ksh
valid_data_range="{2013/05/01},{2013/05/02},{2013/05/03}"
finalDates="{"$valid_data_range"}"
print $finalDates
The problem is that when you have a list in braces outside quotes, the shell performs Brace Expansion (bash manual, but ksh will be similar). Since the 'outside quotes' bit is important, it also tells you how to avoid the problem — enclose the string in quotes when printing:
#!/usr/bin/ksh
valid_data_range="{2013/05/01},{2013/05/02},{2013/05/03}"
finalDates="{$valid_data_range}"
print "$finalDates"
(The print command is specific to ksh and is not present in bash. The change in the assignment line is more cosmetic than functional.)
Also, the brace expansion would not occur in bash; it only occurs when the braces are written directly. This bilingual script (ksh and bash):
valid_data_range="{2013/05/01},{2013/05/02},{2013/05/03}"
finalDates="{$valid_data_range}"
printf "%s\n" "$finalDates"
printf "%s\n" $finalDates
produces:
ksh
{{2013/05/01},{2013/05/02},{2013/05/03}}
{2013/05/01}
{2013/05/02}
{2013/05/03}
bash (also zsh)
{{2013/05/01},{2013/05/02},{2013/05/03}}
{{2013/05/01},{2013/05/02},{2013/05/03}}
Thus, when you need to use the variable $finalDates, ensure it is inside double quotes:
other_command "$finalDates"
if [ "$finalDates" = "$otherString" ]
then : whatever
else : something
fi
Etc — using your preferred layout for whatever you don't like about mine.
You can say:
finalDates=$'{'"$valid_data_range"$'}'
The problem is that the shell is performing brace expansion. This allows you to generate a series of similar strings:
$ echo {a,b,c}
a b c
That's not very impressive, but consider
$ echo a{b,c,d}e
abc ace ade
In order to suppress brace expansion, you can use the set command to turn it off temporarily
$ set +B
$ echo a{b,c,d}e
a{b,c,d}e
$ set -B
$ echo a{b,c,d}e
abe ace ade

Why does \$ reduce to $ inside backquotes [though not inside $(...)]?

Going over the POSIX standard, I came across another rather technical/pointless question. It states:
Within the backquoted style of command substitution, <backslash> shall retain its literal meaning, except when followed by: '$' , '`' , or <backslash>.
It's easy to see why '`' and '\' lose their literal meanings: nested command substitution demands a "different" backquote inside the command substitution, which in turn forces '\' to lose its literal meaning. So, for instance, the following different behavior seems reasonable:
$ echo $(echo \\\\)
\\
$ echo `echo \\\\`
\
But what about '$'? I.e., what's the point or, more concretely, a possible benefit of the following difference?
$ echo $(echo \$\$)
$$
$ echo `echo \$\$`
4735
As '$' by itself is not ruled out inside backquotes, it looks like you would use either '$' or '\\\$' all the time, but never the middle case '\$'.
To recap,
$ echo `echo $$` # PID, OK
4735
$ echo `echo \\\$\\\$` # literal "$$", OK
$$
$ echo `echo \$\$` # What's the point?
4735
PS: I know this question is rather technical... I myself go for the more modern $(...) substitution all the time, but I'm still curious.
By adding a \, you make the inner subshell expand it instead of the outer shell. A good example would be to actually force the starting of a new shell, like this:
$ echo $$
4988
$ echo `sh -c 'echo $$'`
4988
$ echo `sh -c 'echo \$\$'`
4990
$ echo `sh -c 'echo \\\$\\\$'`
$$
Basic Answer
Consider the following command, which finds the base directory where gcc was installed:
gcc_base=$(dirname $(dirname $(which gcc)))
With the $(...) notation, there is no problem with the parsing; it is trivial and is one of the primary reason why the notation is recommended. The equivalent command using back-ticks is:
gcc_base=`dirname \`dirname \\\`which gcc\\\`\``
When the shell first parses this command, it encounters the first backtick, and has to find the matching close backtick. That's when the quoted section comes into effect:
Within the backquoted style of command substitution, shall retain its literal meaning, except when followed by: '$' , '`' , or .
gcc_base=`dirname \`dirname \\\`which gcc\\\`\``
^ ^ ^ ^ ^ ^
1 2 3 4 5 6
backslash-backtick - special rule
backslash-backslash - special rule
backslash-backtick - special rule
backslash-backslash - special rule
backslash-backtick - special rule
backslash-backtick - special rule
So, the unescaped backtick at the end marks the end of the outermost backtick command. The sub-shell that processes that command sees:
dirname `dirname \`which gcc\``
The backslash-back escapes are given the special treatment again, and the sub-sub-shell sees:
dirname `which gcc`
The sub-sub-sub-shell gets to see which gcc and evaluates it (e.g. /usr/gcc/v4.6.1/bin/gcc).
The sub-sub-shell evaluates dirname /usr/gcc/v4.6.1/bin/gcc and produces /usr/gcc/v4.6.1/bin.
The sub-shell evaluates dirname /usr/gcc/v4.6.1/bin and produces /usr/gcc/v4.6.1.
The shell assigns /usr/gcc/v4.6.1 to gcc_base.
In this example, the backslashes were only followed by the special characters - backslash, backtick, dollar. A more complex example would have, for example, \" sequences in the command, and then the special rule would not apply; the \" would simply be copied through unchanged and passed to the relevant sub-shell(s).
Extraordinarily Complex Stuff
For example, suppose you had a command with a blank in its name (heaven forbid; and this shows why!) such as totally amazing (two blanks; it is a more stringent test than a single blank). Then you could write:
$ cmd="totally amazing"
$ echo "$cmd"
totally amazing
$ which "$cmd"
/Users/jleffler/bin/totally amazing
$ dirname $(which "$cmd")
usage: dirname path
$ # Oops!
$ dirname "$(which \"\$cmd\")"
"$cmd": not found
.
$ # Oops!
$ dirname "$(which \"$cmd\")"
"totally: not found
amazing": not found
.
$ dirname "$(eval which \"$cmd\")"
totally amazing: not found
.
$ dirname "$(eval which \"\$cmd\")"
/Users/jleffler/bin
$ # Ouch, but at least that worked!
$ # But how to extend that to the next level?
$ dirname "$(eval dirname \"\$\(eval which \\\"\\\$cmd\\\"\)\")"
/Users/jleffler
$
OK - well, that's the "easy" one! Do you need a better reason to avoid spaces in command names or path names? I've also demonstrated to my own satisfaction that it works correctly with pathnames that contain spaces.
So, can we compress the learning cycle for backticks? Yes...
$ cat x3.sh
cmd="totally amazing"
which "$cmd"
dirname "`which \"$cmd\"`"
dirname "`dirname \"\`which \\"\$cmd\\\"\`\"`"
$ sh -x x3.sh
+ cmd='totally amazing'
+ which 'totally amazing'
/Users/jleffler/bin/totally amazing
++ which 'totally amazing'
+ dirname '/Users/jleffler/bin/totally amazing'
/Users/jleffler/bin
+++ which 'totally amazing'
++ dirname '/Users/jleffler/bin/totally amazing'
+ dirname /Users/jleffler/bin
/Users/jleffler
$
That is still a ghastly, daunting, non-intuitive set of escape sequences. It's actually shorter than the version for $(...) notation, and doesn't use any eval commands (which always complicate things).
This probably has to do with the strange way the Bourne shell parses substitutions (the real Korn shell is slightly similar but most other shells do not exhibit the strange behaviour at all).
Basically, the Bourne shell's parser does not interpret substitutions ($ and `) inside double-quotes, or parameter substitution ($) anywhere. This is only done at expansion time. Also, in many cases unmatched quotes (single-quotes, double-quotes or backquotes) are not an error; the closing quote is assumed at the end.
One consequence is that if a parameter substitution with a word containing spaces like ${v+a b} occurs outside double-quotes, it is not parsed correctly and will cause an expansion error when executed. The space needs to be quoted. Other shells do not have this problem.
Another consequence is that double-quotes inside backquotes inside double-quotes do not work reliably. For example,
v=0; echo "`v=1; echo " $v "`echo b"
will print
1 echo b
in most shells (one command substitution), but
0 b
in the Bourne shell and the real Korn shell (ksh93) (two command substitutions).
(Ways to avoid the above issue are to assign the substitution to a variable first, so double-quotes are not necessary, or to use new-style command substitution.)
The real Korn shell (ksh93) attempts to preserve much of the strange Bourne shell behaviour but does parse substitutions at parse time. Thus, ${v+a b} is accepted but the above example has "strange" behaviour. A further strange thing is that something like
echo "`${v+pwd"
is accepted (the result is like with the missing closing brace). And where does the opening brace in the error message from
echo "`${v+pwd`"
come from?
The below session shows an obscure case where $ and \$ differ in a non-obvious way:
$ echo ${.sh.version}
Version JM 93u 2011-02-08
$ v=0; echo "`v=1; echo "${v+p q}"`echo b"
p qecho b
$ v=0; echo "`v=1; echo "\${v+p q}"`echo b"
p{ q}b
Basically, a backslash is an escape character. You put it before another character to represent something special. An 'n','t','$' and '\'are these special characters.
"\n" --> newline
"\t" --> tab (indent)
"\$" --> $ (because a $ before a word in shell denotes a variable)
"\\" --> \
The backslash before characters is only interpreted the above way when it is inside quotes.
If you want to find more info or other escape chars go here

Resources