Exactly how do backslashes work within backticks? - bash

From the Bash FAQ:
Backslashes (\) inside backticks are handled in a non-obvious manner:
$ echo "`echo \\a`" "$(echo \\a)"
a \a
$ echo "`echo \\\\a`" "$(echo \\\\a)"
\a \\a
But the FAQ does not break down the parsing rules that lead to this difference. The only relevant quote from man bash I found was:
When the old-style backquote form of substitution is used, backslash retains its literal meaning except when followed by $, `, or .
The "$(echo \\a)" and "$(echo \\\\a)" cases are easy enough: Backslash, the escape character, is escaping itself into a literal backlash. Thus every instance of \\ becomes \ in the output. But I'm struggling to understand the analogous logic for the backtick cases. What is the underlying rule and how does the observed output follow from it?
Finally, a related question... If you don't quote the backticks, you get a "no match" error:
$ echo `echo \\\\a`
-bash: no match: \a
What's happening in this case?
Re: my main question, I have a theory for a set of rules that explains all the behavior, but still don't see how it follows from any of the documented rules in bash. Here are my proposed rules....
Inside backticks, a backslash in front of a character simply returns that character. Ie, a single backslash has no effect. And this is true for all characters, except backlash itself and backticks. In the case of backslash itself, \\ becomes an escaping backslash. It will escape its next character.
Let's see how this plays out in an example:
echo "`echo $a`" # prints the value of $a
echo "`echo \$a`" # single backslash has no effect: equivalent to above
echo "`echo \\$a`" # escaping backslash make $ literal
Try it online!
Let's analyze the original examples from this perspective:
echo "`echo \\a`"
Here the \\ produces an escaping backslash, but when we "escape" a we just get back a, so it prints a.
echo "`echo \\\\a`"
Here the first pair \\ produces an escaping backslash which is applied to \, producing a literal backslash. That is, the first 3 \\\ become a single literal \ in the output. The remaining \a just produces a. Final result is \a.

The logic is quite simple as such. So we look at bash source code (4.4) itself
case '`': /* Backquoted command substitution. */
t_index = sindex++;
temp = string_extract(string, &sindex, "`", SX_REQMATCH);
/* The test of sindex against t_index is to allow bare instances of
` to pass through, for backwards compatibility. */
if (temp == &extract_string_error || temp == &extract_string_fatal)
if (sindex - 1 == t_index)
sindex = t_index;
goto add_character;
last_command_exit_value = EXECUTION_FAILURE;
report_error(_("bad substitution: no closing \"`\" in %s"), string + t_index);
return ((temp == &extract_string_error) ? &expand_word_error
: &expand_word_fatal);
if (expanded_something)
*expanded_something = 1;
if (word->flags & W_NOCOMSUB)
/* sindex + 1 because string[sindex] == '`' */
temp1 = substring(string, t_index, sindex + 1);
tword = command_substitute(temp, quoted);
temp1 = tword ? tword->word : (char *)NULL;
if (tword)
temp = temp1;
goto dollar_add_string;
As you can see calls a function de_backslash(temp); on the string which updates the string in c. The code the same function is below
/* Remove backslashes which are quoting backquotes from STRING. Modifies
STRING, and returns a pointer to it. */
char *
de_backslash(string) char *string;
register size_t slen;
register int i, j, prev_i;
slen = strlen(string);
i = j = 0;
/* Loop copying string[i] to string[j], i >= j. */
while (i < slen)
if (string[i] == '\\' && (string[i + 1] == '`' || string[i + 1] == '\\' ||
string[i + 1] == '$'))
prev_i = i;
ADVANCE_CHAR(string, slen, i);
if (j < prev_i)
string[j++] = string[prev_i++];
while (prev_i < i);
j = i;
string[j] = '\0';
return (string);
The above just does simple thing if there is \ character and the next character is \ or backtick or $, then skip this \ character and copy the next character
So if convert it to python for simplicity
text = r"\\\\$a"
slen = len(text)
i = 0
j = 0
data = ""
while i < slen:
if (text[i] == '\\' and (text[i + 1] == '`' or text[i + 1] == '\\' or
text[i + 1] == '$')):
i += 1
data += text[i]
i += 1
The output of the same is \\$a. And now lets test the same in bash
$ a=xxx
$ echo "$(echo \\$a)"
$ echo "`echo \\\\$a`"

Did some more research to find the reference and rule of what is happening. From the GNU Bash Reference Manual it states
When the old-style backquote form of substitution is used, backslash
retains its literal meaning except when followed by ‘$’, ‘`’, or ‘\’.
The first backquote not preceded by a backslash terminates the command
substitution. When using the $(command) form, all characters between
the parentheses make up the command; none are treated specially.
In other words \, \$, and ` inside of `` are processed by the CLI parser before the command substitution. Everything else is passed to the command substitution for processing.
Let's step through each example from the question. After the # I put how the command substitution was processed by the CLI parser before `` or $() is executed.
Your first example explained.
$ echo "`echo \\a`" # echo \a
$ echo "$(echo \\a)" # echo \\a
Your second example explained:
$ echo "`echo \\\\a`" # echo \\a
$ echo "$(echo \\\\a)" # echo \\\\a
Your third example:
$ echo "`echo $a`" # echo xx
$ echo "`echo \$a`" # echo $a
echo "`echo \\$a`" # echo \$a
Your third example using $()
$ echo "$(echo $a)" # echo $a
$ echo "$(echo \$a)" # echo \$a
$ echo "$(echo \\$a)" # echo \\$a


Howto split a string on a multi-character delimiter in bash?

Why doesn't work the following bash code?
for i in $( echo "emmbbmmaaddsb" | split -t "mm" )
echo "$i"
expected output:
The recommended tool for character subtitution is sed's command s/regexp/replacement/ for one regexp occurence or global s/regexp/replacement/g, you do not even need a loop or variables.
Pipe your echo output and try to substitute the characters mm witht the newline character \n:
echo "emmbbmmaaddsb" | sed 's/mm/\n/g'
The output is:
Since you're expecting newlines, you can simply replace all instances of mm in your string with a newline. In pure native bash:
printf '%s\n' "${in//$sep/$'\n'}"
If you wanted to do such a replacement on a longer input stream, you might be better off using awk, as bash's built-in string manipulation doesn't scale well to more than a few kilobytes of content. The gsub_literal shell function (backending into awk) given in BashFAQ #21 is applicable:
# Taken from http://mywiki.wooledge.org/BashFAQ/021
# usage: gsub_literal STR REP
# replaces all instances of STR with REP. reads from stdin and writes to stdout.
gsub_literal() {
# STR cannot be empty
[[ $1 ]] || return
# string manip needed to escape '\'s, so awk doesn't expand '\n' and such
awk -v str="${1//\\/\\\\}" -v rep="${2//\\/\\\\}" '
# get the length of the search string
len = length(str);
# empty the output string
out = "";
# continue looping while the search string is in the line
while (i = index($0, str)) {
# append everything up to the search string, and the replacement string
out = out substr($0, 1, i-1) rep;
# remove everything up to and including the first instance of the
# search string from the line
$0 = substr($0, i + len);
# append whatever is left
out = out $0;
print out;
...used, in this context, as:
gsub_literal "mm" $'\n' <your-input-file.txt >your-output-file.txt
A more general example, without replacing the multi-character delimiter with a single character delimiter is given below :
Using parameter expansions : (from the comment of #gniourf_gniourf)
while [[ $s ]]; do
array+=( "${s%%"$delimiter"*}" );
declare -p array
A more crude kind of way
# main string
# delimiter string
#length of main string
#length of delimiter string
#iterator for length of string
#length tracker for ongoing substring
#starting position for ongoing substring
while [ $i -lt $strLen ]; do
if [ $delimiter == ${str:$i:$dLen} ]; then
strP=$(( i + dLen ))
i=$(( i + dLen ))
i=$(( i + 1 ))
wordLen=$(( wordLen + 1 ))
declare -p array
Reference - Bash Tutorial - Bash Split String
With awk you can use the gsub to replace all regex matches.
As in your question, to replace all substrings of two or more 'm' chars with a new line, run:
echo "emmbbmmaaddsb" | awk '{ gsub(/mm+/, "\n" ); print; }'
The ‘g’ in gsub() stands for “global,” which means replace everywhere.
You may also ask to print just N match, for example:
echo "emmbbmmaaddsb" | awk '{ gsub(/mm+/, " " ); print $2; }'

Is there any csh alternative for printf %q of bash?

Bash's build in command printf supports the %q format string, which escapes the content of a variable for shell input.
I have tried some options::q only escaped space, and gnu printf does not support %q.
Currently, I use below code:
set valq = `echo $val:q | bash -c 'read q;printf %q "$q"'`
/path/to/executable $valq
I do not like csh script having dependency of bash. Is there any csh native solution for this?
Here is a test code for illustrating the problem I have met.
#!/bin/csh -f
set i = 1
set tst1 = ""
set tst2 = ""
while ( $i <= $#argv )
set arg = "$argv[$i]"
set tst1 = ($tst1:q $arg:q)
set arg2 = `echo $arg:q | bash -c 'read q;printf %q "$q"'`
set tst2 = "$tst2:q $arg2:q"
# i = $i + 1
echo "====case 1===="
./test.csh $tst1:q
./test.csh $tst1
./test.csh $tst2
echo "====case 2===="
csh -cf "./test.csh $tst1"
csh -cf "./test.csh $tst1:q"
csh -cf "./test.csh $tst2"
#!/bin/csh -f
echo -n "TEST ARG:"
set i = 1
while ($i <= $#argv)
echo -n "${i}:$argv[$i] "
# i = $i + 1
Test Results 1:
>./wrapper.csh "a ()" b c
====case 1====
TEST ARG:1:a () 2:b 3:c
TEST ARG:1:a 2:() 3:b 4:c
TEST ARG:1:a\ 2:\(\) 3:b 4:c
====case 2====
Badly placed ()'s.
Badly placed ()'s.
TEST ARG:1:a () 2:b 3:c
Test Results 2:
bash>./wrapper.csh "'\"a ()" b c csh>./wrapper.csh "'"'"'"a ( ) " b c
====case 1====
TEST ARG:1:'"a () 2:b 3:c
TEST ARG:1:'"a 2:() 3:b 4:c
TEST ARG:1:\'\"a\ 2:\(\) 3:b 4:c
====case 2====
Unmatched '.
Unmatched '.
TEST ARG:1:'"a () 2:b 3:c
Summary for the test:
If commands is directly called inside csh, then $val:q is the proper usage.
If commands is passed by arguments, then printf %q is the proper usage.
Just use /path/to/executable "$val".
If variables are expanded within " (as in csh -cf "test.csh $tst1") and if special characters and multiple words are to be preserved, the words must indeed be quoted. But the special printf of bash isn't indispensable for this; we could do it e. g. with:
set tst1q=`printf " '%s'" $tst1:q`
csh -cf "test.csh $tst1q"
(the normal printf without %q).
To allow both " and ', you can after you initially do
set s='s/[] "$&-*;<>?`|~[]/\\&/g'
replace bash -c 'read q;printf %q "$q"' with sed "$s" in wrapper.csh.
The regular expression
[] "$&-*;<>?`|~[]
is a bracket expression, a list of characters enclosed in []. It matches a single character which is to be prepended with a backslash by the replacement \\& (the special character & refers to the matched character). I didn't include the characters , and ^ (they are escaped by printf %q, but that's not needed in csh), while I included ~ (which isn't escaped by printf %q, but needs to be in csh - try wrapper.csh "~").

Replacing quotation marks with "``" and "''"

I have a document containing many " marks, but I want to convert it for use in TeX.
TeX uses 2 ` marks for the beginning quote mark, and 2 ' mark for the closing quote mark.
I only want to make changes to these when " appears on a single line in an even number (e.g. there are 2, 4, or 6 "'s on the line). For e.g.
"This line has 2 quotation marks."
--> ``This line has 2 quotation marks.''
"This line," said the spider, "Has 4 quotation marks."
--> ``This line,'' said the spider, ``Has 4 quotation marks.''
"This line," said the spider, must have a problem, because there are 3 quotation marks."
--> (unchanged)
My sentences never break across lines, so there is no need to check on multiple lines.
There are few quotes with single quotes, so I can manually change those.
How can I convert these?
This is my one-liner which is works for me:
awk -F\" '{if((NF-1)%2==0){res=$0;for(i=1;i<NF;i++){to="``";if(i%2==0){to="'\'\''"}res=gensub("\"", to, 1, res)};print res}else{print}}' input.txt >output.txt
And there is long version of this one-liner with comments:
FS="\"" # set field separator to double quote
if ((NF-1) % 2 == 0) { # if count of double quotes in line are even number
res = $0 # save original line to res variable
for (i = 1; i < NF; i++) { # for each double quote
to = "``" # replace current occurency of double quote by ``
if (i % 2 == 0) { # if its closes quote replace by ''
to = "''"
# replace " by to in res and save result to res
res = gensub("\"", to, 1, res)
print res # print resulted line
} else {
print # print original line when nothing to change
You may run this script by:
awk -f replace-quotes.awk input.txt >output.txt
Here's my one-liner using repeated sed's:
cat file.txt | sed -e 's/"\([^"]*\)"/`\1`/g' | sed '/"/s/`/\"/g' | sed -e 's/`\([^`]*\)`/``\1'\'''\''/g'
(note: it won't work correctly if there are already back-ticks (`) in the file but otherwise should do the trick)
Removed back-tick bug by simplifying, now works for all cases:
cat file.txt | sed -e 's/"\([^"]*\)"/``\1'\'\''/g' | sed '/"/s/``/"/g' | sed '/"/s/'\'\''/"/g'
With comments:
cat file.txt # read file
| sed -e 's/"\([^"]*\)"/``\1'\'\''/g' # initial replace
| sed '/"/s/``/"/g' # revert `` to " on lines with extra "
| sed '/"/s/'\'\''/"/g' # revert '' to " on lines with extra "
Using awk
awk '{n=gsub("\"","&")}!(n%2){while(n--){n%2?Q=q:Q="`";sub("\"",Q Q)}}1' q=\' in
awk '{
n=gsub("\"","&") # set n to the number of quotes in the current line
!(n%2){ # if there are even number of quotes
while(n--){ # as long as we have double-quotes
n%2?Q=q:Q="`" # alternate Q between a backtick and single quote
sub("\"",Q Q) # replace the next double quote with two of whatever Q is
}1 # print out all other lines untouched'
q=\' in # set the q variable to a single quote and pass the file 'in' as input
Using sed
sed '/^\([^"]*"[^"]*"[^"]*\)*$/s/"\([^"]*\)"/``\1'\'\''/g' in
This might work for you:
sed 'h;s/"\([^"]*\)"/``\1''\'\''/g;/"/g' file
Make a copy of the original line h
Replace pairs of "'s s/"\([^"]*\)"/``\1''\'\''/g
Check for odd " and if found revert to original line /"/g
