I am trying to printf an input with variables and need double quotes. My output needs a \x and \f text output.
printf "\x"
or
printf "\\x"
produces the error:
./pegrep.in: line 94: printf: missing hex digit for \x
while
printf "\f"
or
printf "\\f"
produces nothing at all (in a text output I believe it creates ^L)
single quotes however works for x (but not f). I tried enclosing
printf "...'\\x'..."
and got the same error as standard double quotes. Have also tried /// to no avail.
First, let's understand how quoting works.
The simplest form of quoting is the backslash: it prevents the following character from being interpreted in any special way, allowing it to be used as a literal character. For example:
# Two arguments to printf
$ printf '%s\n' a b
a
b
# One three-character argument to printf
printf '%s\n' a\ b
a b
Double quotes are equivalent to escaping every character contained therein: "a b c" is equivalent to \a\ \b\ \c, which is the same as a\ b\ c because a, b, and c have no special meaning to begin with. You can think of every character inside double quotes as being treated literally, with the following exceptions:
$ can start a parameter expansion or a command substitution. Escape it with a backslash to treat it literally.
$ foo=3
$ echo "$foo"
3
$ echo "\$foo"
$foo
A backquote starts a command substitution. Escape it with a backslash to treat it literally.
$ echo "`pwd`"
/home/me
$ echo "\`pwd\`"
`pwd`
A double quote ends a quoted string. Escape it with a backslash to treat it literally.
$ echo "\""
"
Because a backslash might be part of one of the three preceding sequences, escape it with a backslash to treat one literally.
$ echo "\\"
\
Inside single quotes, everything is treated literally, including any use of a backslash. (One consequence of this is that it is impossible to put a single quote in a single-quoted string, because it will always terminate such a string.)
$ echo '\'
\
Once you understand quoting, you next need to realize that printf itself can process whatever backslashes it sees in its first argument.
$ printf 'a\x20b\n' # a, space (ASCII 0x20), b, newline
a b
In order to ensure a string is printed literally no matter what characters are present, use the %s format specifier and pass your string as a second argument:
$ printf '\x41\n'
A
$ printf '%s\n' '\x41'
\x41
I think you may be confusing the \ and the % characters. The % is used to output formatted variable contents. I think this when \f is outputs ^L (aka ASCII FF) that escape sequence is working as documented.
The printf argument can be quoted with single quotes. Then you can use double quotes within it and they will be part of the output:
printf '"%x"' 4011 # output: "fab"
Similarly for floating point:
print '"%f"' 2.2 # output: "2.200000"
The \ character is for escape sequences of otherwise unprintable characters.
See man printf for a full list of the % format characters and their meanings.
Related
I've noticed something weird:
Y=""
echo ${Y:-"\n"}
echo "${Y:-"\n"}"
prints
\n
n
Why is the second line n, not \n? Is this a bug?
It looks as if Bash parsed this as a concatenation of two quoted strings with an unquoted string in between ("${Y:-" and \n and "}") but this doesn't seem to be the case since the commands
echo $(echo "\n")
echo "$(echo "\n")"
echo "${Y:-"'\n'"}"
output
\n
\n
'n'
I'm using GNU bash, version 4.3.11.
I suspect there is a bug in the handling of the word following :- (in fact, I seem to recall reading something about this, but I can't recall where).
If the value is not quoted, I get results I would expect...
$ echo ${Y:-\n}
n
$ echo "${Y:-\n}"
\n
This is also the result you get in dash (ignoring the fact that dash actually produces a literal newline since POSIX mandates that echo should process escaped characters, something bash only does if you use the non-standard -e option.)
In this example, quoting the default value preserves the backslash. As the result of the parameter expansion produces the backslash, quote removal does not remove it.
$ echo ${Y:-"\n"} # Equivalent to echo "\n", so the output makes sense
\n
There doesn't seem to be any reason for bash to behave different in this final example just because the entire parameter expansion is being quoted. It is almost as if quote removal is being applied twice, once to remove the outer double quotes and again to incorrectly remove the backslash.
# Quote removal discards the backslash: OK
$ echo \n
n
# Quote removal discards the double quotes: OK
$ echo "n"
n
# Quote removal discards the first backslash after `\\` is recognized
# as a quoted backslash: OK
$ echo \\n
\n
# Quote removal discards the double quotes, but leaves
# backslash: OK
$ echo "\n"
\n
# Is quote removal discarding both the double quotes *and* the backslash? Not OK
$ echo "${Y:-"\n"}"
n
Related, zsh (with the bsd_echo) option set outputs \n, not n.
% Y=""
% echo "${Y:-"\n"}"
\n
To complement chepner's helpful answer:
Here's an overview of how the major POSIX-like shells handle the following command:
Y=""
printf '%s\n' ${Y:-"\n"} ${Y:-'\n'} "${Y:-"\n"}" "${Y:-'\n'}"
Note that I've added variations with single quotes.
dash [v0.5.8]
\n
\n
\n
'\n'
zsh [v5.0.8]
\n
\n
\n
'\n'
bash [v4.3.42]
\n
\n
n
'\n'
ksh [93u+]
\n
\n
n
'\n'
Curiously, in all shells, '\n' inside "..." preserves the single quotes, while removing them in the unquoted case.
With respect to "\n", both bash and ksh exhibit the oddity uncovered by the OP, while dash and zsh do not.
Maybe I'm looking at this wrong, but I don't seen any inconsistency in the assignment with the default value Y, quoted or unquoted. The echo expression in each case boils down to:
$ echo "\n"
\n
$ echo ""\n""
n
In the first case you have the quoted string "\n", in the second, you have a bare \n (which is simply n)
printf '%s' 'abc' | sed 's/./\\&/g' #1, \a\b\c
printf '%s' "`printf '%s' 'abc' | sed 's/./\\&/g'`" #2, &&&
The expression inside the second backticks returns \a\b\c, and we have printf '%s' "\a\b\c", so it should print \a\b\c.
My question is: why does the second script print &&& ?
note:
I can get the second script work (prints \a\b\c) by prepending each backslash with another backslash, but I don't know why it's needed.
One related question:
why does this single quoted string get interpreted when it's inside of a command substitution
This is a good example to show difference between back-tick and $(cmd) command substitutions.
When the old-style backquoted form of substitution is used, backslash
retains its literal meaning except when followed by "$", "`", or "\".
The first backticks not preceded by a backslash terminates the command
substitution. When using the "$(COMMAND)" form, all characters between
the parentheses make up the command; none are treated specially.
http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_03_04.html
So take a look your example, I used echo instead of printf:
kent$ echo 'abc' | sed 's/./\\&/g'
\a\b\c
kent$ echo -E "`echo 'abc' | sed 's/./\\&/g'`"
&&&
kent$ echo -E "$(echo 'abc' | sed 's/./\\&/g')"
\a\b\c
You can see, the back-tick command substitution made your \\ as single \, thus together with the followed & it became \& (literal &)
Note that I used echo -E in order to disable the interpretation of backslash escapes so that the \a\b\c could be printed out.
Because on the second line :
you are saying:
printf '%s' 'abc' -> 'abc'
Then replace:
'abc'| sed 's/./\\&g' -> &&&
The s mean substitute
. mean one character
\\& by a char &
g mean multiple occurrence on the line
So you are saying:
Replace in abc each char by & multiple time on the same line
Explanation of \\\& :
Two backslashes become a single backslash in the shell which then in sed escapes the forward slash which is the middle delimiter.
\\& -> \& (which makes the forward & a regular character instead of a delimiter)
Three of them: The first two become one in the shell which then escape the third one in sed
\\\& -> \\&
Finally! don't forget that you command is under backquote:
The reason you have to escape it "twice" is because you're entering this command in an environment (such as a shell script) that interprets the double-quoted string once. It then gets interpreted again by the subshell.
From:
Why does sed require 3 backslashes for a regular backslash?
I'm almost certain the code I have here worked before. Here's a simplified version and what it produces:
a="atext"
b="btext"
var=$'${a}\n${b}\n'
printf "var=$var"
Which produces output:
var=${a}
${b}
The real code outputs var to file, but the variable expansions aren't happening for some reason.
If this can't work, can you suggest a nice alternative way, and why one uses $' '? Thanks.
GNU bash, version 4.3.42
$'' is a quoting type used to allow backslash escape sequences to describe literal strings with nonprintable characters and other such oddities. Thus, $'\n' evaluates to a single character -- a newline -- whereas '\n' and "\n" both evaluate to two characters, the first being a backslash and the second being an n.
If you want to have the exact behavior of your original code -- putting a literal newline between the results of two different expansions -- you can switch quote types partway through a string:
a="atext"
b="btext"
var="$a"$'\n'"$b"
printf '%s' "var=$var"
That is, right next to each other, with no spaces between:
"$a"
$'\n'
"$b"
This gives you $a and $b expanded, with a literal newline between them.
Why does this matter? Try the following:
$ a=atext
$ b=btext
$ var1="$a\n$b" # Assign with literal "\" and "n" characters
$ printf "$var1" # Here, printf changes the "\n" into the newline
atext
btext
$ printf '%s' "$var1" # ...but this form shows that the "\n" are really there
atext\nbtext
$ var2="$a"$'\n'"$b" # now, we put a single newline in the string
$ printf '%s' "$var2" # and now even accurate use of printf shows that newline
atext
btext
Just replace the single quotes with double quotes.
$ cat test
a="atext"
b="btext"
var=$"${a}\n${b}\n"
printf "var=$var"
$ sh test
var=atext
btext
For variable expansion you either need to use double quotes or no quotes. Single quotes negate expansion.
when I assign like this:
rmall="\,\.\?\!\:\;\(\)\[\]\{\}\"\'"
then echo $rmall, I got this:
\,\.\?\!\:\;\(\)\[\]\{\}\"\'
But what I want is only , How can I do?
,.?!:;()[]{}"'
as later I need to remove those.
Thank you
You are double quoting by using quotes and backslashes. Use one or the other.
Note: You will always need to use backslash for escaping your quote character but otherwise not needed.
Inside double quotes, only three escape sequences are treated specially:
\" is replaced by a literal "
\$ is replaced by a literal $
\\ is replaced by a literal \
These three are required to allow the literal character in contexts where they would normally produce special behavior. \", obviously, lets you include a double-quote inside a double-quoted string. \$ lets you output a literal dollar sign where it would otherwise trigger parameter substitution:
bash $ foo=5; echo "\$foo = $foo"
$foo = 5
\\ lets you output a literal backslash that precedes a parameter substitution or at the end of a string.
bash $ foo=5; echo "\\$foo"
\5
bash $ echo "Use a \\"
Use a \
A backslash followed by any other character is treated literally:
bash $ echo "\x"
\x
Should I double quote or escape with \ special characters like ',
$ echo "'"
'
$ echo \'
'
Here is apparently doesn't matter, but are there situations where there is a difference, except for $, `` or`, when I know there is a difference.
Thanks,
Eric J.
You can use either backslashes, single quotes, or (on occasion) double quotes.
Single quotes suppress the replacement of environment variables, and all special character expansions. However, a single quote character cannot be inside single quotes -- even when preceded by a backslash. You can include double quotes:
$ echo -e 'The variable is called "$FOO".'
The variable is called "$FOO".
Double quotes hide the glob expansion characters from the shell (* and ?), but it will interpolate shell variables. If you use echo -e or set shopt -s xpg_echo, the double quotes will allow the interpolation of backslash-escaped character sequences such as \a, and \t. To escape those, you have to backslash-escape the backslash:
$ echo -e "The \\t character sequence represents a tab character."
The \t character sequence represents a tab character."
The backslash character will prevent the expansion of special characters including double quotes and the $ sign:
$ echo -e "The variable is called \"\$FOO\"."
The variable is called "$FOO".
So, which one to choose? Which everyone looks the best. For example, in the preceding echo command, I would have been better off using single quotes and that way I wouldn't have the confusing array of backslashes one right after another.
On the other hand:
$ echo -e "The value of \$FOO is '$FOO'."
The value of FOO is 'bar'.
is probably better than trying something like this:
$ echo -e 'The value of $FOO is '"'$FOO'."
Readability should be the key.