I am trying to escape backslash in AWK. This is a sample of what I am trying to do.
Say, I have a variable
$echo $a
hi
The following works
$echo $a | awk '{printf("\\\"%s\"",$1)'}
\"hi"
But, when I am trying to save the output of the same command to a variable using command substitution, I get the following error:
$ q=`echo $a | awk '{printf("\\\"%s\"",$1)'}`
awk: {printf("\\"%s\"",$1)}
awk: ^ backslash not last character on line
I am unable to understand why command substitution is breaking the AWK. Thanks a lot for your help.
Try this:
q=$(echo $a | awk '{printf("\\\"%s\"",$1)}')
Test:
$ a=hi
$ echo $a
hi
$ q=$(echo $a | awk '{printf("\\\"%s\"",$1)}')
$ echo $q
\"hi"
Update:
It will, it just gets a littler messier.
q=`echo $a | awk '{printf("\\\\\"%s\"",$1)}'`
Test:
$ b=hello
$ echo $b
hello
$ t=`echo $b | awk '{printf("\\\\\"%s\"",$1)}'`
$ echo $t
\"hello"
Reference
Quoting inside backquoted commands is somewhat complicated, mainy
because the same token is used to start and to end a backquoted
command. As a consequence, to nest backquoted commands, the backquotes
of the inner one have to be escaped using backslashes. Furthermore,
backslashes can be used to quote other backslashes and dollar signs
(the latter are in fact redundant). If the backquoted command is
contained within double quotes, a backslash can also be used to quote a
double quote. All these backslashes are removed when the shell reads
the backquoted command. All other backslashes are left intact.
The new $(...) avoids these troubles.
Don't get into bad habits with backticks, quoting and parsing shell variables to awk The correct way to do this is:
$ shell_var="hi"
$ awk -v awk_var="$shell_var" -v c='\' 'BEGIN{printf "%s%s\n",c,awk_var}'
\hi
$ res=$(awk -v awk_var="$shell_var" -v c='\' 'BEGIN{printf "%s%s\n",c,awk_var}')
$ echo "$res"
\hi
Related
When I type grep \\\$ shell escapes both \ and $ and transforms it to \$ and sends to grep, which then finds all the lines with dollar sign $. That's fine!
When I type grep \\$ the result is the same and I don't really know why. The first backslash should escape the second one, but then $ is not escaped and shell should replace it with an empty string? grep should receive \ and report an error but instead everything works as in first example for some reason..
In UNIX shells, $x is replaced by the value of the shell variable x but when there is nothing following the $, no substitution is performed. You can test this with echo:
> echo $
$
> echo $x
>
Your two grep arguments are passed into grep as exactly the same regular expression.
> echo \\\$
\$
> echo \\$
\$
>
I'm trying to extract a string from a group of strings stored in a variable, say foo, in bash:
foo="I foobar you"
using awk, but I got different results when I wrapped awk with single and double quotes:
$ echo $foo | awk '{print $2}'
$ foobar
$ echo $foo | awk "{print $2}"
$ I foobar you
Could anyone tell me why ' and " cause different results when wrapping awk?
It is because when using double-quotes the the shell tries to expand it as a positional parameter or a variable before passing it to awk. In your case it tries to expand $1 but finds no value for it, and so the expansion results to nothing and that's why you get the default print action in awk to print the whole line.
The reason why we single quote the awk commands is that we want a plain string and not modified/tampered by the shell by any means before processed by the Awk command itself which understands $1 as the first field (with default space delimiter)
According to shell quoting rules:
Single quotes (‘ ‘) protect everything between the opening and closing quotes. The shell does no interpretation of the quoted text, passing it on verbatim to the command.
As in your code {print $2} passed on as it is to awk as an action:
$ echo $foo | awk '{print $2}'
$ foobar
Double quotes (“ “) protect most things between the opening and closing quotes. The shell does at least variable and command substitution on the quoted text. Different shells may do additional kinds of processing on double-quoted text. Because certain characters within double-quoted text are processed by the shell, they must be escaped within the text. Of note are the characters ‘$’, ‘‘’, ‘\’, and ‘"’, all of which must be preceded by a backslash within double-quoted text if they are to be passed on literally to the program.
So you have to escape $ to get the value of second col as:
$ echo $foo | awk "{print \$2}"
$ foobar
otherwise awk is doing its default action of printing the whole line as in your case.
Fully agreed with #Ed Mortan regarding the best practice of quoting the shell variables. Although the double quotes are default, but if accidentally ….
$ echo "$foo" with double quotes
$ I foobar you
$ echo ‘$foo’ with single quotes
$ $foo
Remember, quoting rules are specific to shell. The above rules only apply to POSIX-complaint, Bourne-style shells (such as Bash, the GNU Bourne-Again Shell). It has nothing to do with awk, sed, grep, etc…
I'm trying to escape ('\') a semicolon (';') in a string on unix shell (bash) with sed. It works when I do it directly without assigning the value to a variable. That is,
$ echo "hello;" | sed 's/\([^\\]\);/\1\\;/g'
hello\;
$
However, it doesn't appear to work when the above command is assigned to a variable:
$ result=`echo "hello;" | sed 's/\([^\\]\);/\1\\;/g'`
$
$ echo $result
hello;
$
Any idea why?
I tried by using the value enclosed with and without quotes but that didn't help. Any clue greatly appreciated.
btw, I first thought the semicolon at the end of the string was somehow acting as a terminator and hence the shell didn't continue executing the sed (if that made any sense). However, that doesn't appear to be an issue. I tried by using the semicolon not at the end of the string (somewhere in between). I still see the same result as before. That is,
$ echo "hel;lo" | sed 's/\([^\\]\);/\1\\;/g'
hel\;lo
$
$ result=`echo "hel;lo" | sed 's/\([^\\]\);/\1\\;/g'`
$
$ echo $result
hel;lo
$
You don't need sed (or any other regex engine) for this at all:
s='hello;'
echo "${s//;/\;}"
This is a parameter expansion which replaces ; with \;.
That said -- why are you trying to do this? In most cases, you don't want escape characters (which are syntax) to be inside of scalar variables (which are data); they only matter if you're parsing your data as syntax (such as using eval), which is a bad idea for other reasons, and best avoided (or done programatically, as via printf %q).
I find it interesting that the use of back-ticks gives one result (your result) and the use of $(...) gives another result (the wanted result):
$ echo "hello;" | sed 's/\([^\\]\);/\1\\;/g'
hello\;
$ z1=$(echo "hello;" | sed 's/\([^\\]\);/\1\\;/g')
$ z2=`echo "hello;" | sed 's/\([^\\]\);/\1\\;/g'`
$ printf "%s\n" "$z1" "$z2"
hello\;
hello;
$
If ever you needed an argument for using the modern x=$(...) notation in preference to the older x=`...` notation, this is probably it. The shell does an extra round of backslash interpretation with the back-ticks. I can demonstrate this with a little program I use when debugging shell scripts called al (for 'argument list'); you can simulate it with printf "%s\n":
$ z2=`echo "hello;" | al sed 's/\([^\\]\);/\1\\;/g'`
$ echo "$z2"
sed
s/\([^\]\);/\1\;/g
$ z1=$(echo "hello;" | al sed 's/\([^\\]\);/\1\\;/g')
$ echo "$z1"
sed
s/\([^\\]\);/\1\\;/g
$ z1=$(echo "hello;" | printf "%s\n" sed 's/\([^\\]\);/\1\\;/g')
$ echo "$z1"
sed
s/\([^\\]\);/\1\\;/g
$
As you can see, the script executed by sed differs depending on whether you use x=$(...) notation or x=`...` notation.
s/\([^\]\);/\1\;/g # ``
s/\([^\\]\);/\1\\;/g # $()
Summary
Use $(...); it is easier to understand.
You need to use four (three also work). I guess its because it's interpreted twice, first one by the sed command and the second one by the shell when reading the content of the variable:
result=`echo "hello;" | sed 's/\([^\\]\);/\1\\\\;/g'`
And
echo "$result"
yields:
hello\;
I'm trying to use a variable in a grep regex. I'll just post an example of the failure and maybe someone can suggest how to make the variable be evaluated while running the grep command. I've tried ${var} as well.
$ string="test this"
$ var="test"
$ echo $string | grep '^$var'
$
Since my regex should match lines which start with "test", it should print the line echoed thru it.
$ echo $string
test this
$
You need to use double quotes. Single quotes prevent the shell variable from being interpolated by the shell. You use single quotes to prevent the shell from doing interpolation which you may have to do if your regular expression used $ as part of the pattern. You can also use a backslash to quote a $ if you're using double quotes.
Also, you may need to put your variable in curly braces ${var} in order to help separate it from the rest of the pattern.
Example:
$ string="test this"
$ var="test"
$ echo $string | grep "^${var}"
I am trying to keep the return of a sed substitution in a variable:
D=domain.com
echo $D | sed 's/\./\\./g'
Correctly returns: domain\.com
D1=`echo $D | sed 's/\./\\./g'`
echo $D1
Returns: domain.com
What am I doing wrong?
D2=`echo $D | sed 's/\./\\\\./g'`
echo $D2
Think of shells rescanning the line each time it is executed. Thus echo $D1, which has the escapes in it, have the escapes applied to the value as the line is parsed, before echo sees it. The solution is yet more escapes.
Getting the escapes correct on nested shell statements can make you live in interesting times.
The backtick operator replaces the escaped backslash by a backslash. You need to escape twice:
D1=`echo $D | sed 's/\./\\\\./g'`
You may also escape the first backslash if you like.