Use bash variable in AWK expression - bash

I tried the following snippet in a shell script but awk didn't find $REF
REF=SEARCH_TEXT
echo "some text" | awk '/$REF/{print $2}'

Instead of quoting games in the shell, use the -v option to pass the shell variable as an awk variable:
awk -v ref="$REF" 'match($0, ref) {print $2}'
If $REF is just text and not a regular expression, use the index() function instead of match().

You question is worded really poor...
Anyway, I think you want this:
REF=SEARCH_TEXT
echo "some text" | awk "/$REF/{print \$2}"
Note the escaping of $2 and the double quotes.
or this:
REF=SEARCH_TEXT
echo "some text" | awk "/$REF/"'{print $2}'
Note the judicious use of double and single quotes and no escaping on $2.
You have to use shell expansion, as otherwise it would encompass exporting a shell variable and using it from the environment with awk - which is overkill in this situation:
export REF=SEARCH_TEXT
echo "some text" | awk '{if (match($0, ENVIRON["REF"])) print $2}'
I think awk does not support variables in /.../ guards. Please correct me if I'm wrong.

In gawk, you have the ENVIRON array, e.g. awk 'END{print ENVIRON["REF"]}' /dev/null will print your variable if you've exported it out from the shell to sub-processes.

Related

Why Does Running Awk With Double Quotes Break But Works With Single Quotes?

I noticed when running a command that this statement doesn't recognize the delimiter
awk -F',' "{print $4}" wtd.csv
However, this one does.
awk -F',' '{print $4}' wtd.csv
Any reason why? I'm sure this is part of some general bash rule I'm forgetting.
If you're using double quotes, $4 will get replaced by Bash (probably with the empty string). You'd need to escape the $ to use it in double quotes.
Example where this also is happening:
[thom#lethe ~]$ echo '$4'
$4
[thom#lethe ~]$ echo "$4"
[thom#lethe ~]$ echo "\$4"
$4
You are forgetting that double-quotes allow bash variable interpolation. In this case it tries to replace $4 with the fourth argument to the shell which is usually empty.
The single-quotes prevent bash interpolation and passes the literal $4 to awk.
You'll have identical results with:
awk -F',' '{print $4}' wtd.csv
awk -F',' "{print \$4}" wtd.csv

Awk print is not working inside bash shell script

When I use AWK print command outside shell it is working perfectly. Below is content of the file (sample.txt) which is comma separated.
IROG,1245,OUTO,OTUG,USUK
After, executing below command outside shell I get IROG as output.
cat sample.txt | awk -F, '{print $1}' > data.txt
Below is inside the shell script
my $HOME ='home/tmp/stephen';
my $DATA ="$HOME/data.txt";
my $SAMPLE ="$HOME/sample.txt";
`cat $SAMPLE | awk -F, '{print $1}' > $DATA`;
But here i get the same content as in original file instead of 1st column.
output is IROG,1245,OUTO,OTUG,USUK
but I expect only IROG. Can someone advise where I am wrong here?
The $1 inside your backticks expression is being expanded by perl before being executed by the shell. Presumably it has no value, so your awk command is simply {print }, which prints the whole record. You should escape the $ to prevent this from happening:
`awk -F, '{print \$1}' "$SAMPLE" > "$DATA"`;
Note that I have quoted your variables and also removed your useless use of cat.
If you mean to use a shell script, as opposed to a perl one (which is what you've currently got), you can do this:
home=home/tmp/stephen
data="$home/data.txt"
sample="$home/sample.txt"
awk -F, '{print $1}' "$sample" > "$data"
In the shell, there must be no spaces in variable assignments. Also, it is considered bad practice to use UPPERCASE variable names, as you risk overwriting the ones used internally by the shell. Furthermore, it is considered good practice to use double quotes around variable expansions to prevent problems related to word splitting and glob expansion.
There are a few ways that you could trim the leading whitespace from your first field. One would be to use sub to remove it:
awk -F, '{sub(/^ */, ""); print $1}'
This removes any space characters from the start of the line. Again, remember to escape the $ if doing this within backticks in perl.

escape dollar sign in bashscript (which uses awk)

I want to use awk in my bashscript, and this line clearly doesn't work:
line="foo bar"
echo $line | awk '{print $1}'
How do I escape $1, so it doesn't get replaced with the first argument of the script?
Your script (with single quotes around the awk script) will work as expected:
$ cat script-single
#!/bin/bash
line="foo bar"
echo $line | awk '{print $1}'
$ ./script-single test
foo
The following, however, will break (the script will output an empty line):
$ cat script-double
#!/bin/bash
line="foo bar"
echo $line | awk "{print $1}"
$ ./script-double test
​
Notice the double quotes around the awk program.
Because the double quotes expand the $1 variable, the awk command will get the script {print test}, which prints the contents of the awk variable test (which is empty). Here's a script that shows that:
$ cat script-var
#!/bin/bash
line="foo bar"
echo $line | awk -v test=baz "{print $1}"
$ ./script-var test
baz
Related reading: Bash Reference Manual - Quoting and Shell Expansions
As currently written, the $1 will not be replaced (since it's within single-quoted string, bash will not parse it)
If you write awk "{print $1}", bash will expand the $1 within the double-quoted string
Note that the variable expansion rules depend on the outermost level of quoting, so the $1 in "awk '{print $1}'" will be expanded

How to pass a bash variable as value of awk parameter?

I would like to replace a variable inside the the awk command with a bash variable.
For example:
var="one two three"
echo $var | awk "{print $2}"
I want to replace the $2 with the var variable. I have tried awk -v as well as something like awk "{ print ${$wordnum} } to no avail.
Sightly different approach:
$ echo $var
one two three
$ field=3
$ echo $var | awk -v f="$field" '{print $f}'
three
$ field=2
$ echo $var | awk -v f="$field" '{print $f}'
two
You've almost got it...
$ myfield='$3'
$ echo $var | awk "{print $myfield}"
three
The hard quotes on the first line prevent interpretation of $3 by the shell. The soft quotes on the second line allow variable replacement.
You can concatenate parts of awk statements with variables. Maybe this is what you want in your script file:
echo $1|awk '{print($'$2');}'
Here the parts {print($ and the value of local variable $2 and );} are concatenated and given to awk.
EDIT: After some advice rather don't use this. Maybe as a one-time solution. It's better to get accustomed to doing it right right away - see link in first comment.

how to pre-construct awk statement to pass to awk on command line?

I have a shell script that constructs an awk program as a string then pass that string to awk. This is because I want to use values of shell variables in the awk program.
My code looks like this:
awk_prog="'{if (\$4~/$shell_var/) print \$1,\$2}'"
echo $awk_prog
awk $awk_prog $FILENAME
However, when I pass the string to awk, I always get the error:
'{if ($4~/regex/) print $1,$2}'
awk: '{if
awk: ^ invalid char ''' in expression
What does that error message mean? I tried the -F: switch but it does not help. How can I settle this issue?
Thank you.
This is caused by shell quoting. The following will work:
awk_prog="{ if (\$4 ~ /$shell_var/) print \$1, \$2 }"
echo "$awk_prog"
awk "$awk_prog" $FILENAME
When you run awk '{ print }' foo from the command line, the shell interprets and removes the quotes around the program so awk receives two arguments - the first is the program text and the second is the filename foo. Your example was sending awk the program text '{if ...}' which is invalid syntax as far as awk is concerned. The outer quotes should not be present.
In the snippet that I gave above, the shell uses the quotes in the awk_prog= line to group the contents of the string into a single value and then assigns it to the variable awk_prog. When it executes the awk "$awk_prog"... line, you have to quote the expansion of $awk_prog so awk receives the program text as a single argument.
There's another way to get your shell variable into awk -- use awk's -v option:
awk -v pattern="$shell_var" '$4 ~ pattern {print $1, $2}' "$FILENAME"
Use -v multiple times if you have several variables to pass to awk.
If you truly want to hold your awk program in a shell variable, build it up using printf:
awk_script="$( printf '$4 ~ /%s/ {print $1, $2}' "$shell_var" )"
awk "$awk_script" "$FILENAME"
Note the use of quotes in the printf command: single quotes around the template to protect the dollar signs you want awk to interpret, double quotes for shell variables.
Another (IMO simpler) solution which (I think) addresses what you are intuitively trying to do is simply to use eval. You want the shell to behave as if you had literally typed:
awk '{if ($4~/foo/) print $1,$2}' path
(where foo and path are the literal contents of $shell_var and $FILENAME). To make that happen, just slap an eval on the front of your last line (and perhaps quotes for good measure, but they aren't necessary in this case) so that your last line is:
eval "awk $awk_prog $FILENAME"

Resources