Pass a variable to awk contained between special characters [duplicate] - bash

I want to extract a substring where certain pattern exist from pipe separated file, thus I used below command,
awk -F ":" '/REWARD REQ. SERVER HEADERS/{print $1, $2, $3, $4}' sample_profile.txt
Here, 'REWARD REQ. SERVER HEADERS' is a pattern which is to be searched in the file, and print its first 4 parts on a colon separated line.
Now, I want to send bash variable to act as a pattern. thus I used below command, but it's not working.
awk -v pat="$pattern" -F ":" '/pat/{print $1, $2 , $3, $4 } sample_profile.txt
How can I use -v and -F in a single awk command?

If you want to provide the pattern through a variable, you need to use ~ to match against it:
awk -v pat="$pattern" '$0 ~ pat'
In your case, the problem does not have to do with -F.
The problem is the usage of /pat/ when you want pat to be a variable. If you say /pat/, awk understands it as a literal "pat", so it will try to match those lines containing the string "pat".
All together, your code should be:
awk -v pat="$pattern" -F ":" '$0~pat{print $1, $2, $3, $4 }' file
# ^^^^^^
See an example:
Given this file:
$ cat file
hello
this is a var
hello bye
Let's look for lines containing "hello":
$ awk '/hello/' file
hello
hello bye
Let's now try looking for "pat", contained in a variable, the way you were doing it:
$ awk -v pat="hello" '/pat/' file
$ # NO MATCHES!
Let's now use the $0 ~ pat expression:
$ awk -v pat="hello" '$0~pat' file
hello # WE MATCH!
hello bye
Of course, you can use such expressions to match just one field and say awk -v pat="$pattern" '$2 ~ pat' file and so on.
From GNU Awk User's Guide → 3.1 How to Use Regular Expressions:
When a regexp is enclosed in slashes, such as /foo/, we call it a regexp constant, much like 5.27 is a numeric constant and "foo" is a string constant.
And GNU Awk User's Guide → 3.6 Using Dynamic Regexps:
The righthand side of a ‘~’ or ‘!~’ operator need not be a regexp
constant (i.e., a string of characters between slashes). It may be any
expression. The expression is evaluated and converted to a string if
necessary; the contents of the string are then used as the regexp. A
regexp computed in this way is called a dynamic regexp or a computed
regexp:
BEGIN { digits_regexp = "[[:digit:]]+" }
$0 ~ digits_regexp { print }
This sets digits_regexp to a regexp that describes one or more digits,
and tests whether the input record matches this regexp.

awk -v pat="$pattern" -F":" '$0 ~ pat { print $1, $2, $3, $4 }' sample_profile.txt
You can't use the variable inside the regex // notation (there's no way to distinguish it from searching for pat); you have to specify that the variable is a regex with the ~ (matching) operator.

This is kind of a hack but it makes things a little simpler for me.
cmd="awk '/$pattern/'"
eval $cmd
making it a string first lets you manipulate it past the boundaries of awk

Related

awk variable for `awk ~ $3 /$VARIABLE/` not working [duplicate]

I want to extract a substring where certain pattern exist from pipe separated file, thus I used below command,
awk -F ":" '/REWARD REQ. SERVER HEADERS/{print $1, $2, $3, $4}' sample_profile.txt
Here, 'REWARD REQ. SERVER HEADERS' is a pattern which is to be searched in the file, and print its first 4 parts on a colon separated line.
Now, I want to send bash variable to act as a pattern. thus I used below command, but it's not working.
awk -v pat="$pattern" -F ":" '/pat/{print $1, $2 , $3, $4 } sample_profile.txt
How can I use -v and -F in a single awk command?
If you want to provide the pattern through a variable, you need to use ~ to match against it:
awk -v pat="$pattern" '$0 ~ pat'
In your case, the problem does not have to do with -F.
The problem is the usage of /pat/ when you want pat to be a variable. If you say /pat/, awk understands it as a literal "pat", so it will try to match those lines containing the string "pat".
All together, your code should be:
awk -v pat="$pattern" -F ":" '$0~pat{print $1, $2, $3, $4 }' file
# ^^^^^^
See an example:
Given this file:
$ cat file
hello
this is a var
hello bye
Let's look for lines containing "hello":
$ awk '/hello/' file
hello
hello bye
Let's now try looking for "pat", contained in a variable, the way you were doing it:
$ awk -v pat="hello" '/pat/' file
$ # NO MATCHES!
Let's now use the $0 ~ pat expression:
$ awk -v pat="hello" '$0~pat' file
hello # WE MATCH!
hello bye
Of course, you can use such expressions to match just one field and say awk -v pat="$pattern" '$2 ~ pat' file and so on.
From GNU Awk User's Guide → 3.1 How to Use Regular Expressions:
When a regexp is enclosed in slashes, such as /foo/, we call it a regexp constant, much like 5.27 is a numeric constant and "foo" is a string constant.
And GNU Awk User's Guide → 3.6 Using Dynamic Regexps:
The righthand side of a ‘~’ or ‘!~’ operator need not be a regexp
constant (i.e., a string of characters between slashes). It may be any
expression. The expression is evaluated and converted to a string if
necessary; the contents of the string are then used as the regexp. A
regexp computed in this way is called a dynamic regexp or a computed
regexp:
BEGIN { digits_regexp = "[[:digit:]]+" }
$0 ~ digits_regexp { print }
This sets digits_regexp to a regexp that describes one or more digits,
and tests whether the input record matches this regexp.
awk -v pat="$pattern" -F":" '$0 ~ pat { print $1, $2, $3, $4 }' sample_profile.txt
You can't use the variable inside the regex // notation (there's no way to distinguish it from searching for pat); you have to specify that the variable is a regex with the ~ (matching) operator.
This is kind of a hack but it makes things a little simpler for me.
cmd="awk '/$pattern/'"
eval $cmd
making it a string first lets you manipulate it past the boundaries of awk

Bash, get substring by keeping the match with awk

How can I split a string with awk but printing the match too?
Full random string:
aaa sasawf wewfTotemeswdwqewqwqtotemwewedew
I need to get "wewftotemeswdwqewqwqtotemwewedew" where the substring is random, the only constant is a space and the word totem in it. As you notice the random string might contain more than one totem word, I need awk to get the substring starting from the first match. To be clear, I need "wewftotemeswdwqewqwqtotemwewedew" not "totemwewedew". I also need it to be case insensitive
I can use awk -F ' .*totem' '{print$2}' to print eswdwqewqwqtotemwewedew but how can I print the match too?
With GNU awk for the third arg to match():
$ echo 'aaa sasawf wewftotemeswdwqewqwq' |
awk 'match($0,/[^ ]*totem[^ ]*/,a) { print a[0] }'
wewftotemeswdwqewqwq
and with any awk:
$ echo 'aaa sasawf wewftotemeswdwqewqwq' |
awk 'match($0,/[^ ]*totem[^ ]*/) { print substr($0,RSTART,RLENGTH) }'
wewftotemeswdwqewqwq
For case-insensitive matching with GNU awk:
awk -v IGNORECASE=1 'match($0,/[^ ]*totem[^ ]*/...
and with any awk:
awk 'match(tolower($0),/[^ ]*totem[^ ]*/...

How can I replace a / with a _ in a shell pipeline?

Running
ip -o -f inet addr show | grep $INTERNAL |awk '/scope global/ {print $4}'
Want to replace the / in my output to _ so rather than reading
10.168.122.59/16
it reads as
10.168.122.59_16
.
|sed s///_/
didnt help
Any suggestions?
All the postprocessing requested can be done internal to awk. Expanding a one-liner provided in a comment by #123 for better readability, this can look like the following:
ip -o -f inet addr show | \
awk -v i="$INTERNAL" '
$0 ~ i && /scope global/ {
sub(/\//, "_", $4);
print $4;
}'
Breaking down how this works:
awk -v i="$INTERNAL" defines an awk variable based on a shell variable. (As an aside, all-caps shell variable names are bad form; per POSIX convention, lowercase names are reserved for application use, whereas all-caps names can have meaning to the OS and surrounding tools).
$0 ~ i filters for the entire line ($0) matching the awk variable i.
/scope global/ by default is applied as a regex against $0 as well (it's equivalent to $0 ~ /scope global/).
sub(/\//, "_", $4) substitutes /s with _s in the fourth field.
print $4 prints that field.
You need to scape the / or use a different separator as below:
echo 10.168.122.59/16 | sed s:/:_:
echo 10.168.122.59/16| awk '{sub(/\//,"_")}1'
10.168.122.59_16

passing for loop index into awk

I am trying to pass a for loop index i into awk but keep getting unexpected token awk errors.
First I tried using the -v option within awk:
for i in "${myarray}"
awk -v var=$i '/var/{print}' myfile.dat
done
I also tried calling the variable directly using single quotes:
for i in "${myarray}"
awk '/'"$i"'/{print}' myfile.dat
done
My end goal is to learn how to pass a for loop index variable through awk as the search pattern. I'd like the above code to search through myfile.dat and print lines which contain the strings in myarray.
There are 2 problems:
Array traversing should be like this for i in "${myarray[#]}"
awk treats text between /.../ as regex literal, to use a variable use $0 ~ var.
Your code should be:
for i in "${myarray[#]}"; do
awk -v var="$i" '$0 ~ var' myfile.dat
done
{print} is default action in awk that you can omit as shown above.
you can do the same loop free as well, e.g.,
echo "${myarray[#]}" | tr ' ' '|' | awk 'NR==FNR{pat=$0; next} $0 ~ pat' - file

Use array variable in awk?

A=(aaa bbb ccc)
cat abc.txt | awk '{ print $1, ${A[$1]} }'
I want to index an array element based on the $1, but the code above is not correct in awk syntax. Could someone help?
You can't index a bash array using a value generated inside awk, even if you weren't using single quotes (thereby preventing bash from doing any substitution). You could pass the array in, though.
A=(aaa bbb ccc)
awk -v a="${A[*]}" 'BEGIN {split(a, A, / /)}
{print $1, A[$1] }' <abc.txt
Because of the split function inside awk, the elements of A may not contain spaces or newlines. If you need to do anything more interesting, set the array inside of awk.
awk 'BEGIN {a[1] = "foo bar" # sadly, there is no way to set an array all
a[2] = "baz" } # at once without abusing split() as above
{print $1, a[$1] }' <abc.txt
(Clarification: bash substitutes variables before invoking the program whose argument you're substituting, so by the time you have $1 in awk it's far too late to ask bash to use it to substitute a particular element of A.)
If you are going to be hard-coding the A array, you can just initialize it in awk
awk 'BEGIN{A[0]="aaa";A[1]="bbb"}{ print $1, A[$1] }' abc.txt
Your awk program within single quotes cannot see the shell environment variable A. In general, you can get a little shell substitution to work if you use double quotes instead of single quotes, but that is done by the shell, before awk is invoked. Overall, it is heavy sledding to try to combine shell and awk this way. If possible, I would take kurumi's approach of using an awk array.
Single quotes: an impenetrable veil.
Double quotes: generally too much travail.
So pick your poison: shell or awk.
Otherwise: your code may balk.
You can also print each element of the array on separate line with printf and pipe it to awk. This code will simply print bash array (bash_arr) from awk:
bash_arr=( 1 2 3 4 5 )
printf '%s\n' "${bash_arr[#]}" |
awk ' { awk_arr[NR] = $0 }
END {
for (key in awk_arr) {
print awk_arr[key]
}
}'

Resources