Assign bash variables from one awk command? - bash

Hoping someone can help me make my awk commands more efficient please!
Let's say my text file has around 30 lines of this type of thing:
ENTIRE:11.3.28.4.0
OSVER:Solaris11
VARFREE:3G
I'm assigning these to variables in a bash script like this:
ENTIRE=$(awk -F\: '$1 ~ /ENTIRE/ {print $2}' $HOSTFILE)
RELEASE=$(awk -F\: '$1 ~ /RELEASE/ {print $2}' $HOSTFILE)
OSVER=$(awk -F\: '$1 ~ /OSVER/ {print $2}' $HOSTFILE)
Because I have around 30 of these, it means awk is run 30 times, which is slow, and clearly not the best way.
Can anyone suggest how I can build these into one awk command please?
Thanks in advance!

You don't need awk at all. If modifying the original file isn't an option, use a while loop and the declare command to define each variable.
while IFS=: read name value; do
declare "$name=$value"
done < "$HOSTFILE"
An example:
$ IFS=: read name value <<< "foo:bar"
$ declare "$name=$value"
$ echo "$foo"
bar

Related

bash variables from nested for loops in awk

I want to simply use the two for loop variables in my awk code but I can't. Please help or guide me in the right direction.
for i in {30,60,100};
do
for j in {7,8};
do
awk -v x=$i -v y=$j '{if ($NF <=x) print $0}' S_$i.txt > S_$i_$j.txt;
done;
done
This was the error I received.
awk: fatal: cannot open file S_.txt for reading (No such file or directory). I saw this error.
S_$i_$j.txt is trying to access a variable named $i_. Use S_${i}_${j}.txt instead but also always quote your shell variables so it should really be:
awk -v x="$i" -v y="$j" '{if ($NF <= x) print $0}' "S_${i}.txt" > "S_${i}_${j}.txt"
or more awkishly:
awk -v x="$i" -v y="$j" '$NF <= x' "S_${i}.txt" > "S_${i}_${j}.txt"
and note that you never use y inside your awk script so it could just be:
awk -v x="$i" '$NF <= x' "S_${i}.txt" > "S_${i}_${j}.txt"
but then it's not clear why you'd want to create 2 copies of your output with each inner loop.
Whatever you're doing, though, could almost certainly be done much faster with a single call to awk than calling it multiple times within shell loops!
The problem you asked about has absolutely nothing to do with for loop variables in my awk code btw, it's all shell fundamentals.
Thanks for your quick response.
However, I tried the following and it worked:
for i in {30,60,100};
do
for j in {7,8};
do
awk -v x=$i -v y=$j '{if ($NF <=x) print $0}' "S_"$j".txt" > "S_"$j"_"$i".txt";
done;
done;
Additionally, I realized that S_30.txt didn't exist. So when I changed it to "S_"$j".txt" it worked fine. My bad on that one.

splitting file with awk command

I was trying to split a file into a training data set and a test data set. I have this error
awk: can't open file -v source line number 1.
The command line was as follows:
awk -v lines=$(wc -l < data/yelp/yelp_review.v8.csv) -v fact=0.80 'NR <= lines * fact {print > "train.txt"; next} {print > "val.txt"}' data/yelp/yelp_review.v8.csv
Anybody enlightens me why it was a problem on macbook?
Well .. miken32 has already identified what went wrong with your first attempt. I can't improve on his explanation of the problem.
My suggestion would be that rather than having wc provide your line count, you just do that job with awk itself. Something like this:
awk -v fact=0.8 'NR==FNR{lines++;next} FNR<=lines*fact{print>"train.txt";next} {print>"val.txt"}' "$file" "$file"
Though I'd probably write it more like this:
awk -v fact=0.8 'NR==FNR{lines++;next} {out="val.txt"} FNR<=lines*fact{out="train.txt"} {print > out}' "$file" "$file"
You can decide whether greater elegance is gained by brevity or avoidance of a next. :-)
What does the output from wc -l < data/yelp/yelp_review.v8.csv look like? Something like this perhaps?
74
So what's going to happen when you drop that into your command?
awk -v lines= 74 -v fact=0.80 ...
As you can see, this isn't going to parse well. Always quote any variable data you use:
awk -v lines="$(wc -l < data/yelp/yelp_review.v8.csv)" -v fact=0.80 ...
Awk is smart enough to trim the spaces from the number before using it.

i need to use variable in instead of direct date in shell script awk

i need to use variable in instead of direct date.
cat file | awk -F, '{ if ($1>"2012-08-20 11:30" && $1<"2012-08-22 16:00") print }'
thanks in advance
Based on your shown code, could you please try following and let me know if this helps you.(In lack of samples I haven't tested it)
awk -v date1="2012-08-20 11:30" -v date2="2012-08-22 16:00" -F, '($1>date1 && $1<date2)' Input_file
In case your variables are coming from shell to awk then following could help you on same, you could change date subtraction order as per your need too:
date1="2012-08-20 11:30"
date2="2012-08-22 16:00"
awk -v date_1="$date1" -v date_2="$date2" -F, '($1>date_1 && $1<date_2)' Input_file

How do I pass a stored value as the column number parameter to edit in awk?

I have a .dat file with | separator and I want to change the value of the column which is defined by a number passed as argument and stored in a var. My code is
awk -v var="$value" -F'|' '{ FS = OFS = "|" } $1=="$id" {$"\{$var}"=8}1'
myfile.dat > tmp && mv tmp myfiletemp.dat
This changes the whole line to 8, obviously doesn't work. I was wondering what is the right way to write this part
{$"\{$var}"=8}1
For example, if I want to change the fourth column to 8 and I have value=4, how do I get {$4=8}?
The other answer is mostly correct, but just wanted to add a couple of notes, in case it wasn't totally clear.
Referring to a variable with a $ in front of it turns it in to a reference to the column. So i=3; print $i; print i will print the third column and then the number 3.
Putting all your variables in the command line will avoid any problems with trying to include bash variables inside your single-quoted awk code, which won't work.
You can let awk do the output to the specific file instead of relying on bash to redirect output and move files.
The -F option on the command line specifies FS for you, so no need to redeclare it in your code.
Here's how I would do this:
#!/bin/bash
column=4
value=8
id=1
awk -v col="$column" -v val="$value" -v id="$id" -F"|" '
BEGIN {OFS="|"}
{$1==id && $col=val; print > "myfiletemp.dat"}
' myfile.dat
you can refer to the awk variable directly by it's name, slight rewrite of your script with correct reference to column number var...
awk -F'|' -v var="$value" 'BEGIN{OFS=FS} $1=="$id"{$var=8}1'
should work as long as $value is a number. If id is another bash variable, pass it the same way as an awk variable
awk -F'|' -v var="$value" -v id="$id" 'BEGIN{OFS=FS} $1==id{$var=8}1'
Not only can you use a number in a variable by putting a $ in front of it, you can also use put a $ in front of an expression!
$ date | tee /dev/stderr | awk '{print $(2+2)}'
Mon Aug 3 12:47:39 CDT 2020
12:47:39

How to pass a bash variable as value of awk parameter?

I would like to replace a variable inside the the awk command with a bash variable.
For example:
var="one two three"
echo $var | awk "{print $2}"
I want to replace the $2 with the var variable. I have tried awk -v as well as something like awk "{ print ${$wordnum} } to no avail.
Sightly different approach:
$ echo $var
one two three
$ field=3
$ echo $var | awk -v f="$field" '{print $f}'
three
$ field=2
$ echo $var | awk -v f="$field" '{print $f}'
two
You've almost got it...
$ myfield='$3'
$ echo $var | awk "{print $myfield}"
three
The hard quotes on the first line prevent interpretation of $3 by the shell. The soft quotes on the second line allow variable replacement.
You can concatenate parts of awk statements with variables. Maybe this is what you want in your script file:
echo $1|awk '{print($'$2');}'
Here the parts {print($ and the value of local variable $2 and );} are concatenated and given to awk.
EDIT: After some advice rather don't use this. Maybe as a one-time solution. It's better to get accustomed to doing it right right away - see link in first comment.

Resources