In a text, I want to classify my data according to it range.
For example, 8.1,9.1,9.9 are all in [8,10). I used variables called left and right to replace 8 and 10 respectively in awk-if. But it doesn't work properly.
My data like this:
9.1 aa
9.2 bb
10.1 cc
11.9 dd
Then my scripts like this:
left=8;right=10 #left=10;right=12
echo "["$left","$right"]:"
cat data | awk '{if(($1>="'$left'")&&($1<"'$right'")) print $2}' | xargs
The result is empty.
[8,10]:
But if I use 8 and 10 directly (without variables), it's OK. And when I use left=10, right=12, it works also properly.
I also found when left=98, right=100, it also didn't work. So why sometimes it doesn't work? Thanks a lot!
With awk's option -v:
left=8;right=10
awk -v l="$left" -v r="$right" '{if($1>=l&&$1<r) print $2}' data
or with environment variables:
export left=8 right=10
awk '{if($1>=ENVIRON["left"]&&$1<ENVIRON["right"]) print $2}' data
Output:
aa
bb
You are performing string comparison but want numeric comparison. Just swap the quotes:
left=8;right=10 #left=10;right=12
echo "["$left","$right"]:"
cat data | awk '{if(($1>='"$left"')&&($1<'"$right"')) print $2}' | xargs
Related
I have two columns in a file bbb:
2459507.3260843 12.60766
2459507.3266052 12.64228
2459507.3271260 12.66145
A simple awk column printing gives the expected results (the file content as above):
awk '{print $1, $2}' bbb
however trying a math operation on the second column:
awk '{print $1, $2-0.3}' bbb prints this:
2459507.3260843 11,7
2459507.3266052 11,7
2459507.3271260 11,7
It treats the column as integer numbers (12) and prints the coma instead of the dot in decimal numbers on output.
awk '{print $1, $2-1}' bbb prints this:
2459507.3260843 11
2459507.3266052 11
2459507.3271260 11
Is there a global environmental variable responsible for this awk behaviour? I have installed Ubuntu 20.04 on a new machine (Intel 9, 12 gen. processor). On other computer with ubuntu 20.04 awk behaves properly. I'm not an expert in ubuntu (just a user of the system). I've tried forcing the precision using awk '{printf"%.7f %.5f\n", $1,$2-1}' bbb but then I've got:
2459507,0000000 11,00000
2459507,0000000 11,00000
2459507,0000000 11,00000
What happened to awk on my machine?
Thanks
Thank you Guys!
karakfa and rowboat have pointed out to locale. I found locale was set to Polish: LC_NUMERIC=pl_PL.UTF-8
The fix to my awk strange behaviour is shown on the site (above) posted by rowboat (thank you rawboat)
From that site:
"The fix is to set your LC_NUMERIC or LC_ALL to C or anything else that use . as the decimal separator:"
so when I've done this, I've got:
$ LC_NUMERIC="C" awk '{print $1, $2-1}' bbb
2459507.3260843 11.6077
2459507.3266052 11.6423
2459507.3271260 11.6615
as expected.
I have an html output which is all in one line; i have tried to extract serial numbers using awk but for some odd reason I am only getting one output. The output from curl comes out as an xml format.
curl -sSku user:somepass https://somewebsite.com/computergroups/id/4
-X GET | awk 'BEGIN{IGNORECASE=1;FS="<serial_number>|</serial_number>";RS=EOF} {print $2}'
the above command only prints the first occurance and ends there. It should print over several hundred.
if you have gawk
$ ... | awk -v RS='</?serial_number>' '!(NR%2)'
assumes open tag comes before close tag.
Awk will be a fragile solution (i.e. would likely fail in the future if the output XML changes).
If you want to do it anyway just this once, use rs to knock each tag onto a line of its own and pick up the pieces after in awk with a regex.
$ echo '<serialnumber>098456</serialnumber><serialnumber>095444></serialnumber>' | rs -c\> 0 1
<serialnumber
098456</serialnumber
<serialnumber
095444
Don't let anyone dismiss the power of awk, Khorem.
I generated some test data like this.
for n in {101..107}; do echo -n "a b c <serial_number>$n</serial_number>"; done > data
Then this,
cat data | awk -- 'BEGIN{IGNORECASE=1;FS=">";RS="</serial_number"};/<serial/{print $NF}'
produces this.
101
102
103
104
105
106
107
I have a series of data files that I need to get a few specific values from. I know the line in the file that the data is on. My data files look like this
x y z
1 0.2 0.3
2 0.1 0.2
3 0.5 0.6
etc.
I am using a shell script to access the files, collect the desired data from each file, and output the collected data in one file.
For example, I need the y value in line 3, 0.1. I have tried the following
let dataLine=3
let yVal=`head -n $dataLine dataFile | tail -n 1 | awk '{print $2}'`
but I get the following error
let: yVal=0.1: syntax error: invalid arithmetic operator (error token is ".1")
I have tried adding | bc after awk '{print $2}' but then it did not even register the correct value for what should be assigned to yVal. When I do it as shown above, it does show that it is recognizing the value in the correct line and column.
Thanks for the help,
$ dataLine=3
$ yVal=$(awk -v dataLine="$dataLine" 'NR==dataLine{print $2}' data)
$ echo $yVal
0.1
If you want to get 3rd line's 2nd field then following may help you in same.
Solution 1st: If you want to pass any shell variable's value to any awk variable then following may help you in same.
line_number=3
awk -v line="$line_number" 'FNR==line{print $2}' Input_file
Solution 2nd: If you want to directly print 3rd line's 2nd field then following may help you in same.
awk 'FNR==3{print $2}' Input_file
cat TEXT | awk -v var=$i -v varB=$j '$1~var , $1~varB {print $1}' > PROBLEM HERE
I am passing two variables from an array to parse a very large text file by range. And it works, kind of.
if I use ">" the output to the file will ONLY be the last three lines as verified by cat and a text editor.
if I use ">>" the output to the file will include one complete read of TEXT and then it will divide the second read into the ranges I want.
if I let the output go through to the shell I get the same problem as above.
Question:
It appears awk is reading every line and printing it. Then it goes back and selects the ranges from the TEXT file. It does not do this if I use constants in the range pattern search.
I undestand awk must read all lines to find the ranges I request.
why is it printing the entire document?
How can I get it to ONLY print the ranges selected?
This is the last hurdle in a big project and I am beating my head against the table.
Thanks!
give this a try, you didn't assign varB in right way:
yours: awk -v var="$i" -varB="$j" ...
mine : awk -v var="$i" -v varB="$j" ...
^^
Aside from the typo, you can't use variables in //, instead you have to specify with regular ~ match. Also quote your shell variables (here is not needed obviously, but to set an example). For example
seq 1 10 | awk -v b="3" -v e="5" '$0 ~ b, $0 ~ e'
should print 3..5 as expected
It sounds like this is what you want:
awk -v var="foo" -v varB="bar" '$1~var{f=1} f{print $1} $1~varB{f=0}' file
e.g.
$ cat file
1
2
foo
3
4
bar
5
foo
6
bar
7
$ awk -v var="foo" -v varB="bar" '$1~var{f=1} f{print $1} $1~varB{f=0}' file
foo
3
4
bar
foo
6
bar
but without sample input and expected output it's just a guess and this would not address the SHELL behavior you are seeing wrt use of > vs >>.
Here's what happened. I used an array to input into my variables. I set the counter for what I thought was the total length of the array. When the final iteration of the array was reached, there was a null value returned to awk for the variable. This caused it to print EVERYTHING. Once I correctly had a counter with the correct number of array elements the printing oddity ended.
As far as the > vs >> goes, I don't know. It did stop, but I wasn't as careful in documenting it. I think what happened is that I used $1 in the print command to save time, and with each line it printed at the end it erased the whole file and left the last three identical matches. Something to ponder. Thanks Ed for the honest work. And no thank you to Robo responses.
I am trying to do a simple division computation between two integers that will result in a float. I do not want to use bc. This approach works for me for a different purpose with slightly different syntax but I am not quite sure where I am messing up. I am positive that the variables are getting assigned correctly, but I have an error once I try to do the division, and nothing actually gets assigned to the variable. Can anyone help?
Thanks in advance!
rate=`awk '{ shared = "'"${tempRatioArray[0]}"'"; total = "'"${tempRatioArray[1]}"'";\
printf "%3.0f\t", shared/total }' | awk '{print}'`
You can use bc:
bc -l <<<"scale=3; 5/2"
2.500
Adapting to your code:
bc -l <<< "scale=3; ${tempRatioArray[0]} / ${tempRatioArray[1]}"
That is not correct way of using shell variables in awk and you don't need 2 awk commands.
Use it like this:
rate=$(awk -v shared="${tempRatioArray[0]}" -v total="${tempRatioArray[1]}" 'BEGIN {
printf "%.3f", (shared/total) }')