Need to use awk to get a specific word or value after another specific word? - bash

I need to use awk to get a specific word or value after another specific word, I tried some awk commands already but after many other filters like grep and sed. The file that I need to get the word from is having the same line more than one time like the below line:
Configuration: number=6 model=MSA SNT=4 IC=8 SIZE=16384MB NRF=24 meas=2.00
If need 24 I used
grep IC file | awk 'NF>1{print $NF}'
If need 16384MB I used
grep IC file | awk -F'SIZE=' '{ print $2 }'|awk '{ print $1 }'
We need to get any word from that line using awk? what I used can get what is needed but we still need a minimized awk command.
I am sure we can use one single awk to get the needed info from one line minimized command?

sed -r 's/.*SIZE=([^ ]+).*/\1/' input
16384MB
sed -r 's/.*NRF=([^ ]+).*/\1/' input
24
grep way :
grep -oP 'SIZE=\K[^ ]+' imput
16384MB
awk way :
awk '{for(i=1;i<=NF;i++) if($i ~ /SIZE=/) split($i,a,"=");print a[2]}' input

You could use an Awk with multi-character de-limiter as below to get this done. Loop through the fields, match the pattern you need and print the next field which contains the field value.
awk -F'[:= ]' -v option="${match}" '{for(i=1;i<=NF;i++) if ($i ~ option) {print $(i+1)}}' file
Examples,
match="number"
awk -F'[:= ]' -v option="${match}" '{for(i=1;i<=NF;i++) if ($i ~ option) {print $(i+1)}}' file
6
match="model"
awk -F'[:= ]' -v option="${match}" '{for(i=1;i<=NF;i++) if ($i ~ option) {print $(i+1)}}' file
MSA
match="meas"
awk -F'[:= ]' -v option="${match}" '{for(i=1;i<=NF;i++) if ($i ~ option) {print $(i+1)}}' file
2.00

here is a more general approach
... | awk -v k=NRF '{for(i=2;i<=NF;i++) {split($i,a,"="); m[a[1]]=a[2]} print m[k]}'
code will stay the same just change the key k.

If you have GNU awk you could use the third parameter of match:
$ awk 'match($0,/( IC=)([^ ]*)/,a)&& $0=a[2]' file
8
Or get the meas:
$ awk 'match($0,/( meas=)([^ ]*)/,a)&& $0=a[2]' file
2.00
Should you use some other awk, you could use this combination of split, substr and match:
$ awk 'split(substr($0,match($0,/ IC=[^ ]*/),RLENGTH),a,"=") && $0=a[2]' file
8

Related

Append 2 column variables in unix

I have a file as follows.
file1.csv
H,2 A:B,pq
D,34 C:B,wq
D,64 F:B,rq
D,6 R:B,tq
I want to format 2nd a column as follows
H,02 0A:0B,pq
D,34 0C:0B,wq
D,64 0F:0B,rq
D,06 0R:0B,tq
I am able to separate the column and format it but cannot merge it
I use following command
formated_nums =`awk -F"," '{print $2}' file1.csv | awk '{print $1}' | awk '{if(length($1)!=2){$1="0"$1}}1'`
formated_letters = `awk -F"," '{print $2}' file1.csv | awk '{print $2}' | awk -F":" '{if(length($1)!=2){$1="0"$1}; if(length($2)!=2){$2="0"$2}}1'| awk '{print $1":"$2}'`
Now I want to merge formated_nums and formated_letters with a space in between
I tried echo "${formated_nums} ${formated_letters}" but it takes variables as rows and appends the whole thing as a row
The simplest I found in awk is to use another separation including space and ':' and reformat the final layout. The only real tricky part is the number that need sometimes to add a 0 in front but it's trivial in formating because number are never bigger than 2 digit (here)
awk -F '[[:blank:],:]' '{printf("%s,%02d 0%s:0%s,%s", $1, $2, $3, $4, $5)}' YourFile
Assuming your data are in the same format (no bigger latest field with space or other "separator" inside)
An alternative awk solution based on gnu awk :
awk -F"[, :]" '{sub($2,sprintf("%02d",$2));sub($3,"0" $3);sub($4,"0" $4)}1' file1
H,02 0A:0B,pq
D,34 0C:0B,wq
D,64 0F:0B,rq
D,06 0R:0B,tq
It sounds like this is what you're really looking for:
$ awk '
BEGIN { FS=OFS=","; p=2 }
{ split($2,t,/[ :]/); for (i in t) {n=length(t[i]); t[i] = (n<p ? sprintf("%0*s",p-n,0) : "") t[i]; $2=t[1]" "t[2]":"t[3]} }
1
' file
H,02 0A:0B,pq
D,34 0C:0B,wq
D,64 0F:0B,rq
D,06 0R:0B,tq

grep/awk specific lines based on specific fields; using ksh variable with awk

I have this input file: file_in.txt (delimited by pipe)
3345:tyg|rty|27|0|0|ty6|{89|io|}62|0
3346:tyg|rtyuio|63|0|1|ty6|{89|gh|}45|0
3347:tyu|ray|24|0|0|ty6|{89|uh|}27|0
3348:tyg|rtoy|93|0|1|ty6|{89|yh|}1|0
3349:tyo|rtert|28|0|0|ty6|{89|gh|}27|0
I want to get only those lines which have 9th field value as }27 using '|' as delimiter so that my output should be:
3347:tyu|ray|24|0|0|ty6|{89|uh|}27|0
3349:tyo|rtert|28|0|0|ty6|{89|gh|}27|0
Below command works fine:
awk -F"|" '{ if ($9 == "}27") print $0 }' file_in.txt
But I want to use a shell variable instead of "}27" for which I tried this:
taskid="}27"
awk -v tid="$taskid" -F"|" '{ if ($9 == "}tid") print $0 }' file_in.txt
Please help me figure out where I am going wrong with this command.
Any other command suggestions to achieve the same are appreciated.
This should work:
taskid="}27"
awk -F'|' -v tid="$taskid" '$9 == tid' file
Output:
3347:tyu|ray|24|0|0|ty6|{89|uh|}27|0
3349:tyo|rtert|28|0|0|ty6|{89|gh|}27|0
Assuming your shell variable $tasked has the value 27, you want to use one of these forms:
build the string with the open brace in the shell
awk -v tid="}$taskid" -F"|" '$9 == tid' file
or do it in awk --- awk's string concatenation is just placing strings side-by-side with optional whitespace in between
awk -v tid="$taskid" -F"|" '$9 == "}" tid' file
Your own command should have worked with this change:
$ ksh
$ taskid=}27
$ awk -v tid=$taskid -F"|" '{ if ($9 == tid) print $0}' file_in.txt
Output:
3347:tyu|ray|24|0|0|ty6|{89|uh|}27|0
3349:tyo|rtert|28|0|0|ty6|{89|gh|}27|0

Multiple if statements in awk

I have a file that looks like
01/11/2015;998978000000;4890********3290;5735;ITUNES.COM/BILL;LU;Cross_border_rub;4065;17;915;INSUFF FUNDS;51;0;
There are 13 semicolon separated columns.
I'm trying to calculate 9 columns for all lines:
awk -F ';' -vOFS=';' '{ gsub(",", ".", $9); print }' file |
awk -F ';' '$0 = NR-1";"$0' |
awk -F ';' -vOFS=';' '{bar[$1]=$1;a[$1]=$2;b[$1]=$3;c[$1]=$4;d[$1]=$5;e[$1]=$6;f[$1]=$7;g[$1]=$8;h[$1]=$9;k[$1]=$10;l[$1]=$11;l[$1]=$12;m[$1]=$13;p[$1]=$14;};
if($7="International") {income=0.0162*h[i]+0.0425*h[i]};
else if($7="Domestic") {income=0.0188*h[i]};
else if($7="Cross_border_rub") {income=0.0162*h[i]+0.025*h[i]}
END{for(i in bar) print income";"a[i],b[i],c[i],d[i],e[i],f[i],g[i],h[i],k[i],l[i],m[i],p[i]}'
How exactly do multiple if statements correctly work in awk?
awk to the rescue!
You don't need the multiple awk invocations. Can consolidate into one
$ awk -F';' -v OFS=';' '{gsub(",", ".", $9)}
$7=="International" {income=(0.0162+0.0425)*$9}
$7=="Domestic" {income=0.0188*$9}
$7=="Cross_border_rub" {income=(0.0162+0.025)*$9}
# what happens for other values since previous income will be copied over
{print income, NR-1, $0}' file
test with your file since you didn't provide a enough sample to test.
Perhaps better if you just assign the rate
$ awk -F';' -v OFS=';' '{gsub(",", ".", $9); rate=0}
$7=="International" {rate=0.0162+0.0425}
$7=="Domestic" {rate=0.0188}
$7=="Cross_border_rub" {rate=0.0162+0.025}
{print rate*$9, NR-1, $0}' file

How to print a range of columns in a CSV in AWK? [duplicate]

This question already has answers here:
Extract specific columns from delimited file using Awk
(8 answers)
Closed 4 years ago.
With awk, I can print any column within a CSV, e.g., this will print the 10th column in file.csv.
awk -F, '{ print $10 }' file.csv
If I need to print columns 5-10, including the comma, I only know this way:
awk -F, '{ print $5","$6","$7","$8","$9","$10 }' file.csv
This method is not so good if I want to print many columns. Is there a simpler syntax for printing a range of columns in a CSV in awk?
The standard way to do this in awk is using a for loop:
awk -v s=5 -v e=10 'BEGIN{FS=OFS=","}{for (i=s; i<=e; ++i) printf "%s%s", $i, (i<e?OFS:ORS)}' file
However, if your delimiter is simple (as in your example), you may prefer to use cut:
cut -d, -f5-10 file
Perl deserves a mention (using -a to enable autosplit mode):
perl -F, -lane '$"=","; print "#F[4..9]"' file
You can use a loop in awk to print columns from 5 to 10:
awk -F, '{ for (i=5; i<=10; i++) print $i }' file.csv
Keep in mind that using print it will print each columns on a new line. If you want to print them on same line using OFS then use:
awk -F, -v OFS=, '{ for (i=5; i<=10; i++) printf("%s%s", $i, OFS) }' file.csv
With GNU awk for gensub():
$ cat file
a,b,c,d,e,f,g,h,i,j,k,l,m
$
$ awk -v s=5 -v n=6 '{ print gensub("(([^,]+,){"s-1"})(([^,]+,){"n-1"}[^,]+).*","\\3","") }' file
e,f,g,h,i,j
s is the start position and n is the number of fields to print from that point on. Or if you prefer to specify start and end:
$ awk -v s=5 -v e=10 '{ print gensub("(([^,]+,){"s-1"})(([^,]+,){"e-s"}[^,]+).*","\\3","") }' file
e,f,g,h,i,j
Note that this will only work with single-character field separators since it relies on being able to negate the FS in a character class.

awk - split only by first occurrence

I have a line like:
one:two:three:four:five:six seven:eight
and I want to use awk to get $1 to be one and $2 to be two:three:four:five:six seven:eight
I know I can get it by doing sed before. That is to change the first occurrence of : with sed then awk it using the new delimiter.
However replacing the delimiter with a new one would not help me since I can not guarantee that the new delimiter will not already be somewhere in the text.
I want to know if there is an option to get awk to behave this way
So something like:
awk -F: '{print $1,$2}'
will print:
one two:three:four:five:six seven:eight
I will also want to do some manipulations on $1 and $2 so I don't want just to substitute the first occurrence of :.
Without any substitutions
echo "one:two:three:four:five" | awk -F: '{ st = index($0,":");print $1 " " substr($0,st+1)}'
The index command finds the first occurance of the ":" in the whole string, so in this case the variable st would be set to 4. I then use substr function to grab all the rest of the string from starting from position st+1, if no end number supplied it'll go to the end of the string. The output being
one two:three:four:five
If you want to do further processing you could always set the string to a variable for further processing.
rem = substr($0,st+1)
Note this was tested on Solaris AWK but I can't see any reason why this shouldn't work on other flavours.
Some like this?
echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1'
one two:three:four:five:six
This replaces the first : to space.
You can then later get it into $1, $2
echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' | awk '{print $1,$2}'
one two:three:four:five:six
Or in same awk, so even with substitution, you get $1 and $2 the way you like
echo "one:two:three:four:five:six" | awk '{sub(/:/," ");$1=$1;print $1,$2}'
one two:three:four:five:six
EDIT:
Using a different separator you can get first one as filed $1 and rest in $2 like this:
echo "one:two:three:four:five:six seven:eight" | awk -F\| '{sub(/:/,"|");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight
Unique separator
echo "one:two:three:four:five:six seven:eight" | awk -F"#;#." '{sub(/:/,"#;#.");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight
The closest you can get with is with GNU awk's FPAT:
$ awk '{print $1}' FPAT='(^[^:]+)|(:.*)' file
one
$ awk '{print $2}' FPAT='(^[^:]+)|(:.*)' file
:two:three:four:five:six seven:eight
But $2 will include the leading delimiter but you could use substr to fix that:
$ awk '{print substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
two:three:four:five:six seven:eight
So putting it all together:
$ awk '{print $1, substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight
Storing the results of the substr back in $2 will allow further processing on $2 without the leading delimiter:
$ awk '{$2=substr($2,2); print $1,$2}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight
A solution that should work with mawk 1.3.3:
awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1}' FS='\0'
one
awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $2}' FS='\0'
two:three:four five:six:seven
awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1,$2}' FS='\0'
one two:three:four five:six:seven
Just throwing this on here as a solution I came up with where I wanted to split the first two columns on : but keep the rest of the line intact.
Comments inline.
echo "a:b:c:d::e" | \
awk '{
split($0,f,":"); # split $0 into array of fields `f`
sub(/^([^:]+:){2}/,"",$0); # remove first two "fields" from `$0`
print f[1],f[2],$0 # print first two elements of `f` and edited `$0`
}'
Returns:
a b c:d::e
In my input I didn't have to worry about the first two fields containing escaped :, if that was a requirement, this solution wouldn't work as expected.
Amended to match the original requirements:
echo "a:b:c:d::e" | \
awk '{
split($0,f,":");
sub(/^([^:]+:)/,"",$0);
print f[1],$0
}'
Returns:
a b:c:d::e

Resources