Suppose I have a CSV input file called input.csv:
1,2,3,valueIWantToPrint
3,5,2,valueIWantToPrint
I currently need to print the last element of each row of that input with awk, which is easily accomplished with the field separator and NF variables:
awk -F"," '{print $NF}' input.csv
But now let's say that I want to make the field that I want to print a variable, because later perhaps the input format will change and it will be a different field.
Input file:
1,2,valueIWantToPrint,3
3,5,valueIWantToPrint,2
Script:
FIELD_TO_PRINT=3
awk -F"," -v fieldToPrint=FIELD_TO_PRINT '{print $fieldToPrint}' input.csv
Ok, that was easy. But now to make it as flexible as possible, I would like have the ability to set FIELD_TO_PRINT to the equivalent of NF so that I can print the last value regardless of the number of fields. What I'm after is something like this:
Input:
1,2,3,7,2,5,23,1,3,6,valueIWantToPrint
3,5,2,6,3,valueIWantToPrint
This script doesn't work, but illustrates what I am trying to accomplish:
FIELD_TO_PRINT=NF
awk -F"," -v fieldToPrint=FIELD_TO_PRINT '{print $fieldToPrint}' input.csv
Is there a convenient way to set a variable to mean "the last field in record?" This example is pretty trivial, but ultimately the FIELD_TO_PRINT variable will be put in a separate configuration file, and the awk script will be much larger and more complex. So having a good way to accomplish this will be very useful.
You can use this round-about way:
Negative n means NF:
n=-1
awk -F, -v n="$n" '{print (n<0 ? $NF : $n)}' f.csv
valueIWantToPrint
valueIWantToPrint
and when n > 0 print numbered field:
n=3
awk -F, -v n="$n" '{print (n<0 ? $NF : $n)}' f.csv
3
2
you can use this trick
$ awk -F, -v n=-1 '{print (n<0)?$(NF+n+1):$n}' file
valueIWantToPrint
valueIWantToPrint
assume negative indices start counting from NF backwards, so -2 will mean the penultimate field etc.
Related
I have a CSV file which has 4 columns. I want to first:
print the first 10 items of each column
only print the items in the third column
My method is to pipe the first awk command into another but i didnt get exactly what i wanted:
awk 'NR < 10' my_file.csv | awk '{ print $3 }'
The only missing thing was the -F.
awk -F "," 'NR < 10' my_file.csv | awk -F "," '{ print $3 }'
You don't need to run awk twice.
awk -F, 'NR<=10{print $3}'
This prints the third field for every line whose record number (line) is less than or equal to 10.
Note that < is different from <=. The former matches records one through nine, the latter matches records one through ten. If you need ten records, use the latter.
Note that this will walk through your entire file, so if you want to optimize your performance:
awk -F, '{print $3} NR>10{exit}'
This will print the third column. Then if the record number is greater than 10, it will exit. This does not step through your entire file.
Note also that awk's "CSV" matching is very simple; awk does not understand quoted fields, so the record:
red,"orange,yellow",green
has four fields, two of which have double quotes in them. YMMV depending on your input.
I have a line like:
one:two:three:four:five:six seven:eight
and I want to use awk to get $1 to be one and $2 to be two:three:four:five:six seven:eight
I know I can get it by doing sed before. That is to change the first occurrence of : with sed then awk it using the new delimiter.
However replacing the delimiter with a new one would not help me since I can not guarantee that the new delimiter will not already be somewhere in the text.
I want to know if there is an option to get awk to behave this way
So something like:
awk -F: '{print $1,$2}'
will print:
one two:three:four:five:six seven:eight
I will also want to do some manipulations on $1 and $2 so I don't want just to substitute the first occurrence of :.
Without any substitutions
echo "one:two:three:four:five" | awk -F: '{ st = index($0,":");print $1 " " substr($0,st+1)}'
The index command finds the first occurance of the ":" in the whole string, so in this case the variable st would be set to 4. I then use substr function to grab all the rest of the string from starting from position st+1, if no end number supplied it'll go to the end of the string. The output being
one two:three:four:five
If you want to do further processing you could always set the string to a variable for further processing.
rem = substr($0,st+1)
Note this was tested on Solaris AWK but I can't see any reason why this shouldn't work on other flavours.
Some like this?
echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1'
one two:three:four:five:six
This replaces the first : to space.
You can then later get it into $1, $2
echo "one:two:three:four:five:six" | awk '{sub(/:/," ")}1' | awk '{print $1,$2}'
one two:three:four:five:six
Or in same awk, so even with substitution, you get $1 and $2 the way you like
echo "one:two:three:four:five:six" | awk '{sub(/:/," ");$1=$1;print $1,$2}'
one two:three:four:five:six
EDIT:
Using a different separator you can get first one as filed $1 and rest in $2 like this:
echo "one:two:three:four:five:six seven:eight" | awk -F\| '{sub(/:/,"|");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight
Unique separator
echo "one:two:three:four:five:six seven:eight" | awk -F"#;#." '{sub(/:/,"#;#.");$1=$1;print "$1="$1 "\n$2="$2}'
$1=one
$2=two:three:four:five:six seven:eight
The closest you can get with is with GNU awk's FPAT:
$ awk '{print $1}' FPAT='(^[^:]+)|(:.*)' file
one
$ awk '{print $2}' FPAT='(^[^:]+)|(:.*)' file
:two:three:four:five:six seven:eight
But $2 will include the leading delimiter but you could use substr to fix that:
$ awk '{print substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
two:three:four:five:six seven:eight
So putting it all together:
$ awk '{print $1, substr($2,2)}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight
Storing the results of the substr back in $2 will allow further processing on $2 without the leading delimiter:
$ awk '{$2=substr($2,2); print $1,$2}' FPAT='(^[^:]+)|(:.*)' file
one two:three:four:five:six seven:eight
A solution that should work with mawk 1.3.3:
awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1}' FS='\0'
one
awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $2}' FS='\0'
two:three:four five:six:seven
awk '{n=index($0,":");s=$0;$1=substr(s,1,n-1);$2=substr(s,n+1);print $1,$2}' FS='\0'
one two:three:four five:six:seven
Just throwing this on here as a solution I came up with where I wanted to split the first two columns on : but keep the rest of the line intact.
Comments inline.
echo "a:b:c:d::e" | \
awk '{
split($0,f,":"); # split $0 into array of fields `f`
sub(/^([^:]+:){2}/,"",$0); # remove first two "fields" from `$0`
print f[1],f[2],$0 # print first two elements of `f` and edited `$0`
}'
Returns:
a b c:d::e
In my input I didn't have to worry about the first two fields containing escaped :, if that was a requirement, this solution wouldn't work as expected.
Amended to match the original requirements:
echo "a:b:c:d::e" | \
awk '{
split($0,f,":");
sub(/^([^:]+:)/,"",$0);
print f[1],$0
}'
Returns:
a b:c:d::e
I want to record the RSSI at a certain point with the distance that point is from a router. The distance will be user input and so will the output file name so the user will type something like:
sh rssi.sh output.csv 20
where output.csv is the csv I want to append the results to and 20 is the distance
at the moment rssi.sh looks like:
#!/bin/bash
RSSI_CSV=$1
DISTANCE=$2
RSSI=$(iwconfig wlan0 | awk -F'[ =]+' '/Signal level/ {print $7}\')
awk '{print $DISTANCE, $RSSI}' > $RSSI_CSV
This creates RSSI_CSV as per user input but doesn't print the required values in it and I'm not sure why.
I imagine it's
awk '{print $DISTANCE, $RSSI}' > $RSSI_CSV
that isn't working as echo RSSI or echo DISTANCE both output the values to the screen. I'm using awk as I want to have columns so i can output a csv file, perhaps though there is a better way?
There are a couple of issues with your awk need to pass the variables using the -v option and use the BEGIN block as no input is given. Also note that a single > will not append but overwrite the file. For appending you need >>:
awk -vD=$DISTANCE -vR=$RSSI 'BEGIN{print D,R}' >> $RSSI_CSV
Demo:
$ DISTANCE=20
$ RSSI=$(iwconfig wlan0 | awk -F'[ =]+' '/Signal level/ {print $7}')
$ awk -vD=$DISTANCE -vR=$RSSI 'BEGIN{print D,R}'
20 -47
Note: I believe you want comma separated values so:
$ awk -vD=$DISTANCE -vR=$RSSI 'BEGIN{print D","R}'
20,-47
However awk is overkill for printing variables just use good old echo:
$ echo "$DISTANCE,$RSSI"
20,-47
You don't need awk to print two shell variables.
printf "%s,%s\n" "$DISTANCE" "$RSSI" >> "$RSSI_CSV"
I'd like to filter the content of /etc/passwd, only showing the lines for which the value in the third column is greater than 999.
Is there an easy way to do this with a one liner? I'd like to do it without writing a boring for-loop.
This is a simple way to do it:
awk -F: '$3 > 999' /etc/passwd
This uses awk with a field separator of : and instructs it to print the line if the third field is greater than 999. If you want to only print the first field (username) or construct some new lines based on the fields, this is a starting point:
awk -F: '{if ($3 > 999) print "user", $1, "uid", $3}' /etc/passwd
How should I go about inserting a character at a certain point in a csv line? For instance, if I had the following:
1,2,3,4,5,6,7
How could I insert ,,,,, at the spot where the 5 (fifth field) is, so it would look like
1,2,3,4,,,,,,5,6,7
I found a link for how to do this for java, but unfortunately I did not have much luck finding out how to do it with bash. Any help would be much appreciated, thanks!
You can use awk to change a specific field:
awk -F"," '{OFS=","; a=$5; $5=",,,,,",a; print $0}' file
The idea is to update the field 5 with the desired values and then print the whole line.
echo "1,2,3,4,5,6,7" | awk -F"," '{a=$5; $5=",,,,,"a; OFS=","; print}'
would print:
1,2,3,4,,,,,,5,6,7
awk -F, 'BEGIN{OFS=","}{$5=",,,,,"$5;print}' your_file
tested below:
> echo "1,2,3,4,5,6" | awk -F, 'BEGIN{OFS=","}{$5=",,,,,"$5;print}'
1,2,3,4,,,,,,5,6
>
or you can do it using perl:
> echo "1,2,3,4,5,6" | perl -F, -lane '$F[4]=~s/^/,,,,,/g;print join(",",#F)'
1,2,3,4,,,,,,5,6
>