I'm new using awk and I found it very useful for extracting data from columns. For example in my file I had
Data: 1234 23434 31324
If I wanted the second column I used:
awk '/Data:/ {print $3}' file.txt
But next, I had some variables inside the file, let's say:
variable_1=1
variable_2=4
How can I extract only the value? how can I extract the name of the variable by knowing the value?
awk offers to specify the field delimiter:
awk -F'=' '$1 == "variable_1" {print $2}' file
Prints:
1
You can do a lot of things with your file, what do you really want?
Get values:
source file.txt
echo "variable_1=${variable_1}"
echo "variable_2=${variable_2}"
Get keys corresponding to value 2
sed '/=2$/ s/=.*//' file.txt
Related
I have the file with | delimited, and am trying to perform below logic
cat list.txt
101
102
103
LIST=`cat list.txt`
Input file
1|Anand|101|1001
2|Raj|103|1002
3|Unix|110|101
Expected result
1|Anand|101|1001
2|Raj|103|1002
3|Unix|UNKNOWN|101
I tried 2 methods,
using fgrep by passing list.txt as input and tried to segregate as 2 files. One matches the list and second not matching and post that non matching file using awk & gsub replacing the 3rd column with UNKNOWN, but issue here is in 3rd row 4th column contains the value available in list.txt, so not able to get expected result
Tried using one liner awk by passing list in -v VAR. Here no changes in the results.
awk -F"|" -v VAR="$LIST" '{if($3 !~ $VAR) {{{gsub(/.*/,"UNKNOWN", $3)1} else { print 0}' input_file
Can you please suggest me how to attain the expected results
There is no need to use cat to read complete file in a variable.
You may just use this awk:
awk 'BEGIN {FS=OFS="|"}
FNR==NR {a[$1]; next}
!($3 in a) {$3 = "UKNNOWN"} 1' list.txt input_file
1|Anand|101|1001
2|Raj|103|1002
3|Unix|UKNNOWN|101
I have a CSV file, which contain something like this:
"1","32","1","2"
"2","2","22","2"
"3","72","5","2"
"4","36","22","2"
I want to display only the first field if the third field contain the value: 22.
In my example, I want to have:
2
4
I was thinking something like this:
awk -F , -v OFS=, '{if ($3=="22")} {print $1}' myfile.csv
How can I do that?
If it is fine, to keep the quotation marks:
awk -F, '$3=="\"22\""{print $1}' test.csv
The output in this case:
"2"
"4"
To get rid of the quotation marks, you could do this:
awk -F\" '$6==22{print $2}' test.csv
Output:
2
4
In this case, quotation marks are treated as delimiters. Therefore, we have to adjust the numbering of columns.
Of course, you can also replace the quotation marks:
awk -F, '$3=="\"22\""{str=$1; gsub("\"","",str); print str}' test.csv
Here is a simpler awk command to get your job done:
awk -F '","|"' '$4 == 22{print $2}' file
2
4
IMHO the simplest thing is to get rid of the double quotes and then do whatever you want:
$ awk -F, '{gsub(/"/,"")} $3==22{print $1}' file
2
4
I have this DB dump file in comma separated CSV file with first line as heading/table name and rest of it are data and some has duplicate entry
HOST_#_INFORMATION,HOST#,Primary Hostname,DNS Domain,IP_#_INFORMATION,Primary IP,DNS
,11,abc,example.com,,10.10.10.10,10.10.10.1
,12,bcd,example.com,,10.10.10.11,10.10.10.1
,13,cde,example.com,,10.10.10.12,10.10.10.1
,11,abc,example.com,,10.10.10.10,10.10.10.1
,13,cde,example.com,,10.10.10.12,10.10.10.1
I need to print only unique columns between HOST_#_INFORMATION and IP_#_INFORMATIO. Output I am looking for is
HOST#,Primary Hostname,DNS Domain
11,abc,example.com
12,bcd,example.com
12,bcd,example.com
I tried with awk gsub option but only printing first line. how can i parse this csv file. I am open to perl option also. Thanks
[root#test /tmp]$ awk -F, -vOFS=, '{if(++a[$2,$3,$4]==1)print $2,$3,$4}' a
HOST#,Primary Hostname,DNS Domain
11,abc,example.com
12,bcd,example.com
13,cde,example.com
No need for awk or sed, use cut'n'sort instead:
cut -d, -f2-4 infile | sort -u
Output:
11,abc,example.com
12,bcd,example.com
13,cde,example.com
Assuming your input format (OP specify between 2 field but with 1 configuration showed)
awk -F ',' 'NR == 1{print "HOST#,Primary Hostname,DNS Domain"} NR > 1{print $2 "," $3, "," $4}' YourFile
Assuming you will parse header separately from data, this is how to parse data and remove duplicates:
awk -F',' '{print $2","$3","$4}'|sort -u
In Perl you could use Text::CSV module, which has rich set of functions to deal with CSV files.
I want to sum up values in a field of a CSV file
where matching should be checked by reading another file,
say we have CSV_file:
adam,18
denis,19
julie,17
adam,15
max,20
julie,19
and a simple txt file containing:
adam
julie
all I need is to sum up 18,15,17,19
how could I easily do that with awk?
awk 'NR==FNR{ s[$1]+= $2; next} {t+=s[$1]} END{ print t}' FS=, csv-file names.txt
Assuming names.txt is:
adam
julie
And values.txt is:
adam,18
denis,19
julie,17
adam,15
max,20
julie,19
Then you can make use of grep's -f flag, which reads patterns from a file, one pattern per line, and returns all lines from values.txt that match any pattern. Then we just use awk to parse out the numbers and sum:
grep -f names.txt values.txt | \
awk 'BEGIN{FS=",";total=0}{total+=$2}END{print total}'
I put together this shell script to do two things:
Change the delimiters in a data file ('::' to ',' in this case)
Select the columns and I want and append them to a new file
It works but I want a better way to do this. I specifically want to find an alternative method for exploding each line into an array. Using command line arguments doesn't seem like the way to go. ANY COMMENTS ARE WELCOME.
# Takes :: separated file as 1st parameters
SOURCE=$1
# create csv target file
TARGET=${SOURCE/dat/csv}
touch $TARGET
echo #userId,itemId > $TARGET
IFS=","
while read LINE
do
# Replaces all matches of :: with a ,
CSV_LINE=${LINE//::/,}
set -- $CSV_LINE
echo "$1,$2" >> $TARGET
done < $SOURCE
Instead of set, you can use an array:
arr=($CSV_LINE)
echo "${arr[0]},${arr[1]}"
The following would print columns 1 and 2 from infile.dat. Replace with
a comma-separated list of the numbered columns you do want.
awk 'BEGIN { IFS='::'; OFS=","; } { print $1, $2 }' infile.dat > infile.csv
Perl probably has a 1 liner to do it.
Awk can probably do it easily too.
My first reaction is a combination of awk and sed:
Sed to convert the delimiters
Awk to process specific columns
cat inputfile | sed -e 's/::/,/g' | awk -F, '{print $1, $2}'
# Or to avoid a UUOC award (and prolong the life of your keyboard by 3 characters
sed -e 's/::/,/g' inputfile | awk -F, '{print $1, $2}'
awk is indeed the right tool for the job here, it's a simple one-liner.
$ cat test.in
a::b::c
d::e::f
g::h::i
$ awk -F:: -v OFS=, '{$1=$1;print;print $2,$3 >> "altfile"}' test.in
a,b,c
d,e,f
g,h,i
$ cat altfile
b,c
e,f
h,i
$