I have two files (file1 and file2)
file1
ABC=14.2.0.7.SAMPLE=git.xyz/plugins/gitiles/+/refs/heads/clientpatch/abc/patch142007
DEF=14.3.0.5.SAMPLE=git.xyz/plugins/gitiles/+/refs/heads/clientpatch/def/patch143005
DEF=14.3.0.5.SAMPLE2=git.calypso/plugins/gitiles/+/refs/heads/clientpatch/def/patch14300-calib
HIJ=12.0.0.0.Sp3.SAMPLE3=git.xyz/plugins/gitiles/+/refs/heads/clientpatch/hij/patch120000sp3
MNO=16.1.0.28.SAMPLE=git.xyz/plugins/gitiles/+/refs/heads/clientpatch/mno/patch161028
.......(150 lines)
file2
IJK = open
ABC = closed
PQR = closed
DEF = open
HIJ = open
LMN = closed
MNO = closed
PQR = open
......(> 150 lines)
output file
ABC=14.2.0.7.SAMPLE=git.xyz/plugins/gitiles/+/refs/heads/client/abc/patch142007=closed
DEF=14.3.0.5.SAMPLE=git.xyz/plugins/gitiles/+/refs/heads/client/def/patch143005=open
DEF=14.3.0.5.SAMPLE2=git.xyz/plugins/gitiles/+/refs/heads/client/def/patch14300-calib=open
HIJ=12.0.0.0.Sp3.SAMPLE3=git.xyz/plugins/gitiles/+/refs/heads/client/hij/patch120000sp3=open
MNO=16.1.0.28.SAMPLE=git.xyz/plugins/gitiles/+/refs/heads/client/mno/patch161028=closed
I have tried the following script. But it is not giving me any output. Not even printing anything. No errors
while IFS= read -r line
do
key1=`echo $line | awk -F "=" '{print $1}'` < file1
key2=`echo $line | awk -F "=" '{print $2}'` < file1
key3=`echo $line | awk -F "=" '{print $3}'` < file1
key4=`echo $line | awk -F "=" '{print $1}'` < file2
value3=`echo $line | awk -F "=" '{print $2}'` < file2
if [ "$key1" == "$key4" ]; then
echo "$key1=$key2=$key3=$value3"
fi
done
Giving a brief description for how the code should work.
The code should compare first columns of two files(file1 and file2). If each name matches it should give me output file as listed above. Else go to the next line. I should get output if my two files are either in sorted or unsorted format.
Helps will be appreciated. Thank you
Or another approach with awk that stores the file2 values in an array and then appends the correct state to the appropriate line in file1:
awk -F' = ' 'NR==FNR {a[$1]=$2; next} {print $0"="a[$1]}' file2 FS="=" file1
Example Use/Output
$ awk -F' = ' 'NR==FNR {a[$1]=$2; next} {print $0"="a[$1]}' file2 FS="=" file1
ABC=14.2.0.7.SAMPLE=git.xyz/plugins/gitiles/+/refs/heads/clientpatch/abc/patch142007=closed
DEF=14.3.0.5.SAMPLE=git.xyz/plugins/gitiles/+/refs/heads/clientpatch/def/patch143005=open
DEF=14.3.0.5.SAMPLE2=git.calypso/plugins/gitiles/+/refs/heads/clientpatch/def/patch14300-calib=open
HIJ=12.0.0.0.Sp3.SAMPLE3=git.xyz/plugins/gitiles/+/refs/heads/clientpatch/hij/patch120000sp3=open
MNO=16.1.0.28.SAMPLE=git.xyz/plugins/gitiles/+/refs/heads/clientpatch/mno/patch161028=closed
Could you please try following.
awk '
BEGIN{
OFS="="
}
FNR==NR{
a[$1]=$NF
next
}
($1 in a){
print $0,a[$1]
}
' Input_file2 FS="=" Input_file1
Explanation: Adding detailed explanation for above code.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here.
OFS="=" ##Setting OFS as = here for all lines.
}
FNR==NR{ ##Checking condition if FNR==NR which will be TRUE when file2 is being read.
a[$1]=$NF ##Creating an array a with index $1 and value is last field.
next ##next will skip all further statements from here.
}
($1 in a){ ##Checking condition if $1 of current line is present in array a then do following.
print $0,a[$1] ##Printing current line and value of array a with index $1.
}
' file2 FS="=" file1 ##Mentioning Input_file file2 and file1 and setting FS="=" for file1 here.
I am trying to split a file (testfile.csv) that contains the following:
1,2,4,5,6,7,8,9
a,b,c,d,e,f,g,h
q,w,e,r,t,y,u,i
a,s,d,f,g,h,j,k
z,x,c,v,b,n,m,z
into a file
1,2
a,b
q,w
a,s
z,x
and another file
4,5
c,d
e,r
d,f
c,v
but I cannot seem to do that in awk using an iterative solution.
awk -F, '{print $1, $2}'
awk -F, '{print $3, $4}'
does it for me but I would like a looping solution.
I tried
awk -F, '{ for (i=1;i< NF;i+=2) print $i, $(i+1) }' testfile.csv
but it gives me a single column. It appears that I am iterating over the first row and then moving onto the second row skipping every other element of that specific row.
You can use cut:
$ cut -d, -f1,2 file > file_1
$ cut -d, -f3,4 file > file_2
If you are going to use awk be sure to set the OFS so that the columns remain a CSV file:
$ awk 'BEGIN{FS=OFS=","}
{print $1,$2 >"f1"; print $3,$4 > "f2"}' file
$ cat f1
1,2
a,b
q,w
a,s
z,x
$cat f2
4,5
c,d
e,r
d,f
c,v
Is there a quick and dirty way of renaming the resulting files with the first row and first column (like first file would be 1.csv, second file would be 4.csv:
awk 'BEGIN{FS=OFS=","}
FNR==1 {n1=$1 ".csv"; n2=$3 ".csv"}
{print $1,$2 >n1; print $3,$4 > n2}' file
awk -F, '{ for (i=1; i < NF; i+=2) print $i, $(i+1) > i ".csv"}' tes.csv
works for me. I was trying to get the output in bash which was all jumbled up.
It's do-able in bash, but it will be much slower than awk:
f=testfile.csv
IFS=, read -ra first < <(head -1 "$f")
for ((i = 0; i < (${#first[#]} + 1) / 2; i++)); do
slice_file="${f%.csv}$((i+1)).csv"
cut -d, -f"$((2 * i + 1))-$((2 * (i + 1)))" "$f" > "$slice_file"
done
with sed:
sed -r '
h
s/(.,.),./\1/w file1.txt
g
s/.,.,(.,.),./\1/w file2.txt' file.txt
I have a file that looks like
01/11/2015;998978000000;4890********3290;5735;ITUNES.COM/BILL;LU;Cross_border_rub;4065;17;915;INSUFF FUNDS;51;0;
There are 13 semicolon separated columns.
I'm trying to calculate 9 columns for all lines:
awk -F ';' -vOFS=';' '{ gsub(",", ".", $9); print }' file |
awk -F ';' '$0 = NR-1";"$0' |
awk -F ';' -vOFS=';' '{bar[$1]=$1;a[$1]=$2;b[$1]=$3;c[$1]=$4;d[$1]=$5;e[$1]=$6;f[$1]=$7;g[$1]=$8;h[$1]=$9;k[$1]=$10;l[$1]=$11;l[$1]=$12;m[$1]=$13;p[$1]=$14;};
if($7="International") {income=0.0162*h[i]+0.0425*h[i]};
else if($7="Domestic") {income=0.0188*h[i]};
else if($7="Cross_border_rub") {income=0.0162*h[i]+0.025*h[i]}
END{for(i in bar) print income";"a[i],b[i],c[i],d[i],e[i],f[i],g[i],h[i],k[i],l[i],m[i],p[i]}'
How exactly do multiple if statements correctly work in awk?
awk to the rescue!
You don't need the multiple awk invocations. Can consolidate into one
$ awk -F';' -v OFS=';' '{gsub(",", ".", $9)}
$7=="International" {income=(0.0162+0.0425)*$9}
$7=="Domestic" {income=0.0188*$9}
$7=="Cross_border_rub" {income=(0.0162+0.025)*$9}
# what happens for other values since previous income will be copied over
{print income, NR-1, $0}' file
test with your file since you didn't provide a enough sample to test.
Perhaps better if you just assign the rate
$ awk -F';' -v OFS=';' '{gsub(",", ".", $9); rate=0}
$7=="International" {rate=0.0162+0.0425}
$7=="Domestic" {rate=0.0188}
$7=="Cross_border_rub" {rate=0.0162+0.025}
{print rate*$9, NR-1, $0}' file
I am trying out one script in which a file [ file.txt ] has so many columns like
abc|pqr|lmn|123
pqr|xzy|321|azy
lee|cha| |325
xyz| |abc|123
I would like to get the column list in bash script using awk command if column is empty it should print blank else print the column value
I have tried the below possibilities but it is not working
cat file.txt | awk -F "|" {'print $2'} | sed -e 's/^$/blank/' // Using awk and sed
cat file.txt | awk -F "|" '!$2 {print "blank"} '
cat file.txt | awk -F "|" '{if ($2 =="" ) print "blank" } '
please let me know how can we do that using awk or any other bash tools.
Thanks
I think what you're looking for is
awk -F '|' '{print match($2, /[^ ]/) ? $2 : "blank"}' file.txt
match(str, regex) returns the position in str of the first match of regex, or 0 if there is no match. So in this case, it will return a non-zero value if there is some non-blank character in field 2. Note that in awk, the index of the first character in a string is 1, not 0.
Here, I'm assuming that you're interested only in a single column.
If you wanted to be able to specify the replacement string from a bash variable, the best solution would be to pass the bash variable into the awk program using the -v switch:
awk -F '|' -v blank="$replacement" \
'{print match($2, /[^ ]/) ? $2 : blank}' file.txt
This mechanism avoids problems with escaping metacharacters.
You can do it using this sed script:
sed -r 's/\| +\|/\|blank\|/g' File
abc|pqr|lmn|123
pqr|xzy|321|azy
lee|cha|blank|325
xyz|blank|abc|123
If you don't want the |:
sed -r 's/\| +\|/\|blank\|/g; s/\|/ /g' File
abc pqr lmn 123
pqr xzy 321 azy
lee cha blank 325
xyz blank abc 123
Else with awk:
awk '{gsub(/\| +\|/,"|blank|")}1' File
abc|pqr|lmn|123
pqr|xzy|321|azy
lee|cha|blank|325
xyz|blank|abc|123
You can use awk like this:
awk 'BEGIN{FS=OFS="|"} {for (i=1; i<=NF; i++) if ($i ~ /^ *$/) $i="blank"} 1' file
abc|pqr|lmn|123
pqr|xzy|321|azy
lee|cha|blank|325
xyz|blank|abc|123
Here is my code that I want to use to separate 3 columns from hist.txt into 2 separate files, hist1.dat with first and second column and hist2.dat with first and third column. The columns in hist.txt may be separated with more than one space. I want to save in histogram1.dat and histogram2.dat the first n lines until the last nonzero value.
The script creates histogram1.dat correct, but histogram2.dat contains all the lines from hist2.dat.
hist.txt is like :
http://pastebin.com/JqgSKZrP
#!bin/bash
sed 's/\t/ /g' hist.txt | awk '{print $1 " " $2;}' > hist1.dat
sed 's/\t/ /g' hist.txt | awk '{print $1 " " $3;}' > hist2.dat
head -n $( awk 'BEGIN {last=1}; {if($2!=0) last=NR};END {print last}' hist1.dat) hist1.dat > histogram1.dat
head -n $( awk 'BEGIN {last=1}; {if($2!=0) last=NR};END {print last}' hist2.dat) hist2.dat > histogram2.dat
What is the cause of this problem? Might it be due to some special restriction with head?
Thanks.
For your first histogram, try
awk '$2 ~ /000000/{exit}{print $1, $2}' hist.txt
and for your second:
awk '$3 ~ /000000/{exit}{print $1, $3}' hist.txt
Hope I understood you correctly...