Print the entire row which has difference in value while compare the columns - bash

I want to print the entire row whose value dont match
EG :
Symbol Qty Symbol Qty Symbol qty
a 10 a 10 a 11
b 11 b 11 b 11
c 12 c 12 f 13
f 12 f 12 g 13
OUTPUT :
a 10 a 10 a 11
c 12 c 12 (empty Space)
f 12 f 12 f 13
empty space {ES} g 13
awk 'FNR==NR{a[$0];next}!($0 in a ) ' output1.csv output2.csv >> finn1.csv
awk 'FNR==NR{a[$0];next}!($0 in a ) ' finn1.csv output4.csv >> finn.csv
but this prints all in one column that is missing
Like a 11, but I require the whole line

Assuming that you only want to test for mismatched Qty fields, try this:
#!/bin/bash
declare input_file="/path/to/input_file"
declare -i header_flag=0 a b c
while read line; do
[ ${header_flag} -eq 0 ] && header_flag=1 && continue # Ignore first line.
[ ${#line} -eq 0 ] && continue # Ignore blank lines.
read x a x b x c x <<< ${line} # Reuse ${x} because it is not used.
[ ${a} -ne ${b} -o ${a} -ne ${c} -o ${b} -ne ${c} ] && echo ${line}
done < ${input_file}

The awk one-liner
awk '!($1 == $3 && $2 == $4 && $3 == $5 && $4 == $6)' file
will output
Symbol Qty Symbol Qty Symbol qty
a 10 a 10 a 11
c 12 c 12 f 12
f 12 f 12 g 13
You're going about this the wrong way: you can't mash up all the files into one and then try to find which ones have different/missing values. You need to process the individual files
$ cat file1
Symbol Qty
a 10
b 11
c 12
f 12
$ cat file2
Symbol Qty
a 10
b 11
c 12
f 12
$ cat file3
Symbol qty
a 11
b 11
f 13
g 13
Then
assuming you have GNU awk
gawk '
FNR > 1 { qty[$1][FILENAME] = $1 " " $2 }
END {
OFS = "\t"
for (sym in qty) {
missing = !((ARGV[1] in qty[sym]) && (ARGV[2] in qty[sym]) && (ARGV[3] in qty[sym]))
unequal = !(qty[sym][ARGV[1]] == qty[sym][ARGV[2]] && qty[sym][ARGV[1]] == qty[sym][ARGV[3]])
if (missing || unequal) {
print qty[sym][ARGV[1]], qty[sym][ARGV[2]], qty[sym][ARGV[3]]
}
}
}
' file{1,2,3}
outputs
a 10 a 10 a 11
c 12 c 12
f 12 f 12 f 13
g 13

Related

Print variable inside awk while calculating variable name

I have a script that looks like the example below. I have a letter offset and I need to print the letter that I calculate with the offset. I am not sure how to read that letter using ksh.
My expected answer would be for LETTER_OFFSET(1)=a,LETTER_OFFSET(2)=v, LETTER_OFFSET(3)=c, etc. The offset I have it been calculated inside a loop.
#!/bin/ksh
# 1 2 3 4 5 6 7 8 9 10 11 12
LETTERS=" a v c d g r g s s a g f"
LETTER_OFFSET="3";
Letter=$(echo $LETTERS | awk '{print $((1 * $$LETTER_OFFSET )) }')
You'll pass your offset into your awk script to use as an awk variable using the awk -v flag:
LETTER=$(echo $LETTERS | awk -v offset=$LETTER_OFFSET '{print $offset}')
You don't need to invoke awk in every iteration. You can populate an array using your letters and then access it's values using index:
#!/bin/ksh
# 1 2 3 4 5 6 7 8 9 10 11 12
letters=" a v c d g r g s s a g f"
# populate an array
arr=($letters)
offset=1
while [ "$offset" -le 12 ]; do
echo "${arr[$offset-1]}"
let offset++
done
Output:
a
v
c
d
g
r
g
s
s
a
g
f

Awk flag to remove unwanted data

another awk question.
I have a large text file that is separated by numerical values
43 47
abc
efg
hig
21 122
hijk
lmnop
39 41
somemore
texthere
what i would like to do is print the text only if a condition is satisfied.
here's what i have tried, with no luck
awk '{a=$1; b=$2; if (a < 43 && a > 37 && b < 52 && b > 41) {f=1} elif (a > 43 && a < 37 && b > 52 && b < 41) {print; f=0} } f' file
I'd like to print all of the text if the statement is satisfied and i'd like to skip the text if the statement isn't satisfied.
desired output from above
43 47
abc
efg
hig
39 41
somemore
texthere
awk '
# on a line with 2 numbers:
NF == 2 && $1 ~ /^[0-9]+$/ && $2 ~ /^[0-9]+$/ {
# set a flag if the numbers fall in the given ranges
f = (37 <= $1 && $1 <= 43 && 41 <= $2 && $2 <= 52)
}
f
' file
Self-explanining solution:
awk '
function inrange(x, a, b) { return a <= x && x <= b }
/^[0-9]+[\t ]+[0-9]/ {
f = inrange($1, 37, 43) && inrange($2, 41, 52)
}
f
'

how to use awk to merge files with common fields and print in another file

I have read all the related questions, but still quite confuse...
I have two files tab separated.
file1 (breaks added for readability):
a 15 bac
g 10 bac
h11 bac
r 33 arq
t 12 euk
file2 (breaks added for readability):
0 15 h 3 5 2 gf a a g e g s s g g
p 33 g 4 5 2 hg 3 1 3 f 5 h 5 h 6
g 4 r 8 j 9 jk 9 j 9 9 h t 9 k 0
Output desired (breaks added for readability):
bac 15 h 3 5 2 gf a a g e g s s g g
arq 33 g 4 5 2 hg 3 1 3 f 5 h 5 h 6
ND g 4 r 8 j 9 jk 9 j 9 9 h t 9 k 0
Just that. I need to print the complete file2 but in the first column I need to replace with the third column of file1 only when $2 of file2 is the same that $2 of file1...
file1 is larger than file2, but still could happen that $2 from file2 is not present in file1, in that case print in the first column ND.
I'm sure it must be simple, but I have problems with awk managing two files. Please, if someone could help me...
Using this awk command:
awk 'FNR==NR{a[$2]=$3;next} {$1=(a[$2])?a[$2]:"ND"} 1' file1 file2
bac 15 h 3 5 2 gf a a g e g s s g g
arq 33 g 4 5 2 hg 3 1 3 f 5 h 5 h 6
ND 4 r 8 j 9 jk 9 j 9 9 h t 9 k 0
Explanation:
FNR==NR - Execute this block for first file in input i.e. file1
a[$2]=$3 - Populate an associative array a with key as $2 and value as $3 from file1
next - Read next line until EOF on first file
Now operating in file2
$1=(a[$2])?a[$2]:"ND" - Overwrite $1 with a[$2] if $2 is found in array a, otherwise by literal string "ND"
1 - print the output
You could try with join + awk command as below:
join -t ' ' -a2 -1 2 -2 2 test1.txt test2.txt | awk 'BEGIN { start = 5; end = 18 } { if (NF == 16) { temp = $1; $1 = "ND " $2; $2 = temp; print } else { printf("%s %s ", $3, $1); for (i=start; i<=end; i++) printf ("%s ", $i); printf("\n");}}'

Shell script to find common values and write in particular pattern with subtraction math to range pattern

Shell script to find common values and write in particular pattern with subtraction math to range pattern
Shell script to get command values in two files and write i a pattern to new file AND also have the first value of the range pattern to be subtracted by 1
$ cat file1
2
3
4
6
7
8
10
12
13
16
20
21
22
23
27
30
$ cat file2
2
3
4
8
10
12
13
16
20
21
22
23
27
Script that works:
awk 'NR==FNR{x[$1]=1} NR!=FNR && x[$1]' file1 file2 | sort | awk 'NR==1 {s=l=$1; next} $1!=l+1 {if(l == s) print l; else print s ":" l; s=$1} {l=$1} END {if(l == s) print l; else print s ":" l; s=$1}'
Script out:
2:4
8
10
12:13
16
20:23
27
Desired output:
1:4
8
10
11:13
16
19:23
27
Similar to sputnick's, except using comm to find the intersection of the file contents.
comm -12 <(sort file1) <(sort file2) |
sort -n |
awk '
function print_range() {
if (start != prev)
printf "%d:", start-1
print prev
}
FNR==1 {start=prev=$1; next}
$1 > prev+1 {print_range(); start=$1}
{prev=$1}
END {print_range()}
'
1:4
8
10
11:13
16
19:23
27
Try doing this :
awk 'NR==FNR{x[$1]=1} NR!=FNR && x[$1]' file1 file2 |
sort |
awk 'NR==1 {s=l=$1; next}
$1!=l+1 {if(l == s) print l; else print s -1 ":" l; s=$1}
{l=$1}
END {if(l == s) print l; else print s -1 ":" l; s=$1}'

AWK -- How to do selective multiple column sorting?

In awk, how can I do this:
Input:
1 a f 1 12 v
2 b g 2 10 w
3 c h 3 19 x
4 d i 4 15 y
5 e j 5 11 z
Desired output, by sorting numerical value at $5:
1 a f 2 10 w
2 b g 5 11 z
3 c h 1 12 v
4 d i 4 15 y
5 e j 3 19 x
Note that the sorting should only affecting $4, $5, and $6 (based on value of $5), in which the previous part of table remains intact.
This could be done in multiple steps with the help of paste:
$ gawk '{print $1, $2, $3}' in.txt > a.txt
$ gawk '{print $4, $5, $6}' in.txt | sort -k 2 -n b.txt > b.txt
$ paste -d' ' a.txt b.txt
1 a f 2 10 w
2 b g 5 11 z
3 c h 1 12 v
4 d i 4 15 y
5 e j 3 19 x
Personally, I find using awk to safely sort arrays of columns rather tricky because often you will need to hold and sort on duplicate keys. If you need to selectively sort a group of columns, I would call paste for some assistance:
paste -d ' ' <(awk '{ print $1, $2, $3 }' file.txt) <(awk '{ print $4, $5, $6 | "sort -k 2" }' file.txt)
Results:
1 a f 2 10 w
2 b g 5 11 z
3 c h 1 12 v
4 d i 4 15 y
5 e j 3 19 x
This can be done in pure awk, but as #steve said, it's not ideal. gawk has limited sort functions, and awk has no built-in sort at all. That said, here's a (rather hackish) solution using a compare function in gawk:
[ghoti#pc ~/tmp3]$ cat text
1 a f 1 12 v
2 b g 2 10 w
3 c h 3 19 x
4 d i 4 15 y
5 e j 5 11 z
[ghoti#pc ~/tmp3]$ cat doit.gawk
### Function to be called by asort().
function cmp(i1,v1,i2,v2) {
split(v1,a1); split(v2,a2);
if (a1[2]>a2[2]) { return 1; }
else if (a1[2]<a2[2]) { return -1; }
else { return 0; }
}
### Left-hand-side and right-hand-side, are sorted differently.
{
lhs[NR]=sprintf("%s %s %s",$1,$2,$3);
rhs[NR]=sprintf("%s %s %s",$4,$5,$6);
}
END {
asort(rhs,sorted,"cmp"); ### This calls the function we defined, above.
for (i=1;i<=NR;i++) { ### Step through the arrays and reassemble.
printf("%s %s\n",lhs[i],sorted[i]);
}
}
[ghoti#pc ~/tmp3]$ gawk -f doit.gawk text
1 a f 2 10 w
2 b g 5 11 z
3 c h 1 12 v
4 d i 4 15 y
5 e j 3 19 x
[ghoti#pc ~/tmp3]$
This keeps your entire input file in arrays, so that lines can be reassembled after the sort. If your input is millions of lines, this may be problematic.
Note that you might want to play with the printf and sprintf functions to set appropriate output field separators.
You can find documentation on using asort() with functions in the gawk man page; look for PROCINFO["sorted_in"].

Resources