UNIX group by two values - bash

I have a file with the following lines (values are separated by ";"):
dev_name;dev_type;soft
name1;ASR1;11.1
name2;ASR1;12.2
name3;ASR1;11.1
name4;ASR3;15.1
I know how to group them by one value, like count of all ASRx, but how can I group it by two values, as for example:
ASR1
*11.1 - 2
*12.2 - 1
ASR3
*15.1 - 1

another awk
$ awk -F';' 'NR>1 {a[$2]; b[$3]; c[$2,$3]++}
END {for(k in a) {print k;
for(p in b)
if(c[k,p]) print "\t*"p,"-",c[k,p]}}' file
ASR1
*11.1 - 2
*12.2 - 1
ASR3
*15.1 - 1

$ cat tst.awk
BEGIN { FS=";"; OFS=" - " }
NR==1 { next }
$2 != prev { prt(); prev=$2 }
{ cnt[$3]++ }
END { prt() }
function prt( soft) {
if ( prev != "" ) {
print prev
for (soft in cnt) {
print " *" soft, cnt[soft]
}
delete cnt
}
}
$ awk -f tst.awk file
ASR1
*11.1 - 2
*12.2 - 1
ASR3
*15.1 - 1
Or if you like pipes....
$ tail +2 file | cut -d';' -f2- | sort | uniq -c |
awk -F'[ ;]+' '{print ($3!=prev ? $3 ORS : "") " *" $4 " - " $2; prev=$3}'
ASR1
*11.1 - 2
*12.2 - 1
ASR3
*15.1 - 1

try something like
awk -F ';' '
NR==1{next}
{aRaw[$2"-"$3]++}
END {
asorti( aRaw, aVal)
for( Val in aVal) {
split( aVal [Val], aTmp, /-/ )
if ( aTmp[1] != Last ) { Last = aTmp[1]; print Last }
print " " aTmp[2] " " aRaw[ aVal[ Val] ]
}
}
' YourFile
key here is to use 2 field in a array. The END part is more difficult to present the value than the content itself

Using Perl
$ cat bykub.txt
dev_name;dev_type;soft
name1;ASR1;11.1
name2;ASR1;12.2
name3;ASR1;11.1
name4;ASR3;15.1
$ perl -F";" -lane ' $kv{$F[1]}{$F[2]}++ if $.>1;END { while(($x,$y) = each(%kv)) { print $x;while(($p,$q) = each(%$y)){ print "\t\*$p - $q" }}}' bykub.txt
ASR1
*11.1 - 2
*12.2 - 1
ASR3
*15.1 - 1
$

Yet Another Solution, this one using the always useful GNU datamash to count the groups:
$ datamash -t ';' --header-in -sg 2,3 count 3 < input.txt |
awk -F';' '$1 != curr { curr = $1; print $1 } { print "\t*" $2 " - " $3 }'
ASR1
*11.1 - 2
*12.2 - 1
ASR3
*15.1 - 1

I don't want to encourage lazy questions, but I wrote a solution, and I'm sure someone can point out improvements. I love posting answers on this site because I learn so much. :)
One binary subcall to sort, otherwise all built-in processing. That means using read, which is slow. If your file is large, I'd recommend rewriting the loop in awk or perl, but this will get the job done.
sed 1d groups | # strip the header
sort -t';' -k2,3 > group.srt # pre-sort to collect groupings
declare -i ctr=0 # initialize integer record counter
IFS=';' read x lastA lastB < group.srt # priming read for comparators
printf "$lastA\n\t*$lastB - " # priming print (assumes at least one record)
while IFS=';' read x a b # loop through the file
do if [[ "$lastA" < "$a" ]] # on every MAJOR change
then printf "$ctr\n$a\n\t*$b - " # print total, new MAJOR header and MINOR header
lastA="$a" # update the MAJOR comparator
lastB="$b" # update the MINOR comparator
ctr=1 # reset the counter
elif [[ "$lastB" < "$b" ]] # on every MINOR change
then printf "$ctr\n\t*$b - " # print total and MINOR header
ctr=1 # reset the counter
else (( ctr++ )) # otherwise increment
fi
done < group.srt # feed read from sorted file
printf "$ctr\n" # print final group total at EOF

Related

How to calculate the mean of row from csv file from nth column?

This may look like a duplicate but I could not solve the issue I'm having.
I'm trying to find the average of each column from a CSV/TSV file the data looks like below:
input.tsv
ID source random text val1 val2 val3 val4 val330
1 atttt eeeee test 0.9 0.5 0.2 0.54 0.89
2 afdg adfgrg tf 0.6 0.23 0.5 0.4 0.29
output.tsv
ID source random text Avg
1 atttt eeeee test 0.606
2 afdg adfgrg tf 0.404
or at least
ID Avg
1 0.606
2 0.404
I tried a suggestion from here
awk 'NR==1{next}
{printf("%s\t", $1
printf("%.2f\n", ($5 + $6 + $7)/3}' input.tsv
which threw error.
and
awk '{ s = 4; for (i = 5; i <= NF; i++) s += $i; print $1, (NF > 1) ? s / (NF - 1) : 0; }' input.tsv
the below code also threw a syntax error
for i in `cat input.tsv` do; VALUES=`echo $i | tr '\t' '\t'`;COUNT=0;SUM=0;typeset -i j;IFS=' ';for j in $VALUES; do;SUM=`expr $SUM + $j`;COUNT=`expr $COUNT + 1`;done;AVG=`expr $SUM / $COUNT`;echo $AVG;done
help me resolve the issue to calculate the average of the row
From you code reference:
awk 'NR==1{next}
{
# missing the last ). This print the 1st column
#printf("%s\t", $1
printf("%s\t", $1 )
# missing the last ) and average of 3 colum only
#printf("%.2f\n", ($5 + $6 + $7)/3
printf("%.2f\n", ($5 + $6 + $7 + $8 + $9) / 5 )
}' input.tsv
Your second code is not easy work with , lot of subshell (backtic) and shell loop but most of all, i think it is made for working with integer value and for full line of value (not 5- > 9). Forget it unless you don't want awk in this case.
for fun
awk 'NR==1{
# Header
print $0 OFS "Avg"
Count = NF - 5
next
}
{
# print each element of the line + sum after col 4
for( i=Avg=0;i<=NF;i++) {
if( i >=5 ) Avg+= $i
printf( "%s ", $i)
}
# print average
printf( "%.2f\n", Avg/Count )
}
' input.tsv
Assuming here that it is always counting on the full stack of value, we can change the Count by (NF - 4) if less value are on the line and empty are not counting
You could use this awk script:
awk 'NR>1{
for(i=5;i<=NF;i++)
sum+=$i
}
{
print $1,$2,$3,$4,(NF>4&&sum!=""?sum/(NF-4):(NR==1?"Avg":""))
sum=0
}' file | column -t
The first block gets the sum of all ids starting from the 5th element.
The second block, prints the header line and the average value.
column -t displays the result in column.
This would be working as expected:
awk 'BEGIN{OFS="\t"}
(NR==1){ print $1,$2,$3,$4,"Avg:"; next }
{ s=0; for(i=5;i<=NF;++i) s+=$i }
{ print $1,$2,$3,$4, (NF>4 ? s/(NF-4) : s) }' input.tsv
or just for the fun of it, if you want to make the for-loop obfuscated:
awk 'BEGIN{OFS="\t"}
(NR==1){ print $1,$2,$3,$4,"Avg:"; next }
{ for(s=!(i=5);i<=NF;s+=$(i++)) {} }
{ print $1,$2,$3,$4, (NF>4 ? s/(NF-4) : s) }' input.tsv
$ cat tst.awk
NR == 1 { avg = "Avg" }
NR > 1 {
sum = cnt = 0
for (i=5; i<=NF; i++) {
sum += $i
cnt++
}
avg = (cnt ? sum / cnt : 0)
}
{ print $1, $2, $3, $4, avg }
$ awk -f tst.awk file
ID source random text Avg
1 atttt eeeee test 0.606
2 afdg adfgrg tf 0.404
Using Perl one-liner
> perl -lane '{ $s=0;foreach(#F[4..8]){$s+=$_} $F[4]=$s==0?"Avg":$s/5;print "$F[0]\t$F[1]\t$F[2]\t$F[3]\t$F[4]" } ' input.tsv
ID source random text Avg
1 atttt eeeee test 0.606
2 afdg adfgrg tf 0.404
>

Add x^2 to every "nonzero" coefficient with sed/awk

I have to write, as easy as possible, a script or command which has to use awk or/and sed.
Input file:
23 12 0 33
3 4 19
1st line n=3
2nd line n=2
In each line of file we have string of numbers. Each number is coefficient and we have to add x^n where n is the highest power (sum of spaces between numbers in each line (no space after last number in each line)) and if we have "0" in our string we have to skip it.
So for that input we will have output like:
23x^3+12x^2+33
3x^2+4x+19
Please help me to write a short script solving that problem. Thank you so much for your time and all the help :)
My idea:
linescount=$(cat numbers|wc -l)
linecounter=1
While[linecounter<=linescount];
do
i=0
for i in spaces=$(cat numbers|sed 1p | sed " " )
do
sed -i 's/ /x^spaces/g'
i=($(i=i-1))
done
linecounter=($(linecounter=linecounter-1))
done
Following awk may help you on same too.
awk '{for(i=1;i<=NF;i++){if($i!="" && $i){val=(val?val "+" $i:$i)(NF-i==0?"":(NF-i==1?"x":"x^"NF-i))} else {pointer++}};if(val){print val};val=""} pointer==NF{print;} {pointer=""}' Input_file
Adding a non-one liner form of solution too here.
awk '
{
for(i=1;i<=NF;i++){
if($i!="" && $i){
val=(val?val "+" $i:$i)(NF-i==0?"":(NF-i==1?"x":"x^"NF-i))}
else {
pointer++}};
if(val) {
print val};
val=""
}
pointer==NF {
print}
{
pointer=""
}
' Input_file
EDIT: Adding explanation too here for better understanding of OP and all people's learning here.
awk '
{
for(i=1;i<=NF;i++){ ##Starting a for loop from variable 1 to till the value of NF here.
if($i!="" && $i){ ##checking if variable i value is NOT NULL then do following.
val=(val?val "+" $i:$i)(NF-i==0?"":(NF-i==1?"x":"x^"NF-i))} ##creating variable val here and putting conditions here if val is NULL then
##simply take value of that field else concatenate the value of val with its
##last value. Second condition is to check if last field of line is there then
##keep it like that else it is second last then print "x" along with it else keep
##that "x^" field_number-1 with it.
else { ##If a field is NULL in current line then come here.
pointer++}}; ##Increment the value of variable named pointer here with 1 each time it comes here.
if(val) { ##checking if variable named val is NOT NULL here then do following.
print val}; ##Print the value of variable val here.
val="" ##Nullifying the variable val here.
}
pointer==NF { ##checking condition if pointer value is same as NF then do following.
print} ##Print the current line then, seems whole line is having zeros in it.
{
pointer="" ##Nullifying the value of pointer here.
}
' Input_file ##Mentioning Input_file name here.
Offering a Perl solution since it has some higher level constucts than bash that make the code a little simpler:
use strict;
use warnings;
use feature qw(say);
my #terms;
while (my $line = readline(*DATA)) {
chomp($line);
my $degree = () = $line =~ / /g;
my #coefficients = split / /, $line;
my #terms;
while ($degree >= 0) {
my $coefficient = shift #coefficients;
next if $coefficient == 0;
push #terms, $degree > 1
? "${coefficient}x^$degree"
: $degree > 0
? "${coefficient}x"
: $coefficient;
}
continue {
$degree--;
}
say join '+', #terms;
}
__DATA__
23 12 0 33
3 4 19
Example output:
hunter#eros  ~  perl test.pl
23x^3+12x^2+33
3x^2+4x+19
Any information you want on any of the builtin functions used above: readline, chomp, push, shift, split, say, and join can be found in perldoc with perldoc -f <function-name>
$ cat a.awk
function print_term(i) {
# Don't print zero terms:
if (!$i) return;
# Print a "+" unless this is the first term:
if (!first) { printf " + " }
# If it's the last term, just print the number:
if (i == NF) printf "%d", $i
# Leave the coefficient blank if it's 1:
coef = ($i == 1 ? "" : $i)
# If it's the penultimate term, just print an 'x' (not x^1):
if (i == NF-1) printf "%sx", coef
# Print a higher-order term:
if (i < NF-1) printf "%sx^%s", coef, NF - i
first = 0
}
{
first = 1
# print all the terms:
for (i=1; i<=NF; ++i) print_term(i)
# If we never printed any terms, print a "0":
print first ? 0 : ""
}
Example input and output:
$ cat file
23 12 0 33
3 4 19
0 0 0
0 1 0 1
17
$ awk -f a.awk file
23x^3 + 12x^2 + 33
3x^2 + 4x + 19
0
x^2 + 1
17
$ cat ip.txt
23 12 0 33
3 4 19
5 3 0
34 01 02
$ # mapping each element except last to add x^n
$ # -a option will auto-split input on whitespaces, content in #F array
$ # $#F will give index of last element (indexing starts at 0)
$ # $i>0 condition check to prevent x^0 for last element
$ perl -lane '$i=$#F; print join "+", map {$i>0 ? $_."x^".$i-- : $_} #F' ip.txt
23x^3+12x^2+0x^1+33
3x^2+4x^1+19
5x^2+3x^1+0
34x^2+01x^1+02
$ # with post processing
$ perl -lape '$i=$#F; $_ = join "+", map {$i>0 ? $_."x^".$i-- : $_} #F;
s/\+0(x\^\d+)?\b|x\K\^1\b//g' ip.txt
23x^3+12x^2+33
3x^2+4x+19
5x^2+3x
34x^2+01x+02
One possibility is:
#!/usr/bin/env bash
line=1
linemax=$(grep -oEc '(( |^)[0-9]+)+' inputFile)
while [ $line -lt $linemax ]; do
degree=$(($(grep -oE ' +' - <<<$(grep -oE '(( |^)[0-9]+)+' inputFile | head -$line | tail -1) | cut -d : -f 1 | uniq -c)+1))
coeffs=($(grep -oE '(( |^)[0-9]+)+' inputFile | head -$line | tail -1))
i=0
while [ $i -lt $degree ]; do
if [ ${coeffs[$i]} -ne 0 ]; then
if [ $(($degree-$i-1)) -gt 1 ]; then
echo -n "${coeffs[$i]}x^$(($degree-$i-1))+"
elif [ $(($degree-$i-1)) -eq 1 ]; then
echo -n "${coeffs[$i]}x"
else
echo -n "${coeffs[$i]}"
fi
fi
((i++))
done
echo
((line++))
done
The most important lines are:
# Gets degree of the equation
degree=$(($(grep -oE ' +' - <<<$(grep -oE '(( |^)[0-9]+)+' inputFile | head -$line | tail -1) | cut -d : -f 1 | uniq -c)+1))
# Saves coefficients in an array
coeffs=($(grep -oE '(( |^)[0-9]+)+' inputFile | head -$line | tail -1))
Here, grep -oE '(( |^)[0-9]+)+' finds lines containing only numbers (see edit). grep -oE ' +' - ........... |cut -d : -f 1 |uniq counts the numbers of coefficients per line as explained in this question.
Edit: An improved regex for capturing lines with only numbers is
grep -E '(( |^)[0-9]+)+' inputfile | grep -v '[a-zA-Z]'
sed -r "s/(.*) (.*) (.*) (.*)/\1x^3+\2x^2+\3x+\4/; \
s/(.*) (.*) (.*)/\1x^2+\2x+\3/; \
s/\+0x(^.)?\+/+/g; \
s/^0x\^.[+]//g; \
s/\+0$//g;" koeffs.txt
Line 1: Handle 4 elements
Line 2: Handle 3
Line 3: Handle 0 in the middle
Line 5: Handle 0 at start
Line 5: Handle 0 at end
Here is a more bashy, less sedy answer which is better readable, than the sed one, I think:
#!/bin/bash
#
# 0 4 12 => 12x^3
# 2 4 12 => 12x
# 3 4 12 => 12
term () {
p=$1
leng=$2
fac=$3
pot=$((leng - 1 - p))
case $pot in
0) echo -n '+'${fac} ;;
1) echo -n '+'${fac}x ;;
*) echo -n '+'${fac}x^$pot ;;
esac
}
handleArray () {
# mapfile puts a counter into the array, starting with 0 for the 1st
# get rid of it!
shift
coeffs=($*)
# echo ${coeffs[#]}
cnt=0
len=${#coeffs[#]}
while (( cnt < len ))
do
if [[ ${coeffs[$cnt]} != 0 ]]
then
term $cnt $len ${coeffs[$cnt]}
fi
((cnt++))
done
echo # -e '\n' # extra line for dbg, together w. line 5 of the function.
}
mapfile -n 0 -c 1 -C handleArray < ./koeffs.txt coeffs | sed -r "s/^\++//;s/\++$//;"
The mapfile reads data and produces an array. See help mapfile for a brief syntax introduction.
We need some counting, to know, to which power to raise. Meanwhile we try to get rid of 0-terms.
In the end I use sed to remove leading and trailing plusses.
sh solution
while read line ; do
set -- $line
while test $1 ; do
i=$(($#-1))
case $1 in
0) ;;
*) case $i in
0) j="" ;;
1) j="x" ;;
*) j="x^$i" ;;
esac
result="$result$1$j+";;
esac
shift
done
echo "${result%+}"
result=""
done < infile
$ cat tst.awk
{
out = sep = ""
for (i=1; i<=NF; i++) {
if ($i != 0) {
pwr = NF - i
if ( pwr == 0 ) { sfx = "" }
else if ( pwr == 1 ) { sfx = "x" }
else { sfx = "x^" pwr }
out = out sep $i sfx
sep = "+"
}
}
print out
}
$ awk -f tst.awk file
23x^3+12x^2+33
3x^2+4x+19
First, my test set:
$ cat file
23 12 0 33
3 4 19
0 1 2
2 1 0
Then the awk script:
$ awk 'BEGIN{OFS="+"}{for(i=1;i<=NF;i++)$i=$i (NF-i?"x^" NF-i:"");gsub(/(^|\+)0(x\^[0-9]+)?/,"");sub(/^\+/,""}1' file
23x^3+12x^2+33
3x^2+4x^1+19
1x^1+2
2x^2+1x^1
And an explanation:
$ awk '
BEGIN {
OFS="+" # separate with a + (negative values
} # would be dealt with in gsub
{
for(i=1;i<=NF;i++) # process all components
$i=$i (NF-i?"x^" NF-i:"") # add x and exponent
gsub(/(^|\+)0(x\^[0-9]+)?/,"") # clean 0s and leftover +s
sub(/^\+/,"") # remore leading + if first component was 0
}1' file # output
This might work for you (GNU sed);)
sed -r ':a;/^\S+$/!bb;s/0x\^[^+]+\+//g;s/\^1\+/+/;s/\+0$//;b;:b;h;s/\S+$//;s/\S+\s+/a/g;s/^/cba/;:c;s/(.)(.)\2\2\2\2\2\2\2\2\2\2/\1\1\2/;tc;s/([a-z])\1\1\1\1\1\1\1\1\1/9/;s/([a-z])\1\1\1\1\1\1\1\1/8/;s/([a-z])\1\1\1\1\1\1\1/7/;s/([a-z])\1\1\1\1\1\1/6/;s/([a-z])\1\1\1\1\1/5/;s/([a-z])\1\1\1\1/4/;s/([a-z])\1\1\1/3/;s/([a-z])\1\1/2/;s/([a-z])\1/1/;s/[a-z]/0/g;s/^0+//;G;s/(.*)\n(\S+)\s+/\2x^\1+/;ba' file
This is not a serious solution!
Shows how sed can count, kudos goes to Greg Ubben back in 1989 when he wrote wc in sed!

Calculation with in Bash Shell

How can I calculate following data?
Input:
2 Printers
2 x 2 Cartridges
2 Router
1 Cartridge
Output:
Total Number of Printers: 2
Total Number of Cartridges: 5
Total Number of Router: 2
Please note that Cartridges have been multiplied (2 x 2) + 1 = 5. I tried following but not sure how to get the number when I have (2 x 2) type of scenario:
awk -F " " '{print $1}' Cartridges.txt >> Cartridges_count.txt
CartridgesCount=`( echo 0 ; sed 's/$/ +/' Cartridges_count.txt; echo p ) | dc`
echo "Total Number of Cartridges: $CartridgesCount"
Please advise.
This assumes that there are only multiplication operators in the data.
awk '{$NF = $NF "s"; sub("ss$", "s", $NF); qty = $1; for (i = 2; i < NF; i++) {if ($i ~ /^[[:digit:]]+$/) {qty *= $i}}; items[$NF] += qty} END {for (item in items) {print "Total number of", item ":", items[item]}}'
Broken out on multiple lines:
awk '{
$NF = $NF "s";
sub("ss$", "s", $NF);
qty = $1;
for (i = 2; i < NF; i++) {
if ($i ~ /^[[:digit:]]+$/) {
qty *= $i
}
};
items[$NF] += qty
}
END {
for (item in items) {
print "Total number of", item ":", items[item]
}
}'
Try something like this (assuming a well formatted input) ...
sed -e 's| x | * |' -e 's|^\([ 0-9+*/-]*\)|echo $((\1)) |' YourFileName | sh | awk '{a[$2]+=$1;} END {for (var in a) print a[var] " "var;}'
P.S. Cartridges and Cartridge are different. If you want to take care of that too, it would be even more difficult but you can modify the last awk in the pipeline.

summing second columns of all files in bash

I have 1-N files in this format:
file 1:
1 1
2 5
3 0
4 0
5 0
file 2:
1 5
2 1
3 0
4 0
5 1
As an output, I want to sum all second columns of all files, so the output looks like this:
output:
1 6
2 6
3 0
4 0
5 1
Thanks a lot.
(Alternatively would be the best for me to do this operation automatically with all files that have the same name, but start with different number, e.g. 1A.txt, 2A.txt, 3A.txt as one output and 1AD.txt, 2AD.txt, 3AD.txt as next output)
Something like this should work:
cat *A.txt | awk '{sums[$1] += $2;} END { for (i in sums) print i " " sums[i]; }'
cat *AD.txt | awk '{sums[$1] += $2;} END { for (i in sums) print i " " sums[i]; }'
A quick summing solution can be done in awk:
{ sum[$1] += $2; }
END { for (i in sum) print i " " sum[i]; }
Grouping your input files is done easiest by building a list of suffixes and then globbing for them:
ls *.txt | sed -e 's/^[0-9]*//' | while read suffix; do
awk '{ sum[$1] += $2; } END { for (i in sum) print i " " sum[i]; }' *$suffix > ${suffix}.sum
done
#!/bin/bash
suffixes=$(find . -name '*.txt' | sed 's/.*[0-9][0-9]*\(.*\)\.txt/\1/' | sort -u)
for suffix in ${suffixes}; do
paste *${suffix}.txt | awk '{sum = 0; for (i = 2; i <= NF; i += 2) sum += $i;
print $1" "sum}' > ${suffix}.sums.txt
done
exit 0
Pure Bash:
declare -a sum
for file in *A.txt; do
while read a b; do
((sum[a]+=b))
done < "$file"
done
for idx in ${!sum[*]}; do # iterate over existing indices
echo "$idx ${sum[$idx]}"
done

How can I remove selected lines with an awk script?

I'm piping a program's output through some awk commands, and I'm almost where I need to be. The command thus far is:
myprogram | awk '/chk/ { if ( $12 > $13) printf("%s %d\n", $1, $12 - $13); else printf("%s %d\n", $1, $13 - $12) } ' | awk '!x[$0]++'
The last bit is a poor man's uniq, which isn't available on my target. Given the chance the command above produces an output such as this:
GR_CB20-chk_2, 0
GR_CB20-chk_2, 3
GR_CB200-chk_2, 0
GR_CB200-chk_2, 1
GR_HB20-chk_2, 0
GR_HB20-chk_2, 6
GR_HB20-chk_2, 0
GR_HB200-chk_2, 0
GR_MID20-chk_2, 0
GR_MID20-chk_2, 3
GR_MID200-chk_2, 0
GR_MID200-chk_2, 2
What I'd like to have is this:
GR_CB20-chk_2, 3
GR_CB200-chk_2, 1
GR_HB20-chk_2, 6
GR_HB200-chk_2, 0
GR_MID20-chk_2, 3
GR_MID200-chk_2, 2
That is, I'd like to print only line that has a maximum value for a given tag (the first 'field'). The above example is representative of the at data in that the output will be sorted (as though it had been piped through a sort command).
Based on my answer to a similar need, this script keeps things in order and doesn't accumulate a big array. It prints the line with the highest value from each group.
#!/usr/bin/awk -f
{
s = substr($0, 0, match($0, /,[^,]*$/))
if (s != prevs) {
if ( FNR > 1 ) print prevline
prevval = $2
prevline = $0
}
else if ( $2 > prevval ) {
prevval = $2
prevline = $0
}
prevs = s
}
END {
print prevline
}
If you don't need the items to be in the same order they were output from myprogram, the following works:
... | awk '{ if ($2 > x[$1]) x[$1] = $2 } END { for (k in x) printf "%s %s", k, x[k] }'

Resources