Extract Maximum and minimum value using awk

Extract Maximum and minimum value using awk - shell

How to find maximum and minimum value from the below table using awk command.
20 90 60 30
55 75 80 85
10 15 99 95
55 95 70 20
9 35 85 75
I want the output like max value=99 and min=9

with gnu awk:
awk '{for(x=1;x<=NF;x++)a[++y]=$x}END{c=asort(a);print "min:",a[1];print "max:",a[c]}'
output:
min: 9
max: 99
without awk:
xargs -n1|sort -n|head or tail -1
e.g.
min:
kent$ echo "20 90 60 30
55 75 80 85
10 15 99 95
55 95 70 20
9 35 85 75"|xargs -n1|sort -n|head -1
9
max:
kent$ echo "20 90 60 30
55 75 80 85
10 15 99 95
55 95 70 20
9 35 85 75"|xargs -n1|sort -n|tail -1
99
you can of course xargs -n1|sort -n then pipe to awk to pick first and last and print in one shot.

If you have GNU awk:
# using array
awk '{x[NR]=$1}END{asort(x);print "max="x[NR],"min="x[1]}' RS=' +|\n' file
max=99 min=9
# No array
awk 'NR==1{m=n=$1}{$1>m?m=$1:m;$1<n?n=$1:n}END{print "max="m,"min="n}' RS=' +|\n' file
max=99 min=9

awk '
NR == 1 { min=max=$1 }
{
for (i=1;i<=NF;i++) {
min = (min < $i ? min : $i)
max = (max > $i ? max : $i)
}
}
END {
printf "min value = %s\n", (min == "" ? "NaN" : min)
printf "max value = %s\n", (max == "" ? "NaN" : max)
}
' file
The test resulting in "NaN" is to accommodate empty input files.

Related

Delete repeated rows keeping one closer to another file using awk

I have two files
$cat file1.txt
0105 20 20 95 50
0106 20 20 95 50
0110 20 20 88 60
0110 20 20 88 65
0115 20 20 82 70
0115 20 20 82 70
0115 20 20 82 75
If you see the file1.txt, there are repeated values in column-1, which are 0110 and 0115.
So I would like to keep one row only based on the column-5 values, which are closer to corresponding values in a reference file (file2.txt). Here closer means the equal or the nearest value in file2.txt. I don't want to change any value in file1.txt, but just to select one row.
$cat file2.txt
0105 20 20 95 50
0106 20 20 95 50
0107 20 20 95 52
0110 20 20 88 65 34
0112 20 20 82 80 23
0113 20 20 82 85 32
0114 20 20 82 70 23
0115 20 20 82 72
0118 20 20 87 79
0120 20 20 83 79
So if we compare the two files, we must keep 0110 20 20 88 65, as the column-5 entry (i.e. 65) in file1.txt is closer that in reference file (i.e. 65 in file2.txt) and delete the other repeated rows. Similarly we must keep 0115 20 20 82 70 because 70 is closer to 72 and delete other two rows starting with 0115
Desire output:
0105 20 20 95 50
0106 20 20 95 50
0110 20 20 88 65
0115 20 20 82 70
I am trying with the following script, but not getting my desire result.
awk 'FNR==NR { a[$5]; next } $5 in a ' file1.txt file2.txt > test.txt
awk '{a[NR]=$1""$2} a[NR]!=a[NR-1]{print}' test.txt
My fortran program algorithm is:
# check each entries in column-1 in file1.txt with next rows if they are same or not
i.e. for i=1,i++ do # Here i is ith row
for j=1,j++ do
if a[i,j] != a[i+1,j]; then print the whole row as it is,
else
# find the row b[i,j] in file2.txt starting with a[i,j]
# and compare the 5th column i.e. b[i,j+5] with all a[i,j+5] starting with a[i,j] in file1.txt
# and take the differences to find closest one
e.g. if we have 3 rows starting with same entry, then
we select the a[i,j] in which diff(b[i,j+5],a[i,j+5]) is minumum i=1,2,3

awk 'BEGIN {
while ((getline line < "file2.txt")>0) {
split(line, f);
file2[f[1]] = line;
}
}
{
if (!($1 in result)) result[$1] = $0;
split(result[$1], a);
split(file2[$1], f);
if (abs(f[5]-$5) < abs(f[5]-a[5])) result[$1] = $0;
}
END {
for (i in result) print result[i];
}
function abs(n) {
return (n < 0 ? -n : n);
}' file1.txt | sort

convert comma separated list in text file into columns in bash

I've managed to extract data (from an html page) that goes into a table, and I've isolated the columns of said table into a text file that contains the lines below:
[30,30,32,35,34,43,52,68,88,97,105,107,107,105,101,93,88,80,69,55],
[28,6,6,50,58,56,64,87,99,110,116,119,120,117,114,113,103,82,6,47],
[-7,,,43,71,30,23,28,13,13,10,11,12,11,13,22,17,3,,-15,-20,,38,71],
[0,,,3,5,1.5,1,1.5,0.5,0.5,0,0.5,0.5,0.5,0.5,1,0.5,0,-0.5,-0.5,2.5]
Each bracketed list of numbers represents a column. What I'd like to do is turn these lists into actual columns that I can work with in different data formats. I'd also like to be sure to include that blank parts of these lists too (i.e., "[,,,]")
This is basically what I'm trying to accomplish:
30 28 -7 0
30 6
32 6
35 50 43 3
34 58 71 5
43 56 30 1.5
52 64 23 1
. . . .
. . . .
. . . .
I'm parsing data from a web page, and ultimately planning to make the process as automated as possible so I can easily work with the data after I output it to a nice format.
Anyone know how to do this, have any suggestions, or thoughts on scripting this?

Since you have your lists in python, just do it in python:
l=[["30", "30", "32"], ["28","6","6"], ["-7", "", ""], ["0", "", ""]]
for i in zip(*l):
print "\t".join(i)
produces
30 28 -7 0
30 6
32 6

awk based solution:
awk -F, '{gsub(/\[|\]/, ""); for (i=1; i<=NF; i++) a[i]=a[i] ? a[i] OFS $i: $i}
END {for (i=1; i<=NF; i++) print a[i]}' file
30 28 -7 0
30 6
32 6
35 50 43 3
34 58 71 5
43 56 30 1.5
52 64 23 1
..........
..........

Another solution, but it works only for file with 4 lines:
$ paste \
<(sed -n '1{s,\[,,g;s,\],,g;s|,|\n|g;p}' t) \
<(sed -n '2{s,\[,,g;s,\],,g;s|,|\n|g;p}' t) \
<(sed -n '3{s,\[,,g;s,\],,g;s|,|\n|g;p}' t) \
<(sed -n '4{s,\[,,g;s,\],,g;s|,|\n|g;p}' t)
30 28 -7 0
30 6
32 6
35 50 43 3
34 58 71 5
43 56 30 1.5
52 64 23 1
68 87 28 1.5
88 99 13 0.5
97 110 13 0.5
105 116 10 0
107 119 11 0.5
107 120 12 0.5
105 117 11 0.5
101 114 13 0.5
93 113 22 1
88 103 17 0.5
80 82 3 0
69 6 -0.5
55 47 -15 -0.5
-20 2.5
38
71
Updated: or another version with preprocessing:
$ sed 's|\[||;s|\][,]\?||' t >t2
$ paste \
<(sed -n '1{s|,|\n|g;p}' t2) \
<(sed -n '2{s|,|\n|g;p}' t2) \
<(sed -n '3{s|,|\n|g;p}' t2) \
<(sed -n '4{s|,|\n|g;p}' t2)

If a file named data contains the data given in the problem (exactly as defined above), then the following bash command line will produce the output requested:
$ sed -e 's/\[//' -e 's/\]//' -e 's/,/ /g' <data | rs -T
Example:
cat data
[30,30,32,35,34,43,52,68,88,97,105,107,107,105,101,93,88,80,69,55],
[28,6,6,50,58,56,64,87,99,110,116,119,120,117,114,113,103,82,6,47],
[-7,,,43,71,30,23,28,13,13,10,11,12,11,13,22,17,3,,-15,-20,,38,71],
[0,,,3,5,1.5,1,1.5,0.5,0.5,0,0.5,0.5,0.5,0.5,1,0.5,0,-0.5,-0.5,2.5]
$ sed -e 's/[//' -e 's/]//' -e 's/,/ /g' <data | rs -T
30 28 -7 0
30 6 43 3
32 6 71 5
35 50 30 1.5
34 58 23 1
43 56 28 1.5
52 64 13 0.5
68 87 13 0.5
88 99 10 0
97 110 11 0.5
105 116 12 0.5
107 119 11 0.5
107 120 13 0.5
105 117 22 1
101 114 17 0.5
93 113 3 0
88 103 -15 -0.5
80 82 -20 -0.5
69 6 38 2.5
55 47 71

Pretty-print with awk?

I have a code which is intended to output numbers stored in a file (which are in one column) to another TXT file. The part of the code which does this this is:
awk -F"\n" 'NR==1{a=$1" ";next}{a=a$1" "}END{print a}' col_trim.txt >> row.txt
the output is something like this:
1.31 2.3 3.35 2.59 1.63
2.03 2.21 1.99 1.5 1.12
1 0.6 -0.71 -2.1 0.01
But I want it to be like this:
1.31 2.30 3.35 2.59 1.63
2.03 2.21 1.99 1.50 1.12
1.00 0.60 -0.71 -2.10 0.01
As you see all numbers in the second sample have 2 digits after decimal and also if they are negative, the negative sign is placed before the number so it doesn't mess the arrangement of the numbers.
Any idea?
P.S.:
The input file is a text file with a column of numbers (for each row):
1.31
2.3
3.35
2.59
1.63
The whole code is like this:
#!/bin/sh
rm *.txt
for time in 00 03 06 09 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96; do
filename=gfs.t00z.master.grbf$time.10m.uv.grib2
wgrib2 $filename -spread $time.txt
sed 's:lon,lat,[A-Z]* 10 m above ground d=\([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]\).*:\1 '$time'0000:' $time.txt > temp.txt
for (( j = 1; j <= 2; j++ )); do
if [ j == 1 ]; then
sed -n '/lon,lat,UGRD/,/lon,lat,VGRD/p' $time.txt > vel_sep.txt
else
sed -n '/lon,lat,VGRD/,/359.500000,90.000000/p' $time.txt > vel_sep.txt
fi
line=174305
sed -n 1p temp.txt >> row.txt
for (( i = 1; i <= 48; i++ )); do
sed -n "$line","$(($line+93))"p vel_sep.txt > col.txt
sed 's:[0-9]*.[0-9]*,[0-9]*.[0-9]*,::' col.txt > col_trim.txt
awk -F"\n" 'NR==1{a=$1" ";next}{a=a$1" "}END{print a}' col_trim.txt >> row.txt
line=$(($line-720))
done
done
done
exit 0

Replace your awk by this:
awk -F"\n" 'NR==1{a=sprintf("%10.2f", $1); next}
{a=sprintf("%s%10.2f", a,$1);}END{print a}' col_trim.txt >> row.txt
EDIT: For left alignment:
awk -F"\n" 'NR==1{a=sprintf("%-8.2f", $1); next}
{a=sprintf("%s%-8.2f", a,$1);}END{print a}' col_trim.txt >> row.txt

You can use the column command:
awk -F"\n" 'NR==1{a=$1" ";next}{a=a$1" "}END{print a}' col_trim.txt | \
column -t >> row.txt
This gives:
1.31 2.3 3.35 2.59 1.63
2.03 2.21 1.99 1.5 1.12
1 0.6 -0.71 -2.1 0.01

This can be solved using printf with awk
Eksample:
echo -e "1 -2.5 10\n-3.4 2 12" | awk '{printf "%8.2f %8.2f %8.2f\n",$1,$2,$3}'
1.00 -2.50 10.00
-3.40 2.00 12.00

Additionally, this script has big spaces we can improve.
Here is the first one:
change from:
for time in 00 03 06 09 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96; do
to
for time in $(seq 0 3 96); do
time=$(printf "%02d" $time)
if you can show us the sample output of wgrib2 $filename -spread $time.txt, we can give more suggestions.

Moving the second column to first column using awk

I would like do the following using awk:
Example of my input data with four columns and a number of rows:
10 20 30 40
50 30 60 80
90 12 40 20
Desired output:
10 20
30 40
>
50 30
60 80
>
90 12
40 20

Try something like:
awk '{print $1 " " $2 "\n" $3 " " $4 "\n>"}'
output is:
10 20
30 40
>
50 30
60 80
>
90 12
40 20
>
Sorry about the trailing >

Try awk '{ print $1" "$2"\n" $3" "$4"\n>" }'

GNU sed
sed -r 's/(\S+\s+){2}/&\n/;$!a >' file
10 20
30 40
>
50 30
60 80
>
90 12
40 20
Notice the last line, no unwanted trailing >.

I add a pure bash solution (w/o calling any external utilities):
Script:
while read a b c d; do echo -e "$a $b\n$c $d\n>"; done <infile
Or without the explicit loop:
printf "%s %s\n%s %s\n>\n" $(<infile)
Input:
cat >infile <<XXX
10 20 30 40
50 30 60 80
90 12 40 20
XXX
Output:
10 20
30 40
>
50 30
60 80
>
90 12
40 20
>

Using bash to read elements on a diagonal on a matrix and redirecting it to another file

So, currently i have created a code to do this as shown below. This code works and does what it is supposed to do after I echo the variables:
a=`awk 'NR==2 {print $1}' $coor`
b=`awk 'NR==3 {print $2}' $coor`
c=`awK 'NR==4 {print $3}' $coor`
....but i have to do this for many more lines and i want a more general expression. So I have attempted to create a loop shown below. Syntax wise i don't think anything is wrong with the code, but it is not outputting anything to the file "Cmain".
I was wondering if anyone could help me, I'm kinda new at scripting.
If it helps any, I can also post what i am trying to read.
for (( i=1; i <= 4 ; i++ )); do
for (( j=0; j <= 3 ; j++ )); do
B="`grep -n "cell" "$coor" | awk 'NR=="$i" {print $j}'`"
done
done
echo "$B" >> Cmain

You can replace your lines of awk with this one:
awk '{ for (i=1; i<=NF; i++) if (NR >= 2 && NR == i) print $(i - 1) }' file.txt
Tested input:
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
Output:
11
22
33
44
55
66
77

awk 'BEGIN {f=1} {print $f; f=f+1}' infile > outfile

An alternative using sed and coreutils, assuming space separated input is in infile:
n=$(wc -l infile | cut -d' ' -f1)
for i in $(seq 1 $n); do
sed -n "${i} {p; q}" infile | cut -d' ' -f$i
done

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Extract Maximum and minimum value using awk - shell

How to find maximum and minimum value from the below table using awk command. 20 90 60 30 55 75 80 85 10 15 99 95 55 95 70 20 9 35 85 75 I want the output like max value=99 and min=9

If you have GNU awk: # using array awk '{x[NR]=$1}END{asort(x);print "max="x[NR],"min="x[1]}' RS=' +|\n' file max=99 min=9 # No array awk 'NR==1{m=n=$1}{$1>m?m=$1:m;$1<n?n=$1:n}END{print "max="m,"min="n}' RS=' +|\n' file max=99 min=9

awk ' NR == 1 { min=max=$1 } { for (i=1;i<=NF;i++) { min = (min < $i ? min : $i) max = (max > $i ? max : $i) } } END { printf "min value = %s\n", (min == "" ? "NaN" : min) printf "max value = %s\n", (max == "" ? "NaN" : max) } ' file The test resulting in "NaN" is to accommodate empty input files.

Related

Delete repeated rows keeping one closer to another file using awk

convert comma separated list in text file into columns in bash

Pretty-print with awk?

Moving the second column to first column using awk

Using bash to read elements on a diagonal on a matrix and redirecting it to another file

Categories

Resources