Pretty-print with awk? - bash

I have a code which is intended to output numbers stored in a file (which are in one column) to another TXT file. The part of the code which does this this is:
awk -F"\n" 'NR==1{a=$1" ";next}{a=a$1" "}END{print a}' col_trim.txt >> row.txt
the output is something like this:
1.31 2.3 3.35 2.59 1.63
2.03 2.21 1.99 1.5 1.12
1 0.6 -0.71 -2.1 0.01
But I want it to be like this:
1.31 2.30 3.35 2.59 1.63
2.03 2.21 1.99 1.50 1.12
1.00 0.60 -0.71 -2.10 0.01
As you see all numbers in the second sample have 2 digits after decimal and also if they are negative, the negative sign is placed before the number so it doesn't mess the arrangement of the numbers.
Any idea?
P.S.:
The input file is a text file with a column of numbers (for each row):
1.31
2.3
3.35
2.59
1.63
The whole code is like this:
#!/bin/sh
rm *.txt
for time in 00 03 06 09 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96; do
filename=gfs.t00z.master.grbf$time.10m.uv.grib2
wgrib2 $filename -spread $time.txt
sed 's:lon,lat,[A-Z]* 10 m above ground d=\([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]\).*:\1 '$time'0000:' $time.txt > temp.txt
for (( j = 1; j <= 2; j++ )); do
if [ j == 1 ]; then
sed -n '/lon,lat,UGRD/,/lon,lat,VGRD/p' $time.txt > vel_sep.txt
else
sed -n '/lon,lat,VGRD/,/359.500000,90.000000/p' $time.txt > vel_sep.txt
fi
line=174305
sed -n 1p temp.txt >> row.txt
for (( i = 1; i <= 48; i++ )); do
sed -n "$line","$(($line+93))"p vel_sep.txt > col.txt
sed 's:[0-9]*.[0-9]*,[0-9]*.[0-9]*,::' col.txt > col_trim.txt
awk -F"\n" 'NR==1{a=$1" ";next}{a=a$1" "}END{print a}' col_trim.txt >> row.txt
line=$(($line-720))
done
done
done
exit 0

Replace your awk by this:
awk -F"\n" 'NR==1{a=sprintf("%10.2f", $1); next}
{a=sprintf("%s%10.2f", a,$1);}END{print a}' col_trim.txt >> row.txt
EDIT: For left alignment:
awk -F"\n" 'NR==1{a=sprintf("%-8.2f", $1); next}
{a=sprintf("%s%-8.2f", a,$1);}END{print a}' col_trim.txt >> row.txt

You can use the column command:
awk -F"\n" 'NR==1{a=$1" ";next}{a=a$1" "}END{print a}' col_trim.txt | \
column -t >> row.txt
This gives:
1.31 2.3 3.35 2.59 1.63
2.03 2.21 1.99 1.5 1.12
1 0.6 -0.71 -2.1 0.01

This can be solved using printf with awk
Eksample:
echo -e "1 -2.5 10\n-3.4 2 12" | awk '{printf "%8.2f %8.2f %8.2f\n",$1,$2,$3}'
1.00 -2.50 10.00
-3.40 2.00 12.00

Additionally, this script has big spaces we can improve.
Here is the first one:
change from:
for time in 00 03 06 09 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96; do
to
for time in $(seq 0 3 96); do
time=$(printf "%02d" $time)
if you can show us the sample output of wgrib2 $filename -spread $time.txt, we can give more suggestions.

Related

Reshape table and complete voids with NA (or -999) using bash

I'm trying to create a table based on the ASCII bellow. What I need is to arrange the numbers from the 2nd column in a matrix. The first and third columns of the ASCII give columns and rows in the new matrix. The new matrix needs to be fully populated, so it is necessary to complete missing positions on the new table with NA (or -999).
This is what I have
$ cat infile.txt
1 68 2
1 182 3
1 797 4
2 4 1
2 70 2
2 339 3
2 1396 4
3 12 1
3 355 3
3 1854 4
4 7 1
4 85 2
4 333 3
5 9 1
5 68 2
5 182 3
5 922 4
6 10 1
6 70 2
and what I would like to have:
NA 4 12 7 9 10
68 70 NA 85 68 70
182 339 355 333 182 NA
797 1396 1854 NA 922 NA
I can only use standard UNIX commands (e.g. awk, sed, grep, etc).
So What I have so far...
I can mimic a 2d array in bash
irows=(`awk '{print $1 }' infile.txt`) # rows positions
jcols=(`awk '{print $3 }' infile.txt`) # columns positions
values=(`awk '{print $2 }' infile.txt`) # values
declare -A matrix # the new matrix
nrows=(`sort -k3 -n in.txt | tail -1 | awk '{print $3}'`) # numbers of rows
ncols=(`sort -k1 -n in.txt | tail -1 | awk '{print $1}'`) # numbers of columns
nelem=(`echo "${#values[#]}"`) # number of elements I want to pass to the new matrix
# Creating a matrix (i,j) with -999
for ((i=0;i<=$((nrows-1));i++)) do
for ((j=0;j<=$((ncols-1));j++)) do
matrix[$i,$j]=-999
done
done
and even print on the screen
for ((i=0;i<=$((nrows-1));i++)) do
for ((j=0;j<=$((ncols-1));j++)) do
printf " %i" ${matrix[$i,$j]}
done
echo
done
But when I tried to assign the elements, something gets wrong
for ((i=0;i<=$((nelem-1));i++)) do
matrix[${irows[$i]},${jcols[$i]}]=${values[$i]}
done
Thanks in advance for any help with this, really.
A solution in plain bash by simulating a 2D array with an associative array could be something like that (Notice that row and column counts are not hard coded and the code works with any permutation of input lines provided that each line has the format specified in the question):
$ cat printmat
#!/bin/bash
declare -A mat
nrow=0
ncol=0
while read -r col elem row; do
mat[$row,$col]=$elem
if ((row > nrow)); then nrow=$row; fi
if ((col > ncol)); then ncol=$col; fi
done
for ((row = 1; row <= nrow; ++row)); do
for ((col = 1; col <= ncol; ++col)); do
elem=${mat[$row,$col]}
if [[ -z $elem ]]; then elem=NA; fi
if ((col == ncol)); then elem+=$'\n'; else elem+=$'\t'; fi
printf "%s" "$elem"
done
done
$ ./printmat < infile.txt prints out
NA 4 12 7 9 10
68 70 NA 85 68 70
182 339 355 333 182 NA
797 1396 1854 NA 922 NA
Any time you find yourself writing a loop in shell just to manipulate text you have the wrong approcah. See why-is-using-a-shell-loop-to-process-text-considered-bad-practice for many of the reasons why.
Using any awk in any shell on every UNIX box:
$ cat tst.awk
{
vals[$3,$1] = $2
numRows = ($3 > numRows ? $3 : numRows)
numCols = $1
}
END {
OFS = "\t"
for (rowNr=1; rowNr<=numRows; rowNr++) {
for (colNr=1; colNr<=numCols; colNr++) {
val = ((rowNr,colNr) in vals ? vals[rowNr,colNr] : "NA")
printf "%s%s", val, (colNr < numCols ? OFS : ORS)
}
}
}
.
$ awk -f tst.awk infile.txt
NA 4 12 7 9 10
68 70 NA 85 68 70
182 339 355 333 182 NA
797 1396 1854 NA 922 NA
here is one way to get you started. Note that this is not intended to be "the" answer but to encourage you to try to learn the toolkit.
$ join -a1 -e NA -o2.2 <(printf "%s\n" {1..4}"_"{1..6}) \
<(awk '{print $3"_"$1,$2}' file | sort -n) |
pr -6at
NA 4 12 7 9 10
68 70 NA 85 68 70
182 339 355 333 182 NA
797 1396 1854 NA 922 NA
works, however, row and column counts are hard coded, which is not the proper way to do it.
Preferred solution will be filling up an awk 2D array with the data and print it in matrix form at the end.

convert comma separated list in text file into columns in bash

I've managed to extract data (from an html page) that goes into a table, and I've isolated the columns of said table into a text file that contains the lines below:
[30,30,32,35,34,43,52,68,88,97,105,107,107,105,101,93,88,80,69,55],
[28,6,6,50,58,56,64,87,99,110,116,119,120,117,114,113,103,82,6,47],
[-7,,,43,71,30,23,28,13,13,10,11,12,11,13,22,17,3,,-15,-20,,38,71],
[0,,,3,5,1.5,1,1.5,0.5,0.5,0,0.5,0.5,0.5,0.5,1,0.5,0,-0.5,-0.5,2.5]
Each bracketed list of numbers represents a column. What I'd like to do is turn these lists into actual columns that I can work with in different data formats. I'd also like to be sure to include that blank parts of these lists too (i.e., "[,,,]")
This is basically what I'm trying to accomplish:
30 28 -7 0
30 6
32 6
35 50 43 3
34 58 71 5
43 56 30 1.5
52 64 23 1
. . . .
. . . .
. . . .
I'm parsing data from a web page, and ultimately planning to make the process as automated as possible so I can easily work with the data after I output it to a nice format.
Anyone know how to do this, have any suggestions, or thoughts on scripting this?
Since you have your lists in python, just do it in python:
l=[["30", "30", "32"], ["28","6","6"], ["-7", "", ""], ["0", "", ""]]
for i in zip(*l):
print "\t".join(i)
produces
30 28 -7 0
30 6
32 6
awk based solution:
awk -F, '{gsub(/\[|\]/, ""); for (i=1; i<=NF; i++) a[i]=a[i] ? a[i] OFS $i: $i}
END {for (i=1; i<=NF; i++) print a[i]}' file
30 28 -7 0
30 6
32 6
35 50 43 3
34 58 71 5
43 56 30 1.5
52 64 23 1
..........
..........
Another solution, but it works only for file with 4 lines:
$ paste \
<(sed -n '1{s,\[,,g;s,\],,g;s|,|\n|g;p}' t) \
<(sed -n '2{s,\[,,g;s,\],,g;s|,|\n|g;p}' t) \
<(sed -n '3{s,\[,,g;s,\],,g;s|,|\n|g;p}' t) \
<(sed -n '4{s,\[,,g;s,\],,g;s|,|\n|g;p}' t)
30 28 -7 0
30 6
32 6
35 50 43 3
34 58 71 5
43 56 30 1.5
52 64 23 1
68 87 28 1.5
88 99 13 0.5
97 110 13 0.5
105 116 10 0
107 119 11 0.5
107 120 12 0.5
105 117 11 0.5
101 114 13 0.5
93 113 22 1
88 103 17 0.5
80 82 3 0
69 6 -0.5
55 47 -15 -0.5
-20 2.5
38
71
Updated: or another version with preprocessing:
$ sed 's|\[||;s|\][,]\?||' t >t2
$ paste \
<(sed -n '1{s|,|\n|g;p}' t2) \
<(sed -n '2{s|,|\n|g;p}' t2) \
<(sed -n '3{s|,|\n|g;p}' t2) \
<(sed -n '4{s|,|\n|g;p}' t2)
If a file named data contains the data given in the problem (exactly as defined above), then the following bash command line will produce the output requested:
$ sed -e 's/\[//' -e 's/\]//' -e 's/,/ /g' <data | rs -T
Example:
cat data
[30,30,32,35,34,43,52,68,88,97,105,107,107,105,101,93,88,80,69,55],
[28,6,6,50,58,56,64,87,99,110,116,119,120,117,114,113,103,82,6,47],
[-7,,,43,71,30,23,28,13,13,10,11,12,11,13,22,17,3,,-15,-20,,38,71],
[0,,,3,5,1.5,1,1.5,0.5,0.5,0,0.5,0.5,0.5,0.5,1,0.5,0,-0.5,-0.5,2.5]
$ sed -e 's/[//' -e 's/]//' -e 's/,/ /g' <data | rs -T
30 28 -7 0
30 6 43 3
32 6 71 5
35 50 30 1.5
34 58 23 1
43 56 28 1.5
52 64 13 0.5
68 87 13 0.5
88 99 10 0
97 110 11 0.5
105 116 12 0.5
107 119 11 0.5
107 120 13 0.5
105 117 22 1
101 114 17 0.5
93 113 3 0
88 103 -15 -0.5
80 82 -20 -0.5
69 6 38 2.5
55 47 71

printing selected rows from a file using awk

I have a text file with data in the following format.
1 0 0
2 512 6
3 992 12
4 1536 18
5 2016 24
6 2560 29
7 3040 35
8 3552 41
9 4064 47
10 4576 53
11 5088 59
12 5600 65
13 6080 71
14 6592 77
15 7104 83
I want to print all the lines where $1 > 1000.
awk 'BEGIN {$1 > 1000} {print " " $1 " "$2 " "$3}' graph_data_tmp.txt
This doesn't seem to give the output that I am expecting.What am I doing wrong?
You can do this :
awk '$1>1000 {print $0}' graph_data_tmp.txt
print $0 will print all the content of the line
If you want to print the content of the line after the 1000th line/ROW, then you could do the same by replacing $1 with NR. NR represents the number of rows.
awk 'NR>1000 {print $0}' graph_data_tmp.txt
All you need is:
awk '$1>1000' file

Extract Maximum and minimum value using awk

How to find maximum and minimum value from the below table using awk command.
20 90 60 30
55 75 80 85
10 15 99 95
55 95 70 20
9 35 85 75
I want the output like max value=99 and min=9
with gnu awk:
awk '{for(x=1;x<=NF;x++)a[++y]=$x}END{c=asort(a);print "min:",a[1];print "max:",a[c]}'
output:
min: 9
max: 99
without awk:
xargs -n1|sort -n|head or tail -1
e.g.
min:
kent$ echo "20 90 60 30
55 75 80 85
10 15 99 95
55 95 70 20
9 35 85 75"|xargs -n1|sort -n|head -1
9
max:
kent$ echo "20 90 60 30
55 75 80 85
10 15 99 95
55 95 70 20
9 35 85 75"|xargs -n1|sort -n|tail -1
99
you can of course xargs -n1|sort -n then pipe to awk to pick first and last and print in one shot.
If you have GNU awk:
# using array
awk '{x[NR]=$1}END{asort(x);print "max="x[NR],"min="x[1]}' RS=' +|\n' file
max=99 min=9
# No array
awk 'NR==1{m=n=$1}{$1>m?m=$1:m;$1<n?n=$1:n}END{print "max="m,"min="n}' RS=' +|\n' file
max=99 min=9
awk '
NR == 1 { min=max=$1 }
{
for (i=1;i<=NF;i++) {
min = (min < $i ? min : $i)
max = (max > $i ? max : $i)
}
}
END {
printf "min value = %s\n", (min == "" ? "NaN" : min)
printf "max value = %s\n", (max == "" ? "NaN" : max)
}
' file
The test resulting in "NaN" is to accommodate empty input files.

Using bash to read elements on a diagonal on a matrix and redirecting it to another file

So, currently i have created a code to do this as shown below. This code works and does what it is supposed to do after I echo the variables:
a=`awk 'NR==2 {print $1}' $coor`
b=`awk 'NR==3 {print $2}' $coor`
c=`awK 'NR==4 {print $3}' $coor`
....but i have to do this for many more lines and i want a more general expression. So I have attempted to create a loop shown below. Syntax wise i don't think anything is wrong with the code, but it is not outputting anything to the file "Cmain".
I was wondering if anyone could help me, I'm kinda new at scripting.
If it helps any, I can also post what i am trying to read.
for (( i=1; i <= 4 ; i++ )); do
for (( j=0; j <= 3 ; j++ )); do
B="`grep -n "cell" "$coor" | awk 'NR=="$i" {print $j}'`"
done
done
echo "$B" >> Cmain
You can replace your lines of awk with this one:
awk '{ for (i=1; i<=NF; i++) if (NR >= 2 && NR == i) print $(i - 1) }' file.txt
Tested input:
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
Output:
11
22
33
44
55
66
77
awk 'BEGIN {f=1} {print $f; f=f+1}' infile > outfile
An alternative using sed and coreutils, assuming space separated input is in infile:
n=$(wc -l infile | cut -d' ' -f1)
for i in $(seq 1 $n); do
sed -n "${i} {p; q}" infile | cut -d' ' -f$i
done

Resources