Using BASH, selecting row and column [CUT command?] - bash

1 A 18 -180
2 B 19 -180
3 C 20 -150
50 D 21 -100
128 E 22 -130
10 F 23 -0
10 G 23 -0
In the above file, I can easily print out the column using cat command.
cat /file_directory | cut -d' ' -f3
In that case, the output would be the third column.
But, What I want to do is something different. For example, I wanna pick the element depending on the row element.
So if I pick B from the second row, the printout would be [row associated "B" in the second column ][column =3] = [2][3]. which is only 19, not anything else. HOW TO DO IT?

Use awk:
$ awk '$2 == "B" {print $3}' file.txt
19
awk splits each row into fields (by default using arbitrary whitespace a the field delimiters). Each statement has two parts: a pattern to select a line, and an action to take on a selected line. In the above, we check if the 2nd column ($2) has the value "B"; for each line for which that is true, we print the value in the 3rd column.

#!/bin/sh
cat sofile+.txt | tr -s ' ' > sofile1+.txt
mv sofile1+.txt sofile+.txt
cat > edcommands+.sh << EOF
/B/
EOF
line=$(ed -s sofile+.txt < edcommands+.txt)
echo ${line} | cut -d' ' -f2,3
rm ./edcommands+.txt
Sofile+.txt is what contains your data.
You might also need to install ed for this, since it isn't in most distributions by default any more, sadly.

Related

Can I catch the value by using bourne shell script?

When I type some command in openwrt, the result is like this.
Security Signal(%) Mode
WPA2 86 on
WPA2 42 on
In this result, I want to catch the signal value(86) in first column.
How can i catch the value by using bourne shell script?
Plus, luci.sys.call function is only used in cbi file for making Luci, isn't it?
Try also
HereYourCommand | awk 'NR==2 { print $2 }'
The awk program prints the second field (aka column) of the second record (aka line).
The following should do:
HereYourCommand | head -2 | tail -1 | tr -s ' ' | cut -d' ' -f2
Replace HereYourCommand with your call to openwrt.
The explanation:
head -2: pick up just the first two lines.
tail -1: from this two lines, pick up the last line.
tr -s ' ': replace multiple spaces with a single one.
cut -d' ' -f2: pick up the 2nd field from the remaining line.
cat test|tail +2|tr -s '\s\t' ' '|cut -d' ' -f2
tail +2 skips first line, then I am replacing spaces or tabs with single space and cut get second field.
output:
4
5
8
input:
x y z
1 4 7
2 5 7
4 8 0

how to find maximum and minimum values of a particular column using AWK [duplicate]

I'm using awk to deal with a simple .dat file, which contains several lines of data and each line has 4 columns separated by a single space.
I want to find the minimum and maximum of the first column.
The data file looks like this:
9 30 8.58939 167.759
9 38 1.3709 164.318
10 30 6.69505 169.529
10 31 7.05698 169.425
11 30 6.03872 169.095
11 31 5.5398 167.902
12 30 3.66257 168.689
12 31 9.6747 167.049
4 30 10.7602 169.611
4 31 8.25869 169.637
5 30 7.08504 170.212
5 31 11.5508 168.409
6 31 5.57599 168.903
6 32 6.37579 168.283
7 30 11.8416 168.538
7 31 -2.70843 167.116
8 30 47.1137 126.085
8 31 4.73017 169.496
The commands I used are as follows.
min=`awk 'BEGIN{a=1000}{if ($1<a) a=$1 fi} END{print a}' mydata.dat`
max=`awk 'BEGIN{a= 0}{if ($1>a) a=$1 fi} END{print a}' mydata.dat`
However, the output is min=10 and max=9.
(The similar commands can return me the right minimum and maximum of the second column.)
Could someone tell me where I was wrong? Thank you!
Awk guesses the type.
String "10" is less than string "4" because character "1" comes before "4".
Force a type conversion, using addition of zero:
min=`awk 'BEGIN{a=1000}{if ($1<0+a) a=$1} END{print a}' mydata.dat`
max=`awk 'BEGIN{a= 0}{if ($1>0+a) a=$1} END{print a}' mydata.dat`
a non-awk answer:
cut -d" " -f1 file |
sort -n |
tee >(echo "min=$(head -1)") \
> >(echo "max=$(tail -1)")
That tee command is perhaps a bit much too clever. tee duplicates its stdin stream to the files names as arguments, plus it streams the same data to stdout. I'm using process substitutions to filter the streams.
The same effect can be used (with less flourish) to extract the first and last lines of a stream of data:
cut -d" " -f1 file | sort -n | sed -n '1s/^/min=/p; $s/^/max=/p'
or
cut -d" " -f1 file | sort -n | {
read line
echo "min=$line"
while read line; do max=$line; done
echo "max=$max"
}
Your problem was simply that in your script you had:
if ($1<a) a=$1 fi
and that final fi is not part of awk syntax so it is treated as a variable so a=$1 fi is string concatenation and so you are TELLING awk that a contains a string, not a number and hence the string comparison instead of numeric in the $1<a.
More importantly in general, never start with some guessed value for max/min, just use the first value read as the seed. Here's the correct way to write the script:
$ cat tst.awk
BEGIN { min = max = "NaN" }
{
min = (NR==1 || $1<min ? $1 : min)
max = (NR==1 || $1>max ? $1 : max)
}
END { print min, max }
$ awk -f tst.awk file
4 12
$ awk -f tst.awk /dev/null
NaN NaN
$ a=( $( awk -f tst.awk file ) )
$ echo "${a[0]}"
4
$ echo "${a[1]}"
12
If you don't like NaN pick whatever you'd prefer to print when the input file is empty.
late but a shorter command and with more precision without initial assumption:
awk '(NR==1){Min=$1;Max=$1};(NR>=2){if(Min>$1) Min=$1;if(Max<$1) Max=$1} END {printf "The Min is %d ,Max is %d",Min,Max}' FileName.dat
A very straightforward solution (if it's not compulsory to use awk):
Find Min --> sort -n -r numbers.txt | tail -n1
Find Max --> sort -n -r numbers.txt | head -n1
You can use a combination of sort, head, tail to get the desired output as shown above.
(PS: In case if you want to extract the first column/any desired column you can use the cut command i.e. to extract the first column cut -d " " -f 1 sample.dat)
#minimum
cat your_data_file.dat | sort -nk3,3 | head -1
#this fill find minumum of column 3
#maximun
cat your_data_file.dat | sort -nk3,3 | tail -1
#this will find maximum of column 3
#to find in column 2 , use -nk2,2
#assing to a variable and use
min_col=`cat your_data_file.dat | sort -nk3,3 | head -1 | awk '{print $3}'`

Bash retrieve column number from column name

Is there a better way (such as a one liner in AWK) where I can get the column number in a table with headings from a column name? I want to be able to process a column independent of what the column number actually is (such as when another column is added the script will not need to change).
For example, given the following table in "table.tsv":
ID Value Target Not Used
1 5 9 11
2 4 8 12
3 6 7 10
I can do a sort on the "Target" column using:
#!/bin/bash
(IFS=$'\t'; read -r; printf "%s\n" "$REPLY"; i=0; for col in $REPLY; do
((++i))
[ "$col" == "Target" ] && break
done; sort -t$'\t' "-k$i,${i}n") < table.tsv
Is there a way to do it without the for loop (or at least clean it up a little)?
The expected output of the given script is:
ID Value Target Not Used
3 6 7 10
2 4 8 12
1 5 9 11
However, I was trying to give an example of what I was trying to do. I want to pass/filter my table through several programs so the headings and all columns should be preserved: just have processing occur at each step.
In pseudo code, what I would like to do is:
print headings from stdin
i=$(magic to determine column position given "Target")
sort -t$'\t' "-k$i,${i}n" # or whatever processing is required on that column
another alternative with a lot of pipes
$ head -1 table | tr -s ' ' '\n' | nl -nln | grep "Target" | cut -f1
extract first row, transpose, number lines, find column name, extract number
Or, awk to the rescue!
$ awk -v RS='\t' '/Target/{print NR; exit}' file.tsv
3
Here is an awk alternative:
awk -F '\t' -v col='Target' 'NR==1{for (i=1; i<=NF; i++) if ($i == col){c=i; break}}
{print $c}' file
EDIT: To print column number only:
awk -F '\t' -v col='Target' 'NR==1{for (i=1; i<=NF; i++) if ($i==col) {print i;exit}}' file
3
$ awk -v name='Target' '{for (i=1;i<=NF;i++) if ($i==name) print i; exit}' file
3

awk: find minimum and maximum in column

I'm using awk to deal with a simple .dat file, which contains several lines of data and each line has 4 columns separated by a single space.
I want to find the minimum and maximum of the first column.
The data file looks like this:
9 30 8.58939 167.759
9 38 1.3709 164.318
10 30 6.69505 169.529
10 31 7.05698 169.425
11 30 6.03872 169.095
11 31 5.5398 167.902
12 30 3.66257 168.689
12 31 9.6747 167.049
4 30 10.7602 169.611
4 31 8.25869 169.637
5 30 7.08504 170.212
5 31 11.5508 168.409
6 31 5.57599 168.903
6 32 6.37579 168.283
7 30 11.8416 168.538
7 31 -2.70843 167.116
8 30 47.1137 126.085
8 31 4.73017 169.496
The commands I used are as follows.
min=`awk 'BEGIN{a=1000}{if ($1<a) a=$1 fi} END{print a}' mydata.dat`
max=`awk 'BEGIN{a= 0}{if ($1>a) a=$1 fi} END{print a}' mydata.dat`
However, the output is min=10 and max=9.
(The similar commands can return me the right minimum and maximum of the second column.)
Could someone tell me where I was wrong? Thank you!
Awk guesses the type.
String "10" is less than string "4" because character "1" comes before "4".
Force a type conversion, using addition of zero:
min=`awk 'BEGIN{a=1000}{if ($1<0+a) a=$1} END{print a}' mydata.dat`
max=`awk 'BEGIN{a= 0}{if ($1>0+a) a=$1} END{print a}' mydata.dat`
a non-awk answer:
cut -d" " -f1 file |
sort -n |
tee >(echo "min=$(head -1)") \
> >(echo "max=$(tail -1)")
That tee command is perhaps a bit much too clever. tee duplicates its stdin stream to the files names as arguments, plus it streams the same data to stdout. I'm using process substitutions to filter the streams.
The same effect can be used (with less flourish) to extract the first and last lines of a stream of data:
cut -d" " -f1 file | sort -n | sed -n '1s/^/min=/p; $s/^/max=/p'
or
cut -d" " -f1 file | sort -n | {
read line
echo "min=$line"
while read line; do max=$line; done
echo "max=$max"
}
Your problem was simply that in your script you had:
if ($1<a) a=$1 fi
and that final fi is not part of awk syntax so it is treated as a variable so a=$1 fi is string concatenation and so you are TELLING awk that a contains a string, not a number and hence the string comparison instead of numeric in the $1<a.
More importantly in general, never start with some guessed value for max/min, just use the first value read as the seed. Here's the correct way to write the script:
$ cat tst.awk
BEGIN { min = max = "NaN" }
{
min = (NR==1 || $1<min ? $1 : min)
max = (NR==1 || $1>max ? $1 : max)
}
END { print min, max }
$ awk -f tst.awk file
4 12
$ awk -f tst.awk /dev/null
NaN NaN
$ a=( $( awk -f tst.awk file ) )
$ echo "${a[0]}"
4
$ echo "${a[1]}"
12
If you don't like NaN pick whatever you'd prefer to print when the input file is empty.
late but a shorter command and with more precision without initial assumption:
awk '(NR==1){Min=$1;Max=$1};(NR>=2){if(Min>$1) Min=$1;if(Max<$1) Max=$1} END {printf "The Min is %d ,Max is %d",Min,Max}' FileName.dat
A very straightforward solution (if it's not compulsory to use awk):
Find Min --> sort -n -r numbers.txt | tail -n1
Find Max --> sort -n -r numbers.txt | head -n1
You can use a combination of sort, head, tail to get the desired output as shown above.
(PS: In case if you want to extract the first column/any desired column you can use the cut command i.e. to extract the first column cut -d " " -f 1 sample.dat)
#minimum
cat your_data_file.dat | sort -nk3,3 | head -1
#this fill find minumum of column 3
#maximun
cat your_data_file.dat | sort -nk3,3 | tail -1
#this will find maximum of column 3
#to find in column 2 , use -nk2,2
#assing to a variable and use
min_col=`cat your_data_file.dat | sort -nk3,3 | head -1 | awk '{print $3}'`

How to get the second column from command output?

My command's output is something like:
1540 "A B"
6 "C"
119 "D"
The first column is always a number, followed by a space, then a double-quoted string.
My purpose is to get the second column only, like:
"A B"
"C"
"D"
I intended to use <some_command> | awk '{print $2}' to accomplish this. But the question is, some values in the second column contain space(s), which happens to be the default delimiter for awk to separate the fields. Therefore, the output is messed up:
"A
"C"
"D"
How do I get the second column's value (with paired quotes) cleanly?
Use -F [field separator] to split the lines on "s:
awk -F '"' '{print $2}' your_input_file
or for input from pipe
<some_command> | awk -F '"' '{print $2}'
output:
A B
C
D
If you could use something other than 'awk' , then try this instead
echo '1540 "A B"' | cut -d' ' -f2-
-d is a delimiter, -f is the field to cut and with -f2- we intend to cut the 2nd field until end.
This should work to get a specific column out of the command output "docker images":
REPOSITORY TAG IMAGE ID CREATED SIZE
ubuntu 16.04 12543ced0f6f 10 months ago 122 MB
ubuntu latest 12543ced0f6f 10 months ago 122 MB
selenium/standalone-firefox-debug 2.53.0 9f3bab6e046f 12 months ago 613 MB
selenium/node-firefox-debug 2.53.0 d82f2ab74db7 12 months ago 613 MB
docker images | awk '{print $3}'
IMAGE
12543ced0f6f
12543ced0f6f
9f3bab6e046f
d82f2ab74db7
This is going to print the third column
Or use sed & regex.
<some_command> | sed 's/^.* \(".*"$\)/\1/'
You don't need awk for that. Using read in Bash shell should be enough, e.g.
some_command | while read c1 c2; do echo $c2; done
or:
while read c1 c2; do echo $c2; done < in.txt
If you have GNU awk this is the solution you want:
$ awk '{print $1}' FPAT='"[^"]+"' file
"A B"
"C"
"D"
awk -F"|" '{gsub(/\"/,"|");print "\""$2"\""}' your_file
#!/usr/bin/python
import sys
col = int(sys.argv[1]) - 1
for line in sys.stdin:
columns = line.split()
try:
print(columns[col])
except IndexError:
# ignore
pass
Then, supposing you name the script as co, say, do something like this to get the sizes of files (the example assumes you're using Linux, but the script itself is OS-independent) :-
ls -lh | co 5

Resources