How to obtain the value for the 3rd one from the bottom in bash? - bash

I have a line like this
3672975 3672978 3672979
awk '{print $1}' will return the first number 3672975
If I still want the first number, but indicating it is the 3rd one from the bottom, how should I adjust awk '{print $-3}'?
The reason is, I have hundreds of numbers, and I always want to obtain the 3rd one from the bottom.
Can I use awk to obtain the total number of items first, then do the subtraction?

$NF is the last field, $(NF-1) is the one before the last etc., so:
$ awk '{print $(NF-2)}'
for example:
$ echo 3672975 3672978 3672979 | awk '{print $(NF-2)}'
3672975
Edit:
$ echo 1 10 100 | awk '{print $(NF-2)}'
1

or with cut and rev
echo 1 2 3 4 | rev | cut -d' ' -f 3 | rev
2

Related

Problem replacing numbers with words from a file

I have two files:
In the first one (champions.csv) I have the number and the name of some LoL champions
1,Annie
2,Olaf
3,Galio
4,Twisted Fate
5,Xin Zhao
6,Urgot
7,LeBlanc
8,Vladimir
9,Fiddlesticks
10,Kayle
11,Master Yi
In the second one (top.csv) I have couples of champions (first and second column) and the number of won matches by them (third column)
2,1,3
3,1,5
4,1,6
5,1,1
6,1,10
7,1,9
8,1,11
10,4,12
7,5,2
3,3,6
I need to substitute the numbers of the second file with the respective names of the first file.
I tried using awk and storing the names in an array but it didn't work
lengthChampions=`cat champions.csv | wc -l`
for i in `seq 1 $length`; do
name=`cat champions.csv | head -$i | tail -1 | awk -F',' '{print $2}'`
champions[$i]=$name
done
for i in `seq 1 10`; do
champion1=${champions[`cat top.csv | head -$i | tail -1 | awk -F',' '{print $1}'`]}
champion2=${champions[`cat top.csv | head -$i | tail -1 | awk -F',' '{print $2}'`]}
awk -F',' 'NR=='$i' {$1='$champion1'} {$2='$champion2'} {print $1","$2","$3}' top.csv > tmptop.csv && mv tmptop.csv top.csv
done
I would like a solution for this problem maybe with less code than this. The result should be something like that (not the actual result for my files):
Ahri,Ashe,1502
Camille,Ezreal,892
Ekko,Dr. Mundo,777
Fizz,Caitlyn,650
Gnar,Ezreal,578
Fiora,Irelia,452
Janna,Graves,321
Jax,Jinx,245
Ashe,Corki,151
Katarina,Lee Sin,102
this can be accomplished in a single awk call. associate numbers with champions in an array and use it for replacing numbers in second file.
awk 'BEGIN{FS=OFS=","} NR==FNR{a[$1]=$2;next} {$1=a[$1];$2=a[$2]} 1' champions.csv top.csv
Olaf,Annie,3
Galio,Annie,5
Twisted Fate,Annie,6
Xin Zhao,Annie,1
Urgot,Annie,10
LeBlanc,Annie,9
Vladimir,Annie,11
Kayle,Twisted Fate,12
LeBlanc,Xin Zhao,2
Galio,Galio,6
in case there should be some numbers in top.csv that don't exist in champions.csv, use the following instead to prevent those numbers from being deleted:
awk 'BEGIN{FS=OFS=","} NR==FNR{a[$1]=$2;next} ($1 in a){$1=a[$1]} ($2 in a){$2=a[$2]} 1' champions.csv top.csv
Assuming that the 2nd column of champions.csv isn't too huge, (i.e. larger than the maximum size of the bash array ${c[#]}), then using bash and cut:
readarray -t -O 1 c < <(cut -d, -f2 champions.csv)
while IFS=, read x y z; do
printf '%s,%s,%s\n' "${c[$x]}" "${c[$y]}" "$z"
done < top.csv
Output:
Olaf,Annie,3
Galio,Annie,5
Twisted Fate,Annie,6
Xin Zhao,Annie,1
Urgot,Annie,10
LeBlanc,Annie,9
Vladimir,Annie,11
Kayle,Twisted Fate,12
LeBlanc,Xin Zhao,2
Galio,Galio,6

Print column using while loop in awk

I need to extract the values of the 2nd column from a file while the value from $1 = 2 until $1 = 3. As an example, from the file
1 | 2.158e+06
| 2.31e+06
| 5.008e+06
2 | 693000
| 718000
| 725000
3 | 2.739e+06
| 2.852e+06
| 2.865e+06
| 2.874e+06
4 | 4.033e+06
| 4.052e+06
| 4.059e+06
I would like to extract values of the 2nd column from $1=2 until $1=3
693000
718000
725000
I tried using awk, but I have just figured out how to extract the values from $1=1 until $2=2
awk -F "|" '{if ($1>1) exit; else print $2}' foo.txt
Output
2.158e+06
2.31e+06
5.008e+06
I also tried this
awk -F "|" '{i=2; do {print $2; i++} while ($4); if ($1>i) exit}' foo.txt
But it gives me the whole 2nd column
2.158e+06
2.31e+06
5.008e+06
693000
718000
725000
2.739e+06
2.852e+06
2.865e+06
2.874e+06
4.033e+06
Does anyone know how to do this using awk or other command?
Thanks
A range pattern could work nicely here. The pattern $1==2,$1==3 will start executing the action when the first column is 2 and stop when it is 3. (Since the range is inclusive we need to check that the first column is not 3 before printing the second column in this case.)
$ awk -F\| '$1==2,$1==3 { if ($1 != 3) print $2 }' foo.txt
693000
718000
725000
hzhang#dell-work ~ $ cat sample.csv
1 | 2.158e+06
| 2.31e+06
| 5.008e+06
2 | 693000
| 718000
| 725000
3 | 2.739e+06
| 2.852e+06
| 2.865e+06
| 2.874e+06
4 | 4.033e+06
| 4.052e+06
| 4.059e+06
hzhang#dell-work ~ $ awk -F"|" 'BEGIN{c=0}{if($1>=3){c=0} if(c==1 ||($1>=2 && $1<3)){c = 1;print $2}}' sample.csv
693000
718000
725000
I set a flag c. If $1 is not between 2 and 3, the flag set to 0, otherwise it is 1, which means we can print $2 out.
This is what I came up with:
awk -F "|" '{if ($1==3) exit} /^2/,EOF {print $2}' file
1) /^2/,EOF {print $2} signifies print everything in second column up to the end of file, starting with a row that begins with a 2
2) {if ($1==3) exit} stops printing once the first column is a number 3
Output
693000
718000
725000
using getline statement in awk tactically
awk -v FS=" [|] " '$1=="2"{print $2;getline;while(($1==" "||$1==2)){print $2;$0="";getline>0}}' my_file
Here is another awk
awk -F\| '/^2$/ {f=1} /^3$/ {f=0} f {print $2+0}' file
693000
718000
725000
-F\| set field separator to |
/^2/ if file start with 2, set flag f to true.
/^3/ if file start with 2, set flag f to false.
f {print $2+0}' if flag f is true, print filed 2.
$2+0 this is used to remove space in front of number. Remove it if it contains letters.
Just so you don't have to read the entire file, exit when you see a '3':
$ awk -F\| '/^2\s+/ {f=1} /^3\s+/ {exit} f {print $2+0}' file
693000
718000
725000

AWK Print Second Column of Last Line

I'm trying to use AWK to post the second row of last line of this command (the total disk space):
df --total
The command I'm using is:
df --total | awk 'FNR == NF {print $2}'
But it does not get it right.
Is there another way to do it?
You're using the awk variable NF which is Number of Fields. You might have meant NR, Number of Rows, but it's easier to just use END:
df --total | awk 'END {print $2}'
You can use tail first then use awk:
df --total | tail -1 | awk '{print $2}'
One way to do it is with a tail/awk combination, the former to get just the last line, the latter print the second column:
df --total | tail -1l | awk '{print $2}'
A pure-awk solution is to simply store the second column of every line and print it out at the end:
df --total | awk '{store = $2} END {print store}'
Or, since the final columns are maintained in the END block from the last line, simply:
df --total | awk 'END {print $2}'
awk has no concept of "this is the last line". sed does though:
df --total | sed -n '$s/[^[:space:]]\+[[:space:]]\+\([[:digit:]]\+\).*/\1/p'

how awk takes the result of a unix command as a parameter?

Say there is an input file with tabs delimited field, the first field is integer
1 abc
1 def
1 ghi
1 lalala
1 heyhey
2 ahb
2 bbh
3 chch
3 chchch
3 oiohho
3 nonon
3 halal
3 whatever
First, i need to compute the counts of the unique values in the first field, that will be:
5 for 1, 2 for 2, and 6 for 3
Then I need to find the max of these counts, in this case, it's 6.
Now i need to pass "6" to another awk script as a parmeter.
I know i can use command below to get a list of count:
cut -f1 input.txt | sort | uniq -c | awk -F ' ' '{print $1}' | sort
but how do i get the first count number and pass it to the next awk command as a parameter not as an input file?
This is nothing very specific for awk.
Either a program can read from stdin, then you can pass the input with a pipe:
prg1 | prg2
or your program expects input as parameter, then you use
prg2 $(prg1)
Note that in both cases prg1 is processed before prg2.
Some programs allow both possibilities, while a huge amount of data is rarely passed as argument.
This AWK script replaces your whole pipeline:
awk -v parameter="$(awk '{a[$1]++} END {for (i in a) {if (a[i] > max) {max = a[i]}}; print max}' inputfile)" '{print parameter}' otherfile
where '{print parameter}' is a standin for your other AWK script and "otherfile" is the input for that script.
Note: It is extremely likely that the two AWK scripts could be combined into one which would be less of a hack than doing it in a way such as that outlined in your question (awk feeding awk).
You can use the shell's $() command substitution:
awk -f script -v num=$(cut -f1 input.txt | sort | uniq -c | awk -F ' ' '{print $1}' | sort | tail -1) < input_file
(I added the tail -1 to ensure that at most one line is used.)

Remove all occurrences of a duplicate line

If I want to remove lines where certain fields are duplicated then I use sort -u -k n,n.
But this keeps one occurrence. If I want to remove all occurrences of the duplicate is there any quick bash or awk way to do this?
Eg I have:
1 apple 30
2 banana 21
3 apple 9
4 mango 2
I want:
2 banana 21
4 mango 2
I will presort and then use a hash in perl but for v. large files this is going to be slow.
This will keep your output in the same order as your input:
awk '{seen[$2]++; a[++count]=$0; key[count]=$2} END {for (i=1;i<=count;i++) if (seen[key[i]] == 1) print a[i]}' inputfile
Try sort -k <your fields> | awk '{print $3, $1, $2}' | uniq -f2 -u | awk '{print $2, $3, $1}' to remove all lines that are duplicated (without keeping any copies). If you don't need the last field, change that first awk command to just cut -f 1-5 -d ' ', change the -f2 in uniq to -f1, and remove the second awk command.

Resources