Storing the grep output in loop - shell

This grep commands prints the numbers (the count of groups merged)
grep "merged" sombe_conversion_PSTN.sh.sql.log | awk '{print $1}' | sed 's/ //g'
The Output is as follows:
1000000
41474
41543
83410
83153
83085
82861
82904
82715
41498
41319
I need to add the data from second to last row of output and store it in a variable and
first element in a different variable.
for example :
var_num=1000000
sum_others=663962
How do i loop and add the variables?

Do it twice. If your list of numbers is in the file output, do
$ var_num=$(cat output | head -1)
$ sum_others=$(cat output | sed '1d' | awk '{s += $1} END {print s}')

Related

How do I remove the header in the df command?

I'm trying to write a bash command that will sort all volumes by the amount of data they have used and tried using
df | awk '{print $1 | "sort -r -k3 -n"}'
Output:
map
devfs
Filesystem
/dev/disk1s5
/dev/disk1s2
/dev/disk1s1
But this also shows the header called Filesystem.
How do I remove that?
For your specific case, i.e. using awk, #codeforester answer (using awk NR (Number of Records) variable) is the best.
In a more general case, in order to remove the first line of any output, you can use the tail -n +N option in order to output starting with line N:
df | tail -n +2 | other_command
This will remove the first line in df output.
Skip the first line, like this:
df | awk 'NR>1 {print $1 | "sort -r -k3 -n"}'
I normally use one of these options, if I have no reason to use awk:
df | sed 1d
The 1d option to sed says delete the first line, then print everything else.
df | tail -n+2
the -n+2 option to tail say start looking at line 2 and print everything until End-of-Input.
I suspect sed is faster than awk or tail, but I can't prove it.
EDIT
If you want to use awk, this will print every line except the first:
df | awk '{if (FNR>1) print}'
FNR is the File Record Number. It is the line number of the input. If it is greater than 1, print the input line.
Count the lines from the output of df with wc and then substract one line to output a headerless df with tail ...
LINES=$(df|wc -l)
LINES=$((${LINES}-1))
df | tail -n ${LINES}
OK - I see oneliner - Here is mine ...
DF_HEADERLESS=$(LINES=$(df|wc -l); LINES=$((${LINES}-1));df | tail -n ${LINES})
And for formated output lets printf loop over it...
printf "%s\t%s\t%s\t%s\t%s\t%s\n" ${DF_HEADERLESS} | awk '{print $1 | "sort -r -k3 -n"}'
This might help with GNU df and GNU sort:
df -P | awk 'NR>1{$1=$1; print}' | sort -r -k3 -n | awk '{print $1}'
With GNU df and GNU awk:
df -P | awk 'NR>1{array[$3]=$1} END{PROCINFO["sorted_in"]="#ind_num_desc"; for(i in array){print array[i]}}'
Documentation: 8.1.6 Using Predefined Array Scanning Orders with gawk
Removing something from a command output can be done very simply, using grep -v, so in your case:
df | grep -v "Filesystem" | ...
(You can do your awk at the ...)
When you're not sure about caps, small caps, you might add -i:
df | grep -i -v "FiLeSyStEm" | ...
(The switching caps/small caps are meant as a clarification joke :-) )

Problem replacing numbers with words from a file

I have two files:
In the first one (champions.csv) I have the number and the name of some LoL champions
1,Annie
2,Olaf
3,Galio
4,Twisted Fate
5,Xin Zhao
6,Urgot
7,LeBlanc
8,Vladimir
9,Fiddlesticks
10,Kayle
11,Master Yi
In the second one (top.csv) I have couples of champions (first and second column) and the number of won matches by them (third column)
2,1,3
3,1,5
4,1,6
5,1,1
6,1,10
7,1,9
8,1,11
10,4,12
7,5,2
3,3,6
I need to substitute the numbers of the second file with the respective names of the first file.
I tried using awk and storing the names in an array but it didn't work
lengthChampions=`cat champions.csv | wc -l`
for i in `seq 1 $length`; do
name=`cat champions.csv | head -$i | tail -1 | awk -F',' '{print $2}'`
champions[$i]=$name
done
for i in `seq 1 10`; do
champion1=${champions[`cat top.csv | head -$i | tail -1 | awk -F',' '{print $1}'`]}
champion2=${champions[`cat top.csv | head -$i | tail -1 | awk -F',' '{print $2}'`]}
awk -F',' 'NR=='$i' {$1='$champion1'} {$2='$champion2'} {print $1","$2","$3}' top.csv > tmptop.csv && mv tmptop.csv top.csv
done
I would like a solution for this problem maybe with less code than this. The result should be something like that (not the actual result for my files):
Ahri,Ashe,1502
Camille,Ezreal,892
Ekko,Dr. Mundo,777
Fizz,Caitlyn,650
Gnar,Ezreal,578
Fiora,Irelia,452
Janna,Graves,321
Jax,Jinx,245
Ashe,Corki,151
Katarina,Lee Sin,102
this can be accomplished in a single awk call. associate numbers with champions in an array and use it for replacing numbers in second file.
awk 'BEGIN{FS=OFS=","} NR==FNR{a[$1]=$2;next} {$1=a[$1];$2=a[$2]} 1' champions.csv top.csv
Olaf,Annie,3
Galio,Annie,5
Twisted Fate,Annie,6
Xin Zhao,Annie,1
Urgot,Annie,10
LeBlanc,Annie,9
Vladimir,Annie,11
Kayle,Twisted Fate,12
LeBlanc,Xin Zhao,2
Galio,Galio,6
in case there should be some numbers in top.csv that don't exist in champions.csv, use the following instead to prevent those numbers from being deleted:
awk 'BEGIN{FS=OFS=","} NR==FNR{a[$1]=$2;next} ($1 in a){$1=a[$1]} ($2 in a){$2=a[$2]} 1' champions.csv top.csv
Assuming that the 2nd column of champions.csv isn't too huge, (i.e. larger than the maximum size of the bash array ${c[#]}), then using bash and cut:
readarray -t -O 1 c < <(cut -d, -f2 champions.csv)
while IFS=, read x y z; do
printf '%s,%s,%s\n' "${c[$x]}" "${c[$y]}" "$z"
done < top.csv
Output:
Olaf,Annie,3
Galio,Annie,5
Twisted Fate,Annie,6
Xin Zhao,Annie,1
Urgot,Annie,10
LeBlanc,Annie,9
Vladimir,Annie,11
Kayle,Twisted Fate,12
LeBlanc,Xin Zhao,2
Galio,Galio,6

split numbers in and store them in different files using unix shell script

I have a file called "list.txt" which contains the following rows of numbers.
31056780
31909020
31092320
61093190
61094592
45090280
45902902
I need to now take all the rows starting with "31" and store them in another file call file31.txt take all the rows starting with "61" and store them in file61.txt, take all rows starting with "45" store it in file45.txt
file31.txt will contain.
31056780
31909020
31092320
file61.txt will contain.
61093190
61094592
file45.txt will contain.
45090280
45902902
I tried this command for all 3 but it does not do what i want it to do.
awk -F\" '/31*/ {print $0}' list.txt > file31
awk -F\" '/61*/ {print $0}' list.txt > file61
awk -F\" '/45*/ {print $0}' list.txt > file45
You can use output redirection inside a single awk script. It can construct the filename by concatenating the first two characters of the line.
awk '{ fn = "list" substr($0, 1, 2) ".txt"; print > fn }' list.txt
You could use grep or sed to filter the lines with a matching pattern, for example:
sed '/^31/!d' list.txt > list31.txt
Or in a for loop for every number you want:
for n in "31" "45" "61"; do sed '/^'"$n"'/!d' list.txt > list$n.txt; done
Hope it helps.
You can use:
awk '/^31/{print > "file31"} /^45/{print > "file45"} /^61/{print > "file61"}' file
for i in `cat list.txt | cut -c1-2 | uniq`; do cat list.txt | grep -P ^${i} > file${i}.txt; done
This command works fine and is generic enough to work for all cases.
Now let's understand how it works.
cat list.txt | cut -c1-2 | uniq
31
45
61
Next we loop over these unique identifiers to create the new files using
cat list.txt | grep -P ^${i}
grep -P finds strings with partial match - here ^ - means that we are looking at this partial string only at the beginning of the line.

How do I pipe commands inside for loop in bash

I am writing a bash script to iterate through file lines with given value.
The command I am using to list the possible values is:
cat file.csv | cut -d';' -f2 | sort | uniq | head
When I use it in for loop like this it stops working:
for i in $( cat file.csv | cut -d';' -f2 | sort | uniq | head )
do
//do something else with these lines
done
How can I use piped commands in for loop?
You can use this awk command to get sum of 3rd column for each unique value of 2nd columns:
awk -F ';' '{sums[$2]+=$3} END{for (i in sums) print i ":", sums[i]}' file.csv
Input data:
asd;foo;0
asd;foo;2
asd;bar;1
asd;foo;4
Output:
foo: 6
bar: 1

how awk takes the result of a unix command as a parameter?

Say there is an input file with tabs delimited field, the first field is integer
1 abc
1 def
1 ghi
1 lalala
1 heyhey
2 ahb
2 bbh
3 chch
3 chchch
3 oiohho
3 nonon
3 halal
3 whatever
First, i need to compute the counts of the unique values in the first field, that will be:
5 for 1, 2 for 2, and 6 for 3
Then I need to find the max of these counts, in this case, it's 6.
Now i need to pass "6" to another awk script as a parmeter.
I know i can use command below to get a list of count:
cut -f1 input.txt | sort | uniq -c | awk -F ' ' '{print $1}' | sort
but how do i get the first count number and pass it to the next awk command as a parameter not as an input file?
This is nothing very specific for awk.
Either a program can read from stdin, then you can pass the input with a pipe:
prg1 | prg2
or your program expects input as parameter, then you use
prg2 $(prg1)
Note that in both cases prg1 is processed before prg2.
Some programs allow both possibilities, while a huge amount of data is rarely passed as argument.
This AWK script replaces your whole pipeline:
awk -v parameter="$(awk '{a[$1]++} END {for (i in a) {if (a[i] > max) {max = a[i]}}; print max}' inputfile)" '{print parameter}' otherfile
where '{print parameter}' is a standin for your other AWK script and "otherfile" is the input for that script.
Note: It is extremely likely that the two AWK scripts could be combined into one which would be less of a hack than doing it in a way such as that outlined in your question (awk feeding awk).
You can use the shell's $() command substitution:
awk -f script -v num=$(cut -f1 input.txt | sort | uniq -c | awk -F ' ' '{print $1}' | sort | tail -1) < input_file
(I added the tail -1 to ensure that at most one line is used.)

Resources