Datetime to epoch conversion - bash

I have a bash question (when using awk). I'm extracting every single instance of the first and fifth column in a textfile and piping it to a new file with the following code,
cut -f4 test170201.rawtxt | awk '/stream_0/ { print $1, $5 }' > testLogFile.txt
This is part of the file (test170201.rawtxt) I'm extracting the data from, columns Timestamp and Loss,
Timestamp Stream Status Seq Loss Bytes Delay
17/02/01.10:58:25.212577 stream_0 OK 80281 0 1000 38473
17/02/01.10:58:25.213401 stream_0 OK 80282 0 1000 38472
17/02/01.10:58:25.215560 stream_0 OK 80283 0 1000 38473
17/02/01.10:58:25.216645 stream_0 OK 80284 0 1000 38472
This is the result I'm getting in testLogFile.txt
17/02/01.10:58:25.212577 0
17/02/01.10:58:25.213401 0
17/02/01.10:58:25.215560 0
17/02/01.10:58:25.216645 0
However, I want the Timestamp to be written in epoch in the file above. Is there an easy way of modifying the code I already have to do this?

Given:
$ cat file
Timestamp Stream Status Seq Loss Bytes Delay
17/02/01.10:58:25.212577 stream_0 OK 80281 0 1000 38473
17/02/01.10:58:25.213401 stream_0 OK 80282 0 1000 38472
17/02/01.10:58:25.215560 stream_0 OK 80283 0 1000 38473
17/02/01.10:58:25.216645 stream_0 OK 80284 0 1000 38472
You can write a POSIX Bash script to do what you are looking for:
while IFS= read -r line || [[ -n "$line" ]]; do
if [[ "$line" =~ ^[[:digit:]]{2}/[[:digit:]]{2}/[[:digit:]]{2} ]]
then
arr=($line)
ts=${arr[0]}
dec=${ts##*.} # fractional seconds
# GNU date may need different flags:
epoch=$(date -j -f "%y/%m/%d.%H:%M:%S" "${ts%.*}" "+%s")
printf "%s.%s\t%s\n" "$epoch" "$dec" "${arr[4]}"
fi
done <file >out_file
$ cat out_file
1485975505.212577 0
1485975505.213401 0
1485975505.215560 0
1485975505.216645 0
For GNU date, try:
while IFS= read -r line || [[ -n "$line" ]]; do
if [[ "$line" =~ ^[[:digit:]]{2}/[[:digit:]]{2}/[[:digit:]]{2} ]]
then
arr=($line)
ts="20${arr[0]}"
d="${ts%%.*}"
tmp="${ts%.*}"
tm="${tmp#*.}"
dec="${ts##*.}" # fractional seconds
epoch=$(date +"%s" --date="$d $tm" )
printf "%s.%s\t%s\n" "$epoch" "$dec" "${arr[4]}"
fi
done <file >out_file
For an GNU awk solution, you can do:
awk 'function epoch(s){
split(s, dt, /[/:. ]/)
s="20" dt[1] " " dt[2] " " dt[3] " " dt[4] " " dt[5] " " dt[6]
return mktime(s) "." dt[7]}
/^[0-9][0-9]/ { print epoch($1), $5 }' file >out_file
If you don't want the fractional second included in the epoch, they are easily removed.

awk -F '[.[:blank:]]+' '
# use separator for dot and space (to avoid trailing time info)
{
# for line other than header
if( NR>1) {
# time is set for format "YYYY MM DD HH MM SS [DST]"
# prepare with valuable info
T = "20"$1 " " $2
# use correct separator
gsub( /[\/:]/, " ", T)
# convert to epoch
E = mktime( T)
# print result, adding fractionnal as mentionned later
printf("%d.%d %s\n", E, $3, $7)
}
else {
# print header (line 1)
print $1 " "$7
}
}
' test170201.rawtxt \
> Redirected.file
self commented, code is longer for understanding purpose
use of gnu awk for the mktime function not available in posix or older version
Oneliner a bit optimized here after
awk -F '[.[:blank:]]+' '{if(NR>1){T="20"$1" "$2;gsub(/[\/:]/," ", T);$1=mktime(T)}print $1" "$7}' test170201.rawtxt

Using GNU awk
Input
$ cat f
Timestamp Stream Status Seq Loss Bytes Delay
17/02/01.10:58:25.212577 stream_0 OK 80281 0 1000 38473
17/02/01.10:58:25.213401 stream_0 OK 80282 0 1000 38472
17/02/01.10:58:25.215560 stream_0 OK 80283 0 1000 38473
17/02/01.10:58:25.216645 stream_0 OK 80284 0 1000 38472
Output
$ awk '
BEGIN{cyear = strftime("%y",systime())}
function epoch(v, datetime){
sub(/\./," ",v);
split(v,datetime,/[/: ]/);
datetime[1] = datetime[1] <= cyear ? 2000+datetime[1] : 1900+datetime[1];
return mktime(datetime[1] " " datetime[2] " " datetime[3] " " datetime[4]" " datetime[5]" " datetime[6])
}
/stream_0/{
print epoch($1),$5
}' f
1485926905 0
1485926905 0
1485926905 0
1485926905 0
To write to new file just redirect like below
cut -f4 test170201.rawtxt | awk '
BEGIN{cyear = strftime("%y",systime());}
function epoch(v, datetime){
sub(/\./," ",v);
split(v,datetime,/[/: ]/);
datetime[1] = datetime[1] <= cyear ? 2000+datetime[1] : 1900+datetime[1];
return mktime(datetime[1] " " datetime[2] " " datetime[3] " " datetime[4]" " datetime[5]" " datetime[6])
}
/stream_0/{
print epoch($1),$5
}' > testLogFile.txt

Related

convert some of the response values in curl command

I use the below script to get the average response time for a specific website. It works just fine. I just need to convert some of the values in the response like the time_total, i want to view it in milliseconds and the size_download in KB format. And in the end of the command where I share the average response time, i also want to print it in milliseconds. Any help is really appreciated.
for ((i=1;i<=50;i++)); do curl -w 'Return Code: %{http_code}; Bytes Received: %{size_download}; Response Time: %{time_total}\n' "https://www.google.com" -m 2 -o /dev/null -s; done |tee /dev/tty|awk '{ sum += $NF; n++ } END { If (n > 0); print "Average Response Time =",sum /n;}'
Three answer here...
As this question is tagged shell (not bash), here is three different way for doing this, by using awk + bash, shell + bc or bash alone.
1. Using awk to process output and compute averages
As doing forks like curl ... | awk ... repetitively, is resource killer, I prefer to run awk only once, doing the whole formatting output job under awk:
iter=10
for ((i=iter;i--;));do
curl -w '%{http_code} %{size_download} %{time_total}\n' \
"https://www.google.com" -m 2 -o /dev/null -s
sleep .02
done | awk -v iter=$iter '
BEGIN {
ttim=0;tsiz=0;
printf " %-3s %8s %11s %10s\n","Res","Size","Time","Rate"
};
{
printf " %-3s %7.2fK %9.2fms %7.2fK/s\n", \
$1, $2/1024,$3*1000,$2/$3/1024;
tsiz+=$2;ttim+=$3;
};
END {
printf "Tot %7.2fK %9.2fms\nAvg %7.2fK %9.2fms %7.2fK/s\n", \
tsiz/1024,ttim*1000, tsiz/iter/1024,ttim/iter*1000,tsiz/ttim/1024;
}'
May produce:
Res Size Time Rate
200 14.61K 128.48ms 113.71K/s
200 14.75K 131.06ms 112.52K/s
200 14.73K 131.71ms 111.85K/s
200 14.72K 130.24ms 113.05K/s
200 14.66K 134.68ms 108.86K/s
200 14.69K 131.39ms 111.79K/s
200 14.63K 131.15ms 111.53K/s
200 14.70K 126.26ms 116.42K/s
200 14.71K 129.08ms 113.98K/s
200 14.68K 131.23ms 111.86K/s
Tot 146.88K 1305.28ms
Avg 14.69K 130.53ms 112.53K/s
2. Working on averages using shell
As shell don't work with floating numbers, you have to use bc or another subprocess to resolv this operations.
As doing forks like var=$(echo '3*4'|bc) is resource killer, I prefer to run bc only once, as a background process. One advantage of this is bc can store overall variables (total download and total size here).
First part: Init some variables and run bc
Creating two fifos for backgrounded bc and one fifo for curl, making parsing of curl's output easier.
Declaring some variables, and numfmt function into bc for further use...
Note about numfmt: This function compute human readable presentation from integer value, output two values
octal character octal value of b, K, M, G, P and T and
floating number from submited value, divided by power of 1024
The character from octal could be output by printf '%b' \\$value under shell.
#!/bin/sh
target=${1:-https://www.google.com}
iter=${2:-10}
delay=${3:-.02}
tempdir=$(mktemp -d)
bcin="$tempdir/bcin" bcout="$tempdir/bcout" curlout="$tempdir/curlout"
mkfifo "$bcin" "$bcout" "$curlout"
exec 3<>"$bcin"; exec 4<>"$bcout"
cleanUp() { [ -e "$bcin" ] && rm "$bcin" "$bcout" "$curlout" && rmdir "$tempdir"
exit;}
trap cleanUp 0 1 2 3 6 15
bc -l <&3 >&4 &
mBc() { echo >&3 "$2"; read -r $1 <&4 ;}
cat >&3 <<EOInitBc
unit[0]=142;unit[1]=113;unit[2]=115;unit[3]=107;unit[4]=124;unit[5]=120
define void numfmt (s) {
if (s==0) { print "0,0\n"; return;};
p=l(s)/l(1024);
scale=0;
p=p/1;
scale=20;
print unit[p]," ",s/1024^p,"\n";
};
tsiz=0;ttim=0;
EOInitBc
# read variables from curl
checkHttpRes() {
curl -sm2 -w'%{http_code} %{size_download} %{time_total}' -o/dev/null \
"$1" >"$curlout" &
read -r cod siz tim <"$curlout"
mBc rate "tsiz+=$siz;ttim+=$tim;$siz/$tim;"
mBc mtim "$tim*1000"
mBc 'hunit hsz' "numfmt($siz)"
mBc 'hurt hrat' "numfmt($rate)"
printf ' %-3s %7.2f%b %9.2fms %7.2f%b/s\n' "$cod" "$hsz" "\\$hunit" \
"$mtim" "$hrat" "\\$hurt"
}
# Last part: main routine
printf ' %-3s %8s %11s %10s\n' Res Size Time Rate
i="$iter"
while [ "$i" -gt 0 ];do
checkHttpRes "$target"
sleep "$delay"
i=$((i-1))
done
mBc 'hutsz htsz' "numfmt(tsiz)"
mBc 'huasz hasz' "numfmt(tsiz/$iter)"
mBc ttim "1000*ttim"
mBc atim "1000*ttim/$iter"
mBc 'huart hart' "numfmt(tsiz/ttim)"
printf 'Tot %7.2f%b %9.2fms\nAvg %7.2f%b %9.2fms %7.2f%b/s\n' \
"$htsz" "\\$hutsz" "$ttim" "$hasz" "\\$huasz" "$atim" "$hart" "\\$huart"
Run sample:
$ ./curlStat.sh http://www.google.com 10 .1
Res Size Time Rate
200 14.76K 141.84ms 104.09K/s
200 14.65K 136.21ms 107.53K/s
200 14.61K 136.74ms 106.86K/s
200 14.67K 138.08ms 106.26K/s
200 14.70K 130.56ms 112.56K/s
200 14.65K 135.72ms 107.97K/s
200 14.68K 135.28ms 108.53K/s
200 14.64K 134.20ms 109.07K/s
200 14.70K 136.32ms 107.82K/s
200 14.71K 136.19ms 108.00K/s
Tot 146.77K 1361.14ms
Avg 14.68K 136.11ms 107.83K/s
3. Last: bash
This question is tagged sh, but for ((i=... syntax is a bashism. So here is a compact pure bash version of this:
#!/bin/bash
target=${1:-https://www.google.com} iter=${2:-10} delay=${3:-.02}
txsz(){ local i=$(($1>=1<<50?5:$1>=1<<40?4:$1>=1<<30?3:$1>=1<<20?2:$1>1023?1:0
)) a=(b K M G T P);((i>4?i+=-2:0))&&a=(${a[#]:2})&&set -- $(($1>>20)) $2;local\
r=00$((1000*$1/(1024**i)));printf -v $2 %.2f%s ${r::-3}.${r: -3} ${a[i]};}
declare -i ttim=0 tsiz=0
checkHttpRes() { local code size time ustim hsz hrat
read -r code size time < <(
curl -sm2 -w'%{http_code} %{size_download} %{time_total}' -o/dev/null "$1"
)
printf -v ustim '%.6f' "$time"
ustim=$((10#${ustim/.}))
tsz "$size" hsz
tsz $(( size*10**7/ustim/10 )) hrat
ttim+=ustim tsiz+=size ustim=00$ustim
printf ' %-3s %8s %9.2fms %8s/s\n' "$code" "$hsz" \
"${ustim::-3}.${ustim: -3}" "$hrat"
}
printf ' %-3s %8s %11s %10s\n' Res Size Time Rate
for ((i=iter;i--;)) ;do
checkHttpRes "$target"
sleep "$delay"
done
tsz $tsiz htsz
ustim=00$ttim uatim=00$((ttim/iter))
tsz $((tsiz/iter)) hasz
tsz $(( tsiz*10**7/ttim/10 )) hart
printf 'Tot %8s %9.2fms\nAvg %8s %9.2fms %8s/s\n' "$htsz" \
"${ustim::-3}.${ustim: -3}" "$hasz" "${uatim::-3}.${uatim: -3}" "$hart"
Without any fork to bc nor awk, all operation are done in pseudo float using shifted integers.
This will produce same result:
Res Size Time Rate
200 14.68K 132.79ms 110.55K/s
200 14.68K 135.59ms 108.24K/s
200 14.68K 132.31ms 110.99K/s
200 14.75K 141.66ms 104.15K/s
200 14.66K 139.90ms 104.79K/s
200 14.71K 140.07ms 105.00K/s
200 14.68K 142.74ms 102.86K/s
200 14.64K 133.42ms 109.71K/s
200 14.72K 135.62ms 108.56K/s
200 14.71K 139.16ms 105.72K/s
Tot 146.92K 1373.25ms
Avg 14.69K 137.32ms 106.98K/s
You can just pipe the curl output through awk and format it as you want like this:
curl -w 'Return Code: %{http_code}; Bytes Received: %{size_download}; Response Time: %{time_total}\n' "https://www.google.com" -m 2 -o /dev/null -s | awk '{printf "Return Code: %d; KiB Received: %f; Response Time(ms): %f\n", $3, $6/1024, $9*1000}'
So the oneliner is the following:
for ((i=1;i<=50;i++)); do curl -w 'Return Code: %{http_code}; Bytes Received: %{size_download}; Response Time: %{time_total}\n' "https://www.google.com" -m 2 -o /dev/null -s | awk '{printf "Return Code: %d; KiB Received: %f; Response Time(ms): %f\n", $3, $6/1024, $9*1000}'; done | tee /dev/tty |awk '{ sum += $NF; n++ } END { If (n > 0); print "Average Response Time =",sum /n;}'
You can also format the numbers as you want by puttingg, for example %.2f for 2 decimal precision or %d for integer...

awk modify printf output

I need to parse the output of ldapsearch and only keep the attributes with numeric values.
Also I need to transform the output to make it usable in prometheus monitoring.
this is the output of a raw ldapsearch:
# 389, snmp, monitor
dn: cn=389,cn=snmp,cn=monitor
cn: 389
objectClass: top
objectClass: extensibleObject
anonymousbinds: 9
unauthbinds: 9
simpleauthbinds: 122256
strongauthbinds: 0
bindsecurityerrors: 27869
inops: 24501385
readops: 17933653
compareops: 24852
addentryops: 14205
removeentryops: 0
modifyentryops: 378287
modifyrdnops: 0
listops: 0
searchops: 19194674
onelevelsearchops: 117
wholesubtreesearchops: 1260904
referrals: 0
chainings: 0
securityerrors: 2343
errors: 4694375
connections: 1075
connectionseq: 4720927
bytesrecv: 1608469180
bytessent: -424079608
entriesreturned: 19299393
referralsreturned: 0
I execute this query in order to remove the fields that are not numerical and also the dn/cn fields if they have numbers eg cn=389.
${LDAPSEARCH} -LLL -H ${LDAP_URI} -x -D "${BINDDN}" -w ${LDAP_PASSWD} -b "${cn}" -s base | sed '/^cn\|^dn/d' | awk -F: '{ if ( $1 != "connection" && $2 ~ /[[:digit:]$]/) printf "dsee_%s\n", $1 $2}'
But i need to modify the print f so that it prints me the field like this:
dsee_modifyrdnops{node="vm1",cn="389"} 0
dsee_listops{node="vm1",cn="1389"} 0
dsee_strongauthbinds{node="vm1",cn="389"} 0
dsee_readops{"node="vm1",cn="389"} 37194588
I have difficulties adding the curly brackets and quotes to the printf command.
what would be the best way to improve the awk/sed command and modify the printf output?
In plain bash:
#!/bin/bash
node=vm1
while IFS=: read -r key val; do
[[ $key = cn ]] && { cn=${val# }; continue; }
if [[ $val =~ ^\ -?[0-9]+(\.[0-9]*)?$ ]]; then
printf 'dsee_%s{node="%s",cn="%s"}%s\n' "$key" "$node" "$cn" "$val"
fi
done < <( your_raw_ldapsearch_command )
something along these lines:
$ cat tst.awk
BEGIN {
FS=":[[:blank:]]*"
qq="\""
node="vm1"
}
$1=="cn" {cn=$2}
$1!~/^((cn|dn)$|connection)/ && $2~/^[[:digit:]]+$/ {
printf("dsee_%s{node=%s%s%s,cn=%s%s%s} %d\n", $1, qq, node, qq, qq, cn, qq, $2)
}
$ awk -f tst.awk myFile
dsee_anonymousbinds{node="vm1",cn="389"} 9
dsee_unauthbinds{node="vm1",cn="389"} 9
dsee_simpleauthbinds{node="vm1",cn="389"} 122256
dsee_strongauthbinds{node="vm1",cn="389"} 0
dsee_bindsecurityerrors{node="vm1",cn="389"} 27869
dsee_inops{node="vm1",cn="389"} 24501385
dsee_readops{node="vm1",cn="389"} 17933653
dsee_compareops{node="vm1",cn="389"} 24852
dsee_addentryops{node="vm1",cn="389"} 14205
dsee_removeentryops{node="vm1",cn="389"} 0
dsee_modifyentryops{node="vm1",cn="389"} 378287
dsee_modifyrdnops{node="vm1",cn="389"} 0
dsee_listops{node="vm1",cn="389"} 0
dsee_searchops{node="vm1",cn="389"} 19194674
dsee_onelevelsearchops{node="vm1",cn="389"} 117
dsee_wholesubtreesearchops{node="vm1",cn="389"} 1260904
dsee_referrals{node="vm1",cn="389"} 0
dsee_chainings{node="vm1",cn="389"} 0
dsee_securityerrors{node="vm1",cn="389"} 2343
dsee_errors{node="vm1",cn="389"} 4694375
dsee_bytesrecv{node="vm1",cn="389"} 1608469180
dsee_entriesreturned{node="vm1",cn="389"} 19299393
dsee_referralsreturned{node="vm1",cn="389"} 0

Fill the missing dates using awk

I have some missing dates in a file. e.g.
$cat ifile.txt
20060805
20060807
20060808
20060809
20060810
20060813
20060815
20060829
20060901
20060903
20060904
20060905
20070712
20070713
20070716
20070717
The dates are in the format YYYYMMDD. My intention is fill the missing dates in between the dates if they are missing maximum for 5 day e.g.
20060805
20060806 ---- This was missed
20060807
20060808
20060809
20060810
20060811 ----- This was missed
20060812 ----- This was missed
20060813
20060814 ----- This was missed
20060815
20060829
20060830 ------ This was missed
20060831 ------ This was missed
20060901
20060902 ------ This was missed
20060903
20060904
20060905
20070712
20070713
20070714 ----- This was missed
20070715 ----- This was missed
20070716
20070717
Other dates are not needed where there is a gap of more than 5 days. For example, I don't need to fill the dates between 20060815 and 20060829, because the gap between them is more than 5 days.
I am doing it in following ways, but don't get anything.
#!/bin/sh
awk BEGIN'{
a[NR]=$1
} {
for(i=1; i<NR; i++)
if ((a[NR+1]-a[NR]) <= 5)
for (j=1; j<(a[NR+1]-a[NR]); j++)
print a[j]
}' ifile.txt
Desired output:
20060805
20060806
20060807
20060808
20060809
20060810
20060811
20060812
20060813
20060814
20060815
20060829
20060830
20060831
20060901
20060902
20060903
20060904
20060905
20070712
20070713
20070714
20070715
20070716
20070717
Could you please try following, written and tested with shown samples in GNU awk.
awk '
FNR==1{
print
prev=mktime(substr($0,1,4)" "substr($0,5,2)" "substr($0,7,2) " 00 00 00")
next
}
{
found=i=diff=""
curr_time=mktime(substr($0,1,4)" "substr($0,5,2)" "substr($0,7,2) " 00 00 00")
diff=(curr_time-prev)/86400
if(diff>1){
while(++i<=diff){ print strftime("%Y%m%d", prev+86400*i) }
found=1
}
prev=mktime(substr($0,1,4)" "substr($0,5,2)" "substr($0,7,2) " 00 00 00")
}
!found
' Input_file
The following seems to work:
stringtodate() {
echo "${1:0:4}-${1:4:2}-${1:6:2} 12:00:00"
}
datetoseconds() {
LC_ALL=C date -d "$(stringtodate "$1")" +%s
}
secondstodate() {
LC_ALL=C date -d "#$1" +%Y%m%d
}
outputdatesbetween() {
local start=$1
local stop=$2
for ((i = $1; i < $2; i += 3600*24)); do
secondstodate "$i"
done
}
prev=
while IFS= read -r line; do
now=$(datetoseconds "$line")
if [[ -n "$prev" ]] &&
((
now - prev > 3600 * 24 &&
now - prev < 3600 * 24 * 5
))
then
outputdatesbetween "$((prev + 3600 * 24))" "$now"
fi
echo "$line"
prev="$now"
done < 1
Tested on repl
Here is a quick GNU awk script. We use GNU awk to make use of the time-functions mktime and strftime:
awk -v n=5 'BEGIN{FIELDWIDTHS="4 2 2"}
{t=mktime($1 " " $2 " " $3 " 0 0 0",1) }
(t-p < n*86400) { for(i=p+86400;i<t;i+=86400) print strftime("%Y%m%d",i,1) }
{print; p=t}' file
Using mktime we convert the time into the total seconds since 1970. The function strftime converts it back to the desired format. Be aware that we enable the UTC-flag in both functions to ensure that we do not end up with surprises around Daylight-Saving-Time. Furthermore, since we already make use of GNU awk, we can further use the FIELDWIDTHS to determine the field lengths.
note: If your awk does not support the UTC-flag in mktime and strftime, you can run the following:
TZ=UTC awk -v n=5 'BEGIN{FIELDWIDTHS="4 2 2"}
{t=mktime($1 " " $2 " " $3 " 0 0 0") }
(t-p < n*86400) { for(i=p+86400;i<t;i+=86400) print strftime("%Y%m%d",i) }
{print; p=t}' file

Bash script, command - output to array, then print to file

I need advice on how to achieve this output:
myoutputfile.txt
Tom Hagen 1892
State: Canada
Hank Moody 1555
State: Cuba
J.Lo 156
State: France
output of mycommand:
/usr/bin/mycommand
Tom Hagen
1892
Canada
Hank Moody
1555
Cuba
J.Lo
156
France
Im trying to achieve with this shell script:
IFS=$'\r\n' GLOBIGNORE='*' :; names=( $(/usr/bin/mycommand) )
for name in ${names[#]}
do
#echo $name
echo ${name[0]}
#echo ${name:0}
done
Thanks
Assuming you can always rely on the command to output groups of 3 lines, one option might be
/usr/bin/mycommand |
while read name;
read year;
read state; do
echo "$name $year"
echo "State: $state"
done
An array isn't really necessary here.
One improvement could be to exit the loop if you don't get all three required lines:
while read name && read year && read state; do
# Guaranteed that name, year, and state are all set
...
done
An easy one-liner (not tuned for performance):
/usr/bin/mycommand | xargs -d '\n' -L3 printf "%s %s\nState: %s\n"
It reads 3 lines at a time from the pipe and then passes them to a new instance of printf which is used to format the output.
If you have whitespace at the beginning (it looks like that in your example output), you may need to use something like this:
/usr/bin/mycommand | sed -e 's/^\s*//g' | xargs -d '\n' -L3 printf "%s %s\nState: %s\n"
#!/bin/bash
COUNTER=0
/usr/bin/mycommand | while read LINE
do
if [ $COUNTER = 0 ]; then
NAME="$LINE"
COUNTER=$(($COUNTER + 1))
elif [ $COUNTER = 1 ]; then
YEAR="$LINE"
COUNTER=$(($COUNTER + 1))
elif [ $COUNTER = 2 ]; then
STATE="$LINE"
COUNTER=0
echo "$NAME $YEAR"
echo "State: $STATE"
fi
done
chepner's pure bash solution is simple and elegant, but slow with large input files (loops in bash are slow).
Michael Jaros' solution is even simpler, if you have GNU xargs (verify with xargs --version), but also does not perform well with large input files (external utility printf is called once for every 3 input lines).
If performance matters, try the following awk solution:
/usr/bin/mycommand | awk '
{ ORS = (NR % 3 == 1 ? " " : "\n")
gsub("^[[:blank:]]+|[[:blank:]]*\r?$", "") }
{ print (NR % 3 == 0 ? "State: " : "") $0 }
' > myoutputfile.txt
NR % 3 returns the 0-based index of each input line within its respective group of consecutive 3 lines; returns 1 for the 1st line, 2 for the 2nd, and 0(!) for the 3rd.
{ ORS = (NR % 3 == 1 ? " " : "\n") determines ORS, the output-record separator, based on that index: a space for line 1, and a newline for lines 2 and 3; the space ensures that line 2 is appended to line 1 with a space when using print.
gsub("^[[:blank:]]+|[[:blank:]]*\r?$", "") strips leading and trailing whitespace from the line - including, if present, a trailing \r, which your input seems to have.
{ print (NR % 3 == 0 ? "State: " : "") $0 } prints the trimmed input line, prefixed by "State: " only for every 3rd input line, and implicitly followed by ORS (due to use of print).

How do I iterate through each line of a command's output in bash?

I have a script that reads from /proc/stat and calculates CPU usage. There are three relevant lines in /proc/stat:
cpu 1312092 24 395204 12582958 77712 456 3890 0 0 0
cpu0 617029 12 204802 8341965 62291 443 2718 0 0 0
cpu1 695063 12 190402 4240992 15420 12 1172 0 0 0
Currently, my script only reads the first line and calculates usage from that:
cpu=($( cat /proc/stat | grep '^cpu[^0-9] ' ))
unset cpu[0]
idle=${cpu[4]}
total=0
for value in "${cpu[#]}"; do
let total=$(( total+value ))
done
let usage=$(( (1000*(total-idle)/total+5)/10 ))
echo "$usage%"
This works as expected, because the script only parses this line:
cpu 1312092 24 395204 12582958 77712 456 3890 0 0 0
It's easy enough to get only the lines starting with cpu0 and cpu1
cpu=$( cat /proc/stat | grep '^cpu[0-9] ' )
but I don't know how to iterate over each line and apply this same process. Ive tried resetting the internal field separator inside a subshell, like this:
cpus=$( cat /proc/stat | grep '^cpu[0-9] ' )
(
IFS=$'\n'
for cpu in $cpus; do
cpu=($cpu)
unset cpu[0]
idle=${cpu[4]}
total=0
for value in "${cpu[#]}"; do
let total=$(( total+value ))
done
let usage=$(( (1000*(total-idle)/total+5)/10 ))
echo -n "$usage%"
done
)
but this gets me a syntax error
line 18: (1000*(total-idle)/total+5)/10 : division by 0 (error token is "+5)/10 ")
If I echo the cpu variable in the loop it looks like it's separating the lines properly. I looked at this thread and I think Im assigning the cpu variable to an array properly but is there another error Im not seeing?
I put my script into whats wrong with my script and it doesnt show me any errors apart from a warning about using cat within $(), s o I'm stumped.
Change this line in the middle of your loop:
IFS=' ' cpu=($cpu)
You need this because outside of your loop you're setting IFS=$'\n', but with that settingcpu($cpu)` won't do what you expect.
Btw, I would write your script like this:
#!/bin/bash -e
grep ^cpu /proc/stat | while IFS=$'\n' read cpu; do
cpu=($cpu)
name=${cpu[0]}
unset cpu[0]
idle=${cpu[4]}
total=0
for value in "${cpu[#]}"; do
((total+=value))
done
((usage=(1000 * (total - idle) / total + 5) / 10))
echo "$name $usage%"
done
The equivalent using awk:
awk '/^cpu/ { total=0; idle=$5; for (i=2; i<=NF; ++i) { total += $i }; print $1, int((1000 * (total - idle) / total + 5) / 10) }' < /proc/stat
Because the OP asked, an awk program.
awk '
/cpu[0-9] .*/ {
total = 0
idle = $5
for(i = 0; i <= NF; i++) { total += $i; }
printf("%s: %f%%\n", $1, 100*(total-idle)/total);
}
' /proc/stat
The /cpu[0-9] .*/ means "execute for every line matching this expression".
The variables like $1 do what you'd expect, but the 1st field has index 1, not 0: $0 means the whole line in awk.

Resources