I always result in bites (4521564 b)
if the result is > 1 YB - print result in YB
if the result is > 1 ZB and < 1 YB - print result in ZB
if the result is > 1 EB and < 1 ZB - print result in EB
if the result is > 1 PB and < 1 EB - print result in PB
if the result is > 1 TB and < 1 PB - print result in TB
if the result is > 1 GB and < 1 TB - print result in GB
if the result is > 1 MB and < 1 GB - print result in MB
if the result is > 1 KB and < 1 MB - print result in KB
Do not know about the bash for which it can be calculated?
Using awk:
f.awk contents:
$ cat f.awk
function calc(num,i)
{
if (num>=1024)
calc(num/1024,i+1);
else
printf "%.2f %s\n", num,a[i+1];
}
BEGIN{
split("b KB MB GB TB PB EB ZB YB",a);
calc(val,0)
}
Run the above awk program like this:
$ awk -v val=4521564 -f f.awk
4.31 MB
The logic used here is to keep diving the number by 1024 till the original number becomes less than 1024. And during every division increment a count. The count is finally mapped to the units to get the appropriate unit. The function calc is called recursively.
Eg: I/p: 1000bytes: In this case, since no. is less than 1024, no division is done and the counter is 0. The counter maps to bytes index.
I/p : 2050 : Divided by 1024 and count is incremented by 1. After division, since the no. is less than 1024, it is printed with the unit pointed by the counter, which in this case is Kb.
shell doesn't do floating point without a helper so this shell function will round
byteme(){
v=$1
i=0
s=" KMGTPEZY"
while [ $v -gt 1024 ]; do
i=$((i+1))
v=$((v/1024))
done
echo $v${s:$i:1}b
}
byteme 1234567890
1Gb
Related
I have about 54,000 packets to analyze and I am trying to determine the average # of packets per second (as well as the min and max # of packets during a given second)
My input file is a single column of the packet times (see sample below):
0.004
0.015
0.030
0.050
..
..
1999.99
I've used awk to determine the timing deltas but can't figure out a way to parse out the chunks of time to get an output of:
0-1s = 10 packets
1-2s = 15 packets
etc
Here is an example of how you can use awk to get the desired output.
Suppose your original input file is sample.txt, first thing to do is reverse sort it (sort -nr) then you can supply awk with the newly sorted file along with the time variable through awk "-v" argument. Perform your tests inside awk, make use of "next" to skip lines and "exit" to quit the awk script when needed.
#!/bin/bash
#
for i in 0 1 2 3
do
sort -nr sample.txt |awk -v time=$i 'BEGIN{number=0}''{
if($1>=(time+1)){next}
else if( $1>=time && $1 <(time+1))
{number+=1}
else{
printf "[ %d - %d [ : %d records\n",time,time+1,number;exit}
}'
done
Here's the sample file:
0.1
0.2
0.8
.
.
0.94
.
.
1.5
1.9
.
3.0
3.6
Here's the program's output:
[ 1 - 2 [ : 5 records
[ 2 - 3 [ : 8 records
[ 3 - 4 [ : 2 records
Hope this helps !
Would you please try the followings:
With bash:
max=0
while read -r line; do
i=${line%.*} # extract the integer part
a[$i]=$(( ${a[$i]} + 1 )) # increment the array element
(( i > max )) && max=$i # update the maximum index
done < sample.txt
# report the summary
for (( i=0; i<=max; i++ )); do
printf "%d-%ds = %d packets\n" "$i" $(( i+1 )) "${a[$i]}"
done
With AWK:
awk '
{
i = int($0)
a[i]++
if (i > max) max = i
}
END {
for (i=0; i<=max; i++)
printf("%d-%ds = %d packets\n", i, i+1, a[i])
}' sample.txt
sample.txt:
0.185
0.274
0.802
1.204
1.375
1.636
1.700
1.774
1.963
2.044
2.112
2.236
2.273
2.642
2.882
3.000
3.141
5.023
5.082
Output:
0-1s = 3 packets
1-2s = 6 packets
2-3s = 6 packets
3-4s = 2 packets
4-5s = 0 packets
5-6s = 2 packets
Hope this helps.
I have a file that I want to import into a database table, but I want to have a piece in each row. In the import, I need to indicate for each row the offset (first byte) and length (number of bytes)
I have the following files:
*line_numbers.txt* -> Each row contains the number of
the last row of a record in *plans.txt*.
*plans.txt* -> All the information required for all the rows.
I have the following code:
#Starting line number of the record
sLine=0
#Starting byte value of the record
offSet=0
while read line
do
endByte=`awk -v fline=${sLine} -v lline=${line} \
'{if (NR > fline && NR < lline) \
sum += length($0); } \
END {print sum}' plans.txt`
echo "\"plans.txt.${offSet}.${endByte}/\"" >> lobs.in
sLine=$((line+1))
offSet=$((endByte+offSet))
done < line_numbers.txt
This code will write in the file lobs.in something similar to:
"plans.txt.0.504/"
"plans.txt.505.480/"
"plans.txt.984.480/"
"plans.txt.1464.1159/"
"plans.txt.2623.515/"
This means, for example, that the first record starts at byte 0 and continues for the next 504 bytes. The next starts at byte 505 and continues for the next 480 bytes.
I still have to run more tests, but It seems to be working.
My problem is It is very very slow for the volume I need to process.
Do you have any performance tips?
I looked in a way to insert the loop in awk, but I need 2 input files and I don't know how to process It without the while.
Thank you!
Doing this all in awk would be much faster.
Suppose you have:
$ cat lines.txt
100
200
300
360
10000
50000
And:
$ awk -v maxl=50000 'BEGIN{for (i=1;i<=maxl;i++) printf "Line %d\n", i}' >data.txt
(So you have Line 1\nLine 2\n...Line maxl in the file data.txt)
You would do something like:
awk 'FNR==NR{lines[FNR]=$1; next}
{data[FNR]=length($0); next}
END{ sl=1
for (i=1; i in lines; i++) {
bc=0
for (j=sl; j<=lines[i]; j++){
bc+=data[j]
}
printf "line %d to %d is %d bytes\n", sl, j-1, bc
sl=lines[i]+1
}
}' lines.txt data.txt
line 1 to 100 is 1392 bytes
line 101 to 200 is 1500 bytes
line 201 to 300 is 1500 bytes
line 301 to 360 is 900 bytes
line 361 to 10000 is 153602 bytes
line 10001 to 50000 is 680000 bytes
Simple improvement. Never redirect inside a loop with >>, what can be redirected outside a loop with >>. Worse:
while read line
do
# .... stuff omitted ...
echo "\"plans.txt.${offSet}.${endByte}/\"" >> lobs.in
# ....
done < line_numbers.txt
Note how the only line in the loop that outputs anything is echo. Better:
while read line
do
# .... stuff omitted ...
echo "\"plans.txt.${offSet}.${endByte}/\""
# ....
done < line_numbers.txt >> lobs.in
I'm trying to format a number in BASH. I'd like to replicate the byte/packet number output from iptables.
here are some examples:
258
591K
55273
37G
22244
2212
6127K
12M
114K
As you can see:
there is no thousands separator,
the field is a max of 5 characters wide,
each suffix is either: none, K, M, G, etc...
I've searched the documentation on printf but have been unable to find anything that can format a number this way. Does anyone know how to do this?
Thanks.
You could build a custom formatting with awk, something like this :
awk 'BEGIN{ u[0]=""; u[1]="K"; u[2]="M"; u[3]="G"} { n = $1; i = 0; while(n > 1000) { i+=1; n= int(n/1000) } print n u[i] } '
Input sample :
258
591000
55273
37000000000
22244
2212
6127000
12000000
114000
Output :
258
591K
55K
37G
22K
2K
6M
12M
114K
has to be done programmatically, but it's not hard
#!/bin/sh
humanFormat() {
test $x -gt 1000000000 && x=`expr x / 1000000000`G
test $x -gt 1000000 && x=`expr x / 1000000`M
test $x -gt 1000 && x=`expr x / 1000`K
}
(edited to fix execution order)
I have a script that reads from /proc/stat and calculates CPU usage. There are three relevant lines in /proc/stat:
cpu 1312092 24 395204 12582958 77712 456 3890 0 0 0
cpu0 617029 12 204802 8341965 62291 443 2718 0 0 0
cpu1 695063 12 190402 4240992 15420 12 1172 0 0 0
Currently, my script only reads the first line and calculates usage from that:
cpu=($( cat /proc/stat | grep '^cpu[^0-9] ' ))
unset cpu[0]
idle=${cpu[4]}
total=0
for value in "${cpu[#]}"; do
let total=$(( total+value ))
done
let usage=$(( (1000*(total-idle)/total+5)/10 ))
echo "$usage%"
This works as expected, because the script only parses this line:
cpu 1312092 24 395204 12582958 77712 456 3890 0 0 0
It's easy enough to get only the lines starting with cpu0 and cpu1
cpu=$( cat /proc/stat | grep '^cpu[0-9] ' )
but I don't know how to iterate over each line and apply this same process. Ive tried resetting the internal field separator inside a subshell, like this:
cpus=$( cat /proc/stat | grep '^cpu[0-9] ' )
(
IFS=$'\n'
for cpu in $cpus; do
cpu=($cpu)
unset cpu[0]
idle=${cpu[4]}
total=0
for value in "${cpu[#]}"; do
let total=$(( total+value ))
done
let usage=$(( (1000*(total-idle)/total+5)/10 ))
echo -n "$usage%"
done
)
but this gets me a syntax error
line 18: (1000*(total-idle)/total+5)/10 : division by 0 (error token is "+5)/10 ")
If I echo the cpu variable in the loop it looks like it's separating the lines properly. I looked at this thread and I think Im assigning the cpu variable to an array properly but is there another error Im not seeing?
I put my script into whats wrong with my script and it doesnt show me any errors apart from a warning about using cat within $(), s o I'm stumped.
Change this line in the middle of your loop:
IFS=' ' cpu=($cpu)
You need this because outside of your loop you're setting IFS=$'\n', but with that settingcpu($cpu)` won't do what you expect.
Btw, I would write your script like this:
#!/bin/bash -e
grep ^cpu /proc/stat | while IFS=$'\n' read cpu; do
cpu=($cpu)
name=${cpu[0]}
unset cpu[0]
idle=${cpu[4]}
total=0
for value in "${cpu[#]}"; do
((total+=value))
done
((usage=(1000 * (total - idle) / total + 5) / 10))
echo "$name $usage%"
done
The equivalent using awk:
awk '/^cpu/ { total=0; idle=$5; for (i=2; i<=NF; ++i) { total += $i }; print $1, int((1000 * (total - idle) / total + 5) / 10) }' < /proc/stat
Because the OP asked, an awk program.
awk '
/cpu[0-9] .*/ {
total = 0
idle = $5
for(i = 0; i <= NF; i++) { total += $i; }
printf("%s: %f%%\n", $1, 100*(total-idle)/total);
}
' /proc/stat
The /cpu[0-9] .*/ means "execute for every line matching this expression".
The variables like $1 do what you'd expect, but the 1st field has index 1, not 0: $0 means the whole line in awk.
I have collected vmstat data in a file . It gives details about free, buffer and cache .
SInce i'm interested in finding the memory usage , I should do the following computation for each line of vmstat output -- USED=TOTAL - (FREE+BUFFER+CACHE) where TOTAL is the total RAM memory and USED is the instantaneous memory value.
TOTAL memory = 4042928 (4 GB)
My code is here
grep -v procs $1 | grep -v free | awk '{USED=4042928-$4-$5-$6;print $USED}' > test.dat
awk: program limit exceeded: maximum number of fields size=32767
FILENAME="-" FNR=1 NR=1
You should not be printing $USED for a start, the variable in awk is USED:
pax> vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 1402804 258392 2159316 0 0 54 79 197 479 1 2 93 3
pax> vmstat | egrep -v 'procs|free' | awk '{USED=4042928-$4-$5-$6;print USED}'
222780
What is most likely to be happening in your case is that you're using an awk with that limitation of about 32000 fields per record.
Because your fields 4, 5 and 6 are respectively 25172, 664 and 8520 (from one of your comments), your USED value becomes 4042928-25172-664-8520 or 4008572.
If you tried to print USED, that's what you'd get but, because you're trying to print $USED, it thinks you want $4008572 (field number 4008572) which is just a little bit beyond the 32000 range.
Interestingly, if you have a lot more free memory, you wouldn't get the error but you'd still get an erroneous value :-)
By the way, gawk doesn't have this limitation, it simply prints an empty field (see, for example, section 11.9 here).
you can just do it with one awk command
vmstat | awk 'NR>2{print 4042928-$4-$5-$6 }' file