awk search and calculate standard deviation different results - bash

I am working to take the output of sar and calculate the standard deviation of a column. I can perform this successfully with a single column in a file. However when I calculate this same column in a file where I am stripping out the 'bad' lines like the title lines and avg lines, it is giving me a different value.
Here are the files I am performing this on:
/tmp/saru.tmp
# cat /tmp/saru.tmp
Linux 2.6.32-279.el6.x86_64 (progserver) 09/06/2012 _x86_64_ (4 CPU)
11:09:01 PM CPU %user %nice %system %iowait %steal %idle
11:10:01 PM all 0.01 0.00 0.05 0.01 0.00 99.93
11:11:01 PM all 0.01 0.00 0.06 0.00 0.00 99.92
11:12:01 PM all 0.01 0.00 0.05 0.01 0.00 99.93
11:13:01 PM all 0.01 0.00 0.05 0.00 0.00 99.93
11:14:01 PM all 0.01 0.00 0.04 0.00 0.00 99.95
11:15:01 PM all 0.01 0.00 0.06 0.00 0.00 99.92
11:16:01 PM all 0.01 0.00 2.64 0.01 0.01 97.33
11:17:01 PM all 0.02 0.00 21.96 0.00 0.08 77.94
11:18:01 PM all 0.02 0.00 21.99 0.00 0.08 77.91
11:19:01 PM all 0.02 0.00 22.10 0.00 0.09 77.78
11:20:01 PM all 0.02 0.00 22.06 0.00 0.09 77.83
11:21:01 PM all 0.02 0.00 22.10 0.03 0.11 77.75
11:22:01 PM all 0.01 0.00 21.94 0.00 0.09 77.95
11:23:01 PM all 0.02 0.00 22.15 0.00 0.10 77.73
11:24:01 PM all 0.02 0.00 22.02 0.00 0.09 77.87
11:25:01 PM all 0.02 0.00 22.03 0.00 0.13 77.82
11:26:01 PM all 0.02 0.00 21.96 0.01 0.14 77.86
11:27:01 PM all 0.02 0.00 22.00 0.00 0.09 77.89
11:28:01 PM all 0.02 0.00 21.91 0.00 0.09 77.98
11:29:01 PM all 0.03 0.00 22.02 0.02 0.08 77.85
11:30:01 PM all 0.14 0.00 22.23 0.01 0.13 77.48
11:31:01 PM all 0.02 0.00 22.26 0.00 0.16 77.56
11:32:01 PM all 0.03 0.00 22.04 0.01 0.10 77.83
Average: all 0.02 0.00 15.29 0.01 0.07 84.61
/tmp/sarustriped.tmp
# cat /tmp/sarustriped.tmp
0.05
0.06
0.05
0.05
0.04
0.06
2.64
21.96
21.99
22.10
22.06
22.10
21.94
22.15
22.02
22.03
21.96
22.00
21.91
22.02
22.23
22.26
22.04
The Calculation based on /tmp/saru.tmp:
# awk '$1~/^[01]/ && $6~/^[0-9]/ {sum+=$6; array[NR]=$6} END {for(x=1;x<=NR;x++){sumsq+=((array[x]-(sum/NR))**2);}print sqrt(sumsq/NR)}' /tmp/saru.tmp
10.7126
The Calculation based on /tmp/sarustriped.tmp ( the correct one )
# awk '{sum+=$1; array[NR]=$1} END {for(x=1;x<=NR;x++){sumsq+=((array[x]-(sum/NR))**2);}print sqrt(sumsq/NR)}' /tmp/sarustriped.tmp
9.96397
Could someone assist and tell me why these results are different and is there a way to get the corrected results with a single awk command. I am trying to do this for performance so not using a separate command like grep or another awk command is preferable.
Thanks!
UPDATE
so I tried this ...
awk '
$1~/^[01]/ && $6~/^[0-9]/ {
numrec += 1
sum += $6
array[numrec] = $6
}
END {
for(x=1; x<=numrec; x++)
sumsq += ((array[x]-(sum/numrec))^2)
print sqrt(sumsq/numrec)
}
' saru.tmp
and it works correctly for the sar -u output I was working with. I do not see why it would not work with other 'lists'. To make it short, trying to work with sar -r column 5. it is giving a wrong answer again... Output is giving 1.68891 but actual deviation is .107374... this is the same command that worked with sar -u..... if you need files I can provide. Just not sure how to make a new 'full' comment... so i just edited the old one...thanks!

I think the bug is that your first awk line (the one that operates on saru.tmp) does not ignore the invalid lines, so when you do math using NR your result depends on the number of skipped lines. When you remove all of the invalid/skipped lines the result is the same from both programs. So in the first command, you should use the number of valid lines rather than NR in your math.
How about this?
awk '
$1 ~ /^[01]/ && $6~/^[0-9]/ {
numrec += 1
sum += $6
array[numrec] = $6
}
END {
for(x=1; x<=numrec; x++)
sumsq += (array[x]-(sum/numrec))^2
print sqrt(sumsq/numrec)
}
' saru.tmp

For debugging problems like this, the simplest technique is to print out some basic data. You might print the number of items, and the sum of the values, and the sum of the squares of the values (or sum of the squares of the deviations from the mean). This will likely tell you what's different between the two runs. Sometimes, it might help to print out the values you're accumulating as you're accumulating the data. If I had to guess, I'd suspect you are counting inappropriate lines (blanks, or the decoration lines), so the counts are different (and maybe the sums too).
I have a couple of (non-standard) programs to do the calculations. Given the 23 relevant lines from the multi-column output in a file data, I ran:
$ colnum -c 6 data | pstats
# Count = 23
# Sum(x1) = 3.557200e+02
# Sum(x2) = 7.785051e+03
# Mean = 1.546609e+01
# Std Dev = 1.018790e+01
# Variance = 1.037934e+02
# Min = 4.000000e-02
# Max = 2.226000e+01
$
The standard deviation here is the sample standard deviation rather than the population standard deviation; the difference is dividing by (N-1) for the sample and N for the population.

Related

Why does if condition print zero value for non-zero condition?

I am unable to get the required output from bash code.
I have a text file:
1 0.00 0.00
2 0.00 0.00
3 0.00 0.08
4 0.00 0.00
5 0.04 0.00
6 0.00 0.00
7 -3.00 0.00
8 0.00 0.00
The required output should be only non-zero values:
0.08
0.04
-3.0
This is my code:
z=0.00
while read line
do
line_o="$line"
openx_de=`echo $line_o|awk -F' ' '{print $2,$3}'`
IFS=' ' read -ra od <<< "$openx_de"
for i in "${od[#]}";do
if [ $i != $z ]
then
echo "openx_default_value is $i"
fi
done
done < /openx.txt
but it also gets the zero values.
To get only the nonzero values from columns 2 and 3, try:
$ awk '$2+0!=0{print $2} $3+0!=0{print $3}' openx.txt
0.08
0.04
-3.00
How it works:
$2+0 != 0 {print $2} tests to see if the second column is nonzero. If it is nonzero, then the print statement is executed.
We want to do a numeric comparison between the second column, $2, and zero. To tell awk to treat $2 as a number, we first add zero to it and then we do the comparison.
The same is done for column 3.
Using column names
Consider this input file:
$ cat openx2.txt
n first second
1 0.00 0.00
2 0.00 0.00
3 0.00 0.08
4 0.00 0.00
5 0.04 0.00
6 0.00 0.00
7 -3.00 0.00 8 0.00 0.00
To print the column name with each value found, try:
$ awk 'NR==1{two=$2; three=$3; next} $2+0!=0{print two,$2} $3+0!=0{print three,$3}' openx2.txt
second 0.08
first 0.04
first -3.00
awk '{$1=""}{gsub(/0.00/,"")}NF{$1=$1;print}' file
0.08
0.04
-3.00

Sort by highest value in any field

I want to sort a file based on values in columns 2-8?
Essentially I want ascending order based on the highest value that appears on the line in any of those fields but ignoring columns 1, 9 and 10. i.e. the line with the highest value should be the last line of the file, 2nd largest value should be 2nd last line etc... If the next number in the ascending order appears on multiple lines (like A/B) I don't care of the order it gets printed.
I've looked at using sort but can't figure out an easy way to do what I want...
I'm a bit stumped, any ideas?
Input:
#1 2 3 4 5 6 7 8 9 10
A 0.00 0.00 0.01 0.23 0.19 0.07 0.26 0.52 0.78
B 0.00 0.00 0.02 0.26 0.19 0.09 0.20 0.56 0.76
C 0.00 0.00 0.02 0.16 0.20 0.22 2.84 0.60 3.44
D 0.00 0.00 0.02 0.29 0.22 0.09 0.28 0.62 0.90
E 0.00 0.00 0.90 0.09 0.18 0.05 0.24 1.21 1.46
F 0.00 0.00 1.06 0.03 0.04 0.01 0.00 1.13 1.14
G 0.00 0.00 1.11 0.10 0.31 0.08 0.64 1.60 2.25
H 0.00 0.00 1.39 0.03 0.04 0.01 0.01 1.47 1.48
I 0.00 0.00 1.68 0.16 0.55 0.24 5.00 2.63 7.63
J 0.00 0.00 6.86 0.52 1.87 0.59 12.79 9.83 22.62
K 0.00 0.00 7.26 0.57 2.00 0.64 11.12 10.47 21.59
Expected output:
#1 2 3 4 5 6 7 8 9 10
A 0.00 0.00 0.01 0.23 0.19 0.07 (0.26) 0.52 0.78
B 0.00 0.00 0.02 (0.26) 0.19 0.09 0.20 0.56 0.76
D 0.00 0.00 0.02 (0.29) 0.22 0.09 0.28 0.62 0.90
E 0.00 0.00 (0.90) 0.09 0.18 0.05 0.24 1.21 1.46
F 0.00 0.00 (1.06) 0.03 0.04 0.01 0.00 1.13 1.14
G 0.00 0.00 (1.11) 0.10 0.31 0.08 0.64 1.60 2.25
H 0.00 0.00 (1.39) 0.03 0.04 0.01 0.01 1.47 1.48
C 0.00 0.00 0.02 0.16 0.20 0.22 (2.84) 0.60 3.44
I 0.00 0.00 1.68 0.16 0.55 0.24 (5.00) 2.63 7.63
K 0.00 0.00 7.26 0.57 2.00 0.64 (11.12) 10.47 21.59
J 0.00 0.00 6.86 0.52 1.87 0.59 (12.79) 9.83 22.62
Preprocess the data: print the max of columns 2 through 8 at the start of each line, then sort, then remove the added column:
awk '
NR==1{print "x ", $0}
NR>1{
max = $2;
for( i = 3; i <= 8; i++ )
if( $i > max )
max = $i;
print max, $0
}' OFS=\\t input-file | sort -n | cut -f 2-
Another pure awk variant:
$ awk 'NR==1; # print header
NR>1{ #For other lines,
a=$2;
ai=2;
for(i=3;i<=8;i++){
if($i>a){
a=$i;
ai=i;
}
} # Find the max number in the line
$ai= "(" $ai ")"; # decoration - mark highest with ()
g[$0]=a;
}
function cmp_num_val(i1, v1, i2, v2) {return (v1 - v2);} # sorting function
END{
PROCINFO["sorted_in"]="cmp_num_val"; # assign sorting function
for (a in g) print a; # print
}' sortme.txt | column -t # column -t for formatting.
#1 2 3 4 5 6 7 8 9 10
A 0.00 0.00 0.01 0.23 0.19 0.07 (0.26) 0.52 0.78
B 0.00 0.00 0.02 (0.26) 0.19 0.09 0.20 0.56 0.76
D 0.00 0.00 0.02 (0.29) 0.22 0.09 0.28 0.62 0.90
E 0.00 0.00 (0.90) 0.09 0.18 0.05 0.24 1.21 1.46
F 0.00 0.00 (1.06) 0.03 0.04 0.01 0.00 1.13 1.14
G 0.00 0.00 (1.11) 0.10 0.31 0.08 0.64 1.60 2.25
H 0.00 0.00 (1.39) 0.03 0.04 0.01 0.01 1.47 1.48
C 0.00 0.00 0.02 0.16 0.20 0.22 (2.84) 0.60 3.44
I 0.00 0.00 1.68 0.16 0.55 0.24 (5.00) 2.63 7.63
K 0.00 0.00 7.26 0.57 2.00 0.64 (11.12) 10.47 21.59
J 0.00 0.00 6.86 0.52 1.87 0.59 (12.79) 9.83 22.62

Unix Shell: Summing up values, one per line but skipping every nth row

I am trying to design a Unix shell script (preferably generic sh) that will take a file whose contents are numbers, one per line. These numbers are the CPU idle time from mpstat obtained by:
cat ${PARSE_FILE} | awk '{print $13}' | grep "^[!0-9]" > temp.txt
So the file is a list if numbers, like:
46.19
93.41
73.60
99.40
95.80
96.00
77.10
99.20
52.76
81.18
69.38
89.80
97.00
97.40
76.18
97.10
What these values really are is that line 1 is for Core 1, line 2 for Core 2, etc... for X number of cores (in my case 8) - so every 9th line is again for Core 1, etc...
The original file looks something like this:
10/28/2013 Linux 2.6.32-358.el6.x86_64 (host) 10/28/2013 _x86_64_
(32 CPU)
10/28/2013
10/28/2013 02:25:05 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
10/28/2013 02:25:15 PM 0 51.20 0.00 2.61 0.00 0.00 0.00 0.00 0.00 46.19
10/28/2013 02:25:15 PM 1 6.09 0.00 0.50 0.00 0.00 0.00 0.00 0.00 93.41
10/28/2013 02:25:15 PM 2 25.20 0.00 1.20 0.00 0.00 0.00 0.00 0.00 73.60
10/28/2013 02:25:15 PM 3 0.40 0.00 0.20 0.00 0.00 0.00 0.00 0.00 99.40
10/28/2013 02:25:15 PM 4 3.80 0.00 0.40 0.00 0.00 0.00 0.00 0.00 95.80
10/28/2013 02:25:15 PM 5 3.70 0.00 0.30 0.00 0.00 0.00 0.00 0.00 96.00
10/28/2013 02:25:15 PM 6 21.70 0.00 1.20 0.00 0.00 0.00 0.00 0.00 77.10
10/28/2013 02:25:15 PM 7 0.70 0.00 0.10 0.00 0.00 0.00 0.00 0.00 99.20
10/28/2013 02:25:25 PM 0 45.03 0.00 1.61 0.00 0.00 0.60 0.00 0.00 52.76
10/28/2013 02:25:25 PM 1 17.82 0.00 1.00 0.00 0.00 0.00 0.00 0.00 81.18
10/28/2013 02:25:25 PM 2 29.62 0.00 1.00 0.00 0.00 0.00 0.00 0.00 69.38
10/28/2013 02:25:25 PM 3 9.70 0.00 0.40 0.00 0.00 0.10 0.00 0.00 89.80
10/28/2013 02:25:25 PM 4 2.40 0.00 0.60 0.00 0.00 0.00 0.00 0.00 97.00
10/28/2013 02:25:25 PM 5 2.00 0.00 0.60 0.00 0.00 0.00 0.00 0.00 97.40
10/28/2013 02:25:25 PM 6 22.92 0.00 0.90 0.00 0.00 0.00 0.00 0.00 76.18
10/28/2013 02:25:25 PM 7 2.40 0.00 0.50 0.00 0.00 0.00 0.00 0.00 97.10
I'm trying to design a script that will take the number of cores and this file as a variable and get me the average for each core and I'm not sure how to do this. Here is what I have:
cat ${PARSE_FILE} | awk '{print $13}' | grep "^[!0-9]" > temp.txt
NUMBER_OF_CORES=8
NUMBER_OF_LINES=`awk ' END { print NR } ' temp.txt`
NUMBER_OF_VALUES=`echo "scale=0;${NUMBER_OF_LINES}/${NUMBER_OF_CORES}" | bc`
for i in `seq 1 ${NUMBER_OF_CORES}`
do
awk 'NR % $i == 0' temp.txt
echo Core: ${i} Average: xx
done
So I have the number of values (lines over cores) that each core has, so that is every nth line I need to skip but I'm not sure how to cleanly do this. I basically need to loop every "NUMBER_OF_CORES" times through the file, skipping every "NUMBER_OF_CORES" line and summing them up to divide by "NUMBER_OF_VALUES".
Will this do ?
awk '/CPU/&&/idle/{f=1;next}f{a[$4]+=$13;b[$4]++}END{for(i in a){print i,a[i]/b[i]}}' your_file
Actually the number of cores is not needed here. It will calculate average idle time for all the cores available in the file
Tested:
> cat temp
10/28/2013 Linux 2.6.32-358.el6.x86_64 (host) 10/28/2013 _x86_64_
(32 CPU)
10/28/2013
10/28/2013 02:25:05 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
10/28/2013 02:25:15 PM 0 51.20 0.00 2.61 0.00 0.00 0.00 0.00 0.00 46.19
10/28/2013 02:25:15 PM 1 6.09 0.00 0.50 0.00 0.00 0.00 0.00 0.00 93.41
10/28/2013 02:25:15 PM 2 25.20 0.00 1.20 0.00 0.00 0.00 0.00 0.00 73.60
10/28/2013 02:25:15 PM 3 0.40 0.00 0.20 0.00 0.00 0.00 0.00 0.00 99.40
10/28/2013 02:25:15 PM 4 3.80 0.00 0.40 0.00 0.00 0.00 0.00 0.00 95.80
10/28/2013 02:25:15 PM 5 3.70 0.00 0.30 0.00 0.00 0.00 0.00 0.00 96.00
10/28/2013 02:25:15 PM 6 21.70 0.00 1.20 0.00 0.00 0.00 0.00 0.00 77.10
10/28/2013 02:25:15 PM 7 0.70 0.00 0.10 0.00 0.00 0.00 0.00 0.00 99.20
10/28/2013 02:25:25 PM 0 45.03 0.00 1.61 0.00 0.00 0.60 0.00 0.00 52.76
10/28/2013 02:25:25 PM 1 17.82 0.00 1.00 0.00 0.00 0.00 0.00 0.00 81.18
10/28/2013 02:25:25 PM 2 29.62 0.00 1.00 0.00 0.00 0.00 0.00 0.00 69.38
10/28/2013 02:25:25 PM 3 9.70 0.00 0.40 0.00 0.00 0.10 0.00 0.00 89.80
10/28/2013 02:25:25 PM 4 2.40 0.00 0.60 0.00 0.00 0.00 0.00 0.00 97.00
10/28/2013 02:25:25 PM 5 2.00 0.00 0.60 0.00 0.00 0.00 0.00 0.00 97.40
10/28/2013 02:25:25 PM 6 22.92 0.00 0.90 0.00 0.00 0.00 0.00 0.00 76.18
10/28/2013 02:25:25 PM 7 2.40 0.00 0.50 0.00 0.00 0.00 0.00 0.00 97.10
> nawk '/CPU/&&/idle/{f=1;next}f{a[$4]+=$13;b[$4]++}END{for(i in a){print i,a[i]/b[i]}}' temp
2 71.49
3 94.6
4 96.4
5 96.7
6 76.64
7 98.15
0 49.475
1 87.295
>
The script below countCores.sh is based on the data you gave in temp.txt
This may not be what you want but will give you some ideas. I was'nt sure
what overall total average you wanted so I just chose to show average of the values
in column one for all 8 cores. I also used cat -n to represent the core number.
Hope This helps. VonBell
#!/bin/bash
#Execute As: countCores.sh temp.txt 8
AllCoreTotals=0
DataFile="$1"
NumCores="$2"
AllCoreTotals=0
NumLines="`cat -n $DataFile|cut -f1|tail -1|tr -d " "`"
PrtCols="`echo $NumLines / $NumCores|bc`"
clear;echo;echo
echo "============================================================="
pr -t${PrtCols} $DataFile|tr -d "\t"|tr -s " " "+"|bc |\
while read CoreTotal
do
CoreAverage=`echo $CoreTotal / $PrtCols|bc`
echo "$CoreTotal Core Average $CoreAverage"
AllCoreTotals="`echo $CoreTotal + $AllCoreTotals|bc`"
echo "$AllCoreTotals" > AllCoreTot.tmp
done|cat -n
AllCoreAverage=`cat AllCoreTot.tmp`
AllCoreAverage="`echo $AllCoreAverage / $NumCores|bc`"
echo "============================================================="
echo "(Col One) Total Core Average: $AllCoreAverage "
rm $DataFile
rm AllCoreTot.tmp
Why not do it for all cores at the same time:
awk -f prog.awk ${PARSE_FILE}
Then in prog.awk put
{ if ((NF == 13) && ($4 != "CPU"))
{ SUM[$4] += $13;
CNT[$4]++;
}
}
END { for(loop in SUM)
{ printf("CPU: %d Total: %d Count: %d Average: %d\n",
loop, SUM[loop], CNT[loop], SUM[loop]/CNT[loop]);
}
}
If you want to do it on one line:
awk '{if ((NF == 13) && ($4 != "CPU")){SUM[$4] += $13;CNT[$4]++;}} END {for(loop in SUM){printf("CPU: %d Total: %d Count: %d Average: %d\n", loop, SUM[loop], CNT[loop], SUM[loop]/CNT[loop]);}}' ${PARSE_FILE}
After some more study, this snippet seems to do the trick:
#Parse logs to get CPU averages for cores
PARSE_FILE=`ls ~/logs/*mpstat*`
echo "Parsing ${PARSE_FILE}..."
cat ${PARSE_FILE} | awk '{print $13}' | grep "^[!0-9]" > temp.txt
NUMBER_OF_CORES=8
NUMBER_OF_LINES=`awk ' END { print NR } ' temp.txt`
NUMBER_OF_VALUES=`echo "scale=0;${NUMBER_OF_LINES}/${NUMBER_OF_CORES}" | bc`
TOTAL=0
for i in `seq 1 ${NUMBER_OF_CORES}`
do
sed -n $i'~'$NUMBER_OF_CORES'p' temp.txt > temp2.txt
SUM=`awk '{s+=$0} END {print s}' temp2.txt`
AVERAGE=`echo "scale=0;${SUM}/${NUMBER_OF_VALUES}" | bc`
echo Core: ${i} Average: `expr 100 - ${AVERAGE}`
TOTAL=$((TOTAL+${AVERAGE}))
done
TOTAL_AVERAGE=`echo "scale=0;${TOTAL}/${NUMBER_OF_CORES}" | bc`
echo "Total Average: `expr 100 - ${TOTAL_AVERAGE}`"
rm temp*.txt

iostat & steal time

I am trying to catch some data from iostat output:
# iostat -m
avg-cpu: %user %nice %system %iowait %steal %idle
9.92 0.00 14.17 0.01 0.00 75.90
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 6.08 0.00 0.04 2533 261072
dm-0 1.12 0.00 0.00 1290 30622
dm-1 0.00 0.00 0.00 1 0
dm-2 1.22 0.00 0.00 0 33735
dm-3 7.22 0.00 0.03 1213 196713
How can I match the "0.00" value?
Numbers aren't separated by a tab or a constant number of spaces.
Also the value can be 3 digits 0.00 or 4 digits 45.00, etc.
Any idea how to match it using bash?
Try this, using awk:
iostat | awk 'NR==3 { print $5 }'
NR==3 will operate on the third line, and $5 prints column 5. Verify that the proper column is being selected by playing around with the number, i.e. using your output and print $4 should yield 0.01.

Understanding ruby-prof output

I ran ruby-profiler on one of my programs. I'm trying to figure out what each fields mean. I'm guessing everything is CPU time (and not wall clock time), which is fantastic. I want to understand what the "---" stands for. Is there some sort of stack information in there. What does calls a/b mean?
Thread ID: 81980260
Total Time: 0.28
%total %self total self wait child calls Name
--------------------------------------------------------------------------------
0.28 0.00 0.00 0.28 5/6 FrameParser#receive_data
100.00% 0.00% 0.28 0.00 0.00 0.28 6 FrameParser#read_frames
0.28 0.00 0.00 0.28 4/4 ChatServerClient#receive_frame
0.00 0.00 0.00 0.00 5/47 Fixnum#+
0.00 0.00 0.00 0.00 1/2 DebugServer#receive_frame
0.00 0.00 0.00 0.00 10/29 String#[]
0.00 0.00 0.00 0.00 10/21 <Class::Range>#allocate
0.00 0.00 0.00 0.00 10/71 String#index
--------------------------------------------------------------------------------
100.00% 0.00% 0.28 0.00 0.00 0.28 5 FrameParser#receive_data
0.28 0.00 0.00 0.28 5/6 FrameParser#read_frames
0.00 0.00 0.00 0.00 5/16 ActiveSupport::CoreExtensions::String::OutputSafety#add_with_safety
--------------------------------------------------------------------------------
0.28 0.00 0.00 0.28 4/4 FrameParser#read_frames
100.00% 0.00% 0.28 0.00 0.00 0.28 4 ChatServerClient#receive_frame
0.28 0.00 0.00 0.28 4/6 <Class::Lal>#safe_call
--------------------------------------------------------------------------------
0.00 0.00 0.00 0.00 1/6 <Class::Lal>#safe_call
0.00 0.00 0.00 0.00 1/6 DebugServer#receive_frame
0.28 0.00 0.00 0.28 4/6 ChatServerClient#receive_frame
100.00% 0.00% 0.28 0.00 0.00 0.28 6 <Class::Lal>#safe_call
0.21 0.00 0.00 0.21 2/4 ChatUserFunction#register
0.06 0.00 0.00 0.06 2/2 ChatUserFunction#packet
0.01 0.00 0.00 0.01 4/130 Class#new
0.00 0.00 0.00 0.00 1/1 DebugServer#profile_stop
0.00 0.00 0.00 0.00 1/33 String#==
0.00 0.00 0.00 0.00 1/6 <Class::Lal>#safe_call
0.00 0.00 0.00 0.00 5/5 JSON#parse
0.00 0.00 0.00 0.00 5/8 <Class::Log>#log
0.00 0.00 0.00 0.00 5/5 String#strip!
--------------------------------------------------------------------------------
Each section of the ruby-prof output is broken up into the examination of a particular function. for instance, look at the first section of your output. The read_frames method on FrameParser is the focus and it is basically saying the following:
100% of the execution time that was profiled was spent inside of FrameParser#read_frames
FrameParser#read_frames was called 6 times.
5 out of the 6 calls to read_frames came from FrameParser#receive_data and this accounted 100% of the execution time (this is the line above the read_frames line).
The lines below the read_frames (but within that first section) method are all of the methods that FrameParser#read_frames calls (you should be aware of that since this seems like it's your code), how many of that methods total calls read_frames is responsible for (the a/b calls column), and how much time those calls took. They are ordered by which of them took up the most execution time. In your case, that is receive_frame method on the ChatServer class.
You can then look down at the section focusing on receive_frames (2 down and centered with the '100%' line on receive_frame) and see how it's performance is broken down. each section is set up the same way and usually the subsequent function call which took the most time is the focus of the next section down. ruby-prof will continue doing this through the full call stack. You can go as deep as you want until you find the bottleneck you'd like to resolve.

Resources