Calculate CPU per process - bash

I'm trying to write a script that gives back the CPU usage (in %) for a specific process I need to use the /proc/PID/stat because ps aux is not present on the embedded system.
I tried this:
#!/usr/bin/env bash
PID=$1
PREV_TIME=0
PREV_TOTAL=0
while true;do
TOTAL=$(grep '^cpu ' /proc/stat |awk '{sum=$2+$3+$4+$5+$6+$7+$8+$9+$10; print sum}')
sfile=`cat /proc/$PID/stat`
PROC_U_TIME=$(echo $sfile|awk '{print $14}')
PROC_S_TIME=$(echo $sfile|awk '{print $15}')
PROC_CU_TIME=$(echo $sfile|awk '{print $16}')
PROC_CS_TIME=$(echo $sfile|awk '{print $17}')
let "PROC_TIME=$PROC_U_TIME+$PROC_CU_TIME+$PROC_S_TIME+$PROC_CS_TIME"
CALC="scale=2 ;(($PROC_TIME-$PREV_TIME)/($TOTAL-$PREV_TOTAL)) *100"
USER=`bc <<< $CALC`
PREV_TIME="$PROC_TIME"
PREV_TOTAL="$TOTAL"
echo $USER
sleep 1
done
But is doesn't give the correct value if i compare this to top. Do some of you know where I make a mistake?
Thanks

Under a normal invocation of top (no arguments), the %CPU column is the proportion of ticks used by the process against the total ticks provided by one CPU, over a period of time.
From the top.c source, the %CPU field is calculated as:
float u = (float)p->pcpu * Frame_tscale;
where pcpu for a process is the elapsed user time + system time since the last display:
hist_new[Frame_maxtask].tics = tics = (this->utime + this->stime);
...
if(ptr) tics -= ptr->tics;
...
// we're just saving elapsed tics, to be converted into %cpu if
// this task wins it's displayable screen row lottery... */
this->pcpu = tics;
and:
et = (timev.tv_sec - oldtimev.tv_sec)
+ (float)(timev.tv_usec - oldtimev.tv_usec) / 1000000.0;
Frame_tscale = 100.0f / ((float)Hertz * (float)et * (Rc.mode_irixps ? 1 : Cpu_tot));
Hertz is 100 ticks/second on most systems (grep 'define HZ' /usr/include/asm*/param.h), et is the elapsed time in seconds since the last displayed frame, and Cpu_tot is the numer of CPUs (but the 1 is what's used by default).
So, the equation on a system using 100 ticks per second for a process over T seconds is:
(curr_utime + curr_stime - (last_utime + last_stime)) / (100 * T) * 100
The script becomes:
#!/bin/bash
PID=$1
SLEEP_TIME=3 # seconds
HZ=100 # ticks/second
prev_ticks=0
while true; do
sfile=$(cat /proc/$PID/stat)
utime=$(awk '{print $14}' <<< "$sfile")
stime=$(awk '{print $15}' <<< "$sfile")
ticks=$(($utime + $stime))
pcpu=$(bc <<< "scale=4 ; ($ticks - $prev_ticks) / ($HZ * $SLEEP_TIME) * 100")
prev_ticks="$ticks"
echo $pcpu
sleep $SLEEP_TIME
done
The key differences between this approach and that of your original script is that top is computing its CPU time percentages against 1 CPU, whereas you were attempting to do so against the aggregate total for all CPUs. It's also true that you can compute the exact aggregate ticks over a period of time by doing Hertz * time * n_cpus, and that it may not necessarily be the case that the numbers in /proc/stat will sum correctly:
$ grep 'define HZ' /usr/include/asm*/param.h
/usr/include/asm-generic/param.h:#define HZ 100
$ grep ^processor /proc/cpuinfo | wc -l
16
$ t1=$(awk '/^cpu /{sum=$2+$3+$4+$5+$6+$7+$8+$9+$10; print sum}' /proc/stat) ; sleep 1 ; t2=$(awk '/^cpu /{sum=$2+$3+$4+$5+$6+$7+$8+$9+$10; print sum}' /proc/stat) ; echo $(($t2 - $t1))
1602

Related

How to parallelize csh while loop parallel with GNU-parallel

I have the following script that creates multiple objects.
I tried just simply running it in my terminal, but it seems to take so long. How can I run this with GNU-parallel?
The script below creates an object. It goes through niy = 1 through niy = 800, and for every increment in niy, it loops through njx = 1 to 675.
#!/bin/csh
set njx = 675 ### Number of grids in X
set niy = 800 ### Number of grids in Y
set ll_x = -337500
set ll_y = -400000 ### (63 / 2) * 1000 ### This is the coordinate at lower right corner
set del_x = 1000
set del_y = 1000
rm -f out.shp
rm -f out.shx
rm -f out.dbf
rm -f out.prj
shpcreate out polygon
dbfcreate out -n ID1 10 0
# n = 0 ### initilzation of counter (n) to count gridd cells in loop
# iy = 1 ### initialization of conunter (iy) to count grid cells along north-south direction
echo ### emptly line on screen
while ($iy <= $niy) ### start the loop for norht-south direction
echo ' south-north' $iy '/' $niy ### print a notication on screen
# jx = 1
while ($jx <= $njx)### start the loop for east-west direction
# n++
set x = `echo $ll_x $jx $del_x | awk '{print $1 + ($2 - 1) * $3}'`
set y = `echo $ll_y $iy $del_y | awk '{print $1 + ($2 - 1) * $3}'`
set txt = `echo $x $y $del_x $del_y | awk '{print $1, $2, $1, $2 + $4, $1 + $3, $2 + $4, $1 + $3, $2, $1, $2}'`
shpadd out `echo $txt`
dbfadd out $n
# jx++
end ### close the second loop
# iy++
end ### close the first loop
echo
### lines below create a projection file for the created shapefile using
cat > out.prj << eof
PROJCS["Asia_Lambert_Conformal_Conic",GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Lambert_Conformal_Conic"],PARAMETER["False_Easting",0.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",120.98],PARAMETER["Standard_Parallel_1",5.0],PARAMETER["Standard_Parallel_2",20.0],PARAMETER["Latitude_Of_Origin",14.59998],UNIT["Meter",1.0]]
eof
###
###
###
The inner part gets executed 540,000 times and on each iteration you invoke 3 awk processes to do 3 simple bits of maths... that's 1.6 million awks.
Rather than that, I have written a single awk to generate all the loops and do all the maths and this can then be fed into bash or csh to actually execute it.
I wrote this and ran it completely in the time the original version got to 16% through. I have not checked it extremely thoroughly, but you should be able to readily correct any minor errors:
#!/bin/bash
awk -v njx=675 -v niy=800 -v ll_x=-337500 -v ll_y=-400000 '
BEGIN{
print "shpcreate out polygon"
print "dbfcreate out -n ID1 10 0"
n=0
for(iy=1;iy<niy;iy++){
for(jx=1;jx<njx;jx++){
n++
x = llx + (jx-1)*1000
y = lly + (iy-1)*1000
txt = sprintf("%d %d %d %d %d %d %d %d %d %d",x,y,x, y+dely, x+delx, y+dely, x+delx,y,x,y)
print "shpadd out",txt
print "dbfadd out",n
}
}
}' /dev/null
If the output looks good, you can then run it through bash or csh like this:
./MyAwk | csh
Note that I don't know anything about these Shapefile (?) tools, shpadd or dbfadd tools. They may or may not be able to be run in parallel - if they are anything like sqlite running them in parallel will not help you much. I am guessing the changes above are enough to make a massive improvement to your runtime. If not, here are some other things you could think about.
You could append an ampersand (&) to each line that starts dbfadd or shpadd so that several start in parallel, and then print a wait after every 8 lines so that you run 8 things in parallel in chunks.
You could feed the output of the script directly into GNU Parallel, but I have no idea if the ordering of the lines is critical.
I presume this is creating some sort of database. It may be faster if you run it on a RAM-backed filesystem, such as /tmp.
I notice there is a Python module for manipulating Shapefiles here. I can't help thinking that would be many, many times faster still.

Calculate average execution time of a program using Bash

To get the execution time of any executable, say a.out, I can simply write time ./a.out. This will output a real time, user time and system time.
Is it possible write a bash script that runs the program numerous times and calculates and outputs the average real execution time?
You could write a loop and collect the output of time command and pipe it to awk to compute the average:
avg_time() {
#
# usage: avg_time n command ...
#
n=$1; shift
(($# > 0)) || return # bail if no command given
for ((i = 0; i < n; i++)); do
{ time -p "$#" &>/dev/null; } 2>&1 # ignore the output of the command
# but collect time's output in stdout
done | awk '
/real/ { real = real + $2; nr++ }
/user/ { user = user + $2; nu++ }
/sys/ { sys = sys + $2; ns++}
END {
if (nr>0) printf("real %f\n", real/nr);
if (nu>0) printf("user %f\n", user/nu);
if (ns>0) printf("sys %f\n", sys/ns)
}'
}
Example:
avg_time 5 sleep 1
would give you
real 1.000000
user 0.000000
sys 0.000000
This can be easily enhanced to:
sleep for a given amount of time between executions
sleep for a random time (within a certain range) between executions
Meaning of time -p from man time:
-p
When in the POSIX locale, use the precise traditional format
"real %f\nuser %f\nsys %f\n"
(with numbers in seconds) where the number of decimals in the
output for %f is unspecified but is sufficient to express the
clock tick accuracy, and at least one.
You may want to check out this command-line benchmarking tool as well:
sharkdp/hyperfine
Total execution time vs sum of single execution time
Care! dividing sum of N rounded execution time is imprecise!
Instead, we could divide total execution time of N iteration (by N)
avg_time_alt() {
local -i n=$1
local foo real sys user
shift
(($# > 0)) || return;
{ read foo real; read foo user; read foo sys ;} < <(
{ time -p for((;n--;)){ "$#" &>/dev/null ;} ;} 2>&1
)
printf "real: %.5f\nuser: %.5f\nsys : %.5f\n" $(
bc -l <<<"$real/$n;$user/$n;$sys/$n;" )
}
Nota: This uses bc instead of awk to compute the average. For this, we would create a temporary bc file:
printf >/tmp/test-pi.bc "scale=%d;\npi=4*a(1);\nquit\n" 60
This would compute ΒΆ with 60 decimals, then exit quietly. (You can adapt number of decimals for your host.)
Demo:
avg_time_alt 1000 sleep .001
real: 0.00195
user: 0.00008
sys : 0.00016
avg_time_alt 1000 bc -ql /tmp/test-pi.bc
real: 0.00172
user: 0.00120
sys : 0.00058
Where codeforester's function will anser:
avg_time 1000 sleep .001
real 0.000000
user 0.000000
sys 0.000000
avg_time 1000 bc -ql /tmp/test-pi.bc
real 0.000000
user 0.000000
sys 0.000000
Alternative, inspired by choroba's answer, using Linux's/proc
Ok, you could consider:
avgByProc() {
local foo start end n=$1 e=$1 values times
shift;
export n;
{
read foo;
read foo;
read foo foo start foo
} < /proc/timer_list;
mapfile values < <(
for((;n--;)){ "$#" &>/dev/null;}
read -a endstat < /proc/self/stat
{
read foo
read foo
read foo foo end foo
} </proc/timer_list
printf -v times "%s/100/$e;" ${endstat[#]:13:4}
bc -l <<<"$[end-start]/10^9/$e;$times"
)
printf -v fmt "%-7s: %%.5f\\n" real utime stime cutime cstime
printf "$fmt" ${values[#]}
}
This is based on /proc:
man 5 proc | grep [su]time\\\|timer.list | sed 's/^/> /'
(14) utime %lu
(15) stime %lu
(16) cutime %ld
(17) cstime %ld
/proc/timer_list (since Linux 2.6.21)
Then now:
avgByProc 1000 sleep .001
real : 0.00242
utime : 0.00015
stime : 0.00021
cutime : 0.00082
cstime : 0.00020
Where utime and stime represent user time and system time for bash himself and cutime and cstime represent child user time and child system time wich are the most interesting.
Nota: In this case (sleep) command won't use a lot of ressources.
avgByProc 1000 bc -ql /tmp/test-pi.bc
real : 0.00175
utime : 0.00015
stime : 0.00025
cutime : 0.00108
cstime : 0.00032
This become more clear...
Of course, as accessing timer_list and self/stat successively but not atomicaly, differences between real (nanosecs based) and c?[su]time (based in ticks ie: 1/100th sec) may appear!
From bashoneliners
adapted to transform (,) to (.) for i18n support
hardcoded to 10, adapt as needed
returns only the "real" value, the one you most likely want
Oneliner
for i in {1..10}; do time $#; done 2>&1 | grep ^real | sed s/,/./ | sed -e s/.*m// | awk '{sum += $1} END {print sum / NR}'
I made a "fuller" version
outputs the results of every execution so you know the right thing is executed
shows every run time, so you glance for outliers
But really, if you need advanced stuff just use hyperfine.
GREEN='\033[0;32m'
PURPLE='\033[0;35m'
RESET='\033[0m'
# example: perf sleep 0.001
# https://serverfault.com/questions/175376/redirect-output-of-time-command-in-unix-into-a-variable-in-bash
perfFull() {
TIMEFORMAT=%R # `time` outputs only a number, not 3 lines
export LC_NUMERIC="en_US.UTF-8" # `time` outputs `0.100` instead of local format, like `0,100`
times=10
echo -e -n "\nWARMING UP ${PURPLE}$#${RESET}"
$# # execute passed parameters
echo -e -n "RUNNING ${PURPLE}$times times${RESET}"
exec 3>&1 4>&2 # redirects subshell streams
durations=()
for _ in `seq $times`; {
durations+=(`{ time $# 1>&3 2>&4; } 2>&1`) # passes stdout through so only `time` is caputured
}
exec 3>&- 4>&- # reset subshell streams
printf '%s\n' "${durations[#]}"
total=0
for duration in "${durations[#]}"; {
total=$(bc <<< "scale=3;$total + $duration")
}
average=($(bc <<< "scale=3;$total/$times"))
echo -e "${GREEN}$average average${RESET}"
}
It's probably easier to record the start and end time of the execution and divide the difference by the number of executions.
#!/bin/bash
times=10
start=$(date +%s)
for ((i=0; i < times; i++)) ; do
run_your_executable_here
done
end=$(date +%s)
bc -l <<< "($end - $start) / $times"
I used bc to calculate the average, as bash doesn't support floating point arithmetics.
To get more precision, you can switch to nanoseconds:
start=$(date +%s.%N)
and similarly for $end.

How to calculate percentage in shell script

I used the below line of script in my shell code
Percent=echo "scale=2; $DP*100/$SDC" | bc
it returns .16 as output but i need it as 0.16
Posix-compliant solution using bc:
#!/bin/sh
Percent="$(echo "
scale=2;
a = $DP * 100 / $SDC;
if (a > -1 && a < 0) { print "'"-0"'"; a*=-1; }
else if (a < 1 && a > 0) print 0;
a" | bc)"
That's kind of ugly with all of those special checks for cases when the answer is between -1 and 1 (exclusive) but not zero.
Let's use this Posix-compliant solution using awk instead:
#!/bin/sh
Percent="$(echo "$DP" "$SDC" |awk '{printf "%.2f", $1 * 100 / $2}')"
Z shell can do this natively:
#!/bin/zsh
Percent="$(printf %.2f $(( DP * 100. / SDC )) )"
(The dot is necessary to instruct zsh to use floating point math.)
Native Posix solution using string manipulation (assumes integer inputs):
#!/bin/sh
# # e.g. round down e.g. round up
# # DP=1 SDC=3 DP=2 SDC=3
Percent=$(( DP * 100000 / SDC + 5)) # Percent=33338 Percent=66671
Whole=${Percent%???} # Whole=33 Whole=66
Percent=${Percent#$Whole} # Percent=338 Percent=671
Percent=$Whole.${Percent%?} # Percent=33.33 Percent=66.67
This calculates 1,000 times the desired answer so that we have all the data we need for the final resolution to the hundredths. It adds five so we can properly truncate the thousandths digit. We then define temporary variable $Whole to be just the truncated integer value, we temporarily strip that from $Percent, then we append a dot and the decimals, excluding the thousandths (which we made wrong so we could get the hundredths rounded properly).

Get CPU usage in bash, without top or sysstat

My test script looks like this:
prev_total_cpu=0
prev_idle_cpu=0
while true
do
tempenv=$(grep "cpu " /proc/stat)
#get cpu times
read cpu user nice system idle <<< "$tempenv"
#calculate total cpu time
let total=$[$user + $nice + $system + $idle]
#calculate delta total and delta idle
prev_total="prev_total_$cpu"
prev_idle="prev_idle_$cpu"
let delta_total=$[$total-${!prev_total}]
let delta_idle=$[$idle-${!prev_idle}]
#calculate cpu usage
printf "$cpu: $[100*($delta_total-$delta_idle)/$delta_total]%% "
#remember total and idle times
let prev_total_$cpu="$total"
let prev_idle_$cpu="$idle"
echo ""
sleep 1
done
Now, the same thing with multiple CPU support:
prev_total_cpu=0
prev_idle_cpu=0
prev_total_cpu0=0
prev_idle_cpu0=0
prev_total_cpu1=0
prev_idle_cpu1=0
while true
do
#loop through cpus
grep "cpu" /proc/stat | while IFS='\n' read tempenv
do
#get cpu times
read cpu user nice system idle <<< "$tempenv"
#calculate total cpu time
let total=$[$user + $nice + $system + $idle]
#calculate delta total and delta idle
prev_total="prev_total_$cpu"
prev_idle="prev_idle_$cpu"
let delta_total=$[$total-${!prev_total}]
let delta_idle=$[$idle-${!prev_idle}]
#calculate cpu usage
printf "$cpu: $[100*($delta_total-$delta_idle)/$delta_total]%% "
#remember total and idle times
let prev_total_$cpu="$total"
let prev_idle_$cpu="$idle"
done
echo ""
sleep 1
done
Doesn't work - it shows the same numbers over and over again. What am I doing wrong?
Sub-question:
I would like to initialize prev_total_$cpuand prev_idle_$cpu in a loop, but I don't know how to set the condition. I need the opposite of sed -ne 's/^cpu\(.*\) .*/\1/p' /proc/stat so I can loop through its output and do the initialization.
When you do
grep "cpu" /proc/stat | while IFS='\n' read tempenv
do
...
done
Your while loop runs on a subshell environment.
One solution would be to use process substitution instead:
while IFS='\n' read tempenv
do
...
done < <(grep "cpu" /proc/stat)
Another way is to enable lastpipe option if Bash supports it:
shopt -s lastpipe
... loop goes next ...
Adding some suggestions to your code, when doing arithmetic operations in Bash you better use (( )) format e.g.:
(( total = user + nice + system + idle ))
It might also be better to use -n than printf:
echo -n "$cpu: $(( 100 * ($delta_total - $delta_idle) / $delta_total ))%% "
Lastly, I do hope you really mean that you run your script with Bash and not other shells like Ksh. They are very different. Bash is a shell but Shell != Bash.

Comparing Two Timestamps Within a Second

I am writing a shell script that parses a CSV file and performs some calculations.
The timestamps are in the form: HH:MM:SSS.sss and stored in variables: $t2 and $t1.
I would like to know the difference between the two stamps (it will always be less than one second) and report this as $t3 in seconds (ie: 0.020)
t3=$t2-$t1
But the above code is just printing the two variable with a minus sign between - how do I compare the two timestamps?
Here's a funky way to do it! Strip off all the whole seconds to get the milliseconds. Do the subtraction. If result has gone negative it's because the seconds overflowed, so add back in 1000ms. Slot a decimal point on the front to make seconds from milliseconds.
#!/bin/bash -xv
t1="00:00:02.001"
t2="00:00:03.081"
ms1=${t1/*\./}
ms2=${t2/*\./}
t3=$((10#$ms2-10#$ms1))
[[ $t3 < 0 ]] && t3=$((t3+1000))
t3=$(echo "scale=3; $t3/1000"|bc)
echo $t3
You can use awk maths to compute this difference in 2 timestamps after converting both timestamps to their milli-second value:
t1=04:13:32.234
t2=04:13:32.258
awk -F '[ :.]+' '{
t1=($1*60*60 + $2*60 + $3)*1000 + $4
t2=($5*60*60 + $6*60 + $7)*1000 + $8
print (t2-t1)/60}' <<< "$t1 $t2"
0.4
Formula used for conversion:
timestamp value (ms) = (hour * 60 * 60 + minute * 60 + second ) * 1000 + milli-second

Resources