Script to extract highest latency from traceroute - bash

I'm looking for a script that can extract the line with the highest latency hop from a traceroute. Ideally it would look at the max or avg of the 3 values by line. How can I so that?
This is what I tried so far:
traceroute www.google.com | awk '{printf "%s\t%s\n", $2, $3+$4+$5; }' | sort -rgk2 | head -n1
traceroute -w10 www.google.com | awk '{printf "%s\t%s\n", $2, ($3+$4+$5)/3; }' | sort -rgk2 | head -n1
It seemed a step in the right direction, except some of the values coming back from a traceroute are *, so both the sum and the average provide a wrong value.
Update
Got one step further:
traceroute www.cnn.com | awk '{count = 0;sum = 0;for (i=3; i<6; i++){ if ($i != "*") {sum += $i;count++;}}; printf "%s\t%s\t%s\t%s\n", $2, count, sum, sum/count }' | sort -rgk2
now need to intercept if I dont' have a column 4,5. Sometimes traceroute only provides 3 stars like this:
17 207.88.13.153 235.649ms 234.864ms 239.316ms
18 * * *

You will have to
Kick off a traceroute
Collect each line of output ( a pipe would likely work well here)
Use a tool like awk to
Analyze the line and extract the information you want
Compare the values you just got with previous values and store the current line if appropriate
At the end of the input print the stored value

Try:
$ traceroute 8.8.8.8 | awk ' BEGIN { FPAT="[0-9]+\\.[0-9]{3} ms" }
/[\\* ]{3}/ {next}
NR>1 {
for (i=1;i<4;i++) {gsub("*","5000.00 ms",$i)}
av = (gensub(" ms","",1,$1) + gensub(" ms","",1,$2) + gensub(" ms","",1,$3))/3
if (av > worst) {
ln = $0
worst = av
}
}
ND { print "Highest:", ln, " Average:", worst, "ms"}'
which gives:
Highest: 6 72.14.242.166 (72.14.242.166) 7.383 ms 72.14.232.134 (72.14.232.134) 7.865 ms 7.768 ms Average: 7.672 ms
If there are three asterix (asteri?) * * * the script assumes that the hop isn't responding with the IGMP response and ignores it completely. If there are one or two * in a line, it gives them the value of 5.0 seconds.

Stephan, you could try and use pchar a derivative of pathchar. It should be in the Ubuntu repository.
I takes a while to run though so you need some patience. It will show you throughput and that will be much better than latency for determining the bottleneck.
http://www.caida.org/tools/taxonomy/perftaxonomy.xml
Here is an example:
rayd#raydHPEliteBook8440p ~ sudo pchar anddroiddevs.com
pchar to anddroiddevs.com (31.221.38.104) using UDP/IPv4
Using raw socket input
Packet size increments from 32 to 1500 by 32
46 test(s) per repetition
32 repetition(s) per hop
0: 192.168.0.20 (raydHPEliteBook8440p.local)
Partial loss: 0 / 1472 (0%)
Partial char: rtt = 6.553065 ms, (b = 0.000913 ms/B), r2 = 0.241811
stddev rtt = 0.196989, stddev b = 0.000244
Partial queueing: avg = 0.012648 ms (13848 bytes)
Hop char: rtt = 6.553065 ms, bw = 8759.575088 Kbps
Hop queueing: avg = 0.012648 ms (13848 bytes)
1: 80.5.69.1 (cpc2-glfd6-2-0-gw.6-2.cable.virginm.net)

Use mtr --raw -c 1 google.com. It's wayy faster and easier to parse.

Related

Calculate the Average of Numbers from command output

Need help with a nvidia-smi command. When i run the following command:
nvidia-smi --format=csv --query-gpu=utilization.gpu
It returns:
utilization.gpu [%]
89 %
45 %
22 %
68 %
I want to have a script that uses these returned values and calculate an average number.
So it should do this 89 + 45 + 22 + 68 = 224 / 4 = 56
Is there a way of doing this ?
nvidia-smi --format=csv --query-gpu=utilization.gpu | awk '/[[:digit:]]+[[:space:]]%/ { tot+=$1;cnt++ } END { print tot/cnt }'
Pipe the output of nvidia-smi ... to awk. Process lines that have a one or more digits, a space and then a "%". Create a running total (tot) and also, count the number of occurrences (cnt) At the end, divide tot by cnt and print the result.
nvidia-smi --format=csv --query-gpu=utilization.gpu | tail -n +2 | awk '{ sum+=$1 }END { print sum/NR }'
Use tail to get all but not the first line, then compute the mean of column 1.

Performance: While Loop with AWK

I have a file that I want to import into a database table, but I want to have a piece in each row. In the import, I need to indicate for each row the offset (first byte) and length (number of bytes)
I have the following files:
*line_numbers.txt* -> Each row contains the number of
the last row of a record in *plans.txt*.
*plans.txt* -> All the information required for all the rows.
I have the following code:
#Starting line number of the record
sLine=0
#Starting byte value of the record
offSet=0
while read line
do
endByte=`awk -v fline=${sLine} -v lline=${line} \
'{if (NR > fline && NR < lline) \
sum += length($0); } \
END {print sum}' plans.txt`
echo "\"plans.txt.${offSet}.${endByte}/\"" >> lobs.in
sLine=$((line+1))
offSet=$((endByte+offSet))
done < line_numbers.txt
This code will write in the file lobs.in something similar to:
"plans.txt.0.504/"
"plans.txt.505.480/"
"plans.txt.984.480/"
"plans.txt.1464.1159/"
"plans.txt.2623.515/"
This means, for example, that the first record starts at byte 0 and continues for the next 504 bytes. The next starts at byte 505 and continues for the next 480 bytes.
I still have to run more tests, but It seems to be working.
My problem is It is very very slow for the volume I need to process.
Do you have any performance tips?
I looked in a way to insert the loop in awk, but I need 2 input files and I don't know how to process It without the while.
Thank you!
Doing this all in awk would be much faster.
Suppose you have:
$ cat lines.txt
100
200
300
360
10000
50000
And:
$ awk -v maxl=50000 'BEGIN{for (i=1;i<=maxl;i++) printf "Line %d\n", i}' >data.txt
(So you have Line 1\nLine 2\n...Line maxl in the file data.txt)
You would do something like:
awk 'FNR==NR{lines[FNR]=$1; next}
{data[FNR]=length($0); next}
END{ sl=1
for (i=1; i in lines; i++) {
bc=0
for (j=sl; j<=lines[i]; j++){
bc+=data[j]
}
printf "line %d to %d is %d bytes\n", sl, j-1, bc
sl=lines[i]+1
}
}' lines.txt data.txt
line 1 to 100 is 1392 bytes
line 101 to 200 is 1500 bytes
line 201 to 300 is 1500 bytes
line 301 to 360 is 900 bytes
line 361 to 10000 is 153602 bytes
line 10001 to 50000 is 680000 bytes
Simple improvement. Never redirect inside a loop with >>, what can be redirected outside a loop with >>. Worse:
while read line
do
# .... stuff omitted ...
echo "\"plans.txt.${offSet}.${endByte}/\"" >> lobs.in
# ....
done < line_numbers.txt
Note how the only line in the loop that outputs anything is echo. Better:
while read line
do
# .... stuff omitted ...
echo "\"plans.txt.${offSet}.${endByte}/\""
# ....
done < line_numbers.txt >> lobs.in

how to compute CPU usage with bash? [duplicate]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed last year.
Improve this question
I am wondering how you can get the system CPU usage and present it in percent using bash, for example.
Sample output:
57%
In case there is more than one core, it would be nice if an average percentage could be calculated.
Take a look at cat /proc/stat
grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$4+$5)} END {print usage "%"}'
EDIT please read comments before copy-paste this or using this for any serious work. This was not tested nor used, it's an idea for people who do not want to install a utility or for something that works in any distribution. Some people think you can "apt-get install" anything.
NOTE: this is not the current CPU usage, but the overall CPU usage in all the cores since the system bootup. This could be very different from the current CPU usage. To get the current value top (or similar tool) must be used.
Current CPU usage can be potentially calculated with:
awk '{u=$2+$4; t=$2+$4+$5; if (NR==1){u1=u; t1=t;} else print ($2+$4-u1) * 100 / (t-t1) "%"; }' \
<(grep 'cpu ' /proc/stat) <(sleep 1;grep 'cpu ' /proc/stat)
You can try:
top -bn1 | grep "Cpu(s)" | \
sed "s/.*, *\([0-9.]*\)%* id.*/\1/" | \
awk '{print 100 - $1"%"}'
Try mpstat from the sysstat package
> sudo apt-get install sysstat
Linux 3.0.0-13-generic (ws025) 02/10/2012 _x86_64_ (2 CPU)
03:33:26 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:33:26 PM all 2.39 0.04 0.19 0.34 0.00 0.01 0.00 0.00 97.03
Then some cutor grepto parse the info you need:
mpstat | grep -A 5 "%idle" | tail -n 1 | awk -F " " '{print 100 - $ 12}'a
Might as well throw up an actual response with my solution, which was inspired by Peter Liljenberg's:
$ mpstat | awk '$12 ~ /[0-9.]+/ { print 100 - $12"%" }'
0.75%
This will use awk to print out 100 minus the 12th field (idle), with a percentage sign after it. awk will only do this for a line where the 12th field has numbers and dots only ($12 ~ /[0-9]+/).
You can also average five samples, one second apart:
$ mpstat 1 5 | awk 'END{print 100-$NF"%"}'
Test it like this:
$ mpstat 1 5 | tee /dev/tty | awk 'END{print 100-$NF"%"}'
EDITED: I noticed that in another user's reply %idle was field 12 instead of field 11. The awk has been updated to account for the %idle field being variable.
This should get you the desired output:
mpstat | awk '$3 ~ /CPU/ { for(i=1;i<=NF;i++) { if ($i ~ /%idle/) field=i } } $3 ~ /all/ { print 100 - $field }'
If you want a simple integer rounding, you can use printf:
mpstat | awk '$3 ~ /CPU/ { for(i=1;i<=NF;i++) { if ($i ~ /%idle/) field=i } } $3 ~ /all/ { printf("%d%%",100 - $field) }'
Do this to see the overall CPU usage. This calls python3 and uses the cross-platform psutil module.
printf "%b" "import psutil\nprint('{}%'.format(psutil.cpu_percent(interval=2)))" | python3
The interval=2 part says to measure the total CPU load over a blocking period of 2 seconds.
Sample output:
9.4%
The python program it contains is this:
import psutil
print('{}%'.format(psutil.cpu_percent(interval=2)))
Placing time in front of the call proves it takes the specified interval time of about 2 seconds in this case. Here is the call and output:
$ time printf "%b" "import psutil\nprint('{}%'.format(psutil.cpu_percent(interval=2)))" | python3
9.5%
real 0m2.127s
user 0m0.119s
sys 0m0.008s
To view the output for individual cores as well, let's use this python program below. First, I obtain a python list (array) of "per CPU" information, then I average everything in that list to get a "total % CPU" type value. Then I print the total and the individual core percents.
Python program:
import psutil
cpu_percent_cores = psutil.cpu_percent(interval=2, percpu=True)
avg = sum(cpu_percent_cores)/len(cpu_percent_cores)
cpu_percent_total_str = ('%.2f' % avg) + '%'
cpu_percent_cores_str = [('%.2f' % x) + '%' for x in cpu_percent_cores]
print('Total: {}'.format(cpu_percent_total_str))
print('Individual CPUs: {}'.format(' '.join(cpu_percent_cores_str)))
This can be wrapped up into an incredibly ugly 1-line bash script like this if you like. I had to be sure to use only single quotes (''), NOT double quotes ("") in the Python program in order to make this wrapping into a bash 1-liner work:
printf "%b" \
"\
import psutil\n\
cpu_percent_cores = psutil.cpu_percent(interval=2, percpu=True)\n\
avg = sum(cpu_percent_cores)/len(cpu_percent_cores)\n\
cpu_percent_total_str = ('%.2f' % avg) + '%'\n\
cpu_percent_cores_str = [('%.2f' % x) + '%' for x in cpu_percent_cores]\n\
print('Total: {}'.format(cpu_percent_total_str))\n\
print('Individual CPUs: {}'.format(' '.join(cpu_percent_cores_str)))\n\
" | python3
Sample output: notice that I have 8 cores, so there are 8 numbers after "Individual CPUs:":
Total: 10.15%
Individual CPUs: 11.00% 8.50% 11.90% 8.50% 9.90% 7.60% 11.50% 12.30%
For more information on how the psutil.cpu_percent(interval=2) python call works, see the official psutil.cpu_percent(interval=None, percpu=False) documentation here:
psutil.cpu_percent(interval=None, percpu=False)
Return a float representing the current system-wide CPU utilization as a percentage. When interval is > 0.0 compares system CPU times elapsed before and after the interval (blocking). When interval is 0.0 or None compares system CPU times elapsed since last call or module import, returning immediately. That means the first time this is called it will return a meaningless 0.0 value which you are supposed to ignore. In this case it is recommended for accuracy that this function be called with at least 0.1 seconds between calls. When percpu is True returns a list of floats representing the utilization as a percentage for each CPU. First element of the list refers to first CPU, second element to second CPU and so on. The order of the list is consistent across calls.
Warning: the first time this function is called with interval = 0.0 or None it will return a meaningless 0.0 value which you are supposed to ignore.
Going further:
I use the above code in my cpu_logger.py script in my eRCaGuy_dotfiles repo.
References:
Stack Overflow: How to get current CPU and RAM usage in Python?
Stack Overflow: Executing multi-line statements in the one-line command-line?
How to display a float with two decimal places?
Finding the average of a list
Related
https://unix.stackexchange.com/questions/295599/how-to-show-processes-that-use-more-than-30-cpu/295608#295608
https://askubuntu.com/questions/22021/how-to-log-cpu-load

Calculate CPU per process

I'm trying to write a script that gives back the CPU usage (in %) for a specific process I need to use the /proc/PID/stat because ps aux is not present on the embedded system.
I tried this:
#!/usr/bin/env bash
PID=$1
PREV_TIME=0
PREV_TOTAL=0
while true;do
TOTAL=$(grep '^cpu ' /proc/stat |awk '{sum=$2+$3+$4+$5+$6+$7+$8+$9+$10; print sum}')
sfile=`cat /proc/$PID/stat`
PROC_U_TIME=$(echo $sfile|awk '{print $14}')
PROC_S_TIME=$(echo $sfile|awk '{print $15}')
PROC_CU_TIME=$(echo $sfile|awk '{print $16}')
PROC_CS_TIME=$(echo $sfile|awk '{print $17}')
let "PROC_TIME=$PROC_U_TIME+$PROC_CU_TIME+$PROC_S_TIME+$PROC_CS_TIME"
CALC="scale=2 ;(($PROC_TIME-$PREV_TIME)/($TOTAL-$PREV_TOTAL)) *100"
USER=`bc <<< $CALC`
PREV_TIME="$PROC_TIME"
PREV_TOTAL="$TOTAL"
echo $USER
sleep 1
done
But is doesn't give the correct value if i compare this to top. Do some of you know where I make a mistake?
Thanks
Under a normal invocation of top (no arguments), the %CPU column is the proportion of ticks used by the process against the total ticks provided by one CPU, over a period of time.
From the top.c source, the %CPU field is calculated as:
float u = (float)p->pcpu * Frame_tscale;
where pcpu for a process is the elapsed user time + system time since the last display:
hist_new[Frame_maxtask].tics = tics = (this->utime + this->stime);
...
if(ptr) tics -= ptr->tics;
...
// we're just saving elapsed tics, to be converted into %cpu if
// this task wins it's displayable screen row lottery... */
this->pcpu = tics;
and:
et = (timev.tv_sec - oldtimev.tv_sec)
+ (float)(timev.tv_usec - oldtimev.tv_usec) / 1000000.0;
Frame_tscale = 100.0f / ((float)Hertz * (float)et * (Rc.mode_irixps ? 1 : Cpu_tot));
Hertz is 100 ticks/second on most systems (grep 'define HZ' /usr/include/asm*/param.h), et is the elapsed time in seconds since the last displayed frame, and Cpu_tot is the numer of CPUs (but the 1 is what's used by default).
So, the equation on a system using 100 ticks per second for a process over T seconds is:
(curr_utime + curr_stime - (last_utime + last_stime)) / (100 * T) * 100
The script becomes:
#!/bin/bash
PID=$1
SLEEP_TIME=3 # seconds
HZ=100 # ticks/second
prev_ticks=0
while true; do
sfile=$(cat /proc/$PID/stat)
utime=$(awk '{print $14}' <<< "$sfile")
stime=$(awk '{print $15}' <<< "$sfile")
ticks=$(($utime + $stime))
pcpu=$(bc <<< "scale=4 ; ($ticks - $prev_ticks) / ($HZ * $SLEEP_TIME) * 100")
prev_ticks="$ticks"
echo $pcpu
sleep $SLEEP_TIME
done
The key differences between this approach and that of your original script is that top is computing its CPU time percentages against 1 CPU, whereas you were attempting to do so against the aggregate total for all CPUs. It's also true that you can compute the exact aggregate ticks over a period of time by doing Hertz * time * n_cpus, and that it may not necessarily be the case that the numbers in /proc/stat will sum correctly:
$ grep 'define HZ' /usr/include/asm*/param.h
/usr/include/asm-generic/param.h:#define HZ 100
$ grep ^processor /proc/cpuinfo | wc -l
16
$ t1=$(awk '/^cpu /{sum=$2+$3+$4+$5+$6+$7+$8+$9+$10; print sum}' /proc/stat) ; sleep 1 ; t2=$(awk '/^cpu /{sum=$2+$3+$4+$5+$6+$7+$8+$9+$10; print sum}' /proc/stat) ; echo $(($t2 - $t1))
1602

BASH - extract integer from logfile with sed

I've got the following logfile and I'd like to extract the number of dropped packets (in the following example the number is 0):
ITGDec version 2.8.1 (r1023)
Compile-time options: bursty multiport
----------------------------------------------------------
Flow number: 1
From 192.168.1.2:0
To 192.168.1.2:8999
----------------------------------------------------------
Total time = 2.990811 s
Total packets = 590
Minimum delay = 0.000033 s
Maximum delay = 0.000169 s
Average delay = 0.000083 s
Average jitter = 0.000010 s
Delay standard deviation = 0.000016 s
Bytes received = 241900
Average bitrate = 647.048576 Kbit/s
Average packet rate = 197.270907 pkt/s
Packets dropped = 0 (0.00 %)
Average loss-burst size = 0.000000 pkt
----------------------------------------------------------
__________________________________________________________
**************** TOTAL RESULTS ******************
__________________________________________________________
Number of flows = 1
Total time = 2.990811 s
Total packets = 590
Minimum delay = 0.000033 s
Maximum delay = 0.000169 s
Average delay = 0.000083 s
Average jitter = 0.000010 s
Delay standard deviation = 0.000016 s
Bytes received = 241900
Average bitrate = 647.048576 Kbit/s
Average packet rate = 197.270907 pkt/s
Packets dropped = 0 (0.00 %)
Average loss-burst size = 0 pkt
Error lines = 0
----------------------------------------------------------
I'm trying with the following command:
cat logfile | grep -m 1 dropped | sed -n 's/.*=\([0-9]*\) (.*/\1/p'
but nothing gets printed.
Thank you
EDIT: I just wanted to tell you that the "Dropped packets" line gets printed in the following way in the code of the program:
printf("Packets dropped = %13lu (%3.2lf %%)\n", (long unsigned int) 0, (double) 0);
It will be easier to use awk here:
awk '/Packets dropped/{print $4}' logfile
Aside from the problem in your sed expression (that it doesn't allow space after =), you don't really need a pipeline here.
grep would suffice:
grep -m 1 -oP 'dropped\s*=\s*\K\d+' logfile
You could have fixed your sed expression by permitting space after the =:
sed -n 's/.*= *\([0-9]*\) (.*/\1/p'
Avoiding your use of cat and grep, in plain sed:
sed -n 's/^Packets dropped[=[:space:]]\+\([0-9]\+\).*/\1/p' logfile
Matches
any line starting with "Packets dropped"
one or more whitespace or "=" characters
one or more digits (which are captured)
The rest .* is discarded.
With the -r option as well, you can lose a few backslashes:
sed -nr 's/^Packets dropped[=[:space:]]+([0-9]+).*/\1/p' logfile
sed -n '/Packets dropped/ s/.*[[:space:]]\([0-9]\{1,\}\)[[:space:]].*/\1/p' YourFile
but print both 2 line (detail + summary) where info is write

Resources