I am using command to bring back the Read MB/s.
hdparm -t /dev/sda | awk '/seconds/{print $11}'
From what I was reading it was a good idea to test three times. Add those values up and then divide by 3 for your average.
Sometimes I will have 3 to 16 drives, so I would like to create a question that ask how many drives I have installed. Then perform the hdparm on each drive... Was wondering if there was a simple way to change the SDA all the way up to SDB, SDC, SDD, etc. without typing that command so many times...
Thank you
Bash makes it easy to enumerate all the drives:
$ echo /dev/sd{a..h}
/dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh
Then you said you wanted to average the timing output, so let's define a function to do that:
perform_timing() {
for i in {1..3}; do hdparm -t "$1"; done |
awk '/seconds/ { total += $11; count++ } END { print (total / count) }'
}
Then you can run it on all the drives:
for drive in /dev/sd{a..h}; do
printf '%s: %s\n' "$drive" "$(perform_timing "$drive")"
done
Beaking it Down
The perform_timing function does two things: 1) runs hdparm three times, then 2) averages the output. You can see how the first part works by running it manually:
# for i in {1..3}; do hdparm -t "/dev/sdc"; done
/dev/sdc:
Timing buffered disk reads: 1536 MB in 3.00 seconds = 511.55 MB/sec
/dev/sdc:
Timing buffered disk reads: 1536 MB in 3.00 seconds = 511.97 MB/sec
/dev/sdc:
Timing buffered disk reads: 1538 MB in 3.00 seconds = 512.24 MB/sec
The second part combines your awk code with logic to average all the lines, instead of printing them individually. You can see how the averaging works with a simple awk example:
$ printf '1\n4\n5\n'
1
4
5
$ printf '1\n4\n5\n' | awk '{ total += $1; count++ } END { print (total / count) }'
3.33333
We wrap all that logic in a function called perform_timing as a good programming practice. That lets us call it as if it were any other command:
# perform_timing /dev/sdc
512.303
Finally, instead of writing:
perform_timing /dev/sda
perform_timing /dev/sdb
...
We wrap it all in a loop, which this simplified loop should help explain:
# for drive in /dev/sd{a..c}; do printf '%s\n' "$drive"; done
/dev/sda
/dev/sdb
/dev/sdc
Just use without any loops:
#hdparm -i /dev/sd{a..d}
Related
I have a bash script that takes HeapDump. But I need to trigger this automatically when the memory of my machine reaches 80 %.
Can anyone help me with the script? I have my environment running on AWS.
Here is my attempt so far:
#!/bin/bash
threshold=40
threshold2=45
freemem=$(($(free -m |awk 'NR==2 {print $3}') * 100))
usage=$(($freemem / 512))
if [ "$usage" -gt "$threshold" ]
The way the shell works is it examines the exit code of each program to decide what to do next. So you want to refactor your code so it returns 0 (success) when memory is below 80% and some other number otherwise.
Doing arithmetic in the shell is brittle at best, and impossible if you want floating point rather than integers. You are already using Awk - refactor all the logic into Awk for simplicity and efficiency.
#!/bin/bash
# declare a function
freebelowthres () {
free -m |
awk -v thres="$1" 'NR==2 {
if ($3 * 100 / 512 > thres) exit(1)
exit(0) }'
}
Usage: if freebelowthres 80; then...
As regards integer adding one-liners, several proposed shell scripting solutions exist;
however, on closer look at each of the solutions chosen, there are inherent limitations:
awk ones would choke at arbitrary precision and integer size (it behaves C-like, afterall)
bc ones would rather be unhappy with arbitrarily long inputs: (sed 's/$/+\\/g';echo 0)|bc
Understanding that there may be issues of portability on top of that across platforms (see [1] [2]) which is undesired,
is there a generic solution which is a winner on both practicality and brevity?
Hint: SunOS & MacOSX are examples where portability would be an issue.
fi. could dc command permit to handle arbitrarily large 2^n, integer or otherwise, inputs?
[1] awk: https://stackoverflow.com/a/450821/1574494 or https://stackoverflow.com/a/25245025/1574494 or Printing long integers in awk
[2] bc: Bash command to sum a column of numbers
An optimal solution for dc(1) sums the inputs as they are read:
$ jot 1000000 | sed '2,$s/$/+/;$s/$/p/' | dc
500000500000
The one I usually use is paste -sd+|bc:
$ time seq 1 20000000 | paste -sd+|bc
200000010000000
real 0m10.092s
user 0m10.854s
sys 0m0.481s
(For strict Posix compliance, paste needs to be provided with an explicit argument: paste -sd+ -|bc. Apparently that is necessary with the BSD paste implementation installed by default on OS X.)
However, that will fail for larger inputs, because bc buffers an entire expression in memory before evaluating it. On my system, bc ran out of memory trying to add 100 million numbers, although it was able to do 70 million. But other systems may have smaller capacities.
Since bc has variables, you could avoid long lines by repetitively adding to a variable instead of constructing a single long expression. This is (as far as I know) 100% Posix compliant, but there is a 3x time penalty:
$ time seq 1 20000000|sed -e's/^/s+=/;$a\' -es|bc
200000010000000
real 0m29.224s
user 0m44.119s
sys 0m0.820s
Another way to handle the case where the input size exceeds bc's buffering capacity would be to use the standard xargs tool to add the numbers in groups:
$ time seq 1 100000000 |
> IFS=+ xargs sh -c 'echo "$*"' _ | bc | paste -sd+ | bc
5000000050000000
real 1m0.289s
user 1m31.297s
sys 0m19.233s
The number of input lines used by each xargs evaluation will vary from system to system, but it will normally be in the hundreds and it might be much more. Obviously, the xargs | bc invocations could be chained arbitrarily to increase capacity.
It might be necessary to limit the size of the xargs expansion using the -s switch, on systems where ARG_MAX exceeds the capacity of the bc command. Aside from performing an experiment to establish the bc buffer limit, there is no portable way to establish what that limit might be but it certainly should be no less than LINE_MAX which is guaranteed to be at least 2048. Even with 100-digit addends, that will allow a reduction by a factor of 20, so a chain of 10 xargs|bc pipes would handle over 1013 addends assuming you were prepared to wait a couple of months for that to complete.
As an alternative to constructing a large fixed-length pipeline, you could use a function to recursively pipe the output from xargs|bc until only one value is produced:
radd () {
if read a && read b; then
{ printf '%s\n%s\n' "$a" "$b"; cat; } |
IFS=+ xargs -s $MAXLINE sh -c 'echo "$*"' _ |
bc | radd
else
echo "$a"
fi
}
If you use a very conservative value for MAXLINE, the above is quite slow, but with plausible larger values it is not much slower than the simple paste|bc solution:
$ time seq 1 20000000 | MAXLINE=2048 radd
200000010000000
real 1m38.850s
user 0m46.465s
sys 1m34.503s
$ time seq 1 20000000 | MAXLINE=60000 radd
200000010000000
real 0m12.097s
user 0m17.452s
sys 0m5.090s
$ time seq 1 100000000 | MAXLINE=60000 radd
5000000050000000
real 1m3.972s
user 1m31.394s
sys 0m27.946s
As well as the bc solutions, I timed some other possibilities. As shown above, with an input of 20 million numbers, paste|bc took 10 seconds. That's almost identical to the time used by adding 20 million numbers with
gawk -M '{s+=$0} END{print s}'
Programming languages such as python and perl proved to be faster:
# 9.2 seconds to sum 20,000,000 integers
python -c $'import sys\nprint(sum(int(x) for x in sys.stdin))'
# 5.1 seconds
perl -Mbignum -lne '$s+=$_; END{print $s}'
I was unable to test dc -f - -e '[+z1<r]srz1<rp' on large inputs, since its performance appears to be quadratic (or worse); it summed 25 thousand numbers in 3 seconds, but it took 19 seconds to sum 50 thousand and 90 seconds to do 100 thousand.
Although bc is not the fastest and memory limitations require awkward workarounds, it has the advantage of working out of the box on Posix-compliant systems without the necessity to install enhanced versions of any standard utility (awk) or programming languages not required by Posix (perl and python).
You can use gawk with the -M flag:
$ seq 1 20000000 | gawk -M '{s+=$0} END{print s}'
200000010000000
Or Perl with bignum enabled:
$ seq 1 20000000 | perl -Mbignum -lne '$s+=$_; END{print $s}'
200000010000000
$ seq 1000|(sum=0;while read num; do sum=`echo $sum+$num|bc -l`;done;echo $sum)
500500
Also, this one will not win a top-speed prize, however it IS:
oneliner, yes.
portable
adds lists of any length
adds numbers of any precision (each number's length limited only by MAXLINE)
does not rely on external tools such as python/perl/awk/R etc
with a stretch, you may call it elegant too ;-)
come on guys, show the better way to do this!
It seems that the following does the trick:
$ seq 1000|dc -f - -e '[+z1<r]srz1<rp'
500500
but, is it the optimal solution?
jot really slows u down :
( time ( jot 100000000 | pvZ -i 0.2 -l -cN in0 |
mawk2 '{ __+=$_ } END { print __ }' FS='\n' ) )
in0: 100M 0:00:17 [5.64M/s] [ <=> ]
( jot 100000000 | nice pv -pteba -i 1 -i 0.2 -l -cN in0 |
mawk2 FS='\n'; )
26.43s user 0.78s system 153% cpu 17.730 total
5000000050000000
using another awk instance to generate the sequence shaves off 39.7% :
( time (
mawk2 -v __='100000000' '
BEGIN { for(_-=_=__=+__;_<__;) {
print ++_ } }' |
pvZ -i 0.2 -l -cN in0 |
mawk2 '{ __+=$_ } END{ print __ }' FS='\n' ))
in0: 100M 0:00:10 [9.37M/s] [ <=> ]
( mawk2 -v __='100000000' 'BEGIN {…}' | )
19.44s user 0.68s system 188% cpu 10.687 total
5000000050000000
for the bc option, gnu-paste is quite a bit faster than bsd-paste in this regard, but both absolutely pale compared to awk, while perl is only slightly behind :
time jot 15000000 | pvE9 | mawk2 '{ _+=$__ } END { print _ }'
out9: 118MiB 0:00:02 [45.0MiB/s] [45.0MiB/s] [ <=> ]
112500007500000
jot 15000000 2.60s user 0.03s system 99% cpu 2.640 total
pvE 0.1 out9 0.01s user 0.05s system 2% cpu 2.640 total
mawk2 '{...}'
1.09s user 0.03s system 42% cpu 2.639 total
perl -Mbignum -lne '$s+=$_; END{print $s}' # perl 5.36
1.36s user 0.03s system 52% cpu 2.662 total
time jot 15000000 | pvE9 | gpaste -sd+ -|bc
out9: 118MiB 0:00:02 [45.3MiB/s] [45.3MiB/s] [ <=> ]
112500007500000
jot 15000000 2.59s user 0.03s system 99% cpu 2.627 total
pvE 0.1 out9 0.01s user 0.05s system 2% cpu 2.626 total
gpaste -sd+ - 0.27s user 0.03s system 11% cpu 2.625 total # gnu-paste
bc 4.55s user 0.46s system 66% cpu 7.544 total
time jot 15000000 | pvE9 | paste -sd+ -|bc
out9: 118MiB 0:00:05 [22.7MiB/s] [22.7MiB/s] [ <=> ]
112500007500000
jot 15000000 2.63s user 0.03s system 51% cpu 5.207 total
pvE 0.1 out9 0.01s user 0.06s system 1% cpu 5.209 total
paste -sd+ - 5.14s user 0.05s system 99% cpu 5.211 total # bsd-paste
bc 4.53s user 0.40s system 49% cpu 10.029 total
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed last year.
Improve this question
I am wondering how you can get the system CPU usage and present it in percent using bash, for example.
Sample output:
57%
In case there is more than one core, it would be nice if an average percentage could be calculated.
Take a look at cat /proc/stat
grep 'cpu ' /proc/stat | awk '{usage=($2+$4)*100/($2+$4+$5)} END {print usage "%"}'
EDIT please read comments before copy-paste this or using this for any serious work. This was not tested nor used, it's an idea for people who do not want to install a utility or for something that works in any distribution. Some people think you can "apt-get install" anything.
NOTE: this is not the current CPU usage, but the overall CPU usage in all the cores since the system bootup. This could be very different from the current CPU usage. To get the current value top (or similar tool) must be used.
Current CPU usage can be potentially calculated with:
awk '{u=$2+$4; t=$2+$4+$5; if (NR==1){u1=u; t1=t;} else print ($2+$4-u1) * 100 / (t-t1) "%"; }' \
<(grep 'cpu ' /proc/stat) <(sleep 1;grep 'cpu ' /proc/stat)
You can try:
top -bn1 | grep "Cpu(s)" | \
sed "s/.*, *\([0-9.]*\)%* id.*/\1/" | \
awk '{print 100 - $1"%"}'
Try mpstat from the sysstat package
> sudo apt-get install sysstat
Linux 3.0.0-13-generic (ws025) 02/10/2012 _x86_64_ (2 CPU)
03:33:26 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %idle
03:33:26 PM all 2.39 0.04 0.19 0.34 0.00 0.01 0.00 0.00 97.03
Then some cutor grepto parse the info you need:
mpstat | grep -A 5 "%idle" | tail -n 1 | awk -F " " '{print 100 - $ 12}'a
Might as well throw up an actual response with my solution, which was inspired by Peter Liljenberg's:
$ mpstat | awk '$12 ~ /[0-9.]+/ { print 100 - $12"%" }'
0.75%
This will use awk to print out 100 minus the 12th field (idle), with a percentage sign after it. awk will only do this for a line where the 12th field has numbers and dots only ($12 ~ /[0-9]+/).
You can also average five samples, one second apart:
$ mpstat 1 5 | awk 'END{print 100-$NF"%"}'
Test it like this:
$ mpstat 1 5 | tee /dev/tty | awk 'END{print 100-$NF"%"}'
EDITED: I noticed that in another user's reply %idle was field 12 instead of field 11. The awk has been updated to account for the %idle field being variable.
This should get you the desired output:
mpstat | awk '$3 ~ /CPU/ { for(i=1;i<=NF;i++) { if ($i ~ /%idle/) field=i } } $3 ~ /all/ { print 100 - $field }'
If you want a simple integer rounding, you can use printf:
mpstat | awk '$3 ~ /CPU/ { for(i=1;i<=NF;i++) { if ($i ~ /%idle/) field=i } } $3 ~ /all/ { printf("%d%%",100 - $field) }'
Do this to see the overall CPU usage. This calls python3 and uses the cross-platform psutil module.
printf "%b" "import psutil\nprint('{}%'.format(psutil.cpu_percent(interval=2)))" | python3
The interval=2 part says to measure the total CPU load over a blocking period of 2 seconds.
Sample output:
9.4%
The python program it contains is this:
import psutil
print('{}%'.format(psutil.cpu_percent(interval=2)))
Placing time in front of the call proves it takes the specified interval time of about 2 seconds in this case. Here is the call and output:
$ time printf "%b" "import psutil\nprint('{}%'.format(psutil.cpu_percent(interval=2)))" | python3
9.5%
real 0m2.127s
user 0m0.119s
sys 0m0.008s
To view the output for individual cores as well, let's use this python program below. First, I obtain a python list (array) of "per CPU" information, then I average everything in that list to get a "total % CPU" type value. Then I print the total and the individual core percents.
Python program:
import psutil
cpu_percent_cores = psutil.cpu_percent(interval=2, percpu=True)
avg = sum(cpu_percent_cores)/len(cpu_percent_cores)
cpu_percent_total_str = ('%.2f' % avg) + '%'
cpu_percent_cores_str = [('%.2f' % x) + '%' for x in cpu_percent_cores]
print('Total: {}'.format(cpu_percent_total_str))
print('Individual CPUs: {}'.format(' '.join(cpu_percent_cores_str)))
This can be wrapped up into an incredibly ugly 1-line bash script like this if you like. I had to be sure to use only single quotes (''), NOT double quotes ("") in the Python program in order to make this wrapping into a bash 1-liner work:
printf "%b" \
"\
import psutil\n\
cpu_percent_cores = psutil.cpu_percent(interval=2, percpu=True)\n\
avg = sum(cpu_percent_cores)/len(cpu_percent_cores)\n\
cpu_percent_total_str = ('%.2f' % avg) + '%'\n\
cpu_percent_cores_str = [('%.2f' % x) + '%' for x in cpu_percent_cores]\n\
print('Total: {}'.format(cpu_percent_total_str))\n\
print('Individual CPUs: {}'.format(' '.join(cpu_percent_cores_str)))\n\
" | python3
Sample output: notice that I have 8 cores, so there are 8 numbers after "Individual CPUs:":
Total: 10.15%
Individual CPUs: 11.00% 8.50% 11.90% 8.50% 9.90% 7.60% 11.50% 12.30%
For more information on how the psutil.cpu_percent(interval=2) python call works, see the official psutil.cpu_percent(interval=None, percpu=False) documentation here:
psutil.cpu_percent(interval=None, percpu=False)
Return a float representing the current system-wide CPU utilization as a percentage. When interval is > 0.0 compares system CPU times elapsed before and after the interval (blocking). When interval is 0.0 or None compares system CPU times elapsed since last call or module import, returning immediately. That means the first time this is called it will return a meaningless 0.0 value which you are supposed to ignore. In this case it is recommended for accuracy that this function be called with at least 0.1 seconds between calls. When percpu is True returns a list of floats representing the utilization as a percentage for each CPU. First element of the list refers to first CPU, second element to second CPU and so on. The order of the list is consistent across calls.
Warning: the first time this function is called with interval = 0.0 or None it will return a meaningless 0.0 value which you are supposed to ignore.
Going further:
I use the above code in my cpu_logger.py script in my eRCaGuy_dotfiles repo.
References:
Stack Overflow: How to get current CPU and RAM usage in Python?
Stack Overflow: Executing multi-line statements in the one-line command-line?
How to display a float with two decimal places?
Finding the average of a list
Related
https://unix.stackexchange.com/questions/295599/how-to-show-processes-that-use-more-than-30-cpu/295608#295608
https://askubuntu.com/questions/22021/how-to-log-cpu-load
If I run a program then it will shows following on the terminal screen
Total Events Processed 799992
Events Aborted (part of RBs) 0
Events Rolled Back 0
Efficiency 100.00 %
Total Remote (shared mem) Events Processed 0
Percent Remote Events 0.00 %
Total Remote (network) Events Processed 0
Percent Remote Events 0.00 %
Total Roll Backs 0
Primary Roll Backs 0
Secondary Roll Backs 0
Fossil Collect Attempts 0
Total GVT Computations 0
Net Events Processed 799992
Event Rate (events/sec) 3987042.0
If I want to take first and fifth row from the output then how do I do that?
You can use the grep utility if it is available:
$ ./program | grep 'Total Events Processed\|Total Remote (shared mem) Events Processed'
I think sed would do this, e.g.:
sed -n -e 1p -e 5p input.txt
You could do this with awk, too:
awk 'NR==1 || NR==5 {print;} NR==6 {nextfile;}'
Or in prettier sed:
sed -ne '1p;5p;6q'
Or even by piping through a pure Bourne shell script, if that's the way you roll (per your tag):
#!/bin/sh
n=0
while read line; do
n=$((n+1))
if [ $n = 1 -o $n = 5 ]; then
echo "$line"
elif [ $n = 6 ]; then
break
fi
done
Note that in all cases we're quitting at record 6, because there's no need to continue walking through the output.
I have collected vmstat data in a file . It gives details about free, buffer and cache .
SInce i'm interested in finding the memory usage , I should do the following computation for each line of vmstat output -- USED=TOTAL - (FREE+BUFFER+CACHE) where TOTAL is the total RAM memory and USED is the instantaneous memory value.
TOTAL memory = 4042928 (4 GB)
My code is here
grep -v procs $1 | grep -v free | awk '{USED=4042928-$4-$5-$6;print $USED}' > test.dat
awk: program limit exceeded: maximum number of fields size=32767
FILENAME="-" FNR=1 NR=1
You should not be printing $USED for a start, the variable in awk is USED:
pax> vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 1402804 258392 2159316 0 0 54 79 197 479 1 2 93 3
pax> vmstat | egrep -v 'procs|free' | awk '{USED=4042928-$4-$5-$6;print USED}'
222780
What is most likely to be happening in your case is that you're using an awk with that limitation of about 32000 fields per record.
Because your fields 4, 5 and 6 are respectively 25172, 664 and 8520 (from one of your comments), your USED value becomes 4042928-25172-664-8520 or 4008572.
If you tried to print USED, that's what you'd get but, because you're trying to print $USED, it thinks you want $4008572 (field number 4008572) which is just a little bit beyond the 32000 range.
Interestingly, if you have a lot more free memory, you wouldn't get the error but you'd still get an erroneous value :-)
By the way, gawk doesn't have this limitation, it simply prints an empty field (see, for example, section 11.9 here).
you can just do it with one awk command
vmstat | awk 'NR>2{print 4042928-$4-$5-$6 }' file