Enhanced docker stats command with total amount of RAM and CPU - bash

I just want to share a small script that I made to enhance the docker stats command.
I am not sure about the exactitude of this method.
Can I assume that the total amount of memory consumed by the complete Docker deployment is the sum of each container consumed memory ?
Please share your modifications and or corrections. This command is documented here: https://docs.docker.com/engine/reference/commandline/stats/
When running a docker stats The output looks like this:
$ docker stats --all --format "table {{.MemPerc}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.Name}}"
MEM % CPU % MEM USAGE / LIMIT NAME
0.50% 1.00% 77.85MiB / 15.57GiB ecstatic_noether
1.50% 3.50% 233.55MiB / 15.57GiB stoic_goodall
0.25% 0.50% 38.92MiB / 15.57GiB drunk_visvesvaraya
My script will add the following line at the end:
2.25% 5.00% 350.32MiB / 15.57GiB TOTAL
docker_stats.sh
#!/bin/bash
# This script is used to complete the output of the docker stats command.
# The docker stats command does not compute the total amount of resources (RAM or CPU)
# Get the total amount of RAM, assumes there are at least 1024*1024 KiB, therefore > 1 GiB
HOST_MEM_TOTAL=$(grep MemTotal /proc/meminfo | awk '{print $2/1024/1024}')
# Get the output of the docker stat command. Will be displayed at the end
# Without modifying the special variable IFS the ouput of the docker stats command won't have
# the new lines thus resulting in a failure when using awk to process each line
IFS=;
DOCKER_STATS_CMD=`docker stats --no-stream --format "table {{.MemPerc}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.Name}}"`
SUM_RAM=`echo $DOCKER_STATS_CMD | tail -n +2 | sed "s/%//g" | awk '{s+=$1} END {print s}'`
SUM_CPU=`echo $DOCKER_STATS_CMD | tail -n +2 | sed "s/%//g" | awk '{s+=$2} END {print s}'`
SUM_RAM_QUANTITY=`LC_NUMERIC=C printf %.2f $(echo "$SUM_RAM*$HOST_MEM_TOTAL*0.01" | bc)`
# Output the result
echo $DOCKER_STATS_CMD
echo -e "${SUM_RAM}%\t\t\t${SUM_CPU}%\t\t${SUM_RAM_QUANTITY}GiB / ${HOST_MEM_TOTAL}GiB\tTOTAL"

From the documentation that you have linked above,
The docker stats command returns a live data stream for running containers.
To limit data to one or more specific containers, specify a list of container names or ids separated by a space.
You can specify a stopped container but stopped containers do not return any data.
and then furthermore,
Note: On Linux, the Docker CLI reports memory usage by subtracting page cache usage from the total memory usage.
The API does not perform such a calculation but rather provides the total memory usage and the amount from the page cache so that clients can use the data as needed.
According to your question, it looks like you can assume so, but also do not forget it also factors in containers that exist but are not running.

Your docker_stats.sh does the job for me, thanks!
I had to add unset LC_ALL somewhere before LC_NUMERIC is used though as the former overrides the latter and otherwise I get this error:
"Zeile 19: printf: 1.7989: Ungültige Zahl." This is probably due to my using a German locale.
There is also a discussion to add this feature to the "docker stats" command itself.

Thanks for sharing the script! I've updated it in a way it depends on DOCKER_MEM_TOTAL instead of HOST_MEM_TOTAL as docker has it's own memory limit which could differ from host total memory count.

Related

Keep looping until ram is full then pause and continue the loop once it get free again

I have to run a command on many files which I do by bash looping and an ampersand at the end of each command so files run in parallel. Yet I don't want to consume all the ram on the server in case where I have 100 or such files. Is there a way to keep the loop going on until ram usage exceed certain threshold and at this point the loop pauses and once ram is free again then continue looping to next files ? Thanks
Yes, you can. Use the free command to determine the RAM usage. Then compare the current usage to your threshold. Wrap the condition of the loop into a function to make it a bit more readable:
ramAboveThreshold() {
local threshold=6000000000 # threshold in bytes (6 GB)
local used="$(free -b | awk 'NR == 2 {print $2}')"
(( used > threshold ))
}
Inside your old loop, place another loop that waits for the RAM usage to drop under your threshold:
for i in myFiles/*; do
while ramAboveThreshold; do
sleep 5
done
myCommand "$i" &
done
free does not only print the used RAM, but also the free and total RAM, so the script could be altered to have a a threshold like »at least n bytes free« or even »less than 60% used«.

How to suppress the general information for top command

I wish to suppress the general information for the top command
using a top parameter.
By general information I mean the below stuff :
top - 09:35:05 up 3:26, 2 users, load average: 0.29, 0.22, 0.21
Tasks: 1 total, 0 running, 1 sleeping, 0 stopped, 0 zombie
Cpu(s): 2.3%us, 0.7%sy, 0.0%ni, 96.3%id, 0.8%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 3840932k total, 2687880k used, 1153052k free, 88380k buffers
Swap: 3998716k total, 0k used, 3998716k free, 987076k cached
What I do not wish to do is :
top -u user | grep process_name
or
top -bp $(pgrep process_name) | do_something
How can I achieve this?
Note: I am on Ubuntu 12.04 and top version is 3.2.8.
Came across this question today. I have a potential solution - create a top configuration file from inside top's interactive mode when the summary area is disabled. Since this file is also read at startup of top in batch mode, it will cause the summary area to be disabled in batch mode too.
Follow these steps to set it up..
Launch top in interactive mode.
Once inside interactive mode, disable the summary area by successively pressing 'l', 'm' and 't'.
Press 'W' (upper case) to write your top configuration file (normally, ~/.toprc)
Exit interactive mode.
Now when you run top in batch mode the summary area will not appear (!)
Taking it one step further...
If you only want this for certain situations and still want the summary area most of the time, you could use an alternate top configuration file. However, AFAIK, the way to get top to use an alternate config file is a bit funky. There are a couple of ways to do this. The approach I use is as follows:
Create a soft-link to the top executable. This does not have to be done as root, as long as you have write access to the link's location...
ln -s /usr/bin/top /home/myusername/bin/omgwtf
Launch top by typing the name of the link ('omgwtf') rather than 'top'. You will be in normal top interactive mode, but when you save the configuration file it will write to ~/.omgwtfrc, leaving ~/.toprc alone.
Disable the summary area and write the configuration file same as before (press 'l', 'm', 't' and 'W')
In the future, when you're ready to run top without summary info in batch mode, you'll have to invoke top via the link name you created. For example,
% omgwtf -usyslog -bn1
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
576 syslog 20 0 264496 8144 1352 S 0.0 0.1 0:03.66 rsyslogd
%
If you're running top in batch mode (-b -n1), just delete the header lines with sed:
top -b -n1 | sed 1,7d
That will remove the first 7 header lines that top outputs and returns only the processes.
It's known as the "Summary Area" and i don't think there is a way at top initialization to disable those.
But while top is running, you can disable those by pressing l, t, m.
From man top:
Summary-Area-defaults
'l' - Load Avg/Uptime On (thus program name)
't' - Task/Cpu states On (1+1 lines, see '1')
'm' - Mem/Swap usage On (2 lines worth)
'1' - Single Cpu On (thus 1 line if smp)
This will dump the output and it can be redirected to any file if needed.
top -n1 |grep -Ev "Tasks:|Cpu(s):|Swap:|Mem:"
To monitoring a particular process, following command is working for me -
top -sbn1 -p $(pidof <process_name>) | grep $(pidof <process_name>)
And to get the all process information you can use the following -
top -sbn1|sed -n '/PID/,/^$/p'
egrep may be good enough in this case, but I would add that perl -lane could do this kind of thing with lightning speed:
top -b -n 1 | perl -lane '/PID/ and $x=1; $x and print' | head -n10
This way you may forget the precise arguments for grep, sed, awk, etc. for good because perl is typically much faster than those tools.
On a mac you cannot use -b which is used in many of the other answers.
In that case the command would be top -n1 -l1 | sed 1,10d
Grabbing only the first process line (and its header), only logging once, instead of interactive, then suppress the general information for top command which are the first 10 lines.

Script to send alert mail if disk usage exceeds a percentage

I am new to shell scripting, and want to implement a script on my server which will automatically send e-mail alerts if:
Disk usage exceeds 90%
Disk usage exceeds 95% (In addition to the previous e-mail)
My filesystem is abc:/xyz/abc and my mount is /pqr. How can I set this up via scripts?
You can use the df command to check the file system usage. As a starting point, you can use the below command:
df -h | awk -v val=90 '$1=="/pqr"{x=int($5)>val?1:0;print x}'
The above command will print 1 if more than threshold, else print 0. The threshold is set in val.
Note: Please ensure the 5th column of your df output is the use percentage, else use appropriate column.

Errno::ENOMEM: Cannot allocate memory - cat

I have a job running on production which process xml files.
xml files counts around 4k and of size 8 to 9 GB all together.
After processing we get CSV files as output. I've a cat command which will merge all CSV files to a single file I'm getting:
Errno::ENOMEM: Cannot allocate memory
on cat (Backtick) command.
Below are few details:
System Memory - 4 GB
Swap - 2 GB
Ruby : 1.9.3p286
Files are processed using nokogiri and saxbuilder-0.0.8.
Here, there is a block of code which will process 4,000 XML files and output is saved in CSV (1 per xml) (sorry, I'm not suppose to share it b'coz of company policy).
Below is the code which will merge the output files to a single file
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each {|file|
`cat #{file} >> #{final_output_file}`
}
I've taken memory consumption snapshots during processing.It consumes almost all part of the memory, but, it won't fail.
It always fails on cat command.
I guess, on backtick it tries to fork a new process which doesn't get enough memory so it fails.
Please let me know your opinion and alternative to this.
So it seems that your system is running pretty low on memory and spawning a shell + calling cat is too much for the few memory left.
If you don't mind loosing some speed, you can merge the files in ruby, with small buffers.
This avoids spawning a shell, and you can control the buffer size.
This is untested but you get the idea :
buffer_size = 4096
output_file = File.open(final_output_file, 'w')
Dir["#{processing_directory}/*.csv"].sort_by {|file| [file.count("/"), file]}.each do |file|
f = File.open(file)
while buffer = f.read(buffer_size)
output_file.write(buffer)
end
f.close
end
You are probably out of physical memory, so double check that and verify your swap (free -m). In case you don't have a swap space, create one.
Otherwise if your memory is fine, the error is most likely caused by shell resource limits. You may check them by ulimit -a.
They can be changed by ulimit which can modify shell resource limits (see: help ulimit), e.g.
ulimit -Sn unlimited && ulimit -Sl unlimited
To make these limit persistent, you can configure it by creating the ulimit setting file by the following shell command:
cat | sudo tee /etc/security/limits.d/01-${USER}.conf <<EOF
${USER} soft core unlimited
${USER} soft fsize unlimited
${USER} soft nofile 4096
${USER} soft nproc 30654
EOF
Or use /etc/sysctl.conf to change the limit globally (man sysctl.conf), e.g.
kern.maxprocperuid=1000
kern.maxproc=2000
kern.maxfilesperproc=20000
kern.maxfiles=50000
I have the same problem, but instead of cat it was sendmail (gem mail).
I found problem & solution here by installing posix-spawn gem, e.g.
gem install posix-spawn
and here is the example:
a = (1..500_000_000).to_a
require 'posix/spawn'
POSIX::Spawn::spawn('ls')
This time creating child process should succeed.
See also: Minimizing Memory Usage for Creating Application Subprocesses at Oracle.

How to extract specific information from textfile table UNIX/Shell Scripting

I'm really new to UNIX/Shell Scripting I'm trying to extract disk usage from numerous servers. So what I'm trying to do is making a shell script that runs
df -g > diskusage.txt to obtain following table and extract ** data from below
Filesystem Size Used Avail Use% Mounted on
/dev/ibm_lv 84.00 56.81 33% 637452 5% /usr/IBM
/dev/apps_lv 10.00 9.95 **1%** 5 1% /usr/apps
/dev/upi_lv 110.00 85.85 **22%** 90654 1% /usr/app/usr
user08:/backup 2000.00 1611.22 20% 177387 1% /backup
Depending on the server, there are more file systems but i only want /usr/apps/usr,/usr/apps disk usage regardless of the number of filesystem. (/usr/apps/usr,/usr/apps will always located at last three row)
I'm pretty sure there are simpler ways than reading last 3 lines -> disregard last line -> search for % on each line.
If there is better way to extract these data, please let me know.
df -g | awk '/\/usr\/app/ {print $4}'
That gets you the available percentages, but it doesn't tell you which one goes with which. You can always include the mountpoint in the output, but then you still have to do some parsing to get the numbers out, something like this:
while read avail mount; do
echo "$mount has $avail available"
done < <(df -g | awk '/\/usr\/app/ {print $4, $NF}')

Resources