bash script that monitor a disk partition's usage - bash

df shows
-bash-4.1# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 1918217320 1783986384 36791092 98% /
tmpfs 16417312 0 16417312 0% /dev/shm
/dev/sda1 482214 148531 308784 33% /boot
/dev/sdb1 1922858352 1373513440 451669312 76% /disk2
I need to bash script a function that returns 1 if an paritions become 100% full.
how can this be done? what commands can I use to parse out the output of df?

This should do it:
disks_space() {
! df -P | awk '{print $5}' | grep -Fqx '100%'
}
In other words, check if any of the lines in the fifth column of the POSIX df output contains the exact string "100%".

Probelm with percentage is if its a terrabyte disk 95% of that may still be lots of free gig - refer to the bottom script for actual disk space - the format 100 at the end of the example shows alert when it is below 100MB left on a partition
diskspace.sh
#!/bin/sh
# set -x
# Shell script to monitor or watch the disk space
# It will send an email to $ADMIN, if the (free available) percentage of space is >= 90%.
# -------------------------------------------------------------------------
# Set admin email so that you can get email.
ADMIN="root"
# set alert level 90% is default
ALERT=90
# Exclude list of unwanted monitoring, if several partions then use "|" to separate the partitions.
# An example: EXCLUDE_LIST="/dev/hdd1|/dev/hdc5"
EXCLUDE_LIST="/auto/ripper"
#
#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
#
function main_prog() {
while read output;
do
echo $output
usep=$(echo $output | awk '{ print $1}' | cut -d'%' -f1)
partition=$(echo $output | awk '{print $2}')
if [ $usep -ge $ALERT ] ; then
if [ "$partition" == "/var" ]; then
# echo "Running out of space \"$partition ($usep%)\" on server $(hostname), $(date)"
echo "Running out of space \"$partition ($usep%)\" on server $(hostname), $(date)" | mail -s "Alert: Almost out of disk space $usep%" $ADMIN
# Extra bits you may wish to do -
#for FILE in `find $partition -size +1G -print`
#do
# echo $FILE
# DATE=`date +%Y-%m-%d_%H%M`
# filename=`echo ${FILE##*/}`
# mkdir /mnt/san/$hostname
# echo cp $FILE /mnt/san/$(hostname)/$filename-$DATE
# #echo > $FILE
#done
fi
fi
done
}
if [ "$EXCLUDE_LIST" != "" ] ; then
df -hP | grep -vE "^[^/]|tmpfs|cdrom|${EXCLUDE_LIST}" | awk '{print $5 " " $6}' | main_prog
else
df -hP | grep -vE "^[^/]|tmpfs|cdrom"| awk '{print $5 " " $6}' | main_prog
fi
Or you could use this style of check I put in place for nagios (using snmp to connect to a remote host)
snmp_remote_disk_auto
#!/bin/bash
# This script takes:
# <host> <community> <megs>
snmpwalk="/usr/bin/snmpwalk"
snmpget="/usr/bin/snmpget"
function usage() {
echo "$0 localhost public 100"
echo "where localhost is server"
echo "public is snmp pass"
echo "100 is when it reaches below a 100Mb"
echo "-----------------------------------"
echo "define threshold below limit specific for partitions i.e. boot can be 50mb where as /var I guess we want to catch it at around 1 gig so"
echo "$0 localhost public 1024"
}
server=$1;
pass=$2
limit=$3;
errors_found="";
partitions_found="";
lower_limit=10;
graphtext="|"
if [ $# -lt 3 ]; then
usage;
exit 1;
fi
# takes <size> <used> <allocation>
calc_free() {
echo "$1 $2 - $3 * 1024 / 1024 / p" | dc
}
for partitions in $($snmpwalk -v2c -c $pass -Oq $server hrStorageDescr|grep /|egrep -v "(/mnt|/home|/proc|/sys)"|awk '{print $NF}'); do
if [[ $partitions =~ /boot ]]; then
limit=$lower_limit;
fi
if result=$($snmpwalk -v2c -c $pass -Oq $server hrStorageDescr | grep "$partitions$"); then
index=$(echo $result | sed 's/.*hrStorageDescr//' | sed 's/ .*//')
args=$($snmpget -v2c -c $pass -Oqv $server hrStorageSize$index hrStorageUsed$index hrStorageAllocationUnits$index | while read oid j ; do printf " $oid" ; done)
free=$(calc_free$args)
back_count=$(echo $partitions|grep -o "/"|wc -l)
if [[ $back_count -ge 2 ]]; then
gpartition=$(echo "/"${partitions##*/})
else
gpartition=$partitions;
fi
if [ "$free" -gt "$limit" ]
then
graphtext=$graphtext$gpartition"="$free"MB;;;0 "
#graphtext=$graphtext$partitions"="$free"MB;;;0 "
partitions_found=$partitions_found" $partitions ($free MB)"
else
graphtext=$graphtext$gpartition"="$free"MB;;;0 "
#graphtext=$graphtext$partitions"="$free"MB;;;0 "
errors_found=$errors_found" $partitions ($free MB)"
fi
else
graphtext=$graphtext$gpartition"="0"MB;;;0 "
#graphtext=$graphtext$partitions"="0"MB;;;0 "
errors_found=$errors_found" $paritions does_not_exist_or_snmp_is_not_responding"
fi
done
if [ "$errors_found" == "" ]; then
echo "OK: $partitions_found$graphtext"
exit 0
else
echo "CRITICAL: $errors_found$graphtext";
exit 2;
fi
./snmp_remote_disk_auto localhost public 100
OK: / (1879 MB) /var (2281 MB) /tmp (947 MB) /boot (175 MB)|/=1879MB;;;0 /var=2281MB;;;0 /tmp=947MB;;;0 /boot=175MB;;;0

Not a huge fan of excessive greps and awks as it can really bring errors over time..
I would just get the information for the folders that matter. Below is a sample of using stat which will give you the available BYTES in a folder, then converts it to MB (10**6). I roughly tested this on my RHEL6.x system.
folder_x_mb=$(($(stat -f --format="%a*%s" /folder_x)/10**6))
folder_y_mb=$(($(stat -f --format="%a*%s" /folder_y)/10**6))
folder_z_mb=$(($(stat -f --format="%a*%s" /folder_z)/10**6))

How about something like:
df | perl -wne 'if(/(\d+)%\s+(.*)/){print "$2 at $1%\n" if $1>90}'
You can change the threshold and instead of printing you can just exit:
df | perl -wne 'if(/(\d+)%\s+(.*)/){exit 1 if $1>99}'

Here is a simple script to check if there are already disk that reached their maximum capacity and -- if there were it would return / output 1.
#!/bin/sh
CHECK=$(df -Ph | grep '100%' | xargs echo | cut -d' ' -f5)
if [ "$CHECK" == "100%"]
then
echo 1
else
echo 0
fi

Try this: df -Ph | grep -v "Use%" | sed 's/%//g' | awk '$5 > LIMIT {print $1,$2,$3,$4,$5"%";}' | column -t'
It will return all df -Ph entries that exceed the LIMIT
For example, on my workstation, df -Ph returns:
Filesystem Size Used Avail Use% Mounted on
/dev/cciss/c0d0p1 92G 32G 56G 37% /
shmfs 98G 304K 98G 1% /dev/shm
192.168.1.1:/apache_cache 2.7T 851G 1.9T 32% /media/backup
/dev/dm-4 50G 49G 1.1G 98% /lun1
/dev/dm-7 247G 30G 218G 12% /lun2
Let's say I want to list the mount points that exceed 20% of capacity.
I use df -Ph | grep -v "Use%" | sed 's/%//g' | awk '$5 > 20 {print $1,$2,$3,$4,$5"%";}' | column -t, and it returns the following:
/dev/cciss/c0d0p1 92G 32G 56G 37% /
192.168.1.1:/apache_cache 2.7T 851G 1.9T 32% /media/backup
/dev/dm-4 50G 49G 1.1G 98% /lun1
The column -t part is here purely for the output to be readable.

Related

df -h if freespace equals then | bash

The error that I am getting is with the df command on mac, I would like to specify a value in gigabytes and perhaps allow the user to choose a different drive if / is full.
destination="$HOME/Desktop/sandbox"
if [ $(df -h --output=avail /|tail -n1) -lt 300000 ]; then
echo "There is less than 300GB available..." ;
exit
else
for files in *.tar ; do echo copying "$files" ; cp "$files" "$destination" ; read -n 1 -p "Press any key..." ; done
fi
Not sure if df -h / | tail -1 | awk '{print $4}' | sed 's/..$//' is a good option
destination="$HOME/Desktop/sandbox"
freespace="$(df -h / | tail -1 | awk '{print $4}' | sed 's/..$//')"
if [ "$freespace" -lt 300 ]; then
echo "There is less than 300GB available..." ;
exit
else
for files in *.sh ; do echo copying "$files" ; cp "$files" "$destination" ; read -n 1 -p "Press any key..." ; done
fi
The problem is that you are using dh -h what means "human readable" and always append unit after the number (M, G, k...). There is a lot of options how to cut it out by grep, awk, cut or sed. But perhaps the best way is to use different option e.g. df -m to have an output in megabytes and multiply the required space.
Also I would propose to add a grep $destination (mountpoint visible in df output) to filter the required FS and to not rely that it's always last line in the output.
Try something like this:
destination="$HOME/Desktop/sandbox"
free_space=$(df -h | grep $destination | awk '{print $4}')
if [ $free_space -lt $((300 * 1024)) ]; then
echo "There is less than 300GB available..." ;
exit
else
for files in *.tar; do
echo copying "$files"
cp "$files" "$destination"
read -n 1 -p "Press any key..."
done
fi

Bash - disk utilization notification

This script should output a warning notification for the utilization of the main disk if over 50%, but it provides no output. My disk is currently sat at 60% so it should in theory work.
I have added an else statement to identify if the loop is not working but the else statement isnt triggered.
I'm provided no error so its hard to identify where i have gone wrong specifically.
#!/bin/bash
df -H | grep /dev/sda2 | awk '{ printf "%d", $5}' > diskOutput.txt
input="diskOutput.txt"
while IFS= read -r line
do
if [ $line -gt 50 ]
then
up="`uptime | cut -b 1-9`"
output="WARNING UTILISATION $line - $up"
echo "$output"
else
echo "no-in"
fi
done < $input
#rm diskOutput.txt
echo "finished"
Try this.
#!/bin/bash
df -H | grep /dev/sda2 | awk '{ printf "%d", $5}' > diskOutput.txt
echo "" >>diskOutput.txt
input="diskOutput.txt"
while IFS= read -r line
do
if [ $line -gt 50 ]
then
up="`uptime | cut -b 1-9`"
output="WARNING UTILISATION $line - $up"
echo "$output"
else
echo "no-in"
fi
done < $input
#rm diskOutput.txt
echo "finished"
You are setting an internal field separator as space here.
while IFS= read -r line
But when creating file, with %d you are removing all char except digits.

Bash variable not saving new data given?

I wrote a Bash function:
CAPACITY=0
USED=0
FREE=0
df | grep /$ | while read LINE ; do
CAPACITY=$(echo "${CAPACITY}+$(echo ${LINE} | awk '{print $2}')" | bc )
USED="$[${USED}+$(echo ${LINE} | awk '{print $3}')]"
FREE="$[${FREE}+$(echo ${LINE} | awk '{print $4}')]"
done
echo -e "${CAPACITY}\t${USED}\t${FREE}"
for i in /home /etc /var /usr; do
df | grep ${i}[^' ']*$ | while read LINE ; do
CAPACITY=$[${CAPACITY}+$(echo ${LINE} | awk '{print $2}')]
USED=$[${USED}+$(echo ${LINE} | awk '{print $3}')]
FREE=$[${FREE}+$(echo ${LINE} | awk '{print $4}')]
done
done
if [ "${1}" = "explode?" ] ; then
if [ $[${USED}*100/${CAPACITY}] -ge 95 ] ; then
return 0
else
return 1
fi
elif [ "${1}" = "check" ] ; then
echo -e "Capacity = $(echo "scale=2; ${CAPACITY}/1024/1024" | bc)GB\nUsed = $(echo "scale=2; ${USED}/1024/1024" | bc)GB\nAvaliable = $(echo "scale=2; ${FREE}/1024/1024" | bc)GB\nUsage = $(echo "scale=2; ${USED}*100/${CAPACITY}" | bc)%"
fi
}
Note the 2 different methods to store the data in the CAPACITY/USED/FREE vars in the first 'while' loop and the echo right after it to debug the code.
Seems as though while running the script the data inputted into the variables in the loop isn't saved.
Here's the output while running the script with 'set -x':
+ CAPACITY=0
+ USED=0
+ FREE=0
+ df
+ grep '/$'
+ read LINE
++ bc
+++ echo /dev/vda1 52417516 8487408 43930108 17% /
+++ awk '{print $2}'
++ echo 0+52417516
+ CAPACITY=52417516
++ echo /dev/vda1 52417516 8487408 43930108 17% /
++ awk '{print $3}'
+ USED=8487408
++ echo /dev/vda1 52417516 8487408 43930108 17% /
++ awk '{print $4}'
+ FREE=43930108
+ read LINE
+ echo -e '0\t0\t0'
0 0 0
Why the heck don't the variables store the new numbers even though it clearly shows a new number was stored?
Why ... don't the variables store the new numbers even though it clearly shows a new number was stored?
Because the right part of | is run in a subshell, so the changes are not propagated to the parent shell.
$ a=1
$ echo a=$a
a=1
$ true | { a=2; echo a=$a; }
a=2
$ echo a=$a
echo a=1
For more info read bashfaq I set variables in a loop that's in a pipeline. Why do they disappear after the loop terminates?. The common solution is to use a process substitution:
while IFS= read -r line; do
blabla
done < <( blabla )
The $[ is deprecated. Use $((...)) instead. bash hackers wiki obsolete and deprecated syntax.
In bash just use arithmetic expansion (( for numbers comparison. if (( used * 100 / capacity >= 96 )); then.
By convention upper case variables are used for exported variables. Use lower case variable names for script local variables.
The is no need to grep the output of df. Just df /home /etc /var /usr. Or really just read -r capacity used free < <(df /home /etc /var /usr | awk '{ capacity += $1; used += $3; free += $4 } END{print capacity, used, free}').

CPU monitoring script not triggering properly

I was wondering if anyone could help with the reasons that this is not triggering properly
HOSTNAME=`hostname -s`
LOAD=25.00
CAT=/bin/cat
MAILFILE=/home/jboss/monitor.mail
MAILER=/bin/mail
mailto="bob#bob.bob"
CPU_LOAD=`sar -P ALL 1 10 |grep 'Average.*all' |awk -F" " '{print 100.0 -$NF}'`
if [[ $CPU_LOAD > $LOAD ]];
then
PROC=`ps -eo pcpu,pid -o comm= | sort -k1 -n -r | head -1`
echo -e "Please check processes on ${HOSTNAME} the value of cpu load is $CPU_LOAD%.
Highest process is: $PROC" > $MAILFILE
$CAT $MAILFILE | $MAILER -s "CPU Load is on ${HOSTNAME} is $CPU_LOAD %" $mailto
fi
This seems to be working properly for the sar and ps however I'm still getting alerts emailed for things like CPU Load is 3.18%. Unless I'm missing something it shouldn't trigger unless load is greater than 25%.
It seems though that it's more doing if load is greater than 2.5% Any suggestions?
Thank you
Instead of using:
if [[ $CPU_LOAD > $LOAD ]];then
you must use
if [[ $CPU_LOAD -gt $LOAD ]]; then
Bash only handles integers, so to use higher precision, you could do something like this:
cpu_limit=25
# read the 5min load-average straight from the special file on /proc
read -r _ load_avg _ </proc/loadavg
# multiply by 100 for precision
load_avg=$(bc <<<"scale=0; $load_avg * 100 / 1")
# compare numbers with (( )) instead
if (( load_avg > cpu_limit )); then
...
fi
Try this code - (Tested - working fine)
$ cat f.sh
HOSTNAME=$(hostname -s)
LOAD=25.00
MAILFILE=$HOME/a.txt
MAILER=/bin/mailx
mailto="vipinkumarr89#gmail.com"
CPU_LOAD=$(sar -P ALL 1 10 |grep 'Average.*all' |awk -F" " '{print 100.0 -$NF}')
if [[ $CPU_LOAD > $LOAD ]];then
{
PROC=$(ps -eo pcpu,pid -o comm= | sort -k1 -n -r | head -1)
echo -e "Please check processes on ${HOSTNAME} the value of cpu load is $CPU_LOAD%.
Highest process is: $PROC" > $MAILFILE
cat $MAILFILE | $MAILER -s "CPU Load is on ${HOSTNAME} is $CPU_LOAD %" $mailto
}
fi

Bash - Memory usage

I have a problem that I can't solve, so I've come to you.
I need to write a program that will read all processes and a program must sort them by users and for each user it must display how much of a memory is used.
For example:
user1: 120MB
user2: 300MB
user3: 50MB
total: 470MB
I was thinking to do this with ps aux command and then get out pid and user with awk command. Then with pmap I just need to get total memory usage of a process.
it's just a little update, users are automatically selected
#!/bin/bash
function mem_per_user {
# take username as only parameter
local user=$1
# get all pid's of a specific user
# you may elaborate the if statement in awk obey your own rules
pids=`ps aux | awk -v username=$user '{if ($1 == username) {print $2}}'`
local totalmem=0
for pid in $pids
do
mem=`pmap $pid | tail -1 | \
awk '{pos = match($2, /([0-9]*)K/, mem); if (pos > 0) print mem[1]}'`
# when variable properly set
if [ ! -z $mem ]
then
totalmem=$(( totalmem + $mem))
fi
done
echo $totalmem
}
total_mem=0
for username in `ps aux | awk '{ print $1 }' | tail -n +2 | sort | uniq`
do
per_user_memory=0
per_user_memory=$(mem_per_user $username)
if [ "$per_user_memory" -gt 0 ]
then
total_mem=$(( $total_mem + $per_user_memory))
echo "$username: $per_user_memory KB"
fi
done
echo "Total: $total_mem KB"
Try this script, which may solve your problem:
#!/bin/bash
function mem_per_user {
# take username as only parameter
local user=$1
# get all pid's of a specific user
# you may elaborate the if statement in awk obey your own rules
pids=`ps aux | awk -v username=$user '{if ($1 == username) {print $2}}'`
local totalmem=0
for pid in $pids
do
mem=`pmap $pid | tail -1 | \
awk '{pos = match($2, /([0-9]*)K/, mem); if (pos > 0) print mem[1]}'`
# when variable properly set
if [ ! -z $mem ]
then
totalmem=$(( totalmem + $mem))
fi
done
echo $totalmem
}
total_mem=0
for i in `seq 1 $#`
do
per_user_memory=0
eval username=\$$i
per_user_memory=$(mem_per_user $username)
total_mem=$(( $total_mem + $per_user_memory))
echo "$username: $per_user_memory KB"
done
echo "Total: $total_mem KB"
Best regards!
You can access the shell commands in python using the subprocess module. It allows you to spawn subprocesses and connect to the out/in/error. You can execute the ps -aux command and parse the output in python.
check out the docs here
Here is my version. I think that Tim's version is not working correctly, the values in KB are too large. I think the RSS column from pmap -x command should be used to give more accurate value. But do note that you can't always get correct values because processes can share memmory. Read this A way to determine a process's "real" memory usage, i.e. private dirty RSS?
#!/bin/bash
if [ "$(id -u)" != "0" ]; then
echo "WARNING: you have to run as root if you want to see all users"
fi
echo "Printing only users that current memmory usage > 0 Kilobytes "
all=0
for username in `ps aux | awk '{ print $1 }' | tail -n +2 | sort | uniq`
do
pids=`ps aux | grep $username | awk -F" " '{print $2}'`
total_memory=0
for pid in $pids
do
process_mem=`pmap -x $pid | tail -1 | awk -F" " '{print $4}'`
if [ ! -z $process_mem ]
then #don't try to add if string has no length
total_memory=$((total_memory+$process_mem))
fi
done
#print only those that use any memmory
if [ $total_memory -gt 0 ]
then
total_memory=$((total_memory/(1024)))
echo "$username : $total_memory MB"
all=$((all+$total_memory))
fi
done
echo "----------------------------------------"
echo "Total: $all MB"
echo "WARNING: Use at your own risk"

Resources