Use 'df -h' to check % remaining disk space of a specific folder - bash

I am using 'df -h' command to get disk space details in my directory and it gives me response as below :
Now I want to be able to do this check automatically through some batch or script - so I am wondering, if I will be able to check disk space only for specific folders which I care about, as shown in image - I am only supposed to check for /nas/home that it does not go above 75%.
How can I achieve this ? Any help ?
My work till now:
I am using
df -h > DiskData.txt
... this outputs to a text file
grep "/nas/home" "DiskData.txt"
... which gives me the output:
*500G 254G 247G 51% /nas/home*
Now I want to be able to search for the number previous or right nearby '%' sign (51 in this case) to achieve what I want.

This command will give you percentage of /nas/home directory
df /nas/home | awk '{ print $4 }' | tail -n 1| cut -d'%' -f1
So basically you can use store as value in some variable and then apply if else condition.
var=`df /nas/home | awk '{ print $4 }' | tail -n 1| cut -d'%' -f1`
if(var>75){
#send email
}

another variant:
df --output=pcent /nas/home | tail -n 1 | tr -d '[:space:]|%'
output=pcent - show only percent value (for coreutils => 8.21 )

A more concise way without extensive piping could be:
df -h /nas/home | perl -ane 'print substr $F[3],0,-1 if $.==2'
Returns: 51 for your example.

Related

How do I remove the header in the df command?

I'm trying to write a bash command that will sort all volumes by the amount of data they have used and tried using
df | awk '{print $1 | "sort -r -k3 -n"}'
Output:
map
devfs
Filesystem
/dev/disk1s5
/dev/disk1s2
/dev/disk1s1
But this also shows the header called Filesystem.
How do I remove that?
For your specific case, i.e. using awk, #codeforester answer (using awk NR (Number of Records) variable) is the best.
In a more general case, in order to remove the first line of any output, you can use the tail -n +N option in order to output starting with line N:
df | tail -n +2 | other_command
This will remove the first line in df output.
Skip the first line, like this:
df | awk 'NR>1 {print $1 | "sort -r -k3 -n"}'
I normally use one of these options, if I have no reason to use awk:
df | sed 1d
The 1d option to sed says delete the first line, then print everything else.
df | tail -n+2
the -n+2 option to tail say start looking at line 2 and print everything until End-of-Input.
I suspect sed is faster than awk or tail, but I can't prove it.
EDIT
If you want to use awk, this will print every line except the first:
df | awk '{if (FNR>1) print}'
FNR is the File Record Number. It is the line number of the input. If it is greater than 1, print the input line.
Count the lines from the output of df with wc and then substract one line to output a headerless df with tail ...
LINES=$(df|wc -l)
LINES=$((${LINES}-1))
df | tail -n ${LINES}
OK - I see oneliner - Here is mine ...
DF_HEADERLESS=$(LINES=$(df|wc -l); LINES=$((${LINES}-1));df | tail -n ${LINES})
And for formated output lets printf loop over it...
printf "%s\t%s\t%s\t%s\t%s\t%s\n" ${DF_HEADERLESS} | awk '{print $1 | "sort -r -k3 -n"}'
This might help with GNU df and GNU sort:
df -P | awk 'NR>1{$1=$1; print}' | sort -r -k3 -n | awk '{print $1}'
With GNU df and GNU awk:
df -P | awk 'NR>1{array[$3]=$1} END{PROCINFO["sorted_in"]="#ind_num_desc"; for(i in array){print array[i]}}'
Documentation: 8.1.6 Using Predefined Array Scanning Orders with gawk
Removing something from a command output can be done very simply, using grep -v, so in your case:
df | grep -v "Filesystem" | ...
(You can do your awk at the ...)
When you're not sure about caps, small caps, you might add -i:
df | grep -i -v "FiLeSyStEm" | ...
(The switching caps/small caps are meant as a clarification joke :-) )

Integer expected error in script

I am trying to write a simple script to monitor disk usage. I keep getting integer expression expected errors at line 5. (THRESHOLD value is intentionally set low for testing.)
Here is my script
#!/bin/bash
CURRENT=$(df -hP | grep / | awk '{ print $5}' | sed 's/%//g')
THRESHOLD=10
if [ "$CURRENT" -gt "$THRESHOLD" ] ; then
mail -s 'Disk Space Alert' john.kenny#ngc.com << EOF
Your root partition remaining free space is critically low. Used: $CURRENT%
EOF
fi
My screen output looks like this
./monitor_disk_space.sh: line 5: [: 7
0
22
1
1
1
1
1
1: integer expression expected
I'm new to bash scripts and especially awk. Any suggestions would be appreciated.
As you can see you're getting a string of newline-separated values from your pipeline. This string is not in itself an integer, so it can't be compared to $THRESHOLD.
Assuming you'd like to send the message if any filesystem is above $THRESHOLD percent full, you may use
df -hP | awk '/\// { sub("%", "", $5); print $5 }' |
while read number; do
if [ "$number" -gt "$THRESHOLD" ]; then
mail ...
break
fi
done
This would pass the values, one by one, into a loop that would compare them against $THRESHOLD. If any value is larger, the mail is sent and the loop exits (via the break).
I also took the liberty of shortening your pipeline to just df+awk, as awk is more than capable of doing the work of both grep and sed.
If you only want to check the root partition, then use df -hP / in the pipeline above.
CURRENT=$(df -hP | grep / | awk '{ print $5}' | sed 's/%//g')
df -hp shows a summary of disk usage.
grep / filters out the header line.
awk '{print $5}' prints the 5th column, which is the percentage usage for each file system.
sed 's/%//g' deletes the % character. (There's only one, so the g is unnecessary. I might have used tr -d %, but it doesn't really matter.)
$(...) captures the output of the above -- which is going to be multiple lines of output, each of which should contain an integer.
The -gt operator requires a single integer for each of its arguments.
I think the problem is the grep /, which prints every line containing a / character (that's probably going to be everything except the header line). Your message indicates that you're interested in the root filesystem.
Changing grep / to grep /$ is one simple solution.
But passing / as an argument to the df command, so it displays usage only for the root file system, is even simpler.
Here's how I might do it:
CURRENT=$(df / | awk 'NR == 2 { print $5 }' | tr -d %)
You could incorporate the deletion of the % character into the awk command, but that would be a little more complicated.
why not do it all in awk?
$ df -hP |
awk -v th=10 '/\// {if($5+0>th)
system("echo Your ... " $5 " | mail -s \"Disk Space Alert\" xxx#example.com")}'

hdd script in bash - strange output

Trying to check the available hdd space via a script:
df -h :
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 18G 9.1G 8.7G 52% /
The commands are :
com=`df -h | awk '{print $5}' | grep % | grep -v Use | sort -n | tail -4 | cut -d % -f1`
echo $com
52 74 100 100
I want to isolate "52" for my checks ,so :
for i in ${com[#]};do
> echo ${com[0]:0:2}
> done
52
52
52
52
Ok, i managed to retrieve the correct number for my later checks ,but why the command returns the number "52" four times ??
Thanks a lot
You don't want to use a bash script for this trivial use-case. Also you are using the bunch of awk, grep commands to store output in a variable and not in an array's context.
You just need to use a simple Awk command,
df -h | awk 'NR==1{for(i=1;i<=NF;i++) if ($i == "Use%"){ ind=i; break}} NR==2 {n=split($0, val); used=val[ind]; sub(/%/,"",used); print used}'
The above command first looks up the column which has the Use% stored in the header line and then looks up the actual value in the same column in the next row.
To use the output in a variable store the output of command substitution as below
used_storage=$(df -h | awk 'NR==1{for(i=1;i<=NF;i++) if ($i == "Use%"){ ind=i; break}} NR==2 {n=split($0, val); used=val[ind]; sub(/%/,"",used); print used}')
echo "$used_storage"

Match List of Numbers in For Loop in Bash

I have a script that loops over a curl command, which pulls in data from an API.
LIST_OF_ID=$(curl -s -X POST -d "username=$USER&password=$PASS&action=action" http://link.to/api.php)
for PHONE_NUMBER in $(echo $LIST_OF_ID | tr '_' ' ' | awk '{print $2}');
do
$VOIP_ID = $(echo $LIST_OF_ID | tr '_' ' ' | awk '{print $1}')
done
I also have a variable of 16 numbers in the range of "447856321455"
NUMBERS=$(cat << EOF
441111111111
441111111112
441111111113
... etc
)
The output on the API call is:
652364_441111111112
As you may notice I have taken the output and cut it into 2 parts and put it in a variable.
What I need is to match the 6 digit code from the output where the number in the output, matches with the number in the variable.
I've attempted it using if statements but I can't work my head around the correct way of doing it.
Any help would be appreciated.
Thank you.
I would do it using join rather than a loop in bash. Like this:
curl -s -X POST -d "$PARAMS" "$URL" | sort \
| join -t _ -2 2 -o 2.1 <(sort numbers.txt) -
What this does is take the sorted output from curl and join it with the sorted contents of numbers.txt (you could use $NUMBERS too), using _ as the separator, using column 2 of file 2 which is - meaning stdin (from curl). Then output field 2.1 which is the six-digit ID.
Read why-is-using-a-shell-loop-to-process-text-considered-bad-practice and then do something like this:
curl ... |
awk -v numbers="$NUMBERS" -F'_' '
BEGIN { split(numbers,tmp,/[[:space:]]+/); for (i in tmp) nums[tmp[i]] }
$2 in nums
'
but to be honest I cant really tell what it is you are trying to do as the numbers in your sample input don't seem to match each other (what does in the range of "447856321455" mean and how does it relate to $NUMBERS containing 441111111111 through 441111111113 and how does any of that relate to match the 6 digit code) and the expected output is missing.

print line number of output in shell script

I have a script that prints out the average time when pinging a server, shown below:
ping -c3 "${I}" | tail -1 | awk '{print $4}' | cut -d '/' -f 2 | sed 's/$/\tms/'
How can I add the line number to output of the script above when pinging a list of servers ??
my actual output when pinging list of 3 host is:
6.924 ms
100.099 ms
7.756 ms
I want the output to be like this:
1,6.924 ms
2,100.099 ms
3,7,756 ms
so that this can be read by excel :)
Thank in advanced!!
Pipe your output through perl:
echo -e 'aa\nbb' | perl -ne 'print $., ",", $_'
Output:
1,aa
2,bb
Is that what you want?
C=1
for I in 'host1' 'host2' 'host3'
do
ping -c3 "${I}" | tail -1 | awk '{print $4}' | cut -d '/' -f 2 | echo "$C,$(sed 's/$/\tms/')"
C=$((C+1))
done
The standard tool for line numbering is nl. Pipe your output to nl -s, That is:
for I; do
ping -c3 "${I}" | awk -F/ 'END{print $5, "\tms"}'
done | nl -s,
Since you haven't specified how the list is generated, I'm just showing the case where the list of hosts to be pinged is given on the command line. Note that this introduces leading whitespace before the line number, so you might want to filter that through sed to remove.
Of course, this script is spending most of its time waiting for the ping, and you probably want to speed it up by running the pings in parallel. In that case, it is better to add the line number at the beginning so you can get a stable sort in the output:
line=1
{ for I; do ping -c3 $I | awk -F/ 'END{
printf( "%d,%s\tms\n", line,$5 )}' line=$line &
: $((line +=1 ))
done; wait; } | sort -n
In this case, the wait is not necessary since sort will block until all of the pings have closed their output, but the wait becomes necessary if you add any processes in the pipeline before the sort that do not necessarily wait for all of their input before doing any processing, so it is a good practice to leave the wait in place.

Resources