hdd script in bash - strange output - bash

Trying to check the available hdd space via a script:
df -h :
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 18G 9.1G 8.7G 52% /
The commands are :
com=`df -h | awk '{print $5}' | grep % | grep -v Use | sort -n | tail -4 | cut -d % -f1`
echo $com
52 74 100 100
I want to isolate "52" for my checks ,so :
for i in ${com[#]};do
> echo ${com[0]:0:2}
> done
52
52
52
52
Ok, i managed to retrieve the correct number for my later checks ,but why the command returns the number "52" four times ??
Thanks a lot

You don't want to use a bash script for this trivial use-case. Also you are using the bunch of awk, grep commands to store output in a variable and not in an array's context.
You just need to use a simple Awk command,
df -h | awk 'NR==1{for(i=1;i<=NF;i++) if ($i == "Use%"){ ind=i; break}} NR==2 {n=split($0, val); used=val[ind]; sub(/%/,"",used); print used}'
The above command first looks up the column which has the Use% stored in the header line and then looks up the actual value in the same column in the next row.
To use the output in a variable store the output of command substitution as below
used_storage=$(df -h | awk 'NR==1{for(i=1;i<=NF;i++) if ($i == "Use%"){ ind=i; break}} NR==2 {n=split($0, val); used=val[ind]; sub(/%/,"",used); print used}')
echo "$used_storage"

Related

How to extract specific data from grep command in bash?

I'm trying to write my own script to tell me if I've used more than 500 MiB of my data.
I'm using vnstat -d for the information about data usage.
vnstat -d Output here
Output should be:
Only from the "Total column"
Only have values greater than 500.
I want only values from the "total"column. My output lists data from all the columns.
Better clear from the following:
#!/bin/bash
for i in `vnstat -d | grep -a [0-9] `; //get numerical values in i (-a tag as vnstat outputs in binary)
do
NUMBER=$(echo $i | grep -o '[5-9][0-9][0-9]'); //store values >500 in a var called NUMBER
echo $NUMBER;
done;
I'm a self-learning newb here so please try not to bash (pun) me.
Current output which I'm receiving from above script:
600
654
925
884
923
871
967
868
My desired output should be:
654
923
967
Simplified:
#/bin/bash
if [[ $(( $(vnstat -d --oneline|cut -d';' -f6|cut -d. -f1|paste -sd '+') )) -ge 500 ]];then
echo 500 Mb reached
fi
(What the script does, is it takes the specified field from the oneliner CSV-like output from each interface, then cuts the whole numbers and does a SUM of them. And then it compares if that sum is equal or greater than 500. And if it is, then it outputs a message)
Note:
-f6 will parse the "total for today" traffic
you can replace it with:
-f4 = rx for today
-f5 = tx for today
You want to parse a pipe delimited table and check only a specific column, there are tools more appropriate than grep for this job, for example you could write a small bash script where you use the cut command to extract the data and process them, or awk.
Here is a solution with awk. We print numbers > 500 of that column, total. Send your command output to
awk -F "|" '($3+0>=500){print $3}'
-F sets the field delimiter to |
$3+0 is used to convert a string starting with a number to that number, so that
we can handle it as a number and do the comparison.
Now, if you really want to extract all values having column total > 500 MiB,
then the expected output should include all values expressed in GiB, as they are
> 1000 MiB, for example the minimum value in your evil screenshot is 0.98 GiB which is 1003 MiB. So we can add this to the first condition.
awk -F "|" '($3 ~ /GiB/ || $3+0>500){print $3}'
Now if you want the output to be only integers in MiB, we can modify it to:
awk -F "|" '($3 ~ /GiB/){$3=1024*$3+0} ($3+0>500){printf "%.0f\n",$3}'
Here we convert all GiB values to MiB, and we do the comparison after that.
I'd use awk. Something like (untested)
vnstat -d | awk '$1 == "estimated" { exit }
($9 == "GiB" && $8 > 0.5) ||
($9 == "MiB" && $8 > 500) { print $8 " " $9 }'
#!/bin/bash
IFS=$'\n'
for i in `vnstat -d`; do # get each lines
VALUE=$(echo $i | cut -d\| -f3) # get total value with unit, in case you want to check for GiB values
NUMBER=$(echo $VALUE | grep -o '[0-9.*]' | cut -d. -f1); # split the string by '|', get the number part, store the integer part into NUMBER
if [[ $NUMBER -ge 500 && "$VALUE" == *"MiB"* || "$VALUE" == *"GiB"* ]]; then # if the number is greater than or equals to 500 OR it's in GiB
echo $VALUE; # echo the value
fi
done
Of course you can strip out the GiB checking if you wanted to.
Edit: Added IFS=$'\n' at the beginning. This allows the for loop to use endline as the delimiter.
vnstat has several options to format the output.
You can use vnstat --dumpdb, vnstat --json or vnstat --xml to have well-formatted data that you can then parse more easily (for example with jq if you choose the JSON format).
For example :
vnstat --json | jq '.interfaces[] | select(.id == "eth0") | .traffic | .days[1] | .rx'
will extract the number of kiB received on the interface eth0 yesterday (the day 0 is today, 1 is yesterday, etc)
To have the total rx+tx, you can use
vnstat --json | jq '.interfaces[] | select(.id == "eth0") | .traffic | .total | .rx+.tx'
You can also sum several days, for example today and yesterday :
vnstat --json | jq '.interfaces[] | select(.id == "eth0") | .traffic | [.days[0,1] | .rx+.tx] | add'
And instead of days, you can references "months" or "hours" (for hours, be careful, the id has not the same meaning, it's the reference of the hour).

Integer expected error in script

I am trying to write a simple script to monitor disk usage. I keep getting integer expression expected errors at line 5. (THRESHOLD value is intentionally set low for testing.)
Here is my script
#!/bin/bash
CURRENT=$(df -hP | grep / | awk '{ print $5}' | sed 's/%//g')
THRESHOLD=10
if [ "$CURRENT" -gt "$THRESHOLD" ] ; then
mail -s 'Disk Space Alert' john.kenny#ngc.com << EOF
Your root partition remaining free space is critically low. Used: $CURRENT%
EOF
fi
My screen output looks like this
./monitor_disk_space.sh: line 5: [: 7
0
22
1
1
1
1
1
1: integer expression expected
I'm new to bash scripts and especially awk. Any suggestions would be appreciated.
As you can see you're getting a string of newline-separated values from your pipeline. This string is not in itself an integer, so it can't be compared to $THRESHOLD.
Assuming you'd like to send the message if any filesystem is above $THRESHOLD percent full, you may use
df -hP | awk '/\// { sub("%", "", $5); print $5 }' |
while read number; do
if [ "$number" -gt "$THRESHOLD" ]; then
mail ...
break
fi
done
This would pass the values, one by one, into a loop that would compare them against $THRESHOLD. If any value is larger, the mail is sent and the loop exits (via the break).
I also took the liberty of shortening your pipeline to just df+awk, as awk is more than capable of doing the work of both grep and sed.
If you only want to check the root partition, then use df -hP / in the pipeline above.
CURRENT=$(df -hP | grep / | awk '{ print $5}' | sed 's/%//g')
df -hp shows a summary of disk usage.
grep / filters out the header line.
awk '{print $5}' prints the 5th column, which is the percentage usage for each file system.
sed 's/%//g' deletes the % character. (There's only one, so the g is unnecessary. I might have used tr -d %, but it doesn't really matter.)
$(...) captures the output of the above -- which is going to be multiple lines of output, each of which should contain an integer.
The -gt operator requires a single integer for each of its arguments.
I think the problem is the grep /, which prints every line containing a / character (that's probably going to be everything except the header line). Your message indicates that you're interested in the root filesystem.
Changing grep / to grep /$ is one simple solution.
But passing / as an argument to the df command, so it displays usage only for the root file system, is even simpler.
Here's how I might do it:
CURRENT=$(df / | awk 'NR == 2 { print $5 }' | tr -d %)
You could incorporate the deletion of the % character into the awk command, but that would be a little more complicated.
why not do it all in awk?
$ df -hP |
awk -v th=10 '/\// {if($5+0>th)
system("echo Your ... " $5 " | mail -s \"Disk Space Alert\" xxx#example.com")}'

Adding custom column in the output with awk

I am reading file utilization on the server with below command.
How can I add the hostname in my output as a first column?
Thanks in advance
df -h | grep % | awk '{OFS="\t";print $6,$5}'
Output:
/apps/inf9b2b 43%
/apps/dbclients 13%
/apps/inf9 77%
This is a simple application of How do I use shell variables in an awk script?
df -h | awk -v hostname="$(hostname)" '/%/ {OFS="\t"; print hostname, $6, $5}'
Note that there's no need for an external grep -- just make your pattern match a condition of the awk statement.
You can do df -h | grep % | awk '{OFS="\t";print "hostname\t" $6,$5}'

Use 'df -h' to check % remaining disk space of a specific folder

I am using 'df -h' command to get disk space details in my directory and it gives me response as below :
Now I want to be able to do this check automatically through some batch or script - so I am wondering, if I will be able to check disk space only for specific folders which I care about, as shown in image - I am only supposed to check for /nas/home that it does not go above 75%.
How can I achieve this ? Any help ?
My work till now:
I am using
df -h > DiskData.txt
... this outputs to a text file
grep "/nas/home" "DiskData.txt"
... which gives me the output:
*500G 254G 247G 51% /nas/home*
Now I want to be able to search for the number previous or right nearby '%' sign (51 in this case) to achieve what I want.
This command will give you percentage of /nas/home directory
df /nas/home | awk '{ print $4 }' | tail -n 1| cut -d'%' -f1
So basically you can use store as value in some variable and then apply if else condition.
var=`df /nas/home | awk '{ print $4 }' | tail -n 1| cut -d'%' -f1`
if(var>75){
#send email
}
another variant:
df --output=pcent /nas/home | tail -n 1 | tr -d '[:space:]|%'
output=pcent - show only percent value (for coreutils => 8.21 )
A more concise way without extensive piping could be:
df -h /nas/home | perl -ane 'print substr $F[3],0,-1 if $.==2'
Returns: 51 for your example.

Why is my awk sub command failing?

When I run
df -hl | grep '/dev/disk1' | awk '{sub(/%/, \"\");print $5}'
I'm getting the following error:
awk: syntax error at source line 1
context is
{sub(/%/, >>> \ <<< "\");}
awk: illegal statement at source line 1
I can't seem to find any documentation on awk sub.
df -hl | grep '/dev/disk1'
returns
/dev/disk1 112Gi 94Gi 18Gi 85% 24672655 4649071 84% /
As I understand, it should return the percentage of disk space used.
It should return 85 from the input
/dev/disk1 112Gi 94Gi 18Gi 85% 24699942 4621784 84% /
This will fix the command as you supplied it.
df -hl | grep '/dev/disk1' | awk '{sub( /%/, ""); print $5 }'
No need to escape the double quotes.
Of course you don't need to use grep here either.
df -hl | awk '/disk1/ { sub( /%/, "", $5); print $5}'
Notice that you can supply the target for the substitution as a third argument to sub.
The sub command is described in the gawk manual on this page.
Perhaps you can reduce it down to just df and awk with:
df --output=pcent /dev/disk1 | awk '/ /{printf("%d\n", $1)}'

Resources