icinga2 disk space check or with three arguments - arguments

I am trying to configure icinga2 to monitor my linux server disk space using check_nrpe. my configuraiton is given below
nrpe.cfg:
command[check_root]=/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
icinga configuration
object CheckCommand "nrpe-check-2arg" {
import "plugin-check-command"
command = [PluginDir + "/check_nrpe" ]
arguments = {
"-H" = "$host_name$"
"-c" = "$check$"
"-a" = "$loads$"
}
}
object Service "testing-haproxy-master: / disk space" {
import "generic-service"
host_name = "tmahaprx01.verizon.com"
check_command = "nrpe-check-2arg"
vars.address = "192.168.1.104"
vars.check = "check_root"
vars.loads = "80%!90%!/"
}
Now the out put i am getting is
root#icinga:/etc/icinga2/hosts# /usr/lib/nagios/plugins/check_nrpe -H 192.168.1.104 -c check_root -a '80%C!90%!/'
DISK OK - free space: /sys/fs/cgroup 0 MB (100% inode=99%); /dev 1457 MB (99%
inode=99%); /run 293 MB (99% inode=99%); /run/lock 5 MB (100% inode=99%);
/run/shm 1468 MB (100% inode=99%); /run/user 100 MB (100% inode=99%);|
/sys/fs/cgroup=0MB;0;0;0;0 /dev=0MB;291;145;0;1457 /run=0MB;58;29;0;293
/run/lock=0MB;0;0;0;5 /run/shm=0MB;293;146;0;1468 /run/user=0MB;19;9;0;100
The expecting output when I execute from my remote Linux machine is
root#tmahaprx01:~# /usr/lib/nagios/plugins/check_disk -w 80% -c 90% -p /
DISK OK - free space: / 43144 MB (96% inode=97%);| /=1743MB;9462;4731;0;47314
Could you please guide me how i can pass the third argument (/) ?

The problem with NRPE is that you're writing a command that executes another command. Assuming that the nrpe.cfg includes something like this:
command[check_disk]=/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
you know that the path must be the 3rd argument:
object CheckCommand "nrpe-disk" {
import "nrpe"
vars.nrpe_arguments = [ "$disk_wfree$%", "$disk_cfree$%", "$disk_partition$" ]
vars.nrpe_command = "check_disk"
//variables should be propagated from host/group definition
vars.disk_wfree = 20
vars.disk_cfree = 10
vars.disk_partition = "/"
}
variable names might be dependent on Icinga version, check the original nrpe command definition on your system, it might be located in:
/usr/share/icinga2/include/command-plugins.conf

Related

Is there any specific way to extract SSD name without "/"?

In my current system, I have 3 SSDs, sda, sdb and sdc. The OS is installed in sdc.
I am trying to extract the SSDs without the OS installed in it. So, this command
echo $(eval $(lsblk -oMOUNTPOINT,PKNAME -P | grep 'MOUNTPOINT="/"'); echo $PKNAME | sed 's/[0-9]*$//')
returns sdc.
But if I want the drive without OS, how should I modify the above command?
grep 'MOUNTPOINT!="/"' doesn't return anything.
The bash script provided by Renaud works as expected on systems with raw drives. In case, if the system has LVM partitions then it returns only dm-1.
How to handle this case? To get the correct SSD name on either raw or LVM systems?
On the LVM system, lsblk returns. The expected output is sdb
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 1.8T 0 disk
├─sda1 8:1 0 512M 0 part /boot/efi
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 1.8T 0 part
├─ubuntu--vg-ubuntu--lv-real
│ 253:1 0 880G 0 lvm
│ ├─ubuntu--vg-ubuntu--lv 253:2 0 880G 0 lvm /
│ └─ubuntu--vg-clean 253:4 0 880G 0 lvm
└─ubuntu--vg-clean-cow 253:3 0 400G 0 lvm
└─ubuntu--vg-clean 253:4 0 880G 0 lvm
sdb 8:16 0 1.8T 0 disk
On the raw drive, lsblk returns
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 223.6G 0 disk
sdb 8:16 0 465.8G 0 disk
├─sdb1 8:17 0 2G 0 part
└─sdb2 8:18 0 463.8G 0 part
sdc 8:32 0 232.9G 0 disk
├─sdc1 8:33 0 156M 0 part /boot/efi
├─sdc2 8:34 0 26.7G 0 part /boot
├─sdc3 8:35 0 182.7G 0 part /
└─sdc4 8:36 0 23.4G 0 part [SWAP]
Here the expected output is sda or sdc.
grep has a -v option to print only non-matching lines. So:
lsblk -oMOUNTPOINT,PKNAME -P | grep -v 'MOUNTPOINT="/"'
should exclude this drive. But note that if you have more than one other drive the rest of your script will not work as you would like. Only the last drive will be considered because it overrides the others. A loop would probably be closer to your needs:
while IFS= read -r line; do
eval "$line"
sed 's/[0-9]*$//' <<< "$PKNAME"
done < <(lsblk -oMOUNTPOINT,PKNAME -P | grep -v 'MOUNTPOINT="/"')

cgroups blkio subsystem is not counting the block write byte count properly for conatiner applications

I am working on the linux kernel base 3.14 version and i have enabled the cgroup and blkio subsystem on it for checking the write byte count of the block device from the container and host applications.
But, I have problems in getting the written bytes from the cgroup blkio throttling function for the container application.
It works for the main hierarchy (e.g. /sys/fs/cgroup/blkio/blkio.throttle.io_service_bytes) , but not for the deeper ones (e.g. /sys/fs/cgroup/blkio/lxc/web (container name is web))
I created a small test script (checkWrite), which will simply enter ther cgroup it is started in (pwd) and will create 1M.
#!/bin/bash
SIZE=1M
DST="/home/root"
#check if we are in the /sys/fs/cgroup/ dir
if [ ! -e ./tasks ]; then
echo "Error, this script must be started in a cgroup blkio directory"
echo "Start in or below /sys/fs/cgroup/blkio !"
exit -1
fi
echo "Using the cgroup: ${PWD##*/cgroup}"
# add myself to cgroup
echo $$ > tasks
mygroup=`cat /proc/$$/cgroup | grep blkio`
echo "we're now in bklio cgroup: ${mygroup}"
# call sync to let kernel store data
sync
sleep 1
# fetch current writen bytes count for eMMC
before=$(cat blkio.throttle.io_service_bytes | grep "179:24 Write")
echo "before writing: ${before}"
echo "writing ${SIZE} random data to ${DST}/DELME ..."
dd if=/dev/urandom of=${DST}/DELME bs=${SIZE} count=1
sync
sleep 2
# fetch current writen bytes count for eMMC
after=$(cat blkio.throttle.io_service_bytes | grep "179:24 Write")
echo "after writing: ${after}"
written=$((${after##* }-${before##* }))
written=$((written/1024))
echo "written = ${after##* }B - ${before##* }B = ${written}kB"
rm -rf ${DST}/DELME
The output is;
/sys/fs/cgroup/blkio# ~/checkWrite
Using the cgroup: /blkio
we're now in bklio cgroup: 3:blkio:/ <- this task is in this blkio chgroup now
before writing: 179:24 Write 200701952 <- from blkio.throttle.io_service_bytes
writing 1M random data to /var/opt/bosch/dynweb/DELME ...
1+0 records in
1+0 records out
after writing: 179:24 Write 201906176
written = 201906176B - 200701952B = **1176kB** **<- fairly ok**
/sys/fs/cgroup/blkio/lxc/web# ~/checkWrite
Using the cgroup: /blkio/system.slice
we're now in bklio cgroup: 3:blkio:/system.slice
before writing: 179:24 Write 26064896
writing 1M random data to /var/opt/bosch/dynweb/DELME ...
1+0 records in
1+0 records out
after writing: 179:24 Write 26130432
written = 26130432B - 26064896B = **64kB** **<- much too less**
Do I misunderstand the handling?
If it is not working, then how to monitor/watch/read the block device write from the container applications.

Copy a large directory to the SAMBA shared directory

description
When I try to create more than 100,000 files in the SAMBA shared dir, the smbd cpu usage increases as the number of files increases and performance degrades. I look at the smbd log, before the creation of each document will receive from the client SMB2_FIND request, the request parameters in_file_name = "*", used to obtain the entire directory of information. That is to say that each file needs to create a whole directory of information, with the increase in the number of sub-files, performance significantly decreased. I would like to ask is whether the SMB protocol to consider the use of a large number of documents under the directory scene, smbd there is no way to optimize, or modify the configuration can improve performance?
test environment
client: windows8
server: centos7.1
smbd version:
[root#localhost samba]# ps aux | grep smbd
root 3378 0.0 0.3 386040 5800 ? Ss 20:43 0:00 /usr/sbin/smbd
root 3380 0.0 0.1 386040 3108 ? S 20:43 0:00 /usr/sbin/smbd
root 3385 0.0 0.3 390600 7004 ? S 20:44 0:00 /usr/sbin/smbd
root 3504 0.0 0.0 112648 976 pts/0 R+ 21:09 0:00 grep -- color=auto smbd
[root#localhost samba]# /usr/sbin/smbd --version
Version 4.2.10
test procedure
modify smb.conf
[root#localhost samba]# testparm -s -c
Load smb config files from /etc/samba/smb.conf
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[1111]"
Loaded services file OK.
Server role: ROLE_STANDALONE
[global]
workgroup = MYGROUP
server string = Samba Server Version %v
security = USER
log file = /var/log/samba/log.%m
max log size = 900000000
server max protocol = SMB2
idmap config * : backend = tdb
[1111]
comment = share
path = /tmp
read only = No
restart smbd service
[root#localhost samba]# service smb restart
Redirecting to /bin/systemctl restart smb.service
map a directory to a network disk
Write the test program
var writewg sync.WaitGroup
var i uint64
for i = 0; i < files; i++ {
writewg.Add(1);
ctx.sem <- true
go func(index uint64) {
if isdir {
subdir := path + "/dir_" + strconv.FormatUint(index, 10)
os.MkdirAll(subdir, 0777)
} else {
file := path + "/file_" + strconv.FormatUint(index, 10)
f, err := os.OpenFile(file, os.O_CREATE|os.O_APPEND|os.O_RDWR, os.ModePerm|os.ModeTemporary)
if err != nil {
fmt.Println("OpenFile ", file, " failed ", err );
} else {
f.Write(ctx.data[0:])
f.Close()
}
}
atomic.AddUint64(&ctx.task.Ops, 1)
<- ctx.sem
writewg.Add(-1)
}(i)
}
writewg.Wait()
run my test program
create 500 files
View the log
[root#localhost samba]# grep "smbd_dirptr_get_entry mask=.*file_0" log.xjl | wc -l
500
[root#localhost samba]# grep "SMB2_OP_FIND" log.xjl | wc -l
1020

I want to use parallel-ssh to run a bash script on multiple servers, but it simple prints the echo statements

I have a bash script called sr_run_batch.sh which does super resolution of images. Now I want to do testing on different servers in parallel at the same time. ie. 1 Virtual machine at one given point of time. then 2 virtual machines at one point of time , 3 and then 4.
I tried writing into it the commands
for host in $(cat hosts.txt); do ssh "$host" "$command" >"output.$host"; done
ssh-keygen && for host in $(cat hosts.txt); do ssh-copy-id $host; done
where the file hosts.txt contains the list of servers: username#ip(format) but when I run this, it gives me substitution error
Hence, I tried pssh (parallel-ssh)
pssh -h hosts-file -l username -P $command
command being ./sr_run_batch.sh
but it didn't run, so I modified this to
pssh -h hosts-file -l ben -P -I<./sr_run_batch.sh
But, for some unknown reason, it just prints the echo statements in the code.
here is the code :
NList=(5)
VList=(1)
FList=("input/flower1.jpg" "input/flower2.jpg" "input/flower3.jpg" "input/flower4.jpg")
IList=("320X240" "640X480" "1280X960" "1920X1200")
SList=(2 3)
for VM in ${VList[#]}; do
for ((index=0; index < ${#FList};)) do
file=$FList[$index]
image_size=$IList[$index]
width=`echo $image_size|cut -d "X" -f1`
height=`echo $image_size|cut -d "X" -f2`
for scale_factor in ${SList[#]}; do
for users in ${NList[#]}; do
echo "V: $VM, " "F: $file, " "S: $scale_factor, " "I: $width $height , " "N: $users"
for i in `seq 1 $users` ; do
./sr_run_once.sh $file $width $height $scale_factor &
done
wait
done # for users
done # for scale_factor
done # for index
done # for VM
exit 0
Have you also tried to use pssh with a simple bash-script so see if the communication is set up ok?
$ pssh -h hosts.txt -A -l ben -P -I<./uptime.sh
Warning: do not enter your password if anyone else has superuser
privileges or access to your account.
Password:
10.0.0.67: 11:06:50 up 28 min, 2 users, load average: 0.00, 0.00, 0.00
[1] 11:06:50 [SUCCESS] 10.0.0.67
10.0.0.218: 11:06:50 up 24 min, 2 users, load average: 0.00, 0.05, 0.20
[2] 11:06:50 [SUCCESS] 10.0.0.218

Mac OS X: GNU parallel can't find the number of cores on a remote server

I used homebrew to install GNU parallel on my mac so I can run some tests remotely on my University's servers. I was quickly running through the tutorials, but when I ran
parallel -S <username>#$SERVER1 echo running on ::: <username>#$SERVER1
I got the message
parallel: Warning: Could not figure out number of cpus on <username#server> (). Using 1.
Possibly related, I never added parallel to my path and got the warning that "parallel" wasn't a recognized command, but parallel ran anyways and still echo'd correctly. This particular server has 16 cores, how can I get parallel to recognize them?
GNU Parallel is less tested on OS X as I do not have access to an OS X installation, so you have likely found a bug.
GNU Parallel has since 20120322 used these to find the number of CPUs:
sysctl -n hw.physicalcpu
sysctl -a hw 2>/dev/null | grep [^a-z]physicalcpu[^a-z] | awk '{ print \$2 }'
And the number of cores:
sysctl -n hw.logicalcpu
sysctl -a hw 2>/dev/null | grep [^a-z]logicalcpu[^a-z] | awk '{ print \$2 }'
Can you test what output you get from those?
Which version of GNU Parallel are you using?
As a work around you can force GNU Parallel to detect 16 cores:
parallel -S 16/<username>#$SERVER1 echo running on ::: <username>#$SERVER1
Since version 20140422 you have been able to export your path to the remote server:
parallel --env PATH -S 16/<username>#$SERVER1 echo running on ::: <username>#$SERVER1
That way you just need to add the dir where parallel lives on the server to your path on local machine. E.g. parallel on the remote server is in /home/u/user/bin/parallel:
PATH=$PATH:/home/u/user/bin parallel --env PATH -S <username>#$SERVER1 echo running on ::: <username>#$SERVER1
Information for Ole
My iMac (OSX MAvericks on Intel core i7) gives the following, which all looks correct:
sysctl -n hw.physicalcpu
4
sysctl -a hw
hw.ncpu: 8
hw.byteorder: 1234
hw.memsize: 17179869184
hw.activecpu: 8
hw.physicalcpu: 4
hw.physicalcpu_max: 4
hw.logicalcpu: 8
hw.logicalcpu_max: 8
hw.cputype: 7
hw.cpusubtype: 4
hw.cpu64bit_capable: 1
hw.cpufamily: 1418770316
hw.cacheconfig: 8 2 2 8 0 0 0 0 0 0
hw.cachesize: 17179869184 32768 262144 8388608 0 0 0 0 0 0
hw.pagesize: 4096
hw.busfrequency: 100000000
hw.busfrequency_min: 100000000
hw.busfrequency_max: 100000000
hw.cpufrequency: 3400000000
hw.cpufrequency_min: 3400000000
hw.cpufrequency_max: 3400000000
hw.cachelinesize: 64
hw.l1icachesize: 32768
hw.l1dcachesize: 32768
hw.l2cachesize: 262144
hw.l3cachesize: 8388608
hw.tbfrequency: 1000000000
hw.packages: 1
hw.optional.floatingpoint: 1
hw.optional.mmx: 1
hw.optional.sse: 1
hw.optional.sse2: 1
hw.optional.sse3: 1
hw.optional.supplementalsse3: 1
hw.optional.sse4_1: 1
hw.optional.sse4_2: 1
hw.optional.x86_64: 1
hw.optional.aes: 1
hw.optional.avx1_0: 1
hw.optional.rdrand: 0
hw.optional.f16c: 0
hw.optional.enfstrg: 0
hw.optional.fma: 0
hw.optional.avx2_0: 0
hw.optional.bmi1: 0
hw.optional.bmi2: 0
hw.optional.rtm: 0
hw.optional.hle: 0
hw.cputhreadtype: 1
hw.machine = x86_64
hw.model = iMac12,2
hw.ncpu = 8
hw.byteorder = 1234
hw.physmem = 2147483648
hw.usermem = 521064448
hw.pagesize = 4096
hw.epoch = 0
hw.vectorunit = 1
hw.busfrequency = 100000000
hw.cpufrequency = 3400000000
hw.cachelinesize = 64
hw.l1icachesize = 32768
hw.l1dcachesize = 32768
hw.l2settings = 1
hw.l2cachesize = 262144
hw.l3settings = 1
hw.l3cachesize = 8388608
hw.tbfrequency = 1000000000
hw.memsize = 17179869184
hw.availcpu = 8
sysctl -n hw.logicalcpu
8

Resources