Unknown storage : Swap space : ERROR in nagios

Unknown storage : Swap space : ERROR in nagios - snmp

We have a Nagios server running on Linux and one of the host machine is running on Linux.
When I try to manually run the command to get the information of swap space using SNMP I am getting the output, but it is not reflecting on the dashboard.
Can anybody help me?
For your reference, please find the output from manually running the command.
check_snmp_swap.pl -H IP Address -C public -m -w 80 -c 90
Swap Space: 0%used(26MB/95998MB) /data: 0%used(188MB/129704MB) Real
Memory: 16%used(10263MB/64444MB) /: 62%used(30070MB/48432MB) Memory
Buffers: 0%used(239MB/64444MB) (<80%) : OK
But in dashboard I'm not able to see the status of only Swap space, but I'm able to see the status of CPU and RAM.

Check your service definition for check_snmp_swap. Make sure that the service is registered, meaning you set register 1 in the service definition.
For example:
define service{
host_name check_snmp_swap
service_description check-swap
check_command check_snmp_swap!public!80!90
max_check_attempts 5
check_interval 5
retry_interval 3
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,c,r
contact_groups linux-admins
register 1
}
Also check the command definition for check_snmp_swap. Make sure that the correct community string gets passed into the command -- in this case, public.
EDIT:
From the configuration information you posted in the comments, I think you have a bit of confusion regarding service definitions and service template definitions.
It looks like you posted a template - which as a template, really should have it's register value set to 0 to indicate it's a template.
Now a real service definition may inherit some settings from a service template. The purpose of this is to save you from having to re-enter the same information over and over again when you create service definitions.
You can override the settings inherited from the service template by explicitly defining those settings in the service definition.
You should create a service definition that looks something like this:
define service{
host_name check_snmp_swap
use generic-service
service_description check-swap
check_command check_snmp_swap
max_check_attempts 5
check_interval 10
retry_interval 2
check_period 24x7
notification_interval 30
notification_period 24x7
notification_options w,u,c,r
contact_groups admins
register 1
}
Then restart your nagios service:
service nagios restart

Related

Retrieve the instance ID after deploying at Vultr with the vultr-cli?

I'm scripting a process with the vultr-cli. I need to deploy a new VPS at Vultr, perform some intermediate steps, then destroy the VPS in a bash script. How do I retrieve instance values in the script after the deployment? Is there a way to capture the information as JSON or set environment variables directly?
So far, my script looks like:
#!/bin/bash
## Create an instance. How do I retrieve the instance ID
## for use later in the script?
vultr-cli instance create --plan vc2-1c-1gb --os 387 --region ewr
## With the instance ID, retrieve the main IPv4 address.
## Note: I only want the main IP, but there may be multiple.
vultr-cli instance ipv4 list $INSTANCE_ID
## Perform some tasks here with the IPv4. Assuming I created
## the instance with my SSH key, for example:
scp root#$INSTANCE_IPv4:/var/log/logfile.log ./logfile.log
## Destroy the instance.
vultr-cli instance delete $INSTANCE_ID

The vultr-cli will output the response it gets back from the API like this
☁ ~ vultr-cli instance create --plan vc2-1c-1gb --os 387 --region ewr
INSTANCE INFO
ID 87e98eb0-a189-4519-8b4e-fc46bb0a5331
Os Ubuntu 20.04 x64
RAM 1024
DISK 0
MAIN IP 0.0.0.0
VCPU COUNT 1
REGION ewr
DATE CREATED 2021-01-23T17:39:45+00:00
STATUS pending
ALLOWED BANDWIDTH 1000
NETMASK V4
GATEWAY V4 0.0.0.0
POWER STATUS running
SERVER STATE none
PLAN vc2-1c-1gb
LABEL
INTERNAL IP
KVM URL
TAG
OsID 387
AppID 0
FIREWALL GROUP ID
V6 MAIN IP
V6 NETWORK
V6 NETWORK SIZE 0
FEATURES []
So you would want to capture the ID and it's value from the response. This is a crude example but it does work.
vultr-cli instance create --plan vc2-1c-1gb --os 387 --region ewr | grep -m1 -w "ID" | sed 's/ID//g' | tr -d " \t\n\r"
We are grepping for the 1st line that has ID (which will always be be the 1st line). Then remove the word ID followed by removing any whitespace and newlines.
You will want to do something similar to with the ipv4 list call you have.
Again, there may be a better way to write out the grep/sed/tr portion but this will work for your needs. Hopefully this helps!

Setup Nagios dependencies to monitor a service on a Windows host

I'm having difficulties setting up nagios dependencies so I only receive notifications if the host is up (Pingable).
My host cfg file is as follow:
# Configuration file /etc/nagios/adagios/hosts/dp-front.cfg
# Edited by PyNag on Wed Nov 11 16:38:15 2015
define host {
alias Ditmas Park Front Desk
use windows-server
host_name dp-front
address 192.168.200.47
max_check_attempts 2
check_command check-host-alive
check_period workhours
notification_period workhours
}
define service{
use generic-service
host_name dp-front
service_description Medical Records
check_command check_nt!PROCSTATE!-d SHOWALL -l Robocopy.exe
notification_interval 0
max_check_attempts 1
}
define servicedependency{
host_name localhost
service_description PING
dependent_host_name localhost
dependent_service_description PING
execution_failure_criteria c
notification_failure_criteria w,u,c
}
Nagios is monitoring the service correctly and I receive notifications all the time when the service go down. I'm just not sure how to setup the "servicedependency" section of it.
I'd really appreciate your help as always.

Probably you need to setup below perameter in your host configuration file
notification_options d u r

Add to your host template 'windows-server' a notification_option option
https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/objectdefinitions.html
notification_options: This directive is used to determine when
notifications for the host should be sent out. Valid options are a
combination of one or more of the following: d = send notifications on
a DOWN state, u = send notifications on an UNREACHABLE state, r = send
notifications on recoveries (OK state), f = send notifications when
the host starts and stops flapping, and s = send notifications when
scheduled downtime starts and ends. If you specify n (none) as an
option, no host notifications will be sent out. If you do not specify
any notification options, Nagios will assume that you want
notifications to be sent out for all possible states. Example: If you
specify d,r in this field, notifications will only be sent out when
the host goes DOWN and when it recovers from a DOWN state.

How can I monitor a router with a intern ssl certificate in Nagios?

This ist my current setup:
Host config:
define host{
use generic-host ; Inherit default values from a template
host_name A+A ; The name we're giving to this host
alias A+A Objektausstattung Router ; A longer name associated with the host
address https://87.139.203.190:444 ; IP address of the host
hostgroups Router ; Host groups this host is associated with
}
Service config:
define service{
use generic-service ; Inherit default values from a template
host_name A+A
service_description HTTP
check_command check_http
}
I´ll get this error from Nagios:
check_icmp: Failed to resolve https://87.139.203.190:444
What am I doing wrong here ?

Nagios tries to resolve to ip-address and port. Try ip-address only.
address https://87.139.203.190 ; IP address of the host

Your host definition should only specify an IP address for the 'address'. The URL is not an attribute of the host, but of the HTTP check your want to perform.
The Service definition specifies the check_command, which is in turn defined in the checkcommands.cfg file. This will specify exactly what command is to be run, possibly using additional parameters passed.
You will probably want to pass the port number as a parameter, and that you are to use HTTPS. How to do this will depend on your settings. For example, you could use this in your checkcommands.cfg:
define command{
command_name check_https
command_line $USER1$/check_http -t 12 -H $HOSTADDRESS$ -f ok --ssl=1 -u "$ARG1$" -p "$ARG2$" -w $ARG3$ -c $ARG4$
}
Then you could configure your service with a checkcommand thus:
check_command check_https!/!444!1!5
This would check for the url http://87.139.203.190:444/, giving a warning if it takes over 1s and a critical if it takes over 5s to complete. TLSv1 would be used (else you might get a false positive on web servers with Poodle protection).

Remove EC2's entry from resolv.conf

I have private DNS servers and I want to write them to resolv.conf with resolvconf on Debian on AWS/EC2.
There is a problem in the order of nameserver entries.
In my resolv.conf, EC2's default nameserver is always written at first line like so:
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 172.16.0.23
nameserver 10.0.1.185
nameserver 10.100.0.130
search ap-northeast-1.compute.internal
172.16.0.23 is EC2's default nameserver and others are mine.
How to remove EC2 entry? Or, how to move EC2 entry to third?
Here I have an interface file:
% ls -l /etc/resolvconf/run/interface/
-rw-r--r-- 1 root root 62 Jun 7 23:35 eth0
It seems that the file eth0 is automatically generated by dhcp so can't remove it permanently.
% cat /etc/resolvconf/run/interface/eth0
search ap-northeast-1.compute.internal
nameserver 172.16.0.23
My private DNS entry is here:
% cat /etc/resolvconf/resolv.conf.d/base
nameserver 10.0.1.185
nameserver 10.100.0.130
Please help.

I think I just solved a very similar problem. I was bothered by Amazon EC2's crappy internal DNS servers so I wanted to run a local caching dnsmasq daemon and use that in /etc/resolv.conf. At first I just did echo nameserver 127.0.0.1 > /etc/resolv.conf but then I realized that my change would eventually be overwritten by the DHCP client after a reboot or DHCP lease refresh.
What I've now done instead is to edit /etc/dhcp3/dhclient.conf and uncomment the line prepend domain-name-servers 127.0.0.1;. You should be able to use the prepend directive in a very similar way.
Update: These instructions are based on Ubuntu Linux but I imagine the general concept applies on other systems as well, even other DHCP clients must have similar configuration options.

I'm approaching this problem from the other direction (wanting the internal nameservers), much of what I've learned may be of interest.
There are several options to control name resolution in the VPC management console.
VPC -> DHCP option sets -> Create dhcp option set
You can specify your own name servers there.
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_DHCP_Options.html
Be sure to attach this dhcp option set to your VPC to get it to take effect.
Alternatively (I found this out by mistake) local dns servers are not set if the following settings are disabled in VPC settings:
DnsHostnames
and
DnsSupport
http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-dns.html
Settings can also be overridden locally (which you'll notice if you move instances between vpcs). /etc/dhcp/dhclient.conf
The following line might be of interest:
prepend domain-name-servers
Changes, of course, take effect on dhclient start.

How do I assign a static DNS server to a private Amazon EC2 instance running Ubuntu, RHEL, or Amazon Linux?
Short Description
Default behavior for an EC2 instance associated with a virtual private cloud (VPC) is to request a DNS server address at startup using the Dynamic Host Configuration Protocol (DHCP). The VPC responds to DHCP requests with the address of an internal DNS server. The DNS server addresses returned in the DHCP response are written to the local /etc/resolv.conf file and are used for DNS name resolution requests. Any manual modifications to the resolv.conf file are overwritten when the instance is restarted.
Resolution
To configure an EC2 instance running Linux to use static DNS server entries, use a text editor such as vim to edit the file /etc/dhcp/dhclient.conf and add the following line to the end of the file:
supersede domain-name-servers xxx.xxx.xxx.xxx, xxx.xxx.xxx.xxx;
Ubuntu - dhclient.conf - DHCP client configuration file 
The supersede statement
supersede [ option declaration ] ;
If for some option the client should always use a locally-configured value or values
rather than whatever is supplied by the server, these values can be defined in the
supersede statement.
The prepend statement
prepend [ option declaration ] ;
If for some set of options the client should use a value you supply, and then use the
values supplied by the server, if any, these values can be defined in the prepend
statement. The prepend statement can only be used for options which allow more than one
value to be given. This restriction is not enforced - if you ignore it, the behaviour
will be unpredictable.
The append statement
append [ option declaration ] ;
If for some set of options the client should first use the values supplied by the server,
if any, and then use values you supply, these values can be defined in the append
statement. The append statement can only be used for options which allow more than one
value to be given. This restriction is not enforced - if you ignore it, the behaviour
will be unpredictable.

In here someone come with solution that basically replaces the file on boot using rc.local
https://forums.aws.amazon.com/thread.jspa?threadID=74497
Edit /etc/sysconfig/network-scripts/ifcfg-eth0 to say PEERDNS=no
Create a file called /etc/resolv.backup with what you want
Add the following 2 lines to /etc/rc.local:
rm -f /etc/resolv.conf cp /etc/resolv.backup /etc/resolv.conf

This is what we are doing for our servers in the environment.
interface "eth0"
{
prepend domain-name-servers 10.x.x.x;
supersede host-name "{Hostname}";
append domain-search "domain";
supersede domain-name "DOMAIN";
}
Hope this helps.

The following worked in a Debian stretch on AWS EC2.
Just create /etc/dhcp/dhclient-enter-hooks.d/nodnsupdate:
#!/bin/sh
make_resolv_conf(){
:
}
Then you can modify /etc/resolv.conf and it will persist your changes across restarts.

Setup in crontab as
#reboot cp -r /home/.../resolv.conf /etc/resolv.conf

What gems do you recommend to use for this kind of automation?

I have to create a script to manage maintenance pages server for my hosting company.
I will need to do a CLI interface that would act like this (example scenario) :
(here, let's suppose that mcli is the name of the script, 1.1.1.1 the original server address (that host the website, www.exemple.com)
Here I just create the loopback interface on the maintenance server with the original ip address and create the nginx site-specific config file in sites-enabled
$ mcli register www.exemple.com 1.1.1.1
[DEBUG] Adding IP 1.1.1.1 to new loopback interface lo:001001001001
[WARNING] No root directory specified, setting default maintenance page.
[DEBUG] Registering www.exemple.com maintenance page and reloading Nginx: OK
Then when I want to enable the maintenance page and completely shutdown the website:
$ mcli maintenance www.exemple.com
[DEBUG] Connecting to router with SSH: OK
[DEBUG] Setting new route to 1.1.1.1 to maintenance server: OK
[DEBUG] Writing configuration: Ok
Then removing the maintenance page:
$ mcli nomaintenance www.exemple.com
[DEBUG] Connecting to router with SSH: OK
[DEBUG] Removing route to 1.1.1.1: Ok
[DEBUG] Writing configuration: Ok
And I would need a function to see the actual states of the websites
$ mcli list
+------------------+-----------------+------------------+
| Site Name | Server I.P | Maintenance mode |
+------------------+-----------------+------------------+
| www.example.com | 1.1.1.1 | Enabled |
| www.example.org | 1.1.1.2 | Disabled |
+------------------+-----------------+------------------+
$ mcli show www.example.org
Site Name: www.example.org
Server I.P: 1.1.1.1
Maintenance Mode: Enabled
Root Directory : /var/www/maintenance/default/
But I never did this kind of scripting with Ruby. What gems do you recommend for this kind of things ? For command line parsing ? Column/Colorized output ? SSH connection (needed to connect to cisco routers)
Do you recommend me to use a local database (sqlite) to store meta datas (Stages changes, actual states) or do you recommend me to compute on the fly by analyzing nginx/interfaces configuration files and using syslog for monitoring changes done with this script ?
This script will be used at first time for a massive datacenter physical migration, and next for standard usages for scheduled downtimes.
Thank you

First of all, I'd recommend you get a copy of Build awesome command-line applications in Ruby.
That said, you might want to check
GLI command line parsing like git
OptionParser command line parsing
Personally, I'd go for the SQLite approach for storing data, but I'm biased (having a strong SQL background).

Thor is a good gem for handling CLI options. It allows this type of organization in your script:
class Maintenance < Thor
desc "maintenance", "put up maintenance page"
method_option :switch, :aliases => '-s', :type => 'string'
#The method name is the name of the task that would be run => mcli maintenance
def maintenance
#do stuff
end
no_tasks do
#methods that you don't want cli tasks for go here
end
end
Maintenance.start
I don't really have any good suggestions for column/colorized output though.
I definitely recommend using some kind of a database to store states though. Maybe not sqlite, I would probably opt for maybe a redis database that stores key/value pairs with the information you are looking for.

We have similar task. I use next architecture
Small application (C) what generate config file
Add nginx init.d script new switch update_clusters. This script will restart nginx only if config file is changed
update_clusters() {
${CONF_GEN} --outfile=/tmp/nginx_clusters.conf
RETVAL=$?
if [[ "$RETVAL" != "0" ]]; then
return 5
fi
if ! diff ${CLUSTER_CONF_FILE} /tmp/nginx_clusters.conf > /dev/null; then
echo "Cluster configuration changed. Reload service"
mv -f /tmp/nginx_clusters.conf ${CLUSTER_CONF_FILE}
reload
fi
}
Set of bash scripts to add records to database.
Web console to add/modify/delete records in database (extjs+nginx module)

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio