The problem
SNMPD is correctly delegating SNMP polling requests to another program but the response from that program is not valid. A manual run of the program with the same arguments is responding correctly.
The detail
I've installed the correct LSI raid drivers on a server and want to configure SNMP. As per the instructions, I've added the following to /etc/snmp/snmpd.conf to redirect SNMP polling requests with a given OID prefix to a program:
pass .1.3.6.1.4.1.3582 /usr/sbin/lsi_mrdsnmpmain
It doesn't work correctly for SNMP polling requests:
snmpget -v1 -c public localhost .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.1
I get the following response:
Error in packet
Reason: (noSuchName) There is no such variable name in this MIB.
Failed object: SNMPv2-SMI::enterprises.3582.5.1.4.2.1.2.1.32.1
What I've tried
SNMPD passes two arguments, -g and <oid> and expects a three line response <oid>, <data-type> and <data-value>.
If I manually run the following:
/usr/sbin/lsi_mrdsnmpmain -g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
I correctly get a correct three line response:
.1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
integer
30
This means that the pass command is working correctly and the /usr/sbin/lsi_mrdsnmpmain program is working correctly in this example
I tried replacing /usr/sbin/lsi_mrdsnmpmain with a bash script. The bash script delegates the call and logs the supplied arguments and output from the delegated call:
#!/bin/bash
echo "In: '$#" > /var/log/snmp-pass-test
RETURN=$(/usr/sbin/lsi_mrdsnmpmain $#)
echo "$RETURN"
echo "Out: '$RETURN'" >> /var/log/snmp-pass-test
And modified the pass command to redirect to the bash script. If I run the bash script manually /usr/sbin/snmp-pass-test -g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0 I get the correct three line response as I did when I ran /usr/sbin/lsi_mrdsnmpmain manually and I get the following logged:
In: '-g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
Out: '.1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
integer
30'
When I rerun the snmpget test, I get the same Error in packet... error and the bash script's logging shows that the captured delegated call output is empty:
In: '-g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.0
Out: ''
If I modify the bash script to only echo an empty line I also get the same Error in packet... message.
I've also tried ensuring that the environment variables that are present when I manually call /usr/sbin/lsi_mrdsnmpmain are the same for the bash script but I get the same empty output.
Finally, my questions
Why would the bash script behave differently in these two scenarios?
Is it likely that the problem that exists with the bash scripts is the same as originally noticed (manually running program has different output to SNMPD run program)?
Updates
eewanco's suggestions
What user is running the program in each scenario?
I added echo "$(whoami)" > /var/log/snmp-pass-test to the bash script and root was added to the logs
Maybe try executing it in cron
Adding the following to root's crontab and the correct three line response was logged:
* * * * * /usr/sbin/lsi_mrdsnmpmain -g .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.1 >> /var/log/snmp-test-cron 2>&1
Grisha Levit's suggestion
Try logging the stderr
There aren't any errors logged
Checking /var/log/messages
When I run it via SNMPD, I get MegaRAID SNMP AGENT: Error in getting Shared Memory(lsi_mrdsnmpmain) logged. When I run it directly, I don't. I've done a bit of googling and I may need lm_sensors installed; I'll try this.
I installed lm_sensors & compat-libstdc++-33.i686 (the latter because it said it was a pre-requisite from the instructions and I was missing it), uninstalled and reinstalled the LSI drivers and am experiencing the same issue.
SELinux
I accidently stumbled upon a page about extending snmpd with scripts and it says to check the script has the right SELinux context. I ran grep AVC /var/log/audit/audit.log | grep snmp before and after running a snmpget and the following entry is added as a direct result from running snmpget:
type=AVC msg=audit(1485967641.075:271): avc: denied { unix_read unix_write } for pid=5552 comm="lsi_mrdsnmpmain" key=558265 scontext=system_u:system_r:snmpd_t:s0 tcontext=system_u:system_r:initrc_t:s0 tclass=shm
I'm now assuming that SELinux is causing the call to fail; I'll dig further...see answer for solution.
strace (eewanco's suggestion)
Try using strace with and without snmp and see if you can catch a system call failure or some additional hints
For completeness, I wanted to see if strace would have hinted that SELinux was denying. I had to remove the policy packages using semodule -r <policy-package-name> to reintroduce the problem then ran the following:
strace snmpget -v1 -c public localhost .1.3.6.1.4.1.3582.5.1.4.2.1.2.1.32.1 >> strace.log 2>&1
The end of strace.log is as follows and unless I'm missing something, it doesn't seem to provide any hints:
...
sendmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("127.0.0.1")}, msg_iov(1)= [{"0;\2\1\0\4\20public\240$\2\4I\264-m\2"..., 61}], msg_controllen=32, {cmsg_len=28, cmsg_level=SOL_IP, cmsg_type=, ...}, msg_flags=0}, MSG_DONTWAIT|MSG_NOSIGNAL) = 61
select(4, [3], NULL, NULL, {0, 999997}) = 1 (in [3], left {0, 998475})
brk(0xab9000) = 0xab9000
recvmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(161), sin_addr=inet_addr("127.0.0.1")}, msg_iov(1)= [{"0;\2\1\0\4\20public\242$\2\4I\264-m\2"..., 65536}], msg_controllen=0, msg_flags=0}, MSG_DONTWAIT) = 61
write(2, "Error in packet\nReason: (noSuchN"..., 81Error in packet
Reason: (noSuchName) There is no such variable name in this MIB.
) = 81
write(2, "Failed object: ", 15Failed object: ) = 15
write(2, "SNMPv2-SMI::enterprises.3582.5.1"..., 48SNMPv2- SMI::enterprises.3582.5.1.4.2.1.2.1.32.1
) = 48
write(2, "\n", 1
) = 1
brk(0xaa9000) = 0xaa9000
close(3) = 0
exit_group(2) = ?
+++ exited with 2 +++
It was SELinux that was denying snmpd a delegated call to /usr/sbin/lsi_mrdsnmpmain (and probably beyond).
To identify it, I ran grep AVC /var/log/audit/audit.log and for each entry, I ran the following:
echo "<grepped-output>" | audit2allow -a -M <filename>
This creates a SELinux policy package that should allow the delegated call through. The package is then loaded using the following:
semodule -i <filename>.pp
I had to do this 5 times as there were different causes of denial (unix_read unix_write, associate, read write). I'll look to combine the modules into one.
Now when I run snmpget I get the correct delegated output:
SNMPv2-SMI::enterprises.3582.5.1.4.2.1.2.1.32.1 = INTEGER: 34
Related
I saw another question that seemed similar mpirun: token slots not supported but their solution did not work for me.
I get the error
token slots not supported at this time
when running the command mpirun -hostfile temp.txt hostname
where temp.txt is
hostname1 slots=2
hostname2 slots=2
I have the mpirun version 2021.5
Release Date: 20211102 (id: 9279b7d62).
It did not work to instead write
hostname1:2
hostname2:2
in that case the command runs but it instead does the number of physical processors that are available, which is default.
EDIT: I am adding the full output
[host RAMSES]$ mpirun -hostfile temp.txt hostname
[mpiexec#host] HYD_hostfile_process_tokens (../../../../../src/pm/i_hydra/libhydra/hostfile/hydra_hostfile.c:47): token slots not supported at this time
[mpiexec#host] HYD_hostfile_unique_parse (../../../../../src/pm/i_hydra/libhydra/hostfile/hydra_hostfile.c:232): unable to process token
[mpiexec#host] match_arg (../../../../../src/pm/i_hydra/libhydra/arg/hydra_arg.c:83): match handler returned error
[mpiexec#host] HYD_arg_parse_array (../../../../../src/pm/i_hydra/libhydra/arg/hydra_arg.c:128): argument matching returned error
[mpiexec#host] mpiexec_get_parameters (../../../../../src/pm/i_hydra/mpiexec/mpiexec_params.c:1359): error parsing input array
[mpiexec#host] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1784): error parsing parameters
So I found that on my version of mpi I had to specify processor placement not in the hostfile, as most of the examples I found do, but rather in the machinefile.
So the new command and file look like:
mpirun -machinefile machine.txt hostname
machine.txt:
host1:2
host2:2
Given the following sample/simple snmpd.conf (Net-SNMP 5.7.2 on RHEL 7.4)
rwcommunity private 192.168.56.101
trapsess -Ci --clientaddr=192.168.56.128 -v 2c -c private 192.168.56.101:162
when starting a SNMP Daemon
snmpd -f -Lo -D -C -c data/snmpd_test.conf udp:192.168.56.128:161
We obtain ''Start Up'' InformRequest with IP source 192.56.168.1 instead of ...128 (WireShark snapshot below)
It is not surprising as the -D option allows us to output the debug information saying that
trace: netsnmp_config_process_memory_list(): read_config.c, 696:
read_config:mem: processing memory: clientaddr 192.168.56.128
trace: run_config_handler(): read_config.c, 562:
9:read_config:parser: clientaddr handler not registered for this time
Web sources however say:
snmp.conf
...This value is also used by snmpd when generating notifications.
snmpd.conf
trapsess [SNMPCMD_ARGS] HOST
provides a more generic mechanism for defining notification destinations.
SNMPCMD_ARGS should be the command-line options required for an equivalent
snmptrap (or snmpinform) command to send the desired notification
I read also some old threads like this one
However this option is working well with snmptrap
snmptrap -D -Lo -Ci --clientaddr=192.168.56.128 -M+path_to_my_mibs -v 2c -c private 192.168.56.101:162 "" .1.3.6.1.4.1.a.b.c.d.e.f.0 i 0
This option is also working when placed in snmp.conf ( mind there is no 'd' here ) and then it applies to snmpset and snmpget (and maybe other)
So my question is: Is it a documentation error, a bug, a misuse of the Net-SNMP stack ?
After a long struggle I may have an answer and I write a short note as I just found a trick
It seems that clientaddr is not parsed correctly wherever in the snmpd.conf
(I tried not also inside the trapsess line)
But it seems to be a valid option in the command line of snmpd
like it was a valid option in the snmptrap command line. So I assumed it could be the same parsing mechanism for both.
a condition also is that the IP addres must be valid one
which means that
snmpd -f -Lo -D -C -c data/snmpd_test.conf --clientaddr=192.168.56.128 udp:192.168.56.128:161
seems to fully solve my problem.
I will perform more tests and if accurate format this answer a little bit better but it seems a good hint.
Okay, hopefully I can explain this correctly as I have no idea what's causing this or how to resolve this.
For some reason bash commands (on a CentOS 6.x server) are displaying more information than "normally" and that causes issues with certain scripts. I have no clue if there is a name for this, but hopefully someone knows a solution for this.
First example.
Correct / good server:
[root#goodserver ~]# vzctl enter 3567
entered into CT 3567
[root#example /]#
(this is the correct behaviour)
Incorrect / bad server:
[root#badserver /]# vzctl enter 3127
Entering CT
entered into CT 3127
Open /dev/pts/0
[root#example /]#
With the "bad" server it will display more information as usual, like:
Entering CT
Open /dev/pts/0
It's like it parsing extra information on what it's doing.
Ofcourse the above is purely something cosmetic, however with several bash scripts we use, these issues are really issues.
A part of the script we use, uses the following command (there are more, but this is mainly a example of what's wrong):
DOMAIN=`vzctl exec $VEID 'hostname -d'`
The result of the above information is parsed in /etc/named.conf.
On the GOOD server it would be added in the named.conf like this:
zone "example.com" {
type master;
file "example.com";
allow-transfer {
200.190.100.10;
200.190.101.10;
common-allow-transfer;
};
};
The above is correct.
On the BAD server it would be added in the named.conf like this:
zone "Executing command: hostname -d
example.com" {
type master;
file "Executing command: hostname -d
example.com";
allow-transfer {
200.190.100.10;
200.190.101.10;
common-allow-transfer;
};
};
So it's add stuff of the action it does, in this example "Executing command: hostname -d"
Another example here when I run the command on a good server and on the bad server.
Bad server:
[root#bad-server /]# DOMAIN=`vzctl exec 3333 'hostname -d'`
[root#bad-server /]# echo $DOMAIN
Executing command: hostname -d example.com
Good server:
[root#good-server ~]# DOMAIN=`vzctl exec 4444 'hostname -d'`
[root#good-server ~]# echo $DOMAIN
example.com
My knowledge is limited, but I have tried several things checking rsyslog and the grub.conf, but nothing seems out of the ordinary.
I have no clue why it's displaying the extra information.
Probably it's something simple / stupid, but I have been trying to solve this for hours now and I really have no clue...
So any help is really appreciated.
Added information:
Both servers use: kernel.printk = 7 4 1 7
(I don't know if that's useful)
Well (thanks to Aaron for pointing me in the right direction) I finally found the little culprit which was causing all the issues I experienced with this script (which worked for every other server, so no need to change that obviously).
The issues were caused by the VERBOSE leven set in vz.conf (located in /etc/vz/ directory). There is an option in there called "VERBOSE" and in my case it was set to 3.
According to OpenVZ's website it does the following:
Increments logging level up from the default. Can be used multiple times.
Default value is set to the value of VERBOSE parameter in the global
configuration file vz.conf(5), or to 0 if not set by VERBOSE parameter.
After I changed VERBOSE=3 to VERBOSE=0 my script worked fine once again (as it did for every other server). :-)
So a big shoutout to Aaron for pointing me in the right direction. The answer is easy when you know where to look!
Sorry to say, but I am kinda disappointed by ndim's reaction. This is the 2nd time he was very unhelpful and rude in his response after that. He clearly didn't read the issue I posted correctly. Oh well.
I would make sure to properly parse the output of the command. In this case, we are only interested in lines of the form
entered into CT 12345
One way of doing this would be to pipe everything through sed and having sed print only the number when the line looks as above (untested, and I always forget which braces/brackets/parens need a backslash in front of them):
whateverthecommand | sed -n 's/^entered into CT ([0-9]{1,})$/\1/p'
I am rebuilding an Icinga server that has been left behind by a previous employee. I have everything up and running, except for a bunch of MIB files for 3com switches that I cannot get to work.
The server is a CentOS 6 OpenVZ container.
In the original server there is a bunch of mib files in the default location at /usr/share/snmp/mibs/ and the 3com ones at /usr/share/snmp/mibs/3Com_4500/MIBs. The 3Com mibs work fine:
/usr/lib/nagios/plugins/check_snmp -H 10.10.111.11 -P 2c -C public -o hwDevMFanStatus.65536 -s "active(1)" -m A3COM-HUAWEI-LswDEVM-MIBSNMP OK - active(1) |
In the new server, the MIBs in the 3com folder do not get acknowledged and I get errors like the following:
/usr/lib/nagios/plugins/check_snmp -H 10.10.111.11 -P2c -C someuser -o hwDevMFanStatus.65536 -s "active(1)" -m A3COM-HUAWEI-LswDEVM-MIB
External command error: No log handling enabled - turning on stderr logging
Cannot find module (A3COM-HUAWEI-LswDEVM-MIB): At line 0 in (none)
hwDevMFanStatus.65536: Unknown Object Identifier (Sub-id not found: (top) -> hwDevMFanStatus)
/etc/snmp/snmpd.conf is identical for both servers and so is /etc/sysconfig/snmp.
set does not show any ENV variable related to snmp or mib.
Thanks
You are confusing snmpd.conf and snmp.conf the former being the configuration file for the SNMP daemon whereas Net-SNMP applications use snmp.conf.
The mibs/mibdirs directives you are interested in would be specified in snmp.conf (see also man snmp.conf.
I have a very simple script to run. It calls tcpreplay and then ask the user to type in something. Then the read will fail with read: read error: 0: Resource temporarily unavailable.
Here is the code
#!/bin/bash
tcpreplay -ieth4 SMTP.pcap
echo TEST
read HANDLE
echo $HANDLE
And the output is
[root#vse1 quick_test]# ./test.sh
sending out eth4
processing file: SMTP.pcap
Actual: 28 packets (4380 bytes) sent in 0.53 seconds. Rated: 8264.2 bps, 0.06 Mbps, 52.83 pps
Statistics for network device: eth4
Attempted packets: 28
Successful packets: 28
Failed packets: 0
Retried packets (ENOBUFS): 0
Retried packets (EAGAIN): 0
TEST
./test.sh: line 6: read: read error: 0: Resource temporarily unavailable
[root#vse1 quick_test]#
I am wondering if I need to close or clear up any handles or pipes after I run tcpreplay?
Apparently tcpreplay sets O_NONBLOCK on stdin and then doesn't remove it. I'd say it's a bug in tcpreplay. To work it around you can run tcpreplay with stdin redirected from /dev/null. Like this:
tcpreplay -i eth4 SMTP.pcap </dev/null
Addition: note that this tcpreplay behavior breaks non-interactive shells only.
Another addition: alternatively, if you really need tcpreplay to receive your
input you can write a short program which resets O_NONBLOCK. Like this one
(reset-nonblock.c):
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int
main()
{
if (fcntl(STDIN_FILENO, F_SETFL,
fcntl(STDIN_FILENO, F_GETFL) & ~O_NONBLOCK) < 0) {
perror(NULL);
return 1;
}
return 0;
}
Make it with "make reset-nonblock", then put it in your PATH and use like this:
tcpreplay -i eth4 SMTP.pcap
reset-nonblock
While the C solution works, you can turn off nonblocking input in one line from the command-line using Python. Personally, I alias it to "setblocking" since it is fairly handy.
$ python3 -c $'import os\nos.set_blocking(0, True)'
You can also have Python print the previous state so that it may be changed only temporarily:
$ o=$(python3 -c $'import os\nprint(os.get_blocking(0))\nos.set_blocking(0, True)')
$ somecommandthatreadsstdin
$ python3 -c $'import os\nos.set_blocking(0, '$o')'
Resource temporarily unavailable is EAGAIN (or EWOULDBLOCK) which is the error code for nonblocking file descriptor when no further data is available (would block if wasn't in nonblocking mode). The previous command (tcpreplay in this case) erroneously left STDIN in nonblocking mode. The shell will not correct it, and the following process isn't meant to work with non- default nonblocking STDIN.
In your script, you can also turn off nonblocking with:
perl -MFcntl -e 'fcntl STDIN, F_SETFL, fcntl(STDIN, F_GETFL, 0) & ~O_NONBLOCK'