Advice on SNMP MIB trap organization - snmp

I am looking for some advice on SNMP MIB trap organization or best practices. I haven't found any material describing real world usage and expectations.
I have only briefly worked with SNMP in the past, and mostly just get/set, I have never had to deal with traps before.
Let me explain...
I recently joined a company and needed to look at their MIB, but the traps in it are not what I expected.
For each trap that raises an alarm condition (eg. ‘over X threshold’ - severity critical, id 100) has a completely separate trap for a clear (‘over X threshold clear’- severity clear, id 134). Each one of the traps has an arbitrary ‘trap-id’ assigned to it with no meaning or relationship information encoded in it. The only way one knows that trap 134 clears trap 100 is to look at the textual name of the trap. This just doesn’t seem right.
For example, the fan failure trap is as follows (edited for brevity):
fooTrapFanFailure NOTIFICATION-TYPE
OBJECTS {StampID, SerialNumber, Name, TrapID, Severity}
DESCRIPTION "Fan failure, trap-id 105, severity major"
::= { fooTraps 8 }
fooTrapFanFailureClear NOTIFICATION-TYPE
OBJECTS {StampID, SerialNumber, Name, TrapID, Severity}
DESCRIPTION "Fan failure clear, trap-id 132, severity informational"
::= { fooTraps 11 }
The only way I know that 132 is a clear for 105 is to manually read the MIB or programmatically scan the MIB and build a table based on the trap name. This case is even more goofy as the Clear trap shows up with an ‘informational’ severity.
I expected when an ‘over X threshold’ trap-id 100 is raised, it would be sent with its severity set to say ‘critical’ and when it clears, the very same trap-id 100 would be sent with a severity of ‘clear’.
Or it would be even better if there was just one generic alarm trap which contains the trap-id and severity, instead of the my 65 or so unique traps.
So, in short, the question is:
Is this 'two trap, one to raise and one to clear' normal?

It's not normal but it's OK. I can see why they might have done it that way, it is easy to assign colors in HP OV NNM based on OID (don't know the exact version but it worked in the year 2000). Or else you might have to parse the packet to display colors on the Manager/Management station.
Generally it is good idea to use trap status as part of the trap binding.

I use my Mib files with the SNMPc.
My file has only one trap EventOccured and all other logic (correlation) made by SNMPc filters. I use Severity='CL' to clear traps.
So don't understand what for you need a Severity var in the "fooTrapFanFailureClear" trap?

Related

How to show fieldnames and severity in SNMP CA Spectrum?

Conditions:
CA Console spectrum server receives SNMP traps (events), MIB definition file loaded to CA system.
Problem:
In events list I don't see color severity and I see only OIDs, but not fields names described by MIB file.
Does anyone have any suggestions about that?
Problem is resolved. Problem was not in MIB file, but in SNMP trap sender: it was sent trap ID incorrect and of course CA spectrim don't recognize event.
I Used library Lextm.SharpSnmpLib. Function Messenger.SendTrapV2 has parameter called "enterprise". But you must don't provide enterpriseOID to this parameter, you need to provide TRAP ID (!!!). It's non sense, but it working!

How to define Severity in SNMP?

Hi I am trying to understand SNMP trap mechanism, I referred http://docstore.mik.ua/orelly/networking_2ndEd/snmp/ch02_06.htm#enettdg-CHP-2-TABLE-8.html and I understood that there are two types Generic and enterprise, Now In My Java code, I want to capture description from specific OID,
// variable binding for Enterprise Specific objects, Severity (should be defined in MIB file)
pdu.add(new VariableBinding(new OID(trapOid), new OctetString("Major")));
Here, Instead of "Major", what should I specify to get the severity for that specific OID?
Any help would be higly appreciated
In general, the severity is not an attribute of an SNMP trap.
Usually the custom severity mapping is defined in vendor specific MIB file as variable binding of specific trap. Here is an example:
sysLogMessageSeverity OBJECT-TYPE
SYNTAX INTEGER {
emergency (0), --system is unusable
alert (1), --action must be taken immediately
critical (2), --critical conditions
error (3), --error conditions
warning (4), --warning conditions
notice (5), --normal but significant condition
informational (6), --informational messages
debug (7) --debug-level messages
}
ACCESS read-only
STATUS mandatory
DESCRIPTION
"Severity level of the message"
::= { sysLogMibObjects 5 }
Please also note that most of modern NMSs allow the user to assign custom severity to any received SNMP traps based on user-defined rules.
The most widely used tool to do that is NetDecision TrapVision. Find out more at: http://netmechanica.com/products/?prod_id=1003
i used two ways before:
1. adding a severity variable to MIB and including it in any sent trap.
2. classifying events causing traps to Critical, Major, ... and assigning an enterprise trap id range to each type like: traps with ids in range (1,100) are Critical, traps with ids in range (101,200) are Major and...

Best practices on setting exit status codes

When implementing my own scripts, is it the best practice to exit with different exit codes for different failure scenarios? Or should I just return exit code 1 for failure and 0 for success providing the reason on stderr?
Providing a descriptive error message to stderr is fine and well for interactive users, but if you expect your scripts to be used by other scripts/programs, you should have distinctive error codes for different failures, so the calling script could make an informed decision on how to handle the failure.
If the calling program does not wish to handle different failures differently it could always check the return code against > 0 - but don't assume this is the case.
There are some recommendations, see wikipedia, but not normative, except the one of 0 iff success:
http://en.wikipedia.org/wiki/Exit_status#POSIX
*In Unix and other POSIX-compatible systems, the wait system call sets a status value of type int packed as a bitfield with various types of child termination information. If the child terminated by exiting (as determined by the WIFEXITED macro; the usual alternative being that it died from an uncaught signal), SUS specifies that the low-order 8 bits of the exit status can be retrieved from the status value using the WEXITSTATUS macro in wait.h;[6][7] when using the POSIX waitid system call (added with POSIX-2001), the range of the status is no longer limited and can be in the full integer range.
POSIX-compatible systems typically use a convention of zero for success and non zero for error.[8] Some conventions have developed as to the relative meanings of various error codes; for example GNU recommend that codes with the high bit set be reserved for serious errors,[3] and FreeBSD have documented an extensive set of preferred interpretations.[9] Meanings for 15 status codes 64 through 78 are defined in sysexits.h. These historically derive from sendmail and other message transfer agents, but they have since found use in many other programs.[10]*

SNMP MIB DISPLAY-HINT or UNITS - which one has precedence?

I am writing a MIB and a SNMP agent. I seem to be confused by an apparent conflict between DISPLAY-HINT and UNITS. Is it better for a NMS to have a DISPLAY-HINT, or knowledge of the UNITS?
The background for this question is as follows: One object in the MIB is mPowerVoltage:
FixedDiv10 ::= TEXTUAL-CONVENTION
DISPLAY-HINT "d-1"
STATUS current
DESCRIPTION "Fixed point, one decimal"
SYNTAX Integer32
mPowerVoltage OBJECT-TYPE
SYNTAX FixedDiv10
UNITS "V/10"
MAX-ACCESS read-only
STATUS current
DESCRIPTION "Power Voltage in desiVolts"
::= { mPowerEntry 2 } -- an entry in a table with integer index
Actual transfer "on the wire" of the value I understand, for instance 10.8 V is transferred as 108 in an Integer32. And this is my motivation to set UNITS as "V/10" and describe the object as Power Voltage in desiVolts. However, when I use snmpget I get:
snmpget -c public -v 1 -m -MY-MIB 192.168.1.3 mPowerVoltage.1
MY-MIB::mPowerVoltage.1 = INTEGER: 10.8 V/10
which is indeed what I specified, but is clearly wrong.
But I can hardly change the UNITS to "V"? Hence the question, should I remove the DISPLAY-HINT, or should I remove the UNITS?
Baard
As I understand it, they're two diferent things, so neither takes precidence.
DISPLAY-HINT tells the caller how to place the decimal point - so in your example it prints out an "on-the-wire" value of 108 as 10.8.
UNITS is just a bit of text that gets appended after the number, exactly as you typed it. In this case you should definitely change the units to "V" because you've told the caller to display the number in V by dividing it by 10.
It does seem a bit inconsistent that one is part of the textual convention, while the other is part of the object definition, however.

How to identify an unknown exit code?

I have the problem that a Mac application I wrote often suddenly exits with a for me unknown exit code 33 and without any further indication of what went wrong. I already searched the whole source code for the number 33, but I couldn't find anything (I was hoping for a line of code like exit(33)).
Can you give me any hint how I could track down this problem? Is there a way for example to set a breakpoint into the exit-function or something like that?
There are no predefined meanings for a processes exit code. The C standard defines EXIT_SUCCESS and EXIT_FAILURE without numeric values. On Unix-like systems they are defined to 0 and 1. Unix limits those exit to an unsigned 8-bit integer, so they range from 0 to 255, but the meaning for each exit code (except 0 for success) is up to the developer.
FreeBSD defines a couple of values as documented on the sysexits(3) manpage. But the number 33 is not among them.
Your best way to debug this problem would be to set a breakpoint to the various exit functions (exit, _exit) and see when and where they get called.
The problem was that there was an exit-call exit(12321) in my code, which gets reported in the console as 33. It seems the status-parameter of exit(int) can not be an arbitrary int-value.

Resources