SNMP, OIDs and SCOM don’t seem to a very exciting mix at a first glance. However, when combined in a smart manner, they extend your monitoring solution in an awesome way. This posting is about just that. It will describe at a high level how to go about it and high light some potential pitfalls. And as an extra gift, it will show two short YouTube videos as a demonstration of the power of SNMP, OIDs and SCOM working together.
How about…
checking the UPS systems, whether they’re still powered, or the batteries are being loaded or not? And at what percentage the battery capacity is and having it displayed in a graph in the SCOM Console? Or how about getting an Alert when the temperature in your datacenter is too high? And having a graph plotted near real time in SCOM about the temperature as well? Or getting an Alert when there is water detected in the datacenter?
All of this – and much much more – can be realized with SCOM, some good equipment, good software for SNMP walks (available for free like GetIf) and testing.
Nowadays one can buy for not too much money a data center thermometer with Ethernet connection or ‘Industrial Ethernet Temperature, Humidity, Pressure Sensors With Relay Outputs’, like this one:
Many times these devices are white labeled, thus sold under different brands. One of the real manufacturers is Comet Systems, to be found here. In the Netherlands these devices are sold under another brand, like Atal. Even though this information seems trivial it’s very important. It has everything to do with the related MIB files, about which I will tell more later on.
SNMP Get vs. SNMP Trap
Any how, devices like these are really awesome since they contain a whole SNMP stack of their own which can be queried by SCOM, using a simple SNMP Get command. The advantage of this, compared to a SNMP Trap, is that a single Monitor can be build and targeted against a whole bunch of devices. With a SNMP Trap this won’t do and per device a Monitor has to be build. Besides that there are more downsides of SNMP Traps. So whenever I can, I stay away from SNMP Traps.
White label, other brands and the MIB mix-up
As stated before, many times these devices do come from a couple of factories all over the globe. Companies buy them in masses and rebrand it under a label of their own. However, it’s necessary to know exactly what type of device you’re using so you know exactly what MIB file to use. For instance, the device in the picture above is sold in the Netherlands under a totally different label and model.
However, the same MIB still applies which only matches with the brand and models as the ones from the real manufacturer. So this is the hardest part, to search for the original label and model type. Only then you know what part of the MIB file relates to your device. But when you have tackled this, the rest is – almost - a walk in the park.
Let’s walk SNMP, some high level steps
Place the correct MIB file into the directory where GetIf loads its MIB files from. Start GetIf, enter the correct ip-address, community string and connect to the device. Go to the MBrowser tab and go through the SNMP stack, and find the OID you’re looking for, like temperature:
Write down this OID (high lighted in yellow) since you’re going to need that in SCOM later on.
Another interesting OID in this case is for flood detection, which is an Ethernet thermometer with additional input. One of the additional inputs is the LG-12 Flood detector which works really simple and shows only two values: All is OK (no water detected) value 1 and ‘Houston, we’ve got a problem’, water detected: value 0:
Also write down this OID.
Let’s create a Flood Detection Monitor, some high level steps
Create this kind of Monitor: SNMP > Probe Based Detection > Simple Event Detection > Event Monitor – Single Event and Single Event.
Don’t forget to DISABLE the Monitor and enable it through using an override, targeted against the group containing all these devices! Of course, these devices need to monitored by SCOM as network devices.
Use for both SNMP Probes (First and Second) the same OID. And for Parameter Name (used in both Expressions, First and Second Expression) this entry: /DataItem/SnmpVarBinds/SnmpVarBind[1]/Value.
Configure the Health and Alerting and save the Monitor. Don’t forget to enable the Monitor by using an override targeted against a Group containing these devices.
Time for a test of the Flood Detector Monitor
Let’s say the Flood Detector Monitor is properly built and configured. So it’s time for some testing. In this case I have made two video’s and uploaded them to YouTube.
Water Alert
In the first video the flood detector is put into a paper cup with some water:
Now the circuit closes (OID gets value 0) and SCOM will raise an Alert
Water is gone, Alert as well
In the second video the flood detector is removed from the paper cup:
Now the circuit is open again (OID gets value 1, all is well in SCOM) and the related Monitor is set to a Healthy state again, thus closing the Alert:
Let’s create a Temperature Monitor
For this the same steps are used as for the Flood Detector Monitor.
Of course, a different OID and other values are at play here. Suppose you want an Alert when the temperature of your datacenter exceeds 25 degrees Celsius. The First Expression (situation is not OK) looks like this:
The Second Expression (situation is OK) looks like this:
Configure the Health and Alerting and save the Monitor. Don’t forget to enable the Monitor by using an override targeted against a Group containing these devices.
And now monitoring is in place and an Alert will be raised when the temperature of 25 degrees Celsius is exceeded.
Let’s plot the temperature in near real time
For this a Rule is required, using the same OID for the Temperature Monitor: Collection Rules > Performance Based > SNMP Performance:
Configure the SNMP Probe (nothing more than the OID and frequency of probing) and you’re done. Don’t forget to enable the Rule by using an override targeted against a Group containing these devices and you’re in business.
In the SCOM Console add a Performance View targeted against these SNMP Network Devices or targeted against the Rule you created earlier. Be patient and within an hour or so data starts getting in :).
Conclusion
Even though SNMP, OIDs and SCOM might seem boring, there are many possibilities to extend your monitoring solution into places which you didn’t expect. Many devices are available on the market which have a SNMP stack. When you have the related MIB file and it contains some good OIDs, you can build almost anything. Happy SCOMming!