Tuesday, June 11, 2013

High Level Overview: NetApp SAN Monitoring With DATA ONTAP MP

Update 06-12-2013:
Cameron Fuller posted today a blog article all about how to tune this MP. Awesome. So for more information about how to tune this MP when all is in place, go here. Thanks Cameron!

This posting contains a high level overview of the required steps in order to monitor a NetApp SAN with the DATA ONTAP MP, titled OnCommand PlugIn by NetApp. This high level overview is based on version 3.2 of the OnCommand Plugin.

For a full installation manual please use the PDF files supplied by NetApp. These manuals are part of the downloadable executable (OnCommand-PlugIn-Microsoft_3.2_x64_NetApp.exe).

This MP has some dependencies. Without having them in place AND properly configured, the OnCommand Plugin won’t work. So make sure all is accounted for.

  1. PowerShell version 3.0 has to be installed on ALL OM12 Management Servers;
  2. NetApp OnCommand PlugIn has to be installed on ALL OM12 Management Servers;
  3. SNMP on ALL NetApp Filers must be enabled and configured;
  4. ALL NetApp Filers must be present in OM12 as network devices (so run a Discovery);
  5. The OM12 Action account requires permissions on the NetApp Filers;
  6. A SQL server for hosting the SQL database the OnCommand Plugin uses. The SQL Server hosting the OpsMgr SQL database will do the trick.

Dependencies 1, 3, 4 and 5 must be in place before you start with installing the OnCommand PlugIn. Dependency 2 will be taken care of when installing and configuring the OnCommand PlugIn.

Installation & configuration
The installation of the OnCommand PlugIn starts really simple with installing the OnCommand Plugin on ALL OM12 Management Servers. Please make sure PS 3.0 is installed and operational before you start. Otherwise the installation will fail.

  1. Installing OnCommand Plugin on all OM12 Management Servers
    1. Start the file OnCommand-PlugIn-Microsoft_3.2_x64_NetApp.exe with elevated permissions.
    2. Follow the wizard and select the required components, e.g: SCOM Management Packs, Storage Monitoring, SCOM Console Integration, Cmdlets, Documentation and OnCommand Discovery Agent;
    3. When having SCORCH and/or Hyper-V you can also select the components related to those technologies;
    4. The account you have to specify requires local admin access on the OM12 Management Servers. Many times using the OM12 Action account works best;
    5. From version 3.2 this MP uses a SQL database as well. Using the same SQL server which hosts the OpsMgr database works fine for me.

  2. Configuring the NetApp MP
    Make sure all NetApp Filers are already discovered and monitored in OM12 as network devices.
    1. During the installation of the OnCommand PlugIn on the OM12 Management Servers two NetApp MPs are imported: OnCommand Data ONTAP and OnCommand Data ONTAP Reports;
    2. When you had the OM12 Console open when installing OnCommand PlugIn, close it and open it again;
    3. Create a MP for the overrides created for the NetApp MPs;
    4. Go to Monitoring > Data ONTAP > Storage Systems > Management Server. Select one of the listed OM12 Management Servers and click on the right side of the OM12 Console under the header Health Service Tasks on Data ONTAP: Add Controller;
    5. Add all NetApp controllers, one by one;
    6. Go to Monitoring > Data ONTAP > Storage Systems > Controllers. Select one of the listed Controllers and select in the right side of the OM12 Console under the header Data ONTAP Controller Tasks > Data ONTAP Manage Controller Credentials;
    7. Add per Controller the required credentials. Best Practice here is to use an AD based account. When SSL isn’t required, remove the selection. Because of a bug removing the SSL requirement might fire an error. Simply click Continue and go on.

  3. Discovering the NetApp components
    Now all NetApp components need to be discovered. Otherwise no monitoring Smile.
    1. Search for the Rule Data ONTAP: Discovery Rule. Use this shortcut for this search: go to Tools (top menu bar of the OM12 Console) > Search > Rules. Saves you a lot of time;
    2. By default this Rule is turned off. Enable it through an override and store it in the MP created in Step 2.3;
    3. Now the Discovery has to be started. Go to Monitoring > Data ONTAP > Storage Systems > Management Server. Select one of the listed Controllers and select in the right side of the OM12 Console under the header Data ONTAP Controller Tasks > Data ONTAP: Run Discovery Task;
    4. When the OM12 Action account is authorized for accessing the NetApp SAN, you don’t need to enter credentials for running this task;
    5. After an hour or so all NetApp components are discovered and will have a monitored state a bit later.

  4. Required: Tuning!
    This MP is really good and really appreciated by many of my customers. However, many of the Monitors in this MP are set to zero so those Monitors require some good tuning in order to get the best out of this MP.

    Other Monitors use wrong thresholds. This isn’t a bug is done on purpose, forcing you to tune them according your environment. When done this MP will really deliver added value.

Compliments to NetApp for delivering such a good MP. When properly tuned (like any other MP Smile), this MP really delivers added value for any organization running one or more NetApp SANs and OM12. I have seen many third party MPs but not many are of this level. A job well done NetApp!


SCOM Artist said...

Hello. First of all thanks for this post as it is a great overview for configuring the MP.

I recently installed the 4.0 version and I'm slightly confused on the wording of the documentation.

I installed the MP, added my controllers, setup credentials, and completed the "Run Discovery Task" that is located inside the SCOM console. I also setup the SNMP alert destinations on the NetApp. The documentation mentions the discovery will occur by default in 4 hour intervals. So do I still need to enable the Rules for discovery?

Sven Wells said...

Just a word of caution about this MP/Plug-In, we’ve found that some of the rules that run/sync each day, actually Close all open alerts, ie. Data ONTAP: Volume Space Utilization (%) monitoring, then after a few minutes all of the closed alerts will re-open. This can cause 2 alert storms: 1. for Closed alerts 2. for New alerts. NetApp tech support states that because the re-sync rule, the one that syncs up with the NetApp controllers every so often (ie. every night), basically clears all such open alerts and then reopens them when it discovers threshold breaches. According to NetApp tech support, this is by design. I believe it is a flaw, but there is no word on whether or not they will fix or enhance this so that open alerts stay open and the repeat count gets incremented.