Issue
When you close an Alert which is raised by a Monitor, you’re basically ‘undermining’ the proper working of SCOM. Simply because the related Monitor is still in an unhealthy state.
So when the unhealthy condition happens again, the related Monitor won’t act. Simply because it’s still in an unhealthy state. So no state change which means NO ALERT.
Fix
The best way to go about it is to TRAIN the people involved. To make sure they know how SCOM works and what the differences are between Rules and Monitors. How they function. How Alerts should be handled.
So that’s the starting point. But how about it when people are making mistakes? And Alerts get closed even though they’re generated by Monitors? Sure, more training and making people aware is required. But still there are Monitors out there in an unhealthy state, requiring a ‘fix’.
So why not use the Task Scheduler which runs a PS script which checks for unhealthy Monitors with closed Alerts, and resets them for you? Of course, this doesn’t create a situation where people can close Alerts without giving it a second thought, but it will help you out as an additional safe guard.
In some SCOM environments a scheduled script like this is a no-go area where as it will be a welcome safe guard for other environments. So make up your own mind .
The PS script
I am not a PS hero. But I know how to Google. So I found this posting on the blog SystemCenterTipps which provided me a GREAT script.
And indeed, as the same posting states, even though it’s written for a Runbook in Orchestrator, with some minor adjustments, it can be used a standalone PS script as well, used by a plain Scheduled Task.
For this customer it turned out to be a great help. So a BIG word of thanks to the author of this posting .
1 comment:
There used to be a "green machine" script that did the same thing in SCOM2007. -Looks like this would be the SCORCH equivalent...
Post a Comment