Friday, May 8, 2009

Health Service Heartbeat Failure Alerts when monitoring SCCM servers

In environments where a SCCM infrastructure is in place and the SCCM related servers are monitored by SCOM, on a irregular basis Alerts about Health Service Heartbeat Failure will appear in the SCOM Console. These Alerts will close automatically as well since the related service returns to a running state again within a minute or two.

What happens is that SCCM has a mechanism in place which backups SCCM installation folder of that server. It pauses the SCOM Agent Health Service for a short while (a minute or so).

This event in the OpsMgr eventlog will be shown:

clip_image002

Mostly this service is resumed within a minute, so SCOM will not Alert on it. Sometimes however the backup-process lasts a bit longer and this Alert will be shown in the Console:

clip_image002[4]

Even though the Agent Health service is suspended (Maintenance Mode), an Alert will be raised. This is because the Agent Health Watcher is still running and that one will fire off an Alert:

clip_image004

For more information about Maintenance Mode, check this posting of mine.

No comments: