For a customer quite a few Linux servers had to be monitored. During the roll-out of these OM12 SP1 Agents to the Linux systems several errors popped up. Thanks to a highly experienced Linux guru working for this customers these issues were sorted out pretty fast. Based on this experience I have made a table with the most occurring errors and their possible causes and their fixes.
Issue | Cause & Resolution |
DNS Configuration error | 01: Faulty reverse DNS Lookup Zone. When fixed all went just fine 02: Linux system had multiple names, all registered in DNS. After a couple of retries the Agent landed properly. 03: System resided in an old segment which didn’t have a zone on the new DNS servers. When fixed all went just fine |
Failed during SSH Discovery | 01: SSH was locked down to ROOT only. When fixed for the OM12 SP1 account used by Linux all went just fine. 02: An outdated version of SSH which isn’t compatible with the .NET SSH implementation Microsoft uses on the OM12 SP1 side. SSH requires an update. 03: An outdated version of SSH which doesn’t accept certain SSH calls. SSH requires an update. |
Failed to install kit | 01: Home folder of the OM12 SP1 Linux account was missing. After having added this folder all went just fine. 02: Certain files were locked. When retried the installation of the OM12 SP1 Agent some hours later all went just fine. |
Installation hangs | On some systems the installation of the OM12 SP1 Linux Agent just hanged. Had to hard stop the OM12 SP1 Console. Then a second attempt went just fine. |
Unexpected Discovery Result | 01: Reason unknown. Second attempt (some hours later) ran just fine. 02: A restart of the OM12 SP1 services on the OM12 SP1 MS running the Discovery (be careful though): http://www.opsman.co.za/?p=50 |
WinRM cannot complete the operation | Firewall was blocking WinRM service. After having opened that port (TCP 1270) it still didn’t work. See this posting to get it working: http://blogs.technet.com/b/chandanbharti/archive/2011/12/21/linux-agent-install-issue.aspx |
Agent verification failed | Multiple DNS issues: 1: Linux system has a different hostname compared to the FQDN. Correct it (hostname or FQDN) and all is just fine. 2: DNS record isn’t present. Add the record and all is just fine. |
Other resources for troubleshooting OM12 SP1 UNIX/Linux Agent installation issues:
- Bob Cornelissen: http://www.bictt.com/blogs/bictt.php/2011/05/29/scom-trick-15-cross-platform
- Microsoft TechNet Wiki: http://social.technet.microsoft.com/wiki/contents/articles/4966.troubleshooting-unixlinux-agent-discovery-in-system-center-2012-operations-manager.aspx
- Stefan Roth: http://blog.scomfaq.ch/2012/09/11/scom-2012-linux-discovery-unspecified-failure/
- Enabling logging and debugging in OM12: http://technet.microsoft.com/en-us/library/hh212862
- Microsoft TechNet – Trouble shooting UNIX/Linux monitoring: http://technet.microsoft.com/en-us/library/hh212885
Other useful resources, all related to UNIX/Linux monitoring with OM12:
No comments:
Post a Comment