Tuesday, February 12, 2013

SCOM R2 DB Move: How To Get Rid Of The Nagging EventID 100

A new customer of mine had their SCOM R2 databases moved by someone working for another company. But soon it turned out many issues came to be after the move of the SCOM R2 databases.

Finally this customer called me because they needed those issues to be fixed, no matter what. So I started analyzing all the issues they had by checking the RMS, MS servers and related SQL Servers. Boy, there was so much wrong! Basic stuff actually. This was really bad. But already after a day of troubleshooting things started to look better and SCOM was getting healthier again.

The nagging EventID 100
One thing remained though: one of the MS servers kept on logging EventID 100 for three times, telling it couldn’t connect to SQL Server. However, the normal registry entries and database tables were set correctly, pointing to the new SQL Server. But no matter what I checked and tried EventID 100 kept coming back:

DetectDuplicateRelAgnToSrvMonitor.vbs : Script executed with Error Number: -2147467259 Error Details: [DBNETLIB][ConnectionOpen (Connect()).]SQL Server does not exist or access denied.

Also an EventID was created for not being able to run the grooming procedures. Which isn’t good as well…

But when checking this TechNet library article, all about moving the SCOM R2 database to a new SQL Server, all was OK. But still this EventID 100 kept coming back biting me. Time for a deep dive.

The deep dive
First I ran a check by exporting all unsealed MPs. Perhaps a MP was made with references to the old SQL Server? This customer has many customizations in place so it was a valid check. But as it turned out, the MPs were clean.

Even when the MS server had it’s cache cleaned, the errors came back. So it was pushed from the SQL database, telling me the culprit was to be found there. But the usual suspect (Step 6) was OK.

Finally I contacted the DBA and asked him to query the whole OperationsManager database looking for the entry of the old SQL Server where the SCOM databases resided first.

How the West was WON
Soon the DBA came back and showed me the results. Many entries were found in Alert and/or Event tables. But one entry really puzzled me. It was the table dbo.MT_OpsMgrDBWatcher.

There were some GUIDs present (BaseManagedEntityId), representing the RMS and MS servers. All but one had the correct SQL Server name in the field DatabaseServerName_88A7F2CA_2467_7715_544D_9FDAAB0CAD11.

Exactly! The faulty MS still referred to the old SQL Server. After a manual modification of that entry soon the nagging EventIDs 100 disappeared.

Whenever moving the SCOM R2 databases to another SQL Server, don’t forget to checkout the table dbo.MT_OpsMgrDBWatcher.

No comments: