Thursday, January 29, 2009

SCOM Discovery Wizard doesn’t work

Symptom
The SCOM Discovery Wizard can run forever without ever discovering a single system eventhough the systems to be discovered are up & running and not restricted by a firewall.

Cause
The SQL Broker of the OperationsManager database is not running. Without it, the Discovery Wizard will not function.

Remedy

  1. Check to see whether the SQL Broker is running
    - Open SQL Server Management Studio
    - Select the right instance and the OpsMgr database
    - Start a new query on the OpsMgr database:
    SELECT is_broker_enabled FROM sys.databases WHERE name = 'OperationsManager'
    - Value = 0 :SQL Broker is disabled. Goto Step 2.
    - Value= 1 : SQL Broker is enabled. All is OK.

    Check here for another issue which might be causing the Discovery Wizard to run forever.

  2. Enabling the SQL Broker for the OpsMgr database
    - Open SQL Server Management Studio
    - Select the right instance and the OpsMgr database
    - Start a new query on the OpsMgr database:
    ALTER DATABASE OperationsManager SET SINGLE_USER WITH ROLLBACK IMMEDIATE
    - Click Execute
    - Start this query on the OpsMgr database:
    ALTER DATABASE OperationsManager SET ENABLE_BROKER
    - Click Execute
    - Close SQL Server Management Studio.
    Note: Closing SQL Server Management Studio closes the connection to the database in single user mode. Most of the times one has to stop all SCOM related services on the RMS since these services have running connections to this database. Without stopping them one won't be able to run the next query.

- Open SQL Server Management Studio
- Select the right instance and the OpsMgr database
- Start a new query on the OpsMgr database:
ALTER DATABASE OperationsManager SET MULTI_USER
- Click Execute.

Repeat Step 1 to check the SQL Broker is running now (value must be 1)

12 comments:

HunterX3 said...

Pretty much nothing in Security Center Essentials Monitoring is working. No alerts are coming in. The SCE server cannot communicate with the clients because the client's OpsMgr Health Service service won't start. That errors with: The OpsMgr Health Service service terminated with service-specific error 2147500037 (0x80004005).

As for the events I posted before, the second one is 29106/OpsMgr Conig Service/Warning. The first one I cannot find. I am also seeing these: Configuration state of OpsMgr Health Service "{44293b59-9080-52ec-b353-dbfcbd6da198}" running on "sceservername" may be out of date. It should contact OpsMgr Config Service to synchronize its configuration state. Those are 29102/OpsMgr Config Service/Information

In AD, we created a overall OU for the company that we placed all of our sub OU's under, but those were for users, not computers. We did delete a bunch of computer accounts and I tried to stop them from being monitored under the Agent Managed under Administration. I did not remove any management packs that I am aware of.

You ask: Did SCOM use SCP (Service Connection Point) which has been removed? I don't know what that is so I'm guessing not. Is there a way to check?

Marnix Wolf said...

Hi HunterX3.

Now I see: we are talking about two different products. SCE (System Center Essentials) is not SCOM (System Center Operations Manager), eventhough there are certain things alike.

SCE is however certainly not distributed as SCOM is, so there is no such thing as a RMS and a MS. SCE is kind off an 'all-in-one' product: a bit of SCOM, WSUS & SCCM.

My blog is solely about SCOM and not SCE. I have not really worked with SCE.

Having said that, I can give it a try to help you. However, I can't give any guarantees.

Nor will my next reply be swift since I will be onsite at multiple customersites so I do not have much time to react.

So let's start.

1 When was SCE still up & running before this all started? Better, after what action(s) started the problems with SCE? The more detail you can give the better.

2 You talk about deleting computer-accounts. Did you remove them from SCE BEFORE the removal from AD?

You say you tried. How? Did you delete the Agents or did you run an uninstall (would only be succesful if the computeraccounts were still present in the AD).

Are these deleted computeraccounts still present in SCE?

3 Are those computer-accounts servers as well?

4 What about SCE? Is it all installed on one server (SCE and SQL)?

5 Are all SCE & SQL related services still up & running?

6 Is the SCE server account still present and OK in AD?

7 Is the SQL server account still present and OK in AD?

8 When I read the error about the Agent not starting on a server, with this error check these pages: http://blogs.technet.com/smsandmom/archive/2008/04/30/opsmgr-2007-healthservice-service-fails-to-start-with-25362-warning.aspx

and

http://www.eggheadcafe.com/conversation.aspx?messageid=31822881&threadid=31822881.

9: The presence of an SCP can be seen in the AD. Check here: http://social.technet.microsoft.com/Forums/en-US/systemcenterdeployment/thread/b42152cc-4b6a-4df1-86bb-6942ad58d88e


An SCP is a way to help SCOM/SCE Agents - which haven't been pushed from SCOM/SCE to find their configurationinformation.

Suppose you have an OS image with the SCOM/SCE Agent included. When this OS starts, the Agent will contact the SCP for it's configurationinformation (to what server does it have to report).

But SCP is only interesting when one doesn't push the SCOM/SCE Agent from SCOM/SCE.

Good luck with it all.

Jolivenom said...

I was following a KB from Microsoft http://support.microsoft.com/kb/941409 and they dont have the second execute line that you have in their article.

ALTER DATABASE OperationsManager SET SINGLE_USER WITH ROLLBACK IMMEDIATE

so when i tried to execute the
ALTER DATABASE OperationsManager SET ENABLE_BROKER
the query just keeps runnin with no respone or results.

Using an eval version of SQL2008 and Ops Manager 2007 R2

Dominique said...

Hello,
Hiow long the second command
ALTER DATABASE OperationsManager SET_SINGLE_USER WITH ROLLBACK IMMEDIATE shpould run?

It is already over 10 minutes...

Thanks,
Dom

Marnix Wolf said...

Hi Dom.

Normally this should run in a couple of minutes at the most. But important here is what other SQL DBs the server is hosting.

This blogposting is based on some personal experience on a dedicated SQL server.

Cheers,
Marnix

Dominique said...

Hi Marnix,

No other DB on this server... it is the RMS and it has its own DB on its own SQL Server locallly.

Thanks,
Dom

Dominique said...

I think I found the issue but not sure how to resolve it... All SCOM Services are stopped but I have 27 connections to the Database??? Could it be from the MS themselves? How to drop them and or to locate who is connected?

Thanks,
Dom

Dominique said...

After passing sucessfully ALTER DATABASE OperationsManager SET SINGLE_USER the database is shown as Single User.but when running ALTER DATABASE OperationsManager SET ENABLE_BROKER I am getting Msg 5011, Level 14, State 5, Line 1
User does not have permission to alter database 'OperationsManager', the database does not exist, or the database is not in a state that allows access checks.
Msg 5069, Level 16, State 1, Line 1
ALTER DATABASE statement failed.
it seems conflictual!!! I could not run the command because it is in single user mode ... but I need to be in single user mode to be able to run the command... something is wrong!!!

Thanks,
Dom

Marnix Wolf said...

Hi Dom.

Do I understand it correctly that the RMS also runs SQL Server, hosting the SCOM DBs?

Cheers,
Marnix

Marnix Wolf said...

Hi Dom.

Disable the SDK service on the RMS and the Health Service on all the MS servers.

Close down SQL Management Studio and start it. Now you should be able to access the DB and run the queries against it.

When more direct contact is needed, please leave your mailaddress here.

Cheers,
Marnix

Unknown said...

By the way, I'm using SCOM 2012, and was experiencing the same Discovery Wizard problem. I went through the same SQL Server steps Marnix posted above and it resolved my issue. Thanks Marnix!

Mina said...

don't change the DB to single user mode, you will get stuck.
I changed it to single user mode and
I kept trying to run the last query but I couldn't as other process accessing the DB and each time I kill a process another starts

this Query works immediately

alter database [] set enable_broker with rollback immediate