Wednesday, September 30, 2009

Crucial OpsMgr Services explained. Part IV: The Health Service

--------------------------------------------------------------------------------- 
Postings in the same series:
Part   I: The Basics.
Part  II: The SDK Service
Part III: The Config Service
---------------------------------------------------------------------------------

Meet the workhorse of OpsMgr!
image

Where the other two services (Config & SDK) are only to be found on the OpsMgr Management Servers, this service is not only to be found there, but also on any monitored server/workstation where an OpsMgr Agent is installed. And, last but not least, this service also runs on an OpsMgr Gateway Server. This last one can be looked upon as a ‘super’ OpsMgr Agent.

Ok, let’s continue. What does the Health Service do? It is commonly known as the Workflow Engine. It provides the means for running/executing the monitoring modules. These can be chained together in different ways (AKA workflows), thus enabling end-to end monitoring scenario’s.

The Health Service comes into two shapes, even though it is still the same service:

  1. Agent Health Service
    Runs on monitored servers/workstations where an OpsMgr Agent is installed. It collects performance data, executes tasks and so on. Even when it is disconnected from the Management Server where it reports to, it continues to run. It queues the collected data and events on the disk of the monitored server/workstation. When the connection is restored it sends it collected data and events to the Management Server.

  2. Management Server Health Service
    Runs on a (Root) Management Server. Its functionality varies, based on the setup of the Management Group and the imported MPs.

But that is not all. The Health Service has also two other features which are good to be known as well:

  1. Extensibility
    When additional MPs are loaded with new functionality, the Health Service can be extended in order to support this new functionality.

  2. Credential Management
    For other OpsMgr processes it provides this functionality, so modules can be run/executed with a different set of credentials. How this is done? Well, hold on to your seats and fasten your seatbelts. :)

    The HealthService initiates the process MonitoringHost.exe. The HealthService can spawn multiple MonitoringHost.exe processes… as needed.  Typically – you will see a couple MonitoringHost processes executing under the Default Agent Action Account.  In addition, HealthService will launch MonitoringHost processes under any preconfigured Run-As accounts that are executing workflows on the agents, using those credentials. Thus ‘giving’ the HealthService the credential management capability to support the execution of modules running as different users. 

So when you look at the running processes on an OpsMgr Agent managed server/workstation you’ll see the process HealthService.exe and multiple processes MonitoringHost.exe, running under different credentials.

To be even more precise, the process MonitoringHost.exe is the real workhorse here, since it performs all these tasks:

  • Monitoring and collecting Windows event log data
  • Monitoring and collecting Windows performance counter data
  • Monitoring and collecting Windows Management Instrumentation (WMI) data
  • Running actions such as scripts or batches

As you can see, there is more then meets the eye. I hope this series cleared things up a bit about the OpsMgr services. Of course there is ACS (Audit Collection Services) as well which consist out the ACS Forwarder service and the ACS Collector service, but ACS is something different. Even though it needs OpsMgr as a basis, it performs a total different job (Audit collection).

Credits & used sources:
For this posting I have used – besides my personal experience from the field – SCOM Unleashed, the OpsMgr Security Guide and input from Kevin Holman.

Monday, September 28, 2009

Crucial OpsMgr Services explained. Part III: The Config Service

--------------------------------------------------------------------------------- 
Postings in the same series:
Part   I: The Basics.
Part  II: The SDK Service
---------------------------------------------------------------------------------

This service plays a very important role in the OpsMgr environment. First of all, it manages the relationships and topology of the OpsMgr environment (Management Group).

It also takes care of the distribution of the Management Packs to the monitored systems. It delivers the monitoring configuration to every OpsMgr Agent Health Service. When the Config Service sends monitoring configuration it may also include sensitive data, which is stored and maintained by the OpsMgr database. Here SDK comes in to play. The SDK API performs two critical jobs here:

  1. It prevents the Config Service from viewing this data
  2. It makes sure the data is delivered to the Config Service in an encrypted format (the public key of the target Health Service is being used here)

Now the Config Service  will deliver this information to the Health Service of the OpsMgr Agent.

To put it into plain English, the Config Service becomes a parcel delivery service between the OpsMgr database and the targeted Health Service on a monitored system. Here the Config Service only knows the sender and the receiver. It has no clue what so ever about the contents of this parcel. And when it tries to take a sneak peek, the SDK API is there to give it a smack to its head… :)
Bekijk de afbeelding op ware grootte

The six steps of an monitoring configuration update process:
An update of the monitoring configuration exists essentially out of six steps: (The related events are to be found in the OpsMgr event log of each monitored system (as long as an OpsMgr Agent is in place of course)).

  1. The HealthService is being notified by the Config Service that it needs an update:
    EventID 29102, source OpsMgr Config Service:
    image
    Basically, the HealthService is told by the Config Service that its configuration is out of date and has to contact the OpsMgr Config Service in order to update/synchronize its configuration data.

  2. The OpsMgr Agent contacts the Config Service in order to request an update package:
    EventID 21024, source OpsMgr Connector:
    image 
    The update/synchronization process is being prepared.

  3. The Health Service has contacted the Config Service and is receiving the update package:
    EventID 29103, source OpsMgr Config Service:
    image
    The OpsMgr Agent kicks in. 

  4. The updated monitoring configuration has been received:
    EventID 21025, source OpsMgr Connector:
    image 

  5. The secure configuration information has been received:
    EventID 7023, source HealthService:
    image
    The sensitive information has been received as well.
    (Between steps 4 & 5 there are multiple EventIDs 7026 to be found. These are about the validation of the RunAs accounts.)

  6. The new monitoring configuration has become active:
    EventID 1210, source HealthService:
    image
    The new information has been processed by the Health Service.
    (Between steps 5 & 6 there are multiple EventIDs 7025, EventID 7024  and EventID 7028 to be found. These events are related to the validation/logon of the OpsMgr (RunAs) accounts)

Credits & used sources:
For this posting I have used – besides my personal experience from the field – information from the book
SCOM Unleashed and input from Maarten Goet.

Friday, September 25, 2009

New OpsMgr Connect Portal

Yesterday Microsoft launched a new Connect Portal for the OpsMgr Community. Here one can provide feedback to the OpsMgr Team and participate in surveys, related to OpsMgr like certain MPs.

Normally this kind of portal is reserved to TAP programs so it is something special. All you need in order to participate is a registration on Connect. Want to know more? Check it out here: https://connect.microsoft.com/opsmgr

Thursday, September 24, 2009

While installing a Management Server this setup error occurs: ‘Setup cannot locate the SC database’

Bumped into this issue. The basic components of a new OpsMgr R2 environment (RMS & SQL) were already in place, based on W2K08 x64 SP2 and SQL 2K08 x64 SP1. Now it was time to install several Management Servers. But this screen kept on nagging me while trying to install the first Management Server:
image

Hmm. Strange. I checked the firewalls and all was OK. No issues there. The SQL server was online AND TCP/IP was enabled as well. But no matter what I tried, the same error message kept on coming back. So time for some deeper investigations. Since the firewalls weren’t the issue it had to be something with SQL it self.

I started SQL Server Configuration Manager > SQL Server Network Configuration > Protocols for MSSQLSERVER and checked TCP/IP:
image 

Indeed, it is enabled. But lets take a deeper look, and check its properties (double click it) and check the tab IP Addresses:
image 

I need the second listed IP. Even though it is active, it is NOT enabled. So I enabled it, restarted the SQL Server service as stated in this message:
image

and now I can install the Management Server without any issue:
image

Tuesday, September 22, 2009

New KB article: HTTP Error 403 when accessing OpsMgr Reporting after upgrading to Windows Server 2008

After having upgraded the Microsoft SQL Server 2005 Reporting Services server from Windows 2003 Server to Windows Server 2008, the virtual directory of SRS (http://localhost/reports) generates HTTP Error 403.

KB article KB975439 describes this issue and how to solve it.

Monday, September 21, 2009

Crucial OpsMgr Services explained. Part II: The SDK Service

--------------------------------------------------------------------------------- 
Postings in the same series:
Part   I: The Basics.
---------------------------------------------------------------------------------

As stated in Part I, the SDK Service is present on every OpsMgr Management Server but runs only on the RMS. Again I repeat myself: leave it disabled on the OpsMgr Management Servers. Only the RMS must have this service in a running state.

The SDK service in a nutshell: (Thanks to Maarten Goet, who corrected me on some points here)

  1. Provides access for the OpsMgr (Web) Console to the OpsMgr database
  2. Importing and storing Management Packs in to the OpsMgr database
  3. Stores Management Group information in to the OpsMgr database
  4. Viewing current state of a monitored object

Also good to know:

  1. Access for the OpsMgr (Web) Console
    Every launched OpsMgr (Web) Console connects to the SDK Service on the RMS in order to retrieve the data, no matter where the Console is being run from. So every opened OpsMgr (Web) Console causes an additional load on the RMS. The same goes for running the OpsMgr PowerShell extensions. This one also connects to the SDK Service.

    There is also a misunderstanding about the OpsMgr Web Console since many people tend to think that it doesn’t put that much of load on a RMS compared to the ‘full’ version of the Console. But that is not true. I saw a comment of Maarten Goet on a thread on the the Microsoft TechNet OpsMgr Forum where he explained it. I quote his words:

    …Anybody telling you that the webconsole "uses less memory on the RMS" is lying. The webconsole uses the SDK just as the 'full' console does. Listing the Active Alerts or Computers view (or anything else for that matter) uses the same API calls. Given that IIS hosts those web users you could even say that from a server perspective the Webconsole requires more resources…’ 

    This is something to reckon with when deploying an OpsMgr environment.


  2. Encryption Key 
    The SDK service is also the owner of the Encryption Key (information is stored in the registry of the RMS) for the Management Group. With this key the Run As Account information - stored in the OpsMgr database - is accessed. This Encryption Key is needed when promoting a Management Server to Root Management Server.

    Want to know more about the Encryption Key, how to make a backup of it? Go here. How to restore it when it is lost, or how to promote a Management Server to RMS? Check out this article.


  3. Keeping track of the SDK connections
    Even though it is not watertight (explained later) it still provides a way to see how many (approximately!) (Web) Console and OpsMgr PowerShell connections are running. It is based on a blog posting of Kevin Holman, found here.

    -
    Open the Console go to Authoring > Rules > New Rule > Collection Rule > Performance Based > Windows Performance > Next 
     image
    (Don’t forget to put this Rule in it’s own MP.)

    - Give it a proper Rule Name, select ‘Performance Collection’ as Rule Category, ‘Root Management Server’ as Rule Target and leave the rule enabled > Next
    image 

    - In the next screen hit the button Next and select the RMS as Computer, as ObjectOpsMgr SDK Service’ and as counter ‘Client Connections’ > OK
    image
    - Leave the interval at its default setting (15 minutes) > Next

    - Here I do not use Optimized Performance Collection Settings because this will influence the accuracy negatively > Create
    image

    - Go to Monitoring > SDK Connections > right click it, select New > Performance View > give it a proper Name and Description
     image

    - Select as ConditionSelected by specific Rules’, click the link ‘specific’ in the Criteria Description box
    image 
    and select the earlier created Rule (SDK Connections) > OK > OK

    It takes some time (15 minutes timeframe) before some data is collected. For this screen dump I have lowered the Interval to 1 minute (don’t do this in production environments) and also altered the Time Range:
    image

    Hmm. Strange. I have only one Console open, and already five connections are being shown? Well, apparently other connections/processes are at work here as well.

    First of all, you have the SDK service running on the RMS, which counts for one connection. The Console I run is also counted as one connection. This leaves three connections which seem to be unaccounted for.

    But that is not true. Underwater there are many processes busy which use the SDK service as well, like Connectors. So even though this Performance View delivers a better sight what is happening, it still needs some ‘translation’. Be aware of this.

    Another example, with – besides an open Console – also a running Web Console and the OpsMgr PS extensions running:
    image

    - And yes, there is a report as well. Since it is a rule it collects (performance)data to be put in a report: Go to this blog posting of mine. At step 4 select the RMS as Windows Computer, at step 6 select as Performance Object ´OpsMgr SDK Service´ and as Counter ´Client Connections´ (contained within the earlier created rule) and click OK (twice).
    image
    Select a date and run the report. Be aware though that the rule needs to run some time in order to collect sufficient data to be shown in the report.

OpsMgr now supports Windows 2008 R2

On the 18th of September an update of the MP for the Server OS has been released. Besides some fixes it also contains support for Windows Server 2008 R2.
image 
The MP is to be downloaded here.

This OS cannot only be monitored by OpsMgr (from OpsMgr SP1 or later versions) but also used as a platform for an OpsMgr Management Server role, as stated in KB974722:
image

Friday, September 18, 2009

Crucial OpsMgr Services explained. Part I: The Basics

--------------------------------------------------------------------------------- 
Postings in the same series:
Part  II: The SDK Service
Part III: The Config Service
Part IV: The Health Service
---------------------------------------------------------------------------------

On many occasions I do get questions about the OpsMgr specific services, like:

  • What are they and what are their purposes?
  • Why are certain services only available on OpsMgr Management Servers and not on Agent Managed servers?
  • Why are certain OpsMgr services disabled on Management Servers but enabled on the Root Management Server?

Since there is much to tell and I strive for readable blog postings I have decided to launch a new series in order to cover this topic. At the moment I do not know exactly how many postings this series will contain, but lets start with Part I: The Basics and see how it goes from there.
image

As one has noticed, every OpsMgr Management Server has three OpsMgr Services: (Blue: RTM/SP1 name of service, Red: R2 name of service. Check out this posting)

  1. OpsMgr SDK Service / System Center Data Access
  2. OpsMgr Health Service / System Center Management
  3. OpsMgr Config Service / System Center Management Configuration

An OpsMgr managed server (a server which runs the OpsMgr Agent) runs only the second service, OpsMgr Health Service / System Center Management.

As expected, every service has is own purpose. Also when checking a Management Server (not the Root Management Server!) one will notice that only the second service is running and the other two (SDK & Config) are disabled. This is by design and shouldn’t be altered.

  1. Question: Why does a Root Management Server runs all three services?
    One can look upon the OpsMgr hierarchy as Windows NT4. Here one had a Primary Domain Controller (PDC) with a writable SAM and some or many Backup Domain Controllers (BDCs) with a read-only copy of the SAM. And no matter from what location the User Manager or Server Manager was being run, it always connected to the PDC since that server contained the only writable SAM. Any adjustment was done there and then replicated to the BDCs. 

    The RMS is just the PDC of OpsMgr and the Management Servers are the BDCs. So the RMS maintains the OpsMgr Management Group in every kind of way. Importing MPs? Done there. Adjusting MPs? Done there. (Web) Console connections? Authorizations? Notifications? Setting permissions? Scoping Views? Deleting objects? Setting Overrides? Yep! The RMS does it all. All changes are being put into the OpsMgr related DBs and replicated to the Management Servers (Better: The Management Servers are notified by the RMS things have changed so they have to update their configuration.)

    This makes the RMS a rather busy fellow. It has a lot of work to process.

    In order to do that, it needs all three OpsMgr Services to be in a running state. This also explains why enabling these three services on a Management Server is the recipe for disaster: the OpsMgr Management Group doesn’t know any more who the ‘leader’ is and ends up in a situation to be compared with one which is known in the Cluster world as a ‘Split-Brain’ scenario. Isn’t nice either….

    Only when a Management Server must be promoted to Root Management Server, these services need to be started and set to automatically. But that is another story.

  2. Question: Doesn’t that make the RMS vulnerable?
    Hmmm. Yes. It does. Therefore in environments where Enterprise Monitoring is crucial, a Clustered RMS is advised. Personally I have some doubts about the mechanism behind Clustering but that is another discussion. But indeed, the RMS introduces a SPOF (Single Point Of Failure).

    On System Center Central there was a very interesting discussion going on about the new version of OpsMgr to be released in 201x. One (or more) person(s) raised the idea to distribute the RMS roles and thus leverage it to the Windows Domain Controllers as we do know them now. That is really a good idea. Lets hope it is being picked up.

  3. Question: What about the specifics of these OpsMgr Services?
    Good question. I will describe every service in a separate blog posting. The blog postings in this series will look like this: 
    - Crucial OpsMgr Services explained. Part II: The SDK Service.
    - Crucial OpsMgr Services explained. Part III: The Config Service.
    - Crucial OpsMgr Services explained. Part IV: The Health Service.

And perhaps some other postings on this topic as well, based on the feedback I do get.

Key Management Service (KMS) MP updated

24-09-2009 - Update: Do not use this MP directly. First check out the posting of Kevin Holman. Go here.

A new version of the KMS MP has been released. Taken directly from the website: ‘…The KMS MP monitors core Key Management Service offered by Windows Server 2008 R2, Windows 7, Windows Server 2008, Windows Server 2003 and Vista…’

Want to know more? Go check it out here.

Thursday, September 17, 2009

OpsMgr Effective Configuration Viewer & OpsMgr R2

The Resource Kit for OpsMgr contains also a tool known as ‘Effective Configuration Viewer’ a.k.a ECV.

What this tool does? Taken directly from the website: ‘…It displays the set of rules and monitors that are running on a computer, distributed application, or any other managed entity after any configured overrides have been applied…

With OpsMgr RTM/SP1 this tool just runs fine and has been of great help to me on many occasions. But with OpsMgr R2 ECV doesn’t show all rules running on a computer. For instance I know for sure there are 8 rules in place (and functional) but ECF only shows two. It seems like this tool needs an update.

I am curious whether I am the only one experiencing this issue. For now I do not use this tool anymore in R2 environments.

How to troubleshoot OpsMgr Console Performance in large environments

Even though the OpsMgr Console performance in R2 has increased significantly, sometimes one might bump into large environments where the performance of the OpsMgr Console needs attention. The OpsMgr Support Team blog has posted a very good and thorough article about how to troubleshoot it.

Even when one has a smaller environment and/or doesn’t have these issues at all, it is still a good read since one gets a better understanding of the inner workings of OpsMgr in general and the OpsMgr Console in particular.

Article to be found here.

All credits go to the OpsMgr Support Team.

OpsMgr R2 supports now an OpsMgr R2 Agent on Windows Embedded Standard 2009

Windows Embedded Standard 2009 is used (for instance) on POS (Point Of Sales) Terminals, Windows based thin-clients and ATMs. Now these systems can be monitored by OpsMgr R2 as well.

The System Center Operations Manager 2007 R2 Agent Prerequisites Macro for Windows Embedded Standard 2009 can be downloaded from the Subscriber Downloads:
image

Want to know more? Check it out here. Taken directly from that website:

‘…The OpsMgr 2007 R2 Agent Prerequisite Component for Windows Embedded installs the necessary components that support the OpsMgr 2007 R2 Agent on Windows Embedded. This allows you to then install the OpsMgr 2007 R2 Agent and monitor the health of Windows Embedded in the same way as you monitor the health of other Microsoft operating systems…’

and

‘…Supported Operating Systems: Windows XP Embedded, Windows Embedded Standard 2009…’

OpsMgr is becoming more and more versatile! Great!

Wednesday, September 16, 2009

OpsMgr R2 discovery issue

This issue happened to me at a customers site. When pushing OpsMgr Agents to newly managed systems using the Discovery Method ‘Browse for, or type-in computer names’ this error message popped up:
image

First I thought there were some issues with the servers being discovered, like some needed services being turned off on the servers which need to be discovered, or not using the appropriate permissions. Or perhaps some ports were blocked? So I used Kevin Holman’s posting as a trouble shooting guide. Also this posting of him is a great help.

So I started to research the issue. But all seemed to be well: no blocked network ports, no authorization issues nor any needed service being turned off. Then a network sniffer on the Management Server was installed in order to see whether any network traffic was generated at all when running the above mentioned discovery.

No. Nothing! Yes, there was much traffic going on but not the discovery traffic flowing from the Management Server to the server which needed to be discovered. Not one packet what so ever!

Even more puzzling was the fact that the discovery of most servers residing in the same domain as the OpsMgr servers went without any problem what so ever. And for some other servers, residing in the same domain, the same error message popped up, while these servers are configured exactly the same…

Finally I decided to use the LDAP queries as a discovery method (Scan Active Directory) and that worked.

But it still puzzled me so I contacted Kevin Holman. He advised me to (un)check (*) the option ‘Verify discovered computers can be contacted’ in the ‘Auto or Advanced’ screen.
image
(*: When the Discovery doesn’t run with this option checked, uncheck it. And the other way round)

Tried it today, and yes, it works. I might be missing something here but it seems a bit like a bug.

So whenever Discoveries aren’t working as they should AND all is in place (checkout the earlier mentioned blog postings of Kevin), try the (un)check ‘Verify discovered computers can be contacted’ method and all should be OK now.

All credits go to Kevin. Thanks man!

Tuesday, September 15, 2009

How to configure POP3 for OpsMgr in test environments, using Windows 2008 Server

Got some feedback on this posting which described how to configure SMTP on W2K08 for isolated OpsMgr test environments. One of the most heard comments was that SMTP also needs POP3 (or IMAP) in order to deliver anything. And many test environments do not have an Exchange implementation in place and are so isolated these environments aren’t able to connect to anything outside the lab.

On top of that, POP3 isn’t available any more in W2K08. So how to go about it?

I searched the internet and found many solutions. Many of which are really an overkill and one ends up with a full blown mail solution. But finally I found it: Visendo SMTP (pop3) Extender for Windows 2008 Server, to be found here.

It took me some time to get it up & running, but the same webpage also describes how to go about it, check out the ninth comment:
image

Yes, that did the trick. Now I have an OpsMgr test environment which is a real All-In-One Solution:

  • W2K08 DC
  • SQL2K08 SP1, SRS
  • RMS
  • SMTP
  • POP3

It works like a charm! Now I can take a VM with me which has everything on board to be used as a test environment. Runs good, even on a notebook serving as a host.

Free Cisco MP!

Even though Kristopher Bash doesn’t run his blog for a long time, he isn’t a newbie to OpsMgr at all. He approaches OpsMgr from a programmatic angle and knows his stuff inside and out. There fore his postings are of a very high level. I see him many times on the TechNet OpsMgr Forum helping out others.

Today he has released a Cisco MP built by himself, licensed under the GNU Public License. Deep respect! Go check it out your self here.

Thanks Kris, for sharing it with the community.

Monday, September 14, 2009

Remotely managing manually installed OpsMgr Agents

Got this one from the Microsoft TechNet OpsMgr Forum. Its an issue which keeps on coming back. In order to give it a bit more exposure I have put it into this blog posting.

Due to a design flaw up to OpsMgr SP1 a manually installed OpsMgr Agent could be remotely managed from the OpsMgr Console, even though it didn’t always give the expected results. So from the OpsMgr Console manually installed OpsMgr Agents could be repaired, uninstalled and moved to another Management Server.

However, in R2 this functionality has been repaired and manually installed Agent can not be remotely managed from the Console. Now the server/workstation with the manually installed OpsMgr Agent has to be used in order to change/repair/remove the OpsMgr Agent.

When running W2K08 with UAC enabled this cannot be done from the Control Panel. This blog posting of mine describes how to go about it.

New KB article: one or more OpsMgr Management Servers and their managed devices are grayed out in the OpsMgr Console

Above mentioned issue happens to OpsMgr Servers like (R)MS, Gateway and Agent(s). On these computers also EventID 623, Source ESE, Category Transaction Manager in the OpsMgr event log is logged.

KB article KB975057 describes this issue, its cause and how to solve it

Friday, September 11, 2009

Tech-Ed Berlin 2009

Wow! Got the green light: I am going to Tech-Ed Berlin 2009! Last year I went to Tech-Ed Barcelona which was really good. So I hope Tech-Ed Berlin 2009 will be of the same level of quality.
image

Will miss the Tapas though. Perhaps the German beer & bratwurst can make it up?
image 
Are you attending as well? Leave a comment on this posting when you do.

Thursday, September 10, 2009

Running a customized Report – Dell Unmanaged Servers

The Dell MP ‘Dell.WindowsServer.Scalable’ doesn’t contain a lot of reports (just one to be accurate). And that report doesn’t do that much as well.

I have found myself in situations when importing the new Dell MP (only the first two as I blogged about) that some or many Dell Servers aren’t properly recognized. These servers are to be identified in the OpsMgr Console.

Go to Monitoring > Discovered Inventory. Right click in the middle pane and select ‘Change Target Type’. Here one selects ‘Dell Unmanaged Windows Server’ and clicks OK. Now all the servers which the Dell MP doesn’t recognize properly are being shown.

These servers need attention. Sometimes OMSA is indeed outdated (older than 5.3) or needs to be (re)installed. When those actions have been done and the server is still not recognized than the OpsMgr Agent needs a kick in the butt (excuse my French :) ).
image

Just stop the HealthService on a Dell Server where OMSA is OK, rename the folder ‘~:\Program Files\System Center Operations Manager 2007\Health Service State’ and restart the Health Service. Now the server should be recognized properly. This trick has helped me already on many of these occasions.

But suppose, there are a LOT of servers not being properly recognized? Copying the list of servers is an option but not really a good idea. Screen dumps are a way to go but a bit unprofessional. Why not use the Report functionality of OpsMgr?

I know, the Dell report(s) won’t do this, but just follow this little procedure and you end up with a nice report showing all ‘Dell Unmanaged Windows Servers’ with also the reason why those servers aren’t recognized, which can be easily exported to pdf, html, xls and so on.

How to built my own Dell Unmanaged Servers Report?

  1. Go to Reporting > Microsoft Generic Report Library > Custom Configuration report. Double click it.

  2. The Report Parameter area will be displayed. Select for From: ‘Yesterday’. Click the Add Group button.

  3. Type ‘dell unmanaged’ and hit the Search button. The Dell Unmanaged Windows Servers Group will be found. Click Add > OK.

  4. For Report Fields select Display Name and Device Description (Beware: the list of Report Fields isn’t set in alphabetical order..)

  5. Hit the Run button and soon a report will be shown, containing all unrecognized Dell servers, including the name of the server AND the reason why the server isn’t recognized.
    image

Now you can schedule this report, using an offset. More about that to be found here.

Happy Reporting!

I know there is a lot to be said about Reporting of OpsMgr. But when one starts to understand the inner workings of it, it is a very powerful tool! It certainly pays off to spend time experimenting with it.

Post upgrade step after upgrading from OpsMgr SP1 to OpsMgr R2

It is commonly known that R2 not only adds new Management Pack Templates in the Authoring Pane, but that the ones which were already present in RTM/SP1 have been improved. And that is not just some marketing mumbo jumbo.

So whenever an upgrade from SP1 to R2 has been run, always check the monitors which have been built using the Management Pack Templates. These monitors need to be upgraded as well. But before starting that process make a backup of the MPs containing those monitors, for the Just-In-Case-It-Goes-Wrong scenario.

How to update a monitor built with a MP Template? Easy. Go to Authoring > Management Pack Templates and select the built monitor. Double click it and this message will be shown:
image

Click OK and now the editor containing the monitor will be shown. Click the button Apply (bottom left) and this monitor will be upgraded to R2.

Tuesday, September 8, 2009

New Dell MP, some inside information

At a site all the new Dell MPs were loaded and some issues popped up. This blog posting will describe these issues.
  1. Not all Dell servers are properly recognized
    Only 5% of the Dell Servers was recognized properly. All other remaining Dell Servers weren’t shown at all in the View Dell > State View > Servers in the Monitoring Pane. When in the Monitoring Pane the View Discovered Inventory was opened and adjusted to show the Dell Unmanaged Windows Server, all the remaining Dell servers were shown. Also the reason was stated: OMSA needed to be of version 5.3 up to 6.1.

    But that is the case, OMSA version 6.1 is installed and operational on all Dell servers. The OpsMgr event log showed no errors what so ever. The Dell MPs landed properly but somehow didn’t recognize the Dell servers.

    Since all Dell MPs were loaded, I decided to bring this MP suite back to the basics. So I deleted the Detailed MP, DRAC MP and the CMC MP. Also the Information Alerts On MP was loaded. This one had to go as well, since Back-To-Basics is the credo.

    However, the Discovery being used here (Dell Server Discovery), runs every 24 hours. Which is neat but now I wanted to see some quick results, so I made an override which reduced it to 5 minutes. I put that override in a MP of its own so that a simple deletion of it will bring the discovery back to its normal rate, 24 hours.

    Bingo! Soon enough the other Dell servers started popping up in the Console as being Dell Servers and ended up being neatly monitored by the Dell MP. So the special MP containing the override on the frequency of the Discovery could be deleted. 

  2. Specific Dell Tasks run from the Console are using default credentials
    image
    When a Dell server is properly recognized there are 11 tasks available to be launched from the OpsMgr Console. However, many of these tasks are set in such a manner that they use the default credentials (User: Root, Password: Calvin).
    image
    These can not be changed. So when these tasks need to be used in an OMSA environment where the credentials have been changed (which is for many companies a compliance issue) these tasks won’t run since the wrong credentials are being used. The only way to go about it is to disable these tasks and recreate them with the new credentials. However, the credentials are being stored in clear text. Which isn’t neat either…

    These tasks are:
    - Check Power Status
    - Power Reset
    - Power On
    - Turn LED Identification On
    - Turn LED Identification Off
    - Power Off Gracefully
    - Power Cycle
    - Force Power Off

How to configure SMTP for OpsMgr in test environments, using Windows 2008 Server

15-09-2009 Update: For mail delivery besides SMTP also POP3 is needed. Checkout this blog posting describing how to install POP3 on W2K08 server.

When running an OpsMgr test environment which is isolated from the production environment it is nice that the notification channel can be setup as well. This way OpsMgr can really be put through its paces and the needed experience can be gained with Notifications as well. However in situations like these the Exchange team isn’t all too excited to (ab)use ‘their’ servers for it. So how does one go about it without using Exchange AND not to create a server which can be easily used for relaying?

In situations like these I have found that W2K08 (SP2) delivers a nice feature named SMTP. Yes, I know. Nothing new. But with W2K08 the nicest part of the SMTP feature is that mail relay is disabled by default. And with an additional setting it can be even made more secure. So let’s start.

First of all, IIS 7.0 is a prerequisite where the role-services IIS 6.0 Metabase and IIS 6.0 Management Console (from the IIS 6.0 Management Compatibility group) are really needed. But these are automatically installed when the SMTP Role is added through the Server Manager Console.

  1. Open the Server Manager Console, go to the Features Node > Add Feature

  2. Select SMTP Server
    image

  3. Click the button Add Required Features > Next > Install. SMTP is being installed now. Click Close.
    image

  4. Start Menu > Administrative Tools > click Internet Information Services (IIS) 6.0 Manager

  5. In the Console go to [SMTP Virtual Server #1], right click it > Properties
    image
  6. Tab General, select the correct IP-address. Also take note of the FQDN of the SMTP Virtual Server. (This entry will be used in OpsMgr.) On the same tab additional settings like the number of allowed connections or the used port can be changed as well.
    image
  7. Tab Access, click the button Relay. This is the most important part!!!
    image 
  8. Select the option ´Only the list below´, add the correct ip-address of the RMS using the SMTP and deselect the checkbox ´Allow all computers which successfully authenticate to relay, regardless of the list above´
     image
    (The ip-address depicted here is fake.)

    Now SMTP is ready for usage by OpsMgr. Of course additional settings can be changed as well but the most risk full setting, scoping mail relay to only the RMS and not any other server – has been configured properly.

  9. One more thing left to do: Open the Services mmc and go to Simple Mail Transfer Protocol (SMTP) service and set it to start automatically:
    image 

  10. Open the OpsMgr Console > Administration > Notification > Channels > New E-Mail (SMTP) > Settings > Add
    image 
    Edit the settings and click OK. Go to Format, change as needed and click Finish.

Setup the subscribers and subscriptions in OpsMgr and you have a working SMTP solution. Happy learning!

Friday, September 4, 2009

MP Authoring Zen

We all know the famous quote from the movie Forrest Gump:

Life is like a box of chocolatesyou never know what you're gonna get'.  
image

Well this week is just like that.

After the most welcome but totally unexpected gift of the PKI Certificate Verification Management Pack, now the same authors (Rapheal Burri, Pete Zerger and Jaime Correia) are publishing a six part article series about how to built such a MP your self. And here they strive for making it accessible for non-application developers (like I am).

The first part is available for download (free registration needed).

All credits go to Rapheal Burri (who had the lead in developing this MP), Pete Zerger and Jaime Correia from System Center Central.

New Dell MP explained, the outside

This posting won’t take a deep dive into the inner workings of the newly released Dell MP, version 4,00 A00. Instead it will describe this MP at a high level like what it monitors, what the requirements are and so on. Let’s start.

Package:

  • The downloadable package consists out of six MPs and one manual (pdf) and four text files.

  • The manual describes 4 of these as MPs and refers to the other two as ‘utilities’, even though those are MPs as well

  • The 4 MPs are:
    - Dell.WindowsServer.Scalable
    - Dell.WindowsServer.Detailed
    - Dell.OutOfBand.CMC (Chassis Management Controller)
    - Dell.OutOfBand.DRAC (Dell Remote Access Device)

  • The 2 ‘Utilities’ are:
    - Dell.Connections.HardwareLibrary
    - Dell.WindowsServer.InformationAlertsOn

MPs and Utilities explained:
Good news! Dell has finally decided to split the MPs up, based on what needs to be monitored. And not just that, one MP is targeted at small environments (shows more details/components, thus having more classes to alert upon), where as another MP is targeted at large environments (shows less details, thus having less classed to alert upon which can be good in large scale deployments). Also – by default – many Informational Alerts are turned off by default.

  • Dell.Connections.HardwareLibrary
    Even though Dell refers to it as a ‘Utility’, this is the base MP. All other MPs (in)directly depend on it: 
    image

  • Dell.WindowsServer.Scalable
    Monitors Dell hardware where the components are modeled at a high level (Example: Memory Component is discovered at Memory Group level but Memory Unit instances aren’t modeled in this MP). This MP is advised to be used in large OpsMgr environments where +300 Dell servers are being monitored.

  • Dell.WindowsServer.Detailed
    Is an extension of the Scalable MP. This MP models the details like the Memory Unit instances.

  • Dell.WindowsServer.InformationAlertsOn
    By default the Informations Alerts contained within the Scalable MP are turned off. This keeps the OpsMgr Console clean. With this ‘utility’ these Informational Alerts are overridden and turned on:
     image

  • Dell.OutOfBand.CMC
    Monitors CMC.

  • Dell.OutOfBand.DRAC
    Monitors DRAC.

Requirements – Dell servers:

  • On the servers Dell OMSA (OpenManage Server Administrator) needs to be installed, version 5.3 to 6.1 is supported.
  • When monitoring DRAC: the DRAC Agent needs to be installed as well.

Requirements – OpsMgr Servers: (Only needed when running certain Dell Tasks from the OpsMgr Console)

  • BMC Management Utility version 2.0 or higher needs to be installed.

Support – Dell systems:

  • All Dell systems supported by OMSA (version 5.3 – 6.1) are supported by this MP as well.

Support – Windows Server OS:
Finally! Windows 2008 (up to SP2) is supported. Hopefully it will work with R2 as well. But that isn’t sure yet.

  • W2K03 SP2 up to W2K08 SP2

Needed authorizations:
Mostly the default authorizations will suffice. For these two options Power User or Administrator permissions are needed:

  • Dell Monolithic Server-In-Band DRAC Discovery & Console Launch
  • Clearing ESM Log (can also be done by supplying alternative credentials)

Conclusion:
Since the MP hasn’t been tested it is to premature to talk about any conclusion. However, this newly released set of MPs looks like Dell has chosen a new approach to OpsMgr. Also because this MP doesn’t allow an upgrade of the previous versions makes it look like Dell has broken with its past. Which is good since the old MPs were buggy and noisy which could cause significant issues in an OpsMgr environment.

Another wise decision made by Dell is to break the MPs into different parts, where most parts are covering certain hardware/services, thus allowing organizations to select only those MPs which are applicable and needed for their environment.  Combined with the decision to set most Informational Alerts on disabled and to deliver a MP for large environment (Scalable MP) besides a MP for smaller environments (Detailed MP, extension to the Scalable MP) I can only say this set of MPs looks very promising.

None the less (as goes with any MP) test it before putting it to use in a production environment. And RTFM first (not just the pdf-file but also the text files which contain the most recent information) before loading any MP. On top of that: start small with only the ‘utility’ Dell.Connections.HardwareLibrary and the MP Dell.WindowsServer.Scalable. Go from there and see what it does and what adjustments are needed.

Request:
If any one out there has already gained any experience with this new MP being put to use into a test- or production environment, please let know. I am really curious about it. Thanks so much in advance. Of course I will only blog about it when agreed upon.

Thursday, September 3, 2009

HSLockdown explained

Some times I do see questions about the HSLockdown tool. Yes, there are websites explaining what it does, but what about this fragment? 
(Taken directly from the KB article explaining the usage of this tool)

‘…only the NT AUTHORITY\Authenticated Users security principal is allowed access to the Health Service. But when the Active Directory is hardened, or the agent is misconfigured, the Local System account cannot authenticate through the Authenticated Users security principal. Therefore, the agent cannot process Health Service configuration information…’

Sounds like true Geek mumbo jumbo, doesn’t it? I mean, I work for 12+ years in the IT, have had my share of late night/early morning/weekend disaster recoveries, servers going wild, multiple outbreaks of viruses, management teams with (almost) impossible requirements to be met, users who almost turned me into a Hulk (Hit Any User To Continue…) and yet I survived it all. And learned from it. But that sentence, its like I have been asleep all those years and that I am a total newbie to IT!

I will break it down into a more down-to-earth language. So that we all understand it.

Lets take a look at the word Security Principal, taken from Wikipedia and ‘translated’ by me:

…is an entity that can be authenticated by a computer system or network. It can be assigned permissions and privileges for resources in the network…

In some kind of way it can be seen as some kind of group. Keep this in the back of your mind.

Now a few steps back:

  • The OpsMgr Agent (AKA HealthService) uses the Run As Profile ‘Privileged Monitoring Account’ to run under
  • By default this profile uses the LocalSystem account (= NT AUTHORITY\SYSTEM Security Principal)
  • Under this account the OpsMgr Agent processes the Health Service configuration

So far so good. Still no mumbo jumbo?

Ok, now we are going to walk forward:

  • Suppose that on a DC not the LocalSystem account is chosen for the OpsMgr Agent but a Domain Account, specially made for this purpose
  • During the installation of the OpsMgr Agent the tool HSLockdown is run automatically and denies the HealthService access to the LocalSystem account
  • Now only the NT AUTHORITY\Authenticated Users Security Principal is allowed access to the HealthService

So the bottom line here is that the OpsMgr Agent (HealthService) can’t do anything as LocalSystem/System

Lets pick up some speed:

  • Suppose the Active Directory is hardened (made more secure) or the OpsMgr Agent is not properly configured
  • Due to this the LocalSystem account cannot run responses for the OpsMgr Agent
  • Now this DC will be grayed out in the OpsMgr Console, thus not being monitored:
    image
  • The OpsMgr event log shows errors with EventIDs 7022 and 1120

That’s all! No mumbo jumbo and everything is explained (I hope so…).

Time for some repair jobs:

  • The tool is to be found on every system running the OpsMgr Agent: ~:\Program Files\System Center Operations Manager 2007
  • On the problematic DC open a cmd-prompt and type hslockdown /L in order to get a list what accounts are authorized to monitor the DC: 
    image  
    As you can see the account LocalSystem is denied access

  • Change it like this: hslockdown /A "NT Authority\System"
    image 
    Restart the Health Service and you should be OK now.

  • It’s alive!
    image