Friday, January 28, 2011

SCOM R2: Am I Healthy? – Part IV – Is SCOM R2 Up-to-date or Outdated?

----------------------------------------------------------------------------------
Postings in the same series:
Part  I  The Introduction
Part II  – Know What You Have
Part III – Are the DBs OK?
Part  V – Let’s Use The Community
----------------------------------------------------------------------------------

In the fourth  posting of this series I will take a deeper dive into the status of SCOM R2, or better the patch- and hotfix level of the entire Management Group AND the monitored servers. This topic deserves an entire posting since it gets neglected many times while it’s key to the overall health, performance and availability of the entire SCOM R2 infrastructure.

So let’s start in order to pinpoint outdated stuff and correct it!

But before I start I want you to know that this posting is based upon the assumption that proper update management is in place AND functional. It goes without saying that all Windows Servers are patched and updated on a regular (monthly?) basis. So this posting will not describe this since it is a basic requirement for any IT environment in order to be in good shape.

01 – SCOM R2 Core MP
Let’s start very close to SCOM R2 itself. It is time to review the version of the Core MP of SCOM R2 itself. This MP is key to the overall health, performance and availability of SCOM R2.

If there is one MP I really appreciate AND gets better with every successor, it’s this MP. All the flows, thresholds of Agents, queue sizes and the lot are configured in this MP. Remember the days where we had to configure certain thresholds (Private Bytes for instance) our selves? Those days are over with the latest versions of the Core MP for SCOM R2.

The Product Team releases basically every three to four months a new version. This version contains besides bug fixes (if any) also improvements based on Best Practices and input from the field like the customers, the official Forums, PFE’s and MVPs. Also additional functionality is added in order to improve SCOM R2 as a whole. And on top of it all, additional reports are added and existing ones improved.

When upgrading from an old version of the SCOM R2 Core MP (for example when upgrading from SCOM SP1 to SCOM R2) it is good to know ADDITIONAL reports have been added. When upgrading the Core MP from the online catalog these additional Reports won’t be added since they aren’t present in the original version of the SCOM R2 Core MP. In order to solve that, go to this blog posting of mine.

The latest version of the SCOM R2 Core MP contains a very nice feature: it detects whether WMI on the monitored Windows Servers is still functional and running. And when it’s not, SCOM R2 can take action in order to restore WMI. And as we all know, WMI isn’t always that robust as it should be. So this MP takes away a lot of manual labor. Of course, it would be better when WMI would be robust out of itself, but that is a different topic all together :).

Say what? How to know what version of the Core MP is the most recent? And where to download it? And when to know a new version has arrived? Good question! Fortunately I post about it on a regular basis. The posting about the most current version of the SCOM R2 Core MP is to be found here with all the information you need.

One word of caution how ever: Like any other update, patch, fix or updated MP: TEST IT before you put it into production. A single box – virtualized – can be used as a test environment, running an isolated AD Forest, SQL and SCOM R2, all based on trial licenses. Also keep a keen eye on the Official TechNet Forums in order to know whether the update is OK.

The latest version of the Core MP is a good one BUT needs some additional attention which I also blogged about, to be found here.

02 – The other MPs
In your SCOM R2 environment other MPs will be present as well (otherwise SCOM R2 is only monitoring itself…). Many times I bump into SCOM R2 environments which run outdated MPs – besides the Core MP for SCOM R2 – as well.

But to update those MPs without any proper planning can be even worse compared to running outdated MPs. Why? Two reasons:

  1. Like stated before, TESTING is required and of course, RTFM the MP Guide;
  2. Even though some MPs are updated, they are (unfortunately) not an improvement. The latest DHCP MP is a real drama (version 6.0.6709.0) so stay away from it. But then again, when TESTING an updated MP before putting it into production you would have found this out anyway…

Having said that, it is still best practice to run the latest versions since they deliver much added functionality and monitoring options. Some good and shiny examples are:

  • SQL MP;
  • Server OS MP;
  • Exchange 2010 MP (an update for this MP is to be expected soon…)
  • etc etc etc…

03 – Cumulative Updates (CUs)
We all know the days SCOM hit RTM. Soon SP1 came out. After that a whole chain of hotfixes and patches. It was a challenge to know what hotfixes to install and when and in what order. For myself I had a Word document containing many hotfixes and per hotfix an overview what it did and whether it was required by default or under certain circumstances. Alongside I kept a watchful eye on Kevin Holman’s blog as well since he ran a list with those hotfixes as well.

But it wasn’t very effective. Gladly Microsoft listened and introduced the well know CU system (already present in SQL and Exchange for instance) in SCOM R2 as well. So now all hotfixes and patches are combined and put into a CU with added functionality as well. So no more hassle with running multiple hotfixes/patches, but a single executable. Another good thing about the CUs is that the successor of the CUs contains all the hotfixes/patches/updates contained in the previous ones. So with just installing the LATEST CU for SCOM R2, one is up to speed again.

Again, i can’t say it enough times, TESTING is key and also RTFM. Also WAITING is another nice thing to do. So when a new CU is released, just wait some weeks, test it as well, and follow the blogs and the OpsMgr TechNet Forum. Also keep an eye on the official web pages of Microsoft about the CU since it will reveal new information as well. The section ‘List of known issues for this update’ will tell you many things you need to know. So READ it.
image

The latest CU level for SCOM R2 is CU#3. Want to know more? Go here and here.

I can recommend this CU BUT when you are running SCVMM and use PRO Tips, be careful though. Read the ‘List of known issues for this update’ and you know why…

04 – It’s all about patches and updates…
SCOM R2 and its Agents hits many core components of the monitored Windows Servers or the Windows Servers where the SCOM R2 infrastructure is installed on. And some of these core components, like the JET database and WMI need some additional patching, all depending on the version (2003 or 2008) you are running (mind you, some hotfixes are meant for both server versions).

Here is a list of hotfixes and patches I always advise my customers to install on their servers. Again <sigh> TESTING is required. In a customers environment the mentioned SP1 for XML Parser broke an in-house build application which ran on XML version 4.x… So  be careful.


05 – CU and SP level for SQL Server
I know, just like the previous Item 04 some IT shops go by the credo ‘If it ain’t broken, don’t fix it’. But personally I think it is better to prevent serious issues to occur than to solve them afterwards.

Since SCOM R2 is based upon SQL Server this component needs serious attention as well.

Again, caution is required. For instance, does the SQL server instance run only the SCOM DBs or DBs for other applications as well? If so, do those applications support the latest SPs? Sometimes they don’t.

So inform yourself.

Having said that, there are some issues with SQL Server 2008 SP1 which is fixed with the release of CU#3 and later. Also know what versions of SQL Server is officially supported by Microsoft for running SCOM R2, to be found here.
image

And when you decide to install an update, it is perhaps better to skip the CUs (like CU#7 for SQL Server 2008 SP1) and install SP2 right away.

Conclusion
As you can see, much of the overall health, availability and performance of the entire SCOM R2 infrastructure – including the monitored Windows Servers – is directly related to hotfixes, patches and service packs for Windows Server and SQL Server. Also the versions of the Core MP for SCOM R2 and the imported MPs plays a significant role. Go and check it out yourself and when needed, it is time for some RFC’s to be filed… Have fun!

2 comments:

Anonymous said...

Cumulative Update section needs an update now that CU4's out Marnix!

Marnix Wolf said...

Hi Steve, you are right. I will update the post.
Cheers, Marnix