Monday, January 31, 2011

Opalis Operator Console 6.3 Installation: Utility to Automate the Installation

Kelverion has created and published a new tool that simplifies and automates the installation of the current Opalis 6.3 operator console.

Want to know more?

  1. Read more about the utility here;
  2. Watch a demo of it in action here;
  3. Register for the download here.

image

Friday, January 28, 2011

SCOM R2: Am I Healthy? – Part IV – Is SCOM R2 Up-to-date or Outdated?

----------------------------------------------------------------------------------
Postings in the same series:
Part  I  The Introduction
Part II  – Know What You Have
Part III – Are the DBs OK?
Part  V – Let’s Use The Community
----------------------------------------------------------------------------------

In the fourth  posting of this series I will take a deeper dive into the status of SCOM R2, or better the patch- and hotfix level of the entire Management Group AND the monitored servers. This topic deserves an entire posting since it gets neglected many times while it’s key to the overall health, performance and availability of the entire SCOM R2 infrastructure.

So let’s start in order to pinpoint outdated stuff and correct it!

But before I start I want you to know that this posting is based upon the assumption that proper update management is in place AND functional. It goes without saying that all Windows Servers are patched and updated on a regular (monthly?) basis. So this posting will not describe this since it is a basic requirement for any IT environment in order to be in good shape.

01 – SCOM R2 Core MP
Let’s start very close to SCOM R2 itself. It is time to review the version of the Core MP of SCOM R2 itself. This MP is key to the overall health, performance and availability of SCOM R2.

If there is one MP I really appreciate AND gets better with every successor, it’s this MP. All the flows, thresholds of Agents, queue sizes and the lot are configured in this MP. Remember the days where we had to configure certain thresholds (Private Bytes for instance) our selves? Those days are over with the latest versions of the Core MP for SCOM R2.

The Product Team releases basically every three to four months a new version. This version contains besides bug fixes (if any) also improvements based on Best Practices and input from the field like the customers, the official Forums, PFE’s and MVPs. Also additional functionality is added in order to improve SCOM R2 as a whole. And on top of it all, additional reports are added and existing ones improved.

When upgrading from an old version of the SCOM R2 Core MP (for example when upgrading from SCOM SP1 to SCOM R2) it is good to know ADDITIONAL reports have been added. When upgrading the Core MP from the online catalog these additional Reports won’t be added since they aren’t present in the original version of the SCOM R2 Core MP. In order to solve that, go to this blog posting of mine.

The latest version of the SCOM R2 Core MP contains a very nice feature: it detects whether WMI on the monitored Windows Servers is still functional and running. And when it’s not, SCOM R2 can take action in order to restore WMI. And as we all know, WMI isn’t always that robust as it should be. So this MP takes away a lot of manual labor. Of course, it would be better when WMI would be robust out of itself, but that is a different topic all together :).

Say what? How to know what version of the Core MP is the most recent? And where to download it? And when to know a new version has arrived? Good question! Fortunately I post about it on a regular basis. The posting about the most current version of the SCOM R2 Core MP is to be found here with all the information you need.

One word of caution how ever: Like any other update, patch, fix or updated MP: TEST IT before you put it into production. A single box – virtualized – can be used as a test environment, running an isolated AD Forest, SQL and SCOM R2, all based on trial licenses. Also keep a keen eye on the Official TechNet Forums in order to know whether the update is OK.

The latest version of the Core MP is a good one BUT needs some additional attention which I also blogged about, to be found here.

02 – The other MPs
In your SCOM R2 environment other MPs will be present as well (otherwise SCOM R2 is only monitoring itself…). Many times I bump into SCOM R2 environments which run outdated MPs – besides the Core MP for SCOM R2 – as well.

But to update those MPs without any proper planning can be even worse compared to running outdated MPs. Why? Two reasons:

  1. Like stated before, TESTING is required and of course, RTFM the MP Guide;
  2. Even though some MPs are updated, they are (unfortunately) not an improvement. The latest DHCP MP is a real drama (version 6.0.6709.0) so stay away from it. But then again, when TESTING an updated MP before putting it into production you would have found this out anyway…

Having said that, it is still best practice to run the latest versions since they deliver much added functionality and monitoring options. Some good and shiny examples are:

  • SQL MP;
  • Server OS MP;
  • Exchange 2010 MP (an update for this MP is to be expected soon…)
  • etc etc etc…

03 – Cumulative Updates (CUs)
We all know the days SCOM hit RTM. Soon SP1 came out. After that a whole chain of hotfixes and patches. It was a challenge to know what hotfixes to install and when and in what order. For myself I had a Word document containing many hotfixes and per hotfix an overview what it did and whether it was required by default or under certain circumstances. Alongside I kept a watchful eye on Kevin Holman’s blog as well since he ran a list with those hotfixes as well.

But it wasn’t very effective. Gladly Microsoft listened and introduced the well know CU system (already present in SQL and Exchange for instance) in SCOM R2 as well. So now all hotfixes and patches are combined and put into a CU with added functionality as well. So no more hassle with running multiple hotfixes/patches, but a single executable. Another good thing about the CUs is that the successor of the CUs contains all the hotfixes/patches/updates contained in the previous ones. So with just installing the LATEST CU for SCOM R2, one is up to speed again.

Again, i can’t say it enough times, TESTING is key and also RTFM. Also WAITING is another nice thing to do. So when a new CU is released, just wait some weeks, test it as well, and follow the blogs and the OpsMgr TechNet Forum. Also keep an eye on the official web pages of Microsoft about the CU since it will reveal new information as well. The section ‘List of known issues for this update’ will tell you many things you need to know. So READ it.
image

The latest CU level for SCOM R2 is CU#3. Want to know more? Go here and here.

I can recommend this CU BUT when you are running SCVMM and use PRO Tips, be careful though. Read the ‘List of known issues for this update’ and you know why…

04 – It’s all about patches and updates…
SCOM R2 and its Agents hits many core components of the monitored Windows Servers or the Windows Servers where the SCOM R2 infrastructure is installed on. And some of these core components, like the JET database and WMI need some additional patching, all depending on the version (2003 or 2008) you are running (mind you, some hotfixes are meant for both server versions).

Here is a list of hotfixes and patches I always advise my customers to install on their servers. Again <sigh> TESTING is required. In a customers environment the mentioned SP1 for XML Parser broke an in-house build application which ran on XML version 4.x… So  be careful.


05 – CU and SP level for SQL Server
I know, just like the previous Item 04 some IT shops go by the credo ‘If it ain’t broken, don’t fix it’. But personally I think it is better to prevent serious issues to occur than to solve them afterwards.

Since SCOM R2 is based upon SQL Server this component needs serious attention as well.

Again, caution is required. For instance, does the SQL server instance run only the SCOM DBs or DBs for other applications as well? If so, do those applications support the latest SPs? Sometimes they don’t.

So inform yourself.

Having said that, there are some issues with SQL Server 2008 SP1 which is fixed with the release of CU#3 and later. Also know what versions of SQL Server is officially supported by Microsoft for running SCOM R2, to be found here.
image

And when you decide to install an update, it is perhaps better to skip the CUs (like CU#7 for SQL Server 2008 SP1) and install SP2 right away.

Conclusion
As you can see, much of the overall health, availability and performance of the entire SCOM R2 infrastructure – including the monitored Windows Servers – is directly related to hotfixes, patches and service packs for Windows Server and SQL Server. Also the versions of the Core MP for SCOM R2 and the imported MPs plays a significant role. Go and check it out yourself and when needed, it is time for some RFC’s to be filed… Have fun!

Free White Paper: How to manage System Center Operations Manager using Groups

Fellow MVP Cameron Fuller has been working with Veeam to put together a white paper on how to leverage groups to better manage Operations Manager. This White Paper is available at no cost, to be found here.

In the past Cameron posted a four part series about the same topic. In conjunction with the earlier mentioned White Paper (there is some overlap of course) you will have a good understanding about Groups in SCOM, what they do, how they function and how to use them in a proper way.

Since I am talking about Groups, I want to mention some other postings as well since they give more deep information about Group creation as well:

The White Paper written by Cameron contains also many other links to web pages all about Group creation, calculation and Best Practices. So whenever you are bored, there is enough to read!

Wednesday, January 26, 2011

SCOM R2 and Dashboards article in WindowsITPro

Fellow MVP Cameron Fuller has written an excellent article for the magazine WindowsITPro about SCOM R2 Dashboards. For some time this article was only available to the subscribers of that magazine.
image

However, for a few days now this article has been made publicly available. Even though I have posted a whole series about Dashboards, Cameron’s article reveals new angles and approaches.

So when you are interested in Dashboards for SCOM R2 and want to know more about it, go here. News things to be learned.

Thanks Cameron and WindowsITPro for sharing this good information.

Some examples of the Dashboards Cameron created (nice job!):

Monday, January 24, 2011

Config Churn & Exchange 2010 MP

01-24-2010 Update: Based on a comment it turned out that the Discovery Cycle can’t be more than 25 hours (86400 seconds). This blog posting has been corrected based on that input. Thanks for keeping me sharp!

Based on this blog posting of Kevin Holman I changed some Discovery Rules in the Management Groups which I control for some customers of mine.

However, many more MGs from other customers I do not control anymore but I still want those MGs to be in good condition. So I contacted the system engineers running those environments and told them about the issue and sent them the url of the blog posting.

Some of them reported back with the issue being fixed. Others told me they could not find the Discoveries being described in Kevin’s posting. So this posting is all about how to find these Discoveries and to change them accordingly, or better, a quicker way to get the job done. So let’s start.

  1. Open the SCOM R2 Console with Admin permissions. Go to Monitoring (or whatever Wunderbar for that matter…). In the pane go to Tools > Search > and click on Object Discoveries.
    image

  2. The Search Window is opened. Type owningserver and click Search. The two Object Discovery Rules as described in Kevin’s posting are shown now:
    image

  3. Select the first one. In the Object Discovery Details pane more information is shown. Click on the View Knowledge link and the properties for this Discovery Rule will be shown.
    image

  4. Go to the last tab Overrides > click on the button Override… > select the first option For all objects of class: Mailbox. Set the override as stated in Kevin’s blog. However, even though it is Best Practice to save all overrides targeted against a certain MP into a MP of its own and NOT the Default MP, another approach is to be advised. As Kevin states in the same posting, the new MP will solve this issue. So by storing these particular overrides in a new MP, these overrides are easily removed when the newest version of the MP is imported by simply removing this particular MP containing these overrides. So your MP will look like this:
    image

    And now the override will look like this: 
    image  
    Click Apply and OK.

  5. Repeat Steps 3 to 5 for the other Discovery Rule and be happy.

Never ever be afraid to ask any question because you think it might sound stupid. To my humble opinion there is only ONE truly stupid question which is the question which is NEVER asked….

How To: Custom Report Authoring for Beginners

Jonathan Almquist has written an excellent posting about how to start with Report Authoring for SCOM R2.

In conjunction with this information it will jumpstart you in building your own SCOM R2 Reports. Thanks Jonathan for sharing!

How To: Install Opalis 6.3

If you want to know how to install Opalis 6.3 there are some good sources of information to be found on the internet, telling you all need to know:
  1. Charles Joy
    Type of information: Video: http://blogs.technet.com/b/charlesjoy/archive/2011/01/07/installing-opalis-integration-server-6-3-video-tutorial.aspx

  2. Christopher Keyaert
    Type of information: Installation Guide: http://www.vnext.be/2011/01/07/how-to-opalis-6-3-installation/

Combined you will have a thorough understanding about what to do. Thanks Charles & Christopher for sharing!

Dell & Microsoft

Long time ago, when I just had started this blog, I didn’t like the efforts of Dell for System Center. But much has changed!

The latest Dell MP is a good showcase about how mature Dell has become with SCOM and its willingness to learn & listen. Of course, like any other MPs there is always room for improvement. But still the latest Dell MP is really good.

As it turns out, Microsoft and Dell are gaining momentum when it comes down to integrating Dell’s products with System Center. A video has been published in which Brad Anderson, CVP of the Management & Security Division, interviews Laurie Tolson, Vice President of Systems Management at Dell.

They also discuss Dell's solution for Hyper-V Cloud Fast Track, Microsoft's set of programs for private cloud deployment. And believe me, the cloud is coming. There is much cloud washing going on, but that doesn’t mean EVERYTHING is just vapor…

New Integration Packs for Opalis

Got this from a very good blog about Opalis and other SC stuff: Nexus SC: The System Center Team Blog.

On Codeplex three new IPs have been published:

Other cool stuff can be found there as well:

So go check it out yourself.

Thursday, January 20, 2011

New MP: Windows Storage Server 2008 R2 Monitoring

Some days ago Microsoft released a new MP, which monitors the health of Windows Storage Server 2008 R2.

Taken directly from the website:
image

MP to be downloaded from here.

OpsLogix Ping MP Target Host Export Script

Many SCOM environments are using the OpsLogix Ping MP. A nice MP it is since it adds basic and easy to use Ping functionality to SCOM. This MP is build in such a manner that one can easily import many devices based on their IP addresses and friendly names:
image

But wouldn’t it be nice to have an option to export the devices as well? Into the same format (IP address and friendly name) so when the list of devices in the MP gets corrupted or damaged in any kind of way, one is easily back on track again by importing the latest export file?

Yesterday I visited a customer of mine. One of his system engineers bumped into the issue where the list of devices got corrupted so he was forced to add many of the devices by hand since the import list – originally used - wasn’t up to date any more.

Since he is a real script kiddo, he built a PS-script which exports the list of devices present in the OpsLogix Ping MP. This script is managed by Task Scheduler so it runs on a regular basis. That way he has always a backup available in case ‘disaster’ strikes. Some time ago disaster struck again. But now he had this export file, ready to be used. Within a few minutes all the devices were back again!

I asked him whether I am allowed to blog about it and to share his script with the Community. And guess what: he agreed. But he doesn’t want me to mention his name, so let’s call him Mister X. This is his favorite lunch:

All credits go to him. Thank you Mister X! Much appreciated. I owe you a hamburger!

The export functionality comes in two parts:

  1. A batch files which calls the PS-script;
  2. The PS-script which exports the devices present in the OpsLogix Ping MP to a file.

The batch file is managed by the Task Scheduler in order to have it run on a scheduled basis. The entries in red need modification:

  1. The location of the PS-script;
  2. The name of the RMS;
  3. The location of the export file.

Batch File:

@echo off
powershell d:\scripts\opslogix\opslogix.ps1

PS script (opslogix.ps1):

$rootMS="NAME OF RMS"

# ========================================================================================
# Initializing OpsMgr 2007 Powershell provider
# ========================================================================================
     add-pssnapin "Microsoft.EnterpriseManagement.OperationsManager.Client" -ErrorVariable errSnapin ;
     set-location "OperationsManagerMonitoring::" -ErrorVariable errSnapin ;
     new-managementGroupConnection -ConnectionString:$rootMS -ErrorVariable errSnapin ;
     set-location $rootMS -ErrorVariable errSnapin ;

  del d:\scripts\opslogix\opslogix.log

  $mclass = get-monitoringclass -name "OpsLogix.IMP.Ping.Target"
  $mo = Get-MonitoringObject -monitoringclass:$mclass | ForEach-Object {
    $IP = $_.PathName
    $IP = $IP.Replace("%002d","-")
    $IP = $IP.SubString($IP.IndexOf("%003a")+5)
    write-host $IP
    $regel = $_.DisplayName + ";" + $IP
    Add-Content "d:\scripts\opslogix\opslogix.log" $regel
  }

I am sure many people will appreciate this. Nice!

Of course, before importing the file again, one or more valid Source Hosts must be present.

Wednesday, January 19, 2011

SCOM R2: Am I Healthy? – Part III – Are The DBs OK?

----------------------------------------------------------------------------------
Postings in the same series:
Part  I  The Introduction
Part II  – Know What You Have
Part IV – Is SCOM R2 Up-to-date or Outdated?
Part  V – Let’s Use The Community
----------------------------------------------------------------------------------

The third posting in this series is all about the SCOM R2 DBs and their health. The overall health AND performance is directly related to the SCOM R2 DBs and their health. So it is key to have an insight view into the status of them. This posting will enable you to do just that and in such a way that you don’t need to become a SQL or SCOM guru. Just like checking the oil of your car.

01 – Along came the Community
In the old days one had to run many different SQL queries against the DBs. However, thanks to the combined efforts of two fellow MVPs, Pete Zerger and Oskar Landman, this isn’t needed anymore.

Just import the SCC Health Check Management Pack Version 2 (RTFM is required since an additional data source has to be created) and you will get 27(!) new Reports, all targeted at showing the status of your SCOM R2 environment, including the Health of the DBs:
image
(The highlighted Reports are the ones related to the health of the SCOM DBs.)

And not just that. These Reports also give very good information, shown in the Report Details area when a Report is selected. Besides the explanation, some good url’s are shown as well. These url’s give you more information about Config Churn, how to identify it and how to battle it, the Known Issue with the LocalizedText table, eating away too much DB space and how to solve that one as well:
image

And:
image

02 – Let’s not forget the SCOM R2 Core MP…
On top of it all, the latest core MP for SCOM R2 does a really good job of monitoring the health of the SCOM DBs as well. The DBs are monitored in many ways so when something goes wrong you will certainly know it.

One important monitor to reckon with is the Operational Database Space Free (%) Monitor. As the name implies, it monitors the percentage of the available free space of the OpsMgr DB and raises a Warning when it falls below to 40% and creates an error when it falls below 20%:
image

When the OpsMgr DB has too less free space, many issues might happen, like being unable to import any new MP, as I blogged about earlier on. So do not alter this Monitor in any kind of way. And when an Alert comes in from this Monitor, act upon it.

Mind you, this Monitor does not monitor the size of the DB as a file on the disks, but looks inside the DB itself and monitors the available free space within that DB.  Have had some serious discussions about this some time ago.

03 – Let’s check the Default Settings for the SCOM R2 DBs
While installing SCOM R2 the DBs are created and the settings for the Recovery Model, Autogrowth and Autoshrink are configured based on best practice and shouldn’t be altered UNLESS you know what you’re doing and know its consequences. But never ever enable Autoshrink because it will automatically ‘enable’ a lot of unwanted issues as well…

Having said that it’s worthwhile checking these settings out since it wouldn’t be the first time a SQL Operator changed one or more of these settings (they seem to love the setting Auto Shrink) without knowing the consequences or communicate it to the SCOM people involved…

OperationsManager DB settings
The Recovery Model is set to Simple, Autogrowth to None and Autoshrink to False:image

And:
image 

Data Warehouse DB (OperationsManagerDW) settings
For this DB are the settings a bit different. The Recovery Model for the OperationsManager DB is set to Simple and Autoshrink to False. So far so good:
image

But the Autogrowth settings are different compared to the OpsMgr DB:
image

04 – Are the DBs properly dimensioned?
Based on my second posting you know by now how many objects are being monitored by SCOM R2. This amount has a direct relationship with the seize of the SCOM R2 DBs. So it is time to do some checking.

Do they match? Are the DBs properly sized or not? When they seem to have too much space, it is not an issue as long as there is enough disk space available and not required by other applications.

However, when the DBs are too small it is time to make them a bit bigger. Especially for the OpsMgr DB will this do some good work. But how do you know what size the DBs need to have? Good question! The answer is simple: Go here, download the sheet and use it as intended. It will explain itself.

Also something to reckon with, besides this sheet, are the recommendations from the field. Based on real life experience. To be found here. See screen dump as well:
image 

So now you have gained a good insight on your environment. But still there might be some issues at hand which need attention. And when taking a good look at those issues, those are related to the DBs as well. To name a few:

05 – The Ghost Computers. Am I being haunted?
Sometimes you might bump into an issue where you have removed some monitored Computers from the SCOM R2 environment. But they are still present in the SCOM R2 Console. No matter what you do, like clearing the cache of the Console for instance, they keep coming back, totally grayed out of course. So now what?

Good news! A much respected SCOM MVP for many years now (Maarten Goet) blogged about this issue some time ago. However, the solution is unsupported. So use it wisely and at your own risk. Since it is better to be safe then sorry, BACKUP the OpsMgr DB first before running the queries mentioned in his blog posting.

06 – Where are my new Reports?
When you import a new MP into SCOM R2, it will take a while before all the Reports do show up. They need to uploaded to the SQL Server Reporting Services (SSRS) instance. Sometimes however, they won’t appear at all or only partially. This latter might happen with the MP I mentioned in Item 01.

When does not create the required data source BEFORE importing this MP, only some pieces of the Reports contained within this MP will be uploaded to the SSRS instance. In order to remedy it go through these steps:

  1. Remove the MP,
  2. Create the data source as described in the related guide,
  3. Open SSRS in IE, remove the partially uploaded reports coming from this MP,
  4. Check out SCOM R2 Reporting. When all Reports related to this MP are gone, import the MP again and you will be just fine.

SCOM R2 will tell you when Reports are not being uploaded to the SSRS instance. However, the Alert does not tell you how to remedy it:
image

Times that I bumped into it the cause was related to altered permissions on the SCOM R2 services accounts. So check them out thoroughly. Also read this posting of mine. Resetting the accounts mentioned here will force SCOM R2 to upload the Reports contained in the MPs to the SSRS instance. So when any Reports are missing, they will be tried again. And when they go wrong, it will be logged in the OpsMgr event log of the RMS. So keep a keen eye on that log.

07 – Does the information flow into the DBs?
Sometimes a hiccup might occur and the information, collected from the monitored servers and other objects, might not flow from the Management Servers into the SCOM DBs. When that happens, SCOM R2 will tell you.

Times I bumped into this issue the cause was to be found into altered permissions. So check the SCOM R2 service accounts. Other temporary issues were network related, like an enabled firewall on the SQL server without having set the proper exceptions to be found here (under the header Operations Manager 2007 Firewall Scenarios).

08 – Does Grooming Work?
Important to know is whether grooming is still running and functional. Grooming means that closed Alerts are groomed out of the OpsMgr DB as configured in the Database Grooming settings:
image

In order to check whether grooming is doing fine and when not, how to solve it, go here.

As you can see much of the Health of SCOM is directly related to the DBs. It goes without saying that the underlying Server OS and hardware (whether P or V) must be up to specs as well. So check it out your self and regain control over SCOM R2. Next posting in this series will be all about updates and the lot. So stay tuned and see you all next time.

Tuesday, January 18, 2011

AVIcode Forum on TechNet

On the TechNet OpsMgr Forum an additional item has been added: AVIcode, to be found here.
image

When you are using AVIcode and do have any questions, you can post them here. The AVIcode forum is staffed by Microsoft employers and MVPs with deep AVIcode knowledge and experience.

MP Authoring Videos: How Discoveries work

Brian Wren has made some cool videos about how Discoveries in SCOM work.
image

There are four videos available now:

  1. How Discovery Works;
  2. Registry Discoveries;
  3. WMI Discoveries;
  4. Discovery Scripts.

Good videos these are. Enjoy them.

Wednesday, January 12, 2011

SCOM R2: Am I Healthy? – Part II – Know What You Have

----------------------------------------------------------------------------------
Postings in the same series:
Part   IThe Introduction
Part III – Are the DBs OK?
Part IV – Is SCOM R2 Up-to-date or Outdated?
Part  V – Let’s Use The Community
 
----------------------------------------------------------------------------------

The second posting of this series is all about the inventory of your SCOM R2 environment. A good and clear picture of your environment is key to the overall health of that same environment.

Why? Suppose you have too many servers reporting to a single SCOM R2 Management Server. This will affect the total performance of your SCOM R2 environment negatively. Or suppose you think the RMS is beefed up enough but it turns out it isn’t. And when it does, it is always on the wrong moment of the day and/or week.

Or even worse, the non-clustered RMS fails and a MS has to be promoted to RMS. What MS is suited best for this temporary task? Where is the Encryption Key? Where is the password? Does the RMS also has a SMS enabled device attached to it which needs to be connected to the MS which has become the temporary RMS?

It might sound stupid but many times I bump into SCOM R2 environments where the system engineers assume it they have a clear picture of it. But when I ask questions like:

  • What AD accounts are used for SCOM;
  • Are those accounts written down;
  • Do you have the passwords available for those accounts;
  • Are there one or more Gateway Servers in place;
    • If yes, what are their names and in what Forests/Workgroups do they reside?
    • If yes, what PKI did you use?
    • If yes, is that PKI still operational?
  • The exact amount SCOM R2 Management Servers;
  • Whether the RMS is running on a physical box or not;
  • Whether the RMS is clustered;
  • How many Agents are pushed and how many manually installed;
  • What CU level the environment is running on;
  • Whether the SQL server is running on a physical box or not;
  • The size of the databases;
  • At what dimension/size the SCOM R2 environment initially was designed for;
  • etc
  • etc

Some or many of those questions aren’t answered right away. And yes, it is understandable. Many times the people who run the SCOM R2 environment first are working in different departments now or are working for another company. And not much is documented and when it is, it isn’t shared with the new system engineers. So one or more blind spots are present and need to addressed.

When you don’t have a document describing the SCOM R2 environment it is time to do it now. Better to write it down in a moment of ‘calm and peace’ (I know, system engineers are always busy) than hitting into one or more blind spots while trying to recover from a disaster. Because at moments like that it is too late (and you hear a small voice nagging ‘I told you so…’).

So what do you write down about your SCOM environment?

Here is a ‘shortlist’ (duh!) of what kind of information is required in order to ‘Know What You Have’. Only then good management is plausible. Anything else is just an assumption and as we all know, ‘assumption is the mother of all …’.

‘One’ piece of advice: take your time to complete the document. Every day an  hour or so for a week should be enough. And when the document is completed, keep it up to date. Takes about five minutes. Max. Saves a lot of time. So good versioning is required. And store it on a good location which is available and accessible for the other team members as well, like SharePoint.

So let’s start. The document should contain information like this:

  1. Management Group information
    1. Name of the Management Group;
    2. How many Agents are being used (or: how many servers/clients are being monitored);
    3. AEM setting;
    4. ODR setting;
    5. CEIP setting;
    6. Database Grooming Setting;
    7. Number of allowed missed heart beats;
    8. Manually installed Agents:
      1. Are they allowed;
      2. Are they approved automatically;
    9. Connectors, per Connector:
      1. Name;
      2. Functionality;
      3. Used AD accounts with passwords (encrypted!);
      4. Used FQDNs and IP Addresses;
      5. Configured settings;
    10. Whether 3rd Party software (nWorks, Jalasoft, OpsLogix etc etc) is being used;
    11. History:
      1. When is it installed;
      2. What version was it originally;
      3. Has it ever been migrated to other hardware or from P to V or vice versa;
      4. Any major issues like outages and other major downtimes;
    12. Installed Language;
    13. Is the Reporting Component installed:
      1. What is the SSRS url;
      2. What SQL server is hosting is the Data Warehouse;
    14. Is the Web Console installed:
      1. What is the url;
      2. Is SSL used;
      3. Is it published to the internet.

  2. Placement;
    1. FQDN of Forest;
    2. LAN segment;
    3. Environment, like Production, Testing or anything else.

  3. Version of SCOM (RTM (I hope not!), SP1, R2 and CU level when R2 is being used);

  4. RMS information;
    1. FQDN;
    2. IP address;
    3. LAN segment;
    4. Physical location (even when it is virtualized);
    5. P or V box;
    6. Amount of CPU, RAM, Disks;
    7. Server OS and patch level;
    8. Disk configuration, RAID settings and sizes;
    9. Whether it is clustered or not;
    10. SMS enabled device attached to it or not;
    11. AD accounts being used for the SCOM service accounts WITH passwords like (or referring to the encryption tool where they are stored);
      1. SCOM SDK Account;
      2. SCOM Action Account;
      3. SCOM Data Warehouse Read Account;
      4. SCOM Data Warehouse Write Account;
      5. SCOM Health Account;
      6. Any third party software account;
      7. Run-As-Profile accounts (like needed for the SQL/AD MPs).
    12. Backup of Encryption Key and its location (stored outside the RMS with the password as well);
    13. Does the RMS perform other tasks as well (do monitored servers report to it/is 3rd party software installed on it).

  5. SQL Server and SCOM databases
    1. FQDN;
    2. IP address;
    3. LAN segment;
    4. Physical location (even when it is virtualized);
    5. P or V box;
    6. Amount of CPU, RAM, Disks;
    7. Server OS and patch level;
    8. SQL Server:
      1. Version;
      2. Edition;
      3. Architecture;
      4. Patch Level (CUs, SPs and the lot);
      5. Installed features;
    9. Disk configuration, RAID settings and sizes;
    10. Whether it is clustered or not;
    11. Does the SQL server also host other DBs or SQL Instances;
    12. SCOM Databases sizes and locations.

  6. The total amount of MS servers and per MS server:
    1. FQDN;
    2. IP address;
    3. LAN segment;
    4. Physical location (even when it is virtualized);
    5. P or V box;
    6. Amount of CPU, RAM, Disks;
    7. Server OS and patch level;
    8. Disk configuration, RAID settings and sizes;
    9. Function of MS server: what is being monitored and how many.

  7. The total amount of Gateway Servers and per Gateway Server:
    1. FQDN;
    2. IP address;
    3. LAN segment;
    4. Physical location (even when it is virtualized);
    5. P or V box;
    6. Amount of CPU, RAM, Disks;
    7. Server OS and patch level;
    8. Disk configuration, RAID settings and sizes;
    9. Functions of Gateway server(s): what is being monitored and how many;
    10. Whether the Gateway server(s) is/are configured in a fail-over configuration;
    11. FQDN of PKI which is used;
    12. SCOM Action Account for the Forest (and its password) where the GW server resides.

  8. Management Packs
    1. What Microsoft MPs are loaded and configured;
    2. What MPs are custom made;
    3. Versions of MPs;
    4. What Third Party MPs are loaded and configured, if so:
      1. Are additional Connectors installed and if so how are they configured (accounts, FQDNs, IP addresses and the lot);
      2. Are additional servers installed, if so treat them the same as a Management Server while writing down the details.

  9. Backups
    1. Are the SCOM DBs back upped on a regular basis:
      1. What tooling is used;
      2. Where are the backups stored;
      3. What retention policy is used;
    2. Are the SCOM servers (RMS, MS server(s), Gateway Server(s), Third Party Server(s)) back upped on a regular basis;
        1. What tooling is used;
        2. Where are the backups stored;
        3. What retention policy is used;
    3. Are the unsealed MPs back upped on a regular basis;
    4. Are all the backups tested on their validity on a regular basis.

When you have the document ready, you will have many benefits from it. I know, it is a lot of information to gather but one day you need it and then you are HAPPY with the effort you made while creating this document.

And when you have this document you have a better understanding of your SCOM environment as well. So when things look like going sour you know how to act.

Tuesday, January 11, 2011

SCOM R2: Am I Healthy? – Part I – Introduction

---------------------------------------------------------------------------------
Postings in the same series:
Part  IIKnow What You Have
Part III – Are the DBs OK?
Part IV – Is SCOM R2 Up-to-date or Outdated?

Part  V – Let’s Use The Community
---------------------------------------------------------------------------------

In order for any SCOM R2 environment to operate at a good and reliable level, it needs to be healthy. Otherwise you end up with a situation which is described best as Garbage In, Garbage Out.
image

Sometimes I get questions/comments coming from people who run their SCOM R2 environment for a while without properly maintaining it. And that it is not so good. Like every other ICT based solution it needs proper care. For SCOM R2 updates (Cumulative Updates) are published on a a regular schedule. The core MP (the MP of SCOM R2 itself) is also regularly updated. And for the other MPs updates do come out as well.

Of course, it is not wise to implement every update as soon it comes out. It needs some testing and sometimes it is good to wait some weeks as well in order to see what the community has to say about its experiences with the updated product. And when in doubt, go to the official TechNet SCOM Forum and post a question about it. You will definitely get an answer there.

Actually it is nothing new I am telling here. It is all about good Change Management and proper maintenance of your ICT assets. And SCOM R2 makes no difference here. But somehow, sometimes I have the feeling that SCOM R2 gets neglected. And when it does and issues start to occur, like a partially malfunctioning SCOM R2 environment, people start to complain.

Of course, it can be solved. But wouldn’t it be better to prevent it? So no hick ups will occur or even worse, outages of your SCOM R2 environment?

Therefore I have decided to post a whole new series about this topic, or better, how to check whether your SCOM R2 environment is still healthy and operating as it should.

This series will look like this:

  1. Posting I: Introduction;
  2. Posting II: Know What You Have;
  3. Posting III: Are The DBs OK?;
  4. Posting IV: Is SCOM R2 Up-to-date Or Outdated?;
  5. Posting V: Let’s Use The Community.

So stay tuned. See you all next time.

Friday, January 7, 2011

Exchange 2010 MP Helpfile (chm) has been released

Yesterday Microsoft released a Helpfile (chm) for the Exchange 2010 MP, containing MP alerts and topics for Microsoft Exchange Server 2010.
image

File is to be downloaded from here.

Community-developed X-Plat Agents for Debian 5 and Ubuntu 10

Got this one straight from System Center Central:

There are a couple of new X-Plat agents developed by Javier Ripoll and available on Codeplex, delivering monitoring to the Debian 5 and Ubuntu 10 UNIX distributions. If you’re interested in monitoring either of these platforms these agents and MPs are worth a second look.

Debian 5 (x86) System Center Operations Manager 2007 R2 Agent.
You will need the Debian 5 Management Pack

Ubuntu 10 Management Pack for SCOM 2007 R2.
http://scxagentubuntu10.codeplex.com

Requires the Agent (in this project). It is based on Red Hat 5 Management Pack.

Installation
Load both MP's per the normal Opsmgr procedure. You will need to install the agent if you’ve not completed the step already.

Free eBooks

Got this from another blog and copied it directly into this blog posting (some eBooks have been skipped by me though). All credits go to Michael Pietroforte, the owner of that blog.

Thanks Michael for sharing!

The books are ordered chronologically. Books without a publishing date are at the end of the list.

  1. The SysAdmin Handbook
    Red Gate Software, March, 2010.
    A collection of articles about Exchange, Virtualization, Windows Server, Powershell, and Unified Messaging.

  2. Understanding Microsoft Virtualization Solutions (Second Edition)
    Mitch Tulloch, Microsoft Press, February 2010.
    Hyper-V and Remote Desktop Services in Windows Server 2008 R2, Microsoft Virtual Desktop Infrastructure, Microsoft Application Virtualization 4.5, Microsoft Enterprise Desktop Virtualization, Windows Virtual PC and Windows XP Mode, System Center Virtual Machine Manager 2008, and Microsoft’s private and public cloud computing platforms including Windows Azure.

  3. The Complete Windows 7 Shortcuts eBook
    Nitgin Agarwal, The Windows Club, February 2010.
    More than 200 keyboard shortcuts containing almost all the keyboard shortcuts that are available in Windows 7 and its default programs like Paint, WordPad, MS Office, Calculator, Help, Media Player, Media Center, Windows Journal, Internet Explorer.


  4. Introducing Windows Server 2008 R2
    Charlie Russel and Craig Zacker, Microsoft Press, 2010.
    Focuses on what is new and important, while giving you the context from Windows Server 2008 Chapters: What’s New, Installation and Configuration, Hyper-V, Remote Desktop Services and VDI, Active Directory, The File Services Role, IIS 7.5, DirectAccess and Network Policy Server, Other Features and Enhancements.

  5. Windows Server 2008 R2 Essentials
    Neil Smyth, Techtopia, 2010.
    Installing, configuring and administering Windows Server 2008 R2 systems including installation and upgrades, networking configuration, remote desktop services, disk and partition management, RAID configuration, security, BitLocker encryption, remote desktop access, print services, resource sharing, clustering, load balancing and user permission management.

  6. Hyper-V Essentials
    Neil Smyth, Techtopia, 2010.
    Overview of the Hyper-V architecture and components, Hyper-V role installation, the creation, management and migration of virtual machines, virtual networking architecture and remote access to virtual machines.

  7. TCP/IP Fundamentals for Microsoft Windows
    Microsoft, 2008.
    Introductory approach to the basic concepts and principles of the Transmission Control Protocol/Internet Protocol (TCP/IP) protocol suite, how the most important protocols function, and their basic configuration in the Microsoft Windows Vista, Windows Server 2008, Windows XP, and Windows Server 2003 families of operating systems.

  8. TCP/IP Tutorial and Technical Overview
    Lydia Parziale, David T. Britt, Chuck Davis, Jason Forrester, Wei Liu, Carolyn Matthews, Nicolas Rosselot
    IBM Redbooks, December 2006
    Chapters: Core TCP/IP protocols (including IPv6), TCP/IP application protocols, Advanced concepts and new technologies.

  9. Introduction to Storage Area Networks
    Jon Tate, Fabiano Lucchese, Richard Moore, IBM Rebooks, July 2006
    Illustrates where SANs are today, who are the main industry organizations and standard bodies active in the SAN world, and it positions IBM’s comprehensive, best-of-breed approach of enabling SANs with its products and services.

  10. VMM 2008 Essentials
    Techtopia.
    Chapters: VMM 2008 Components, VMM 2008 Architecture and Port Usage, VMM 2008 System Requirements, Installing VMM 2008 Components, A Guided Tour of the VMM Administrator Console, Managing Hosts with the VMM Administrator Console, Creating and Managing VMM 2008 Virtual Machine Templates, Managing Virtual Machines with the VMM Administrator Console, Managing VMM 2008 Library Servers, Performing Physical to Virtual (P2V) Conversions using VMM 2008, Converting VMware Virtual Machines to Hyper-V using VMM 2008 V2V, Understanding and Configuring VMM 2008 User Roles, Deploying a VMM 2008 Self-Service Portal.

  11. Security Concepts Book
    Theodore Parker.
    Examines the typical problems in computer security and related areas, and attempt to extract from them principles for defending systems; attempt to synthesize various fields of knowledge, including computer security, network security, cryptology, and intelligence.

  12. Security+ Essentials
    Techotopia.
    Provide the knowledge needed by IT professionals to pass the CompTIA Security+ exam; largely platform agnostic book.

  13. Windows Server 2008 Essentials
    Techotopia.
    Cover all aspects of installing, configuring and administering Windows Server 2008: installation and upgrades, networking configuration, terminal services, disk and partition management, RAID configuration, security, BitLocker encryption, remote desktop access, print services, resource sharing, clustering, load balancing and user permission management.

AVIcode: Application Monitoring

For a while now Microsoft has acquired AVIcode. For a month or so the newest version has been released, version 5.7.

What it does? Taken directly from the website:
image

However, I do not have that much experience with the product. But I have seen some very cool demonstrations. A fellow MVP, Simon Skinner, has written a whole series about AVIcode 5.7. Since he is an experienced user of this software these postings are really spot on. This is what he has written do far:

  1. System Center Central : AVIcode 5.7 – Part 1 : AVIcode 5.7 goes live
  2. System Center Central : AVIcode 5.7 – Part 2 : Licensing
  3. System Center Central : AVIcode 5.7 – Part 3 : Terminology
  4. System Center Central : AVIcode 5.7 – Part 4 : Install the SEviewer
  5. System Center Central : AVIcode 5.7 – Part 5 : Navigation of the SEviewer
  6. System Center Central : AVIcode 5.7 – Part 6 : Securing the SEviewer

And not just that, he and another fellow MVP, Pete Zerger, will give a webcast about AVIcode on January the 12th. Click here to register for it.

SCOM R2 now officially supports SQL Server 2008 SP2

Yesterday the SCOM R2 Supported Configurations webpage has been updated by Microsoft in order to reflect that SQL Server 2008 SP2 is now officially supported for SCOM R2.
image

Thursday, January 6, 2011

Blog stats of 2010

So 2010 is definitely over now and 2011 is in full swing.

Time for a small review about that year in relation to this blog. So I pulled some statistics in order to get an insight view about how this blog performed in 2010. Some information:

  1. Total amount of visits
    In 2010 the blog had 127,729 visits! Wow! Never ever expected that my blog would get so much attention. THANK YOU everybody.
    image

  2. Total amount of visitors
    The blog got a total of 61,803 visitors.

  3. Total amount of Pageviews
    Whoops! A shiny total of 194,885. Nice!

  4. New postings and comments
    In 2010 I wrote 304 new postings and got about 118 comments.

  5. Top Ten Most Visiting Countries
    image

  6. Top Ten Most Referring Sites
    image

  7. Top Ten Unique Page Views
    image

This is just a small part of all the statistics about my blog. But impressive they are more over when you know that the blog is being run by solely by me. However, without the input from the community I would not have the energy and drive as it is now.

So again, THANK YOU ALL for spending your time on my blog and all your comments. After all, this blog is meant for the community and not for myself (even though I must admit I use it as my online knowledge base). I am nothing more but a tool, the blog itself the means and SCOM the product it’s all about, meant to aid the community in using SCOM in a better way. Looking at the numbers it looks like I am succeeding in that approach. Nice!

Tuesday, January 4, 2011

Error: Consolidator Module Failed Initialization

Got the above mentioned error on of a couple of monitored servers after the Lync MP had been imported and configured. This error was shown in the SCOM R2 Console:
image

I searched the internet but didn’t find anything solid. So it was time to take a deeper dive on the servers as well. Soon I found something which seems to be the cause: WMI is experiencing issues. The servers involved are W2K08 R2. These are some WMI related events which these servers show:

  1. Application Log, EventID 1000:
    image
    WMI is faulting. Hmm, SCOM really needs WMI so this is bad.

  2. OpsMgr Log, EventID 11112:
    image  
    This error generates the Alert shown in the first screen dump.

  3. OpsMgr Log, EventID 10409
    image  
    This event tells WMI cannot be enumerated.

  4. OpsMgr Log, EventID 4001:
    image
    Many scripts are failing. Scripts which tap into WMI that is…

So it is clear WMI is having serious issues which need to be addressed.

There is a hotfix available for fixing WMI on W2K08 R2 based servers: http://support.microsoft.com/kb/981314. It addresses a memory leak with WMI. Normally this leak will not become visible, only when  the Win32_Service class is frequently queried. And guess what SCOM does?

So before the Lync MP was imported, WMI on these servers didn’t get that much load. But after having imported the Lync MP, WMI on these servers is being queried with a higher frequency so the leak comes out.

And when running Perfmon against the process WmiPrvSE, (All instances, select counter Private Bytes) one will see it consuming a maximum load:
image

Tonight the hotfix will be applied and I am pretty sure this will help and resolve the issue. When the results do come in I will update this posting accordingly.

Monday, January 3, 2011

How To: Troubleshoot MPs which do not seem to land

Sometimes one bumps into an issue where a particular MP doesn’t seem to land on a monitored server. All other MPs are neatly in place and functional on the same server, except for that particular MP. So now what? How to deal with it?

This posting will show you some tips, tricks and advises how to go about it. Some might seem too obvious, but still they are worth being mentioned. So bear with me.

  1. RTFM (Read The Friendly Manual)
    Yeah, I know. The most obvious one indeed. And yet, an important one. Every MP comes with its related guide. RTFM is key in order to get the most of the MP or even to get it running. Like the SharePoint 2010 MP (love that one), since a file needs to be moderated in order to get the MP running, in conjunction with an account.

  2. Agent Proxy
    Some MPs need the Agent Proxy setting enabled. MPs like that are for example (but not limited to): AD, Exchange and Cluster. Agents installed on servers which are running services like those must have their Agent Proxy set to enabled:
    image
    Again, RTFM is key here since the related MP guide will tell you so when this setting must be set to enabled.

  3. HealthService store is corrupt
    There is an issue with the Jet DB Engine, present on any kind of Windows Server OS. SCOM uses this Jet DB Engine as well (HealthService store (~:\Program Files\System Center Operations Manager 2007\Health Service State\Health Service Store\HealthServiceStore.edb)). In order to solve this a hotfix has been released by Microsoft: http://support.microsoft.com/default.aspx?scid=kb;en-us;981263.

    However, when this is the cause it would not limit itself to a certain MP not landing on a particular server. The same server would experience other issues as well, like being grayed out in the SCOM Console.

  4. Security
    Some MPs need additional permissions in order to function properly, like the SQL MP. One other example, Exchange 2010 installations lock down the servers where it’s installed on, so some additional work is required. Normally SCOM will Alert upon it when the required permissions aren’t sufficient. When the Console stays clean, another way to go about it is to check things out locally on the problematic server:
    - Stop the Agent on the problematic server;
    - Clear the OpsMgr event log on the problematic server;
    - Start the Agent on the problematic server;
    - Check the OpsMgr Event Log for any warning or error;
    - Open the errors/warning (if any) one by one and read them thoroughly. When its a permissions related issue, detailed information will be given;
    - Solve the permissions issues.
    image
    Example of EventID 7026 which tells you the Action account has been validated successfully.

  5. Non forest/domain residing servers
    Any monitored server residing outside the security boundary of SCOM needs certificates in order to communicate with the SCOM Management Group. On top of that, and sometimes people forget this, additional accounts in SCOM are needed. Many times servers like these do not take part of the Forest where the SCOM Management Group resides. So the accounts set in SCOM for the SQL MP (for instance) can not be validated/used on those servers, so additional accounts are needed.
    image 
    Step 4 will also help out here in order to identify what’s happening and why.

  6. Has the MP landed?
    A MP can only do its work when it has landed on the server. So a good thing to know is whether the MP is in place on the server. There are many ways to go about it, but this is the way I prefer the most: - Stop the Agent on the problematic server;
    - Clear the OpsMgr event log on the problematic server;
    - Rename the folder ~:\Program Files\System Center Operations Manager 2007\Health Service State;
    - Start the Agent on the problematic server;
    - Check the OpsMgr Event Log for any Event with ID 1201;
      image
    - Run through every Event with the same ID in order to see whether the problematic MP has been received on the server;
    - When the MP isn’t mentioned in one or more of Events with this ID (1201) check out to see whether the other servers do get this MP AND the SCOM Console whether one or more Alerts are reported about this particular MP.

  7. Hotfixes, WMI and cscript.exe
    Another important things to reckon with are hotfixes, updates and the like. Not only for SCOM (CUs for SCOM R2) but also for the servers being monitored. As we all know the SCOM Agent relies heavily on certain basic Windows components like WMI, cscript.exe and so on. When WMI is not OK, many MPs will not function properly. When an old version of cscript.exe is in place, some or many scripts will not run properly as well.

    W2K08 based servers need some additional attention as well. Check out these postings in order to know more about it: http://thoughtsonopsmgr.blogspot.com/2009/07/opsmgr-and-windows-2008-what-hotfixes.html,
    http://thoughtsonopsmgr.blogspot.com/2009/03/script-or-executable-failed-to-run-part.html,
    http://thoughtsonopsmgr.blogspot.com/2008/12/wmi-and-windows-2003-server.html

  8. MP itself is corrupt
    Haven’t seen this many times, but sometimes a MP might go wrong. With SCOM R2 and the latest CU this is not very likely to happen. So the best way to go about it is to remove the MP, wait an hour or two and import it again.

    Also when the issues is related to a MP gone bad, the problem of a not-landing MP wouldn’t be limited to a single server, unless only one server runs the application/service being covered by that MP.

Hope this walk through helps in troubleshooting MPs which do not seem to land properly on one or more servers.