Monday, September 29, 2014

Tools: FREE Full Version Of Xian SNMP Device Simulator!!!

Jalasoft Xian Network Manager (Xian NM) is 10 years old this month and celebrates it in a BIG way.

One of those ways is a FREE full version of the SNMP Device Simulator valid for up to 10 devices.

The only ‘catch’ here is that you download the trial from their website on the 15th of October 2014. Jalasoft will send you a permanent license on that date.

Want to know more? Go here.

SCOM 2007x/2012x: Performance Views Show Counters When Perf Collection Rules Are Disabled

Issue
When performance collection rules are disabled in Microsoft System Center Operations Manager, performance views still show counters even after all the data is groomed out. This clutters the related Views and can make them even useless because too many counters are being shown.

Cause
This is by design (yikes!)….

Resolution
A SQL script removes the entries from PerformanceDataAllView for which no data is recorded. KB3002249 contains the SQL scripts (one for removing the entries, another for showing which performance counters will be deleted for what objects before you run the delete script).

!!!WARNING!!!
The same KB states: ‘…Stop all the Operations Manager services on all Management Servers before you run the script…’ AND ‘…back up your OperationsManager Database before you run this script…’.

SO BE CAREFULL WITH THIS AND ONLY RUN IT WHEN REQUIRED.

WSUS: Guided Walkthrough Troubleshooting WSUS Agents Not Reporting To WSUS Server

Some weeks ago Microsoft released KB2993943, all about solving issues with WSUS Agents not reporting to their WSUS server.

For anyone running WSUS NOT being managed by SCCM 2012x, this KB is a great starting point for resolving WSUS Agent issues.

SCCM: Guided Walkthrough Troubleshooting Software Update Synchronization Issues

About a week ago Microsoft released KB2995743, a guided walkthrough about troubleshooting software update synchronization issues in SCCM 2012x.

Even though managing WSUS with SCCM 2012x has many advantages, it can be a challenge to get it right or to solve issues when they arise. This KB article helps you to get things running again.

Wednesday, September 24, 2014

SCCM 2012 R2: CU#3 Is Out!

Yesterday Microsoft released Cumulative Update #3 for SCCM 2012 R2 with build number 5.00.7958.1401. KB2994331 describes this CU in more detail.

Issues fixed in CU#3 involve these SCCM 2012 R2 components:

  1. Administrator Console
  2. Client
  3. Company portal
  4. Migration
  5. Mobile devices
  6. Operating system deployment
  7. Site servers and site systems
  8. Software distribution and application management
  9. Wake on LAN
  10. Windows PowerShell

A new added feature is ‘Management Point Affinity’. This enables you to define on client level to what DP it connects to. Even though it sounds great there are some caveats to reckon with. Justin Chalfant, a Microsoft PFE for SCCM has written an excellent posting about it, to be found here.

I strongly advice you to read it thoroughly before you start implementing MP Affinity.

Monday, September 22, 2014

Lync Server 2010 MP Bug Alert: ‘Consolidator Module Failed Initialization’

Issue
A customer of mine runs a SCOM 2012 R2 environment and has many servers and workloads in place, all monitored by SCOM. So far so good. Also a SINGLE Lync 2012 Server is in place and being monitored by SCOM.

But soon the Alert came in: ‘Consolidator Module Failed Initialization’.  Even though I had seen this Alert in the past on a Lync 2010 server, the fix I found back then didn’t work. Simply because the fix for a WMI memory issue, didn’t apply here since it was already outdated by newer file versions. So something else was at play here.

Cause
Some time ago I was on an assignment for a customer running many many Lync Server environments, among them many Lync 2010 servers as well. And NO WHERE this issue popped up. So what was different here?

Soon I found it. As it turned out, Lync isn’t really made to be installed on a single server. For a POC perhaps – even then I would not recommend it since you can’t test all the functionality of Lync – or a lab. But not for full production.

So in environments with just ONE Lync server this issue pops up! And the Alert is caused by just ONE single Monitor which is configured in the wrong way. But more about that one later on.

This is what the OpsMgr event log on the single Lync 2012 server told me:
clip_image002

Log Name:      Operations Manager
Source:        Health Service Modules
Event ID:      11112
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      XYZ.domain.com

Description:
The Microsoft Operations Manager Consolidator Module failed to initialize because the specified compare count is less than the minimum allowed.

Compare count value: 1
Minimum value allowed: 2

One or more workflows were affected by this. 

Workflow name: Microsoft.LS.2010.Monitoring.UnitMonitor.TimerResetEvent.PDP_PDP_IPADDRESS_NOT_CONFIGURED_IN_NCS
Instance name: LS Policy Decision Point Component [XYZ.domain.com]
Instance ID: {D97677B9-BA09-426C-2FFF-EE8CB1F6A774}
Management group: ABC

In this case the compare count value and the minimum value allowed triggered an Alert for me. When comparing items, you require at least TWO of them, You can’t compare just ONE item with itself!

So apparently this specific Unit Monitor was trying to compare at least TWO items but only found ONE (understandable since it’s only ONE Lync 2010 Server) and failed because of it!

Now my search on the internet became much simpler because I knew now what to look for. And soon on the TechNet Forums I found this topic where I found the cause of it all, as described by ZYEngineer: ‘…On simple deployments of Lync, such as mine, there is no explicitly defined Region, Site, Subnet, Region Link or Region Route.  There's no need since there's only one front-end server.  This monitor is looking for the PDP IP Address in Network Configuration Settings (PDP_PDP_IPADDRESS_NOT_CONFIGURED_IN_NCS). …’

Tada!

Solution
So the only solution here is to disable the Unit Monitor. After some searching in the XML code of this MP I located it, it’s Unit Monitor: IP addresses missed from Network Configuration Settings targeted against LSPolicy Decision Point Component.
clip_image002[10]

As it turned out, this Unit Monitor has a serious issue (allowing a compare count of 1, DUH!) causing it to crash the SCOM Console when trying to look at the properties of it:
clip_image002[6]

And the details show the cause of this Console crashConfused smile:
clip_image002[8]
So the SCOM Console detects the defect of this Unit Monitor without any issue, so why did it slip through QC before this MP was published?

GLADLY, an Override wasn’t an issue at all. So I disabled this Unit Monitor and some minutes later this error in the event log on the Lync 2010 Server and the related SCOM Alert were gone.

Tool: SMART Documentation & Conversion 2.0 Helper For Orchestrator Runbooks

Update, 09-22-2014
Based on this posting I got feedback from Bruno Saille, the Program Manager who’s responsible for this tooling (among other things). I’ve decided to incorporate his feedback in blue. Thanks Bruno for your feedback, awesome!

Already some months ago Microsoft released an updated version their tool to document your Orchestrator  Runbooks, Orchestrator Visio and Word Generator.

However, as it turns out, this is just more than an update of the tool since it incorporates an update of the tool SMART Runbook Conversion Helper as well. This results in the tool SMART Documentation and Conversion Helper 2.0.

The tool itself – with a long description about how to use it with some good examples – can be found here.

So far, so good. But since I’ve used this tool I want to share some of my personal experiences:

  1. Icons please?
    Bumped into it just once. Visio and Word output ended up without icons, even though the export PS script for Runbook icons (this is a 32 bits PS script!) had run successfully and all icon files (jpeg) were present in the same folder the script was run from. Only fix for me was to remove all the icon files, rerun the export PS script for Runbook icons again. Afterwards Visio and Word output had icons again. Don’t know whether this was a one time glitch.

  2. Output to Word is dead slow
    On a well dimensioned server it took a long time to convert some Runbooks to Word format, especially compared to the conversion to Visio. Something which I couldn’t solve and just had to ride it out.

    Feedback from Bruno Saille, Microsoft PM responsible for this software: ‘…Yes, Word output is unfortunately slow, and this is due to how PowerShell works with COM automation. The same code used to be much faster in .NET, and the Visio automation code is not slower with PowerShell. Good thing is that WMF 5.0 is supposed to have a few enhancements in COM automation speed, so this may help (I have not had a chance to give it a good try)…’

  3. Running multiple PS instances on the same computer doesn’t work for conversion to Word
    Tried to run two PS instances in order to make the conversion go faster. For Visio and PS this works great, but when converting Runbooks to Word documents, it doesn’t work. Somehow the document gets corrupted and only the last line is kept. The rest is overwritten by newer output of the conversion. No title, no table. Just a single line of text.

  4. Configure Visio before you start the conversion
    Visio needs some additional configuration before you start your first conversion to it. In my case Visio was installed on the D:\ drive, so I had to modify the path referring to a specific Visio startup file.

  5. Be patient
    The tool works, but has some quirks, one of them is lacking speed. Also when you click something, sometimes there will be a lag. Just be patient and wait. Until now the tool didn’t crash on me which is far more unwelcome.

  6. Placement of the tool window
    When running the tool and starting a conversion (especially to Word), don’t forget to place the window of the tool to a place which doesn’t ‘eat away’ most of your screen. Because when a conversion is started, this windows can’t be dragged to another position. Just something to be aware off.

    Feedback from Bruno Saille, Microsoft PM responsible for this software: ‘…That is one of the limitation of doing this via a PowerShell XAML/WPF "GUI" : The tool will not give focus back until the processing has been finished. This is why we tried to add more extensive feedback in the console window at the same time, so you can confirm something is happening. You can move the console window around even when the GUI is "waiting". Adding multi-threading, etc. might be possible, but then it may bring more complexity than anything, vs a more "simple" PS script that you can modify to your needs…’

Besides this I am happy with the tool since it allows me to gain a good insight in an Orchestrator environment and more important, the Runbooks present. But don’t forget that this tool is the START of your journey in how the Runbooks are made, and not the end of it.

Having a cup of coffee with the persons who built the Runbooks provides tons of information as well, which can’t be captured by any tooling. But than of course, those people must be still around AND available…

Friday, September 19, 2014

Core OS MP Version 6.0.7230.0: Undocumented Change

Already some people pointed it out to me, so the credit of this posting go to them. However, I did not find the time to double check it. But it’s true: the latest version of the Core OS MP (Server OS MP), version 6.0.7230.0 has some changes which AREN’T documented but can affect your monitoring of available free disk space.

What’s happening?
Some versions of the Core OS MP back, there were those really good free disk space Monitors which when they fired an Alert told you directly what’s going on. So the Alert told you exactly how much disk space was still left (in MBs and %).

However, those Monitors were replaced by new ones (for some good reasons). BUT now the Alerts didn’t tell you anymore how much disk space was left Sad smile. Kevin Holman made a special MP which fixed this annoying issue.

But now with the latest version of this Core OS MP (version 6.0.7230.0) these Monitors are back and turned on by default! The previous ones are disabled now by default.

Another thing to reckon with is the time frame in which they run. The reinstated Monitors runs once per hour (3600 seconds) by default. Of course by using an Override you can modify it as required.

More details
In order to make it more clear what I am talking about, I give you more details.

  1. In total there are 4 logical disk free space monitors per OS version now. In this screenshot I’ve 12 of them (WS 2003x, 2008x and 2012x):
    clip_image002
    Only ONE per OS version is enabled by default.

  2. The PREVIOUS ONE with the Alert lacking good information is turned off now by default,
    the Logical Disk Free Space Monitor:
    clip_image002[5]

    This one ran once per 15 minutes:
    clip_image002[7]

    But the Alert showed nothing about how much disk space was left:
    clip_image002[9]

  3. The ONLY Monitor which is enabled by default is the
    Windows 20xy Logical Disk Free Space Monitor. This is the 2012 server OS version:
    clip_image002[11]

    It runs once per hour (3600) seconds:
    clip_image002[13]

    AND it generates an Alert with GOOD information about how much disk space (MBs and %) is left:
    clip_image002[15]

So check your environment and overrides when this MP is in place. Hopefully the MP guide for this MP will properly updated.

SCCM 2012x: The package data in WMI is not consistent to PkgLib

Issue
I’ve bumped into this issue quite a few times now. The DPs seem to be fine BUT in the monitoring pane of the SCCM 2012x Console the DPs have a warning icon. When looking in the relevant log file (smsdpmon.log) on the DPs involved this entry points to the cause of it: The package data in WMI is not consistent to PkgLib.

Cause
As it turns out, it happens when some packages are removed but their entry still ‘lives’ in WMI of the DPs involved. Already the SCCM Team posted an article how to solve it, to be found here. Even though it works, it’s time consuming. So I searched for another solution and found it on the TechNet Forums.

Solution
In this thread member JT_DPS posted some powerful PS scripts, helping to solve this issue really fast. His PS scripts come in three parts.

Part I: This PS script shows you what packages are in WMI and not in the Content Library AND vice versa.

$WMIPkgList = Get-WmiObject -Namespace Root\SCCMDP -Class SMS_PackagesInContLib | Select -ExpandProperty PackageID | Sort-Object
$ContentLib = (Get-ItemProperty -path HKLM:SOFTWARE\Microsoft\SMS\DP -Name ContentLibraryPath)
$PkgLibPath = ($ContentLib.ContentLibraryPath) + "\PkgLib"
$PkgLibList = (Get-ChildItem $PkgLibPath | Select -ExpandProperty Name | Sort-Object)
$PkgLibList = ($PKgLibList | ForEach-Object {$_.replace(".INI","")})
$PksinWMIButNotContentLib = Compare-Object -ReferenceObject $WMIPkgList -DifferenceObject $PKgLibList -PassThru | Where-Object { $_.SideIndicator -eq "<=" }
$PksinContentLibButNotWMI = Compare-Object -ReferenceObject $WMIPkgList -DifferenceObject $PKgLibList -PassThru | Where-Object { $_.SideIndicator -eq "=>" }
Write-Host Items in WMI but not the Content Library
Write-Host ========================================
$PksinWMIButNotContentLib
Write-Host Items in Content Library but not WMI
Write-Host ====================================
$PksinContentLibButNotWMI

Part II: This PS script removes the package from WMI (using the list from Part I):

Foreach ($Pkg in $PksinWMIButNotContentLib){ Get-WmiObject -Namespace Root\SCCMDP -Class SMS_PackagesInContLib -Filter "PackageID = '$Pkg'" | Remove-WmiObject -Confirm }

Part III: This PS script removes the INI file (using the list from Part I):

Foreach ($Pkg in $PksinContentLibButNotWMI){ Remove-Item -Path "$PkgLibPath\$Pkg.INI" -Confirm }

When you revalidate the content on your DPs they should turn to green icons again.

Credits
All credits for the PS scripts in this posting go to JT_DPS.

Wednesday, September 17, 2014

Updated MP: Service Manager 2012x, version 7.5.3079.183

Yesterday Microsoft released an updated version of the Service Manager 2012x MP, version 7.5.3079.183.

This MP requires SCOM 2012x. MP can be downloaded from here.

Tuesday, September 16, 2014

SCOrch: What SQL Server & Database Is Being Used?

Issue
Bumped into a System Center 2012 – Orchestrator (SCOrch) environment where it was unknown what SQL server and database was being used. Nowhere this information was found in the registry. So where to look now?

Solution
As it turned out, the solution was a simple one (even though it still eludes me why this information isn’t stored in the registry, but apparently it’s stored in the SQL database as well….).

  1. Start the tool Deployment Manager (System Center 2012 R2 Orchestrator Deployment Manager);
  2. Right click on Orchestrator Management Server > select Properties;
    image
  3. Go to the second tab Orchestrator Management Server. On this tab the Database Server and Data Store Name (SQL database) are shown:
    image

Credits
I found the solution to this relative simple question in
this posting of the blog Kick That Computer.

Friday, September 12, 2014

SCOM 2012x & UX Monitoring. Guided walkthrough for troubleshooting UNIX and Linux agent discovery

KB2993901 is a comprehensive troubleshooting guide for anyone having issues with UNIX/Linux Agent discoveries and installations.

Glad to see Microsoft puts this kind of effort in it since monitoring UX systems with SCOM 2012 is still a challenge Smile.

SCCM 2012: ‘System Center 2012 R2 Configuration Manager UNLEASHED’ Is Out!!!

YES! The book System Center 2012 R2 Configuration Manager UNLEASHED is out! Awesome!
image

I am a BIG fan of the UNLEASHED series since they go so deep. And the ConfigMgr 2012 books aren’t any different in that respect. Even better, I learned SCCM 2012 from it. So I am very happy this book is out. Bought it right away for my Kindle.

You can buy it anywhere of course, and you can find it here on Amazon.

For anyone working with SCCM 2012 R2 this book is a MUST HAVE!!!

Wednesday, September 10, 2014

SQL MP Challenge: Run As Accounts & - Profiles.

Issue
As stated in this blog posting of Kevin Holman, SQL Server 2012 instances require additional attention. Otherwise SCOM can't monitor it.  Main reason here is that the Local System account (NT AUTHORITY\SYSTEM) is excluded by default from SA (Sys Admin) permissions in SQL Server 2012.

When following the earlier mentioned blog posting you'll be just fine and SCOM will monitor the SQL Server 2012 instances as well. But how about a situation where you've about 50 SQL server instances based on SQL Server 2008x and only a few more which are running SQL Server 2012? In this case you've to be more specific so that only those SQL Server 2012 are using the special (AD) account and excluding the other none SQL Server 2012 instances.

In this scenario it turns out to be a lot harder to get things right. When they aren't you end up with many Alerts all about the credentials.

Example
Suppose you create in AD a dedicated account for monitoring the SQL Server 2012 instances. And this account is used for a Run As Account (e.g. SQL Account) for discovering and monitoring SQL Server 2012 instances. In this example the account is only distributed to the health service running on the server hosting that SQL Server 2012 instance:
clip_image002

This Run As Account is used by two of the three SQL Run As Profiles (SQL Server Discovery Account & SQL Server Monitoring Account). For both Run As Profiles the Run As Account will be used to manage All targeted objects:
clip_image004

However, SCOM doesn't like this kind of configuration and soon the dreaded Alerts start flowing in:

System Center Management Health Service Credentials Not Found Alert Message:

An account specified in the Run As profile "Microsoft.SQLServer.SQLDiscoveryAccount" cannot be resolved.

This condition may have occurred because the account is not configured to be distributed to this computer. To resolve this problem, you need to open the Run As profile specified below, locate the account entry as specified by its SSID, and either choose to distribute the account to this computer if appropriate, or change the setting in the profile so that the target object does not use the specified account.

Note: you may use the command shell to get the Run As account display name by its SSID.

clip_image005

Cause
Even though all these Alerts aren't a good thing, SCOM is doing exactly as configured:

  1. You created a Run As Account for monitoring the SQL Server 2012 instance;
  2. This account is ONLY distributed to one or more health services, running on servers hosting SQL Server 2012 instances;
    clip_image002[6]
  3. All other servers running other versions of SQL Server won't get the credentials of this Run As Account;
  4. In both SQL Run As Profiles HOWEVER this account is used for All targeted objects;
  5. So now ALL workflows related to the SQL MP running on ALL SQL Servers require the credentials as specified in the Run As Account;
  6. But that Run As Account ISN'T distributed to all those SQL servers;
  7. That's why all the Alerts start coming in...

Close but no cigar
Even though one might think to solve this issue by distributing the SQL Run As Account to ALL systems (Less secure - I want the credentials to be distributed automatically to all managed computers) it won't fly either.

Why you ask? Just keep on reading :).

Even though the Run As Account is AD based, it has only permissions on those specific SQL Server 2012 instances. Not on the other SQL Server 2008x instances. So now discovery and monitoring of the SQL Server 2012 instance will work as intended but the other SQL Server 2008x instances don't know this account at SQL instance level so it will be denied access.

Soon SCOM will raise new Alerts, related to this issue:Run As Account does not exist on the target system or does not have enough permissions

Management Group: XYZ. Script: GetSQL2008DBFilesFreeSpace.vbs : Cannot login to database [XYZ][MSSQLSERVER:XYZ]

Another Alert that might pop up is: Run As Account Cannot Log On Locally. This Alert makes sense as well since the dedicated SQL account doesn't have Log on Locally permissions on all monitored servers, except for the SQL Server 2012 computers (when they're configured properly that is).

The Run As account needs to have the "Log On Locally " right.

Solution
But this situation is to be solved when you follow these steps outlined below. I start at the very beginning of things, creating the account in AD. Reason for this is that I leave nothing out, in order to present to you the COMPLETE solution to this challenge.

  1. Create an AD account, used for discovering and monitoring the SQL Server 2012 instances, e.g. _SVC_SCOM_SQL2012_Monitor;
  2. This account will be only member of Domain Users. Give it a very strong password and configure the account that it doesn't has to modify the password after the first logon and that the password will never expire;
  3. On the servers hosting the SQL Server 2012 instances to be monitored:
    1. Make the AD account member of the Local Admin group
    2. In SQL, add the AD account and give it Sys Admin permissions.
  4. SCOM - Run As Accounts
    1. Create a Run As Account (Windows) and use the AD account for it;
    2. Set Distribution to More secure - I want to manually select the computers to which the credentials will be distributed
      clip_image004[6]
    3. Select the Health Services related to the servers hosting the SQL Server 2012 instances:
      clip_image006
  5. SCOM - Run As Profiles
    1. Modify the Run As Profile SQL Server Discovery Account
      1. Select the earlier made Run As Account. At the header This Run As Account will be used to manage the following objects you select the option A selected class, group or object > Select > Object > in the Object Search screen: set the Look for box on Windows Computer. Now search for the Windows Computer(s) which are hosting the SQL Server 2012 instances. Add them one by one (Add button). When you're back in the main screen of the Run As Profile it looks like this:
        clip_image008
      2. Save the configuration.
  6. Modify the Run As Profile SQL Server Monitoring Account
    1. Follow the same steps as described in Step 5.
  7. Now everything will work as intended:
    1. The SQL Server 2008x instances will be monitored by the Local System account and not raise Alerts about the Run As Account for discovering and monitoring SQL Server 2012 instances;
    2. The SQL Server 2012 instances will be managed by the dedicated Run As Account.
    3. When Alerts do come in, recycle the Health Service cache on those systems. Now those Alerts should be gone. When they come back, go through the Steps 1 to 5 again because changes are you missed something :).

Tuesday, September 9, 2014

SCOM 2012 R2 UR#3 Tip: Application Log & Event ID 1022

Many times I am asked the question WHY the latest URs for SCOM 2012 R2 are so poor in reporting their installation status, like success or failed. To be frank, this eludes me as well. So one is obliged to run checks on file versions and so on.

And yes, the blog postings written by Kevin Holman all about applying a certain UR come in handy here. But still, IMHO, the updates themselves should be far more friendly by showing the installation status/results.

Having said that, there is a quick way in order to see whether the related SCOM 2012 R2 components updated successfully or not. When a SCOM 2012 R2 component is updated by an UR, the application log on that same computer will contain multiple Event IDs 1022. One of those is the best however since it states that the UR is applied successfully against that certain SCOM 2012 R2 component.

Some examples:

  1. SCOM 2012 R2 Management Server:

    Log Name:      Application
    Source:        MsiInstaller
    Date:          M/DD/YYYY
    Event ID:      1022
    Task Category: None
    Level:         Information
    Keywords:      Classic
    User:          <DOMAIN>\<USER>
    Computer:      <SCOM 2012 MS SERVER>.<DOMAIN>.<DC>
    Description:
    Product: System Center Operations Manager 2012 Server - Update 'System Center 2012 R2 Operations Manager UR3 Update Patch' installed successfully.

  2. SCOM 2012 R2 Web Console:

    Log Name:      Application
    Source:        MsiInstaller
    Date:          M/DD/YYYY
    Event ID:      1022
    Task Category: None
    Level:         Information
    Keywords:      Classic
    User:          <DOMAIN>\<USER>
    Computer:      <SCOM 2012 MS SERVER>.<DOMAIN>.<DC>
    Description:
    Product: System Center Operations Manager 2012 Web Console - Update 'System Center 2012 R2 Operations Manager UR3 Update Patch' installed successfully.

  3. SCOM 2012 R2 Console:

    Log Name:      Application
    Source:        MsiInstaller
    Date:          M/DD/YYYY
    Event ID:      1022
    Task Category: None
    Level:         Information
    Keywords:      Classic
    User:          <DOMAIN>\<USER>
    Computer:      <SCOM 2012 MS SERVER>.<DOMAIN>.<DC>
    Description:
    Product: System Center Operations Manager 2012 Console - Update 'System Center 2012 R2 Operations Manager UR3 Update Patch' installed successfully.

I agree, it’s a workaround, but at least it helps you to see whether everything went fine. And YES, please do check the file versions as well Smile.

Monday, September 8, 2014

SCCM 2012 OSD & Regional Settings: How To Remove Keyboard Layouts?!

Situation
Suppose you’ve got to roll out Windows 7 SP1 to a bunch of systems. And this version of Windows 7 SP1 is localized, e.g.the Dutch language. So you want these systems to run with the proper localization settings. On itself this isn’t difficult at all, and already many well known SCCM pro’s have blogged about it, like this posting from Kenny Buntinx.

I especially prefer his posting because he not only tells you how to go about it, but also how to use variables so – per Collection you can decide what localization settings are to be applied.

So this one was easy to solve. Even when you have to cover the two well known architectures, 32 and 64 bit. Per architecture the correct Unattended.xml is built and used in the related Task Sequences. Nothing special and almost a Next > Next > Finish experience.

HOWEVER… removing the unwanted keyboard layout and keeping the correct one turned out to be a lot harder to solve. This is what I am talking about, removing the keyboard layout Dutch (Netherlands) – Dutch:
image

So afterwards it has to look like this:
image

And NO! HACKING THE REGISTRY WON’T WORK AND EVEN WRECK HAVOC ON MANDATORY PROFILES!!! SO DON’T DO THAT!!!

Finally, after testing and searching on the internet, I found this MSDN article, all about Windows Vista Command Line Configuration of International Settings. And yes, even though this article is all about the long forgotten (at least Microsoft hopes so Smile), it still works on Windows 7 (I wasn’t too surprised I might add).

This MSDN article tells you how to parse an XML file to the INTL.cpl which controls Region and Language settings. This xml file enables you to configure it down to the smallest bit. Awesome!

So finally the means to an end were found. Now the next step was how to use it, or better WHEN. Because these settings are on a per user basis, so rolling this out while OSD is taking place won’t do the job.

Finally I ended up creating an Application for it (the XML file and a xyz.cmd file containing the syntax) which I deployed against a dedicated Device Collection.

It was a challenge to get it all working but I must say I’ve learned tons new things SmileSmileSmile

SCCM 2012 R2 & SCOM 2012 R2: Alert ‘ConfigMgr Site Role Issue’

Issue
Bumped into an Alert in SCOM 2012 R2 which monitors (among many other things) a SCCM 2012 R2 environment. The Alert ConfigMgr Site Role Issue kept popping up where the Alert itself didn’t give that much information:
image

Cause
Since the Alert was triggered by a Monitor the Health Explorer gave much more information about the real issue at hand here. As it turned out the ConfigMgr Site Role Issue Monitor is nothing but a rollup Monitor: the real work is done by Unit Monitors and their state changes rollup to the ConfigMgr Site Role Issue Monitor.

The Unit Monitor Notification Server Windows Firewall Block Monitor had a Warning status and the description told me the cause of it: ‘…This monitor indicates whether Windows Firewall on Management Point Name does not allow clients to connect to the Notification Server…’

However, there weren’t any issues at all. The Notification Server could be contacted without any issue by all the clients. So now what?

The FIX
As it turns out from this TechNet Forum question, there is an issue with the way SCCM itself installs the Notification Server and puts that information into the registry.

The registry entry HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SMS\Operations Management\Components\SMS_NOTIFICATION_SERVER\D07ACE61-FB84-4461-9F52-ABBA07C2EE3A\Severity should be set – after successful installation – to 1.

But somehow this doesn’t happen, instead the registry entry is left at 2, resulting in the earlier mentioned Unit Monitor to change state (Warning), rolling up to the rollup Monitor which fires the general Alert without any real information…

As stated in the same question the ‘fix’ is manually modifying the entry by setting it to 1. I only did this when I knew for certain that the Notification Server is running well and working as expected. Thank fully, every single bit processed by SCCM is logged (and not just once and at one place…. Smile with tongue out) so by checking all the logs involved it was quite easy to know whether all is okay.

Tuesday, September 2, 2014

SCOM 2012x & UX Monitoring. What UX Agents Are Present? And What Version, Build & Architecture?

When using SCOM 2012x to monitor UX based workloads it’s a hard requirement to have the correct UX MPs imported. Simply because these very same UX MPs contain the UX Agents as well. And when you don’t have the correct UX MPs imported, monitoring won’t happen at all because the UX Agents are missing.

So how about a quick check in order to see what UX Agents are available in your SCOM 2012x environment and of what build, version and architecture they’re?

Procedure 01: UX Agent check in SCOM 2012x Console

  1. Open the SCOM 2012x Console with SCOM Admin permissions > Monitoring > Monitoring > Discovered Inventory;
  2. Right click in the middle pane > Change Target Type > select the radio button View all targets > type in the Look for box: unix/linux supported agents. Only one target remains now;
  3. Select UNIX/Linux Supported Agents > OK.
  4. Now you’ll get an overview of all UX Agents present in your current SCOM 2012x environment:
    image

As you can see, my test lab is running behind in the regular update cycle Smile.

This way you have a quick overview of the UX Agents present in your environment.

But this is only one step of a TWO step procedure. Because those very same packages must be present on your Management Servers and/or Gateway Servers as well.

Procedure 02: UX Agent Package check on SCOM 2012x MS/GWs

  1. On a SCOM 2012 R2 MS server: Go to ~:\Program Files\Microsoft System Center 2012 R2\Operations Manager\Server\AgentManagement\UnixAgents\DownloadedKits and check whether the packages are available;
  2. On a SCOM 2012 R2 GW server: Go to ~:\Program Files\System Center Operations Manager\Gatewa\AgentManagement\UnixAgents\DownloadedKits and check whether the packages are available.

Recap
This is a quick method in order to find out in what shape SCOM 2012x is before you start enabling the monitoring of UX based workloads. This way you know where to start (most of the times it starts with updating the UX MPs which are present by default and adding the required ones, suiting your UX monitoring requirements.

SCOM 2012x & UX Monitoring. Greyed Out UX Agents

As it turns out, in chained Gateway Server scenario’s it’s a bit of a challenge to get things on the road when you want to monitor UX based workloads.

But finally when the UX Agents are in place AND the UX systems are in SCOM, they can stay in a greyed out status way too long. Think about days here…

Finally what fixed it for me was to recycle the Health Service folder on all the Gateway Servers involved, starting at the ones on the customers location, ending up at the Gateway Servers residing in the DMZ of the NOC.

But even when that’s not sufficient, perform the same trick on the Management Servers as well. And of course, one by one and only starting at the next one when the previous is back online and FULLY functional and operational.

How to recycle the HealthService:

  1. Start PS with admin permissions > Stop-Service HealthService;
  2. Empty the OpsMgr eventlog;
  3. Go to ~:\Program Files\System Center Operations Manager\Gateway and remove the folder Health Service State;
  4. In PS: Start-Service HealthService;
  5. In PS: Get-Service HealthService. You should an output like this:
    image`
    The Status should be Running.
  6. Now wait until the server is back to monitoring the assigned workloads before moving on to the next server.

Monday, September 1, 2014

SCOM 2012x & UX Monitoring. How About Account Distribution & Management Servers?

When you want to use SCOM 2012x to monitor UX based systems, not only name resolution has to be 100% okay, but the related UX Run As Accounts must be properly distributed as well.

In the more straight forward scenario’s where the UX monitoring is performed by a Resource Pool containing Management Servers, it’s obvious that the UX Run As Accounts involved have to be distributed to those very same Management Servers as well.

HOWEVER! In a NOC scenario where the UX systems reside behind one more Gateway Servers and those Gateway Servers report to other Gateway Servers, rolling up to the SCOM 2012x Management Servers, those very same UX accounts have to be distributed to ALL Management Servers as well.

Yeah, I hear you. Why should you? Because those very same Management Servers don’t touch the UX systems at all…

When you don’t do this, and want to run the Management Pack Template used for monitoring UX based processes, you’ll get an error like this: Unable to retrieve processes. The task has invalid configuration. The most likely cause is that a $RunAs account was not distributed to the target health service.
image

When I made sure the SCOM 2012x Management Servers also received the credential set for the UX accounts, this error disappeared…

Recap
When monitoring UX systems with SCOM 2012x, distribute the UX Run As Accounts to all SCOM 2012x Management Servers, whether or not they ‘touch’ the UX systems directly. When running tasks targeted at UX systems, the SCOM Management Servers require the credential sets in order to execute the Tasks in a proper manner.

SCOM 2012x & UX Monitoring. Name Resolution Is The Key To Success!!!

When you want to use SCOM 2012x to monitor UX based systems, make sure name resolution is fully functional. And ALL the way. So not only from IP address to FQDN but also the reverse lookups have to be just fine.

And no, don’t use HOSTS files please. One day it’s bound to go wrong. Just create a PinPoint DNS Zone and be done with it. And while you’re at it, create the reverse lookup zone as well.

Also something not to forget is that this kind of name resolution is crucial for ALL SCOM 2012x Management Servers involved WHETHER OR NOT THEY CAN REACH THE UX SYSTEM INVOLVED!.

So even in a NOC scenario where the UX systems reside behind one more Gateway Servers and those Gateway Servers report to other Gateway Servers, rolling up to the SCOM 2012x Management Servers, those very same Management Servers must be able to resolve the names of the UX systems involved.

Otherwise it won’t fly, even when your UX systems are being properly monitored. One thing you might bump into is that the UX systems related tasks won’t run and that the related Management Pack Templates won’t run either, to great frustration.

Xplat Monitoring With Chained Gateway Servers - The NOC Approach

2014-09-01 Update
As it turned out, even though the MS servers aren’t capable of directly contacting the monitored UX servers as described in the NOC approach with chained Gateway Servers, they still require name resolution, and on top of that, reverse name resolution as well.

When running the MP Templates for UX servers, the MS servers (ALL OF THEM, since they’re part of the All Management Servers Resource Pool) must be able to resolve the names of the UX machines involved. So make sure the MS servers can resolve those names, reverse as well.

Otherwise the Tasks related to the UX server specific MP Templates AND the normal Tasks for the UX servers, won’t run at all, or only sometimes when being executed by a MS server which is capable of resolving the names.

The challenge
Even though I’ve helped customers before with monitoring their UX systems with SCOM 2012x, I had a new challenge. In this particular scenario the customer has a NOC in place (Network Operations Center) where only the SCOM 2012x Management Group resides.

All the monitoring happens somewhere else, at many customer locations. Per customer location at least one Gateway Server is in place. Behind that Gateway Server the real monitoring workloads reside. And to make it even more challenging, the Gateway Server(s) residing at the customer locations don’t report directly to the Management Servers residing in the NOC.

Chained Gateway Servers
Instead these customer Gateway Servers report to special NOC Gateway Servers residing in a DMZ. And those Gateway Servers report – finally – to the SCOM 2012x Management Servers. This kind of setup (Gateway Servers communicating to other Gateway Servers and not directly with the Management Servers) is also known as Chained Gateway Servers.

So when monitoring UX based workloads, the chain of communication looks like this: UX based workloads > monitored by the customer Gateway Server(s) > NOC DMZ Gateway Servers > NOC Management Servers.

Additional load AND not for the Management Servers…
And like we all know, SCOM Agents on UX systems aren’t like SCOM Agents on Windows Servers. Where the latter manages itself (decides when to run what scripts and so on, manages it’s own workload and agenda, based on the imported MPs), the SCOM UX Agent is totally managed by the Management- or Gateway Server it reports to.

Also good to know is that not a single UX system would be managed by a Management Server but only by SCOM Gateway Server(s). And when it comes down to load, SCOM Gateway Servers are just like SCOM Agents, nothing like the robust healthservice running on a Management Server Sad smile.

So this means the customer Gateway Server(s) will take an additional hit on their performance. Something to reckon with.

Not clustered but everywhere
The UX systems to be monitored don’t reside at a single location but at many different customer locations. This impacts the setup of the monitoring of the UX systems as well.

Security!!!
Different customers means different security policies. So even though ONE UX account for monitoring would be nice from an administration point of View, it wouldn’t fit the bill at all. So per customer location at least ONE UX account had to be created and distributed to the correct Gateway Servers.

Resource Pools please!
Yes. Monitoring UX based workloads (or SNMP based for that matter) REQUIRE the usage of Resource Pools. And yes, you can put Gateway Servers in Resource Pools, no problem. But in this case multiple Resource Pools are required since the UX systems reside at different locations behind different Gateway Servers.

Certificates?!
Yes. When SCOM installs an UX Agent, it get’s a certificate which is created automatically. But that certificate also needs to be signed. For that the SCOM Management- or Gateway Server creates Root Certificate automatically, used for signing the UX Agent client certificates.

And when the Resource Pool – used for UX monitoring – contains more than one Management- or Gateway Server, those root certificates must be exported and imported on the other Management- or Gateway Server(s) residing in the same Resource Pool. When that doesn’t happen, and another server takes over the monitoring of some or more UX systems, monitoring will come to a grinding halt because of the lacking root certificate…

Network AND DNS!!!
Yes, network connectivity is crucial. Also a fully functional DNS is very important. In the ‘Chained Gateway’ scenario it get’s a bit more complicated when taking a first look at it, but it’s not that hard at all actually. It makes sense Smile:

  • Customer Gateway Server managing/monitoring the UX system > UX system:
    • Port 22 (SSH), only during installation and updating the UX Agent;
    • Port 1270 (WSMAN): All the time
    • Must be able to resolve the FQDN of the UX systems, reverse as well.
  • NOC DMZ Gateway Server > Customer Gateway Server:
    • Port 5723
    • Must be able to resolve the FQDN of the customer Gateway Server
  • SCOM Management Server > NOC DMZ Gateway Server:
    • Port 5723
    • Must be able to resolve the FQDN of the NOC Gateway Server
    • Must be able to resolve the FQDN of the UX systems to be monitored, reverse as well.

Failover
Already configured, before the UX system monitoring question came to be, was failover of the customer Gateway Servers to the NOC DMZ Gateway Servers and – in some situations – the failover of the Agents residing behind the Gateway Servers to another Gateway Server. And yes, the NOC DMZ Gateway Servers are also configured to failover to another SCOM Management Server when their primary goes down.

Even though this doesn’t directly influence the monitoring of UX systems, it’s important to know what the Primary and Secondary are for the Gateway Server managing/monitoring the UX systems. Later more about that Smile with tongue out.

Management Packs
MPs in SCOM are crucial. Even more so for monitoring UX systems. Because the UX MPs also contain the UX Agents! So always make sure you’ve got the latest version of the relevant UX MPs imported. In this case the latest version of the UX MPs are based on UR#3 for SCOM 2012 R2. Even when your SCOM 2012 R2 MG isn’t on UR#3 level, but UR#2 for instance, these UX MPs can be imported.

Credits
Before I continue I want to point out that all this information I’ve described so far is based on input from some people working at Microsoft USA. Thanks to their effort (one person in particular, thanks Steve!) I got it all up and running, along with some deep troubleshooting where finally, the culprit was something very simple…

Overview of what to do
In this section I describe how I went about it and got it running. In the end some troubleshooting was required but my guess it’s just some bad luck what happened here.

  1. Updated the UX MPs and imported those missing. Please read the included MP Guides in order to know what MPs you require. Only import the ones required!!!
  2. Tested network connectivity. Especially the last ‘mile’, from the customer Gateway Server managing/monitoring the UX systems. Checked out DNS and ports 22 and 1270.
  3. Built the required Resource Pools and put only the Gateway Servers in their respective Resource Pools which would finally manage/monitor the UX systems. So I left out the NOC DMZ Gateway Servers and NOC Management Servers!
  4. Created the required UX accounts in SCOM and assigned them to the proper Resource Pools;
  5. Added these Run As Accounts to the three UX Run As Profiles in SCOM;
  6. Ended up by having the UX administrator installing the UX Agent and Certificate on the UX systems using the command line;
  7. The UX systems was successfully Discovered by SCOM and added to the OperationsManager database (Agent installation and certificate signing wasn’t needed anymore, see Step 6).
  8. So far so good. But… the UX system never got to a monitored status. Instead it went from unmonitored to greyed out. Normally this is the normal flow but after some time (minutes) it should get a live status. Not in this case. After hours it was still greyed out. Time for some deep troubleshooting. See next section.

Troubleshooting
Yes. I learned a LOT Smile. Also how to troubleshoot UX systems which don’t want to get a monitored status in SCOM. Nice!

The regular tests

  1. On the customer Gateway Server, managing/monitoring the problematic UX system, I ran telnet and tested whether I could connect to ports 22 (less important, only crucial during installation/updating the UX Agent through the Console) and 1270 (crucial, used by WSMAN). Telnet command: Open <UX system FQN> <port number>.
  2. The UX admin checked whether a firewall was running on the UX system;
  3. On the customer Gateway Server the Windows firewall was checked for the presence of the correct rules allowing traffic on ports 22 and 1270;
  4. FQDN was checked on the customer Gateway Server, including reverse lookup.

These tests turned out to be okay. So no firewall and DNS issues. Check. On to the next series of checks.

The deeper tests
Now it’s time to take a deeper dive into the functionality of SCOM itself. Starting at the customer Gateway Server and working from there.

  1. With two WINRM command you can check whether WINRM on the customer Gateway Server can connect to the SCOM UX Agent on the problematic UX system:

    winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_Agent?__cimnamespace=root/scx -username:<USERNAME> -password:<PASSWORD> -r:https://<FQDN UX SERVER>:1270/wsman -auth:basic -skipCACheck -skipCNCheck -skiprevocationcheck -encoding:utf-8 

    You should get a whole list of information returned. This shows WSMAN is working. Now it’s time for the same command without the –skip parts:

    winrm enumerate http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_Agent?__cimnamespace=root/scx -username:<USERNAME> -password:<PASSWORD> -r:https://<FQDN UX SERVER>:1270/wsman -auth:basic -encoding:utf-8 

    When you get the same list back, WSMAN is working and the permissions are okay as well. When issues arise (the first command with the –skip parts worked, but the latter doesn’t) there might be a certificate issue.

    Source: http://social.technet.microsoft.com/Forums/en-US/18047ddf-bcef-4021-a6eb-6cf644e060ad/scom-2012-sp1-not-able-to-discover-linux-workgroup-servers?forum=operationsmanagerunixandlinux

  2. Start logging and debugging, as stated here: http://technet.microsoft.com/en-us/library/hh212862. Especially the section Enable EnableOpsmgrModuleLogging (no this NOT a typo Smile) was new to me. There is also a section about how to enable logging on the UX Agent installed on the UX system. Very helpful and interesting. Verbose logging for SCOM I already knew.
  3. The whole TechNet article all about troubleshooting UX system monitoring: http://technet.microsoft.com/en-us/library/hh212885. Read the sections Certificate Issues, Management Pack Issues and Operating System Issues.

These tests told me the monitoring itself was properly functioning. WSMAN could connect to the UX Agent on the problematic UX system and get information from it. Also without the –skip part WINRM worked like a charm, so no certificate issues either.

The logs on the customer Gateway Servers told me nothing special. So the issue was deeper down the line… Time for some more tests but now more on the SCOM side located at the NOC level (NOC DMZ Management Servers and Management Servers).

And YES! I restarted the SCOM Console way long ago using the /ClearCache switch as well Smile with tongue out

The even deeper tests
First I ran a script made by Kris Bash, also working at Microsoft. This PS script tells you exactly what the status of the monitored UX server in SCOM itself is.

$hostname=”FQDN UX SERVER”

$blDiscovered=$false
$blHealthy=$false

$class=Get-SCOMClass |Where {$_.name -eq "Microsoft.Unix.OperatingSystem"}
$ComputerClass = Get-SCOMClass | Where {$_.Name -eq "Microsoft.Unix.Computer"}
$class |get-scomclassinstance
$instance=get-scomclassInstance -class:$class |where {$_.path -eq "$hostname"}
if ( ($instance -ne $null) -and ($instance.IsManaged -eq $true) )
{
    $blDiscovered=$true
}
else
{
    $blDiscovered=$false
}
$Computer = Get-SCOMClassInstance -Class:$ComputerClass |Where {$_.Name -eq $hostname}
$ComputerHealth = $Computer.HealthState.ToString()

Write-host "Host: $hostname"
Write-Host "OS is discovered: $blDiscovered"
Write-host "Computer health state: $ComputerHealth"

I ran this script first on the Management Server and later on the customer Gateway Server managing/monitoring the problematic UX system. This output told me there was indeed an issue with this particular UX server. Somehow no status was created.

Time to checkout the whole chain of communication, starting at the customer Gateway Server managing/monitoring to problematic UX system, down to the SCOM Management Server and the NOC DMZ Gateway Server in between.

For this I needed to find out what the primaries were for the Gateway Servers involved. See here this posting of Jimmy Harper, section Commands to verify Gateway Server Failover.

When I checked the OpsMgr event log on the NOC DMZ Gateway Server which is the primary for the Gateway Server managing/monitoring the problematic UX system I noticed data for this server was being dropped since the NOC DMZ Gateway Server didn’t think it belonged to the environment.

And yes, a simple PS-cmdlet for restarting the HealthService (Restart-Service HealthService) fixed this issue. Within a few minutes the UX system got a HEALTHY status in SCOM.

Recap
Monitoring UX systems residing on locations behind SCOM Gateway Servers is a valid scenario. Just make sure you’ve all the requirements in place and go for it. And when something is amiss, use this posting to help you out.