Wednesday, October 27, 2010

Battling Config Churn

Yesterday at a customers location I bumped into an issue where heavy Config Churn was involved.

So it was time to take some action in order to bring it down. It a was a bit challenging since much of the Config Churn was caused by a very important MP which could not be tuned like increasing the interval of the Discovery frequency since those Discoveries are event driven. And disabling those very same discoveries would render the MP almost useless. So that is a no go area.

But let me start at the beginning. When do you know you are experiencing Config Churn? Because you can only start a fight when you know who or what your opponents are. So recognition is very important here.

First of all, complaints started to come in about sluggish Console performance. I checked first how MANY simultaneously SDK connections from the Consoles were running (Monitoring > Monitoring > Operations Manager > Management Server Performance > Console and SDK Connection Count) but that stayed well below 12. And for a super sized RMS and SQL server this is not an issue at all. These very same servers did not show any performance issues at all nor any connectivity issues, so the cause of the sluggish consoles was to be found somewhere else.

Time to take a deeper dive. Could it be Config Churn? In the old days one had to run queries against the Data Warehouse. But today, with the SCC Health Check Reports, those same queries are contained within these Reports:

I really love these Reports and implement them in every SCOM R2 environment since they make life a whole lot easier. Another good thing about these Reports are the details. Every Report is explained and contains good information like url’s to bog articles about the subject. This way you have everything in a single place.

I ran the Report Config Churn – Discoveries Last 24 Hours (DW). With this report I knew what was possibly causing the complaints about the sluggish consoles. So I had identified something. Time for the next steps: battling Config Churn or better, its causes.

Based on the information found in the blog posting this Report refers to, I started to increase the intervals of the Discovery frequencies of some noisy Discoveries. As stated before, the most noisiest ones did not allow this kind of tuning, so I had to work around them.

It took me some time but with the information found in the earlier mentioned blog posting, I found some Discoveries which allowed for tuning and adjusted them accordingly.

Today I ran the same Report again and when compared to the first Report, which I ran yesterday, Config Churn has been reduced by 40%! And the SCOM Admins are happy again since the Consoles have become snappier!

Thanks to the Community who provided these great Reports and the blog posting describing how to battle Config Churn! It made my efforts highly efficient.

No comments: