Wednesday, October 14, 2009

By popular demand: How to make a report showing %CPU time for a particular process?

A regular reader of this blog asked me a question which I deemed important/good enough to be shared with the community. So this blog posting will be about his question: How to create a report containing a graph on %CPU time for a particular process?

Well, sometimes making such a thing is like baking bread. We need some ingredients which need to be put together in a certain order, than it needs to be processed (baked) before we end up with a ‘consumable’ result (the report with the graph).
image
Bing Image Search: ‘Baking’

So let’s start baking!. In this example we want to capture the %CPU Time of the sqlservr.exe process (SQL Server process).

The first ingredient we need is:

Performance Collection Rule
This is needed for capturing some kind of performance. in this case the %CPU Time of the process sqlservr.exe.

Sometimes it is made easy, and the Performance Collection Rule is already in place since it is part of a certain MP. Download Boris Yanushpolsky’s MPViewer version 1.7 from here. Open the MP what might contain that rule and check it out. Use your common sense here: The IIS MP won’t contain a Collection Rule for the SQL Server process…

But for this posting let’s assume there isn’t a Performance Collection Rule in place. So we need to make it ourselves.

  1. Start the OpsMgr Console > Authoring > Rules. Right click it > New Rule > Collection Rules > Performance Based > Windows Performance:
    image
    Make sure to put this Rule into its own MP! Click Next.

  2. Give it a recognizable rule name and select a proper Rule Category (in this case Performance Collection is good)
     image
    The Rule Target must be wisely chosen. Since we want to measure/report on sqlservr.exe we can target it at the SQL 2008 DB Engine. This way we don’t have to disable the rule because it won’t be targeted at all computers, only computers running the SQL 2008 DB Engine. Suppose you would target this rule at Windows Computer. Now we would have to disable the rule since it would be targeted at all monitored Windows Computers, whether or not they are running SQL Server 2008. And enable the rule by using an override targeted at a group containing all SQL 2008 servers.

    So targeting is a very important issue when making rules/monitors.

    Click Next.

  3. Hit the button Select on the next screen and this screen will pop up:
    image
    Select a computer running the SQL 2008 DB Engine, select as Performance Counter ObjectProcess’, as counter% Processor Time’ and as instancesqlservr’ and hit OK. Now you are back in the other screen:
    image
    As you can see, the Object, Counter and Instance are already filled in. The only thing you have to change is the interval. For this example I have set it to 1 minute, which is way too much in a production environment. The default 15 minutes will suffice in most cases. Do not lower it since much data can be collected from it!

    Click Next.

  4. Here Optimization can be chosen. Of course, the accuracy of the collected data will be affected by it. But when having to run collections of many many processes, it is a good thing to think about. In this example I have chosen not to use it.
    image
    Click Create.

  5. Go to the Monitoring Pane > _ShowCase JB (name of the MP selected in Step 1) > right click it > New > Performance View. Select ‘Collect by specific rules’ and in the Criteria Description area click the link ‘specific rules…’ 
    image

  6. In the Select Rules screen, select the earlier created rule (Step 2) and click OK twice.
    image

  7. Now the Perfomance View for this rule will be shown.
    image
    Select the performance rule and in the Actions Pane click Select Time Range. Set it to one hour (just to see whether data is being collected) and hit OK.
    image

    As you can see, data is coming in:
    image 

So now we have the most important ingredient in place and a working order.

Report
All we need is a report showing this data. Gladly I have already written a series about that. The fourth posting in that series is at order here. Of course we don’t want reports about disks but about the %CPU time for the sqlservr.exe process so we need to change some items as described in that posting:

  1. In Step 6 of the previously mentioned posting you need to select as Performance ObjectProcess’. Hit the button Search and the correct Collection Rule will be shown. Select it and click OK.
    image
  2. Skip Steps 7 and 8 and click OK.

  3. Follow Step 9. Now the report will run successfully (Be patient though! A report needs more data to be collected before it can show anything. So wait a couple of hours first after having built the Performance Collection Rule)
    clip_image002

7 comments:

John Bradshaw said...

This is seriously valuable stuff Marnix. I owe you 10 cigars!!
Thankyou
John Bradshaw

Marnix Wolf said...

:)

John Bradshaw said...

Hi Marnix,
Just back from holidays...Had a great time on the beautiful Sunshine Coast in Queensland. Beautiful one day, Perfect the next !!

Hope u r well.

For some reason, I am having trouble with Step 1 of the Report.
Under Monitoring, i can see the data being collected (over the last couple of days), but when I choose Process, under Performance Object in the graph, I do not see my new Rule. I see LOTS of other things, but not the rule for teh process I am monitoring.
If u have time to troubleshoot and it's easier for u I am at jDOTbradshawATunswDOTeduDOTau (Pls remove DOTs and AT)
Thx,
John Bradshaw

Larry said...

Hey Marnix,

Being an SCOM newbie, I could be wrong, but I don't believe that the rule will pick up metrics for servers running more than one instance of SQL Server, as additional instances/processes are named (sqlservr1, sqlservr2, etc.)

It is unfortunate that the SQL 2005 DB Engine class does not populate "Process" property that associates the instance to the process name. That way, we could have used a variable in the rule's "Instance" field. :(

Please let me know if me understanding is incorrect.

Thanks,

Larry

Marnix Wolf said...

Hi Larry.

Thanks for visiting myu blog.

For targeting multiple instances and using a variable, take a look here: http://blogs.technet.com/kevinholman/archive/2007/12/14/how-do-i-collect-data-from-a-multi-instance-object-like-a-sql-db-instance.aspx

Best regards,
Marnix Wolf

Pal said...

Hi,

As we create this performance rule, it starts collecting that from the moment the rule is created. Can we use this procedure to run reports on historical data as well?

This report shows data on CPU utilization for a certain process. Can we run similar report on CPU utlization of the entire server?

Also can we create similar reports on memory and disk utilization?

Thanks in advance,
Pallavi

Marnix Wolf said...

Hi Pal.

Thanks for visiting my blog.

The same procedure can be used to run Reports on historical data as well. However, the data won't be older then the date that Rule became effective.

To answer your other questions, for this SCOM has already some other mechanisms in place.Check out this posting of mine and adjust the counters as needed: http://thoughtsonopsmgr.blogspot.com/2009/08/opsmgr-and-empty-reports-part-4.html

Hope this helps.

Cheers,
Marnix