Welcome to the new world
Microsoft is reinventing itself. It’s in a huge transition from a company previously focused on ‘devices & services’ to an enterprise geared to the ‘mobile-first, cloud-first’ mantra. Even though Microsoft has brought marketing to a whole new level, in this particular case there isn’t much marketing mumbo jumbo, if none at all.
The investments and speed of development in Microsoft’s cloud offering is unprecedented, all across the ‘Azure board’. New features are added on an almost weekly basis to the whole Azure port folio. Some are kept low key (like the Clutter feature in Office 365) where as others do get a bigger exposure.
Fact is that Azure is an ever evolving cloud environment gaining more traction by the day. Microsoft’s whole workforce has shifted their direction and are working in unison for the development of the cloud.
OMS has the same speed of development
OMS makes no difference here. Quite recently Microsoft introduced a new feature in OMS: Near real-time performance data collection. At a first glance it might seem like a minor step, but – after having tested it thoroughly – it’s a giant leap for OMS.
I’ll tell you why.
NRT & supposed impact
The intervals for near real-time (NRT) performance data collection by OMS is set by default to 10 seconds. Which makes sense since the name of the new feature implies ‘near real-time’.
Being someone with a SCOM background it made me wonder about the footprint of it all. How about memory and CPU load?. How about network load? In other words, what kind of footprint does OMS with NRT performance data collection has on any given server?
Time to put it to the test.
The test environment
Any test is just as good as the environment used for it, together with the applied test scenario. So I decided to deploy in my own test lab two brand new VMs, identical to each other. Also I deployed a new OMS Workspace in order to ascertain the test wasn’t ‘contaminated’ with old settings I tested in my other OMS workspaces.
Items:
- 2 identical Windows 2012 R2 VMs (3 GB RAM, 1 vCPU, 1 logical drive C:\, workgroup member), NRT01 and NRT02;
- Both VMs placed on the same Hyper-V host, using the same storage, compute and network resources;
- One new OMS workspace, named NRTLab.
Item configuration:
- Server NRT01 got the Windows Agent, downloadable from the OMS workspace NRTLab (the Windows Agent is the Microsoft Monitoring Agent (MMA) with OMS Workspace connection capabilities);
- The Windows Agent on NRT01 connects ONLY to the NRTLab OMS Workspace;
- NRTLab isn’t connected to any SCOM 2012 Management Group nor any Azure Storage Accounts:
- NRTLab Solutions configuration: Log Search and System Update Assessment:
- NRTLab Logs configuration. Log Name: Operations Manager (Error & Warning):
- NRTLab NRT Performance Data Collection settings. OMS default with the default sample interval:
- NRTLab is happy and reports a 100% complete configuration:
- And yes, NRT01 is connected properly to NRTLab and data is coming in:
Now I’ve got enough resources to run a good test. How about a valid test scenario?
Test scenario
Say what? NRT02 has NO Windows Agent? Yes, that’s correct! This server has only ONE purpose: it’s a reference server!
Now I can see what kind of CPU, RAM and network load this server has compared to NRT01 running the Windows Agent reporting to NRTLab while collecting NRT performance data, OpsMgr event log entries (errors & warnings) & checking whether the server is missing out on any crucial updates (performed by the System Update Assessment Solution).
On both servers I defined a new Data Collector Set in Performance Monitor, in order to collect specific performance data:
NRT01
- Logical Disk > Current Disk Queue Length (C:);
- Memory > Available MBytes
- Network Adapter > Bytes Total/Sec
- Network Adapter > Current Bandwidth
- Process > % Processor Time (HealthService.exe & MonitoringHost.exe)
- Process > IO Data Operations/sec (HealthService.exe & MonitoringHost.exe)
- Process > Working Set – Private (HealthService.exe & MonitoringHost.exe)
- Processor Information > % Processor Time
NRT02
- Logical Disk > Current Disk Queue Length (C:);
- Memory > Available MBytes
- Network Adapter > Bytes Total/Sec
- Network Adapter > Current Bandwidth
- Process > % Processor Time (_Total)
- Process > IO Data Operations/sec (_Total)
- Process > Working Set – Private (_Total)
- Processor Information > % Processor Time
I had these Data Collector Sets running for about 24 hours. No programs were opened, all MMC’s were closed (Performance Monitor included!), so these servers were simply running without being used except for their own running processes and services.
I ran these Data Collector Sets multiple times in order to establish a baseline. The results in this posting are based on the last run, from 20:43 9/7/2015 until 21:21 9/8/2015.
The results
And I must say this is the very reason I run the Data Collector Sets multiple times. Simply because the results are very impressive.
Seeing is believing, so let’s take a look at the Report View of the Report of both Data Collector Sets:
NRT01
NRT02
As you can see is the memory footprint of the Windows Agent really small. With the counter Process / Working Set – Private we see the number of bytes in use for both components of the Windows Agent, comprised of HealthService.exe (5.2 MB) and MonitoringHost.exe (11.8 MB).
This means that together (the Windows Agent actually) uses 17 MB of RAM! I don’t know about you, but to me that’s really small.
Looking at the CPU footprint you can see it’s small as well. The Windows Agent consumes about 0.151 % Processor Time (% Processor Time NRT01 – % Processor Time NRT02).
When looking at process level, we see that HealthService.exe consumes 0.014 % Processor Time and MonitoringHost.exe 0.034. Together even less than 0.05 (0.048)!
And the load on the network (Bytes Total/sec) is also very low: 413.469 Bytes Total/sec (0.00039 Megabyte!) for the Windows Agent Bytes Total/sec NRT01 – Bytes Total/sec NRT02).
But how about the network load for NRT Performance data collection only? The OpsMgr Engineering Team states: ‘… for a particular computer, a given counter instance (e.g., Processor(_Total)\% Processor Time) with 10 second sample interval will send ~1MB per day (~1MB/day/counter instance)…’.
I contacted Microsoft about this and they told me this is UNCOMPRESSED data! Since it get’s compressed these values are even lower! And they assured me this is thoroughly tested and triple checked.
Recap
I am amazed! Never ever I expected to see such a SMALL footprint of the OMS Agent (AKA Windows Agent) on any given monitored server.
Since OMS uses a cloud based state of the art back end for data processing it doesn’t have the potential bottle necks we may see with on-prem SCOM installations. So data comes in, is processed very fast and shown in your OMS workspace in the matter of seconds. Now that’s NEAR REAL-TIME!!!
Since the footprint of OMS is so small I see no reason NOT to use OMS on any important server. Connect the Windows Agent with an on-prem SCOM environment and you’ve got the best of both worlds: on-prem SCOM and state of the art (and ever evolving) OMS in the Cloud!
Check it your self
Both Performance Monitor Reports used for this posting can be downloaded from my OneDrive and opened in Performance Monitor, so you can see it for yourself: NRT01 and NRT02.
But even better, start using OMS today and see what it can do for your environment.