EventID 21405 containing message ‘The process started at x failed to create y, no errors detected in the output. The process exited with 1’ was shown many times in the OpsMgr eventlog of that server:
CSS to the rescue
And no matter what we did and tried, like this solution, helped. So finally Microsoft Customer Support Services (CSS) was contacted and a case opened. It took some time since the server itself had to be traced and logged TOTALLY. Which couldn’t be done during production hours.
Cause & Solution
Finally a giant log file was created and sent out to CSS for thorough investigation. Soon the answer came in: ‘…changing the size of the desktop heap could fix it…’. CSS referred to two KB articles on how to do that, based on various OS types:
- For Windows 2008 and later: http://support.microsoft.com/kb/947246/en-us
- For Windows 2003: http://support.microsoft.com/kb/184802/en-us
After applying the fix as described in the KB article, all was well again.
So somehow somewhere the heap size wasn’t correct anymore which caused the scripts to fail. It turned out that it wasn’t a SCOM issue at all, but that the SCOM Agent made the issue with the server visible.
Using SCOM to detect heap size issues
The key customer has created a Monitor in SCOM which scans all servers for EventID 21405 and alerts upon it. So whenever a server is having heap size issues they’ll know it and know the fix for it as well.