I am doing some performance measurement of my code on a Windows box and I am finding that I am getting dramatically different results between measurements. A quick bit of ad hoc exploration during a slow one shows in the task manager System Idle Processes taking up almost 100% CPU.
Does anyone know what System Idle Processes actually means and what Windows features it may be running?
NB: I am not measuring performance using the task manager, I just used it to take a look at what else was running during a particularly slow measurement.
Please think before saying this is not programming related and closing the question. I would not ask it unless I thought there were grounds to say it is. In this case I believe it clearly is because it is detrimentally affecting my development and test environments and in order to sort it out I need to know a bit more about it. Programming does not start and end with the writing of the code.
The idle process typically doesn't do any useful work except execute the HLT instruction, which puts that CPU core into a lower power state (C1). However, the fact that your benchmark is not consuming 100% CPU time does open the door for speculation about what is going on.
If your application is single-threaded and your test system is multicore/hyperthreaded/multi-CPU, then you should expect to see around 50% idle CPU time for two cores, 75% for four, etc. This is because the CPU time percentages in Task Manager include all cores. (I believe that older versions of Windows had an option to change this, but I don't see it on Vista.)
If the idle process is consuming a lot of CPU, that may indicate that your application is spending a lot of time sleeping. It might be waiting for data from some external source (e.g. a disk or network). It might be spending a lot of time waiting for synchronization objects (e.g. mutexes or events). It also might be spending a lot of time calling the Sleep() function. Profiling your code should identify where it's spending the time.
Getting fully reproducible benchmark results may require you to disable processor/disk/network-intensive background applications and services (e.g. search indexing, SMS software inventory, virus scanning, Windows Update downloads, IncrediBuild/distcc) or to connect the machine to an isolated network (or to no network at all).
I'm assuming that you wrote a benchmark for your application, and that you're just trying to use Task Manager to diagnose why the benchmark results aren't what you expected. Task Manager isn't an accurate way to measure application performance.
System Idle Process is a sort of a default process that Windows runs on the processor when it has nothing else to schedule for running. This process is like a housekeeper that does things like trying to save system power etc.
If you're measuring the performance of your program, don't use the Windows Task manager. Use Performance Monitor instead (which you can start by typing 'perfmon' at the commandline). Or better still, use a profiler.
If you "system idle process" is taking up 100% then essentially you machine is bored, nothing is going on. If you add up everything going on in task manager, subtract this number from 100%, then you will have the value of "system idle process." Notice it consumes almost no memory at all and cannot be affecting performance.
Related
Please, make it once more clear the technical difference between these three things around MS Windows systems. First is Timer Resolution you may set and get via ntdll.dll non-exported functions NtSetTimerResolution and NtQueryTimerResolution or use the Sysinternals' clockres.exe tool.
One of the scandalous trick used by the Chrome browser some time ago to perform better across the web. (They left high resolution trick for Flash plugin only at the moment). https://bugs.chromium.org/p/chromium/issues/detail?id=153139
https://randomascii.wordpress.com/2013/07/08/windows-timer-resolution-megawatts-wasted/
In fact Visual Studio and SQL Server in some cases do the trick as well. I personally feel like it performs the whole system better and crisp, not slow down as many people warn out there.
What is the difference between the timer resolution and application I/O and memory priority (realtime/high/above normal/normal/low/background/etc.) you may set via Task Manager except the fact that the timer resolution sets up for the whole system, not a single application?
What is the difference between them and Processor scheduling option you can adjust from CMD > SystemPropertiesPerformance.exe -> Advanced tab. By default, the users' OS versions (like XP/Vista/7/8/8.1/10) set the performance of programs, the servers' versions (2k3/2k8/2k12/2k16) do care of background services. How this option interacts with those two above?
timeBeginPeriod() is the documented api to do this. It is documented to affect the accuracy for Sleep(). Dave Cutler probably did not enjoy implementing it, but allowing Win 3.1 code to port made it necessary. The multi-media api back then was necessary to keep anemic hardware with small buffers going without stuttering.
Very crude, but there is no other good way to do it in the kernel. The normal state for a processor core is to be stopped on a HLT instruction. Consuming (almost) no power, the only way to revive it is with a hardware interrupt. Which is what it does, it cranks up the clock interrupt rate. Normally ticks 64 times per second, you can jack it up to 1000 with timeBeginPeriod, 2000 with the native api.
And yes, pretty bad for power consumption. The clock interrupt handler also activates the thread scheduler, an fairly unsubtle chunk of code. The reason why a Sleep() call can now wake up at (almost) the clock interrupt rate. Tinkered with in Win8.1 btw, the only thing I noticed about the changes is that it is not quite as responsive anymore and a 1 msec rate can cause up to 2 msec delays.
Chrome is indeed notorious for ab/using the heck out of it. I always assumed that it provided a competitive edge for a company that does big business in mobile operating systems and battery-powered devices. The guy that started this web site noticed something was wrong. The more responsible thing to do for a browser is to bump up the rate to 10 msec, necessary to get accurate GIF animation. Multi-media playback does not need it anymore.
This otherwise has no effect at all on scheduling priorities. One detail I did not check is if the thread quantum changes correspondingly (the number of ticks a thread may own a core before being evicted, 3 for a workstation). I suspect it does.
Suppose you are making a GUI application, and you need to load/parse/calculate a bunch of things before a user can use a certain tool, and you know what you have to do beforehand.
Suddenly, it makes sense to start doing these calculations in the background over a period of time (as opposed to "in one go" on start-up or exactly when it is needed). However, doing too much in the background will slow down the responsiveness of the application.
Are there any standard practices in this kind of approach? Perhaps ways to detect low load on the CPU or the user being idle and execute code in those times? Arguments against this type of approach?
Thanks!
Without knowing your app or your audience, I can't give you specific advice.
My main argument against the approach is that unless you have a high-profile application which will see a lot of use by non-programmers, I wouldn't bother. This sounds like a lot of busy work that could be spent developing or refining features that actually allow people to do new things with your app.
That being said, if there is a reason to do it, there is nothing wrong with lazy-loading data.
The problem with waiting until idle time is that some people have programs like SETI#Home installed on their computer, in which case their computer has little to no idle time. If loading full-throttle kills the responsiveness of your app, you could try injecting sleeps. This is what a lot of video games do when you specify a target frame rate, to avoid pegging the CPU. This would get the data loaded faster, rather than waiting for idle time.
If parts of your app depend on data to work, and the user invokes that part of the app, you will have to abandon the lazy-loading approach and resume your full CPU/disk taxing load. If it takes a long time, or make the app unresponsive, you could display a loading dialog with a progress bar.
If your target audience will tend to have a multicore CPU and if your app startup and background initialization tasks won't contend for the same other resources (e.g. disk IO, network, ...) to the point of creating a new bottleneck, you might want to kick off a background thread to perform the initialization tasks (or even a thread per initialization task if you have several tasks that can run in parallel). That will make more efficient use of a multicore hardware architecture.
You didn't specify your target platform, but it's exceedingly easy to achieve this in .NET and so I have begun doing it in many of my desktop apps.
I recently got a shiny new development workstation. The only disadvantage of this is that the desktop apps I'm developing now run very, very fast, and so I fear that parts of the code that would be annoyingly slow on end users' machines will go unnoticed during my testing.
Is there a good way to slow down an application for testing? I've tried searching around, but all of the results I've been able to find seem pretty fiddly to set up (e.g., manually setting up a high-priority CPU-bound task on the same CPU core as the target app, or running a background process that rapidly interrupts and resumes the target app), and I don't know if the end result is actually a good representation of running on a slower computer (with its slower CPU, slower RAM, slower disk I/O...).
I don't think that this is a job for a profiler; I'm interested in the user's perception of end-to-end performance rather than in where the time goes for particular operations.
setup a virtual machine, give in as little ram as needed and also you can have it use 1,2 or more CPUs. I like VirtualBox myself install your app and test with different RAM configs
Personally, I'd get an old used crappy computer that is typical of what the users have and test on that. It should be cheap and you will see pretty fast how bad things are.
I think the only way to deal with this is through proper end-user testing, i.e. get yourself a "typical" system for testing and use that to identify any perceptible performance bottlenecks.
You can try out either Virtual PC or VMWare Player/Workstation, load an OS onto it, and then throttle back the resources. I know that with any of those tools you can reduce the memory to whatever you'd like. You can also specify the number of cores you want to use. You might even be able to adjust the clock speed in VMWare Workstation... I'm not sure.
I upvoted SQLMenace's answer, but, I also think that profiling needs to be mentioned, no matter how quickly the code is executing - you'll still see what's taking the most time. If you find yourself with some free time, I think profiling and investigating the results is a good way to spend it.
I am debugging an application which slows down the system very badly. The application loads a large amount of data (some 1000 files each of half an MB) from the local hard disk.The files are loaded as memory mapped files and are mapped only when needed. This means that at any given point in time the virtual memory usage does not exceed 300 MB.
I also checked the Handle count using handle.exe from sysinternals and found that there are at the most some 8000 odd handles opened. When the data is unloaded it drops to around 400. There are no handle leaks after each load and unload operation.
After 2-3 Load unload cycles, during one load, the system becomes very slow. I checked the virtual memory usage of the application as well as the handle counts at this point and it was well within the limits (VM about 460MB not much fragmentation also, handle counts 3200).
I want how an application could make the system very slow to respond? What other tools can I use to debug this scenario?
Let me be more specific, when i mean system it is entire windows that is slowing down. Task manager itself takes 2 mins to come up and most often requires a hard reboot
The fact that the whole system slows downs is very annoying, it means you can not attach a profiler easily, it also means it would be even difficult to stop the profiling session in order to view the results ( since you said it require a hard reboot ).
The best tool suited for the job in this situation is ETW ( Event Tracing for Windows ), these tools are great, will give you the exact answer you are looking for
Check them out here
http://msdn.microsoft.com/en-us/library/cc305210.aspx
and
http://msdn.microsoft.com/en-us/library/cc305221.aspx
and
http://msdn.microsoft.com/en-us/performance/default.aspx
Hope this works.
Thanks
Tools you can use at this point:
Perfmon
Event Viewer
In my experience, when things happen to a system that prevent Task Manager from popping up, they're usually of the hardware variety -- checking the system event log of Event Viewer is sometimes just full of warnings or errors that some hardware device is timing out.
If Event Viewer doesn't indicate that any kind of loggable hardware error is causing the slowdown, then try Perfmon -- add counters for system objects to track file read, exceptions, context switches etc. per second and see if there's something obvious there.
Frankly the sort of behavior demonstrated is meant to be impossible - by design - for user-mode code to cause. WinNT goes to a lot of effort to insulate applications from each other and prevent rogue applications from making the system unusable. So my suspicion is some kind of hardware fault is to blame. Is there any chance you can simply run the same test on a different PC?
If you don't have profilers, you may have to do the same work by hand...
Have you tried commenting out all read/write operations, just to check whether the slow down disappears ?
"Divide and conquer" strategies will help you find where the problem lies.
If you run it under an IDE, run it until it gets real slow, then hit the "pause" button. You will catch it in the act of doing whatever takes so much time.
You use tools like "IBM Rational Quantify" or "Intel VTune" to detect performance issue.
[EDIT]
Like BenoƮt did, one good mean is measuring tasks time to identify which is eating cpu.
But remember, as you are working with many files, is likely to be missing that causes the memory to disk swap.
when task manager is taking 2 minutes to come up, are you getting a lot of disk activity? or is it cpu-bound?
I would try process explorer from sysinternals. When your system is in the slowed-down state, and you try running, say, notepad, pay attention to page fault deltas.
Windows is very greedy about caching file data. I would try removing file I/O as someone suggested, and also making sure you close the file mapping as soon as you are done with a file.
I/O is probably causing your slowdown,especially if your files are on the same disk as the OS. Another way to test that would be to move your files to another disk and see if that alleviates the problem.
I am doing profiling of a C code in Microsoft VS 2005 on a Intel Core-2Duo platform.
I measure the time(secs:millisecs) counsumed by my function. But i have some doubts about the accuracy of this measurement as the operating system will not continuously run my application, but instead schedule others apps/services in between the execution of my code.(Although i have no major applications running while i do the profile run, still windows will have lot of code of its own which it will run by preempting my app.). Because of all this i believe the profiling number(time taken by my app to run) is not accurate.
So my question is there any way to find out the Operating system overheads, scheduling overhead on a typical windows system(I run Windows XP)e.g. if my applications says it ran for 60 milliseconds, out of that 60 msec, how much time really was used by my app. and how much time it was sitting idle, due to being pre-empted by some other task scheduled by the OS?
or
Atleast is there any ball-park number to get such OS overhead, based on your experience you came across while doing something similar?
#Kogus: Even if i run outside debugger(standalone app. from a command prompt) it still could be preempted by OS and cause a incorrect measurement of the time consumed by my app.
Is'nt it?
-AD
I think you are going to have some problems with the granularity. See similar questions GetLocalTime() API time resolution and Is gettimeofday() guaranteed to be of microsecond resolution?
Also, you may want to take a look at the Windows Resource Kits Tools which include timeit.exe (similar to time on unix/linux) to give you elapsed and process times.
Suggestion
Try run on multi CPU systems.
The best way of doing this is a dedicated profiling tool. There are lots out there. I haven't used one for C for a few years, someone else will hopefully be able to give better advice. As you are using Visual Studio 2005 this might be a good place to start:
AQ, but I've never used it.
1 - Put some debug logging in your code (include timestamps of course), and run it outside of the debugger
2 - Run again in the debugger
3 - Repeat many times, to get statistically valid data.
4 - Compare.
If there is a significant difference in the average execution time of the standalone vs. the debugger, then you are right to be suspicious of the OS (or the overhead of the debugger hooks themselves...). If no difference, then don't sweat it.
Edit0: Obviously the debug messages have some overhead of their own. You may want to leave those in the code even when you are running from the debugger. That way, both the standalone and the debugger are running the very same code.
Edit1: I misunderstood the question. I thought your concern was that --while debugging--, the OS might interrupt your app more frequently than in a normal mode of execution. If you want to know how much time your app actually spent working, just compare the time taken to the "CPU Time" in the Task Manager.
Edit2: Compare the time returned by GetProcessTimes for your process to the actual execution time. The difference is the time spent by the CPU on somebody else.