Operating System Overheads while profiling? - profile

I am doing profiling of a C code in Microsoft VS 2005 on a Intel Core-2Duo platform.
I measure the time(secs:millisecs) counsumed by my function. But i have some doubts about the accuracy of this measurement as the operating system will not continuously run my application, but instead schedule others apps/services in between the execution of my code.(Although i have no major applications running while i do the profile run, still windows will have lot of code of its own which it will run by preempting my app.). Because of all this i believe the profiling number(time taken by my app to run) is not accurate.
So my question is there any way to find out the Operating system overheads, scheduling overhead on a typical windows system(I run Windows XP)e.g. if my applications says it ran for 60 milliseconds, out of that 60 msec, how much time really was used by my app. and how much time it was sitting idle, due to being pre-empted by some other task scheduled by the OS?
or
Atleast is there any ball-park number to get such OS overhead, based on your experience you came across while doing something similar?

#Kogus: Even if i run outside debugger(standalone app. from a command prompt) it still could be preempted by OS and cause a incorrect measurement of the time consumed by my app.
Is'nt it?
-AD

I think you are going to have some problems with the granularity. See similar questions GetLocalTime() API time resolution and Is gettimeofday() guaranteed to be of microsecond resolution?
Also, you may want to take a look at the Windows Resource Kits Tools which include timeit.exe (similar to time on unix/linux) to give you elapsed and process times.

Suggestion
Try run on multi CPU systems.

The best way of doing this is a dedicated profiling tool. There are lots out there. I haven't used one for C for a few years, someone else will hopefully be able to give better advice. As you are using Visual Studio 2005 this might be a good place to start:
AQ, but I've never used it.

1 - Put some debug logging in your code (include timestamps of course), and run it outside of the debugger
2 - Run again in the debugger
3 - Repeat many times, to get statistically valid data.
4 - Compare.
If there is a significant difference in the average execution time of the standalone vs. the debugger, then you are right to be suspicious of the OS (or the overhead of the debugger hooks themselves...). If no difference, then don't sweat it.
Edit0: Obviously the debug messages have some overhead of their own. You may want to leave those in the code even when you are running from the debugger. That way, both the standalone and the debugger are running the very same code.
Edit1: I misunderstood the question. I thought your concern was that --while debugging--, the OS might interrupt your app more frequently than in a normal mode of execution. If you want to know how much time your app actually spent working, just compare the time taken to the "CPU Time" in the Task Manager.
Edit2: Compare the time returned by GetProcessTimes for your process to the actual execution time. The difference is the time spent by the CPU on somebody else.

Related

using visual studio #pragma to make project high priority

I'm trying to test my project in high priority because the wall-clock timing I have in code seemed to be too long. I have already tried setting my .exe to run in high priority through the command line. It brought back the same time. Someone suggested that I use visual studio's #pragma to make it high priority, however I cannot find how to do that.
Note, my current project is single thread.
Is there any other way?
specific project is: timing sound initialization of soundcard (digital->analog) and DAQ (analog->digital). It's taking 1.1013 to 0.998 seconds, and I'm going to need it to be more precise because I'm using the DAQ to acquire data of a process that only takes around 10ms
If a certain function in your software requires to run in high priority, use the Windows functions SetThreadPriority or SetPriorityClass.
As soon as running at high priority is not required anymore, your software should revert back to normal priority. Do not try to keep your application running always and generally as "high priority". This would be hostile behavior :)
More information at:
msdn.microsoft.com/en-us/library/windows/desktop/ms685100.aspx

Testing perceived performance

I recently got a shiny new development workstation. The only disadvantage of this is that the desktop apps I'm developing now run very, very fast, and so I fear that parts of the code that would be annoyingly slow on end users' machines will go unnoticed during my testing.
Is there a good way to slow down an application for testing? I've tried searching around, but all of the results I've been able to find seem pretty fiddly to set up (e.g., manually setting up a high-priority CPU-bound task on the same CPU core as the target app, or running a background process that rapidly interrupts and resumes the target app), and I don't know if the end result is actually a good representation of running on a slower computer (with its slower CPU, slower RAM, slower disk I/O...).
I don't think that this is a job for a profiler; I'm interested in the user's perception of end-to-end performance rather than in where the time goes for particular operations.
setup a virtual machine, give in as little ram as needed and also you can have it use 1,2 or more CPUs. I like VirtualBox myself install your app and test with different RAM configs
Personally, I'd get an old used crappy computer that is typical of what the users have and test on that. It should be cheap and you will see pretty fast how bad things are.
I think the only way to deal with this is through proper end-user testing, i.e. get yourself a "typical" system for testing and use that to identify any perceptible performance bottlenecks.
You can try out either Virtual PC or VMWare Player/Workstation, load an OS onto it, and then throttle back the resources. I know that with any of those tools you can reduce the memory to whatever you'd like. You can also specify the number of cores you want to use. You might even be able to adjust the clock speed in VMWare Workstation... I'm not sure.
I upvoted SQLMenace's answer, but, I also think that profiling needs to be mentioned, no matter how quickly the code is executing - you'll still see what's taking the most time. If you find yourself with some free time, I think profiling and investigating the results is a good way to spend it.

Program is slower when compiled

Any suggestions on why a VB6 program would be slower when compiled than when running in the debugger? I'm compiling it with "Optimize for fast code."
Notes:
I measure performance by running the compiled version and the non-compiled version on the same machine. I based my predictions on wall-clock time, since 30 minutes vs. 100 minutes is a big enough difference to be visible.
Several months ago, I configured a debugging tool to attach itself to my program whenever it ran. I totally forgot that I had done this.
Special thanks to Process Monitor for making this very obvious.
Turning it off made the program run fast.
AppVerifier, for those who are curious.
You should select the compile to Native Code option
The compile to P-code option forces your program to run in an interpreted mode, which can be slower.
There are some optimizations in the advanced section. Try them out too.
Some more points to consider:
Are you running the compliled application in the same environment? Is it taking the same data as input?
How did you know that it is slow? What if your timing program is wrong?
How do you measure the performance?
It is hard to measure the performance by what you just said. You have to ensure the running environment must be exactly same for compare the performance?
Are you running on the same machine? Do you connect to DB? Does DB has the same work load at different run? You need isolate other factors before reaching such a decision.

Windows System Idle Processes interfering with performance measurement

I am doing some performance measurement of my code on a Windows box and I am finding that I am getting dramatically different results between measurements. A quick bit of ad hoc exploration during a slow one shows in the task manager System Idle Processes taking up almost 100% CPU.
Does anyone know what System Idle Processes actually means and what Windows features it may be running?
NB: I am not measuring performance using the task manager, I just used it to take a look at what else was running during a particularly slow measurement.
Please think before saying this is not programming related and closing the question. I would not ask it unless I thought there were grounds to say it is. In this case I believe it clearly is because it is detrimentally affecting my development and test environments and in order to sort it out I need to know a bit more about it. Programming does not start and end with the writing of the code.
The idle process typically doesn't do any useful work except execute the HLT instruction, which puts that CPU core into a lower power state (C1). However, the fact that your benchmark is not consuming 100% CPU time does open the door for speculation about what is going on.
If your application is single-threaded and your test system is multicore/hyperthreaded/multi-CPU, then you should expect to see around 50% idle CPU time for two cores, 75% for four, etc. This is because the CPU time percentages in Task Manager include all cores. (I believe that older versions of Windows had an option to change this, but I don't see it on Vista.)
If the idle process is consuming a lot of CPU, that may indicate that your application is spending a lot of time sleeping. It might be waiting for data from some external source (e.g. a disk or network). It might be spending a lot of time waiting for synchronization objects (e.g. mutexes or events). It also might be spending a lot of time calling the Sleep() function. Profiling your code should identify where it's spending the time.
Getting fully reproducible benchmark results may require you to disable processor/disk/network-intensive background applications and services (e.g. search indexing, SMS software inventory, virus scanning, Windows Update downloads, IncrediBuild/distcc) or to connect the machine to an isolated network (or to no network at all).
I'm assuming that you wrote a benchmark for your application, and that you're just trying to use Task Manager to diagnose why the benchmark results aren't what you expected. Task Manager isn't an accurate way to measure application performance.
System Idle Process is a sort of a default process that Windows runs on the processor when it has nothing else to schedule for running. This process is like a housekeeper that does things like trying to save system power etc.
If you're measuring the performance of your program, don't use the Windows Task manager. Use Performance Monitor instead (which you can start by typing 'perfmon' at the commandline). Or better still, use a profiler.
If you "system idle process" is taking up 100% then essentially you machine is bored, nothing is going on. If you add up everything going on in task manager, subtract this number from 100%, then you will have the value of "system idle process." Notice it consumes almost no memory at all and cannot be affecting performance.

Finding GDI/User resource usage from a crash dump

I have a crash dump of an application that is supposedly leaking GDI. The app is running on XP and I have no problems loading it into WinDbg to look at it. Previously we have use the Gdikdx.dll extension to look at Gdi information but this extension is not supported on XP or Vista.
Does anyone have any pointers for finding GDI object usage in WinDbg.
Alternatively, I do have access to the failing program (and its stress testing suite) so I can reproduce on a running system if you know of any 'live' debugging tools for XP and Vista (or Windows 2000 though this is not our target).
I've spent the last week working on a GDI leak finder tool. We also perform regular stress testing and it never lasted longer than a day's worth w/o stopping due to user/gdi object handle overconsumption.
My attempts have been pretty successful as far as I can tell. Of course, I spent some time beforehand looking for an alternative and quicker solution. It is worth mentioning, I had some previous semi-lucky experience with the GDILeaks tool from msdn article mentioned above. Not to mention that i had to solve a few problems prior to putting it to work and this time it just didn't give me what and how i wanted it. The downside of their approach is the heavyweight debugger interface (it slows down the researched target by orders of magnitude which I found unacceptable). Another downside is that it did not work all the time - on some runs I simply could not get it to report/compute anything! Its complexity (judging by the amount of code) was another scare-away factor. I'm not a big fan of GUIs, as it is my belief that I'm more productive with no windows at all ;o). I also found it hard to make it find and use my symbols.
One more tool I used before setting on to write my own, was the leakbrowser.
Anyways, I finally settled on an iterative approach to achieve following goals:
minor performance penalties
implementation simplicity
non-invasiveness (used for multiple products)
relying on as much available as possible
I used detours (non-commercial use) for core functionality (it is an injectible DLL). Put Javascript to use for automatic code generation (15K script to gen 100K source code - no way I code this manually and no C preprocessor involved!) plus a windbg extension for data analysis and snapshot/diff support.
To tell the long story short - after I was finished, it was a matter of a few hours to collect information during another stress test and another hour to analyze and fix the leaks.
I'll be more than happy to share my findings.
P.S. some time did I spend on trying to improve on the previous work. My intention was minimizing false positives (I've seen just about too many of those while developing), so it will also check for allocation/release consistency as well as avoid taking into account allocations that are never leaked.
Edit: Find the tool here
There was a MSDN Magazine article from several years ago that talked about GDI leaks. This points to several different places with good information.
In WinDbg, you may also try the !poolused command for some information.
Finding resource leaks in from a crash dump (post-mortem) can be difficult -- if it was always the same place, using the same variable that leaks the memory, and you're lucky, you could see the last place that it will be leaked, etc. It would probably be much easier with a live program running under the debugger.
You can also try using Microsoft Detours, but the license doesn't always work out. It's also a bit more invasive and advanced.
I have created a Windbg script for that. Look at the answer of
Command to get GDI handle count from a crash dump
To track the allocation stack you could set a ba (Break on Access) breakpoint past the last allocated GDICell object to break just at the point when another GDI allocation happens. That could be a bit complex because the address changes but it could be enough to find pretty much any leak.

Resources