Crash dump collection and Performance Impact - windows

Currently, I am searching for a solution that would allow me to monitor a .NET Windows Service Application, in the production environment, and collect memory dumps. I'd like to collect them based on some specific thresholds, at will and on application crashes. I am aware of various methods to achieve this, such as:
DebugDiag
Procdump
Through WER
ADPlus
WinDbg etc.
Some of the methods facilitate the collection during a crash, such as option #3, while others can trigger based on performance counters, such as #1 and #2. Any non-invasive debugger can help me achieve the collection, but I am not sure what's the performance impact of having one attached. For example, if I use Procdump with the -e switch to collect memory dumps on unhandled exceptions, what would be the overhead for the monitored application? Bear in mind that I am referring to a production environment.
I'd be grateful if I you could point me a source or method that explains the performance impact of attaching a non-invasive debugger for memory dump collection. Ideally, that'd be a quantitative measure, although it exceeds my expectations.
P.S: I am not referring to the time needed for the memory dump to be written in the disk, where the application is completely frozen. That's another thing.

Related

How do you log all garbage collection events in CLR/.Net?

I'm looking for an equivalent of java -verbose:gc or any tool or code snippet to get the same information. Ideally, this would be something I could run in an unattended fashion on a service and log everything to a file. My use case is profiling GC-induced latency in a long-running service.
For noninvasive .NET GC profiling you have few options. You can either use CLR Memory Performance Counters or CLR Memory Event Tracing or some profiler (SciTech memory profiler has a nice command line tool that allows you to collect CLR profiling data in the production environment - other .NET profiles probably also expose such a feature).
I suppose that Performance Counters are the least invasive method, but they don't give you a detailed information on GC working - though you can see how many collections were performed (in each generation) as well as how much time your process spent in GC. To collect this information you may use perfmon, typeperf or Powershell (I once described different ways of using perf counters so you may have a look: http://lowleveldesign.wordpress.com/2012/04/19/diagnosing-applications-using-performance-counters/)
ETW events provide much more details on GC inner workings. You can configure the ETW provider manually (using logman or xperf for example) or use an excellent tool PerfView (as #Marc pointed in a comment). If you are only interested in GC events, check GC Only checkbox in the Collect windows:
There is a great episode of Defrag Tools dedicated to CLR GC profiling (part 4): http://channel9.msdn.com/Shows/Defrag-Tools/Defrag-Tools-36-CLR-GC-Part-4, I also recommend you checking the other parts as well as reading the PerfView documentation. PerfView is a really powerful tool and it even allows you to analyse .NET Heap and compare memory snapshots.
The last option (that is using a memory profiler) is probably the most invasive of the three methods, but sometimes might give you even more details on GC heaps (especially when you would like to analyse the objects graphs). I can't think of any good free GC Memory Profiler so probably you will need to pay to get one of those tools. I have some experience with SciTech Memory Profiler (it's pretty good and, as I mentioned earlier, they have a command line client that allows you to collect data on production). I also tried Visual Studio Memory profile - it's not bad but less powerfull than the SciTech one - and finally JetBrains and RedGate also sell memory profilers which are well know among .NET developers and probably comparable to SciTech.

windows performance recorder record specific process

Using the Windows Performance Recorder, is it possible to generate an ETL file based on the tracing of a single process? The ETL files generated for all of the processes in the system result in ETL files measured in GBs for intervals as small as a couple of minutes.
ETW (kernel event) tracing is system wide and captures all processes.
I don't think it is possible to record ETW traces that record just one process (at least not with xperf or wpr). If your traces are too big then the best tactic is to make sure that the rest of the system is as quiet as possible so that it doesn't contribute too much data.
If the rest of the system is already quiet then the traces are probably big because ETW traces tend to be big. You can use trace compression to make them smaller on disk - see UIforETW for how this works - https://randomascii.wordpress.com/2015/09/24/etw-central/.
If the rest of the system is not already quiet then yes, it probably is contributing to bloat in the traces. Note that it may also be affecting performance, so that data is not irrelevant.
And, if you really do need single-process profiling consider using a different profiler. The Visual Studio profiler does per-process profiling.

Win32 threads synchronization/performance monitoring tools

would Intel(R) VTune(TM) Thread profiler be able to tell if threads synchronization was successful? I'm never profiled any application, where do i start?
What exactly are you trying to profile or measure?
If you are trying to protect a critical resource from being accessed by 2 or more threads at the same time, then use a synchronization primitive such as a mutex/critical section/slim reader-writer locks and surround the writes to the critical resource with these primitives.
If you are trying to figure out if there is any lock contention, then I believe the profilers out there will surely be able to help you out. I've never used the Intrl profiler myself, so I can't say how well it works. The new tools in VS2010 (http://code.msdn.microsoft.com/VS2010ppa) are a great way to figure out the problems in your code if your project is VS based.
I can probably help out a little more if you provide more details.

Is a thread's stack reported as memory used in Task Manager?

My coworkers and I are trying to track a memory issue in an application, and in my research I found a blog entry that talks about how each thread gets a 1MB stack by default. Our application happens to create a lot of threads, and so we wrote a quick test program to make sure we understood exactly what was happening. The test app (C#) just goes and creates 300 threads, but Task Manager still only showed 22MB of memory. Is stack memory not counted by Task Manager, or is something else going on?
Task Manager is not the best tool for memory consumtion determination. Instead, download the free trial of a tool like MemProfiler, or RedGate's Memory Profiler
Don’t use the mem usage column in Task Manager for diagnostics or profiling. Use the Perfmon counters, especially Private Bytes and the specific .NET counters that will reveal problems like memory leaks.
Might also be of interest: Memory Usage Auditing For .NET Applications

What could cause the application as well as the system to slowdown?

I am debugging an application which slows down the system very badly. The application loads a large amount of data (some 1000 files each of half an MB) from the local hard disk.The files are loaded as memory mapped files and are mapped only when needed. This means that at any given point in time the virtual memory usage does not exceed 300 MB.
I also checked the Handle count using handle.exe from sysinternals and found that there are at the most some 8000 odd handles opened. When the data is unloaded it drops to around 400. There are no handle leaks after each load and unload operation.
After 2-3 Load unload cycles, during one load, the system becomes very slow. I checked the virtual memory usage of the application as well as the handle counts at this point and it was well within the limits (VM about 460MB not much fragmentation also, handle counts 3200).
I want how an application could make the system very slow to respond? What other tools can I use to debug this scenario?
Let me be more specific, when i mean system it is entire windows that is slowing down. Task manager itself takes 2 mins to come up and most often requires a hard reboot
The fact that the whole system slows downs is very annoying, it means you can not attach a profiler easily, it also means it would be even difficult to stop the profiling session in order to view the results ( since you said it require a hard reboot ).
The best tool suited for the job in this situation is ETW ( Event Tracing for Windows ), these tools are great, will give you the exact answer you are looking for
Check them out here
http://msdn.microsoft.com/en-us/library/cc305210.aspx
and
http://msdn.microsoft.com/en-us/library/cc305221.aspx
and
http://msdn.microsoft.com/en-us/performance/default.aspx
Hope this works.
Thanks
Tools you can use at this point:
Perfmon
Event Viewer
In my experience, when things happen to a system that prevent Task Manager from popping up, they're usually of the hardware variety -- checking the system event log of Event Viewer is sometimes just full of warnings or errors that some hardware device is timing out.
If Event Viewer doesn't indicate that any kind of loggable hardware error is causing the slowdown, then try Perfmon -- add counters for system objects to track file read, exceptions, context switches etc. per second and see if there's something obvious there.
Frankly the sort of behavior demonstrated is meant to be impossible - by design - for user-mode code to cause. WinNT goes to a lot of effort to insulate applications from each other and prevent rogue applications from making the system unusable. So my suspicion is some kind of hardware fault is to blame. Is there any chance you can simply run the same test on a different PC?
If you don't have profilers, you may have to do the same work by hand...
Have you tried commenting out all read/write operations, just to check whether the slow down disappears ?
"Divide and conquer" strategies will help you find where the problem lies.
If you run it under an IDE, run it until it gets real slow, then hit the "pause" button. You will catch it in the act of doing whatever takes so much time.
You use tools like "IBM Rational Quantify" or "Intel VTune" to detect performance issue.
[EDIT]
Like Benoît did, one good mean is measuring tasks time to identify which is eating cpu.
But remember, as you are working with many files, is likely to be missing that causes the memory to disk swap.
when task manager is taking 2 minutes to come up, are you getting a lot of disk activity? or is it cpu-bound?
I would try process explorer from sysinternals. When your system is in the slowed-down state, and you try running, say, notepad, pay attention to page fault deltas.
Windows is very greedy about caching file data. I would try removing file I/O as someone suggested, and also making sure you close the file mapping as soon as you are done with a file.
I/O is probably causing your slowdown,especially if your files are on the same disk as the OS. Another way to test that would be to move your files to another disk and see if that alleviates the problem.

Resources