Automatically create Visual C++ crash dump - visual-studio

Is there a way to create a crash dump file automatically on application crash (on Windows OS), same as if I could save with Visual Studio debugger attached? That is, I want to be able to debug my application in Visual Studio with automatically created crash dump files.

Update: Debug Diag 2.0 is now available. This release has native support for .NET dumps.
Yes it is possible using DebugDiag 1.2.
The Debug Diagnostic Tool (DebugDiag) is designed to assist in
troubleshooting issues such as hangs, slow performance, memory leaks
or fragmentation, and crashes in any user-mode process. The tool
includes additional debugging scripts focused on Internet Information
Services (IIS) applications, web data access components, COM+ and
related Microsoft technologies.
It even allows you to run crash/hang analysis on the dump and give you a nice report about the callstack and thread that are in sleep state (for hang dumps). It also allows you to take on the fly dumps too. The dumps taken by DebugDiag can be analyzed in DebugDiag, Visual Studio and WinDbg.
EDIT: The MSDN link to how to use DebugDiag is here.

Use SetUnhandledExceptionFilter to catch exceptions. And in this global exception handler, use MiniDumpWriteDump function to create a dump file.
There is a lot around exception handling this way, like you won't be able to catch all exceptions (exception from STL, pure-virtual function, abort C runtime call etc). Also, you may be in problem if more than one thread causes some exception. You either need to suspend all other running threads when you get exception in your global exception handler, or use some logic so that other threads won't interfere with your dump-generation code.
To handle all cases, you need to tweak around linker settings (like /EHsc flag), so that ALL exceptions can be handled by try-catch, enable debugging information even for release build so that .PDB is generated and you can get call stack. Use API hooking, so that C-runtime calls won't disable your global-exception handler and lot!
See these:
SetUnhandledExceptionFilter and the C/C++ Runtime Library
Own Crash Minidump with Call Stack
My only recommendation is you start with simpler approach, and don't bother about more complex scenarios. In short, just use SetUnhandledExceptionFilter, in a single threaded application, cause an Access Violation, and in global-exception-handler, use MinidumpWriteDump to create a MINI dump (i.e. without memory-dump).

Related

How does the Visual Studio Attach to Process work?

I've always wanted to know the inner-workings of Visual Studio's debugger and debuggers in general. How does it communicate and control your code, especially when it's running inside a host process or an external network server (attach to process)? Does the compiler or linker patch your code with callbacks so that the debugger is given control? If it indeed works this way, how do interpreted languages such as JavaScript containing no debug code work?
Generally speaking, Windows provides an API for writing debuggers that let you examine and modify memory in another process and to get notifications when exceptions happen in another process.
The debug process sits in a loop, waiting for notification from events from the process under inspection. To set a breakpoint, the debugger process modifies the code in the debugee to cause an exception (typically, an int 3 instruction for x86).
The compiler and linker work together to make the symbol information about a program available in a format that can be read by debuggers. On Windows, that's typically CodeView in a separate PDB file.
In the Unix-derived world, there's an API called ptrace that does essentially the same sorts of things as Windows debugging API.
For remote debugging, a small program is placed on the remote machine that communicates with and acts on behalf of the actual debugger running on the local machine.
For interpreted languages, like JavaScript, the debugger works with the interpreter to give the same sorts of functionality (inspecting memory, setting breakpoints, etc.).
Windows includes support for debuggers. A process has to enable debugger privilege, and once this is done that process can attach to any other process and debug it using windows debugger functions
http://msdn.microsoft.com/en-us/library/windows/desktop/ms679303(v=vs.85).aspx
For something like javascript, seems like you would need the equivalent of a javascript debugger.
In the case of a Visual Studio multi-process project, you typically have to switch which process the debugger is attached to in order to debug that process. I don't know if there's a way to have pending breakpoints set for multiple processes at the same time. There could be other debuggers that work better with multiple processes, but I haven't used such a tool.

How to debug a GPF crash?

I have an old VB6 app that uses a lot of 3rd party components, not just visual but also for audio processing, tcp/udp, VoIP, etc...
When I run the app as an EXE (e.g. not in the VB6 IDE), it will crash sometimes with a GPF. It happens after the program has been running for several days and occurs when there is no one around.
Unfortunately the user has clipped the screenshot, but it typically doesn't have any useful information anyway. The description of the crash reported that the crash occurred in ntdll.dll.
My questions:
What tools do I need in order to debug this?
How do I actually get started?
I have the PDB files from the compilation in VB6. The app is compiled to Optimize for Fast Code. What can I do with them in this situation?
I would use ntsd or windbg (link), and run the app under either of the user mode debuggers (if you're not familiar, they have the same commands - ntsd is a console debugger, while windbg is a GUI debugger). Both have a lot of command line options, but ntsd appname.exe will be enough to get started. Use the .sympath command to point the debugger toward the symbol, and you should be on your way. When the crash occurs, you can examine variables and stack in order to figure out what may be missing.
BTW - the error above is an invalid handle error, so the program probably passed a stale or NULL handle to a windows function. The debugger will tell you more.

Windows 7 and VB6: Event Error ID 1000

I have a completely random error popping up on a particular piece of software out in the field. The application is a game written in VB6 and is running on Windows 7 64-bit. Every once in a while, the app crashes, with a generic "program.exe has stopped responding" message box. This game can run fine for days on end until this message appears, or within a matter of hours. No exception is being thrown.
We run this app in Windows 2000 compatibility mode (this was its original OS), with visual themes disabled, and as an administrator. The app itself is purposely simple in terms of using external components and API calls.
References:
Visual Basic for Applications
Visual Basic runtime objects and procedures
Visual Basic objects and procedures
OLE Automation
Microsoft DAO 3.51 Object Library
Microsoft Data Formatting Object Library
Components:
Microsoft Comm Control 6.0
Microsoft Windows Common Controls 6.0 (SP6)
Resizer XT
As you can see, these are pretty straightforward, Microsoft-standard tools, for the most part. The database components exist to interact with an Access database used for bookkeeping, and the Resizer XT was inserted to move this game more easily from its original 800x600 resolution to 1920x1080.
There is no networking enabled on the kiosks; no network drivers, and hence no connections to remote databases. Everything is encapsulated in a single box.
In the Windows Application event log, when this happens, there is an Event ID 1000 faulting a seemingly random module -- so far, either ntdll.dll or lpk.dll. In terms of API calls, I don't see any from ntdll.dll. We are using kernel32, user32, and winmm, for various file system and sound functions. I can't reproduce as it is completely random, so I don't even know where to start troubleshooting. Any ideas?
EDIT: A little more info. I've tried several different versions of Dependency Walker, at the suggestion of some other developers, and the latest version shows that I am missing IESHIMS.dll and GRPSVC.dll (these two seems to be well-known bugs in Depends.exe), and that I have missing symbols in COMCTRL32.dll and IEFRAME.dll. Any clues there?
The message from the application event log isn't that useful - what you need is a post mortem process dump from your process - so you can see where in your code things started going wrong.
Every time I've seen one of these problems it generally comes down to a bad API parameter rather than something more exotic, this may be caused by bad data coming in, but usually it's a good ol fashioned bug that causes the problem.
As you've probably figured already this isn't going to be easy to debug; ideally you'd have a repeatable failure case to debug, instead of relying on capturing dump files from a remote machine, but until you can make it repeatable remote dumps are the only way forwards.
Dr Watson used to do this, but is no longer shipped, so the alternatives are:
How to use the Userdump.exe tool to create a dump file
Sysinternals ProcDump
Collecting User-Mode dumps
What you need to get is a minidump, these contain the important parts of the process space, excluding standard modules (e.g. Kernel32.dll) - and replacing the dump with a version number.
There are instructions for Automatically Capturing a Dump When a Process Crashes - which uses cdb.exe shipped with the debugging tools, however the crucial item is the registry key \\HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\AeDebug
You can change your code to add better error handling - especially useful if you can narrow down the cause to a few procedures and use the techniques described in Using symbolic debug information to locate a program crash. to directly process the map files.
Once you've got a minidump and the symbol files WinDbg is the tool of choice for digging into these dumps - however it can be a bit of a voyage to discover what the cause is.
The only other thing I'd consider, and this depends on your application structure, is to attempt to capture all input events for replay.
Another option is to find a copy of VMWare 7.1 which has replay debugging and use that as the first step in capturing a reproducible set of steps.
Right click your executable object and let it be WINXP compatible pending
when you discover source of the problem to finally solve it

Fastest way to break in WinDbg for specific exception? .net 4.0 app

Folks,
Debugging a .net 4.0 app using WinDbg (I'm a beginner to WinDbg). I'm trying to break when I hit a stack overflow:
(NTSTATUS) 0xc00000fd – A new guard page for the stack cannot be created
Unfortunately, this overflow happens about 2-hours into a long-running process and logs tells me that it doesn't always happen at the same time/place. If I attach to the process in the debugger, the program runs terribly slow...it might take a few days to hit the bug! Is there a way to speed up the app/WinDbg by telling WinDbg to ONLY break for this particular error?
You can instruct ADPLus to create dumps of the process when exceptions occur. John Robbins has a good article on the subject. You can then use WinDbg to debug the dump file(s).
Be aware, that the original adplus.vbs has been replaced by adplus.exe, which is supposed to provide the same functionality. In my experience there are a few problems with the new implementation, so you may need to use the old script, which is still available as adplus_old.vbs.
Usually, attaching a debugger would not slow down an application too much (compared to starting the application from the debugger, which will set the heap in debug mode).
But by default, the debugger will trace events (exceptions and OutputDebugString), and in your case, there may be too much of them. After attaching with the debugger, you can disable all exception handling. (Menu Debug/Event Filters, or command sxi). You have to change the handling for all events (sxi * means unknown events, and does not apply to all events). You can also disable all tracing with .outmask-0xFFFFFFFF. Then enable only the stack overflow event with sxe -c ".outmask /d" sov

How does a debugger work?

I keep wondering how does a debugger work? Particulary the one that can be 'attached' to already running executable. I understand that compiler translates code to machine language, but then how does debugger 'know' what it is being attached to?
The details of how a debugger works will depend on what you are debugging, and what the OS is. For native debugging on Windows you can find some details on MSDN: Win32 Debugging API.
The user tells the debugger which process to attach to, either by name or by process ID. If it is a name then the debugger will look up the process ID, and initiate the debug session via a system call; under Windows this would be DebugActiveProcess.
Once attached, the debugger will enter an event loop much like for any UI, but instead of events coming from the windowing system, the OS will generate events based on what happens in the process being debugged – for example an exception occurring. See WaitForDebugEvent.
The debugger is able to read and write the target process' virtual memory, and even adjust its register values through APIs provided by the OS. See the list of debugging functions for Windows.
The debugger is able to use information from symbol files to translate from addresses to variable names and locations in the source code. The symbol file information is a separate set of APIs and isn't a core part of the OS as such. On Windows this is through the Debug Interface Access SDK.
If you are debugging a managed environment (.NET, Java, etc.) the process will typically look similar, but the details are different, as the virtual machine environment provides the debug API rather than the underlying OS.
As I understand it:
For software breakpoints on x86, the debugger replaces the first byte of the instruction with CC (int3). This is done with WriteProcessMemory on Windows. When the CPU gets to that instruction, and executes the int3, this causes the CPU to generate a debug exception. The OS receives this interrupt, realizes the process is being debugged, and notifies the debugger process that the breakpoint was hit.
After the breakpoint is hit and the process is stopped, the debugger looks in its list of breakpoints, and replaces the CC with the byte that was there originally. The debugger sets TF, the Trap Flag in EFLAGS (by modifying the CONTEXT), and continues the process. The Trap Flag causes the CPU to automatically generate a single-step exception (INT 1) on the next instruction.
When the process being debugged stops the next time, the debugger again replaces the first byte of the breakpoint instruction with CC, and the process continues.
I'm not sure if this is exactly how it's implemented by all debuggers, but I've written a Win32 program that manages to debug itself using this mechanism. Completely useless, but educational.
In Linux, debugging a process begins with the ptrace(2) system call. This article has a great tutorial on how to use ptrace to implement some simple debugging constructs.
If you're on a Windows OS, a great resource for this would be "Debugging Applications for Microsoft .NET and Microsoft Windows" by John Robbins:
http://www.amazon.com/dp/0735615365
(or even the older edition: "Debugging Applications")
The book has has a chapter on how a debugger works that includes code for a couple of simple (but working) debuggers.
Since I'm not familiar with details of Unix/Linux debugging, this stuff may not apply at all to other OS's. But I'd guess that as an introduction to a very complex subject the concepts - if not the details and APIs - should 'port' to most any OS.
I think there are two main questions to answer here:
1. How the debugger knows that an exception occurred?
When an exception occurs in a process that’s being debugged, the debugger gets notified by the OS before any user exception handlers defined in the target process are given a chance to respond to the exception. If the debugger chooses not to handle this (first-chance) exception notification, the exception dispatching sequence proceeds further and the target thread is then given a chance to handle the exception if it wants to do so. If the SEH exception is not handled by the target process, the debugger is then sent another debug event, called a second-chance notification, to inform it that an unhandled exception occurred in the target process. Source
2. How the debugger knows how to stop on a breakpoint?
The simplified answer is: When you put a break-point into the program, the debugger replaces your code at that point with a int3 instruction which is a software interrupt. As an effect the program is suspended and the debugger is called.
Another valuable source to understand debugging is Intel CPU manual (Intel® 64 and IA-32 Architectures
Software Developer’s Manual). In the volume 3A, chapter 16, it introduced the hardware support of debugging, such as special exceptions and hardware debugging registers. Following is from that chapter:
T (trap) flag, TSS — Generates a debug exception (#DB) when an attempt is
made to switch to a task with the T flag set in its TSS.
I am not sure whether Window or Linux use this flag or not, but it is very interesting to read that chapter.
Hope this helps someone.
My understanding is that when you compile an application or DLL file, whatever it compiles to contains symbols representing the functions and the variables.
When you have a debug build, these symbols are far more detailed than when it's a release build, thus allowing the debugger to give you more information. When you attach the debugger to a process, it looks at which functions are currently being accessed and resolves all the available debugging symbols from here (since it knows what the internals of the compiled file looks like, it can acertain what might be in the memory, with contents of ints, floats, strings, etc.). Like the first poster said, this information and how these symbols work greatly depends on the environment and the language.

Resources