How to debug a GPF crash? - vb6

I have an old VB6 app that uses a lot of 3rd party components, not just visual but also for audio processing, tcp/udp, VoIP, etc...
When I run the app as an EXE (e.g. not in the VB6 IDE), it will crash sometimes with a GPF. It happens after the program has been running for several days and occurs when there is no one around.
Unfortunately the user has clipped the screenshot, but it typically doesn't have any useful information anyway. The description of the crash reported that the crash occurred in ntdll.dll.
My questions:
What tools do I need in order to debug this?
How do I actually get started?
I have the PDB files from the compilation in VB6. The app is compiled to Optimize for Fast Code. What can I do with them in this situation?

I would use ntsd or windbg (link), and run the app under either of the user mode debuggers (if you're not familiar, they have the same commands - ntsd is a console debugger, while windbg is a GUI debugger). Both have a lot of command line options, but ntsd appname.exe will be enough to get started. Use the .sympath command to point the debugger toward the symbol, and you should be on your way. When the crash occurs, you can examine variables and stack in order to figure out what may be missing.
BTW - the error above is an invalid handle error, so the program probably passed a stale or NULL handle to a windows function. The debugger will tell you more.

Related

How does the Visual Studio Attach to Process work?

I've always wanted to know the inner-workings of Visual Studio's debugger and debuggers in general. How does it communicate and control your code, especially when it's running inside a host process or an external network server (attach to process)? Does the compiler or linker patch your code with callbacks so that the debugger is given control? If it indeed works this way, how do interpreted languages such as JavaScript containing no debug code work?
Generally speaking, Windows provides an API for writing debuggers that let you examine and modify memory in another process and to get notifications when exceptions happen in another process.
The debug process sits in a loop, waiting for notification from events from the process under inspection. To set a breakpoint, the debugger process modifies the code in the debugee to cause an exception (typically, an int 3 instruction for x86).
The compiler and linker work together to make the symbol information about a program available in a format that can be read by debuggers. On Windows, that's typically CodeView in a separate PDB file.
In the Unix-derived world, there's an API called ptrace that does essentially the same sorts of things as Windows debugging API.
For remote debugging, a small program is placed on the remote machine that communicates with and acts on behalf of the actual debugger running on the local machine.
For interpreted languages, like JavaScript, the debugger works with the interpreter to give the same sorts of functionality (inspecting memory, setting breakpoints, etc.).
Windows includes support for debuggers. A process has to enable debugger privilege, and once this is done that process can attach to any other process and debug it using windows debugger functions
http://msdn.microsoft.com/en-us/library/windows/desktop/ms679303(v=vs.85).aspx
For something like javascript, seems like you would need the equivalent of a javascript debugger.
In the case of a Visual Studio multi-process project, you typically have to switch which process the debugger is attached to in order to debug that process. I don't know if there's a way to have pending breakpoints set for multiple processes at the same time. There could be other debuggers that work better with multiple processes, but I haven't used such a tool.

AdPlus & WinDbg: difference between taking a dump with AdPlus and with WinDbg?

The task - when the application crashes, it is required to find the crash cause.
I saw recommendations to take the crash dump with AdPlus and then load it into WinDbg for analysis.
What I do is attach WinDbg to a process and wait for the program crash to debug once WinDbg shows the exception.
Is there any advantages in using AdPlus instead of directly attaching WinDbg to the process?
In your case, there's no advantage in creating a dump using AdPlus. If you can attach WinDbg and debug on the target machine, having the complete heap at hand, that's the best you can get.
In general, AdPlus is merely a VB script that wraps CDB, which is a console debugger. When you use it, CDB effectively debugs your program, the same way WinDbg does. The gain you get from using AdPlus is the easy configuration and notification options. Also, since it's designed to create dumps, it does that nicely - creates a per dump folder etc. But that's just convenience - as far as your basic need of finding the bug goes, in your case I'd stick with WinDbg.
I would say ADPlus is only better for non-technical person.
For developers, load process into WinDbg is much more convenient.

Visual C++: Difference between Start with/without debugging in Release mode

What is the difference between Start Debugging (F5) and Start without Debugging (CTRL-F5) when the code is compiled in Release mode?
I am seeing that CTRL-F5 is 10x faster than F5 for some C++ code. If I am not wrong, the debugger is attached to the executing process for F5 and it is not for CTRL-F5. Since this is Release mode, the compiled code does not have any debugging information. So, if I do not have any breakpoints, the execution times should be the same across the two, isn't it?!
(Assume that the Release and Debug modes are the typical configurations you get when you create a new Visual C++ project.)
The problem is that Windows drops in a special Debug Heap, if it detects that your program is running under a Debugger. This appears to happen at the OS level, and is independent of any Debug/Release mode settings for your compilation.
You can work around this 'feature' by setting an environment variable: _NO_DEBUG_HEAP=1
This same issue has been driving me nuts for a while; today I found the following, from whence this post is derived:
http://blogs.msdn.com/b/larryosterman/archive/2008/09/03/anatomy-of-a-heisenbug.aspx
"Start without debugging" just tells Windows to launch the app as it would normally run.
"Start with debugging" starts the VS debugger and has it run the app within the debugger.
This really doesn't have much to do with the debug/release build settings.
When you build the default 'debug' configuration of your app, you'll have the following main differences to the release build:
The emitted code won't be optimised, so is easier to debug because it more closely matches your source
The compiler & linker will output a .PDB file containing lots of extra information to help a debugger - the presence or absence of this information makes no difference to the performance of the code, just the ease of debugging.
Conditional macros like ASSERT and VERIFY will be no-ops in a release build but active in a debug build.
Each one of these items is independent and optional! You can turn any or all of them on or off and still run the code under the debugger, you just won't find life so easy.
When you run 'with debugging' things perform differently for several reasons:
The VS debugger is very inefficient at starting, partly because everything in VS is slow - on versions prior to VS2010 every pixel of the screen will be repainted about 30 times as the IDE staggers into debug mode with much flashing and flickering.
Depending on how things are configured, the debugger might spend a lot of time at startup trying to load symbols (i.e. PDB files) for lots and lots of OS components which are part of your process - it might try fetching these files over the web, which can take an age in some circumstances.
A number of activities your application normally does (loading DLLs, starting threads, handling exceptions) all cause the debugger to be alerted. This has the effect both of slowing them down and of making them tend to run sequentially.
IsDebuggerPresent() and OutputDebugString() behave differently depending on whether a debugger is attached.
IsDebuggerPresent() simply returns another value, so your program can react to this value and behave differently on purpose. OutputDebugString() returns much faster when there's no debugger attached, so if it's called lots of times you'll see that the program runs much faster without the debugger.
When running 'with debugging' the debug heap is used, even for release mode. This causes severe slowdowns in code using a lot of malloc/free or new/delete, which can happen in C++ code without you noticing it because libraries/classes tend to hide this memory management stuff from you.

Debugging a minidump in Visual Studio where the call stack is null

I have a customer who is getting a 100% reproduceable crash that I can't replicate in my program compiled in Visual Studio 2005. I sent them a debug build of my program and kept all the PDB and DLL files handy. They sent me the minidump file, but when I open it I get:
"Unhandled exception at 0x00000000 in MiniDump.dmp: 0xC0000005: Access violation reading location 0x00000000."
Then the call stack shows only "0x00000000()" and the disassembly shows me a dump of the memory at 0x0. I've set up the symbol server, loaded my PDB symbols, etc. But I can't see any way of knowing which of the many DLLs actually caused the jump to null. This is a large project with many dependencies, and some of them are binaries that I don't have the source or PDBs for, as I am using an API as a 3rd party.
So how on earth is this minidump useful? How do I see which DLL caused the crash? I've never really used minidumps for debugging before, but all the tutorials I have read seem to at least display a function name or something else that gives you a clue in the call stack. I just get the one line pointing to null.
I also tried using "Depends" to see if there was some DLL dependency that was unresolved; however on my three test machines with various Windows OS's, I seem to get three different sets of OS DLL dependencies (and yet can't replicate the crash); so this doesn't seem a particularly reliable method either to diagnose the problem.
What other methods are available to determine the cause of this problem? Is there some way to step back one instruction to see which DLL jumped to null?
Well it looks like the answer in this instance was "Use WinDbg instead of Visual Studio for debugging minidumps". I couldn't get any useful info out of VS, but WinDbg gave me a wealth of info on the chain of function calls that led to the crash.
In this instance it still didn't help solve my problem, as all of the functions were in the 3rd party library I am using, so it looks like the only definitive answer to my specific problem is to use log files to trace the state of my application that leads to the crash.
I guess if anyone else sees a similar problem with an unhelpful call stack when debugging a minidump, the best practice is to open it with WinDgb rather than Visual Studio. Seems odd that the best tool for the job is the free Microsoft product, not the commerical one.
The other lesson here is probably "any program that uses a third party library needs to write a log file".
The whole idea behind all 'simple' ways of post mortem debugging is the capture of a stack trace. If your application overwrites the stack there is no way for such analysis. Only very sophisticated methods, that record the whole program execution in dedicated hardware could help.
The way to go in such a case are log files. Spread some log statements very wide around the area where the fault occurs and transmit that version to the customer. After the crash you'll see the last log statement in your log file. Add more log statements between that point and the next log statement that has not been recorded in the log file, ship that version again. Repeat until you found the line causing the problem.
I wrote a two part article about this at ddj.com:
About Log Files Part 1
About Log Files Part 2
Just an observation, but the the stack is getting truncated or over-written, might this be a simple case of using an uninitialised field, or perhaps a buffer overrun ?
That might be fairly easy to locate.
Have you tried to set WinDbg on a customer's computer and use it as a default debugger for any application that causes a crash? You just need to add pdb files to the folder where your application resides. When a crush happens WinDbg starts and you can try to get call stack.
Possibly you already know this, but here are some points about minidump debugging:
1. You need to have exactly the same executables and PDB files, as on the client computer where minidump was created, and they should be placed exactly in the same directories. Just rebuilding the same version doesn't help.
2. Debugger must be connected to MS Symbols server.
3. When debugger starts, it prints process loading log in the Output window. Generally, all libraries should be successfully loaded with debug information. Libraries without debug information are loaded as well, but "no debug info" is printed. Learn this log - it can give you some information.
If executable stack contains frames from a library without debug information, it may be not shown. This happens, for example, if your code is running as third-party library callback.
Try to create minidump on your own computer, by adding some code which creates unhandled exception, and debug it immediately. Does this work? Compare loading log in successful and unsuccessful debugging sessions.
You may have called null function pointer. Current executing function information is needed to show call stack information. Force set instruction pointer to start of any simple function, then you'll see call stack information again.
void SimpleFunc()
{ // <- set next statement here
}

How do I analyse a BSOD and the error information it will provide me?

Well, fortunately I haven't written many applications that cause a BSOD but I just wonder about the usefullness of the information on this screen. Does it contain any useful information that could help me to find the error in my code? If so, what do I need, exactly?
And then, the system restarts and probably has written some error log or other information to the system somewhere. Where is it, what does it contain and how do I use it to improve my code?
I did get a BSOD regularly in the past when I was interacting with a PBX system where the amount of documentation of it's drivers were just absent, so I had to do some trial-and-error coding. Fortunately, I now work for a different company and don't see any BSOD's as a result of my code.
If you want a fairly easy way to find out what caused an OS crash that will work ~90% of the time - assuming you have a crash dump available - then try the following:
Download WinDbg as part of the Debugging tools for Windows package. Note, you only need to install the component called Debugging Tools for Windows.
Run WinDbg
Select "Open Crash Dump" from the file menu
When the dump file has loaded type analyze -v and press enter
WinDbg will do an automated analysis of the crash and will provide a huge amount of information on the system state at the time of the crash. It will usually be able to tell you which module was at fault and what type of error caused the crash. You should also get a stack trace that may or may not be helpful to you.
Another useful command is kbwhich prints out a stack trace. In that list, look for a line contains .sys. This is normally the driver which caused the crash.
Note that you will have to configure symbols in WinDbg if you want the stack trace to give you function names. To do this:
Create a folder such as C:\symbols
In WinDbg, open File -> Symbol File Path
Add: SRV*C:\symbols*http://msdl.microsoft.com/download/symbols
This will cache symbol files from Microsoft's servers.
If the automated analysis is not sufficient then there are a variety of commands that WinDbg provides to enable you to work out exactly what was happening at the time of the crash. The help file is a good place to start in this scenario.
Generally speaking, you cannot cause a OS crash or bug check from within your application code. That said, if you are looking for general tips and stuff, I recommend the NTDebugging blog. Most of the stuff is way over my head.
What happens when the OS crashes is it will write a kernel dump file, depending on the current flags and so on, you get more or less info in it. You can load up the dump file in windbg or some other debugger. Windbg has the useful !analyze command, which will examine the dump file and give you hints on the bucket the crash fell into, and the possible culprits. Also check the windbg documentation on the general cause of the bug check, and what you can do to resolve it.

Resources