I've been using SetUnhandledExceptionFilter for a long time, and my handler walks the stack and uses dbghelp.dll to convert the addresses into File/Line references. It then writes that to a log file and puts up a dialog with the same information for the user. This USED to work just fine. These days however I'm getting a completely useless stack:
1004bbaa: Lgid.dll, C:\Data\Code\Lgi\trunk\src\win32\Lgi\LgiException.cpp:175
10057de0: Lgid.dll, C:\Data\Code\Lgi\trunk\src\win32\Lgi\GApp.cpp:107
7c864191: kernel32.dll, UnhandledExceptionFilter+0x1c7
102158ed: MSVCRTD.dll, winxfltr.c:228
006dc1a7: Scribe.exe, crtexe.c:345
7c817077: kernel32.dll, RegisterWaitForInputIdle+0x49
00000000: Scribe.exe
Where 'Scribe.exe' is my application. Now if I walk the debugger from the exception handler back up the stack several frames I eventually get to a completely different temporary stack that actually includes all the calls that led up to the crash. Which is the information I actually want to log for the user. It's as if the exception handler is executing on a separate stack from the main application.
What I need is the stack information for the actual application stack, that includes all the calls leading up to the crash. Is there some easy way to get that from inside the exception handler?
According to http://www.eptacom.net/pubblicazioni/pub_eng/except.html I can get the exception's EIP and EBP out of the EXCEPTION_POINTERS 'Context' member. So I tried passing that EBP to my stack walker as it's initial point and it could then walk the application stack correctly. As long as I put the EIP as the first point in the stack walk I get the whole thing.
Are you using x64? Could you be hitting http://blog.paulbetts.org/index.php/2010/07/20/the-case-of-the-disappearing-onload-exception-user-mode-callback-exceptions-in-x64/ ?
Related
On Linux I get nice, healthy, full stack traces. On Windows, however, when something crashes (like a segfault violation), I only get the top one or two lines of the stack, followed by the entry 0x0 (which I cannot expand). This makes it very hard to debug
Probably you should start using WinDBG to debug your program instead of IDE like eclipse. This is very powerful command line tool and its functionality is very similar to GDB.
On Windows, "UnhandledExceptionFilter" function is called when no exception handler is defined to handle the exception that is raised. The function typically passes the exception up to the Ntdll.dll file, which catches and tries to handle it.
EXCEPTION_POINTERS structure does contains the most useful information about what is the exception and where it has occurred which gets passed as one of the parameter of the above function. This information would be used by .exr and .cxr command in WinDBG to get the complete stack trace.
typedef struct _EXCEPTION_POINTERS {
PEXCEPTION_RECORD ExceptionRecord;
PCONTEXT ContextRecord;
} EXCEPTION_POINTERS, *PEXCEPTION_POINTERS;
ExceptionRecord A pointer to an EXCEPTION_RECORD structure that
contains a machine-independent description of the exception.
ContextRecord A pointer to a CONTEXT structure that contains a
processor-specific description of the state of the processor at the
time of the exception.
For complete steps about how to get the complete back trace and analysis from the dump file(like GDB)or debug session, you may want to read and follow the steps mentioned in the following link:
http://support.microsoft.com/kb/313109
I'm working on a native fiber/coroutine implementation – fairly standard, for each fiber, a separate stack is allocated, and to switch contexts, registers are pushed onto the source context stack and popped from the target stack. It works well, but now I hit a little problem:
I need SEH to work within a fiber (it's okay if the program terminates or strange things start to happen when an exception goes unhandled until the fiber's last stack frame, it won't). Just saving/restoring FS:[0] (along with FS:[4] and FS:[8], obviously) during the context switch and initially setting FS:[0] for newly allocated fibers to 0xFFFFFFFF (so that the exception handler set after the context switch will be the root of the chain) almost works.
To be precise, it works on all non-server Windows OSes I tested – the problem is that Windows Server 2008 and 2008 R2 have the exception chain validation (SEHOP, SEH overwrite protection) feature enabled by default, which makes RaiseException check if the original handler (somewhere in ntdll.dll) is still the root of the chain, and immediately terminates the program as if no handlers were installed otherwise.
Thus, I'm facing the problem of constructing an appropriate root frame on the stack to keep the validation code happy. Are there any (hidden?) API functions I can call to do that, or do I have to figure out what is needed to keep RtlDispatchException and friends happy and construct the appropriate _EXCEPTION_REGISTRATION entry myself? I can't just reuse the Windows-supplied one from the creating thread because it would be at the wrong address (the SEH implementation also checks if the handler address is in the boundaries given by FS:[4] and FS:[8], and possibly also if the address order is consistent).
Oh, and I'd strongly prefer not to resort to the CreateFiber WinAPI family of functions.
The approach I mentioned in the comments, generating a fake EXCEPTION_REGISTRATION entry pointing to ntdll!FinalExceptionHandler seems to work in practice indeed – at least, that's what we have in the D runtime now, and so far there have been no reports of problems:
https://github.com/D-Programming-Language/druntime/blob/c39de42dd11311844c0ef90953aa65f333ea55ab/src/core/thread.d#L4027
I'm trying to find a stack overflow in a project on MSP430, and found that it occurs mainly when an IRQ occurs after the stack is pretty full.
I've set a breakpoint on a stack pointer write with a value that is smaller than the start address of the stack, and the CPU halts in the IRQ handler.
The call stack display in IAR C-SPY then terminates at the handler function, however I'd be interested in what is below this, as this is what filled the stack.
Is there a way to display the call stack below the current interrupt handler?
If the interrupt handler is written in C, this should work correctly, as the generated CFI (call frame information) should be correct even for interrupt functions.
However, if this (for some reason) should not work, or if the interrupt routine is written in assembler (without proper CFI directives), you can use a little trick. You can manually modify the PC and SP registers in the register window by retrieving the PC from the stack and by "backing up" the SP the amount that it was adjusted inside the function. After this, the debugger will display the function that was executing when the interrupt occurred.
Note, in the traditional MSP430 core, the PC is stored as a plain 16 bit value. However, in the MSP430X core the 20 bits are a bit intertwined with the status register, see the architecture manual for details.
I've the hellish problem of a third party DLL appearing to cause a recursive stack overflow crash when it gets unloaded. I wind up with this pattern on the stack (using windbg):
<Unloaded_ThirdParty.dll>+0xdd01
ntdll!ExecuteHandler2+0x26
ntdll!ExecuteHandler+0x24
ntdll!KiUserExceptionDispatcher+0xf
<Unloaded_ThirdParty.dll>+0xdd01
ntdll!ExecuteHandler2+0x26
ntdll!ExecuteHandler+0x24
ntdll!KiUserExceptionDispatcher+0xf
...
As you would guess, I don't have the source code to ThirdParty.dll.
Q: What does the prefix "Unloaded_" mean in the stack dump. I haven't run across this before.
This means that ThirdParty.dll was no longer being referenced and has already been removed from memory at the time that the crash occurs. To find out the actual stack trace, you need to reload the .dll at its original place in memory with the following command:
.reload /f ThirdParty.dll=0xaaaaaaaa
Of course you need to replace 0xaaaaaaaa with the original base address of the module. This may be somewhat hard to figure out if the module has already been unloaded, but if you have an HMODULE lying around that refers to the dll, the value of that HMODULE is the base address. Worst case, you can add a debugger trace statement to your code that logs the HMODULE of the dll just before you unload it.
I've had a crash just like this before, and as JS points out it means that the dll has been unloaded prior to the crash. However, having the stack trace into that dll may not necessarily give you the information you need to diagnose the problem.
Something in your code is unloading the library because it thinks it's finished with it, but you still have a pointer to it (or to a function inside it) somewhere. My guess would be a callback, perhaps from another thread. I'd suggest searching through your source for any calls to FreeLibrary() and also putting a breakpoint on the FreeLibrary symbol. Find out where the library is being unloaded, and then at that point, ensure that all data that references the dll has been reset. Use a mutex if you have multiple threads.
A tool that may be very useful for this is the excellent Process Monitor which I think shows you dll load and unload events, and will give you a stack trace for each one.
Is there a way to view the register contents in each stack frame in a crash dump?
The registers window seems to contain the registers when the exception occurred but it would be useful to be able to see their contents in each stack frame.
Depending on the calling convention, you can get some of the registers which are saved on the stack. For example, in the cdecl calling convention, all of the registers except for EAX, ECX, and EDX are required to be saved, either by the caller or the callee. Those three registers are clobberable, so you generally won't be able to get their values from higher up in the call stack. If a function doesn't use a register that must be saved, then it won't save it, but since it doesn't use it, that register has the same value in the next higher stack frame.
After doing some research and thinking about this a bit, I realized that it is probably not possible. A crash minidump saves certain areas of process memory (depending on the flags passed to the MiniDumpWriteDump() function) and enough state information to re-create the environment where the crash happened in a debugger. It does not have the processor state at each instruction or even at each stack frame, it only knows about the processor state when the exception occurred.
In optimized builds, it's true that some information down the stack may get tossed, however, you can ask the debugger to try and show you the information for a given stack frame. First do "kn" to see the stack with frame numbers, then try ".frame /c [frame]" or ".frame /r [frame]".
Check out the help (".hh") for more information.
I don't think you can get it either when debugging. The only value you can get from registers is their value at the current instruction.