I'm debugging an intermittent problem in which an application (created using C++ in Visual Studio 2005) is faulting. The event log provides the following information:
faulting module msvcr80.dll
version 8.0.50727.1433
fault address 0x00008aa0
I did a Google search and found many other examples of applications crashing with this particular fault address, but no indication of what it means.
Is there any way to find out what msvcr80.dll is doing at this address?
I tried attaching to a running instance of the application from Visual Studio to see what code is located at 0x00008aa0 -- but there doesn't seem to be anything there!
More generally, given an address somewhere in a Windows DLL, is there a way to figure out what the code is doing?
Windows will never map anything to addresses lower than 0x10000, so you are definitely AV'ing.
Googling myself, someone suggested using dependency walker to find out which module you're using that is directly dependent on msvcr80.dll -- since you are using VS 2005.
That might give you a clue where to start isolating the bug.
Address this low usually indicates a null pointer access violation. The offset of the member access accessed to the base pointer is 8aa0. Looks like a pretty large object. I would suggest you add null-asserts when you dereference pointers to objects of large data type.
You can try to use Microsoft debug symbols, in this case you will see normal function name instead of address.
In VS2005 you should do:
Go to Tools -> Options -> Debugging -> Symbols
Insert http://msdl.microsoft.com/download/symbols as a symbol location
Attach VS to your app instance and repeat the crash
Related
I have a completely random error popping up on a particular piece of software out in the field. The application is a game written in VB6 and is running on Windows 7 64-bit. Every once in a while, the app crashes, with a generic "program.exe has stopped responding" message box. This game can run fine for days on end until this message appears, or within a matter of hours. No exception is being thrown.
We run this app in Windows 2000 compatibility mode (this was its original OS), with visual themes disabled, and as an administrator. The app itself is purposely simple in terms of using external components and API calls.
References:
Visual Basic for Applications
Visual Basic runtime objects and procedures
Visual Basic objects and procedures
OLE Automation
Microsoft DAO 3.51 Object Library
Microsoft Data Formatting Object Library
Components:
Microsoft Comm Control 6.0
Microsoft Windows Common Controls 6.0 (SP6)
Resizer XT
As you can see, these are pretty straightforward, Microsoft-standard tools, for the most part. The database components exist to interact with an Access database used for bookkeeping, and the Resizer XT was inserted to move this game more easily from its original 800x600 resolution to 1920x1080.
There is no networking enabled on the kiosks; no network drivers, and hence no connections to remote databases. Everything is encapsulated in a single box.
In the Windows Application event log, when this happens, there is an Event ID 1000 faulting a seemingly random module -- so far, either ntdll.dll or lpk.dll. In terms of API calls, I don't see any from ntdll.dll. We are using kernel32, user32, and winmm, for various file system and sound functions. I can't reproduce as it is completely random, so I don't even know where to start troubleshooting. Any ideas?
EDIT: A little more info. I've tried several different versions of Dependency Walker, at the suggestion of some other developers, and the latest version shows that I am missing IESHIMS.dll and GRPSVC.dll (these two seems to be well-known bugs in Depends.exe), and that I have missing symbols in COMCTRL32.dll and IEFRAME.dll. Any clues there?
The message from the application event log isn't that useful - what you need is a post mortem process dump from your process - so you can see where in your code things started going wrong.
Every time I've seen one of these problems it generally comes down to a bad API parameter rather than something more exotic, this may be caused by bad data coming in, but usually it's a good ol fashioned bug that causes the problem.
As you've probably figured already this isn't going to be easy to debug; ideally you'd have a repeatable failure case to debug, instead of relying on capturing dump files from a remote machine, but until you can make it repeatable remote dumps are the only way forwards.
Dr Watson used to do this, but is no longer shipped, so the alternatives are:
How to use the Userdump.exe tool to create a dump file
Sysinternals ProcDump
Collecting User-Mode dumps
What you need to get is a minidump, these contain the important parts of the process space, excluding standard modules (e.g. Kernel32.dll) - and replacing the dump with a version number.
There are instructions for Automatically Capturing a Dump When a Process Crashes - which uses cdb.exe shipped with the debugging tools, however the crucial item is the registry key \\HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\AeDebug
You can change your code to add better error handling - especially useful if you can narrow down the cause to a few procedures and use the techniques described in Using symbolic debug information to locate a program crash. to directly process the map files.
Once you've got a minidump and the symbol files WinDbg is the tool of choice for digging into these dumps - however it can be a bit of a voyage to discover what the cause is.
The only other thing I'd consider, and this depends on your application structure, is to attempt to capture all input events for replay.
Another option is to find a copy of VMWare 7.1 which has replay debugging and use that as the first step in capturing a reproducible set of steps.
Right click your executable object and let it be WINXP compatible pending
when you discover source of the problem to finally solve it
I have a customer who is getting a 100% reproduceable crash that I can't replicate in my program compiled in Visual Studio 2005. I sent them a debug build of my program and kept all the PDB and DLL files handy. They sent me the minidump file, but when I open it I get:
"Unhandled exception at 0x00000000 in MiniDump.dmp: 0xC0000005: Access violation reading location 0x00000000."
Then the call stack shows only "0x00000000()" and the disassembly shows me a dump of the memory at 0x0. I've set up the symbol server, loaded my PDB symbols, etc. But I can't see any way of knowing which of the many DLLs actually caused the jump to null. This is a large project with many dependencies, and some of them are binaries that I don't have the source or PDBs for, as I am using an API as a 3rd party.
So how on earth is this minidump useful? How do I see which DLL caused the crash? I've never really used minidumps for debugging before, but all the tutorials I have read seem to at least display a function name or something else that gives you a clue in the call stack. I just get the one line pointing to null.
I also tried using "Depends" to see if there was some DLL dependency that was unresolved; however on my three test machines with various Windows OS's, I seem to get three different sets of OS DLL dependencies (and yet can't replicate the crash); so this doesn't seem a particularly reliable method either to diagnose the problem.
What other methods are available to determine the cause of this problem? Is there some way to step back one instruction to see which DLL jumped to null?
Well it looks like the answer in this instance was "Use WinDbg instead of Visual Studio for debugging minidumps". I couldn't get any useful info out of VS, but WinDbg gave me a wealth of info on the chain of function calls that led to the crash.
In this instance it still didn't help solve my problem, as all of the functions were in the 3rd party library I am using, so it looks like the only definitive answer to my specific problem is to use log files to trace the state of my application that leads to the crash.
I guess if anyone else sees a similar problem with an unhelpful call stack when debugging a minidump, the best practice is to open it with WinDgb rather than Visual Studio. Seems odd that the best tool for the job is the free Microsoft product, not the commerical one.
The other lesson here is probably "any program that uses a third party library needs to write a log file".
The whole idea behind all 'simple' ways of post mortem debugging is the capture of a stack trace. If your application overwrites the stack there is no way for such analysis. Only very sophisticated methods, that record the whole program execution in dedicated hardware could help.
The way to go in such a case are log files. Spread some log statements very wide around the area where the fault occurs and transmit that version to the customer. After the crash you'll see the last log statement in your log file. Add more log statements between that point and the next log statement that has not been recorded in the log file, ship that version again. Repeat until you found the line causing the problem.
I wrote a two part article about this at ddj.com:
About Log Files Part 1
About Log Files Part 2
Just an observation, but the the stack is getting truncated or over-written, might this be a simple case of using an uninitialised field, or perhaps a buffer overrun ?
That might be fairly easy to locate.
Have you tried to set WinDbg on a customer's computer and use it as a default debugger for any application that causes a crash? You just need to add pdb files to the folder where your application resides. When a crush happens WinDbg starts and you can try to get call stack.
Possibly you already know this, but here are some points about minidump debugging:
1. You need to have exactly the same executables and PDB files, as on the client computer where minidump was created, and they should be placed exactly in the same directories. Just rebuilding the same version doesn't help.
2. Debugger must be connected to MS Symbols server.
3. When debugger starts, it prints process loading log in the Output window. Generally, all libraries should be successfully loaded with debug information. Libraries without debug information are loaded as well, but "no debug info" is printed. Learn this log - it can give you some information.
If executable stack contains frames from a library without debug information, it may be not shown. This happens, for example, if your code is running as third-party library callback.
Try to create minidump on your own computer, by adding some code which creates unhandled exception, and debug it immediately. Does this work? Compare loading log in successful and unsuccessful debugging sessions.
You may have called null function pointer. Current executing function information is needed to show call stack information. Force set instruction pointer to start of any simple function, then you'll see call stack information again.
void SimpleFunc()
{ // <- set next statement here
}
An error occure saying:
Unhandled exception at 0xfeeefeee in sgdoc.exe: 0xC0000005: Access violation reading location 0xfeeefeee.
While transferring application from VC++ 6.0 to Visual Studio 2005.
Please help me out.
Your program is trying to read from the address 0xFEEEFEEE.
Since this pattern is the marking free'd memory, your program is probably trying to access a pointer inside a struct that has already been deleted.
You will have to let your program crash with a debugger attached and look what is going on before the crash happens.
Without more data I'll venture 2 guesses.
some memory that was previously initialized with 0 is left uninitialized or is initialized with something else. Check your memory allocation and usage patterns.
I'm assuming you're using mfc, maybe something changed in the implementation since vc6. Try taking chunks out of your class declarations and see if you can identify the problem like that.
What would help more is the lines around the place where the crash happens, and any related declarations that pertain to that bit of code.
Every now and then (ahem...) my code crashes on some system; quite often, my users send screenshots of Windows crash dialogs. For instance, I recently received this:
Unhandled win32 exception # 0x3a009598 in launcher2g.exe:
0xC00000005: Access violation writing location 0x00000000.
It's clear to me (due to the 0xc0000005 code as well as the written out error message) that I'm following a null pointer somewhere in my launcher2g.exe process. What's not clear to me is the significance of the '0x3a009598' number. Is this the code offset in the process' address space where the assembler instruction is stored which triggered the problem?
Under the assumption that 0x3a000000 is the position where the launcher2g.exe module was loaded into the process, I used the Visual Studio debugger to check the assembler code at 0x3a009598 but unfortunately that was just lots of 'int 3' instructions (this was a debug build, so there's lots of int 3 padding).
I always wondered how to make the most of these # 0x12345678 numbers - it would be great if somebody here could shed some light on it, or share some pointers to further explanations.
UPDATE: In case anybody finds this question in the future, here's a very interesting read I found which explains how to make sense of error messages as the one I quoted above: Finding crash information using the MAP file.
0x3a009598 would be the address of the x86 instruction that caused the crash.
The EXE typically gets loaded at its preferred load address - usually 0x04000000 iirc. So its probably bloody far away from 0x3a009598. Some DLL loaded by the process is probably located at this address.
Crash dumps are usually the most useful way to debug this kind of thing if you can get your users to generate and send them. You can load them with Visual Studio 2005 and up and get automatic symbol resolution of system dlls.
Next up, the .map files produced by your build process should help you determine the offending function - assuming you do manage to figure out which exe/dll module the crash was inside, and what its actual load address was.
On XP users can use DrWatsn32 to produce and send you crash dumps. On Vista and up, Windows Error Reporting writes the crash dumps to c:\users\\AppData\Local\Temp*.mdmp
A client is running my company's program and it is halting before it gets anywhere. They sent this information from the Windows Event Log:
faulting module program.exe, version 1.2.3.4, fault address 0x00054321.
We don't have much else to go on so as a last ditch effort I've been trying to see if I can find where that position is in a disassembler. I run the program through Visual Studio, pause it, look at the Disassembly window and try scrolling to that address but all I get there is this:
00054321 ???
00054322 ???
00054323 ???
00054324 ???
00054325 ???
00054326 ???
00054327 ???
00054328 ???
00054329 ???
0005432A ???
Would this be because Visual Studio only disassembles part of the EXE near the pause position or something? It's hard for me to look through how much is actually disassembled because the scrollbar doesn't work fully. (I can't grab and move the scroll position; I have to scroll by line or by page.)
Thanks for any insight you may have!
The fault address could also be caused from a stack corruption problem, ie. the return address could be compromised and jumped back to the wrong address # 0x54321.
Also, depending on the tecnology used (Java, .NET) the code could change it's position between runs.
Visual studio makes a disassembly of the whole process space. ???? means that the position is not accessible.
You'd better need a stack-frame to see what's happening, from a core dump.
WinDbg may be your friend here, there you can load your executable and the symbols (.pdb), if you can get a (mini)dump as QbProg says that would definitely ease the search. But I have had experiences when it was easier doing this in WinDbg.
What are you expecting to see in the disassembly window? This approach is not going to work. If you are able to rebuild the exact same build configuration of that your client is running then you can enable the /MAP option in the project's link options. This will create a file that maps symbols to addresses and will allow you to see which function was executing when the crash occurred. You may have to do a bit of calculation to offset the raw mapped address against the address the module was loaded at on the client's PC.
As Fredrik says, WinDbg may be able to help too, especially if you can get a crash dump from your client's PC.