Symbols, in Microsoft Debugging Tools for Windows? - debugging

What is the need/use of 'symbols' in the Microsoft debugger?
I spent some time trying to figure out the debugger a while back and never was able to get it making any sense (I was trying to debug a server hang...). Part of my problem was not having the proper 'symbols'.
What are they? And why would I need them? Aren't I just looking for text?
Are there any better links out there to using it than How to solve Windows system crashes in minutes ?

You need symbols in order to translate addresses into meaningful names. For example, you have locations on your stack at every function call:
0x00003791
0x00004a42
Symbols allows the debugger to map these addresses to methods
0x00003791 myprog!methodnamea
0x00004a42 myprog!methodnameb
When you build a debug version of a program, the compiler emits symbols with the extension .PDB. It also contains line information so you can do source code debugging, etc..
You need to set your symbol search path correctly for the debugger to pick this up. IN the command window you can do
.sympath c:\symbols;c:\temp\symbols
in order to have it search for the .PDB in these directories. It will also look in the same directory that the executable is ran from.
It also might be helpful to use the Microsoft public symbols server so that you can resolve OS binaries such as NTDLL, GDI, etc.. with this path at the beginning:
.sympath SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols;c:\symbols
You will need to create c:\websymbols first.

On the Windows binary architecture, the information needed for debugging (function names, file and line numbers, etc.) aren't present in the binary itself. Rather, they're collected into a PDB file (Program DataBase, file extension .pdb), which the debugger uses to correlate binary instructions with the sorts of information you probably use while debugging.
So in order to debug a server hang, you need the PDB file both for the server application itself, and optionally for the Windows binaries that your server is calling into.
As a general note, my experience with WinDbg is that it was much, much harder to learn how to use compared to GDB, but that it had much greater power once you understood how to use it. (The opposite of the usual case with Windows/Linux tools, interestingly.)

If you just have the binary file, the only info you can typically get is the stack trace, and maybe the binary or IL(in .NET) instructions. Having the symbols lets you actually match that binary/IL instruction up with a corresponding line in the source code. If you have the source code, it also lets you hook up the debugger in Visual Studio and step through the source code.

Related

How is WinDbg used, what exactly is it, and does it relate to .dmp files?

In the past, I have heard references to parsing .dmp files using WinDbg (I think - I might be wrong).
I have also done fairly extensive debugging with the help of .map files, and I have done extensive debugging using standard logical heuristics and the Visual Studio debugger.
However, occasionally, the program I am developing crashes and creates a .dmp file. I have never been able to interpret the .dmp file. A while ago, I posted a SO question regarding how to interpret .dmp files ( How to view .dmp file on Windows 7? ), but after somewhat significant effort I was unable to figure out how to interpret .dmp files using the answer to that question.
Today, I was viewing an unrelated SO question ( C++ try/throw/catch => machine code ), and a useful comment underneath the accepted answer has, once again, made reference to WinDbg.
If you really want to find this out though, it's easy - just trace
through it in WinDbg
I would like to follow this advice. However, for me, it's not easy to "just trace through it in WinDbg". I've tried in the past and can't figure out what exactly this means or what to do!
So, I'm trying again. "For once and for all", I would like to have plain-and-simple instructions regarding:
What is WinDbg
Assuming WinDbg is related to .dmp files, what exactly is a dump file and how does it relate to WinDbg (and correct me if my assumption is wrong)
How do you create .dmp files and, correspondingly, how do you use WinDbg to analyze them (again, correct me if I'm wrong about the relationship between WinDbg and .dmp files).
If you could please answer this question from the "starting point" of a programmer who ONLY has Visual Studio installed and running.
Thanks!
WinDbg is a multipurpose debugger. It can debug a live process by attaching, set breakpoints, etc like you would with any other debugger. It can also analyze crash dump files as well, which are .dmp files. It functions by you giving it commands.
A .dmp file is a memory dump of something. What that something is depends on what the memory dump is for. It could be for a process, for example. It could also be for the kernel. What is in the memory dump depends, too. In your case, it's probably what your process looked like at the time of it crashing. What the memory dump contains can vary depending on the dump type.
There are various ways. On Windows Vista+, Server 2008+ - you can do it right from the task manager. Right click the process, and click "Create Memory Dump". WinDbg can make a memory dump from a live process too by using the .dump command. Other tools like adplus can be used to automatically create a memory dump on certain conditions, like when a process exceeds a memory or CPU threshold, or when it crashes.
WinDbg can open a Crash Dump easily enough. What is important is that you get your symbols loaded correctly, first. Usually in the form of .pdb files or from a symbol server (though not necessary, or always possible, it is greatly helpful).
Once you have WinDbg running, take a look at the list of commands available to poke around in your crash dump.
WinDbg is a Gui version of the command line debugger cdb.exe, both are user-process and kernel mode debuggers, it uses DbgHelp.dll to issue commands to your application or NT kernel (you can also do the same as it has an api).
.Dmp files are memory dumps of varying detail, some can have minimal detail enough for call stacks of all threads, whilst others will put the entire user-mode memory, handle information, thread information, memory information etc.. see this for more info. So dump files have nothing to do with WinDbg, other than it can open them, incidentally you can open .dmp files in Visual Studio
Like #vcsjones has already stated you can do this using task manager (at least you can from Vista onwards), you can use procdump, you can do this once WinDbg is attached, I usually do a full mini dump like this: .dump /ma c:\mem.dmp, you can also set Windows to do this when a crash happens using Dr. Watson
However, you must have the symbols for Windows and your application in order to be able to generate sensible call stacks, note that for obvious reasons you cannot step through or set breakpoints in a a memory dump, you can only do this for a live process. You can also have WinDbg attach non invasively, so Visual Studio could be attached and you can attach WinDbg non invasively and use the toolset in WinDbg to assist debugging.
For me the main advantage of WinDbg is its free, it is a small download and install, it is fast, it has a very rich toolset for diagnosing problems that are either difficult or impossible to do using visual studio.

WinDbg: APPLICATION_HANG_WRONG_SYMBOLS

I'm pretty new to WinDbg and I'm trying to find a bug which has my application hanging for no appearent reason. I'm not sure I'm doing things right, but I understand that I need both symbols for the system dlls aswell as the .exe I'm debugging. Thus, I set up my symbol path like this:
srv*c:\websymbols*http://msdl.microsoft.com/download/symbols;S:\MY\PATH
The second path pointing to a folder where I placed the .pdb that was generated by VS. I'm positive that's the correct .pdb file, but it was built on a different architecture (not sure if this is an issue). I would like to see a complete stack trace for a start, so I ran !analyze-v. The output looks like this. As you can see it lists APPLICATION_HANG_WRONG_SYMBOLS as primary problem. So I ran .reload /f, giving me this output. I have no symbols for dnAnalytics or Vertec.Interop, so these errors make sense, but there are some missing checksums and iphlpapi.pdb was not found.
So my questions would be: Why does WinDBG lists wrong symbols as primary problem, even though I'm positive I do have the right .pdb file available? (I'm running WinDBG on the same machine the dump was generated). To what extent can I trust the stack trace even thought my symbols are wrong, appearently? Does anyone see a obvious problem that could cause a hang in my application from the stack trace already? Any pointers appreciated!
The "wrong symbols" here is probably because you are using 64-bit CLR with version less than 4.0 and the !analyze extension has some trouble decoding the mixed native / managed stack.
Why is WinDBG looking for the BJM.exe
symbols on the microsoft server in
this case?
This is because you put the symbol server before you local path in the symbol path. Windbg does not know which module is yours and which is Microsoft's. It just looks for the PDB file for the module in the order speicified by the symbol path.
To what extent can I trust the stack trace even thought my symbols are wrong, appearently?
Stacks on x64 are very reliable since the stack-walk does not require symbols. Symbols are reliable (that is you do not have wrong symbols) unless you forced windbg to ignore wrong timestamp / checksum with .reload /f /i
In some cases the address -> symbol may seem wrong. This is usually due to small functions that have the same code (very common in C++ code if the functions are virtual or the code is not optimized)
Try using !sym noisy to get some more information about what it's actually looking at (docs). !itoldyouso may also be useful here (link)

How to extract stack traces from minidumps?

I've got a whole bunch of minidumps which were recorded during the runtime of an application through MiniDumpWriteDump. The minidumps were created on a machine with a different OS version than my development machine.
Now I'm trying to write a program to extract stack traces from the minidumps, using dbghelp.dll. I'm walking the MINIDUMP_MODULE_LIST and call SymLoadModule64, but this fails to download the pdbs (kernel32 etc.) from the public symbol server. If I add "C:\Windows\System32" to the symbol path it finds the dlls and downloads the symbols, but of course they don't match the dlls from the minidump, so the results are useless.
So how do I tell dbghelp.dll to download and use the proper pdbs?
[edit]
I forgot to state that SymLoadModule64 only takes a filename and no version/checksum information, so obviously with SymLoadModule64 alone it's impossible for dbghelp to figure out which pdb to download.
The information is actually available in the MINIDUMP_MODULE_LIST but I don't know how to pass it back to the dbghelp API.
There is SymLoadModuleEx which takes additional parameters, but I have no idea if that's what I need or what I should pass for the additional parameters.
[edit]
No luck so far, though I've noticed there's also dbgeng.dll distributed together with dbghelp.dll in the debugging SDK. MSDN looks quite well documented and says it's the same engine as windbg uses. Maybe I can use that to extract the stack traces.
If anyone can point me to some introduction to using dbgeng.dll to process minidumps that would probably help too, as the MSDN documents only the individual components but not how they work together.
Just in case anyone else wants to automate extracting stack traces from dumps, here's what I ended up doing:
Like I mentioned in the update it's possible to use dbgeng.dll instead of dbghelp.dll, which seems to be the same engine WinDbg uses. After some trial and error here's how to get a good stack trace with the same symbol loading mechanism as WinDbg.
call DebugCreate to get an instance of the debug engine
query for IDebugClient4, IDebugControl4, IDebugSymbols3
use IDebugSymbols3.SetSymbolOptions to configure how symbols are loaded (see MSDN for the options WinDbg uses)
use IDebugSymbols3.SetSymbolPath to set the symbol path like you would do in WinDbg
use IDebugClient4.OpenDumpFileWide to open the dump
use IDebugControl4.WaitForEvent to wait until the dump is loaded
use IDebugSymbols3.SetScopeFromStoredEvent to select the exception stored in the dump
use IDebugControl4.GetStackTrace to fetch the last few stack frames
use IDebugClient4.SetOutputCallbacks to register a listener receiving the decoded stack trace
use IDebugControl4.OutputStackTrace to process the stack frames
use IDebugClient4.SetOutputCallbacks to unregister the callback
release the interfaces
The call to WaitForEvent seems to be important because without it the following calls fail to extract the stack trace.
Also there still seems to be some memory leak in there, can't tell if it's me not cleaning up properly or something internal to dbgeng.dll, but I can just restart the process every 20 dumps or so, so I didn't investigate more.
An easy way to automate the analysis of multiple minidump files is to use the scripts written by John Robbins in his article "Automating Analyzing Tons Of Minidump Files With WinDBG And PowerShell" (you can grab the code on GitHub).
This is easy to tweak to have it perform whatever WinDbg commands you'd like it to, if the default setup is not sufficient.

Debugging a minidump in Visual Studio where the call stack is null

I have a customer who is getting a 100% reproduceable crash that I can't replicate in my program compiled in Visual Studio 2005. I sent them a debug build of my program and kept all the PDB and DLL files handy. They sent me the minidump file, but when I open it I get:
"Unhandled exception at 0x00000000 in MiniDump.dmp: 0xC0000005: Access violation reading location 0x00000000."
Then the call stack shows only "0x00000000()" and the disassembly shows me a dump of the memory at 0x0. I've set up the symbol server, loaded my PDB symbols, etc. But I can't see any way of knowing which of the many DLLs actually caused the jump to null. This is a large project with many dependencies, and some of them are binaries that I don't have the source or PDBs for, as I am using an API as a 3rd party.
So how on earth is this minidump useful? How do I see which DLL caused the crash? I've never really used minidumps for debugging before, but all the tutorials I have read seem to at least display a function name or something else that gives you a clue in the call stack. I just get the one line pointing to null.
I also tried using "Depends" to see if there was some DLL dependency that was unresolved; however on my three test machines with various Windows OS's, I seem to get three different sets of OS DLL dependencies (and yet can't replicate the crash); so this doesn't seem a particularly reliable method either to diagnose the problem.
What other methods are available to determine the cause of this problem? Is there some way to step back one instruction to see which DLL jumped to null?
Well it looks like the answer in this instance was "Use WinDbg instead of Visual Studio for debugging minidumps". I couldn't get any useful info out of VS, but WinDbg gave me a wealth of info on the chain of function calls that led to the crash.
In this instance it still didn't help solve my problem, as all of the functions were in the 3rd party library I am using, so it looks like the only definitive answer to my specific problem is to use log files to trace the state of my application that leads to the crash.
I guess if anyone else sees a similar problem with an unhelpful call stack when debugging a minidump, the best practice is to open it with WinDgb rather than Visual Studio. Seems odd that the best tool for the job is the free Microsoft product, not the commerical one.
The other lesson here is probably "any program that uses a third party library needs to write a log file".
The whole idea behind all 'simple' ways of post mortem debugging is the capture of a stack trace. If your application overwrites the stack there is no way for such analysis. Only very sophisticated methods, that record the whole program execution in dedicated hardware could help.
The way to go in such a case are log files. Spread some log statements very wide around the area where the fault occurs and transmit that version to the customer. After the crash you'll see the last log statement in your log file. Add more log statements between that point and the next log statement that has not been recorded in the log file, ship that version again. Repeat until you found the line causing the problem.
I wrote a two part article about this at ddj.com:
About Log Files Part 1
About Log Files Part 2
Just an observation, but the the stack is getting truncated or over-written, might this be a simple case of using an uninitialised field, or perhaps a buffer overrun ?
That might be fairly easy to locate.
Have you tried to set WinDbg on a customer's computer and use it as a default debugger for any application that causes a crash? You just need to add pdb files to the folder where your application resides. When a crush happens WinDbg starts and you can try to get call stack.
Possibly you already know this, but here are some points about minidump debugging:
1. You need to have exactly the same executables and PDB files, as on the client computer where minidump was created, and they should be placed exactly in the same directories. Just rebuilding the same version doesn't help.
2. Debugger must be connected to MS Symbols server.
3. When debugger starts, it prints process loading log in the Output window. Generally, all libraries should be successfully loaded with debug information. Libraries without debug information are loaded as well, but "no debug info" is printed. Learn this log - it can give you some information.
If executable stack contains frames from a library without debug information, it may be not shown. This happens, for example, if your code is running as third-party library callback.
Try to create minidump on your own computer, by adding some code which creates unhandled exception, and debug it immediately. Does this work? Compare loading log in successful and unsuccessful debugging sessions.
You may have called null function pointer. Current executing function information is needed to show call stack information. Force set instruction pointer to start of any simple function, then you'll see call stack information again.
void SimpleFunc()
{ // <- set next statement here
}

Post-mortem crash-dump debugging without having the exact version of a Windows DLL in the Symbol Server

Within my application, I use the MiniDumpWriteDump function (see dbghelp.dll) to write a crash dump file whenever my application crashes.
I also use a symbol server to store all my executables and pdb files, so that whenever a customer sends me a crash-dump file, the debugger automatically picks up the correct version of the executable and the debug information.
I also store Windows DLL's (ntdll.dll, kernel32.dll, ...) and their debug information in the symbol server (using SymChk). The debug information is fetched from Microsoft's public symbol server.
Most of the time this works perfect, except when:
a customer crashes in one of the Windows DLL's
and the customer uses DLL's that I haven't put in the symbol server
This is because it is quite undoable to store every flavor of every Windows DLL in the Symbol Server (especially with the weekly patches).
So, if a customer crashes in, let's say, version 5.2.123.456 of NTDLL.DLL, and I didn't put this exact version of the DLL in my Symbol Server, then I'm stuck. Even Microsoft's public symbol server doesn't help because it only provides the debug information, not the DLL's itself.
My current solution is to ask the customer for his DLL's, but that's not always easy. Therefore I'm looking for a better solution.
Is there a way to get the debugger showing a correct call stack, or loading the debug information of a specific DLL, even if you don't have the exact version of the DLL?
Alternatively, is there a way to get all versions of all (or the important) Windows DLL's (from Microsoft)?
EDIT:
In the mean time I found a really easy way to solve this problem.
With the utility ModuleRescue (see http://www.debuginfo.com/tools/modulerescue.html) you can generate dummy DLL's from a minidump file. With these dummy DLL's, the debugger is satisfied, and correctly starts loading the debug symbols from the Microsoft servers.
It is possible to relax WinDbg's symbol resolution; see my answer to a similar question. On the other hand, the solution that I propose here relies on the fact that the DLLs are identical other than having different GUIDs identifying their debug symbols. A different version of a DLL is likely going to have a different binary, so the symbols are probably not going to match properly even if you can get them to load.
I'm pretty sure Microsoft's symbol server also provides the binaries. I'm looking in my store and I see a ton of Microsoft .dll files. I have my _NT_SYMBOL_PATH defined as
SRV*F:\Symbols\Microsoft*http://msdl.microsoft.com/download/symbols
This way it will search my local store first before trying to copy them down from Microsoft's public server.
You can have multiple symbol servers in your symbol path. So simply set up the symbol path to point to your own server for your own private modules, and to the public MS server for OS modules, see Symbol Path:
This can be easily combined with the
Microsoft public symbol store by using
the following initial setting:
_NT_SYMBOL_PATH=srv*c:\mysymbols*http://msdl.microsoft.com/download/symbols;cache*c:\mysymbols
The Microsoft Public Symbol Store is documented as http://msdl.microsoft.com/download/symbols.
? What part isn't working?
I've never been in your situation, exactly, but I would expect the debugger to give you the correct portion of the call stack that was in your code, up to the call into the dark dll. Of course, from there down to the actual crash symbols would be unavailable, but can't you see which NTDLL API was being called and which arguments were passed to that call?
You don't say which tool you're using for minidump debugging: WinDBG or VS.

Resources