i have a Win32 compiler which, for years, has been able to create a DBG debug information file.
This has allowed debuggers, and tools like Process Explorer and Process Monitor to have access to symbol information:
i recently learned that Visual Studio's debugger no longer accepts DBG files, only undocumented Program Database (PDB) files:
Since Microsoft keeps the PDB format secret, i assume they have a tool that will allow me to convert existing debugging information to a PDB (so i don't learn the secrets of their file format).
Bonus Reading
cv2pdb: how to use to convert other debug formats to pdb?
Undocumented
Even though Microsoft has a GitHub repository for PDB, the spec remains completely undocumented. The files on their repository are incomplete. There are missing types and declarations.
And even though i've created a PDBViewer:
It doesn't get me anything - because Microsoft doesn't explain what any of it means.
The point isn't just to look at a PDB - we need to create one. And for that we need to know:
what goes in it
where
and what format
PDB is not documented, but you can collect very detailed information about the content of PDB files programmatically using the appropriate interfaces See Sample
The PDB format is now documented-through-code by Microsoft in a GitHub repository. LLVM also have a great overview, partly based on Microsoft's documentation.
That's not a complete answer because you'll still need to write the tool to do the conversion...
LLVM developers documented the PDB file format in order to make clang and lld able to read and produce PDB files. Microsoft's PDB Github repository was put up, in part, to support that work.
PDB is primarily a container for CodeView debug info, which is documented by Microsoft.
LLVM provides libraries for working with PDBs and COFF debug info, as well as command line tools for inspecting and generating them from YAML.
Related
I have .pdb file, downloaded from MS symbols server. I need to fetch list of symbols (functions, arguments, anything it has). There is a tool on CodeProject, but it only reports modules. There is DbgHelp API, but it only could be attcahed to running process. How can I read .pdb file offline?
Good News for anyone still looking,
The information you seek is now open source!
https://github.com/Microsoft/microsoft-pdb
Some real interesting stuff there. Like this pdbdump.cpp file,
with its dumpPublics function or its main flow controls. Good documentation too
You can also use Visual Studio's Dia2Dump sample program to dump human-readable output from a PDB file, including its public symbols.
Be sure to build it as a 32-bit application though, or you might run into some problems with it. (See dia2dump: CoCreateInstance failed - HRESULT = 80040154)
I don't want to tell the whole story but all in all it would lead to this simple question:
"Is there a way that allows me to read all stored debug information inside a PE or PDB file?"
I am using the C/C++ compiler delivered by Visual Studio 2010
If you want it to do it for yourself, using C++, have a look at this question. Matt Pietrek the "urgestalt" for Portable Executable questions, shows how that can be done.
And here, the link to his own list of samples.
This may come down to my misunderstanding of PDB files and the build process, rather than any particular problem but I've struggled to find a good answer elsewhere.
We have recently been good little developers and started indexing and storing our pdb files on a central symbol server (all part of TFS). The problem is that our PDB files do not appear to include all the source information.
When trying to navigate to sources in Visual Studio, the pdb files of our assemblies are found, as shown by the output window:
PdbNavigator: Downloader: file://server/Symbols/my.assembly.pdb/1DB3F79EA3094EAAADFC6CDE6515FC871/my.assembly.pdb -> ok, 251 KB
PdbNavigator: No debugging information found on symbol servers for my.assembly, Version=1.0.1.1206, Culture=neutral, PublicKeyToken=4cd79aeab39b919b
But at the same time it says it found no sources. If I use some of the tools from the windows SDK I can see that the PDB file does not contain the information on about 30% of the source files in the project.
I think I read somewhere that PDB files only include the source for classes actually used within the project, but surely that creates a massive problem for any API type assemblies where multiple classes may have no function within the assembly, only when used from some other part of your project?
If anyone can shed light on this, please let me know.
Thanks.
A PDB (normally) doesn't store source code - it contains a list of "documents", which are the source code file names, and "method information", which maps source lines to offsets in the assembly or binary. A PDB matches when the signature and build date of the assembly matches the same in the PDB file. Chances are, the MyAssembly.pdb has the correct version, but the signature and/or build date don't match.
The signature is not exposed as far as I know, but you may find some code on the Internet that says how to read a PE signature and a PDB signature so you can do a comparison.
I am experimenting an analysis tool that can analyze executable files with embedded debug symbol information in Windows. While trying this tool on several open source projects, I realize that most of the builds do not keep symbolic information in executable files. I am able to compile the source code with VS (2008), but the build normally keeps the debug information in a separated .pdb file, not in the .exe file (unfortunately I only want to read debug information from .exe file and not .pdb file :-().
Does anybody know a way to embed symbol debug information into a single .exe file using Visual Studio?
I know this is a pretty old issue but this feature has recently been merged into Roslyn: https://github.com/dotnet/roslyn/issues/12390
The MSDN says that it isn't possible.
It is not possible to create an .exe or .dll that contains debug information. Debug information is always placed in a .pdb file.
i don't know, yet, how to do it - but there's article on MSDN that talks about it.
A portable executable (i.e .exe or .dll) can have a flag present in the header: (archive)
IMAGE_FILE_DEBUG_STRIPPED
Debugging information was removed and stored separately in stored separately in a .dbg file.
This implies that debugging information can be in the executable, and has the option of being removed and stored in a separate .dbg file.
From MSDN article DBG Files: (archive)
DBG files are portable executable (PE) format files that contain debug information in Codeview format for the Visual Studio debugger (and possibly other formats, depending on how the DBG was created). When you do not have source for certain code, such as libraries or Windows APIs, DBG files permit debugging. DBG files also permit you to do OLE RPC debugging.
DBG files have been superseded by PDB files, which are now more commonly used for debugging.
You can use the REBASE.EXE utility to strip debug information from a PE-format executable and store it in a DBG file. The file characteristic field IMAGE_FILE_DEBUG_STRIPPED in the PE file header tells the debugger that Codeview information has been stripped to a separate DBG file.
A knowledge base article describing the COFF format mentions the dumpbin utility, and it's /SYMBOLS option:
/SYMBOLS Setting this option causes DUMPBIN to display the COFF symbol
table. Symbol tables exist in all object files. A COFF symbol
table appears in an image file only if it is linked with
/DEBUG /DEBUGTYPE:COFF
The next step, and the part that would answer our question is:
what format is the embedded debugging information?
where in the PE is the embedded debugging information stored? (resource?, data section?)
But the answer "it cannot be done" seems to be incorrect.
See also
IMAGE_FILE_HEADER structure (archive)
LOADED_IMAGE structure (archive)
DBG Files (archive)
Microsoft PE and COFF Specification (archive)
An In-Depth Look into the Win32 Portable Executable File Format (archive)
KB121460 - Common Object File Format (COFF) (archive)
There is no built-in support in Visual Studio for this type of operation (at least for managed languages). The .PDB and .EXE files are created at the same time and have no option for embedding. I'm not even sure the .EXE format supports embedding PDB symbols although I could be wrong on this point.
The only course I can see is embedding the PDB as a resource in th e .EXE. However that would have to be a post build step since the two are built at the same time. And there is the potential for invalidating parts of the PDB if you modify the EXE after it's been built.
Is there a particular reason you're trying to do this? I'm imagining it's going to end up causing you a lot of pain as 1) it's not supported AFAIK and 2) the tool chain is geared towards looking for PDB in the same directory not within the .EXE. Deploying 2 files is a bit annoying at first but it's how its done at this point.
I'm pretty sure PDBs were always stand-alone files. VC++ used to have a switch that would cause it to emit (limited compared to PDB) symbol information to a "CodeView" .DBG file that by default was embedded in the EXE. However, that switch appears to no longer be supported in the newer (post 6.x ?) versions of the compiler.
I have heard using PDB files can help diagnose where a crash occurred.
My basic understanding is that you give Visual studio the source file, the pdb file and the crash information (from Dr Watson?)
Can someone please explain how it all works / what is involved?
(Thank you!)
PDB files map an assembly's MSIL to the original source lines. This means that if you put the PDB that was compiled with the assembly in the same directory as the assembly, your exception stack traces will have the names and lines of the positions in the original source files. Without the PDB file, you will only see the name of the class and method for each level of the stack trace.
PDB files are generated when you build your project. They contain information relating to the built binaries which Visual Studio can interpret.
When a program crashes and it generates a crash report, Visual Studio is able to take that report and link it back to the source code via the PDB file for the application. PDB files must be built from the same binary that generated the crash report!
There are some issues that we have encountered over time.
The machine that is debugging the crash report needs to have the source on the same path as the machine that built the binary.
Release builds often optimize to the extent where you cannot view the state of object member variables
If anyone knows how to defeat the former, I would be grateful for some input.
You should look into setting up a symbol server and indexing the PDB files to your source code control system. I just recently went through this process for our product and it works very well. You don't have to be concerned about making PDB files available with the binaries, nor how to get the appropriate source code when debugging dump files.
John Robbins' book: http://www.amazon.com/Debugging-Microsoft-NET-2-0-Applications/dp/0735622027/ref=pd_bbs_sr_1?ie=UTF8&s=books&qid=1222366012&sr=8-1
Look here for some sample code for generating minidumps (which don't have to be restricted to post-crash analysis -- you can generate them at any point in your code without crashing): http://www.codeproject.com/KB/debug/postmortemdebug_standalone1.aspx