How does the Visual Studio Attach to Process work? - debugging

I've always wanted to know the inner-workings of Visual Studio's debugger and debuggers in general. How does it communicate and control your code, especially when it's running inside a host process or an external network server (attach to process)? Does the compiler or linker patch your code with callbacks so that the debugger is given control? If it indeed works this way, how do interpreted languages such as JavaScript containing no debug code work?

Generally speaking, Windows provides an API for writing debuggers that let you examine and modify memory in another process and to get notifications when exceptions happen in another process.
The debug process sits in a loop, waiting for notification from events from the process under inspection. To set a breakpoint, the debugger process modifies the code in the debugee to cause an exception (typically, an int 3 instruction for x86).
The compiler and linker work together to make the symbol information about a program available in a format that can be read by debuggers. On Windows, that's typically CodeView in a separate PDB file.
In the Unix-derived world, there's an API called ptrace that does essentially the same sorts of things as Windows debugging API.
For remote debugging, a small program is placed on the remote machine that communicates with and acts on behalf of the actual debugger running on the local machine.
For interpreted languages, like JavaScript, the debugger works with the interpreter to give the same sorts of functionality (inspecting memory, setting breakpoints, etc.).

Windows includes support for debuggers. A process has to enable debugger privilege, and once this is done that process can attach to any other process and debug it using windows debugger functions
http://msdn.microsoft.com/en-us/library/windows/desktop/ms679303(v=vs.85).aspx
For something like javascript, seems like you would need the equivalent of a javascript debugger.
In the case of a Visual Studio multi-process project, you typically have to switch which process the debugger is attached to in order to debug that process. I don't know if there's a way to have pending breakpoints set for multiple processes at the same time. There could be other debuggers that work better with multiple processes, but I haven't used such a tool.

Related

How to debug a GPF crash?

I have an old VB6 app that uses a lot of 3rd party components, not just visual but also for audio processing, tcp/udp, VoIP, etc...
When I run the app as an EXE (e.g. not in the VB6 IDE), it will crash sometimes with a GPF. It happens after the program has been running for several days and occurs when there is no one around.
Unfortunately the user has clipped the screenshot, but it typically doesn't have any useful information anyway. The description of the crash reported that the crash occurred in ntdll.dll.
My questions:
What tools do I need in order to debug this?
How do I actually get started?
I have the PDB files from the compilation in VB6. The app is compiled to Optimize for Fast Code. What can I do with them in this situation?
I would use ntsd or windbg (link), and run the app under either of the user mode debuggers (if you're not familiar, they have the same commands - ntsd is a console debugger, while windbg is a GUI debugger). Both have a lot of command line options, but ntsd appname.exe will be enough to get started. Use the .sympath command to point the debugger toward the symbol, and you should be on your way. When the crash occurs, you can examine variables and stack in order to figure out what may be missing.
BTW - the error above is an invalid handle error, so the program probably passed a stale or NULL handle to a windows function. The debugger will tell you more.

Automatically create Visual C++ crash dump

Is there a way to create a crash dump file automatically on application crash (on Windows OS), same as if I could save with Visual Studio debugger attached? That is, I want to be able to debug my application in Visual Studio with automatically created crash dump files.
Update: Debug Diag 2.0 is now available. This release has native support for .NET dumps.
Yes it is possible using DebugDiag 1.2.
The Debug Diagnostic Tool (DebugDiag) is designed to assist in
troubleshooting issues such as hangs, slow performance, memory leaks
or fragmentation, and crashes in any user-mode process. The tool
includes additional debugging scripts focused on Internet Information
Services (IIS) applications, web data access components, COM+ and
related Microsoft technologies.
It even allows you to run crash/hang analysis on the dump and give you a nice report about the callstack and thread that are in sleep state (for hang dumps). It also allows you to take on the fly dumps too. The dumps taken by DebugDiag can be analyzed in DebugDiag, Visual Studio and WinDbg.
EDIT: The MSDN link to how to use DebugDiag is here.
Use SetUnhandledExceptionFilter to catch exceptions. And in this global exception handler, use MiniDumpWriteDump function to create a dump file.
There is a lot around exception handling this way, like you won't be able to catch all exceptions (exception from STL, pure-virtual function, abort C runtime call etc). Also, you may be in problem if more than one thread causes some exception. You either need to suspend all other running threads when you get exception in your global exception handler, or use some logic so that other threads won't interfere with your dump-generation code.
To handle all cases, you need to tweak around linker settings (like /EHsc flag), so that ALL exceptions can be handled by try-catch, enable debugging information even for release build so that .PDB is generated and you can get call stack. Use API hooking, so that C-runtime calls won't disable your global-exception handler and lot!
See these:
SetUnhandledExceptionFilter and the C/C++ Runtime Library
Own Crash Minidump with Call Stack
My only recommendation is you start with simpler approach, and don't bother about more complex scenarios. In short, just use SetUnhandledExceptionFilter, in a single threaded application, cause an Access Violation, and in global-exception-handler, use MinidumpWriteDump to create a MINI dump (i.e. without memory-dump).

Log DLLs loaded by a process

I'd like to add logging to our unit tests that records the DLLs they use, and where they're loaded from.
I can get the information I need from Sysinternals ListDLLs, but I'd need to run that while the test process is running, and I'd end up with race conditions: for instance, ListDLLs could run too early, and miss a DLL that's loaded half-way through the test run; or ListDLLs could run too late, after the test process exits.
Similarly, I can get the information I need from the Visual Studio debugger's Output and Modules windows, but I'd like to automate this on our build server.
Is there any command line tool that can run an arbitrary EXE, track the DLLs it uses, and log the information to a file?
You may write your own tool, which will use "debugging" features. This tool must
Start new process suspended
Attach to created process as debugger
Process debugging events, as I remember, you need LOAD_DLL_DEBUG_EVENT
http://msdn.microsoft.com/en-us/library/windows/desktop/ms679302(v=vs.85).aspx
The good news: it's not too hard to write it yourself using Detours. Hook the LoadLibraryA/W functions and log DLL names to a file (using GetModuleFileName against the value that the real LoadLibrary returns). Also hook CreateProcess, so that you can log DLLs loaded by child processes.
The bad news: I'd like to be able to post the source code that I used, but it's an internal tool that I won't be able to share.
Edit: I'm not convinced that this tool's Detours hooks are completely reliable, as during my testing, it's missed a few DLLs. Here's an alternative tool using the debugger API: https://github.com/timrobinson/logdlls
Note that SysInternals (now MSFT: http://technet.microsoft.com/en-US/sysinternals) has a great tools for tracking all sorts of events happening when loading your application: Process Monitor. You will have to filter out anything that is not related to the application you are examining. Also, you may want to set Operation="Load Image" filter.
Tried Tim Robinson's tool, but it seems it will only track Windows related dll's so not useful in my case.

Visual C++: Difference between Start with/without debugging in Release mode

What is the difference between Start Debugging (F5) and Start without Debugging (CTRL-F5) when the code is compiled in Release mode?
I am seeing that CTRL-F5 is 10x faster than F5 for some C++ code. If I am not wrong, the debugger is attached to the executing process for F5 and it is not for CTRL-F5. Since this is Release mode, the compiled code does not have any debugging information. So, if I do not have any breakpoints, the execution times should be the same across the two, isn't it?!
(Assume that the Release and Debug modes are the typical configurations you get when you create a new Visual C++ project.)
The problem is that Windows drops in a special Debug Heap, if it detects that your program is running under a Debugger. This appears to happen at the OS level, and is independent of any Debug/Release mode settings for your compilation.
You can work around this 'feature' by setting an environment variable: _NO_DEBUG_HEAP=1
This same issue has been driving me nuts for a while; today I found the following, from whence this post is derived:
http://blogs.msdn.com/b/larryosterman/archive/2008/09/03/anatomy-of-a-heisenbug.aspx
"Start without debugging" just tells Windows to launch the app as it would normally run.
"Start with debugging" starts the VS debugger and has it run the app within the debugger.
This really doesn't have much to do with the debug/release build settings.
When you build the default 'debug' configuration of your app, you'll have the following main differences to the release build:
The emitted code won't be optimised, so is easier to debug because it more closely matches your source
The compiler & linker will output a .PDB file containing lots of extra information to help a debugger - the presence or absence of this information makes no difference to the performance of the code, just the ease of debugging.
Conditional macros like ASSERT and VERIFY will be no-ops in a release build but active in a debug build.
Each one of these items is independent and optional! You can turn any or all of them on or off and still run the code under the debugger, you just won't find life so easy.
When you run 'with debugging' things perform differently for several reasons:
The VS debugger is very inefficient at starting, partly because everything in VS is slow - on versions prior to VS2010 every pixel of the screen will be repainted about 30 times as the IDE staggers into debug mode with much flashing and flickering.
Depending on how things are configured, the debugger might spend a lot of time at startup trying to load symbols (i.e. PDB files) for lots and lots of OS components which are part of your process - it might try fetching these files over the web, which can take an age in some circumstances.
A number of activities your application normally does (loading DLLs, starting threads, handling exceptions) all cause the debugger to be alerted. This has the effect both of slowing them down and of making them tend to run sequentially.
IsDebuggerPresent() and OutputDebugString() behave differently depending on whether a debugger is attached.
IsDebuggerPresent() simply returns another value, so your program can react to this value and behave differently on purpose. OutputDebugString() returns much faster when there's no debugger attached, so if it's called lots of times you'll see that the program runs much faster without the debugger.
When running 'with debugging' the debug heap is used, even for release mode. This causes severe slowdowns in code using a lot of malloc/free or new/delete, which can happen in C++ code without you noticing it because libraries/classes tend to hide this memory management stuff from you.

How does a debugger work?

I keep wondering how does a debugger work? Particulary the one that can be 'attached' to already running executable. I understand that compiler translates code to machine language, but then how does debugger 'know' what it is being attached to?
The details of how a debugger works will depend on what you are debugging, and what the OS is. For native debugging on Windows you can find some details on MSDN: Win32 Debugging API.
The user tells the debugger which process to attach to, either by name or by process ID. If it is a name then the debugger will look up the process ID, and initiate the debug session via a system call; under Windows this would be DebugActiveProcess.
Once attached, the debugger will enter an event loop much like for any UI, but instead of events coming from the windowing system, the OS will generate events based on what happens in the process being debugged – for example an exception occurring. See WaitForDebugEvent.
The debugger is able to read and write the target process' virtual memory, and even adjust its register values through APIs provided by the OS. See the list of debugging functions for Windows.
The debugger is able to use information from symbol files to translate from addresses to variable names and locations in the source code. The symbol file information is a separate set of APIs and isn't a core part of the OS as such. On Windows this is through the Debug Interface Access SDK.
If you are debugging a managed environment (.NET, Java, etc.) the process will typically look similar, but the details are different, as the virtual machine environment provides the debug API rather than the underlying OS.
As I understand it:
For software breakpoints on x86, the debugger replaces the first byte of the instruction with CC (int3). This is done with WriteProcessMemory on Windows. When the CPU gets to that instruction, and executes the int3, this causes the CPU to generate a debug exception. The OS receives this interrupt, realizes the process is being debugged, and notifies the debugger process that the breakpoint was hit.
After the breakpoint is hit and the process is stopped, the debugger looks in its list of breakpoints, and replaces the CC with the byte that was there originally. The debugger sets TF, the Trap Flag in EFLAGS (by modifying the CONTEXT), and continues the process. The Trap Flag causes the CPU to automatically generate a single-step exception (INT 1) on the next instruction.
When the process being debugged stops the next time, the debugger again replaces the first byte of the breakpoint instruction with CC, and the process continues.
I'm not sure if this is exactly how it's implemented by all debuggers, but I've written a Win32 program that manages to debug itself using this mechanism. Completely useless, but educational.
In Linux, debugging a process begins with the ptrace(2) system call. This article has a great tutorial on how to use ptrace to implement some simple debugging constructs.
If you're on a Windows OS, a great resource for this would be "Debugging Applications for Microsoft .NET and Microsoft Windows" by John Robbins:
http://www.amazon.com/dp/0735615365
(or even the older edition: "Debugging Applications")
The book has has a chapter on how a debugger works that includes code for a couple of simple (but working) debuggers.
Since I'm not familiar with details of Unix/Linux debugging, this stuff may not apply at all to other OS's. But I'd guess that as an introduction to a very complex subject the concepts - if not the details and APIs - should 'port' to most any OS.
I think there are two main questions to answer here:
1. How the debugger knows that an exception occurred?
When an exception occurs in a process that’s being debugged, the debugger gets notified by the OS before any user exception handlers defined in the target process are given a chance to respond to the exception. If the debugger chooses not to handle this (first-chance) exception notification, the exception dispatching sequence proceeds further and the target thread is then given a chance to handle the exception if it wants to do so. If the SEH exception is not handled by the target process, the debugger is then sent another debug event, called a second-chance notification, to inform it that an unhandled exception occurred in the target process. Source
2. How the debugger knows how to stop on a breakpoint?
The simplified answer is: When you put a break-point into the program, the debugger replaces your code at that point with a int3 instruction which is a software interrupt. As an effect the program is suspended and the debugger is called.
Another valuable source to understand debugging is Intel CPU manual (Intel® 64 and IA-32 Architectures
Software Developer’s Manual). In the volume 3A, chapter 16, it introduced the hardware support of debugging, such as special exceptions and hardware debugging registers. Following is from that chapter:
T (trap) flag, TSS — Generates a debug exception (#DB) when an attempt is
made to switch to a task with the T flag set in its TSS.
I am not sure whether Window or Linux use this flag or not, but it is very interesting to read that chapter.
Hope this helps someone.
My understanding is that when you compile an application or DLL file, whatever it compiles to contains symbols representing the functions and the variables.
When you have a debug build, these symbols are far more detailed than when it's a release build, thus allowing the debugger to give you more information. When you attach the debugger to a process, it looks at which functions are currently being accessed and resolves all the available debugging symbols from here (since it knows what the internals of the compiled file looks like, it can acertain what might be in the memory, with contents of ints, floats, strings, etc.). Like the first poster said, this information and how these symbols work greatly depends on the environment and the language.

Resources