WinDbg not showing register values - debugging

Basically, this is the same question that was asked here.
When performing kernel debugging of a machine running Windows 7 or older, with WinDbg version 6.2 and up, the debugger doesn't show anything in the registers window. Pressing the Customize... button results in a message box that reads Registers are not yet known.
At the same time, issuing the r command results in perfectly valid register values being printed out.
What is the reason for this behaviour, and can it be fixed?

TL;DR: I wrote an extension DLL that fixes the bug. Available here.
The Problem
To understand the problem, we first need to understand that WinDbg is basically just a frontend to Microsoft's Windows Symbolic Debugger Engine, implemented inside dbgeng.dll. Other frontends include the command-line kd.exe (kernel debugger) and cdb.exe (user-mode debugger).
The engine implements everything we expect from a debugger: working with symbol files, read and writing memory and registers, setting breakpoitns, etc. The engine then exposes all of this functionality through COM-like interfaces (they implement IUnknown but are not registered components). This allows us, for instance, to write our own debugger (like this person did).
Armed with this knowledge, we can now make an educated guess as to how WinDbg obtains the values of the registers on the target machine.
The engine exposes the IDebugRegisters interface for manipulating registers. This interface declares the GetValues method for retrieving the values of multiple registers in one go. But how does WinDbg know how many registers are there? That why we have the GetNumberRegisters method.
So, to retrieve the values of all registers on the target, we'll have to do something like this:
Call IDebugRegisters::GetNumberRegisters to get the total number of registers.
Call IDebugRegisters::GetValues with the Count parameter set to the total number of registers, the Indices parameter set to NULL, and the Start parameter set to 0.
One tiny problem, though: the second call fails with E_INVALIDARG.
Ehm, excuse me? How can it fail? Especially puzzling is the documentation for this return value:
The value of the index of one of the registers is greater than the number of registers on the target machine.
But I just asked you how many registers there are, so how can that value be out of range? Okay, let's continue reading the docs anyway, maybe something will become clear:
If the return value is not S_OK, some of the registers still might have been read. If the target was not accessible, the return type is E_UNEXPECTED and Values is unchanged; otherwise, Values will contain partial results and the registers that could not be read will have type DEBUG_VALUE_INVALID.
(Emphasis mine.)
Aha! So maybe the engine just couldn't read one of the registers! But which one? Turns out that the engine chokes on the xcr0 register. From the Intel 64 and IA-32 Architectures Software Developer’s Manual:
Extended control register XCR0 contains a state-component bitmap that specifies the user state components that software has enabled the XSAVE feature set to manage. If the bit corresponding to a state component is clear in XCR0, instructions in the XSAVE feature set will not operate on that state component, regardless of the value of the instruction mask.
Okay, so the register controls the operation of the XSAVE instruction, which saves the state of the CPU's extended features (like XMM and AVX). According to the last comment on this page, this instruction requires some support from the operating system. Although the comment states that Windows 7 (that's what the VM I was testing on was running) does support this instruction, it seems that the issue at hand is related to the OS anyway, as when the target is Windows 8 everything works fine.
Really, it's unclear whether the bug is within the debugger engine, which reports more registers than it can retrieve values for, or within WinDbg, which refuses to show any values at all if the engine fails to produce all of them.
The Solution
We could, of course, bite the bullet and just use an older version of WinDbg for debugging older Windows versions. But where's the challenge in that?
Instead, I present to you a debugger extension that solves this problem. It does so by hooking (with the help of this library) the relevant debugger engine methods and returning S_OK if the only register that failed was xcr0. Otherwise, it propagates the failure. The extension supports runtime unload, so if you experience problems you can always disable the hooks.
That's it, have fun!

Related

what is the purpose of the BeingDebugged flag in the PEB structure?

What is the purpose of this flag (from the OS side)?
Which functions use this flag except isDebuggerPresent?
thanks a lot
It's effectively the same, but reading the PEB doesn't require a trip through kernel mode.
More explicitly, the IsDebuggerPresent API is documented and stable; the PEB structure is not, and could, conceivably, change across versions.
Also, the IsDebuggerPresent API (or flag) only checks for user-mode debuggers; kernel debuggers aren't detected via this function.
Why put it in the PEB? It saves some time, which was more important in early versions of NT. (There are a bunch of user-mode functions that check this flag before doing some runtime validation, and will break to the debugger if set.)
If you change the PEB field to 0, then IsDebuggerPresent will also return 0, although I believe that CheckRemoteDebuggerPresent will not.
As you have found the IsDebuggerPresent flag reads this from the PEB. As far as I know the PEB structure is not an official API but IsDebuggerPresent is so you should stick to that layer.
The uses of this method are quite limited if you are after a copy protection to prevent debugging your app. As you have found it is only a flag in your process space. If somebody debugs your application all he needs to do is to zero out the flag in the PEB table and let your app run.
You can raise the level by using the method CheckRemoteDebuggerPresent where you pass in your own process handle to get an answer. This method goes into the kernel and checks for the existence of a special debug structure which is associated with your process if it is beeing debugged. A user mode process cannot fake this one but you know there are always ways around by simply removing your check ....

Visual Studio's Memory window: Inspecting a reference instead of the referenced value?

When I inspect a string variable text using Visual Studio's Memory window, I get to see its value:
Out of curiosity, is there a way to inspect (also in the Memory window) the location where that value gets referenced?
(Of course I can already see the memory location's address. I am asking this because I am curious how the CLR represents, and works with, class-type instances. Based on what the CLI specification states, I am assuming that the CLR represents them at least as a combination of a pointer, a type token, and a value. I am seeing the latter two above, but would like to see the pointer, and what else might be stored along with it.)
In general there's not just one location, especially since this is an interned string. But you do have one since you know that the text variable points to the string. So use the address-of operator to get the address of the reference, type &text in the Address box.
You'll probably want to make it a bit more recognizable, right-click the Memory window and select "8-byte integer". You'd see 000000000256D08. The area of memory you are looking at is the stack of the main thread.
Do beware that this is all a bit academic. This works because you are using the debugger and the jitter optimizer was disabled. In an optimized program, that pointer value is going to be stored in a cpu register. And in the specific case of your test method there would be nothing to look at because the assignment statement will be optimized away.
You can see the "real" code with the Release build and Tools + Options, Debugging, General, untick the "Suppress JIT optimization" option. Beware that it makes the debugger stupid, it no longer knows much about local variables. The most important debugging windows then are Debug + Windows + Disassembly to see the code and Debug + Windows + Registers to see the CPU registers. Right-click the latter window and tick SSE2 so you can see the XMM registers, the x64 jitter likes to use them.

How does a debugger set breakpoints if the image is in read-only memory?

How does a debugger set breakpoints if the image is in read-only memory? I know there are hardware breakpoints, but in the debugger I use (OllyDbg) those have to be set specially using a different dialog than normal breakpoints.
Explanation:
Here is a routine in a debugger that is comparing itself to a copy of itself. EDX points to the running image, EBX points to the known good copy of the image. The breakpoint on 4010CE only is reached if there is a mismatch. The character being compared is in the AL register. As you can see the debugger shows EB F6 at 10CE, but this is false. 10CE actually has CC in it, as you can see by looking at the AL register. This is because the debugger has secretely inserted the CC to perform the breakpoint.
The debugger first has to change the memory protection of the page it wants to write to. This can be done with VirtualProtectEx. After that it is able to write with WriteProcessMemory and then set the protection back to the original value.
Let me preface this with a disclaimer that I'm not familiar with your particular toolset.
If you haven't enabled hardware breakpoints, the only remaining breakpoint type is a software breakpoint. These are only hit (on x86 because that's what I'm most familiar with) when you replace the first byte of an instruction with a trap instruction, and will only be routed through the breakpoint mechanism of your OS to your debugger if the correct trap instruction for your OS is used and the debugger has already registered itself with the OS as a debugger for this process. In order to cause the software breakpoint to happen at the correct moment, the trap instruction must be written into your code segment over the first byte of your correct instruction.
The two answers that got here first explain the two scenarios which could get you here (at least, the only two I can think of):
The kernel always has write access everywhere, except for hardware-protected pages (ie on some sort of ROM), which your process' memory is almost certainly not. It has the ability to write the breakpoint instruction regardless of the permissions exposed to the user process being debugged.
The debugger must use some syscall to change the access rights on the memory of the target process before inserting the breakpoint.
Personally, I'm guessing the first thing is happening. The segment permissions are only in place to protect your target process from itself, not from a debugger process or from the kernel. Debugging mechanisms in operating systems pretty regularly violate "normal" permissions to allow the debugger to do whatever it wants to the target process. This, of course, is why some operating systems require you to enter a password before you're allowed to use the debugger in certain scenarios.
However, you can test if it's the second one by attempting to write to the code segment from inside the target process after a breakpoint has been set. If the write succeeds, you know the permissions have been lowered by the OS (to allow the process to be debugged). It would be pretty awkward for the OS to require a debugger to jump through this hoop since it can already insert arbitrary code into the writeable parts of memory and then force a jump to it by generating a stack frame overflow.
The debugger takes advantage of the WriteProcessMemory() function to alter the instruction in place. It'll keep a copy of the instruction. When the bp is hit it will reset the old byte value and set EIP back to the previous instruction so the real instruction can execute.

A debug register substitute?

I was reading some old articles about debugging, and one of them mentioned the debug registers. Reading some more about these registers and what they can do made me incredibly eager to have some fun with them. However when I tried looking for some more information about how to actually use them I read that they can only be accessed from ring 0 in windows.
I thought that that was the end then, since I'm not going to write a kernel driver just to play with a few registers. But then I thought about the memory editing tool I used to play around with. Cheat engine it's called, and one of the various options of the program was to specify to break on instructions/data that was being executed/accessed/read. That is exactly the same as the debug registers do. So I was wondering: Is there a substitute/replacement for the debug registers in windows? Since I'm sure that the program (cheat engine) doesn't use a kernel driver to set these values.
Thats not true at all, you can set HW debug register from ring3, indirectly (ollydbg does this), for this you need to use SetThreadContext under windows (example).
if you still want a substitute for HW registers, you can use INT3 for code break points and single step trapping for checking if a varibale has changed(highly inefficient).
a good reference is GDB and its source: http://developer.apple.com/library/mac/#documentation/DeveloperTools/gdb/gdbint/gdbint_3.html

State of Registers After Bootup

I'm working on a boot loader on an x86 machine.
When the BIOS copies the contents of the MBR to 0x7c00 and jumps to that address, is there a standard meaning to the contents of the registers? Do the registers have standard values?
I know that the segment registers are typically set to 0, but will sometimes be 0x7c0. What about the other hardware registers?
This early execution environment is highly implementation defined, meaning the implementation of your particular BIOS. Never make any assumptions on the contents of registers. They might be initialized to 0, but they might contain a random value just as well.
from the OS dev Wiki, which is where I get information when I'm playing with my toy OS's
Best option would be to assume nothing. If they have meaning, you will find that from the other side when you need the information they provide.
Undefined, I believe? I think it depends on the mainboard and CPU, and should be treated as random for your own good.
Safest bet is to assume undefined.
Always assume undefined, otherwise you'll hit bad problems if you ever try to port architectures.
There is nothing quite like the pain of porting code that assumes everything uninitialized will be set to zero.
The only thing that I know to be well defined is the processor state immediately after reset.
For the record you can find that in Intel's Software Developer's Manual Vol 3 chapter 8: "PROCESSOR MANAGEMENT AND INITIALIZATION" in the table titled " IA-32 Processor States Following Power-up, Reset, or INIT"
You can always initialize them yourself to start with a known state.

Resources