Tools for seeing fpu registers? - windows

I can use a Debugger like the one included in vc 2015 and attach to a running process, pause it and look at the current values of the Registers. Is there a way to also see the fpu Registers and the flags set by _control87() or _controlfp() ?

Yes, the debugger has a window for that, use Debug > Windows > Registers. Right-click it and tick "Floating point", you'll now see the 8 STx registers and the CTRL register, the one affected by _controlfp().
Do beware that the FPU doesn't get used much anymore. The C/C++ compiler in VS2015 for example no longer generates FPU instructions and hasn't done so since VS2010. You can add the SSE, SSE2 and AVX registers with that same context menu. And beware that the flags you pass to _controlfp() don't have the same value as the bits in fpu's CTRL and sse's MXCSR register.

Related

GDB doesn't disassemble program running in RAM correctly

I have an application compiled using GCC for an STM32F407 ARM processor. The linker stores it in Flash, but is executed in RAM. A small bootstrap program copies the application from Flash to RAM and then branches to the application's ResetHandler.
memcpy(appRamStart, appFlashStart, appRamSize);
// run the application
__asm volatile (
"ldr r1, =_app_ram_start\n\t" // load a pointer to the application's vectors
"add r1, #4\n\t" // increment vector pointer to the second entry (ResetHandler pointer)
"ldr r2, [r1, #0x0]\n\t" // load the ResetHandler address via the vector pointer
// bit[0] must be 1 for THUMB instructions otherwise a bus error will occur.
"bx r2" // jump to the ResetHandler - does not return from here
);
This all works ok, except when I try to debug the application from RAM (using GDB from Eclipse) the disassembly is incorrect. The curious thing is the debugger gets the source code correct, and will accept and halt on breakpoints that I have set. I can single step the source code lines. However, when I single step the assembly instructions, they make no sense at all. It also contains numerous undefined instructions. I'm assuming it is some kind of alignment problem, but it all looks correct to me. Any suggestions?
It is possible that GDB relies on symbol table to check instruction set mode which can be Thumb(2)/ARM. When you move code to RAM it probably can't find this information and opts back to ARM mode.
You can use set arm force-mode thumb in gdb to force Thumb mode instruction.
As a side note, if you get illegal instruction when you debugging an ARM binary this is generally the problem if it is not complete nonsense like trying to disassembly data parts.
I personally find it strange that tools doesn't try a heuristic approach when disassembling ARM binaries. In case of auto it shouldn't be hard to try both modes and do an error count to decide which mode to use as a last resort.

how does debugger resume from breakpoint?

Assume a debugger(common x86 ring3 debugger such as olly, IDA, gdb...) sets a
software breakpoint to virtual address 0x1234.
this is accomplished by replacing the whatever opcode at 0x1234 to '0xCC'
now let's assume that debugee process runs this 0xCC instruction and raises
software exception and debugger catches this.
debugger inspects memory contents, registers and do some stuff.. and
now it wants to resume the debugee process.
this is as far as I know. from now, its my assumption.
debugger recovers the original opcode(which was replaced to 0xCC) of
debugee in order to resume the execution.
debugger manipulates the EIP of debugee's CONTEXT to point the
recovered instruction.
debugger handles the exception and now, debugee resumes from breakpoint.
but debugger wants the breakpoint to remain.
how can debugger manage this?
To answer the original question directly, from the GDB internals manual:
When the user says to continue, GDB will restore the original
instruction, single-step, re-insert the trap, and continue on.
in short and common people words:
Since getting into debug state is atomic operation in X86 and in ARM the processor gets into it and exit of debug state as same as any other instruction in the architecture.
see gdb documentation explains how it works and can be used.
Here are some highlights from ARM and X86 specifications:
in ARM:
SW (Software) breakpoints are implemented by temporarily replacing the
instruction opcode at the breakpoint location with a special
"breakpoint" instruction immediately prior to stepping or executing
your code. When the core executes the breakpoint instruction, it will
be forced into debug state. SW breakpoints can only be placed in RAM
because they rely on modifying target memory.
A HW (Hardware) breakpoint is set by programming a watchpoint unit to monitor the core
busses for an instruction fetch from a specific memory location. HW
breakpoints can be set on any location in RAM or ROM. When debugging
code where instructions are copied (Scatterloading), modified or the
processor MMU remaps areas of memory, HW breakpoints should be used.
In these scenarios SW breakpoints are unreliable as they may be either
lost or overwritten.
In X86:
The way software breakpoints work is fairly simple. Speaking about x86
specifically, to set a software breakpoint, the debugger simply writes
an int 3 instruction (opcode 0xCC) over the first byte of the target
instruction. This causes an interrupt 3 to be fired whenever execution
is transferred to the address you set a breakpoint on. When this
happens, the debugger “breaks in” and swaps the 0xCC opcode byte with
the original first byte of the instruction when you set the
breakpoint, so that you can continue execution without hitting the
same breakpoint immediately. There is actually a bit more magic
involved that allows you to continue execution from a breakpoint and
not hit it immediately, but keep the breakpoint active for future use;
I’ll discuss this in a future posting.
Hardware breakpoints are, as you might imagine given the name, set
with special hardware support. In particular, for x86, this involves a
special set of perhaps little-known registers know as the “Dr”
registers (for debug register). These registers allow you to set up to
four (for x86, this is highly platform specific) addresses that, when
either read, read/written, or executed, will cause the processor to
throw a special exception that causes execution to stop and control to
be transferred to the debugger

CPU Switches from Kernel mode to User Mode on X86 : When and How?

When and how does CPU Switch from Kernel mode to User Mode On X86 : What exactly does it do? How does it makes this transition?
In x86 protected mode, the current privilege level that the CPU is executing in is controlled by the two least significant bits of the CS register (the RPL field of the segment selector).
So a switch from kernel mode (CPL=0) to user mode (CPL=3) is accomplished by replacing a kernel-mode CS value with a user-mode one. There's many ways to do this, but one typical one is an IRET instruction which pops the EIP, CS and EFLAGS registers from the stack.
iret does this for example. See the code here (INTERRUPT_RETURN macro)

How do I set a software breakpoint on an ARM processor?

How do I do the equivalent of an x86 software interrupt:
asm( "int $3" )
on an ARM processor (specifically a Cortex A8) to generate an event that will break execution under gdb?
Using arm-none-eabi-gdb.exe cross compiler, this works great for me (thanks to Igor's answer):
__asm__("BKPT");
ARM does not define a specific breakpoint instruction. It can be different in different OSes. On ARM Linux it's usually an UND opcode (e.g. FE DE FF E7) in ARM mode and BKPT (BE BE) in Thumb.
With GCC compilers, you can usually use __builtin_trap() intrinsic to generate a platform-specific breakpoint. Another option is raise(SIGTRAP).
I have a simple library (scottt/debugbreak) just for this:
#include <debugbreak.h>
...
debug_break();
Just copy the single debugbreak.h header into your code and it'll correctly handle ARM, AArch64, i386, x86-64 and even MSVC.
__asm__ __volatile__ ("bkpt #0");
See BKPT man entry.
For Windows on ARM, the instrinsic __debugbreak() still works which utilizes undefined opcode.
nt!DbgBreakPointWithStatus:
defe __debugbreak
Although the original question asked about Cortex-A7 which is ARMv7-A, on ARMv8 GDB uses
brk #0
On my armv7hl (i.MX6q with linux 4.1.15) system, to set a breakpoint in another process, I use :
ptrace(PTRACE_POKETEXT, pid, address, 0xe7f001f0)
I choose that value after strace'ing gdb :)
This works perfectly : I can examine the traced process, restore the original instruction, and restart the process with PTRACE_CONT.
We can use breakpoint inst:
For A32: use
BRK #imm instruction
For Arm and Thumb: use BKPT #imme instruction.
Or we can use UND pseudo-instruction to generate undefined instruction which will cause exception if processor attempt to execute it.

PPC breakpoints

How is a breakpoint implemented on PPC (On OS X, to be specific)?
For example, on x86 it's typically done with the INT 3 instruction (0xCC) -- is there an instruction comparable to this for ppc? Or is there some other way they're set/implemented?
With gdb and a function that hexdumps itself, I get 0x7fe00008. This appears to be the tw instruction:
0b01111111111000000000000000001000
011111 31
11111 condition flags: lt, gt, ge, logical lt, logical gt
00000 rA
00000 rB
0000000100 constant 4
0 reserved
i.e. compare r0 to r0 and trap on any result.
The GDB disassembly is simply the extended mnemonic trap
EDIT: I'm using "GNU gdb 6.3.50-20050815 (Apple version gdb-696) (Sat Oct 20 18:20:28 GMT 2007)"
EDIT 2: It's also possible that conditional breakpoints will use other forms of tw or twi if the required values are already in registers and the debugger doesn't need to keep track of the hit count.
Besides software breakpoints, PPC also supports hardware breakpoints, implemented via IABR (and possibly IABR2, depending on the core version) registers. These are instructions breakpoints, but there are also data breakpoints (implemented with DABR and, possibly, DABR2). If your core supports two sets of hardware breakpoint registers (i.e. IABR2 and DABR2 are present), you can do more than just trigger on a specific address: you can specify a whole contiguous range of addresses as a breakpoint target. For data breakpoints, you can also specify whether you want them to trigger on write, or read, or any access.
Best guess is a 'tw' or 'twi' instruction.
You could dig into the source code of PPC gdb, OS X probably uses the same functionality as its FreeBSD roots.
PowerPC architectures use "traps".
http://publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp?topic=/com.ibm.aix.aixassem/doc/alangref/twi.htm
Instruction breakpoints are typically realised with the TRAP instruction or with the IABR debug hardware register.
Example implementations:
ArchLinux, Apple, Wii and Wii U.
I'm told by a reliable (but currently inebriated, so take it with a grain of salt) source that it's a zero instruction which is illegal and causes some sort of system trap.
EDIT: Made into community wiki in case my friend is so drunk that he's talking absolute rubbish :-)

Resources