I know that in GDB,
break *address
places a break point at the specified instruction address, but when I run the code, does the program stop before the instruction address or after the instruction has ran?
Related
The windbg command tct executes a program until it reaches a call instruction or a ret instruction. I am wondering how the debugger implements this functionality under the hood.
I could imagine that the debugger scans the instructions from the current instructions for the next call or ret and sets according breakpoints on the found instructions. However, I think this is unlikely because it would also have to take into account jmp instructions so that there are an arbitrary number of possible call or ret instructions where such a breakpoint would have to be set.
On the other hand, I wonder if the x86/x64 CPU provides a functionality that raises an exception to be caught by the debugger whenever the CPU is about to process a call or ret instruction. Yet, I have not heard of such a functionality.
I'd guess that it single-steps repeatedly, until the next instruction is a call or ret, instead of trying to figure out where to set a breakpoint. (Which in the general case could be as hard as solving the Halting Problem.)
It's possible it could optimize that by scanning forward over "straight line" code and setting a breakpoint on the next jmp/jcc/loop or other control-transfer instruction (e.g. xabort), and also catching signals/exceptions that could transfer control to an SEH handler.
I'm also not aware of any HW support for breaking on a certain type of instruction or opcode: the x86 debug registers DR0..7 allow hardware breakpoints at code addresses without rewriting machine code to int3, and also hardware watchpoints (to trap data load/store to a specific address or range of addresses). But not filtering by opcode.
Is there a SPARC equivalent to x86's single step mode? What I want is to stop execution after every instruction and move control flow to a trap handler or something similar.
I thought of using the ta instruction in the delayed execution slot but this would not work when the previous instruction is a branching instruction with the annul bit set.
Sparc lacks a single step bit in PSR, so it's harder to single step. But I've used a trick to help get closer. Set TPC to the address of the instruction you want to single step, and set TNPC to an address someplace else where you've placed a trap instruction. When you execute the retry instruction to get back to the process context, it will single step the one instruction you want, then it will next execute the trap instruction which will bring you right back to the kernel, where you can do whatever you want. (n.b this is for sparc64, not sure about sparc32). This is a nice trick because you don't have modify existing instructions in the user's address space. This was important to me since I was single stepping instructions in the kernel.
Another idea I had, but never tried, was to simply set TNPC to an illegal address. Then after the instruction at TPC was executed, you'd get an automatic trap back into the kernel. And since the trap handling code knows that the process is being single stepped, there would be no confusion over a "real" illegal address trap.
I recently set up my system for kernel debug using qemu+gdb. At present, I can set breakpoints at, for example, __do_page_fault() and trace the call via gdb (with win command). Now I want the following task: A simple C program having a "hello world" printfstatement. Trace the call sequence starting from the userspace down to the write() system call ( or anything in the kernel space that is invoked during the execution of that particular userspace program). I want to learn how userspace program traps into system call w.r.t Linux kernel specifically.
Now my doubt is where to set the breakpoint? We have kernel code as well as the C code of the program. How to go about this situation ? Please give us an explanation with example.
Thank You !
The most easiest way in my opinion is to separate this into two pieces.
Place breakpoint in guest kernel using host gdb.
Place breakpoint in user code before trap instruction, using in-guest target gdb, when hit - print stack using target (in-qemu) gdb. You will get user space stack trace.
Continue execution in guest gdb
In-kernel breakpoint (we have set it at stage 1) will be hit in host gdb. Print kernel stack trace.
P.S.
If your kernel will continuously hit breakpoint (f.e. write syscall is definitely used widely), you can use a conditional breakpoint to hit a breakpoint only with a certain parameters passed.
How does a debugger set breakpoints if the image is in read-only memory? I know there are hardware breakpoints, but in the debugger I use (OllyDbg) those have to be set specially using a different dialog than normal breakpoints.
Explanation:
Here is a routine in a debugger that is comparing itself to a copy of itself. EDX points to the running image, EBX points to the known good copy of the image. The breakpoint on 4010CE only is reached if there is a mismatch. The character being compared is in the AL register. As you can see the debugger shows EB F6 at 10CE, but this is false. 10CE actually has CC in it, as you can see by looking at the AL register. This is because the debugger has secretely inserted the CC to perform the breakpoint.
The debugger first has to change the memory protection of the page it wants to write to. This can be done with VirtualProtectEx. After that it is able to write with WriteProcessMemory and then set the protection back to the original value.
Let me preface this with a disclaimer that I'm not familiar with your particular toolset.
If you haven't enabled hardware breakpoints, the only remaining breakpoint type is a software breakpoint. These are only hit (on x86 because that's what I'm most familiar with) when you replace the first byte of an instruction with a trap instruction, and will only be routed through the breakpoint mechanism of your OS to your debugger if the correct trap instruction for your OS is used and the debugger has already registered itself with the OS as a debugger for this process. In order to cause the software breakpoint to happen at the correct moment, the trap instruction must be written into your code segment over the first byte of your correct instruction.
The two answers that got here first explain the two scenarios which could get you here (at least, the only two I can think of):
The kernel always has write access everywhere, except for hardware-protected pages (ie on some sort of ROM), which your process' memory is almost certainly not. It has the ability to write the breakpoint instruction regardless of the permissions exposed to the user process being debugged.
The debugger must use some syscall to change the access rights on the memory of the target process before inserting the breakpoint.
Personally, I'm guessing the first thing is happening. The segment permissions are only in place to protect your target process from itself, not from a debugger process or from the kernel. Debugging mechanisms in operating systems pretty regularly violate "normal" permissions to allow the debugger to do whatever it wants to the target process. This, of course, is why some operating systems require you to enter a password before you're allowed to use the debugger in certain scenarios.
However, you can test if it's the second one by attempting to write to the code segment from inside the target process after a breakpoint has been set. If the write succeeds, you know the permissions have been lowered by the OS (to allow the process to be debugged). It would be pretty awkward for the OS to require a debugger to jump through this hoop since it can already insert arbitrary code into the writeable parts of memory and then force a jump to it by generating a stack frame overflow.
The debugger takes advantage of the WriteProcessMemory() function to alter the instruction in place. It'll keep a copy of the instruction. When the bp is hit it will reset the old byte value and set EIP back to the previous instruction so the real instruction can execute.
Assume a debugger(common x86 ring3 debugger such as olly, IDA, gdb...) sets a
software breakpoint to virtual address 0x1234.
this is accomplished by replacing the whatever opcode at 0x1234 to '0xCC'
now let's assume that debugee process runs this 0xCC instruction and raises
software exception and debugger catches this.
debugger inspects memory contents, registers and do some stuff.. and
now it wants to resume the debugee process.
this is as far as I know. from now, its my assumption.
debugger recovers the original opcode(which was replaced to 0xCC) of
debugee in order to resume the execution.
debugger manipulates the EIP of debugee's CONTEXT to point the
recovered instruction.
debugger handles the exception and now, debugee resumes from breakpoint.
but debugger wants the breakpoint to remain.
how can debugger manage this?
To answer the original question directly, from the GDB internals manual:
When the user says to continue, GDB will restore the original
instruction, single-step, re-insert the trap, and continue on.
in short and common people words:
Since getting into debug state is atomic operation in X86 and in ARM the processor gets into it and exit of debug state as same as any other instruction in the architecture.
see gdb documentation explains how it works and can be used.
Here are some highlights from ARM and X86 specifications:
in ARM:
SW (Software) breakpoints are implemented by temporarily replacing the
instruction opcode at the breakpoint location with a special
"breakpoint" instruction immediately prior to stepping or executing
your code. When the core executes the breakpoint instruction, it will
be forced into debug state. SW breakpoints can only be placed in RAM
because they rely on modifying target memory.
A HW (Hardware) breakpoint is set by programming a watchpoint unit to monitor the core
busses for an instruction fetch from a specific memory location. HW
breakpoints can be set on any location in RAM or ROM. When debugging
code where instructions are copied (Scatterloading), modified or the
processor MMU remaps areas of memory, HW breakpoints should be used.
In these scenarios SW breakpoints are unreliable as they may be either
lost or overwritten.
In X86:
The way software breakpoints work is fairly simple. Speaking about x86
specifically, to set a software breakpoint, the debugger simply writes
an int 3 instruction (opcode 0xCC) over the first byte of the target
instruction. This causes an interrupt 3 to be fired whenever execution
is transferred to the address you set a breakpoint on. When this
happens, the debugger “breaks in” and swaps the 0xCC opcode byte with
the original first byte of the instruction when you set the
breakpoint, so that you can continue execution without hitting the
same breakpoint immediately. There is actually a bit more magic
involved that allows you to continue execution from a breakpoint and
not hit it immediately, but keep the breakpoint active for future use;
I’ll discuss this in a future posting.
Hardware breakpoints are, as you might imagine given the name, set
with special hardware support. In particular, for x86, this involves a
special set of perhaps little-known registers know as the “Dr”
registers (for debug register). These registers allow you to set up to
four (for x86, this is highly platform specific) addresses that, when
either read, read/written, or executed, will cause the processor to
throw a special exception that causes execution to stop and control to
be transferred to the debugger