What is the difference between hardware and software breakpoints? - debugging

What is the difference between hardware and software breakpoints?
Are hardware breakpoints are said to be faster than software breakpoints, if yes then how, and also then why would we need the software breakpoints at all?

This article provides a good discussion of pros and cons:
http://www.nynaeve.net/?p=80
To answer your question directly software breakpoints are more flexible because hardware breakpoints are limited in some functionality and highly architecture-dependant. One example given in the article is that x86 hardware has a limit of 4 hardware breakpoints.
Hardware breakpoints are faster because they have dedicated registers and less overhead than software breakpoints.

Hardware breakpoints are actually comparators, comparing the current PC with the address in the comparator (when enabled). Hardware breakpoints are the best solution when setting breakpoints. Typically set via the debug probe (using JTAG, SWD, ...). The downside of hardware breakpoints: They are limited. CPUs have only a limited number of hardware breakpoints (comparators). The number of available hardware breakpoints depends on the CPU. ARM 7/9 cores have 2, modern ARM devices (Cortex-M 0,3,4) between 2 and 6,
x86 usually 4.
Software breakpoints are in fact set by replacing the instruction to be breakpointed with a breakpoint instruction. The breakpoint instruction is present in most CPUs, and usually as short as the shortest instruction, so only one byte on x86 (0xcc, INT 3). On Cortex-M CPUs, instructions are 2 or 4 bytes, so the breakpoint instruction is a 2 byte instruction.
Software breakpoints can easily be set if the program is located in RAM (such as on a PC). A lot of embedded systems have the program located in flash memory. Here it is not so easy to exchange the instruction, as the flash needs to be reprogrammed, so hardware breakpoints are used primarily. Most debug probes support only hardware breakpoints if the program is located in flash memory. However, some (such as SEGGER's J-Link) allow reprogramming the flash memory with breakpoint instruction and aso allow an unlimited number of (software) breakpoints even when debugging a program located in flash.
More info about software breakpoints in flash memory

You can go through GDB internals, its very well explains the HW and SW breakpoints.
HW breakpoints are something that require support from MCU. The ARM controllers have special registers where you can write some address space, whenever PC (program counter) == sp register CPU halts. Jtag is usually required to write into those special registers.
SW breakpoints are implemented in GDB by inserting a trap, an illegal divide, or some other instruction that will cause an exception, and then when it’s encountered, gdb will take the exception and stop the program. When the user says to continue, gdb will restore the original instruction, single-step, re-insert the trap, and continue on.
There are a lot of advantages in using HW debuggers over SW debuggers especially if you are dealing with interrupts and memory bus devices. AFAIK interrupts cannot be debugged with software debuggers.

In addition to the answers above, it is also important to note that while software breakpoints overwrite specific instructions in the program to know where to stop, the more limited number of hardware breakpoints are actually part of the processor.
Justin Seitz in his book Gray Hat Python points out that the important difference here is that by overwriting instructions, software breakpoints actually change the CRC of the file, and so any sort of program such as a piece of malware which calculates its CRC can change its behavior in response to breakpoints being set, whereas with hardware breakpoints it is less obvious that the debugger is stopping and stepping through certain chunks of code.

In brief, hardware breakpoints make use of dedicated registers and hence are limited in number. These can be set on both volatile and non volatile memory.
Software breakpoints are set by replacing the opcode of instruction in RAM memory with breakpoint instruction. These can be set only in RAM memory(Flash memory is not feasible to be written) and are not limited.
This article provides good explanation about breakpoints.
Thanks and regards,
Shivakumar V W

Software breakpoints put an instruction in RAM that is executed like a TRAP when your program reaches that address.
While hardware breakpoints use a register of the CPU to implement the breakpoint itself. That is why the hardware breakpoints are much faster. And that is why we need software breakpoints: hardware breakpoints are limited to the processor number of registers dedicated to breakpoints.
I have learned it at work today :)

Watchpoints is where it makes a huge difference
This is a case where hardware handling is much faster:
watch var
rwatch var
awatch var
When you enter those commands on GDB 7.7 x86-64 it says:
Hardware watchpoint 2: var
This hardware capability for x86 is mentioned at: http://en.wikipedia.org/wiki/X86_debug_register
It is likely possible because of the existing paging circuit, which manages every memory access.
The "software" alternative is to single step the program, which is very slow.
Compare that to regular breakpoints, where at least the software implementation injects an int3 instruction at the breaking point and lets the program run, so you only pay overhead when a breakpoint is hit.

Some quote from the Intel System Debugger help doc:
Hardware vs. Software Breakpoints
The debugger can use both hardware
and software breakpoints, each of these has strengths and weaknesses:
Hardware Breakpoints are implemented using the DRx architectural
breakpoint registers described in the Intel SDM. They have the
advantage of being usable directly at reset, being non-volatile, and
being usable with flash or other read-only memory. The downside is
that they are a finite resource. Software Breakpoints require
modifying system memory as they are implemented by replacing the
opcode at the desired location with a special instruction. This makes
them an unlimited resource, but the memory dependency mean you cannot
install them prior to a module being loaded in memory, and if the
target software overwrites that memory then they will become invalid.
In general, any debug feature that must be enabled by the debugger
does not persist after a reset, and may be impacted after other
architectural mode transitions such as SMM entry/exit or VM
entry/exit. Specific examples include:
CPU Reset will clear all debug features, except for reset break. This
means for example that user-specified breakpoints will be invalid
until the target halts once after reset. Note that this halt can be
due to either a reset-break, or due to a user-initiated halt. In
either case the debugger will restore the necessary debug features.
SMM Entry/exit will disable/re-enable breakpoints, this means you
cannot specify a breakpoint in SMRAM while halted outside of SMRAM. If
you wish the break within SMRAM, you must first halt at the SMM
entry-break and manually apply the breakpoint. Alternatively you can
patch the BIOS to re-enable breakpoints when entering SMM, but this
requires the ability to modify the BIOS which cannot be used in
production code.

Related

Is it possible to debug BIOS image in GDB?

I have a custom bios ( I actually have the image of the SPI Flash Bios, but in my opinion it should be the same ) and I would like to debug with GDB. Is it possible to run with GDB ? I know where there is the first executed instruction ( 0h3FFFF0 ) inside the file, and the files it should be a sequence of assembler instruction. Is there a way to follow the flow and run step by step the bios ?
First, it's incorrect to assume that SPI image will contain only BIOS executable. It usually much more then that (ie. ME, Setup parameters, flash descriptor). To investigate what is inside your image I suggest to use binwalk <spi_image>, if you will be lucky, it will detect layout of your image. Then you can extract image parts by binwalk -e <spi_image>.
Second, since I don't know if we talk about modern hardware or some obsolete I will try to be general. I don't think GDB is able to run BIOS binary because those binaries doesn't contain any debugging information that GDB can understand. GDB need to know what executable file it deals with, what is the ABI used and what is the physical architecture of target being debugged. Even if it will know all those information the code could be only disassembled because BIOS starts with low level hardware initialization, which cannot be recreated in GDB. On the other hand modern hardware BIOS (UEFI firmware) is typically signed.
What is the best thing you can do with extracted BIOS binary is to get datasheet of your hardware and use disassembler like IDA Pro or radare2 to investigate code flow.

Is there a GDB backend for ARM Debug Architecture?

ARMv7-A&R architecture manual (DDI0-406B) section C1 defines a debug interface through a series of memory mapped (or CP14 mapped) registers. Features include hardware breakpoints, hardware watchpoints, vector catch, and execution of ARM instructions in debug state. It seems like the perfect alternative to JTAG debugging which usually involves expensive cable and software.
From playing around with my Cortex A9 MPcore device, I found that it is possible for one core to enter debug state and have another core control it (step debugging, breakpoint set, etc). I was wondering if anyone has gone beyond that and implemented the GDB remote serial protocol with this interface?

How to debug code which is present on ROM?

I wanna know if normal practice of setting the breakpoints, step-in & step-out works same for the code which reside on ROM also. Do we have to do something extra for ROM debugging.
It will depend largely on the processor an the debug hardware you use. Many microcontrollers include on-chip debug hardware that includes hardware breakpoints that are essentially program-counter comparators. Other facilities may be supported such as data access break-points and instruction trace - essentially an on-chip in-circuit emulator (ICE).
Hardware breakpoints are a necessarily limited resource; for example ARM7 devices have just two while ARM Cortex-M3/4 are endowed with eight.
Either way, to utilise on-chip debug you require suitable debugger hardware (often via JTAG, or a vendor proprietary interface) to interface the target to the host debugger software.
For chips without on-chip debug, you typically use an in-circuit emulator. This is debug hardware that connects to the target board in place of the processor and can be controlled directly by the host debug software. The emulator hardware executes instructions identically to the actual processor but can be halted and stepped and have breakpoints set. Essentially the ICE works like a special version of the target processor with debug support. A true ICE is uncommon on modern processors since on-chip debug capabilities are almost ubiquitous even on small devices such as PIC and AVR, however some external debug hardware can support features not available on on-chip debug. For example Segger's J-Link supports unlimited break-points on ARM7 and Cortex-M3/4.

Memory debugger for linux kernel

Is there any memory debugger for linux kernel?
We have issues with "NULL pointer dereference" kernel oops among other crashes on android/linux arm based hardware.
Thanks
Modern kernels contain a great deal of built-in diagnostic tools (those are available in "Kernel hacking" sub-menu of the kernel source configuration tool). However, on embedded targets one has also an option of using gdb with a good jtag debugger, such as Abatron BDI series (this will, of course, allow for the most precise diagnostics, including diagnostics of interrupt related problems).
In the absence of hardware debugger, the following options can be quite handy to detect memory leaks (don't forget to compile the kernel with "Compile the kernel with debug info" and "Compile the kernel with frame pointers" set):
Kernel memory leak detector - useful in catching kmalloc/kfree errors.
KGDB (with suboptions) - this will enable a built-in gdb server inside the kernel, which can be accessed from a gdb front-end over a serial port. There's also a KGDB_KDB option to do the same manually (by omitting the gdb front end and using a human manageable protocol).
kmemcheck - requires the least of human interaction and the most machine resources, but can be handy in doing initial memory related problem analysis.
There are plenty of other diagnostics options, useful with more specific classes of problems. Most of them are reasonably documented both with kernel configuration tool snippets as well as with separate documents in Documentation/ sub-directory of the source (+ various online publications).

How to monitor resources needed to set watchpoints in gdb?

On x86 GDB uses some special hardware resources (debug registers?) to set watchpoints. In some situations, when there is not enough of that resources, GDB will set the watchpoint, but it won't work.
Is there any way to programmatically monitor the availability of this resources on Linux? Maybe some info in procfs, or something. I need this info to choose machine in pool for debugging.
From GDB Internals:
"Since they depend on hardware resources, hardware breakpoints may be limited in number; when the user asks for more, gdb will start trying to set software breakpoints. (On some architectures, notably the 32-bit x86 platforms, gdb cannot always know whether there's enough hardware resources to insert all the hardware breakpoints and watchpoints. On those platforms, gdb prints an error message only when the program being debugged is continued.)"
"Too many different watchpoints requested. (On some architectures, this situation is impossible to detect until the debugged program is resumed.) Note that x86 debug registers are used both for hardware breakpoints and for watchpoints, so setting too many hardware breakpoints might cause watchpoint insertion to fail."
"The 32-bit Intel x86 processors feature special debug registers designed to facilitate debugging. gdb provides a generic library of functions that x86-based ports can use to implement support for watchpoints and hardware-assisted breakpoints."
I need this info to choose machine in pool for debugging.
No, you don't. The x86 debug registers (there are 4) are per-process resource, not per-machine resource [1]. You can have up to 4 hardware watchpoints for every process you are debugging. If someone else is debugging on the same machine, you are not going to interfere with each other.
[1] More precisely, the registers are multiplexed by the kernel: in the same way as e.g. the EAX register. Every process on the system and the kernel itself uses EAX, there is only a single EAX register on (single-core) CPU, yet it all works fine through the magic of time-slicing.

Resources