How does a breakpoint in debugger work? - debugging

Breakpoints are one of the coolest feature supported by most popular Debuggers like GDB. But how a breakpoint works ? What code modifications does the compiler do to achieve the breakpoint? Are there any special hardware features used to support breakpoints?

Compiler does not need to "modify" the binary in any way to support the breakpoints. However it is important, that:
Compiler includes enough information in the executable (that is not in the code itself but in special sections in same file), so that debugger can relate source that user wants to debug with machine code. One typical thing debugger needs to know to be able to set breakpoints (unless you specify addresses directly), is where (at which address) program functions and lines of source code start (within machine code).
Code is not optimized by compiler in any way, that makes it impossible to relate source and machine code. Typically you will want debug code that was not optimized or code where only carefully selected optimizations were performed.
The rest of work is then performed by debugger itself.
Software breakpoints don't necessarily need special hardware features. Debugger here relies on modifying original binary (it's copy that is loaded to memory). When you set a breakpoint, debugger will place special instruction at the location of breakpoint. This special instruction needs to somehow let debugger detect when it (this special instruction) is executing. This can be some instruction that causes some kind of interrupt/exception, that debugger can hook onto, or some instruction that handles the control to debug unit. If this runs under some OS, that OS needs to support modifying running program (with something like ptrace poke/peek). Downside of SW breakpoints is that debugger needs to be able to modify running program, which is not possible if program is running from some kind of read-only memory (quite common in embedded world).
Hardware breakpoints (which need to be supported by CPU) implement similar behavior without modifying program binary. This is CPU specific, but usually it lets you to at least define a program address at which execution should hit a breakpoint. CPU continuously compares current PC with these breakpoint addresses and once the condition is matched, it breaks the execution. Number of these breakpoints is always limited.

To put a break point first we have to add some special information in to the binary .We use the flag -g while compiling the c source files to include this info.The Software debugger actually use this info to put break points.The best example for hardware break point support is in VxWorks as I have experienced.
Basically at the break point the processor halts.So internally any step which will give an exception to processor can be used to put a software break point.While a Hardware break point works by matching the address stored in Hardware registers to cause an exception.So Hardware break point is very powerful but it is heavily architecture dependent.
A very good explanation is here
What is the difference between hardware and software breakpoints?
A good intro with Processor related information is given here
http://processors.wiki.ti.com/index.php/How_Do_Breakpoints_Work

Related

What features of x86, if any, allow a user-mode program to be aware that it is being debugged?

I seem to recall hearing in the past that an x86 program can detect whether it is being debugged. This was related to malware analysis.
Yet researching registers, I read up on the DRs and found the statement:
The debug registers are privileged resources; the MOV instructions
that access them can only be executed at privilege level zero. An
attempt to read or write the debug registers when executing at any
other privilege level causes a general protection fault.
So, could someone explain perhaps if there is some alternative feature in x86 that allows a program to become aware that it is being debugged? I think I recall reading something about debuggers actually patching things into a program's instruction memory, but I'm not yet sure (I haven't had time to study the inner workings of a debugger yet).
Could someone clarify whether there is a fundamental feature of x86 that leaks information suggesting the attachment of a debugger to a program? Or perhaps it's just due to something more invasive debuggers might be doing?
No, there is no mechanism built-in into the x86 architecture allowing a program to detect whether it’s being debugged, but there are numerous heuristics allowing you to determine whether it’s probably being debugged, e.g.
check TracerPid in /proc/self/status is nonzero,
examine rdtsc difference (obviously generally unsuitable), or
inspect the trap flag
but these methods aren’t infallible.
Ultimately it could be the case your program isn’t even run on a bare-metal x86 processor.

How to know the assembly code of a Windows system call using Visual Studio?

I am interested in finding out the differences (implementation wise) between timeGetTime() and GetTickCount() Windows API functions, as the source code will not be public, I am thinking to analyse their implementation using their assembly code in Visual Studio, Can anyone suggest any better idea ?
Use a debugger, like Ollydbg, x64dbg, IDA pro or Visual Studio itself for the user-mode code.
If you need to cross over to the kernel-mode side, use WinDBG.
Make a program, as simpler as possible, that invokes the APIs you want to analyze then:
Load it in the debugger, if it is not a system-wide debugger.
Put a breakpoint on the APIs.
Run the program.
Once it breaks, single step as needed.
The more the debugger understand of Windows (read more debugging symbols it has) the easier the analysis.
WinDBG has a rich symbols library but it is a bit hardware to use.
To understand APIs like GetTicksCount you may find useful this pages:
What is the KUSER_SHARED_DATA.
How to show the KUSER_SHARED_DATA memory region in WinDBG.
Who updates the KUSER_SHARED_DATA.
KUSER_SHARED_DATA
As Peter Corders supposed, Windows exposes frequently accesses information, like the tick counts, in a memory region shared across all user-mode processes.
GetTicksCount only copy data from that (with some synchronization in place).

Debugging an ARM assembly (Neon extension)

I am developing an algorithm that uses ARM Neon instructions. I am writing the code using assembler file (.S and no inline asm).
My question is that what is the best way for debugging purpose i.e. viewing registers, memory, etc.
Currently, I am using Android NDK to compile and my Android phone to run the algorithm.
Poor man's debug solutions...
You can use gdb / gdbserver to remotely control execution of applications on an Android phone. I'm not giving full details here because they change all the time but for example you can start with this answer or make a quick search on Internet. Learning to use GDB might seem to have a high steep curve however material on web is exhaustive. You can easily find something to your taste.
Single-stepping an ARM core via software tools is hard that's why ARM ecosystem is full of expensive tools and extra HW equipment.
Trick I use is to insert BRK instructions manually in assembly code. BRK is Self-hosted debug breakpoint. When core sees this instruction it stops and informs OS about situation. OS then notifies debugger about the situation and passes control to it. When debugger gets control you can check contents of registers and probably even make changes to them. Last part of the operation is to make your process continue. Since PC is still at our break point instruction what you must do is to increase PC, set it to instruction after BRK.
Since you mentioned you use .S files instead of .s files you can utilize gcc to do preprocessing / macro work. This way enabling, disabling BRK might become less of an issue.
Big down side of this way of working is turnaround time. If there is a certain point that you want to investigate with gdb you must make sure there is a BRK instruction there and this will probably require another build/push/debug cycle.

What are the possible side effects of using GCC profiling flag -pg?

There is a device driver for a camera device provided to us as a .so library file by the vendor.
Only the header file with API's is available which provides the list of functions that we can work with the device. Our application is linked with the .so library file provided by the vendor and uses the interface functions provided for our objective.
When we wanted to measure the time taken by our application in handling different tasks, we have added GCC -pg flag and compiled+built our application.
But we found that using this executable built with -pg, we are observing random failure in the camera image acquire functions. Since we are using the .so library file, we do not know what is going wrong inside that function.
So in general I wanted to understand what could be the possible reasons of such a failure mode. Any pointers or documents that can help what goes inside profiling and its side effects is appreciated.
This answer is a helpful overview of how the gcc -pg flag profiler actually works. The take-home point is mostly to do with possible changes to timing. If your library has any kind of time-sensitivity in it, introducing profiler overheads might be changing the time it takes to execute parts of the code, and perhaps violating some kind of constraint.
If you look at the gprof documentation, it would explain the implementation details:
Profiling works by changing how every function in your program is
compiled so that when it is called, it will stash away some
information about where it was called from. From this, the profiler
can figure out what function called it, and can count how many times
it was called. This change is made by the compiler when your program
is compiled with the `-pg' option, which causes every function to call
mcount (or _mcount, or __mcount, depending on the OS and compiler) as
one of its first operations.
So the timing of your application would change quite a bit when you turn on -pg.
If you would like to instrument your code without significantly affecting the timings, you could possibly look at oprofile. It does not pose as significant an overhead as gprof does.
Another fairly recent tool that serves as a good lightweight profiling tool is perf.
The profiling tools are useful primarily in understanding the CPU bound pieces of your library/application and can help you optimize those critical pieces. Most of the time they serve to identify some culprit function/method which wastes CPU cycles. So do not use it as the sole piece for debugging any and all issues.
Most vendor libraries would also provide means to turn on extra debugging or dumping extra information during runtime. They include means such as environment variables, log files, /proc or /sys interfaces for drivers, etc. and sometimes even tools to increase debugging levels at runtime. See if you can leverage these.
If you have defined APIs in a library/driver, you should run unit-tests on them instead of trying to debug the whole application you've built.
If you find a certain unit-test fails, send the source code of the unit-test to your vendor, and ask them to fix the bug. If it is not a bug, your vendor would at least point you towards the right set of APIs or the semantics to use.

Debugging an Operating System

I was going through some general stuff about operating systems and struck on a question. How will a developer debug when developing an operating system i.e. debug the OS itself? What tools are available to debug for the OS developer?
Debugging a kernel is hard, because you probably can't rely on the crashing machine to communicate what's going on. Furthermore, the codes which are wrong are probably in scary places like interrupt handlers.
There are four primary methods of debugging an operating system of which I'm aware:
Sanity checks, together with output to the screen.
Kernel panics on Linux (known as "Oops"es) are a great example of this. The Linux folks wrote a function that would print out what they could find out (including a stack trace) and then stop everything.
Even warnings are useful. Linux has guards set up for situations where you might accidentally go to sleep in an interrupt handler. The mutex_lock function, for instance, will check (in might_sleep) whether you're in an unsafe context and print a stack trace if you are.
Debuggers
Traditionally, under debugging, everything a computer does is output over a serial line to a stable test machine. With the advent of virtual machines, you can now wire one VM's execution serial line to another program on the same physical machine, which is super convenient. Naturally, however, this requires that your operating system publish what it is doing and wait for a debugger connection. KGDB (Linux) and WinDBG (Windows) are some such in-OS debuggers. VMWare supports this story explicitly.
More recently the VM developers out there have figured out how to debug a kernel without either a serial line or kernel extensions. VMWare has implemented this in their recent stuff.
The problem with debugging in an operating system is (in my mind) related to the Uncertainty principle. Interrupts (where most of your hard errors are sure to be) are asynchronous, frequent and nondeterministic. If your bug relates to the overlapping of two interrupts in a particular way, you will not expose it with a debugger; the bug probably won't even happen. That said, it might, and then a debugger might be useful.
Deterministic Replay
When you get a bug that only seems to appear in production, you wish you could record what happened and replay it, like a security camera. Thanks to a professor I knew at Illinois, you can now do this in a VMWare virtual machine. VMWare and related folks describe it all better than I can, and they provide what looks like good documentation.
Deterministic replay is brand new on the scene, so thus far I'm unaware of any particularly idiomatic uses. They say it should be particularly useful for security bugs, too.
Moving everything to User Space.
In the end, things are still more brittle in the kernel, so there's a tremendous development advantage to following the Nucleus (or Microkernel) design, where you shave the kernel-mode components to their bare minimum. For everything else, you can use the myriad of user-space dev tools out there, and you'll be much happier. FUSE, a user-space filesystem extension, is the canonical example of this.
I like this last idea, because it's like you wrote the program to be writeable. Cyclic, no?
In a bootstrap scenario (OS from scratch), you'd probably have to introduce remote debugging capabilities (memory dumping, logging, etc.) in the OS kernel early on, and use a separate machine. Or you could use a virtual machine/hypervisor.
Windows CE has a component called KITL - Kernel Independent Transport Layer. I guess the title speaks for itslf.
You can use a VM: eg. debug ring0 code with bochs/gdb
or Debugging NetBSD kernel with qemu
or a serial line with something like KDB.
printf logging
attach to process
serious unit tests
etc..
Remote debugging with kernel debuggers, which can also be done via virtualization.
Debugging an operating system is not for the faint of heart. Because the kernel is being debugged, your options would be quite limited. Copious amount of printf statements is one trick, and furthermore, it depends on really what 'operating system' is being debugged, we could be talking about
Filesystem
Drivers
Memory management
Raw Disk input/output
Screen input/output
Kernel
Again, it is a widely varying exercise as in the above, they all interact with one another. Even more complicated is the fact, supposing you were to debug the kernel, how would you do it if the runtime environment is not properly set (by that, I am talking about the kernel's responsibility for loading binary executables).
Some kernels may (not all of them have them) incorporate a simple debug monitor, in fact, if I rightly recall, in the book titled 'Developing your own 32bit Operating System' by Richard A Burgess, Sams publishing, he incorporated a debug monitor which displays various states of the CPU, registers and so on.
Again, take into account of the fact that the binary executables require a certain loading mechanism, for example a gdb equivalent, if the environment for loading binaries are not set up, then your options are quite limited.
By using copious amount of printf statements to display errors, logs etc to a separate terminal or to a file is the best line of debugging, it does sound a nightmare but it would be worth the effort to do so.
Hope this helps,
Best regards,
Tom.

Resources