How to read GDTR and LDTR in kgdb? - debugging

This question deals with why you can't read the GDTR and LDTR in user-mode GDB. But I don't see why it shouldn't be possible when debugging a Linux kernel (with KGDB compiled in), using GDB on another machine with serial cable.
The kernel being debugged should be able to tell the debugger the values of the GDTR and LDTR, but it doesn't seem there is any GDB command to make it do so. Is there a good reason for this? Is it just something which nobody has implemented?

As you say, nobody implemented it.
gdb in particular doesn't consider those valid registers, so the kernel debug interface doesn't even try to send them.
Unless you are willing to change gdb, you must use a workaround to get that information. One such possibility I can think of is the ThreadExtraInfo command which should be able to send arbitrary string message that gets printed in gdb. So you could add that information in kernel/debug/gdbstub.c.

Related

GDB does not break on a read access to an address caught by PIN

I've captured a read memory trace (i.e. addresses of all read accesses) of a program-run by Intel PIN, ASLR was off. I can capture it several times and the trace is still exactly the same.
Then I take an address from the trace (a specific address I am interested in), set a watchpoint in GDB by rwatch *0x7fffe309e643 but GDB does not notice the access. I'm doing this in the same terminal where the trace has been taken so it should access the same addresses; when I capture it then again, the trace is still very the same.
Do you have any ideas or hints why GDB does not catch it?
I have to note that the access very likely happens within a (C++) structure loaded from a Boost serialization and that the code has been compiled with -g3 -O1 flags (-O1 is really crucial for other reasons but a minimal example showed me that GDB breaks correctly even with -O1).
UPDATE:
As suggested, I tried PIN debugger (which actually utilizes GDB) but there is some problem with watchpoints. To present this problem, I made a minimal example.
int main()
{
unsigned char i = 4;
i++;
i = 3;
return 0;
}
I acquired a read memory trace with PIN and chose 0x7fffffffd89f (or similar) as the address of interest. Then I started PIN debugger and GDB in parallel terminals and connected GDB to PIN ((gdb) target remote :57946, as suggested by PIN). In GDB I set watchpoint (gdb) rwatch *0x7fffffffd89f and started execution but I get
Continuing.
Warning:
Could not insert hardware watchpoint 1.
Could not insert hardware breakpoints:
You may have requested too many hardware breakpoints/watchpoints.
even though I only inserted a single watchpoint. I also tried to explicitly disable hardware breakpoints by set can-use-hw-watchpoints 0 but then it ran for an hour with no result and no processor load so I killed it.
Is there any other way how to set a functional watchpoint at a specific address when GDB is connected to PIN debugger? Note that there is no such problem with standalone GDB.
I use GDB 7.7.1, PIN 2.14-71313 and GCC 4.4.7, running Ubuntu 14.04.
The original problem would be likely solved then.
From pin user guide:
The "pin" Executable (Launcher)
The kit's root directory contains a "pin" executable. This is a 32-bit
launcher, used for launching Pin in 32 and 64 bit modes. The launcher
sets up the environment to find the libraries supplied with the kit.
It is likely that the environment is different when running pinbin, and therefore stack addresses will also be different (0x7fffe309e643 looks like a possible stack address) even without ASLR.
Your best bet is probably to use Pin application debugging interface.

Why is gdb automatically doing a read after my requested write?

I'm using gdb to communicate with a LEON2-based ASIC through a home made gdb server (not sure though that "gdb server" is the correct phrase here). It works like this: the gdb client uses the ordinary gdb protocol to talk to the gdb server, which then translates the gdb requests to reads and writes from/to the HW and sends back the result to the client, if any. My gdb client is sparc-rtems-gdb 6.6 in RTEMS 4.8.0 on a Windows 7 PC.
When I start the gdb client I run the following command to attach to the gdb server:
target extended-remote localhost:5000
Then I want to change a word in RAM so I run this gdb command:
set *((unsigned int*) 0x40000000)=2
While debugging the gdb server I can see that it receives the following line, which is expected and correct according to the gdb protocol, i.e. writing 4 bytes, value 2 to address 0x40000000:
M40000000,4:00000002
Now the confusion: After the write request above, another request comes from the gdb client, read 4 bytes from address 0xABD37787:
mabd37787,4
Why is the gdb client trying to read from that address? As far as I know, I haven't done anything to request this read, I only wanted to perform the write. If gdb would have read back the address 0x40000000, for example to verify the write, it would be OK. But the out-of-nowhere-address 0xABD37787 does not exist on my HW, which causes problems for me.
Is there any way that I can debug the gdb client to determine exactly what (and why) it is sending and receiving? Or is there a setting in gdb that can explain this behaviour?
Best regards
Henrik
While debugging the gdb server I can see that it receives the following line
You don't need to be debugging the gdbserver. You can simply turn on set debug remote 1 in GDB, and have GDB print all sent and received packets.
Why is the gdb client trying to read from that address?
There are several possibilities:
GDB believes that program counter is currently at 0xABD37787
GDB believes that it needs to set a breakpoint there
GDB believes that there is some data that it needs to read
One possible way to figure out why GDB is trying to read that location is to set debug infrun 1. This will print a lot of info about what GDB itself is trying to do.
Another way is to debug GDB itself. Put a breakpoint on putpkt, and when the packet of interest is being sent, examine the stacktrace to see why it is being sent.

thread-aware gdb for the Linux kernel

I am using gdb attached to a serial port of a virtual machine to debug linux kernel.
I am wondering, if there is any patches/plugins which can make the gdb understand some of linux kernel's data structure and make it "thread aware"?
By that I mean under gdb I can see how many kernel threads are there, their status, and for each thread, their stack information.
libvmi
https://github.com/libvmi/libvmi
This project does "LibVMI: Simplified Virtual Machine Introspection" which sounds really close.
This project in particular https://github.com/Wenzel/pyvmidbg uses libvmi and features a demo video of debugging a Windows userland application form inside it, without memory conflicts.
As of May 2019, there are two limitations however as of May 2019, both of which could be overcome with some work: https://github.com/Wenzel/pyvmidbg/issues/24
Linux memory parsing is not yet complete
requires Xen
The developer of that project also answered further at: https://stackoverflow.com/a/56369454/895245
Implementing it with those libraries would be in my opinion the best way to achieve this goal today.
Linaro lkd-python
First, this Linaro page claims to have a working setup: https://wiki.linaro.org/LandingTeams/ST/GDB that allows you to do usual thread operations such as thread, bt, etc., but it relies on a GDB fork. I will test it out later. In 2016, https://youtu.be/pqn5hIrz3A8 says that the implementation was in C, not as Python scripts unfortunately, which would be better and avoid forking. The sketch for lkd-python can be found at: https://git.linaro.org/people/lee.jones/kieran.bingham/binutils-gdb.git/log/?h=lkd-python
Linux kernel in-tree GDB scripts + my brain
I then tried to see what I could do with the kernel in-tree Python scripts at v4.17 + some manual intervention as a prototype, but didn't quite get there yet.
I have tested using this highly automated QEMU + Buildroot setup.
First follow the procedure I described at: How to debug the Linux kernel with GDB and QEMU? to get GDB working.
Then, as described at: How to debug Linux kernel modules with QEMU? run GDB with:
gdb -ex add-auto-load-safe-path /full/path/to/linux/kernel
This loads the in-tree GDB Python scripts from scripts/gdb.
One of those scripts provides:
lx-ps
which lists all threads with format:
0xffff88000ed08000 1 init
0xffff88000ed08ac0 2 kthreadd
The first field is the address of the task_struct struct, so we can see the entire struct with:
p (struct task_struct)*0xffff88000ed08000
which should in theory allow us to get any information we want about the process.
Now I wanted to find the PC. For ARM, I've seen: Find program counter of process in kernel and I tried:
task_pt_regs((struct thread_info *)((struct task_struct)*0xffffffc00e8f8000))->uregs[ARM_pc]
but task_pt_regs is a #define and GDB cannot see defines without -ggdb3: How do I print a #defined constant in GDB? which are apparently not set?
I don't think GDB understands kernel data structures, that would make them version dependent. GDB uses ptrace for gathering information on any running process.
That's all I know :(
pyvmidbg developer here.
I will add some clarifications:
yes the goal of the project is indeed to have a cross-platform, guest-aware GDB stub.
Most of the implementation is already done for Windows, where we are aware of processes and their threads context.
It's possible to intercept a specific process (cmd.exe in the demo) and singlestep its execution (this is limited to 1 process with 1 thread for now), as well as attaching to a new process's entrypoint.
Regarding Linux, I looked at the internals and the resources that I could find, but I'm lacking the whole picture to figure out how I can:
- intercept a task when it's being scheduled (core/sched.c:switch_to() ?)
- read the task state (Windows's KTRAP_FRAME equivalent for Linux ?)
I asked a question on SO, but nobody answered :/
Linux context switch internals: how does a process goes back to userland after the switch?
If you can help with this, I can guide you through the implementation :)
Regarding the hypervisor support, only Xen is fully supported in the Libvmi interface at the moment.
I added a section in the README to describe where we are in terms of VMI APIs with other hypervisors.
Thanks !

I need to find the point in my userland code that crash my kernel

I have big system that make my system crash hard. When I boot up, I don't even have
a coredump. If I log every line that
get executed until my system goes down. I will find that evil code.
Can I log every source code line in GDB to a file?
UPDATE:
ok, I found the bug. It was nasty. The application I started did not
take the system down. After learning about coredump inspection with mdb, and some gdb stepping I found out that the systemcall causing the dump, was not implemented. Updating the system to latest kernel will fix my problem. Thanks to all of you.
MY LESSON:
make sure you know what process causes the coredump. It's not always the one you started.
Sounds like a tricky little problem.
I often try to eliminate as many possible suspects as I can by commenting out large chunks of code, configuring the system to not run certain pieces (if it allows you to do that) etc. This amounts to doing an ad-hoc binary search on the problem, and is a surprisingly effective way of zooming in on offending code relatively quickly.
A potential problem with logging is that the log might not hit the disk before the system locks up - if you don't get a core dump, you might not get the log.
Speaking of core dumps, make sure you don't have a limit on your core dump size (man ulimit.)
You could try to obtain a list of all the functions in your code using objdump, process it a little bit and create a bunch of GDB trace statements on those functions - basically creating a GDB script automatically. If that turns out to be overkill, then a binary search on the code using tracepoints can also help you zoom in on the problem.
And don't panic. You're smarter than the bug - you'll find it.
You can not reasonably track every line of your source using GDB (too slow). Besides, a system crash is most likely a result of a system call, and libc is probably doing the system call on your behalf. Even if you find the line of the application that caused OS crash, you still don't really know anything.
You should start by clarifying which OS is crashing. For Linux, you can try the following approaches:
strace -fo trace.out /path/to/app
After reboot, trace.out will contain syscalls the application was doing just before the crash. If you are lucky, you'll see the last syscall-of-death, but I wouldn't count on it.
Alternatively, try to reproduce the crash on the user-mode Linux, or on kernel with KGDB compiled in.
These will tell you where the problem in the kernel is. Finding the matching system call in your application will likely be trivial.
Please clarify your problem: What part of the system is crashing?
Is it an application?
If so, which application? Is this an application which you have written yourself? Is this an application you have obtained from elsewhere? Can you obtain a clean interrupt if you use a debugger? Can you obtain a backtrace showing which functions are calling the section of code which crashes?
Is it a new hardware driver?
Is it based on an older driver? If so, what has changed? Is it based on a manufacturer's data sheet? Is that data sheet the latest and most correct?
Is it somewhere in the kernel? Which kernel?
What is the OS? I assume it is linux, seeing that you are using the GNU debugger. But of course, that is not necessarily so.
You say you have no coredump. Have you enabled coredumps on your machine? Most systems these days do not have coredumps enabled by default.
Regarding logging GDB output, you may have some success, but it depends where the problem is whether or not you will have the right output logged before the system crashes. There is plenty of delay in writing to disk. You may not catch it in time.
I'm not familiar with the gdb way of doing this, but with windbg the way to go is to have a debugger attached to the kernel and control the debugger remotely over a serial cable (or firewire) from a second debugger. I'm pretty sure gdb has similar capabilities, I could quickly find some hints here: http://www.digipedia.pl/man/gdb.4.html

How does a debugger work?

I keep wondering how does a debugger work? Particulary the one that can be 'attached' to already running executable. I understand that compiler translates code to machine language, but then how does debugger 'know' what it is being attached to?
The details of how a debugger works will depend on what you are debugging, and what the OS is. For native debugging on Windows you can find some details on MSDN: Win32 Debugging API.
The user tells the debugger which process to attach to, either by name or by process ID. If it is a name then the debugger will look up the process ID, and initiate the debug session via a system call; under Windows this would be DebugActiveProcess.
Once attached, the debugger will enter an event loop much like for any UI, but instead of events coming from the windowing system, the OS will generate events based on what happens in the process being debugged – for example an exception occurring. See WaitForDebugEvent.
The debugger is able to read and write the target process' virtual memory, and even adjust its register values through APIs provided by the OS. See the list of debugging functions for Windows.
The debugger is able to use information from symbol files to translate from addresses to variable names and locations in the source code. The symbol file information is a separate set of APIs and isn't a core part of the OS as such. On Windows this is through the Debug Interface Access SDK.
If you are debugging a managed environment (.NET, Java, etc.) the process will typically look similar, but the details are different, as the virtual machine environment provides the debug API rather than the underlying OS.
As I understand it:
For software breakpoints on x86, the debugger replaces the first byte of the instruction with CC (int3). This is done with WriteProcessMemory on Windows. When the CPU gets to that instruction, and executes the int3, this causes the CPU to generate a debug exception. The OS receives this interrupt, realizes the process is being debugged, and notifies the debugger process that the breakpoint was hit.
After the breakpoint is hit and the process is stopped, the debugger looks in its list of breakpoints, and replaces the CC with the byte that was there originally. The debugger sets TF, the Trap Flag in EFLAGS (by modifying the CONTEXT), and continues the process. The Trap Flag causes the CPU to automatically generate a single-step exception (INT 1) on the next instruction.
When the process being debugged stops the next time, the debugger again replaces the first byte of the breakpoint instruction with CC, and the process continues.
I'm not sure if this is exactly how it's implemented by all debuggers, but I've written a Win32 program that manages to debug itself using this mechanism. Completely useless, but educational.
In Linux, debugging a process begins with the ptrace(2) system call. This article has a great tutorial on how to use ptrace to implement some simple debugging constructs.
If you're on a Windows OS, a great resource for this would be "Debugging Applications for Microsoft .NET and Microsoft Windows" by John Robbins:
http://www.amazon.com/dp/0735615365
(or even the older edition: "Debugging Applications")
The book has has a chapter on how a debugger works that includes code for a couple of simple (but working) debuggers.
Since I'm not familiar with details of Unix/Linux debugging, this stuff may not apply at all to other OS's. But I'd guess that as an introduction to a very complex subject the concepts - if not the details and APIs - should 'port' to most any OS.
I think there are two main questions to answer here:
1. How the debugger knows that an exception occurred?
When an exception occurs in a process that’s being debugged, the debugger gets notified by the OS before any user exception handlers defined in the target process are given a chance to respond to the exception. If the debugger chooses not to handle this (first-chance) exception notification, the exception dispatching sequence proceeds further and the target thread is then given a chance to handle the exception if it wants to do so. If the SEH exception is not handled by the target process, the debugger is then sent another debug event, called a second-chance notification, to inform it that an unhandled exception occurred in the target process. Source
2. How the debugger knows how to stop on a breakpoint?
The simplified answer is: When you put a break-point into the program, the debugger replaces your code at that point with a int3 instruction which is a software interrupt. As an effect the program is suspended and the debugger is called.
Another valuable source to understand debugging is Intel CPU manual (Intel® 64 and IA-32 Architectures
Software Developer’s Manual). In the volume 3A, chapter 16, it introduced the hardware support of debugging, such as special exceptions and hardware debugging registers. Following is from that chapter:
T (trap) flag, TSS — Generates a debug exception (#DB) when an attempt is
made to switch to a task with the T flag set in its TSS.
I am not sure whether Window or Linux use this flag or not, but it is very interesting to read that chapter.
Hope this helps someone.
My understanding is that when you compile an application or DLL file, whatever it compiles to contains symbols representing the functions and the variables.
When you have a debug build, these symbols are far more detailed than when it's a release build, thus allowing the debugger to give you more information. When you attach the debugger to a process, it looks at which functions are currently being accessed and resolves all the available debugging symbols from here (since it knows what the internals of the compiled file looks like, it can acertain what might be in the memory, with contents of ints, floats, strings, etc.). Like the first poster said, this information and how these symbols work greatly depends on the environment and the language.

Resources