Setting up two-machine kernel debugging via firewire - macos

The instructions found at Setting Up Kernel Debugging are what I used to get to this point. On the machine running the kext I want to debug, I do see the message "Connected to remote debugger". On the machine I am running gdb on, I do see:
(gdb) kdp-reattach localhost
Connected.
The problem is that 'showallkmods' returns an empty list and none of the other similar commands appear to be working:
(gdb) showallkmods
kmod address size id refs version name
(gdb) showalltasks
task vm_map ipc_space #acts pid process io_policy wq_state command
Invalid type combination in equality test.
(gdb) showregistry
Please load kgmacros after KDP attaching to the target.
(gdb) source /Volumes/KernelDebugKit/kgmacros
Loading Kernel GDB Macros package. Type "help kgm" for more info.
(gdb) showallkmods
kmod address size id refs version name
(gdb) showregistry
Please load kgmacros after KDP attaching to the target.
(gdb) showbootargs
Invalid cast.
I am running 10.6.8 and am using kernel_debug_kit_10.6.8_10k540.dmg
I am not sure what other details one might need to diagnose what has gone wrong, but if you want to ask questions in the comments, I can certainly attempt to provide additional details.

The error "Invalid type combination in equality test." indicates to me that gdb might be expecting a different CPU architecture than the kernel you're connecting to is running. The 10.6 kernel exists in both 32-bit and 64-bit variants, and by default it's determined by the hardware which one gets loaded. gdb normally defaults to x86_64 if your CPU supports it (true of all Intel Macs except the very early Core Duo based ones) so if you're connecting to a 32-bit kernel (the default on most Macs released before 2011) you need to pass the -arch i386 argument when starting gdb. You can check the current kernel CPU architecture by running the uname -a command.
Update: on OSX Mountain Lion, the kernel always runs in 64-bit (x86_64) mode. On OSX Lion, the kernel defaults to 64-bit mode on Macs which are capable of running Mountain Lion and in 32-bit mode otherwise.

Related

Cannot access kernel space when debugging xv6 with QEMU and GDB

I am self-studying the 2019 version of MIT 6.828/6.S081: Operating System Engineering.
I was trying to attach GDB to xv6 running on RISC-V using QEMU, to learn about what is going on when context switching happens between user mode and kernel mode.
After doing make qemu-gdb and gdb in the same directory, my GDB connected to QEMU successfully. However:
(gdb) x/2i $pc
=> 0xd8c: ecall
0xd90: ret
The problem is: Now if I stepi, it "jumps over" to 0xd90 instead of stepping into the kernel space.
Additionally, accessing any kernel addresses is not allowed, as if I was debugging a normal userland program:
(gdb) i r stvec
stvec 0x3ffffff000 274877902848
(gdb) x/i $stvec
0x3ffffff000: Cannot access memory at address 0x3ffffff000
Environment:
Host VM: Manjaro 19.0.2
sudo pacman -Syy
sudo pacman -S riscv64-linux-gnu-binutils riscv64-linux-gnu-gcc riscv64-linux-gnu-gdb qemu-arch-extra
GDB: 9.1
QEMU: 4.2.0
GCC: 9.2.0
Much appreciate anyone could share some insight about what is going on here. Thanks a lot!
I guess you run your code on ubuntu, that is the problem I experienced, then I change to mac, and flow mit tools tutorials, finally, it works.
run make CPUS=1 qemu-gdb in one window.
run riscv64-unknown-elf-gdb in another window.
ignore the Python Exception
I managed to get around this problem by building the riscv toolchain as explained here.
Building the toolchain as explained in the site, generates a generic ELF/Newlib toolchain identified with the prefix riscv64-unknown-elf- in contrast to the more sophisticated Linux-ELF/glibc toolchain identified by the prefix riscv64-unknown-linux-gnu-. The Newlib build allows the debugger to stepi into kernel space.
For crossdev users it is possible to build the toolchain with Newlib support by running:
crossdev --ex-gcc --ex-gdb --target riscv64-unknown-elf

GDB 8.2 macOS High Sierra - program stops immediately after 'run'

I finally managed to run GDB 8.2 on macOS. But now when I'm trying to debug something, I got the following:
(gdb) b main
Breakpoint 1 at 0x100001e94: file project/src/main.cpp, line 34.
(gdb) run
Starting program: project/cmake-build-debug/program
[New Thread 0x1203 of process 5140]
[New Thread 0xf03 of process 5140]
[5]+ Stopped sudo gdb beast
I also tried using it inside CLion. In that case, GDB freezes indefinitely with this:
For help, type "help".
Type "apropos word" to search for commands related to "word".
Function "__cxx_global_var_init" not defined.
Function "__libc_csu_init" not defined.
[New Thread 0x1003 of process 4186]
[New Thread 0xf03 of process 4186]
Warning:
Cannot insert breakpoint -1.
Cannot access memory at address 0xf7ce
Does anybody know what's going on?
[5]+ Stopped sudo gdb beast
This is a gdb bug. The typical cause of this is gdb trying to write something to stdout after it has already given the terminal to the inferior. I'm not sure whether this instance has been fixed, probably because this is most likely only showing up due to:
Warning:
Cannot insert breakpoint -1.
Cannot access memory at address 0xf7ce
... this. This is another bug in gdb, namely that it wasn't updated to account for dyld changes in High Sierra. This bug is fixed and will be in gdb 8.3 (or in any rate in the release after 8.2.1, regardless of what number it is finally given).
Building a gdb from git master will work fine.

Problems getting (usable) core dump under cygwin

I'm trying to develop code under 64-bit Cygwin, and I'm having trouble getting a core dump file that I can use under GDB. The code is compiled using GCC 7.3.0, and I've just updated my Cygwin bits. ulimit -c is unlimited.
I've got my $CYGWIN variable set to point to dumper, and that appears to be being launched on crashes. I get a pop-up, and the message
*** starting debugger for pid 5288, tid 9464
*** continuing pid 5288 from debugger call (1)
Aborted (core dumped)
and a core file (basic.exe.core) is created in the current dir.
When I try to run (the stock Cygwin) GDB on this
gdb tests/basic.exe --core=basic.exe.core
I get the normal version intro, Reading symbols..., and then a warning
warning: core file may not match specified executable file.
and GDB crashes (and dumps its own core file). The crashing program was launched from the Cygwin bash command line (as ./tests/basic.exe).
It's been a long time since I've tried to develop under Windows or Cygwin, so it's quite possible that I'm doing something stupid. Or, alternatively, it may be that GCC 7.3.0 is doing something wrong or that I configured it poorly when I built it.
Any help will be appreciated.

analyzing crash dumps with debian kernel packages

I’m trying to analyze a Linux kernel crash dump. The kernel was built out of 4.4.77 tree with some custom packages on top of that. The command to build the kernel was make-kpkg kernel_image debug_image, producing in 2 different debian packages. The idea is that the first package runs in production, the second package can be utilized for debugging if a problem is detected. So the “kernel_image” package was installed, configured for collecting crashes per instructions, ran, crashed and wrote a crash dump file.
I am using crash utility to analyze the dump. Running
crash vmlinux file.dump
for vmlinux file uncompressed per instructions outputs
crash: vmlinux: no .gnu_debuglink section
crash: vmlinux: no debugging data available
Installing the “debug_image” package does not change that.
I noticed that the “debug-image” package contains its own vmlinux file; it is placed in /usr/lib/debug/lib/modules/4.4.77+/ upon installation. Running
crash /usr/lib/debug/lib/modules/4.4.77+/vmlinux file.dump
Outputs
WARNING: kernels compiled by different gcc versions:
/usr/lib/debug/lib/modules/4.4.77+/vmlinux: (unknown)
dump.201802261029 kernel: 4.8.4
WARNING: kernel version inconsistency between vmlinux and dumpfile
crash: incompatible arguments:
/usr/lib/debug/lib/modules/4.4.77+/vmlinux is not SMP -- dump.201802261029 is SMP
What am I missing? Is it possible to analyze the the dump from “kernel_image” utilizing the info available in “debug_image” package?
Update: Apparently the system had a makedumpfile binary being too old for 4.4 kernels. At some point in the past the system's kernel was updated from 3.something to 4.4 but all user-mode binaries stayed as they were. That somehow caused an invalid crash dump. I could not just apt-get install newer makedumpfile (part of kdump-tools package) due to binary incompatibilities. The problem has been resolved after I re-built makedumpfile binary v 5.9 in the same dev environment as the one used to build user-mode applications for the system. Summarizing, pointing crash to vmlinux file from "debug-image" package worked for me at the end.

gdb loses line number information (on kernel modules) after breakpoint

I am connecting gdb to a virtual machine's kernel and trying to debug the kernel module. I am able to connect to the virtual machine. I have symbol information for kernel code, and can step through kernel code just fine.
When I add the symbol file for my kernel module (whether I do this before or after remote connection, incidentally), I am able to list <function_name> information about the function, until I set a breakpoint; after that:
(gdb) b function_name
Breakpoint 1 at 0xffffffffa01d0074 (3 locations)
(gdb) list function_name
No line number known for function_name.
Additional information:
Both host and guest are Fedora 16 64-bit.
The kernel I am debugging is 3.0.8 - note that this kernel worked fine on a prior 32-bit setup with a different environment and remote-connection setup.
I have tried this with gdb 7.2 and 7.3.50.
Any ideas on whats wrong? It would help if I even knew for certain whether the problem was my kernel, kernel module compilation, the connection, or gdb.
Update: With gdb 7.1, I get the following:
...
(gdb) b function_name
/gdb/breakpoint.c:7903: internal-error: expand_line_sal_maybe: Assertion `found' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
What does that mean?
A partial answer:
With gdb 7.1, recompiling the kernel and kernel module with -gdwarf-2, and the module with -O0 seems to have done the trick. I'm not sure which it is or why yet.

Resources