Debugging kernel using qemu and gdb - linux-kernel

I was trying to debug the kernel using qemu and gdb. For this I have used the concept of bridge connection between qemu and host machine. In the script I have used the tcp:17777:127.0.0.1:22 to connect the qemu machine for gdb.
But when I do ssh 17777 root#localhost (root is user of qemu), it does not respond me.
Question 1: when I will know that I am on right path means I can debug the kernel using qemu?
When we do:
gdb vmlinux
target remote :1234
Question 2: When i try to do gdb vmlinux and target remote :1234 without booting the kernel I want to debug, still I get the following output (which I get when I boot with qemu for kernel I want to boot).
(gdb) target remote :1234
Remote debugging using :1234
default_idle () at arch/x86/kernel/process.c:299
299 current_thread_info()->status |= TS_POLLING;
Help me to understand the concept in detail and share the link to debug kernel using qemu and gdb

Related

GDB on Windows machine

Let us say I am on a Windows machine and I goto its command line terminal and type 'gdb' there. I get gdb prompt (gdb) as shown in the following image. It means gdb.exe is installed on the machine.
My understanding is that the GDB is client-server application. I want to know is this gdb.exe the gdbserver or gdbclient? If its the former then where would be the later and if its the later then where would be the former in this case?
GDB can be a client server application, but it doesn't have to be.
What you started is gdb itself, so, the client side. The server is actually called, gdbserver.
Usually, you'd make use of gdbserver when you want to debug something running on a different machine over a network (though there's nothing to stop you running gdbserver on the same machine as gdb itself).
You can also use gdb to directly start an application to debug, so at the (gdb) prompt you might do:
(gdb) file /path/to/some/executable
(gdb) break main
(gdb) run
For further reading the manual has lots of details, there's a simple example session and more details on remote debug.

Cannot access kernel space when debugging xv6 with QEMU and GDB

I am self-studying the 2019 version of MIT 6.828/6.S081: Operating System Engineering.
I was trying to attach GDB to xv6 running on RISC-V using QEMU, to learn about what is going on when context switching happens between user mode and kernel mode.
After doing make qemu-gdb and gdb in the same directory, my GDB connected to QEMU successfully. However:
(gdb) x/2i $pc
=> 0xd8c: ecall
0xd90: ret
The problem is: Now if I stepi, it "jumps over" to 0xd90 instead of stepping into the kernel space.
Additionally, accessing any kernel addresses is not allowed, as if I was debugging a normal userland program:
(gdb) i r stvec
stvec 0x3ffffff000 274877902848
(gdb) x/i $stvec
0x3ffffff000: Cannot access memory at address 0x3ffffff000
Environment:
Host VM: Manjaro 19.0.2
sudo pacman -Syy
sudo pacman -S riscv64-linux-gnu-binutils riscv64-linux-gnu-gcc riscv64-linux-gnu-gdb qemu-arch-extra
GDB: 9.1
QEMU: 4.2.0
GCC: 9.2.0
Much appreciate anyone could share some insight about what is going on here. Thanks a lot!
I guess you run your code on ubuntu, that is the problem I experienced, then I change to mac, and flow mit tools tutorials, finally, it works.
run make CPUS=1 qemu-gdb in one window.
run riscv64-unknown-elf-gdb in another window.
ignore the Python Exception
I managed to get around this problem by building the riscv toolchain as explained here.
Building the toolchain as explained in the site, generates a generic ELF/Newlib toolchain identified with the prefix riscv64-unknown-elf- in contrast to the more sophisticated Linux-ELF/glibc toolchain identified by the prefix riscv64-unknown-linux-gnu-. The Newlib build allows the debugger to stepi into kernel space.
For crossdev users it is possible to build the toolchain with Newlib support by running:
crossdev --ex-gcc --ex-gdb --target riscv64-unknown-elf

Basic linux kernel development and testing environment using qemu

I want to learn about the linux kernel and this is why I wanted a simple but powerful enough way test kernel changes that I do.
I used the info on this page https://mgalgs.github.io/2015/05/16/how-to-build-a-custom-linux-kernel-for-qemu-2015-edition.html to start.
So now I can start a qemu session with the kernel I choose and also have busybox utilities.
The part I cannot understand is how do I transfer a kernel module .ko on this virtual machine as to load it in my modified kernel ? I tried also transfering a c program by incorporating it in the initramfs but when I try to run the program I receive the following error message:
"/bin/sh: ./proc1: not found" .
Should I use a virtual hdd image ? If so how do I create and use one ? How do I transfer files from host os to the virtual hdd ?
Thnaks in advance.
The created virtual hdd was not discovered because I didn't use mdev -s in the init file.
After that I could mount the sda in qemu session.
The c program that could not be ran I solved by compiling it with the -static flag.

Gdb not catching kernel panic

I have an ARM Linux kernel running as part of the android emulator where I'm doing a bit of testing. I start the emulator without the GUI stuff, and just use adb shell to access the emulator's internal memory.
I start up the emulator on an OSX machine as follows :-
$ emulator -verbose -debug init -show-kernel -kernel ./zImage -avd debug -no-boot-anim -no-skin -no-audio -no-window -qemu -gdb tcp::1234
I attach gdb to the emulator as follows :-
$ arm-eabi-gdb ./vmlinux
(gdb) target remote :1234
I know that the attaching works well, because if I attach the debugger earlier on, I can see that the boot process pauses till I press "c" in gdb. However, when a kernel panic arises in the emulator, I see a stack trace on the terminal that runs the emulator -- however, I don't see any change on the gdb side. The machine halts when the kernel panics, so I'd assume that gdb would show some indication of the same. Why doesn't this happen?
When I Ctrl-C at the emulator side to stop QEMU, I get the message emulator: Done with QEMU main loop. emulator: User-config was not changed. and gdb shows Remote connection closed.
What as I missing here?
Placing a breakpoint on panic and inspecting the backtrace is a possible solution to the problem.

gdb loses line number information (on kernel modules) after breakpoint

I am connecting gdb to a virtual machine's kernel and trying to debug the kernel module. I am able to connect to the virtual machine. I have symbol information for kernel code, and can step through kernel code just fine.
When I add the symbol file for my kernel module (whether I do this before or after remote connection, incidentally), I am able to list <function_name> information about the function, until I set a breakpoint; after that:
(gdb) b function_name
Breakpoint 1 at 0xffffffffa01d0074 (3 locations)
(gdb) list function_name
No line number known for function_name.
Additional information:
Both host and guest are Fedora 16 64-bit.
The kernel I am debugging is 3.0.8 - note that this kernel worked fine on a prior 32-bit setup with a different environment and remote-connection setup.
I have tried this with gdb 7.2 and 7.3.50.
Any ideas on whats wrong? It would help if I even knew for certain whether the problem was my kernel, kernel module compilation, the connection, or gdb.
Update: With gdb 7.1, I get the following:
...
(gdb) b function_name
/gdb/breakpoint.c:7903: internal-error: expand_line_sal_maybe: Assertion `found' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
What does that mean?
A partial answer:
With gdb 7.1, recompiling the kernel and kernel module with -gdwarf-2, and the module with -O0 seems to have done the trick. I'm not sure which it is or why yet.

Resources