GDB stepping through instructions on a particular core in baremetal development on QEMU - debugging

I am learning baremetal development on ARM, for which I chose to simulate Raspi3 on QEMU. Hence, its a virtual ARM Cortex A-53 imlementing ARMv8 architecture. I have compiled the following simple baremetal code :
.global _start
_start:
1: wfe
b 1b
I launch it using :
qemu-system-aarch64 -M raspi3 -kernel kernel8.img -display none -S -s
and the GDB is connected to it from the other terminal using :
gdb-multiarch ./kernel8.elf -ex 'target remote localhost:1234' -ex 'break *0x80000' -ex 'continue'
So far everything is good and I can notice the breakpoint in gdb.
Reading symbols from ./kernel8.elf...
Remote debugging using localhost:1234
0x0000000000000000 in ?? ()
Breakpoint 1 at 0x80000: file start.S, line 5.
Continuing.
Thread 1 hit Breakpoint 1, _start () at start.S:5
5 1: wfe
(gdb) info threads
Id Target Id Frame
* 1 Thread 1.1 (CPU#0 [running]) _start () at start.S:5
2 Thread 1.2 (CPU#1 [running]) 0x0000000000000300 in ?? ()
3 Thread 1.3 (CPU#2 [running]) 0x000000000000030c in ?? ()
4 Thread 1.4 (CPU#3 [running]) 0x000000000000030c in ?? ()
(gdb) list
1 .section ".text.boot"
2
3 .global _start
4 _start:
5 1: wfe
6 b 1b
(gdb)
As per my understanding, in case of ARM all the cores will execute the same code on reset, so ideally all the cores in my case must be running the same code. I just want to verify that by putting breakpoints and that is the problem. The break points for other cores are not hit. If I am not wrong, the threads in my case are nothing but the cores. I tried putting break but does not work :
(gdb) break *0x80000 thread 2
Note: breakpoint 1 (all threads) also set at pc 0x80000.
Breakpoint 2 at 0x80000: file start.S, line 5.
(gdb) thread 2
[Switching to thread 2 (Thread 1.2)]
#0 0x0000000000000300 in ?? ()
(gdb) info threads
Id Target Id Frame
1 Thread 1.1 (CPU#0 [running]) _start () at start.S:5
* 2 Thread 1.2 (CPU#1 [running]) 0x0000000000000300 in ?? ()
3 Thread 1.3 (CPU#2 [running]) 0x000000000000030c in ?? ()
4 Thread 1.4 (CPU#3 [running]) 0x000000000000030c in ?? ()
(gdb) s
Cannot find bounds of current function
(gdb) c
Continuing.
[Switching to Thread 1.1]
Thread 1 hit Breakpoint 1, _start () at start.S:5
5 1: wfe
(gdb)
I deleted the core 1 breakpoint, and then the core 2 hangs forever :
(gdb) info b
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000080000 start.S:5
breakpoint already hit 2 times
2 breakpoint keep y 0x0000000000080000 start.S:5 thread 2
stop only in thread 2
(gdb) delete br 1
(gdb) info break
Num Type Disp Enb Address What
2 breakpoint keep y 0x0000000000080000 start.S:5 thread 2
stop only in thread 2
(gdb) thread 2
[Switching to thread 2 (Thread 1.2)]
#0 0x000000000000030c in ?? ()
(gdb) c
Continuing.
What can I do get a breakpoint on core 2? What am I doing wrong here?
EDIT
I tried set scheduler-locking on (assuming this is what I need) but this also seems not working for me.
(gdb) break *0x80000
Breakpoint 3 at 0x80000: file start.S, line 5.
(gdb) thread 2
[Switching to thread 2 (Thread 1.2)]
#0 0x000000000000030c in ?? ()
(gdb) set scheduler-locking on
(gdb) c
Continuing.
^C/build/gdb-OxeNvS/gdb-9.2/gdb/inline-frame.c:367: internal-error: void skip_inline_frames(thread_info*, bpstat): Assertion `find_inline_frame_state (thread) == NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) n
This is a bug, please report it. For instructions, see:
<http://www.gnu.org/software/gdb/bugs/>.
/build/gdb-OxeNvS/gdb-9.2/gdb/inline-frame.c:367: internal-error: void skip_inline_frames(thread_info*, bpstat): Assertion `find_inline_frame_state (thread) == NULL' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Create a core file of GDB? (y or n)
EDIT 2
Upon #Frank's advice, I built (latest) qemu 6.2.0 locally and used the gdb available in the arm toolchain.
naveen#workstation:~/.repos/src/arm64/baremetal/raspi3-tutorial/01_bareminimum$ /opt/qemu-6.2.0/build/qemu-system-aarch64 -version
QEMU emulator version 6.2.0
Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
naveen#workstation:~/.repos/src/arm64/baremetal/raspi3-tutorial/01_bareminimum$ /opt/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-gdb -version
GNU gdb (GNU Toolchain for the A-profile Architecture 10.3-2021.07 (arm-10.29)) 10.2.90.20210621-git
But I am still having the problem. My other cores 2,3 and 4 never hit the breakpoints. It seems they are not even running my code, as the address they are pointing to, does not look ok.
(gdb) info threads
Id Target Id Frame
* 1 Thread 1.1 (CPU#0 [running]) _start () at start.S:5
2 Thread 1.2 (CPU#1 [running]) 0x000000000000030c in ?? ()
3 Thread 1.3 (CPU#2 [running]) 0x000000000000030c in ?? ()
4 Thread 1.4 (CPU#3 [running]) 0x000000000000030c in ?? ()
EDIT 3
The problem seems with my Makefile, as when I used the command to build, as suggested by Frank, it worked for me. Can someone please look as what's wrong with this Makefile :
CC = /opt/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-elf
CFLAGS = -Wall -O2 -ffreestanding -nostdinc -nostartfiles -nostdlib -g
all: clean kernel8.img
start.o: start.S
${CC}-gcc $(CFLAGS) -c start.S -o start.o
kernel8.img: start.o
${CC}-ld -g -nostdlib start.o -T link.ld -o kernel8.elf
${CC}-objcopy -O binary kernel8.elf kernel8.img
clean:
rm kernel8.elf kernel8.img *.o >/dev/null 2>/dev/null || true
EDIT 4
It turns out that when I use kernel8.elf with QEMU for booting, everything works as expected. But when I use kernel8.img which is a binary format, I get the issue. With bit of reading, I understand that ELF contains the "extra" information required to make the example work. But for clarification, how can I make the kernel8.img work?

You probably have an issue with the versions of gdb or qemu you are using, since I was not able to reproduce your problem with a version 10.1 of aarch64-elf-gdb and a version 6.2.0 of qemu-system-aarch64 compiled from scratch on an Ubuntu 20.04.3 LTS system:
wfe.s:
.global _start
_start:
1: wfe
b 1b
Building wfe.elf:
/opt/arm/10/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-elf-gcc -g -ffreestanding -nostdlib -nostartfiles -Wl,-Ttext=0x80000 -o wfe.elf wfe.s
Looking at generated code:
/opt/arm/10/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-elf-objdump -d wfe.elf
wfe.elf: file format elf64-littleaarch64
Disassembly of section .text:
0000000000080000 <_stack>:
80000: d503205f wfe
80004: 17ffffff b 80000 <_stack>
Starting qemu in a shell session:
/opt/qemu-6.2.0/bin/qemu-system-aarch64 -M raspi3b -kernel wfe.elf -display none -S -s
Starting gdb in another:
/opt/gdb/gdb-10.1-aarch64-elf-x86_64-linux-gnu/bin/aarch64-elf-gdb wfe.elf -ex 'target remote localhost:1234' -ex 'break *0x80000' -ex 'continue'
gdb session:
GNU gdb (GDB) 10.1
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "--host=x86_64-linux-gnu --target=aarch64-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Remote debugging using localhost:1234
warning: No executable has been specified and target does not support
determining executable automatically. Try using the "file" command.
0x0000000000080000 in ?? ()
Breakpoint 1 at 0x80000
Continuing.
[Switching to Thread 1.4]
Thread 4 hit Breakpoint 1, 0x0000000000080000 in ?? ()
(gdb) break *0x80000 thread 2
Note: breakpoint 1 (all threads) also set at pc 0x80000.
Breakpoint 2 at 0x80000
(gdb) info threads
Id Target Id Frame
1 Thread 1.1 (CPU#0 [running]) 0x0000000000080000 in ?? ()
2 Thread 1.2 (CPU#1 [running]) 0x0000000000080000 in ?? ()
3 Thread 1.3 (CPU#2 [running]) 0x0000000000080000 in ?? ()
* 4 Thread 1.4 (CPU#3 [running]) 0x0000000000080000 in ?? ()
(gdb) c
Continuing.
[Switching to Thread 1.2]
Thread 2 hit Breakpoint 1, 0x0000000000080000 in ?? ()
(gdb) info b
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000080000
breakpoint already hit 2 times
2 breakpoint keep y 0x0000000000080000 thread 2
stop only in thread 2
breakpoint already hit 1 time
(gdb) del 1
(gdb) info b
Num Type Disp Enb Address What
2 breakpoint keep y 0x0000000000080000 thread 2
stop only in thread 2
breakpoint already hit 1 time
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, 0x0000000000080000 in ?? ()
(gdb)
The anwers to your two questions are therefore:
What can I do get a breakpoint on core 2?
Exactly what you are doing.
What am I doing wrong here?
Nothing, but may be using old/buggy versions of gdb and/or qemu - my guess would be that gdb is the culprit is your case, but I may be wrong.
You can easily verify by testing again using the version of gdb provided in the gcc toolchain available from Arm, AArch64 ELF bare-metal target (aarch64-none-elf) - I tried, and it worked fine as well:
/opt/arm/10/gcc-arm-10.3-2021.07-x86_64-aarch64-none-elf/bin/aarch64-none-elf-gdb wfe.elf -ex 'target remote localhost:1234' -ex 'break *0x80000' -ex 'continue'
GNU gdb (GNU Toolchain for the A-profile Architecture 10.3-2021.07 (arm-10.29)) 10.2.90.20210621-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "--host=x86_64-pc-linux-gnu --target=aarch64-none-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.linaro.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from wfe.elf...
Remote debugging using localhost:1234
_start () at wfe.s:3
3 1: wfe
Breakpoint 1 at 0x80000: file wfe.s, line 3.
Continuing.
Thread 1 hit Breakpoint 1, _start () at wfe.s:3
3 1: wfe
(gdb) break *0x80000 thread 2
Note: breakpoint 1 (all threads) also set at pc 0x80000.
Breakpoint 2 at 0x80000: file wfe.s, line 3.
(gdb) info threads
Id Target Id Frame
* 1 Thread 1.1 (CPU#0 [running]) _start () at wfe.s:3
2 Thread 1.2 (CPU#1 [running]) _start () at wfe.s:3
3 Thread 1.3 (CPU#2 [running]) _start () at wfe.s:3
4 Thread 1.4 (CPU#3 [running]) _start () at wfe.s:3
(gdb) c
Continuing.
[Switching to Thread 1.2]
Thread 2 hit Breakpoint 1, _start () at wfe.s:3
3 1: wfe
(gdb) info b
Num Type Disp Enb Address What
1 breakpoint keep y 0x0000000000080000 wfe.s:3
breakpoint already hit 2 times
2 breakpoint keep y 0x0000000000080000 wfe.s:3 thread 2
stop only in thread 2
breakpoint already hit 1 time
(gdb) del 1
(gdb) info b
Num Type Disp Enb Address What
2 breakpoint keep y 0x0000000000080000 wfe.s:3 thread 2
stop only in thread 2
breakpoint already hit 1 time
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3 1: wfe
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3 1: wfe
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3 1: wfe
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3 1: wfe
(gdb) c
Continuing.
Thread 2 hit Breakpoint 2, _start () at wfe.s:3
3 1: wfe
(gdb)
Please note that explaining how to build the latest versions of gdb and qemu is out of the scope of the current answer.

Related

Debugging NASM local labels with gdb

I have been having some issues debugging code assembled by nasm with gdb: it seems like gdb doesn't do well with nasm local labels. nasm generates a local symbol named «function».label, which seems to confuse gdb, as it loses track of which function it is in.
Here is one scenario in which it gives a sub-optimal debugging experience:
section .text
global _start
_start:
call foo
ud2
foo:
push rbp
mov rbp, rsp
call bar
.end:
pop rbp
ret
bar:
ret
Compile and debug:
$ nasm -f elf64 -g -F DWARF example.asm -o example.o
$ ld example.o -o example
$ gdb ./example
Reading symbols from ./example...done.
(gdb) b foo
Breakpoint 1 at 0x400087: file example.asm, line 10.
(gdb) run
Starting program: /home/mvanotti/orga2/gdb/example
Breakpoint 1, foo () at example.asm:10
10 push rbp
(gdb) ni
11 mov rbp, rsp
(gdb) ni
12 call bar
(gdb) ni
Program received signal SIGILL, Illegal instruction.
_start () at example.asm:7
7 ud2
As you can see, nexti continues execution even after the return from the bar function call. I believe this is caused because the next instruction in foo belongs to the foo.end symbol, causing gdb to not recognize that as the return point of the function. Adding any other instruction before the .end label in the asm file fixes the issue.
Similarly, the backtrace gets all messed up when it steps into a local label:
(gdb)
foo.end () at example.asm:14
14 pop rbp
(gdb) bt
#0 foo.end () at example.asm:14
#1 0x0000000000000000 in ?? ()
(gdb)
This also affects yasm and lldb.
There is not a clear workaround for this. I couldn't find an option in nasm to not emit the function.label symbols, or an easy way to remove them. strip for example, lets you specify the --wildcard option, but the regexp syntax is too basic and cannot match something like .+\.*. The closest I got was strip --wildcard -N "*.*", but that also matches .something
In gas, this is solved by creating a label in the form of .Llocal_label$ which gets discarded automatically by ld.

Is there a way to 'Step Through' Fortran code like you can with VBA in excel?

I have some Fortran source code that I can understand the general idea behind, but I have never used Fortran before so I would like to see exactly what is happening as each line is being executed (like you can with VBA in Excel by stepping through the code line by line and observing what values variables and arrays have at any point in the code).
Is there a way to step through the source code with a user-interface of some kind so that I can see exactly what variables have been defined, what values they are taking etc..?
For some context: I work in Science and Engineering, but coding is not normally a significant part of my job (as you can probably tell from the content of my question), I have normally only deal with simple scripts in VBA to manipulate data. I have the compiled version of the Fortran code and it works fine, but I know I will need to modify the source and recompile it for my purposes. Unfortunately the person who wrote the original code is not available for advice/input. Another note: I'm not sure how to tell which version of Fortran was used...
Thanks!
Any respectable debugger will allow you to single step through a code. Below is a simple example showing one, gdb, in use. The important points:
Make sure you compile and link with the -g flag
run runs the code
break sets a break point at a given line in the code, i.e. when the code is running it will stop at that line
step steps one line
step n steps n lines
finish runs the program until the end
Note this is very much an oversimplified introduction to show what you want to do is possible. In real life you'll have to learn a bit more about your debugger, they are very powerful and useful pieces of software with many abilities not even hinted at here. For gdb there are a number of tutorials you can find by searching, e.g. https://sourceware.org/gdb/onlinedocs/gdb/index.html
ian#eris:~/work/stack$ gfortran -fcheck=all -Wall -Wextra -std=f2008 -g step.f90 -o step
ian#eris:~/work/stack$ gdb step
GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from step...done.
(gdb) list
1 Program single_step
2
3 Implicit None
4
5 Integer :: i
6
7 Do i = 1, 10
8 Write( *, * ) 'I is now ', i
9 End Do
10
(gdb) run
Starting program: /home/ian/work/stack/step
I is now 1
I is now 2
I is now 3
I is now 4
I is now 5
I is now 6
I is now 7
I is now 8
I is now 9
I is now 10
[Inferior 1 (process 23965) exited normally]
(gdb) break 1
Breakpoint 1 at 0x5555555548c6: file step.f90, line 1.
(gdb) run
Starting program: /home/ian/work/stack/step
Breakpoint 1, single_step () at step.f90:1
1 Program single_step
(gdb) step
7 Do i = 1, 10
(gdb) step
8 Write( *, * ) 'I is now ', i
(gdb) step
I is now 1
7 Do i = 1, 10
(gdb) step
8 Write( *, * ) 'I is now ', i
(gdb) step
I is now 2
7 Do i = 1, 10
(gdb) step
8 Write( *, * ) 'I is now ', i
(gdb) step 5
I is now 3
I is now 4
I is now 5
7 Do i = 1, 10
(gdb) step
8 Write( *, * ) 'I is now ', i
(gdb) step 2
I is now 6
8 Write( *, * ) 'I is now ', i
(gdb) finish
Run till exit from #0 single_step () at step.f90:8
I is now 7
I is now 8
I is now 9
I is now 10
0x0000555555554a30 in main (argc=1, argv=0x7fffffffe2fa) at step.f90:11
11 End Program single_step
(gdb) quit
A debugging session is active.
Inferior 1 [process 23969] will be killed.
Quit anyway? (y or n) y
ian#eris:~/work/stack$

Why does gdb does not show debug symbols of kernel with debug info?

I am trying to learn more about kernel and driver development, so for that purpose I thought to use KVM and gdb to establish debug session with custom installed kernel (v5.1.0).
The kernel has debug info included, and here is a chunk of .config I used:
$ rg -i "(debug|kalls|GDB_SCRIPTS).*=y" .config
205:CONFIG_KALLSYMS=y
206:CONFIG_KALLSYMS_ALL=y
...
225:CONFIG_SLUB_DEBUG=y
...
9620:CONFIG_DEBUG_INFO=y
9623:CONFIG_DEBUG_INFO_DWARF4=y
9624:CONFIG_GDB_SCRIPTS=y
9640:CONFIG_DEBUG_KERNEL=y
...
By using "-s" option I can connect to Ubuntu 18.04 kernel in my VM, but gdb does not show any symbols:
Reading symbols from vmlinux...
(gdb) target remote :1234
Remote debugging using :1234
0xffffffff8ea4af66 in ?? ()
(gdb) bt
#0 0xffffffff8ea4af66 in ?? ()
#1 0xffffffff8f603e38 in ?? ()
#2 0xffffffff8ea4abb2 in ?? ()
#3 0x0000000000000000 in ?? ()
(gdb) i t
Ambiguous info command "t": target, tasks, terminal, threads, tp, tracepoints, tvariables, type-printers, types.
(gdb) i threads
Id Target Id Frame
* 1 Thread 1 (CPU#0 [halted ]) 0xffffffff8ea4af66 in ?? ()
2 Thread 2 (CPU#1 [halted ]) 0xffffffff8ea4af66 in ?? ()
(gdb) b printk
Breakpoint 1 at 0xffffffff81101fa3: file /home/ilukic/projects/kernel/linux-stable/kernel/printk/printk.c, line 2030.
(gdb) c
Continuing.
Warning:
Cannot insert breakpoint 1.
Cannot access memory at address 0xffffffff81101fa3
Command aborted.
(gdb) disassemble 0xffffffff81101f83,100
Dump of assembler code from 0xffffffff81101f83 to 0x64:
End of assembler dump.
(gdb) disassemble 0xffffffff81101f83,+100
Dump of assembler code from 0xffffffff81101f83 to 0xffffffff81101fe7:
0xffffffff81101f83 <kmsg_dump_rewind_nolock+19>: Cannot access memory at address 0xffffffff81101f83
(gdb) disassemble 0xffffffff81101fa3,+10
Dump of assembler code from 0xffffffff81101fa3 to 0xffffffff81101fad:
0xffffffff81101fa3 <printk+0>: Cannot access memory at address 0xffffffff81101fa3
At the end, when inspecting /proc/kallsyms on VM (e.g. searching for printk symbol from previous gdb session), no symbol is found:
~$ cat /proc/kallsyms | grep "t printk"
0000000000000000 t printk_safe_log_store
0000000000000000 t printk_late_init
~$ uname -a
Linux ubuntu18 5.1.0 #2 SMP Tue Nov 12 19:01:21 CET 2019 x86_64 x86_64 x86_64 GNU/Linux
On the other hand when using objdump, "printk" can be found in vmlinux and as seen, gdb does not complain about missing symbol when setting a breakpoint.
I am assuming that installation of kernel went well as no errors were reported, still I can't explain why I can't find corresponding symbols in kallsyms.
Other thing that I find strange is when going through /proc/kallsyms why do all the lines start with 0s.
Any ideas why is gdb not showing any symbols?
As #IanAbbott suggested, CONFIG_RANDOMIZE_BASE=y (or "nokaslr" kernel command line argument)
was missing to prevent KASLR.

Debugging Linux Kernel using GDB in qemu unable to hit function or given address

I am trying to understand kernel bootup sequence step by step using GDB in qemu environment.
Below is my setting:
In one terminal im running
~/Qemu_arm/bin/qemu-system-arm -M vexpress-a9 -dtb ./arch/arm/boot/dts/vexpress-v2p-ca9.dtb -kernel ./arch/arm/boot/zImage -append "root=/dev/mmcblk0 console=ttyAMA0" -sd ../Images/RootFS.ext3 -serial stdio -s -S
In other terminal
arm-none-linux-gnueabi-gdb vmlinux
Reading symbols from vmlinux...done.
(gdb) target remote :1234
Remote debugging using :1234
0x60000000 in ?? ()
My question is how setup breakpoint for the code in /arch/arm/boot/compressed/* files .
e.g I tried to setup break point for decompress_kernel defined in misc.c .
Case 1:
(gdb) b decompress_kernel
Function "decompress_kernel" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (decompress_kernel) pending.
(gdb) c
Continuing.
The above one is not able to hit the function qemu is booting.
Case 2:
(gdb) b *0x80008000
Breakpoint 1 at 0x80008000: file arch/arm/kernel/head.S, line 89.
(gdb) c
Continuing.
In this case also its not able to hit instead qemu is booting up.
Case 3:
(gdb) b start_kernel
Breakpoint 1 at 0x8064d8d8: file init/main.c, line 498.
(gdb) c
Continuing.
Breakpoint 1, start_kernel () at init/main.c:498
498 {
(gdb)
In this case function is hitting and i am able debug step by step.
Note: I have enabled debug,Early printk and tried hbreak
So my query is:
why some functions are not able to hit break points?
Is this qemu limitation or do I need enable something more?
do I need to append any extra parameters?
how to Debug early kernel booting
You are not able to put breakpoints on any function preceding start_kernel because you are not loading symbols for them. In fact you are starting qemu with a zImage of the kernel but loading the symbols from vmlinux. They are not the same: zImage is basically vmlinux compressed as a data payload which is then attached to a stub which decompresses it in memory then jumps to start_kernel.
start_kernel is the entry point of vmlinux, any function preceding it, including decompress_kernel, are part of the stub and not present in vmlinux.
I don't know if doing "arm-none-linux-gnueabi-gdb zImage" instead allows you to debug the stub, I have always done early debug of ARM kernels with JTAG debuggers on real hardware, and never used qemu for that, sorry

How to debug program with custom elf interpreter?

I can debug some program (say /bin/ls) like this:
[ks#localhost ~]$ gdb -q --args /bin/ls
Reading symbols from /bin/ls...Reading symbols from /bin/ls...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Missing separate debuginfos, use: debuginfo-install coreutils-8.22-19.fc21.x86_64
(gdb) start
Temporary breakpoint 1 at 0x402990
Starting program: /usr/bin/ls
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Temporary breakpoint 1, 0x0000000000402990 in main ()
(gdb)
Here I can set temporary breakpoint at main and stop at it.
But I have to run program with custom elf interpreter like this:
[ks#localhost ~]$ gdb -q --args /lib64/ld-linux-x86-64.so.2 /bin/ls
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/usr/lib64/ld-2.20.so.debug...done.
done.
(gdb) start
Function "main" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Temporary breakpoint 1 (main) pending.
Starting program: /usr/lib64/ld-linux-x86-64.so.2 /bin/ls
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
1234 glibc-2.20 python tmp
[Inferior 1 (process 2610) exited normally]
Missing separate debuginfos, use: debuginfo-install libacl-2.2.52-7.fc21.x86_64 libattr-2.4.47-9.fc21.x86_64 libcap-2.24-7.fc21.x86_64 pcre-8.35-8.fc21.x86_64
(gdb)
Here gdb did not stop at main because symbols for /bin/ls were not loaded.
How can I force gdb to load symbols and stop at main in this case?
Here is how you can do it:
cat t.c
#include <stdio.h>
#include <stdlib.h>
int main()
{
printf("Hello\n");
return 0;
}
gcc -g t.c
gdb -q --args /usr/lib64/ld-linux-x86-64.so.2 ./a.out
(gdb) start
Function "main" not defined.
Starting program: /usr/lib64/ld-linux-x86-64.so.2 ./a.out
Hello
[Inferior 1 (process 7134) exited normally]
So far everything is matching what you observed. Now for the solution:
(gdb) set stop-on-solib-events 1
(gdb) r
Starting program: /usr/lib64/ld-linux-x86-64.so.2 ./a.out
Stopped due to shared library event (no libraries added or removed)
(gdb) c
Continuing.
Stopped due to shared library event:
Inferior loaded /usr/lib64/ld-linux-x86-64.so.2
(gdb) c
Continuing.
Stopped due to shared library event:
Inferior loaded /usr/lib64/libc.so.6
At this point, ./a.out has also been loaded, and you can confirm that with:
(gdb) info proc map
process 7140
Mapped address spaces:
Start Addr End Addr Size Offset objfile
0x400000 0x401000 0x1000 0x0 /tmp/a.out
0x600000 0x601000 0x1000 0x0 /tmp/a.out
0x601000 0x602000 0x1000 0x1000 /tmp/a.out
0x555555554000 0x555555579000 0x25000 0x0 /usr/lib64/ld-2.19.so
0x555555779000 0x55555577a000 0x1000 0x25000 /usr/lib64/ld-2.19.so
0x55555577a000 0x55555577c000 0x2000 0x26000 /usr/lib64/ld-2.19.so
0x7ffff7c2a000 0x7ffff7c2d000 0x3000 0x0
0x7ffff7c2d000 0x7ffff7df0000 0x1c3000 0x0 /usr/lib64/libc-2.19.so
0x7ffff7df0000 0x7ffff7fef000 0x1ff000 0x1c3000 /usr/lib64/libc-2.19.so
0x7ffff7fef000 0x7ffff7ff3000 0x4000 0x1c2000 /usr/lib64/libc-2.19.so
0x7ffff7ff3000 0x7ffff7ff5000 0x2000 0x1c6000 /usr/lib64/libc-2.19.so
0x7ffff7ff5000 0x7ffff7ff9000 0x4000 0x0
0x7ffff7ff9000 0x7ffff7ffa000 0x1000 0x0 /etc/ld.so.cache
0x7ffff7ffa000 0x7ffff7ffd000 0x3000 0x0
0x7ffff7ffd000 0x7ffff7fff000 0x2000 0x0 [vdso]
0x7ffffffde000 0x7ffffffff000 0x21000 0x0 [stack]
0xffffffffff600000 0xffffffffff601000 0x1000 0x0 [vsyscall]
Unfortunately, GDB does not understand that it should also load symbols for ./a.out. You have to tell it:
(gdb) add-symbol-file ./a.out
The address where ./a.out has been loaded is missing
One would think that the address that GDB needs would be from the above info proc map: 0x400000. One would be wrong. The actual address GDB needs is the start of .text section, which you can get from readelf:
readelf -WS ./a.out | grep text
[13] .text PROGBITS 0000000000400440 000440 000182 00 AX 0 0 16
Back to GDB:
(gdb) add-symbol-file ./a.out 0x0000000000400440
add symbol table from file "./a.out" at
.text_addr = 0x400440
Reading symbols from ./a.out...done.
And now we can break on main:
(gdb) b main
Breakpoint 1 at 0x400531: file t.c, line 6.
(gdb) c
Continuing.
Breakpoint 1, main () at t.c:6
6 printf("Hello\n");
(gdb) n
Hello
7 return 0;
Voila!
P.S. Re-running the binary may give you some glitches:
(gdb) r
Starting program: /usr/lib64/ld-linux-x86-64.so.2 ./a.out
Error in re-setting breakpoint 1: Cannot access memory at address 0x40052d
Error in re-setting breakpoint 1: Cannot access memory at address 0x40052d
Stopped due to shared library event (no libraries added or removed)
This is happening because the ld-linux is yet to map the ./a.out. But you can continue:
(gdb) c
Continuing.
Stopped due to shared library event:
Inferior loaded /usr/lib64/ld-linux-x86-64.so.2
(gdb) c
Continuing.
Stopped due to shared library event:
Inferior loaded /usr/lib64/libc.so.6
And now, ./a.out has also been loaded, so you can re-enable the breakpoint(s):
(gdb) enable
(gdb) continue
Continuing.
Breakpoint 1, main () at t.c:6
6 printf("Hello\n");

Resources