I'm using valgrind to debug a binary which uses loadable libraries via dlopen.
On debian stable the stacktrace does not contain symbols for calls inside the loadable lib.
| | ->11.55% (114,688B) 0x769492C: ???
| | | ->11.55% (114,688B) 0x7697289: ???
| | | ->11.55% (114,688B) 0x769806F: ???
| | | ->11.55% (114,688B) 0x419812: myfunc (main.c:1010)
Valgrind on debian unstable works fine and the symbols are properly resolved. So I started looking what is different.
I have these packages on both systems (valgrind was updated to 3.7 from unstable):
ii valgrind 1:3.7.0-1+b1
ii libtool 2.2.6b-2
ii gcc 4:4.4.5-1
ii binutils 2.20.1-16
The libs are not stripped and contain debuginfo:
ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, BuildID[sha1]=0x33ffd210859178c15bb3923c5491e1a1b6065015, not stripped
Looking closer I noticed that the size of the libraries are different, on debian unstable the lib is slightly bigger. Comparing them with readelf, the size of the debug info is bigger.
[26] .debug_aranges PROGBITS 0000000000000000 00a74c 000090 00 0 0 1
[27] .debug_pubnames PROGBITS 0000000000000000 00a7dc 000385 00 0 0 1
[28] .debug_info PROGBITS 0000000000000000 00ab61 00512f 00 0 0 1
[29] .debug_abbrev PROGBITS 0000000000000000 00fc90 0006e2 00 0 0 1
[30] .debug_line PROGBITS 0000000000000000 010372 002314 00 0 0 1
[31] .debug_str PROGBITS 0000000000000000 012686 0019d3 01 MS 0 0 1
[32] .debug_loc PROGBITS 0000000000000000 014059 000f24 00 0 0 1
[33] .debug_macinfo PROGBITS 0000000000000000 014f7d 179082 00 0 0 1
[34] .debug_ranges PROGBITS 0000000000000000 18dfff 000060 00 0 0 1
This makes me think that something is missing from the debug info section from the binaries built on debian stable. Now my question is: why and how are the binaries different? The tools (gcc, libtool, binutils) used in the build are the same, including the compiler/linker flags and commands (I checked with diff on make's output).
Update:
The debug_info section size difference came from the fact that the full path of the source file is stored there as well and the build home was different. Also there are different openssl versions on unstable/stable which added some different symbols to the debug_info section. Hence the difference in debug_info size.
Running valgrind in debug mode (-d -v -v) shows that it reads symbols from the loadable lib in both cases:
--19837-- Reading syms from /usr/lib/myplugin.so (0x6c62000)
If you are using dlopen for the loadable library, chances are that it was unloaded before the program terminates. Therefore Valgrind is unable to resolve its symbols. Try to avoid calling dlclose on this library. See http://valgrind.org/docs/manual/faq.html#faq.unhelpful for more information.
Related
I need to access .symtab symbol table by parsing memory of the process.
At the moment, my algorithm is:
Get Dynamic segment (Program's header p_type == PT_DYNAMIC) and follow p_vaddr
Search in this Dynamic Section for the DT_SYMTAB d_tag and take ptr from +4 offset (d_ptr), which should be our actual .symtab Symbol Table.
However, instead of .symtab, for some reason, I'm receiving .dynsym, which is proved by comparing symbol names and other info retrieved from readelf -Ws.
So, how to get the actual .symtab ptr?
Thank you.
For reference, I'm using:
https://en.wikipedia.org/wiki/Executable_and_Linkable_Format#Program_header
http://labmaster.mi.infn.it/Laboratorio2/CompilerCD/clang/l1/ELF.html
More good resources are appreciated.
I need to access .symtab symbol table by parsing memory of the process.
This is generally impossible because .symtab is normally not loaded into the process memory at all.
E.g.
readelf -WS foo.o | egrep ' \.(data|text|symtab)'
[ 1] .text PROGBITS 0000000000000000 000040 00001b 00 AX 0 0 1
[ 5] .data PROGBITS 0000000000000000 0000d0 000000 00 WA 0 0 1
[ 9] .symtab SYMTAB 0000000000000000 000130 000120 18 10 10 8
Notice that .data and .text have A (allocatable) flag, while .symtab doesn't.
However, instead of .symtab, for some reason, I'm receiving .dynsym
.dynsym is the only symbol table used at runtime, and is the only symbol table you can get without reading the executable on disk.
P.S. Also note that a fully-stripped binary will not have a .symtab at all, while still being perfectly runnable.
I'm new to bare-metal and kernel programming, and what better way to start my journey than with a hello world!
Sadly, when it comes to my architecture of choice, PPC64 (Using QEMU and OpenFirmware), I struggle to find relevant information or code examples on how to make a hello world program, using the firmware.
So far I've struggled to get the most simple things working, so far I've tried using this start as my main function and this linker script:
.section .boot, "aw"
.global start
start:
b start # Basically halt the machine.
ENTRY(start)
SECTIONS
{
. = 1M;
.text : {
*(.boot)
*(.text*)
}
.data : {
*(.data*)
*(.rodata*)
}
.bss : {
*(COMMON)
*(.bss)
}
}
I've tested it with:
clang --target=ppc64-unknown-elf -c <asm_file> -o <asm_file>.o
ld.lld --oformat elf_ppc64 --nostdlib -T <linkscript> <asm_file>.o -o output.elf
qemu-system-ppc64 -kernel output.elf -serial stdio
But so far the only outcome of my attempts has been this output of SLOF in QEMU emulation:
Detected RAM kernel at 400000 (4 bytes)
Welcome to Open Firmware
Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
This program and the accompanying materials are made available
under the terms of the BSD License available at
http://www.opensource.org/licenses/bsd-license.php
Booting from memory...
( 700 ) Program Exception [ 1dbf04c4 ]
R0 .. R7 R8 .. R15 R16 .. R23 R24 .. R31
8000000000002000 000000001e478200 0000000000000000 0000000000000000
000000001dc71000 8000000000000000 0000000000000000 0000000000000000
0000000000000000 000000001e477010 0000000000000000 0000000000000000
0000000000000000 0000000000000030 0000000000000000 0000000000000000
0000000000000000 000000000000005b 0000000000000000 0000000000000000
000000001dbf04c4 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
0000000000000000 0000000000000000 0000000000000000 0000000000000000
How could I get this little snippet to work? Is there any documentation I could use to finish the complete hello world program? Thanks in advance!
Have a look at the micropython powerpc port README here:
https://github.com/micropython/micropython/tree/master/ports/powerpc
It shows you how to run qemu and skip open firmware directly into your test program. You'll want a stripped binary rather than the elf (see the objcopy in the Makefile)
In that directory there is a linker script and a head.S which shows you the basics.
Good luck!
I am generating an ELF file for ARM platform using linaro tool chain.
The file is an executable that is supposed to run bare-metal.
I use a linker script to select the locations of sections in the memory because I want to put specific sections in specific locations.
The problem is that when I move some section forward in the memory I see that the image size increases, although no additional data has been added.
When I run readelf -a elf_file I see that both the virtual address (see Address field below) and the offset in image (See Offset field below) are both increased.
Example:
The following lines in the linker script
. = 0x2000000;
.__translations_block_0 : { TM_TranslationTables.o(__translations_block_0) }
Result in the following offsets in the elf file (output from readelf)
[Nr] Name Type Address Offset Size EntSize Flags Link Info Align
[10] .tdata PROGBITS 0000000000279000 00279080 000000000000000c 0000000000000000 WAT 0 0 16
[11] .tbss NOBITS 0000000000279080 0027908c 0000000000011bcc 0000000000000000 WAT 0 0 16
[12] .__translations_b PROGBITS 0000000002000000 02000080 0000000000000008 0000000000000000 WA 0 0 8
[13] .__translations_b PROGBITS 0000000002001000 02001080 0000000000000008 0000000000000000 WA 0 0 8
My question is:
Is there a way to increase the address of some section without blowing the image size? I just want the section to be loaded into memory address 0x2000000, I don't want the image size to be 0x2000000.
Any help would be appreciated.
I have learnt from this recent answer that gcc and clang include the source filename somewhere in the binary as metadata, even when debugging is not enabled.
I can't really understand why this should be a good idea. Besides the tiny privacy risks, this happens also when one optimizes for the size of the resulting binary (-Os), which looks inefficient.
Why do the compilers include this information?
The reason why GCC includes the filename is mainly for debugging purposes, because it allows a programmer to identify from which source file a given symbol comes from as (tersely) outlined in the ELF spec p1-17 and further expanded upon in some Oracle docs on linking.
An example of using the STT_FILE section is given by this SO question.
I'm still confused why both GCC and Clang still include it even if you specify -g0, but you can stop it from including STT_FILE with -s. I couldn't find any explanation for this, nor could I find an "official reason" why STT_FILE is included in the ELF specification (which is very terse).
I have learnt from this recent answer that gcc includes the source filename somewhere in the binary as metadata, even when debugging is not enabled.
Not quite. In modern ELF object files the file name indeed is a symbol of type FILE:
$ readelf bignum.o # Source bignum.c
[...]
Symbol table (.symtab) contains 36 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS bignum.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 0 SECTION LOCAL DEFAULT 5
6: 0000000000000000 0 SECTION LOCAL DEFAULT 6
7: 0000000000000000 0 SECTION LOCAL DEFAULT 7
8: 0000000000000000 0 SECTION LOCAL DEFAULT 8
9: 00000000000003f0 172 FUNC GLOBAL DEFAULT 1 add
10: 00000000000004a0 104 FUNC GLOBAL DEFAULT 1 copy
However, once stripped, the symbol is gone:
$ strip bignum.o
$ readelf -all bignum.o | grep bignum.c
$
So to keep your privacy, strip the executable, or compile/link with -s.
I'm trying to remotely debug an application using GDB command line.
Path in which gdb is run on the PC is the build path of the application. It contains the amixer executable and amixer.c.
The code is compiled with -g -O2 parameters.
The debug symbols seem to be present:
$ readelf -WS amixer
There are 38 section headers, starting at offset 0x1d24c:
...
[27] .debug_aranges PROGBITS 00000000 00a758 000140 00 0 0 8
[28] .debug_info PROGBITS 00000000 00a898 008c59 00 0 0 1
[29] .debug_abbrev PROGBITS 00000000 0134f1 00085a 00 0 0 1
[30] .debug_line PROGBITS 00000000 013d4b 001a8c 00 0 0 1
[31] .debug_frame PROGBITS 00000000 0157d8 000494 00 0 0 4
[32] .debug_str PROGBITS 00000000 015c6c 001f75 01 MS 0 0 1
[33] .debug_loc PROGBITS 00000000 017be1 004dff 00 0 0 1
[34] .debug_ranges PROGBITS 00000000 01c9e0 000700 00 0 0 1
Steps on remote device (stripped binary):
gdbserver 192.16.6.21:12345 amixer
Steps on PC (binary here not stripped):
$ gdb amixer
(gdb) set sysroot /correct/path/to/remote/device/sysroot
(gdb) target remote 192.16.6.12:12345
(gdb) break main
Breakpoint 1 at 0x11f58
(gdb) list main
(gdb) show directories
Source directories searched: $cdir:$cwd
(gdb) continues
...program executes on remote device...
Assumptions I've made:
break main doesn't throw an error so the executable debug symbols are available. I would expect to see the source file mentioned here already. Like in example: Breakpoint 1 at 0x62f4: file builtin.c, line 879.
there is .debug* in the output of readelf -WS amixer so the debug symbols are present
list main doesn't list the source of the main function. Something isn't right
show directories list $cdir and $cwd I'm guessing at least on of them is the directory from which I've started gdb amixer and that is the build directory with both executables and sources
I'm obviously doing something wrong so am looking for a review of the assumptions and debugging tips.
break main doesn't throw an error so the executable debug symbols are available.
You are mistaken: the fact that break main does not show any errors does not imply that debug symbols are available. And the rest of your output is consistent with debug symbols not being available.
So your first step should be to confirm that debug symbols are in fact present. If readelf -WS amixer does not show any .debug_* or .zdebug_* sections, that would be a proof that no debug info is present. If so, re-check your build command lines for presence of -g flag on compile lines, and absence of -Wl,-s or similar flag on the link line.