I have a question about ELF dynamic symbol table. For symbols of type FUNC, I have noticed a value of 0 in some binaries. But in other binaries, it has some non-zero value. Both these binaries were generated by gcc, I want to know why is this difference?. Is there any compiler options to control this?
EDIT: This is the output of readelf --dyn-syms prog1
Symbol table '.dynsym' contains 5 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
2: 000082f0 0 FUNC GLOBAL DEFAULT UND printf#GLIBC_2.4 (2)
3: 00008314 0 FUNC GLOBAL DEFAULT UND abort#GLIBC_2.4 (2)
4: 000082fc 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.4
Here value of "printf" symbol is 82f0 which happens to be the address of plt table entry for printf.
Output of readelf --dyn-syms prog2
Symbol table '.dynsym' contains 6 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
2: 00000000 0 FUNC GLOBAL DEFAULT UND puts#GLIBC_2.4 (2)
3: 00000000 0 FUNC GLOBAL DEFAULT UND printf#GLIBC_2.4 (2)
4: 00000000 0 FUNC GLOBAL DEFAULT UND abort#GLIBC_2.4 (2)
5: 00000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.4
Here the values for all the symbols are zero.
The x86_64 SV ABI mandates that (emphasis mine):
To allow comparisons of function addresses to work as expected,
if an executable file references a function defined in a shared object,
the link editor will place the address of the procedure linkage table
entry for that function in its associated symbol table entry.
This will result in symbol table entries with section index of
SHN_UNDEF but a type of STT_FUNC and a non-zero st_value.
A reference to the address of a function from within a shared
library will be satisfied
by such a definition in the executable.
With my GCC, this program:
#include <stdio.h>
int main()
{
printf("hello %i\n", 42);
return 0;
}
when compiled directly into an executable generates a null value:
1: 0000000000000000 0 FUNC GLOBAL DEFAULT UND printf#GLIBC_2.2.5 (2)
But this program with a comparison of the printf function:
#include <stdio.h>
int main()
{
printf("hello %i\n", 42);
if (printf == puts)
return 1;
return 0;
}
generates a non-null value:
3: 0000000000400410 0 FUNC GLOBAL DEFAULT UND printf#GLIBC_2.2.5 (2)
In the .o file, the first program generates:
000000000014 000a00000002 R_X86_64_PC32 0000000000000000 printf - 4
and the second:
000000000014 000a00000002 R_X86_64_PC32 0000000000000000 printf - 4
000000000019 000a0000000a R_X86_64_32 0000000000000000 printf + 0
The difference is caused by the extra R_X86_64_32 relocation for getting the address of the function.
Observations by running readelf on some binary
All the FUNCTIONS which are UNDEFINED have size zero.
These undefined functions are those which are called through libraries. In my small ELF binary all references to GLIBc are undefined with size zero
From http://docs.oracle.com/cd/E19457-01/801-6737/801-6737.pdf on page 21
It becomes clear that symbol table can have three types of symbols. Among these three, two types UNDEFINED and TENTATIVE symbols are those which are with out storage assigned. in later case you can see in readelf output, some functions which are not undefined(have index) and does not have storage.
for clarity undefined symbols are those which are referenced but does not assign storage(have not been created yet) while tentative symbols are those which are created but w/o assigned storage. e.g uninitialized symbols
edit
if you are talking about .plt, shared libraries symbols bind is lazy.
how to control the bind see http://www.linuxjournal.com/article/1060
This feature is known as lazy symbol binding. The idea is that if you have lots of shared libraries, it could take the dynamic loader lots of time to look up all of the functions to initialize all of the .plt slots, so it would be preferable to defer binding addresses to the functions until we actually need them. This turns out to be a big win if you only end up using a small fraction of the functions in a shared library. It is possible to instruct the dynamic loader to bind addresses to all of the .plt slots before transferring control to the application—this is done by setting the environment variable LD_BIND_NOW=1 before running the program. This turns out to be useful in some cases when you are debugging a program, for example. Also, I should point out that the .plt is in read-only memory. Thus the addresses used for the target of the jump are actually stored in the .got section. The .got also contains a set of pointers for all of the global variables that are used within a program that come from a shared library.
Related
I have a simple NASM code like below. I want to set the value 43 (which is the +3 offset in trx array) to value 99.
section .data
trx db 25,21,17,43
section .text
global _start
_start:
mov [trx+3], byte 99
last:
mov rax, 60
mov rdi, 0
syscall
When i debug and the _start function passed, it works. The value 43 changed to 99.
(gdb) i var
All defined variables:
Non-debugging symbols:
0x00000000006000c4 trx
0x00000000006000c8 __bss_start
0x00000000006000c8 _edata
0x00000000006000c8 _end
(gdb) x/4b &trx
0x6000c4: 25 21 17 43
(gdb) break _start
Breakpoint 1 at 0x4000b0
(gdb) run
Starting program: /home/hexdemsion/Desktop/asm/exec
Breakpoint 1, 0x00000000004000b0 in _start ()
(gdb) stepi
0x00000000004000b8 in last ()
(gdb) x/4b &trx
0x6000c4: 25 21 17 99
Now how can i set that value directly in GDB ? I have tried this command in GDB, but still doesn't work.
(gdb) set 0x00000000006000c4+3 = 99
Left operand of assignment is not an lvalue.
(gdb) set {int}0x00000000006000c4+3 = 99
Left operand of assignment is not an lvalue.
(gdb) set {b}0x00000000006000c4+3 = 99
No symbol table is loaded. Use the "file" command.
For addition, i don't provide any debug information in assemble time.
nasm -f elf64 -o obj.o source.asm; ld -o exec obj.o
You almost had it; use set {char}(0x00000000006000c4+3) = 99.
Here's a more detailed explanation:
In gdb's set statement, the expression to the left of the = can be a convenience variable, or a register name, or an lvalue corresponding to some object in the target.
An lvalue is an object that has an address, a type, and is assignable.
A literal or computed address such as 0x00000000006000c4 or 0x00000000006000c4+3 isn't an lvalue, but you can cast it to an lvalue using *(type *)(0x00000000006000c4+3) or {type}(0x00000000006000c4+3).
Gdb knows about C primitive types, plus whatever types the executable and libraries you're debugging may contain in their symbol tables or debug sections. In your case, since you want to set a byte, you'd use C's char type.
(gdb) x/4b &trx
0x6000c4: 25 21 17 43
(gdb) set {char}(0x00000000006000c4+3) = 99
(gdb) x/4b &trx
0x6000c4: 25 21 17 99
I'm using gcc to compile for mips32, and I declare a pointer to a struct called OSEvent within a global scope as follows:
OSEvent *__osMainEventQueue = NULL;
Additionally, code from within a certain function references this pointer during a call to a function:
__osEnqueueEvent(event, __osMainEventQueue);
That function is declared as follows:
extern void __osEnqueueEvent (OSEvent *event, OSEvent *queue);
However, when debugging this code, gcc seems to dereference the pointer to __osMainEventQueue despite me putting nothing there. You can see this in the disassembly as follows:
118: 3c020000 lui v0,0x0
118: R_MIPS_HI16 __osMainEventQueue
11c: 8c420000 lw v0,0(v0)
11c: R_MIPS_LO16 __osMainEventQueue
120: 00402825 move a1,v0
124: 8fc40018 lw a0,24(s8)
128: 0c000000 jal 0 <osScheduleEvent>
128: R_MIPS_26 __osEnqueueEvent
12c: 00000000 nop
Is there any reason gcc would dereference this pointer? Do I need to reference it with &? (This causes a type mismatch warning so I wouldn't consider this a satisfactory explanation / answer)
There's no pointer dereference. The code is simply loading the value of __osMainEventQueue into $a1 (i.e. the address it points to).
Consider the following scenario: the __osMainEventQueue is located at address 0x12345678 and contains the value 0xDEADBEEF. So what that lui and lw combo does is to first load $v0 with the value 0x12340000. Then it loads from 0x5678($v0), i.e. from (0x12345678), so you end up with 0xDEADBEEF in $v0. Never in this code is there an attempt to read from (0xDEADBEEF).
I'm compiling a very simple hello-world one-liner statically on Debian 7 system on x86_64 machine with gcc version 4.8.2 (Debian 4.8.2-21):
gcc test.c -static -o test
and I get an executable ELF file that includes the following sections:
[17] .tdata PROGBITS 00000000006b4000 000b4000
0000000000000020 0000000000000000 WAT 0 0 8
[18] .tbss NOBITS 00000000006b4020 000b4020
0000000000000030 0000000000000000 WAT 0 0 8
[19] .init_array INIT_ARRAY 00000000006b4020 000b4020
0000000000000010 0000000000000000 WA 0 0 8
[20] .fini_array FINI_ARRAY 00000000006b4030 000b4030
0000000000000010 0000000000000000 WA 0 0 8
[21] .jcr PROGBITS 00000000006b4040 000b4040
0000000000000008 0000000000000000 WA 0 0 8
[22] .data.rel.ro PROGBITS 00000000006b4060 000b4060
00000000000000e4 0000000000000000 WA 0 0 32
Note that .tbss section is allocated at addresses 0x6b4020..0x6b4050 (0x30 bytes) and it intersects with allocation of .init_array section at 0x6b4020..0x6b4030 (0x10 bytes), .fini_array section at 0x6b4030..0x6b4040 (0x10 bytes) and with .jcr section at 0x6b4040..0x6b4048 (8 bytes).
Note it does not intersect with the following sections, for example, .data.rel.ro, but that's probably because .data.rel.ro alignment is 32 and thus it can't be placed any earlier than 0x6b4060.
The resulting file runs ok, but I still don't exactly get how it works. From what I read in glibc documentation, .tbss is a just .bss section for thread local storage (i.e. allocated memory scratch space, not really mapped in physical file). Is it that .tbss section is so special that it can overlap other sections? Are .init_array, .fini_array and .jcr are so useless (for example, they are not needed anymore then TLS-related code runs), so they can be overwritten by bss? Or is it some sort of a bug?
Basically, what do I get to read and write if I'll try to read address 0x6b4020 in my application? .tbss contents or .init_array pointers? Why?
The virtual address of .tbss is meaningless as that section only serves as a template for the TLS storage as allocated by the threading implementation in GLIBC.
The way this virtual address comes into place is that .tbss follows .tbdata in the default linker script:
...
.gcc_except_table : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) }
/* Thread Local Storage sections */
.tdata : { *(.tdata .tdata.* .gnu.linkonce.td.*) }
.tbss : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) }
.preinit_array :
{
PROVIDE_HIDDEN (__preinit_array_start = .);
KEEP (*(.preinit_array))
PROVIDE_HIDDEN (__preinit_array_end = .);
}
.init_array :
{
PROVIDE_HIDDEN (__init_array_start = .);
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array))
PROVIDE_HIDDEN (__init_array_end = .);
}
...
therefore its virtual address is simply the virtual address of the preceding section (.tbdata) plus the size of the preceding section (eventually with some padding in order to reach the desired alignment). .init_array (or .preinit_array if present) comes next and its location should be determined the same way, but .tbss is known to be so very special, that it is given a deeply hard-coded treatment inside GNU LD:
/* .tbss sections effectively have zero size. */
if ((os->bfd_section->flags & SEC_HAS_CONTENTS) != 0
|| (os->bfd_section->flags & SEC_THREAD_LOCAL) == 0
|| link_info.relocatable)
dotdelta = TO_ADDR (os->bfd_section->size);
else
dotdelta = 0; // <----------------
dot += dotdelta;
.tbss is not relocatable, it has the SEC_THREAD_LOCAL flag set, and it does not have contents (NOBITS), therefore the else branch is taken. In other words, no matter how large the .tbss is, the linker does not advance the location of the section that follows it (also know as "the dot").
Note also that .tbss sits in a non-loadable ELF segment:
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000b1f24 0x00000000000b1f24 R E 200000
LOAD 0x00000000000b2000 0x00000000006b2000 0x00000000006b2000
0x0000000000002288 0x00000000000174d8 RW 200000
NOTE 0x0000000000000158 0x0000000000400158 0x0000000000400158
0x0000000000000044 0x0000000000000044 R 4
TLS 0x00000000000b2000 0x00000000006b2000 0x00000000006b2000 <---+
0x0000000000000020 0x0000000000000060 R 8 |
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 |
0x0000000000000000 0x0000000000000000 RW 8 |
|
Section to Segment mapping: |
Segment Sections... |
00 .note.ABI-tag ... |
01 .tdata .ctors ... |
02 .note.ABI-tag ... |
03 .tdata .tbss <---------------------------------------------------+
04
This is rather simple if you have an understanding about two things:
1) What is SHT_NOBITS
2) What is tbss section
SHT_NOBITS means that this section occupies no space inside file.
Normally, NOBITS sections, like bss are placed after all PROGBITS sections at the end of the loaded segments.
tbss is special section to hold uninitialized thread-local data that contribute to the program's memory image. Take an attention here: this section must hold unique data for each program thread.
Now lets talk about overlapping. We have two possible overlappings -- inside binary file and inside memory.
1) Binary files offset:
There is no data to write under this section in binary. Inside file it holds no space, so linker start next section init_array immediately after tbss declared. You may think about its size not as about size, but as about special service information for code like:
if (isTLSSegment) tlsStartAddr += section->memSize();
So it doesn't overlap anything inside file.
2) Memory offset
The tdata and tbss sections may be possibly modified at startup time by the dynamic linker
performing relocations, but after that the section data is kept around as the initialization image and not modified anymore. For each thread, including the initial one, new memory is allocated into which then the content of the initialization image is copied. This ensures that all threads get the same starting conditions.
This what makes tbss (and tdata) so special.
Do not think about their memory offsets as about statically known -- they are more like "generation patterns" for per-thread work. So they also can not overlap with "normal" memory offsets -- they are being processed in other way.
You may consult with this paper to know more.
Sorry if the questions are dumb, but they are really confusing me!
According to elf standard the binary is divided into segments like text segment (containing code and RO data) and data segment (containing RW & BSS) which is loaded into memory when the program is executed and process is created, with the segments providing information for environment preparation for process execution.
The question is, how it is decided that how much stack to allocate to process, when i am not providing stack size during process creation?
Also, using the data segment we can determine how much memory the process requires (for global variables) but once this memory is allocated how mapping of variables is done with the address space inside this allocated memory?
Lastly, is there any relation of this with scatter loading? which i think is not the case as scatter loading is done when image is to be loaded into memory and once control is passed to OS, the memory to be allocated to executable or applications is take care off by the OS itself!
I know these are too many questions, but any help will be greatly appreciated.
If u can provide any reference books or links where i can study in detail about this, that is also appreciated.
Thanks a tonne! :)
The question is, how it is decided that how much stack to allocate to process, when i am not providing stack size during process creation?
When a new process created, execve() system call is used to load the new program as process image into memory from the current running process image. Which mean execve when new program is loaded replaces older .text, .data segments, heap and reset the stack. Now ELF executable file is mapped into memory address space making stack space getting initialized with environment array and the argument array to main().
In do_execve_common() procedure call under subroutine bprm_mm_init() handles tasks such as,
New instance of mm_struct to manage process address space using call to mm_alloc().
Initialize this instance with init_new_context().
bprm_mm_init() initializes stack.
search_binary_handler() routine searches for suitable binary format i.e load_binary, load_shlib to load programs or dynamic libraries respectively. Followed by mapping memory to virtual address space and making process ready to run when scheduler identifies the process.
Therefore, stack memory finally looks like below, which will appear to main() routine at start of the execution. Now and then each environment of a subset of function calls, including parameters and local variables are stored or pushed in stack memory zone dynamically when the calls happen.
-----------------
| | <--- Top of the Stack
| environmental |
| variables and |
| the other |
| parameters to |
| main() |
_________________ <--- Stack Pointer
| |
| Stack Space |
| |
Also, using the data segment we can determine how much memory the process requires (for global variables) but once this memory is allocated how mapping of variables is done with the address space inside this allocated memory?
Let try figuring out how variables are mapped to different parts of memory segments by debugging a simple C program as follows,
/* File Name: elf.c : Demonstrating Global variables */
#include <stdio.h>
int add_numbers(void);
int value1 = 10; // Global Initialized: .data section
int value2; // Global Initialized: .bss section
int add_numbers(void)
{
int result; // Local Uninitialized: Stack section
result = value1 + value2;
return result;
}
int main(void)
{
int final_result; // Local Uninitialized: Stack section
value2 = 20;
final_result = add_numbers();
printf("The sum of %d + %d is %d\n",
value1, value2, final_result);
}
Using readelf to display .data section header as below,
$readelf -a elf
...
Section Headers:
[26] .data PROGBITS 00000000006c2060 000c2060
00000000000016b0 0000000000000000 WA 0 0 32
[27] .bss NOBITS 00000000006c3720 000c3710
0000000000002bc8 0000000000000000 WA 0 0 32
...
$readelf -x 26 elf
Hex dump of section '.data':
0x006c2060 00000000 00000000 00000000 00000000 ................
0x006c2070 0a000000 00000000 00000000 00000000 ................
...
Let's use GDB to look at what these section contain,
(gdb) disassemble 0x006c2060
Dump of assembler code for function `data_start`:
0x00000000006c2060 <+0>: add %al,(%rax)
0x00000000006c2062 <+2>: add %al,(%rax)
0x00000000006c2064 <+4>: add %al,(%rax)
0x00000000006c2066 <+6>: add %al,(%rax)
End of assembler dump.
The above first address of .data section refers to data_start subroutine.
(gdb) disassemble 0x006c2070
Dump of assembler code for function `value1`:
0x00000000006c2070 <+0>: or (%rax),%al
0x00000000006c2072 <+2>: add %al,(%rax)
End of assembler dump.
....
The above disassemble dumps address of global variable value1 initialized to
10. But we don't see global uninitialized variable value2 in next addresses.
Let's look at printing the address of value2,
(gdb) p &value2
$1 = (int *) 0x6c5eb0
(gdb) info symbol 0x6c5eb0
value2 in section **.bss**
(gdb) disassemble 0x6c5eb0
Dump of assembler code for function `value2`:
0x00000000006c5eb0 <+0>: add %al,(%rax)
0x00000000006c5eb2 <+2>: add %al,(%rax)
End of assembler dump.
Tada! Disassembling reference pointer of value2 revels that the variable is stored in .bss section. This explains how the uninitialized global variables mapped to process memory space.
Lastly, is there any relation of this with scatter loading?
No.
I am attempting to use objcopy to convert an xml file to an object file that is then linked into and used by another shared library on RHEL5. I convert the file with this command:
objcopy --input-format binary --output-target i386-pc-linux-gnu --binary-architecture i386 baselines.xml baselines.0
The object file is created and using readelf I get the following:
Symbol table '.symtab' contains 5 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 SECTION LOCAL DEFAULT 1
2: 00000000 0 NOTYPE GLOBAL DEFAULT 1 _binary_baselines_xml_sta
3: 0000132b 0 NOTYPE GLOBAL DEFAULT 1 _binary_baselines_xml_end
4: 0000132b 0 NOTYPE GLOBAL DEFAULT ABS _binary_baselines_xml_siz
So it looks like the size is in there. I dumped the file and verified the xml is embedded as ascii at offset 34 (specified by the .data value) and that it's correct. The data is 0x132b bytes in size, as specified by the variable.
Then in the code, I declare a couple variables:
extern "C"
{
extern char _binary_baselines_xml_start;
extern char _binary_baselines_xml_size;
}
static const char* xml_start = &_binary_baselines_xml_start;
const uint32_t xml_size = reinterpret_cast<uint32_t>(&_binary_baselines_xml_size);
When I step into this, the xml pointer is correct and I can see the xml text in the debugger. However, the size symbol shows the value as 0x132b (which is what I want) but it also indicates that "Address 0x132b is out of bounds". When I use the variable it is a very large incorrect random number. I've tried all sorts of other syntax to declare the extern variable such as char*, char[], int, int*, etc. The result is always the same. The value is there but I can't seem to get to it.
Another point of interest is that this code works fine on a windows machine without the prepended underscore on the extern variables but all else the same.
I can't seem to find much online about using objcopy in this manner so any help is greatly appreciated.
I am not sure what you actual issue is. The *_size symbol is an absolute symbol to indicate the size. You are not supposed to be able to actually reference the location (unless by accident) it is just a way of sneaking an integer value into the linker without actually defining a data variable. What you are doing is correct in how you are using it.
The best way to think about this would be if you had the following code:
char* psize = reinterpret_cast<char*>(0x1234);
int size = reinterpret_cast<int>(psize);
The only difference is the linker fills in the 0x1234 value for you via a symbol.