Controlling File Offset in linking - gcc

I have some assembler for the Microblaze that I want to load at address 0x00000000 (ie to ensure it is executed on a reset).
I have a linker script that should do this (I think):
SECTIONS
{
ENTRY(_start)
. = 0x0000;
.vectors.reset : { *(.vectors.reset) }
. = 0x0008;
.vectors.sw_exception : { *(.vectors.sw_exception) }
. = 0x0010;
.vectors.interrupt : { *(.vectors.interrupt) }
. = 0x0018;
.vectors.hw_exception : { *(.vectors.hw_exception) }
. = 0x100;
.text : { *(.text) }
.data : { *(.data) }
.bss : { *(.bss) }
}
But when the code is compiled it seems to be offset by 0x1000:
objdump -h startup.MICROBLAZE.elf
startup.MICROBLAZE.elf: file format elf32-big
Sections:
Idx Name Size VMA LMA File off Algn
0 .vectors.reset 00000008 00000000 00000000 00001000 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .vectors.sw_exception 00000008 00000008 00000008 00001008 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
2 .vectors.interrupt 00000008 00000010 00000010 00001010 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
3 .vectors.hw_exception 00000008 00000018 00000018 00001018 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
4 .text 00000020 00000100 00000100 00001100 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
Where does that offset come from and how can I supress/control it?
Edit: It seems the 0x1000 offset is the physical offset of the code section in the compiled file/object - is that correct?

According to your objdump listing everything is suppose to be laid out as you expect. I.e. VMA and LMA of your .text section are pointing to address 0x100.
The offset 0x1000 as you correctly guessed is offset of the .text section inside ELF file. But this section will be loaded to address 0x100.
If you try to disassemble your ELF with
$ objdump -S startup.MICROBLAZE.elf
you will see the proper instructions layout. It is also helpful to produce MAP file on a linkage stage with -Wl,-Map,output.map gcc flags.

Related

Why Does the .bss Section Output Show a Large Value Compared to the Memory Map Output?

I'm compiling someone else's code using GCC and the output from GCC shows a large .bss section:
text data bss dec
9468 1080 10892 21440
I run the following command to generate information related to each section so I can try and figure out what is taking up so much space in the .bss section:
objdump -x output.elf > output.dmp
Here is the output of the sections:
Sections:
Idx Name Size VMA LMA File off Algn
0 .mstack 00001000 20000000 20000000 00030000 2**0
ALLOC
1 .pstack 00001000 20001000 20001000 00030000 2**0
ALLOC
2 .nocache 00000000 30040000 30040000 0002042c 2**2
CONTENTS
3 .eth 00000000 30040000 30040000 0002042c 2**2
CONTENTS
4 .vectors 000002a0 08000000 08000000 00010000 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
5 .xtors 0000000c 080002a0 080002a0 000102a0 2**2
CONTENTS, ALLOC, LOAD, DATA
6 .text 00002054 080002b0 080002b0 000102b0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
7 .init 00000004 08002304 08002304 00012304 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
8 .fini 00000004 08002308 08002308 00012308 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
9 .rodata 00000200 0800230c 0800230c 0001230c 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
10 .data 0000042c 24000000 0800250c 00020000 2**3
CONTENTS, ALLOC, LOAD, DATA
11 .bss 00000a8c 24000430 08002938 00020430 2**3
ALLOC
12 .ram0_init 00000000 24000ebc 24000ebc 0002042c 2**2
CONTENTS
13 .ram0 00000000 24000ebc 24000ebc 0002042c 2**2
CONTENTS
14 .ram1_init 00000000 30000000 30000000 0002042c 2**2
CONTENTS
15 .ram1 00000000 30000000 30000000 0002042c 2**2
CONTENTS
16 .ram2_init 00000000 30000000 30000000 0002042c 2**2
CONTENTS
17 .ram2 00000000 30000000 30000000 0002042c 2**2
CONTENTS
18 .ram3_init 00000000 30040000 30040000 0002042c 2**2
CONTENTS
19 .ram3 00000000 30040000 30040000 0002042c 2**2
CONTENTS
20 .ram4_init 00000000 38000000 38000000 0002042c 2**2
CONTENTS
21 .ram4 00000000 38000000 38000000 0002042c 2**2
CONTENTS
22 .ram5_init 00000000 20002000 20002000 0002042c 2**2
CONTENTS
23 .ram5 00000000 20002000 20002000 0002042c 2**2
CONTENTS
24 .ram6_init 00000000 00000000 00000000 0002042c 2**2
CONTENTS
25 .ram6 00000000 00000000 00000000 0002042c 2**2
CONTENTS
26 .ram7_init 00000000 38800000 38800000 0002042c 2**2
CONTENTS
27 .ram7 00000000 38800000 38800000 0002042c 2**2
CONTENTS
28 .ARM.attributes 00000030 00000000 00000000 0002042c 2**0
CONTENTS, READONLY
29 .comment 0000004c 00000000 00000000 0002045c 2**0
CONTENTS, READONLY
30 .debug_info 0001bdb6 00000000 00000000 000204a8 2**0
CONTENTS, READONLY, DEBUGGING, OCTETS
31 .debug_abbrev 000043c3 00000000 00000000 0003c25e 2**0
CONTENTS, READONLY, DEBUGGING, OCTETS
32 .debug_aranges 00000ae8 00000000 00000000 00040628 2**3
CONTENTS, READONLY, DEBUGGING, OCTETS
33 .debug_ranges 00000918 00000000 00000000 00041110 2**0
CONTENTS, READONLY, DEBUGGING, OCTETS
34 .debug_line 00007b18 00000000 00000000 00041a28 2**0
CONTENTS, READONLY, DEBUGGING, OCTETS
35 .debug_str 000031ff 00000000 00000000 00049540 2**0
CONTENTS, READONLY, DEBUGGING, OCTETS
36 .debug_frame 000027e8 00000000 00000000 0004c740 2**2
CONTENTS, READONLY, DEBUGGING, OCTETS
As you can see the .bss section only takes up 2700 bytes. Could someone please explain how I can track down where the extra bytes are coming from? Thank you.
Also not sure if this is relevant, but I'm using an ARM Cortex-M7 processor.
Nevermind. I figured out where the extra bytes were coming from.
The sections .mstack and .pstack have 4Kbytes allocated each and are being placed in RAM. When I take the 8Kbytes total (.mstack + .pstack) and add in the size of the .bss section, I get the GCC output value of 10892.

Windows kernel: why is my memory mapping not working?

I have a kmdf that allocates a single buffer using MmAllocateContiguousMemorySpecifyCache and gets its mdl:
auto ptr = MmAllocateContiguousMemorySpecifyCache(
BUFFER_SIZE,
lowestAcceptible,
highestAcceptible,
lowestAcceptible,
MmCached);
PQUEUE_CONTEXT queueContext = QueueGetContext(queue);
if (ptr)
{
RtlZeroMemory(ptr, BUFFER_SIZE);
queueContext->stage_buffer_ptr = ptr;
queueContext->stage_buffer_byte_size = BUFFER_SIZE;
queueContext->stage_buffer_bus_address = MmGetPhysicalAddress(ptr).QuadPart;
queueContext->mdl = IoAllocateMdl(
ptr,
static_cast<ULONG>(queueContext->stage_buffer_byte_size),
false,
false,
nullptr
);
Once that buffer is allocated, the driver handles a ioctl (method neither) that maps the pre-allocated buffer to the requesting process address space using MmMapLockedPagesSpecifyCache:
auto queueContext = QueueGetContext(Queue);
auto user_ptr = MmMapLockedPagesSpecifyCache(
queueContext->mdl,
UserMode,
MmCached,
nullptr,
false,
MM_PAGE_PRIORITY::HighPagePriority
);
A few lines after, I fill the memory with testing values:
auto vals = (int*)queueContext->stage_buffer_ptr; // kernel virtual address
for (auto i = 0; i < 10; ++i)
{
vals[i] = i;
}
but when the user loops on its mapped address, it gets garbage values.
I tried to debug this issue with WinDbg and when I break after the loop I see the following:
kernel virtual address:
1: kd> dc 0xffffb901`c6f5d000 (ok)
ffffb901`c6f5d000 00000000 00000001 00000002 00000003 ................
ffffb901`c6f5d010 00000004 00000005 00000006 00000007 ................
ffffb901`c6f5d020 00000008 00000009 00000000 00000000 ................
ffffb901`c6f5d030 00000000 00000000 00000000 00000000
user virtual address: (displays junk)
1: kd> dc 0x00000174`423f0000
00000174`423f0000 332c3000 ffffcc05 bc823afb 01d40a37 .0,3.....:..7...
00000174`423f0010 5e784ab3 01d5da99 5e784ab3 01d5da99 .Jx^.....Jx^....
00000174`423f0020 b1de413c 01d5db3f 00002000 00000000 <A..?.... ......
00000174`423f0030 00002000 00000000 00000010 00000000 . ..............
So why don't I see the same values via the pointer received by MmMapLockedPagesSpecifyCache?

MIPS32 router: module_init not called for kernel module

I'm developing a kernel module that I want to run on my router. The router model is DGN2200v2 by Netgear. It's running Linux 2.6.30 on MIPS. My problem is that when I load my module it seems that my module_init isn't getting called. I tried to narrow it down by modifying my module_init to return -3 (which indicates an error?) and insmod still reports success. I can see my module in the output of lsmod, but I don't see my printk output using dmesg.
For starters, I wanted to create the simplest possible module:
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
static int my_init(void)
{
printk(KERN_EMERG "init_module() called\n");
return -3;
}
static void my_cleanup(void)
{
printk(KERN_EMERG "cleanup_module() called\n");
}
module_init(my_init);
module_exit(my_cleanup);
This is the Makefile I'm using:
TOOLCHAIN=/home/user/buildroot-2016.08/output/host/usr/bin/mips-buildroot-linux-uclibc-
ARCH=mips
CC = $(TOOLCHAIN)gcc
KBUILD_CFLAGS:=.
EXTRA_CFLAGS := -I/home/user/buildroot-2016.08/output/build/linux-headers-2.6.30/include\
-I/home/user/buildroot-2016.08/output/build/linux-headers-2.6.30/arch/mips/include/asm/mach-mipssim\
-I/home/user/buildroot-2016.08/output/build/linux-headers-2.6.30/arch/mips/include/asm/mach-generic\
-fno-pic -mno-abicalls -O2
obj-m := module.o
KDIR := /home/user/buildroot-2016.08/output/build/linux-headers-2.6.30
PWD := $(shell pwd)
default:
$(MAKE) -C $(KDIR) SUBDIRS=$(PWD) modules
I'm running make like so:
make ARCH=mips CROSS_COMPILE=/home/user/buildroot-2016.08/output/host/usr/bin/mips-buildroot-linux-uclibc-
which passes successfully.
As you can see, I'm using Buildroot which I (hopefully) configured correctly. I can paste my .config if needed.
I ran objdump on my module and didn't find a problem. In particular, the module_init symbol seems to point to the same place as my my_init function, and it seems to have the code I expect it to:
module.ko: file format elf32-tradbigmips
module.ko
architecture: mips:isa32, flags 0x00000011:
HAS_RELOC, HAS_SYMS
start address 0x00000000
private flags = 50001001: [abi=O32] [mips32] [not 32bitmode] [noreorder]
MIPS ABI Flags Version: 0
ISA: MIPS32
GPR size: 32
CPR1 size: 0
CPR2 size: 0
FP ABI: Soft float
ISA Extension: None
ASEs:
None
FLAGS 1: 00000001
FLAGS 2: 00000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .MIPS.abiflags 00000018 00000000 00000000 00000038 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA, LINK_ONCE_SAME_SIZE
1 .reginfo 00000018 00000000 00000000 00000050 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA, LINK_ONCE_SAME_SIZE
2 .note.gnu.build-id 00000024 00000018 00000018 00000068 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .text 00000040 00000000 00000000 00000090 2**4
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
4 .rodata.str1.4 00000038 00000000 00000000 000000d0 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .modinfo 0000005c 00000000 00000000 00000108 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .data 00000000 00000000 00000000 00000170 2**4
CONTENTS, ALLOC, LOAD, DATA
7 .gnu.linkonce.this_module 0000014c 00000000 00000000 00000170 2**2
CONTENTS, ALLOC, LOAD, RELOC, DATA, LINK_ONCE_DISCARD
8 .bss 00000000 00000000 00000000 000002c0 2**4
ALLOC
9 .comment 00000040 00000000 00000000 000002c0 2**0
CONTENTS, READONLY
10 .pdr 00000040 00000000 00000000 00000300 2**2
CONTENTS, RELOC, READONLY
11 .gnu.attributes 00000010 00000000 00000000 00000340 2**0
CONTENTS, READONLY
12 .mdebug.abi32 00000000 00000000 00000000 00000350 2**0
CONTENTS, READONLY
SYMBOL TABLE:
00000000 l d .MIPS.abiflags 00000000 .MIPS.abiflags
00000000 l d .reginfo 00000000 .reginfo
00000018 l d .note.gnu.build-id 00000000 .note.gnu.build-id
00000000 l d .text 00000000 .text
00000000 l d .rodata.str1.4 00000000 .rodata.str1.4
00000000 l d .modinfo 00000000 .modinfo
00000000 l d .data 00000000 .data
00000000 l d .gnu.linkonce.this_module 00000000 .gnu.linkonce.this_module
00000000 l d .bss 00000000 .bss
00000000 l d .comment 00000000 .comment
00000000 l d .pdr 00000000 .pdr
00000000 l d .gnu.attributes 00000000 .gnu.attributes
00000000 l d .mdebug.abi32 00000000 .mdebug.abi32
00000000 l df *ABS* 00000000 module.c
00000000 l F .text 0000002c my_init
0000002c l F .text 00000014 my_cleanup
00000000 l .rodata.str1.4 00000000 $LC0
0000001c l .rodata.str1.4 00000000 $LC1
00000000 l df *ABS* 00000000 module.mod.c
00000000 l O .modinfo 00000023 __mod_srcversion23
00000024 l O .modinfo 00000009 __module_depends
00000030 l O .modinfo 0000002c __mod_vermagic5
00000000 g O .gnu.linkonce.this_module 0000014c __this_module
0000002c g F .text 00000014 cleanup_module
00000000 g F .text 0000002c init_module
00000000 *UND* 00000000 printk
Disassembly of section .MIPS.abiflags:
00000000 <.MIPS.abiflags>:
0: 00002001 movf a0,zero,$fcc0
4: 01000003 0x1000003
...
10: 00000001 movf zero,zero,$fcc0
14: 00000000 nop
Disassembly of section .reginfo:
00000000 <.reginfo>:
0: a2000014 sb zero,20(s0)
...
14: 00007fef 0x7fef
Disassembly of section .note.gnu.build-id:
00000018 <.note.gnu.build-id>:
18: 00000004 sllv zero,zero,zero
1c: 00000014 0x14
20: 00000003 sra zero,zero,0x0
24: 474e5500 c1 0x14e5500
28: c8e5d654 lwc2 $5,-10668(a3)
2c: cb477d3d lwc2 $7,32061(k0)
30: dfa48d71 ldc3 $4,-29327(sp)
34: c2ea16da ll t2,5850(s7)
38: f6bcae7d sdc1 $f28,-20867(s5)
Disassembly of section .text:
00000000 <init_module>:
0: 27bdffe8 addiu sp,sp,-24
4: 3c040000 lui a0,0x0
4: R_MIPS_HI16 $LC0
8: 3c020000 lui v0,0x0
8: R_MIPS_HI16 printk
c: afbf0014 sw ra,20(sp)
10: 24420000 addiu v0,v0,0
10: R_MIPS_LO16 printk
14: 0040f809 jalr v0
18: 24840000 addiu a0,a0,0
18: R_MIPS_LO16 $LC0
1c: 8fbf0014 lw ra,20(sp)
20: 2402fffd li v0,-3
24: 03e00008 jr ra
28: 27bd0018 addiu sp,sp,24
modinfo output also matches what I expect (same modinfo output as for another .ko that's found on the router, except for the srcversion which my module has but the other module on the router doesn't):
filename: /home/user/module/module.ko
srcversion: B0BADBA395A121CF49B74DC
depends:
vermagic: 2.6.30 mod_unload MIPS32_R1 32BIT
It's entirely possible that I messed something up in my Buildroot configuration, or something doesn't quite match the CPU type of the router, but my init code is so minimal that I'm out of ideas as to what could be wrong.
It turns out that the problem was related to a different kernel configuration between my development environment and the router. Specifically, my kernel was using CONFIG_UNUSED_SYMBOLS whereas the router's was not.
The reason this caused a problem even in a trivial module is that when the kernel loads a module it doesn't only look up the module_init symbol in the module's symbol table. Rather, it reads the module struct from the module (from the .gnu.linkonce.this_module section), and then calls the init module through that struct.
The offset of the init function pointer inside the module struct depends on the kernel configuration, which explains why the kernel can't find the init function if the configuration is different.
Thanks to Sam Protsenko for investing a lot of time in helping me crack this!

How do debug symbols affect performance of a Linux executable compiled by GCC?

All other factors being equal (eg optimisation level), how does having debug symbols in an ELF or SO affect:
Load time.
Runtime memory footprint.
Runtime performance?
And what could be done to mitigate any negative effects?
EDIT
I've seen this question but I find the discussion unhelpful, as the code optimization factor has confused the issue there. Why does my code run slower with multiple threads than with a single thread when it is compiled for profiling (-pg)?
The debug symbols are located in totally different sections from the code/data sections. You can check it with objdump:
$ objdump -h a.out
a.out: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .interp 0000001c 0000000000400200 0000000000400200 00000200 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
1 .note.ABI-tag 00000020 000000000040021c 000000000040021c 0000021c 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
2 .note.gnu.build-id 00000024 000000000040023c 000000000040023c 0000023c 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
3 .hash 00000018 0000000000400260 0000000000400260 00000260 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .gnu.hash 0000001c 0000000000400278 0000000000400278 00000278 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .dynsym 00000048 0000000000400298 0000000000400298 00000298 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
6 .dynstr 00000038 00000000004002e0 00000000004002e0 000002e0 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
7 .gnu.version 00000006 0000000000400318 0000000000400318 00000318 2**1
CONTENTS, ALLOC, LOAD, READONLY, DATA
8 .gnu.version_r 00000020 0000000000400320 0000000000400320 00000320 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
9 .rela.dyn 00000018 0000000000400340 0000000000400340 00000340 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
10 .rela.plt 00000018 0000000000400358 0000000000400358 00000358 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
11 .init 00000018 0000000000400370 0000000000400370 00000370 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
12 .plt 00000020 0000000000400388 0000000000400388 00000388 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
13 .text 000001c8 00000000004003b0 00000000004003b0 000003b0 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
14 .fini 0000000e 0000000000400578 0000000000400578 00000578 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
15 .rodata 00000004 0000000000400588 0000000000400588 00000588 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
16 .eh_frame_hdr 00000024 000000000040058c 000000000040058c 0000058c 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
17 .eh_frame 0000007c 00000000004005b0 00000000004005b0 000005b0 2**3
CONTENTS, ALLOC, LOAD, READONLY, DATA
18 .ctors 00000010 0000000000600630 0000000000600630 00000630 2**3
CONTENTS, ALLOC, LOAD, DATA
19 .dtors 00000010 0000000000600640 0000000000600640 00000640 2**3
CONTENTS, ALLOC, LOAD, DATA
20 .jcr 00000008 0000000000600650 0000000000600650 00000650 2**3
CONTENTS, ALLOC, LOAD, DATA
21 .dynamic 000001a0 0000000000600658 0000000000600658 00000658 2**3
CONTENTS, ALLOC, LOAD, DATA
22 .got 00000008 00000000006007f8 00000000006007f8 000007f8 2**3
CONTENTS, ALLOC, LOAD, DATA
23 .got.plt 00000020 0000000000600800 0000000000600800 00000800 2**3
CONTENTS, ALLOC, LOAD, DATA
24 .data 00000010 0000000000600820 0000000000600820 00000820 2**3
CONTENTS, ALLOC, LOAD, DATA
25 .bss 00000010 0000000000600830 0000000000600830 00000830 2**3
ALLOC
26 .comment 00000039 0000000000000000 0000000000000000 00000830 2**0
CONTENTS, READONLY
27 .debug_aranges 00000030 0000000000000000 0000000000000000 00000869 2**0
CONTENTS, READONLY, DEBUGGING
28 .debug_pubnames 0000001b 0000000000000000 0000000000000000 00000899 2**0
CONTENTS, READONLY, DEBUGGING
29 .debug_info 00000055 0000000000000000 0000000000000000 000008b4 2**0
CONTENTS, READONLY, DEBUGGING
30 .debug_abbrev 00000034 0000000000000000 0000000000000000 00000909 2**0
CONTENTS, READONLY, DEBUGGING
31 .debug_line 0000003b 0000000000000000 0000000000000000 0000093d 2**0
CONTENTS, READONLY, DEBUGGING
32 .debug_str 00000026 0000000000000000 0000000000000000 00000978 2**0
CONTENTS, READONLY, DEBUGGING
33 .debug_loc 0000004c 0000000000000000 0000000000000000 0000099e 2**0
CONTENTS, READONLY, DEBUGGING
You can see the extra sections (27 through 33). These sections won't be loaded at runtime, so there won't be any performance penalty. Using gdb, you can also examine them at runtime:
$ gdb ./a.out
(gdb) break main
(gdb) run
(gdb) info files
// blah blah ....
Local exec file:
`/home/kghost/a.out', file type elf64-x86-64.
Entry point: 0x4003b0
0x0000000000400200 - 0x000000000040021c is .interp
0x000000000040021c - 0x000000000040023c is .note.ABI-tag
0x000000000040023c - 0x0000000000400260 is .note.gnu.build-id
0x0000000000400260 - 0x0000000000400278 is .hash
0x0000000000400278 - 0x0000000000400294 is .gnu.hash
0x0000000000400298 - 0x00000000004002e0 is .dynsym
0x00000000004002e0 - 0x0000000000400318 is .dynstr
0x0000000000400318 - 0x000000000040031e is .gnu.version
0x0000000000400320 - 0x0000000000400340 is .gnu.version_r
0x0000000000400340 - 0x0000000000400358 is .rela.dyn
0x0000000000400358 - 0x0000000000400370 is .rela.plt
0x0000000000400370 - 0x0000000000400388 is .init
0x0000000000400388 - 0x00000000004003a8 is .plt
0x00000000004003b0 - 0x0000000000400578 is .text
0x0000000000400578 - 0x0000000000400586 is .fini
0x0000000000400588 - 0x000000000040058c is .rodata
0x000000000040058c - 0x00000000004005b0 is .eh_frame_hdr
0x00000000004005b0 - 0x000000000040062c is .eh_frame
0x0000000000600630 - 0x0000000000600640 is .ctors
0x0000000000600640 - 0x0000000000600650 is .dtors
0x0000000000600650 - 0x0000000000600658 is .jcr
0x0000000000600658 - 0x00000000006007f8 is .dynamic
0x00000000006007f8 - 0x0000000000600800 is .got
0x0000000000600800 - 0x0000000000600820 is .got.plt
0x0000000000600820 - 0x0000000000600830 is .data
0x0000000000600830 - 0x0000000000600840 is .bss
// blah blah ....
So the only penalty is that you need extra disk space to store this information. You can also use strip to remove the debug information:
$ strip a.out
Use objdump to check it again, you'll see the difference.
EDIT:
Instead of looking at sections, actually the loader loads elf file according to its Program Header, which can be seen by objdump -p (the following example uses using a different elf binary):
$ objdump -p /bin/cat
/bin/cat: file format elf64-x86-64
Program Header:
PHDR off 0x0000000000000040 vaddr 0x0000000000000040 paddr 0x0000000000000040 align 2**3
filesz 0x00000000000001f8 memsz 0x00000000000001f8 flags r-x
INTERP off 0x0000000000000238 vaddr 0x0000000000000238 paddr 0x0000000000000238 align 2**0
filesz 0x000000000000001c memsz 0x000000000000001c flags r--
LOAD off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**21
filesz 0x00000000000078bc memsz 0x00000000000078bc flags r-x
LOAD off 0x0000000000007c28 vaddr 0x0000000000207c28 paddr 0x0000000000207c28 align 2**21
filesz 0x0000000000000678 memsz 0x0000000000000818 flags rw-
DYNAMIC off 0x0000000000007dd8 vaddr 0x0000000000207dd8 paddr 0x0000000000207dd8 align 2**3
filesz 0x00000000000001e0 memsz 0x00000000000001e0 flags rw-
NOTE off 0x0000000000000254 vaddr 0x0000000000000254 paddr 0x0000000000000254 align 2**2
filesz 0x0000000000000044 memsz 0x0000000000000044 flags r--
EH_FRAME off 0x0000000000006980 vaddr 0x0000000000006980 paddr 0x0000000000006980 align 2**2
filesz 0x0000000000000274 memsz 0x0000000000000274 flags r--
STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4
filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
RELRO off 0x0000000000007c28 vaddr 0x0000000000207c28 paddr 0x0000000000207c28 align 2**0
filesz 0x00000000000003d8 memsz 0x00000000000003d8 flags r--
The program headers tell which segment will be loaded with what rwx flags; multiple sections with the same flags will be merged to a single segment.
BTW:
The loader doesn't care about sections when loading elf file, but it will look at several symbol-related sections to resolve symbols when needed.
You might want to look at Why does my code run slower with multiple threads than with a single thread when it is compiled for profiling (-pg)? for a quick explanations of how the debug symbols could affect optimization.
To answer your 3 questions:
Load time will be increased when the debug symbols are present over when not present
The on-disk footprint will be larger
If you compiled with zero optimization then you really lose nothing. If you set optimization, then the optimized code will be less optimized because of the debug symbols.

What is the section for uninitialized global data?

I am a little confused as to where uninitialized global variables go in the ELF file. I have this simple program to test in which sections the variables will be located:
const int a = 11;
int b = 10;
int c;
int main()
{
return 0;
}
I know that uninitialized global variable should be put into .bss section of ELF file, but objdump -h gives me the following output:
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000000a 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000004 00000000 00000000 00000040 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 00000044 2**2
ALLOC
3 .rodata 00000004 00000000 00000000 00000044 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .comment 00000024 00000000 00000000 00000048 2**0
CONTENTS, READONLY
5 .note.GNU-stack 00000000 00000000 00000000 0000006c 2**0
CONTENTS, READONLY
So the variable a goes to .rodata, b goes to .data, and c goes nowhere? When i change the code to:
int c = 0;
everything is as expected - .bss section has the length 4, but what happens with the variable c when it is not initialized?
It goes into a "common section". You can see it with objdump -t or by using nm.
I'm not quite sure I understand what this is about, but the reference to the ld -warn-common flag says this:
int i;
A common symbol. If there are only
(one or more) common symbols for a
variable, it goes in the uninitialized
data area of the output file. The
linker merges multiple common symbols
for the same variable into a single
symbol. If they are of different
sizes, it picks the largest size. The
linker turns a common symbol into a
declaration, if there is a definition
of the same variable.
(Found via the nm man page.) There is more information after that in the man page itself.

Resources