There is a software package elfutils which includes a program called eu-elflint for checking ELF binaries (just as lint for C - hence the name).
Just for curiosity I have checked our own shared libraries with this tool and it found a lot of issues, e.g.:
eu-elflint libUtils.so
section [ 2] '.dynsym': _DYNAMIC symbol size 0 does not match dynamic segment size 248
section [ 2] '.dynsym': _GLOBAL_OFFSET_TABLE_ symbol size 0 does not match .got.plt section size 3076
section [ 8] '.rel.plt': relocation 0: offset out of bounds
section [ 8] '.rel.plt': relocation 1: offset out of bounds
...
section [ 8] '.rel.plt': relocation 765: offset out of bounds
As a crosscheck I have build a very trivial shared library from the source code below
int foo(int a) {
return a + 1;
}
// gcc -shared -fPIC -o libfoo.so foo.c
And tried again ...
eu-elflint libfoo.so
section [ 9] '.rel.plt': relocation 0: offset out of bounds
section [ 9] '.rel.plt': relocation 1: offset out of bounds
section [23] '.comment' has wrong flags: expected none, is MERGE|STRINGS
section [25] '.symtab': _GLOBAL_OFFSET_TABLE_ symbol size 0 does not match .got.plt section size 20
section [25] '.symtab': _DYNAMIC symbol size 0 does not match dynamic segment size 200
As you can see even the trivial example also shows a lot of issues.
BTW: I am on Ubuntu-Karmic-32bit with gcc v4.4.1
BTW: ... the same happens on Debian-Lenny-64bit with gcc v4.2.4
Is this something I should be concerned about?
Quick answer: "Is this something I should be concerned about?" No.
Longer answer: elflint checks not only ABI standards, but also some ELF conventions. Both ABIs and ELF conventions change over time: ABIs are extended, and have to remain backward compatible, and ELF conventions do evolve over time (to get new features, mainly). As a consequence, elflint's expectations have to be kept in sync with what your assembler/linker (the GNU binutils in this case) produce. You can find lots of reports to elflint about new ELF extensions introduced in GNU binutils, and for which elflint only catches later on. Thus, it's most probable that you have a version of elflint that is too old for your installed binutils. As elflint is not so much used, it wouldn't surprise me that a linux distro doesn't keep those two in sync so well.
Related
I'm attempting to convert an assembly file to C++ for use as a small and easy to insert "trampoline" loader for another library. It is injected into another program at runtime, then loads a library, runs a function inside of it, and frees it. This is simply to avoid needing multiple lengthy calls to WriteProccessMemory, and to allow certain runtime checks if needed.
Originally, I wrote the code in assembly as it gave me a high degree of control over the structure of the file. I ended up with a ~128 byte file structured as followed:
<Relocation Header> // Table of function pointers filled in by the loading code
<Code>
<Static Data>
The size/structure of the header is known at compile-time, also allowing the entry point to be calculated, so there is very little code needed to load this.
The problem is that sharing the structure of the header between my assembler (NASM) and compiler (GCC) is... difficult, hence the rewrite.
I've come up with this series of commands to compile/link the C++ code:
g++ -c -O3 -fpic Loader.cpp
g++ -O3 -shared -nostdlib Loader.o
Running objcopy -O binary -j .text a.exe then gives a binary file only about 95 bytes in size (I manually inserted some padding in the assembly version to make it clear when debugging where "sections" are).
Only one problem (at least for this question), the variable offsets haven't been relocated (obviously). Viewing the binary, I can see lines like mov rcx, QWORD PTR [rip+0x4fc9]. Clearly, this will not be valid in a 95 byte file. Is there a way (preferably using GCC or a program in Binutils) that I can get a stripped binary with correct offsets? The solution doesn't have to be a post-process like objcopy, it can happen during any part of the build proccess.
I'd really like to avoid any unneeded information in the file, it wouldn't necessarily be detrimental, but this is meant to be super lightweight. The file does not need to be directly runnable (the entry-point does not have to be 0).
Also to be clear, I'm not asking for a simple addition/subtraction to all pointers, GCC's generated addresses are spread across memory, they should be up against the code.
Although incomplete and needing some changes, I think I've come up with a functioning solution for now.
I compile as before, but link with a slightly different command: g++ -T lnkscrpt.txt -O3 -nostdlib Loader.o (-shared just makes the linker complain about missing a DllMain).
lnkscrpt.txt is an ld linker script (https://ftp.gnu.org/old-gnu/Manuals/ld-2.9.1/html_node/ld_5.html#SEC5) as follows:
SECTIONS
{
. = 0x00;
.bss : { *(.bss) }
.text : { *(.text) }
.data : { *(.rdata) *(.data) }
/DISCARD/ : {*(*)}
}
This preserves the order I want and discards any other default sections.
Finally I run objcopy -O binary -j .* --set-section-flags .bss=alloc,load,contents a.exe
to copy over the remaining sections to a flat binary. The --set-section-flags option simply insures that the binary contains space allocated for the .bss section.
This results in a 128 byte binary, laid out in the exact same way as my custom assembly version, using correct offsets, and not containing any unneeded data.
Edited to add: I have now cross-posted this to the GNU ARM Embedded Toolchain site, as I am fairly certain that it's a linker bug.
Also, I have noticed that it seems to happen when the first program segment fits into the first page in the ELF file (i.e. its starting offset within its page is >= the number of bytes in the ELF header). In this case the segment erroneously gets extended downwards to the beginning of the file. This would explain why the problem disappears if the in-page offset of the start address is reduced from 0x80 to 0x40.
I am implementing a stand-alone OS for ARM Cortex M0, and I have a weird problem with the linker. Here is my source file OS.c, stripped down to illustrate the problem:
int EntryPoint (void) { return 99 ; }
And here is my linker script file OS.ld, simply assigning all code to the region starting at 0x10080:
MEMORY
{
NVM (rx) : ORIGIN = 0x10080, LENGTH = 0x1000
}
SECTIONS
{
.text 0x10080 :
{
OS.o (.text)
} > NVM
}
I compile and link it:
arm-none-eabi-gcc.exe -march=armv6-m -mthumb -c OS.c
arm-none-eabi-gcc.exe -oOS.elf -Xlinker --script=OS.ld OS.o -nostartfiles -nodefaultlibs
And now when I list the program segments with readelf OS.elf -l, I get:
Elf file type is EXEC (Executable file)
Entry point 0x10080
There are 1 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00010000 0x00010000 0x0008c 0x0008c R E 0x10000
According to this, the one and only program segment starts at offset 0x000000 in the ELF output file, which is crazy: that region contains ELF header info irrelevant to the OS. And the physical start address is 0x00010000, which doesn't exist in my hardware.
But the weird thing is that if I change both instances of 0x10080 to 0x10040 in the linker script file, it works! I get:
Elf file type is EXEC (Executable file)
Entry point 0x10040
There are 1 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x010040 0x00010040 0x00010040 0x0000c 0x0000c R E 0x10000
Now the program segment is in the right place in the file, and has length 0x0000c instead of 0x0008c. Unfortunately address 0x00010040 doesn't exist in my hardware either, so this is not a solution.
Is this a bug in the GCC ARM compiler? Running it with --version gives:
arm-none-eabi-gcc.exe (GNU Tools for Arm Embedded Processors 7-2018-q2-update) 7.3.1 20180622 (release) [ARM/embedded-7-branch revision 261907]
what you see might not be what you expect, but is nevertheless correct, IMHO.
ELF was created for System V. An OS that supports virtual memory and mmap() (a system call to map the contents of a file into memory).
You are looking at the ELF program header (not the section headers, see below). The program header is information to a (virtual memory capable) operation system's ELF loader about where it is supposed to mmap() the (complete) ELF file into virtual memory it prepared as process image. This OS would then just allocate one (or more) page(s) somewhere, call that (virtual) 0x10000 (for that process), map the file and jump to 0x10080 (the entry point).
For your second example, this would not work as you specified the (virtual) start address before the end of the ELF file's header (ELF header + program header + section headers), sot it cannot just map the file to a page boundary, making it more complicated (or even impossible) to the OS to do it's mmap() trick.
For your bare metal OS (that most likely doesn't support virtual memory, at least not on startup), the ELF program header's information is probably completely irrelevant.
You should probably rather look at the section headers, instead. They describe physical memory.
I had very similar issue with GNU linker for ARM Cortex-M platform (GNU ld (Atmel build: 508) 2.28.0.20170620). I had bootloader and application projects where linker from application was placing ELF headers in flash location where bootloader code is. I'm not an expert but this modification tricked my linker not to put ELF header in memory space before entry point address (will try to show on your example):
redefine NVM space by including first 0x80 bytes
NVM (rx) : ORIGIN = 0x10000, LENGTH = 0x1000+0x80
in sections part add that offset:
SECTIONS
{
.text :
{
. += 0x80;
OS.o (.text)
} > NVM
}
I'm not sure if this can work in your case but perhaps can be used as a hint for others.
What is the correct gnu assembly syntax for doing the following:
.section .data2
.asciz "******* Output Data ********"
total_sectors_written: .word 0x0
max_buffer_sectors: .word ((0x9fc00 - $data_buffer) / 512) # <=== need help here
.align 512
data_buffer: .asciz "<The actual data will overwrite this>"
Specifically, I'm writing a toy OS. The code above is in 16-bit real mode. I'm setting up a data buffer that will be dumped back to the boot disk. I want to calculate the number of sectors there are between where data_buffer gets placed in memory, and the upper bound of that data buffer. (Address 0x9fc00 is where the buffer would run into RAM reserved for other purposes.)
I know I could write assembly code to calculate this; but, since it is a constant known at build time, I'm curious if I can get the assembler to calculate it for me.
I'm running into three specific problems:
(1) If I use $data_buffer I get this error:
os_src/boot.S: Assembler messages:
os_src/boot.S:497: Error: missing ')'
os_src/boot.S:497: Error: can't resolve `L0' {*ABS* section} - `$data_buffer' {*UND* section}
which I find confusing, because I should use $ when I want the memory address of a label, correct?
(2) If I use data_buffer instead of $data_buffer, I get this error:
os_src/boot.S: Assembler messages:
os_src/boot.S:497: Error: missing ')'
os_src/boot.S:497: Error: value of 653855 too large for field of 2 bytes at 31
make: *** [obj/boot/dd_test.o] Error 1
which seems to suggest that the assembler is complaining about the size of the intermediate value (which does not need to fit in a 16-bit word).
(3) And, of course, what's up with the missing ')'?
When you use expressions in GNU assembler they have to resolve to absolute values. GNU assembler isn't aware of what the origin point of the code will actually be at. That is what the linker is for. Because of that data_buffer absolute address isn't known until linking is done so it is considered relocatable. If you take an absolute value like 0x9fc00 and subtract a relocatable value from it you get a relocatable value. Relocatable values can't be used in constant (absolute) expressions.
All is not lost. The linker itself will know the absolute address once it arranges everything in memory. You seem to suggest you already use a linker script which means the work you have to do is minimal. You can use the linker to compute the value of max_buffer_sectors.
Your linker script will have a SECTIONS directive like:
SECTIONS
{
[your section contents here]
}
You can create a linker symbol max_buffer_sectors with something like:
SECTIONS
{
max_buffer_sectors = (0x9fc00 - (data_buffer)) / 512;
[your section contents here]
}
This will allow the linker to compute the size since it will know data_buffer absolute address in memory.
Your GNU assembly file will need a bit of tweaking:
.globl data_buffer
.section .data2
.asciz "******* Output Data ********"
total_sectors_written: .word 0x0
.align 512
data_buffer: .asciz "<The actual data will overwrite this>"
You'll notice I used .globl data_buffer. This exports the symbol and makes it global so that the linker can use it.
You can then use the symbol max_buffer_sectors in code like:
mov $max_buffer_sectors, %ax
I would like to load .symtab into memory with gdb debugger.
At most two steps are required for a normal section (for some section, e.g. .text, .data, ... , step 1 can be skipped cause is automatically set by ld):
1 - Set the Alloc flag (in case of a special section) to the section in the ELF. This can be done in this way for a normal section.
arm-none-eabi-objcopy --set-section-flags .sectionName=alloc src.elf dst.elf
2 - Set the address to the section. This can be done in 2 ways for a normal section AFAIK
A - Specifying the section memory area in the LD script e.g. for text section:
.text :
{
*(.text)
*(.text*)
} > FLASH
B - Using again objcopy
arm-none-eabi-objcopy --change-section-address .sectioName=0x0ABCD src.elf dst.elf
since .symtab is generated automatically by the linker I cannot treat it as a normal section so none of the steps above works.
Does anyone have any idea on how to solve this?
I already successfully implemented a workaround that to generate a new elf stripping all unneeded sections, and this works but then you have to load two elfs and i'm looking for a cleaner solution.
I am trying to compile a project on Contiki but I have this error:
/usr/lib/gcc/msp430/4.5.3/../../../../msp430/bin/ld: dora_main.sky section `.data' will not fit in region `rom'
/usr/lib/gcc/msp430/4.5.3/../../../../msp430/bin/ld: section .vectors loaded at [0000ffe0,0000ffff] overlaps section .data loaded at [0000ff0c,00010131]
/usr/lib/gcc/msp430/4.5.3/../../../../msp430/bin/ld: region `rom' overflowed by 338 bytes
collect2: ld returned 1 exit status
Someone told me that I have to reduce the ROM partition. Is it true? How could I do that?
Your project is simply to big for the MSP430s memory.
Your options basically are to either trim the binary or if you are lucky you have to update your compiler to use all of the devices memory
1. Trimming the binary
by checking if you compile with -0s
by removing debug output and other strings from the binary
by removing Contiki Apps you might not need
2. Use MSP430X
If you have a MSP430 with more than 32kByte (e.g. MSP430F5335) of flash you can change the memory model with the following flags in your makefile:
CFLAGS += -mmemory-model=large \
-ffunction-sections -fdata-sections \
-mcode-region=far -mdata-region=far
LDFLAGS += -mmemory-model=large \
-Wl,-gc-sections \
-mcode-region=far -mdata-region=far
This will move your code and data past the 16 bit boundary to use all the memory the device supports. See MSP430X section of the Contiki Wiki for more information on how to do this.