mipsisa64-octeon-elf-gcc obj/zxmd_main.o obj/zxmd_mproc.o obj/zxmd_init.o obj/zxmd_pcie.o obj/libcvm-common.a obj/libcvm-pci-drv.a obj/libcvmhfao.a obj/libocteon-hfa.a /home/jianxi/Juson/JusonFlow/sdk/OCTEON-SDK/components/hfa/lib-octeon/pp/octeon/se/libpp.a obj/libcvmx.a obj/libzxexe.a obj/libfdt.a -mfix-cn63xxp1 -march=octeon2 -o cn63hw1.bin
gcc complain:
obj/libzxexe.a(zxmx_tim.o): In function `zxmx_init_tim':
/home/jianxi/Juson/JusonFlow/libexec/zxmx_tim.c:47: undefined reference to `cvmx_tim_setup'
But cvmx_tim_setup can be found in libcvmx.a:
[jianxi#jianxi obj]$ readelf -h libcvmx.a | grep "cvmx-tim.o" -A21
File: libcvmx.a(cvmx-tim.o)
ELF Header:
Magic: 7f 45 4c 46 01 02 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, big endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: MIPS R3000
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 13424 (bytes into file)
Flags: 0x808d4001, noreorder, octeon2, eabi64, mips64r2
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 33
Section header string table index: 30
[jianxi#jianxi obj]$ readelf -s cvmx-tim.o
27: 00000000 92 FUNC GLOBAL DEFAULT 1 cvmx_tim_start
28: 00000000 40 OBJECT GLOBAL DEFAULT 16 cvmx_tim
29: 00000060 56 FUNC GLOBAL DEFAULT 1 cvmx_tim_stop
30: 00000098 276 FUNC GLOBAL DEFAULT 1 cvmx_tim_shutdown
31: 000001b0 752 FUNC GLOBAL DEFAULT 1 cvmx_tim_setup
32: 00000000 0 NOTYPE GLOBAL DEFAULT UND cvmx_clock_get_rate
33: 00000000 0 NOTYPE GLOBAL DEFAULT UND cvmx_bootmem_alloc
34: 00000000 0 NOTYPE GLOBAL DEFAULT UND memset
35: 00000000 0 NOTYPE GLOBAL DEFAULT UND puts
36: 00000000 0 NOTYPE GLOBAL DEFAULT UND printf
When i added cvmx-tim.o in the command , gcc will be executed successfully:
mipsisa64-octeon-elf-gcc obj/cvmx-tim.o obj/zxmd_main.o obj/zxmd_mproc.o obj/zxmd_init.o obj/zxmd_pcie.o obj/libcvm-common.a obj/libcvm-pci-drv.a obj/libcvmhfao.a obj/libocteon-hfa.a /home/jianxi/Juson/JusonFlow/sdk/OCTEON-SDK/components/hfa/lib-octeon/pp/octeon/se/libpp.a obj/libcvmx.a obj/libzxexe.a obj/libfdt.a -mfix-cn63xxp1 -march=octeon2 -o cn63hw1.bin
And if put obj/libcvmx.a in front of obj/zxmd_main.o , gcc will report more errors.
Why gcc can not find cvmx-tim.o in the libcvmx.a?
The order of *.o will cause problems?
It's the order of the libraries:
obj/libcvmx.a obj/libzxexe.a
by the time the linker searches obj/libzxexe.a it has already processed obj/libcvmx.a - it won't search it again for anything that was not already been pulled in when obj/libcvmx.a was processed the first time around.
Change the order of those libraries to:
obj/libzxexe.a obj/libcvmx.a
Besides changing the order of the libraries, you can also force the cvmx_tim_setup to be a marked as 'undefined' symbol in command line. If the symbol is known to be required, then the linker will be on the lookout for it and remember the first library defining it.
add this flag to gcc command: -Wl,--undefined=cvmx_tim_setup
Additionally you can also experiment with --start-group and --end-group in gcc. --start-group (list of binaries to link) --end-group. This will allow a full circular closure for searches. But will cost some link performance.
Ref:
http://eli.thegreenplace.net/2013/07/09/library-order-in-static-linking
Paxym
Related
I make an elf file from an intel-hex file using riscv64-unknown-elf-objcopy.
exact command:
riscv64-unknown-elf-objcopy -O elf32-littleriscv --set-start 0x10000000 --rename-section .sec1=.text image32.ihex image32.elf
the result looks like this:
$ riscv64-unknown-elf-readelf image32.elf -h
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: RISC-V
Version: 0x1
Entry point address: 0x10000000
Start of program headers: 0 (bytes into file)
Start of section headers: 136148 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 8
Section header string table index: 7
My purpose is to execute it using 'Spike'.
Spike does not accept my elf file:
spike --isa=RV32IM --priv=MU -m0x10000000:0x00200000 --real-time-clint image32.elf
spike: ../fesvr/elfloader.cc:36: std::map<std::__cxx11::basic_string<char>, long unsigned int> load_elf(const char*, memif_t*, reg_t*): Assertion `IS_ELF_EXEC(*eh64)' failed.
It does accept elf files with the following header:
$ riscv64-unknown-elf-readelf ok.elf -h
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: RISC-V
Version: 0x1
Entry point address: 0x10000000
Start of program headers: 52 (bytes into file)
Start of section headers: 54672 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 3
Size of section headers: 40 (bytes)
Number of section headers: 23
Section header string table index: 22
I guess I need to change the "Type" field from "REL" to "EXEC", but I do not see any option for that in objcopy.
This post suggests that the following sequence might do what you want:
riscv64-unknown-elf-objcopy -I ihex -O binary image32.ihex image32.bin
riscv64-unknown-elf-objcopy -I binary -O riscv64-unknown-elf image32.bin image32.elf
But I am just guessing :-(
I have an ELF file which we then convert to a binary format:
arm-none-eabi-objcopy -O binary MyElfFile.elf MyBinFile.bin
The ELF file is just under 300KB, but the binary output file is 446-times larger: 134000KB, or 130MB! How is this possible when the whole point of a binary is to remove symbols and section tables and debug info?
Looking at Reddit and SO it looks like the binary image should be smaller than the ELF, not larger.
so.s
b .
.section .data
.word 0x12345678
arm-none-eabi-as so.s -o so.o
arm-none-eabi-objdump -D so.o
so.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <.text>:
0: eafffffe b 0 <.text>
Disassembly of section .data:
00000000 <.data>:
0: 12345678 eorsne r5, r4, #120, 12 ; 0x78
arm-none-eabi-readelf -a so.o
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000034 000004 00 AX 0 0 4
[ 2] .data PROGBITS 00000000 000038 000004 00 WA 0 0 1
[ 3] .bss NOBITS 00000000 00003c 000000 00 WA 0 0 1
[ 4] .ARM.attributes ARM_ATTRIBUTES 00000000 00003c 000012 00 0 0 1
[ 5] .symtab SYMTAB 00000000 000050 000060 10 6 6 4
[ 6] .strtab STRTAB 00000000 0000b0 000004 00 0 0 1
[ 7] .shstrtab STRTAB 00000000 0000b4 00003c 00 0 0 1
so my "binary" has 8 bytes total. In two sections.
-rw-rw-r-- 1 oldtimer oldtimer 560 Oct 12 16:32 so.o
8 bytes relative to 560 for the object.
Link it.
MEMORY
{
one : ORIGIN = 0x00001000, LENGTH = 0x1000
two : ORIGIN = 0x00002000, LENGTH = 0x1000
}
SECTIONS
{
.text : { (.text) } > one
.data : { (.data) } > two
}
arm-none-eabi-ld -T so.ld so.o -o so.elf
arm-none-eabi-objdump -D so.elf
so.elf: file format elf32-littlearm
Disassembly of section .text:
00001000 <.text>:
1000: eafffffe b 1000 <.text>
Disassembly of section .data:
00002000 <.data>:
2000: 12345678 eorsne r5, r4, #120, 12 ; 0x7800000
arm-none-eabi-readelf -a so.elf
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00001000 001000 000004 00 AX 0 0 4
[ 2] .data PROGBITS 00002000 002000 000004 00 WA 0 0 1
[ 3] .ARM.attributes ARM_ATTRIBUTES 00000000 002004 000012 00 0 0 1
[ 4] .symtab SYMTAB 00000000 002018 000070 10 5 7 4
[ 5] .strtab STRTAB 00000000 002088 00000c 00 0 0 1
[ 6] .shstrtab STRTAB 00000000 002094 000037 00 0 0 1
Now...we need 4 bytes at 0x1000 and 4 bytes at 0x2000, if we want to use the -O binary objcopy that means it is going to take the entire memory space and start the file with the lowest address thing and end with the highest address thing. With this link the lowest thing is 0x1000 and highest is 0x2003, a total span of 0x1004 bytes:
arm-none-eabi-objcopy -O binary so.elf so.bin
ls -al so.bin
-rwxrwxr-x 1 oldtimer oldtimer 4100 Oct 12 16:40 so.bin
4100 = 0x1004 bytes
hexdump -C so.bin
00000000 fe ff ff ea 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001000 78 56 34 12 |xV4.|
00001004
The assumption here is the user knows that the base address is 0x1000 as there is no address info in the file format. And that this is a continuous memory image so that the four bytes also land at 0x2000. So -O binary pads the file to fill everything in.
If I change to this
MEMORY
{
one : ORIGIN = 0x00000000, LENGTH = 0x1000
two : ORIGIN = 0x10000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > one
.data : { *(.data*) } > two
}
You can easily see where this is headed.
ls -al so.bin
-rwxrwxr-x 1 oldtimer oldtimer 268435460 Oct 12 16:43 so.bin
So my elf does not change size, but the -O binary format is 0x10000004 bytes in size, there are only 8 bytes I care about but the nature of objcopy -O binary has to pad the middle.
Since the sizes and spaces of things vary specific to your project and your linker script, no generic statements can be made relative to the size of the elf file and the size of an -O binary file.
ls -al so.elf
-rwxrwxr-x 1 oldtimer oldtimer 131556 Oct 12 16:49 so.elf
arm-none-eabi-strip so.elf
ls -al so.elf
-rwxrwxr-x 1 oldtimer oldtimer 131336 Oct 12 16:50 so.elf
arm-none-eabi-as -g so.s -o so.o
ls -al so.o
-rw-rw-r-- 1 oldtimer oldtimer 1300 Oct 12 16:51 so.o
arm-none-eabi-ld -T so.ld so.o -o so.elf
ls -al so.elf
-rwxrwxr-x 1 oldtimer oldtimer 132088 Oct 12 16:51 so.elf
arm-none-eabi-strip so.elf
ls -al so.elf
-rwxrwxr-x 1 oldtimer oldtimer 131336 Oct 12 16:52 so.elf
The elf binary file format does not have absolute rules on content, the consumer of the file can have rule as to what you have to put where, if any specific names of items have to be there, etc. It is a somewhat open file format, it is a container like a cardboard box, and you can fill it to some extent how you like. You cannot fit a cruise ship in it, but you can put books or toys and you can choose how you put the books or toys in it sometimes.
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 010000 000004 00 AX 0 0 4
[ 2] .data PROGBITS 10000000 020000 000004 00 WA 0 0 1
[ 3] .ARM.attributes ARM_ATTRIBUTES 00000000 020004 000012 00 0 0 1
[ 4] .shstrtab STRTAB 00000000 020016 000027 00 0 0 1
Even after stripping there is still extra stuff there, if you study the file format you have a header, relatively small with number of program headers and number of section headers and then that many program headers and that many section headers. Depending on the consumer(s) of the file you may for example only need the main header stuff and two program headers in this case and that is it, a much smaller file (as you can see with the object version of the file).
arm-none-eabi-as so.s -o so.o
ls -al so.o
-rw-rw-r-- 1 oldtimer oldtimer 560 Oct 12 16:57 so.o
arm-none-eabi-strip so.o
ls -al so.o
-rw-rw-r-- 1 oldtimer oldtimer 364 Oct 12 16:57 so.o
readelf that
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 6
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00000000 000034 000004 00 AX 0 0 4
[ 2] .data PROGBITS 00000000 000038 000004 00 WA 0 0 1
[ 3] .bss NOBITS 00000000 00003c 000000 00 WA 0 0 1
[ 4] .ARM.attributes ARM_ATTRIBUTES 00000000 00003c 000012 00 0 0 1
[ 5] .shstrtab STRTAB 00000000 00004e 00002c 00 0 0 1
Extra section headers we don't need which maybe can be removed in the linker script. But I assume for some consumers all you would need is the two program headers
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 2
Plus the 8 bytes and any padding for this file format.
Also note
arm-none-eabi-objcopy --only-section=.text -O binary so.elf text.bin
arm-none-eabi-objcopy --only-section=.data -O binary so.elf data.bin
ls -al text.bin
-rwxrwxr-x 1 oldtimer oldtimer 4 Oct 12 17:03 text.bin
ls -al data.bin
-rwxrwxr-x 1 oldtimer oldtimer 4 Oct 12 17:03 data.bin
hexdump -C text.bin
00000000 fe ff ff ea |....|
00000004
hexdump -C data.bin
00000000 78 56 34 12 |xV4.|
00000004
I have a simple nasm code:
test.nasm
section .text
global _start
_start:
jmp _start
ret
I build it as below:
nasm -f elf64 test.nasm -o test.o
ld test.o -o test -pie
Then I tried to run it:
./test
It gives me this:
bash: ./test: No such file or directory
So I checked the file header:
readelf -h test
It shows this:
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file) <====== NOT an executable
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x1f0
Start of program headers: 64 (bytes into file)
Start of section headers: 4616 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 6
Size of section headers: 64 (bytes)
Number of section headers: 12
Section header string table index: 11
But I can easily create a position independent executable with GCC for below program:
test2.c
#include <stdio.h>
void main(void)
{
printf("hello, pie!\n");
}
I build it with:
gcc test2.c -o test2 -pie
It runs like this:
hello, pie!
So how can I create a position independent executable with nasm and ld rather than a shared object?
ADD 1
I checked the test2 ELF header. It is also a shared object. So it may not be the problem. (Thanks to #Nate Eldredge)
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: DYN (Shared object file) <===== ALSO a shared object
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x530
Start of program headers: 64 (bytes into file)
Start of section headers: 6440 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 9
Size of section headers: 64 (bytes)
Number of section headers: 29
Section header string table index: 28
So how can I use ld to get a position-independent executable from the object file generated with nasm?
I'm learning ELF file and using a C HelloWorld program built by GCC on x86_64 CentOS 7 as an example to investigate the executable.
I can understand how the name of a symbol is found for every symbol in the executable except two in dynamic symbol table (.dynsym).
Take symbol "main" in symbol table (.symtab) for example:
$ readelf -s HelloWorld
...
Symbol table '.symtab' contains 84 entries:
Num: Value Size Type Bind Vis Ndx Name
...
80: 0000000000400586 21 FUNC GLOBAL DEFAULT 13 main
...
Its original value is:
Name of this symbol
^^ ^^ ^^ ^^
00001e20 00 00 00 00 00 00 00 00 fb 03 00 00 12 00 0d 00 |................|
00001e30 86 05 40 00 00 00 00 00 15 00 00 00 00 00 00 00 |..#.............|
The value is 0x0000 03fb (small edian). It is actually an offset in section .strtab with the following content:
$ readelf -p .strtab HelloWorld
String dump of section '.strtab':
...
[ 3fb] main
...
So it is easy to understand that the name of this symbol is "main". And every symbol which is in either .symtab or .dynsym and has a name follows this rule, except two in .dynsym. The two exceptions are:
$ readelf -s HelloWorld
Symbol table '.dynsym' contains 6 entries:
Num: Value Size Type Bind Vis Ndx Name
...
2: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts#GLIBC_2.2.5 (2)
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main#GLIBC_2.2.5 (2)
...
Take symbol #2 for example, its original value is:
Name of the symbol
^^ ^^ ^^ ^^
000002e0 00 00 00 00 00 00 00 00 0b 00 00 00 12 00 00 00 |................|
000002f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
Offset of the symbol name 0x0000 000b. By looking at .dynstr:
$ readelf -p .dynstr HelloWorld
String dump of section '.dynstr':
[ 1] libc.so.6
[ b] puts
[ 10] __libc_start_main
[ 22] GLIBC_2.2.5
[ 2e] _ITM_deregisterTMCloneTable
[ 4a] __gmon_start__
[ 59] _ITM_registerTMCloneTable
I can see the the String at offset 0xb is "puts". But why a character "#" is appended by readelf? And why the String "GLIBC_2.2.5" (at offset 0x22) is appended, too?
But why a character "#" is appended by readelf? And why the String "GLIBC_2.2.5" (at offset 0x22) is appended, too?
These are GNU-versioned symbols. readelf appends that info to make it clear, otherwise you'll see e.g. this:
1232: 00000000000792d0 471 FUNC GLOBAL DEFAULT 13 fmemopen
1233: 0000000000078ed0 527 FUNC GLOBAL DEFAULT 13 fmemopen
and will be confused by how there could be two separate fmemopens at different addresses.
Current output:
1232: 00000000000792d0 471 FUNC GLOBAL DEFAULT 13 fmemopen#GLIBC_2.2.5
1233: 0000000000078ed0 527 FUNC GLOBAL DEFAULT 13 fmemopen##GLIBC_2.22
shows that these symbols have different version info attached to them, and that the GLIBC_2.22 version is the default one.
You can read more about GNU versioned symbols here.
I am learning how a C file is compiled to machine code. I know I can generate assembly from gcc with the -S flag, however it also produces a lot of code to do with main() and printf() that I am not interested in at the moment.
Is there a way to get gcc or clang to "compile" a function in isolation and output the assembly?
I.e. get the assembly for the following c in isolation:
int add( int a, int b ) {
return a + b;
}
There are two ways to do this for a specific object file:
The -ffunction-sections option to gcc instructs it to create a separate ELF section for each function in the sourcefile being compiled.
The symbol table contains section name, start address and size of a given function; that can be fed into objdump via the --start-address/--stop-address arguments.
The first example:
$ readelf -S t.o | grep ' .text.'
[ 1] .text PROGBITS 0000000000000000 00000040
[ 4] .text.foo PROGBITS 0000000000000000 00000040
[ 6] .text.bar PROGBITS 0000000000000000 00000060
[ 9] .text.foo2 PROGBITS 0000000000000000 000000c0
[11] .text.munch PROGBITS 0000000000000000 00000110
[14] .text.startup.mai PROGBITS 0000000000000000 00000180
This has been compiled with -ffunction-sections and there are four functions, foo(), bar(), foo2() and munch() in my object file. I can disassemble them separately like so:
$ objdump -w -d --section=.text.foo t.o
t.o: file format elf64-x86-64
Disassembly of section .text.foo:
0000000000000000 <foo>:
0: 48 83 ec 08 sub $0x8,%rsp
4: 8b 3d 00 00 00 00 mov 0(%rip),%edi # a <foo+0xa>
a: 31 f6 xor %esi,%esi
c: 31 c0 xor %eax,%eax
e: e8 00 00 00 00 callq 13 <foo+0x13>
13: 85 c0 test %eax,%eax
15: 75 01 jne 18 <foo+0x18>
17: 90 nop
18: 48 83 c4 08 add $0x8,%rsp
1c: c3 retq
The other option can be used like this (nm dumps symbol table entries):
$ nm -f sysv t.o | grep bar
bar |0000000000000020| T | FUNC|0000000000000026| |.text
$ objdump -w -d --start-address=0x20 --stop-address=0x46 t.o --section=.text
t.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000020 <bar>:
20: 48 83 ec 08 sub $0x8,%rsp
24: 8b 3d 00 00 00 00 mov 0(%rip),%edi # 2a <bar+0xa>
2a: 31 f6 xor %esi,%esi
2c: 31 c0 xor %eax,%eax
2e: e8 00 00 00 00 callq 33 <bar+0x13>
33: 85 c0 test %eax,%eax
35: 75 01 jne 38 <bar+0x18>
37: 90 nop
38: bf 3f 00 00 00 mov $0x3f,%edi
3d: 48 83 c4 08 add $0x8,%rsp
41: e9 00 00 00 00 jmpq 46 <bar+0x26>
In this case, the -ffunction-sections option hasn't been used, hence the start offset of the function isn't zero and it's not in its separate section (but in .text).
Beware though when disassembling object files ...
This isn't exactly what you want, because, for object files, the call targets (as well as addresses of global variables) aren't resolved - you can't see here that foo calls printf, because the resolution of that on binary level happens only at link time. The assembly source would have the call printf in there though. The information that this callq is actually to printf is in the object file, but separate from the code (it's in the so-called relocation section that lists locations in the object file to be 'patched' by the linker); the disassembler can't resolve this.
The best way to go would be to copy your function in a single temp.c C file and to compile it with the -c flag like this: gcc -c -S temp.c -o temp.s
It should produce a more tighten assembly code with no other distraction (except for the header and footer).