gcc assembler - create only the minimal instructions necessary - gcc

I have created a very minimal application in assembly. It sets some registers to 0 and does a multiplication. Nothing fancy.
However, the gcc adds a lot of stuff to the machine code I do not want.
A small list of the stuff I find in the objdump:
deregister_tm_clones
register_tm_clones
__do_global_dtors_aux
frame_dummy
__libc_fini_array
memset
and a few more
I know that I do not need them, but I have no idea how I can tell to compiler to stop including them. I tried to use optimization options, but this did not change anything.
I compile it basically like: GCC -o ./main.elf ./main.S
Thank you very much for any help!

GCC automatically links the C / C++ runtime start-up crt0.o and the standard library. You can provide your own startup code to override the default and provide command line options to force it not to link no the standard library.
Options controlling startup and default libraries include:
-nostartfiles
-nostdlib
-nodefaultlubs
-nolibc
Each affects the link in a different way, but in this case -nostdlib will exclude both crt0.o and standard libraries. Of course if your code makes no reference to the standard library then nothing will be linked in any case, but explicitly excluding it will helpfully generate a link error if something does reference it.
See: https://gcc.gnu.org/onlinedocs/gcc/Link-Options.html
Be aware that if your code does not establish a valid C runtime environment, providing for example static initialisation and a stack (a minimum), then some C code may not run in the manner intended. You may also need to specify the entry point via --entry=entry if you do not use the same default entry point as crt0 (_start I think).
Alternatively you can invoke gcc with the -c option and separately invoke the linker ld without specifying any library.

so.S:
nop
nop
then build.
as so.S -o so.o
ld -Ttext=0x1000 so.o -o so.elf
objdump -D so.elf
Disassembly of section .text:
0000000000001000 <__bss_start-0x200002>:
1000: 90 nop
1001: 90 nop
objcopy -O binary so.elf so.bin
hexdump -C so.bin
00000000 90 90 |..|
00000002
using gcc
gcc -nostartfiles -nostdlib -nodefaultlibs -ffreestanding so.S -Xlinker "-Ttext=0x1000" -o so.elf
this leaves extra garbage in the file, but
gcc so.S -c -o so.o
ld -Ttext=0x2000 so.o -o so.elf
ld: warning: cannot find entry symbol _start; defaulting to 0000000000002000
objdump -D so.elf
Disassembly of section .text:
0000000000002000 <__bss_start-0x200002>:
2000: 90 nop
2001: 90 nop
But if writing assembly language you might as well use the assembler not the compiler.
_start is not required unless you need an entry point defined in the file then you need to do this:
.globl _start
_start:
plus possibly something in the linker to call that out as the entry point for file formats like elf, exe, etc.
works for cross compiling as well
arm-none-eabi-as so.s -o so.o
arm-none-eabi-ld -Ttext=0x3000 so.o -o so.elf
arm-none-eabi-ld: warning: cannot find entry symbol _start; defaulting to 0000000000003000
arm-none-eabi-objdump -D so.elf
so.elf: file format elf32-littlearm
Disassembly of section .text:
00003000 <__bss_end__-0x10008>:
3000: e1a00000 nop ; (mov r0, r0)
3004: e1a00000 nop ; (mov r0, r0)
pdp11-aout-as so.s -o so.elf
pdp11-aout-as so.s -o so.o
pdp11-aout-ld -Ttext=0x400 so.o -o so.elf
pdp11-aout-objdump -D so.elf
so.elf: file format a.out-pdp11
Disassembly of section .text:
00000400 <so.o>:
400: 00a0 nop
402: 00a0 nop
and so on.

Related

Is there any reference to "main" in the default linker scripts of arm-none-eabi that could interfere when linking from the command line

When using the gnu toolchain particularly arm-none-eabi is there any reason why when using the command line linker option it resorts to what seems an incorrect address to the start of 'main'. However, when 'main' is anything else the correct starting address and stack is initialized. For example,
.thumb
.syntax unified
.globl _start
_start:
.word 0x20001000
.word reset
reset:
bl main
b .
int main ( void )
{
return(0);
}
arm-none-eabi-gcc -O2 -c -mthumb main.c -o main.o
arm-none-eabi-as start.s -o start.o
arm-none-eabi-gcc -O2 -c -mthumb main.c -o main.o
arm-none-eabi-ld -Ttext=0x08000000 start.o main.o -o main.elf
arm-none-eabi-objdump -d main.elf
main.elf: file format elf32-littlearm
Disassembly of section .text:
08000000 <main>:
8000000: 2000 movs r0, #0
8000002: 4770 bx lr
08000004 <_start>:
8000004: 20001000 .word 0x20001000
8000008: 0800000c .word 0x0800000c
0800000c <reset>:
800000c: f7ff fff8 bl 8000000 <main>
8000010: e7fe b.n 8000010 <reset+0x4>
in the disassembly the output above doesn't initialize the stack 0x20001000 and start of rom 0x08000000 correctly from what I notice, but..
.thumb
.syntax unified
.globl _start
_start:
.word 0x20001000
.word reset
reset:
bl notmain
b .
int notmain ( void )
{
return(0);
}
arm-none-eabi-gcc -O2 -c -mthumb main.c -o main.o
arm-none-eabi-as start.s -o start.o
arm-none-eabi-gcc -O2 -c -mthumb main.c -o main.o
arm-none-eabi-ld -Ttext=0x08000000 start.o main.o -o main.elf
arm-none-eabi-objdump -d main.elf
main.elf: file format elf32-littlearm
Disassembly of section .text:
08000000 <_start>:
8000000: 20001000 .word 0x20001000
8000004: 08000008 .word 0x08000008
08000008 <reset>:
8000008: f000 f802 bl 8000010 <xmain>
800000c: e7fe b.n 800000c <reset+0x4>
...
08000010 <notmain>:
8000010: 2000 movs r0, #0
8000012: 4770 bx lr
I tried looking through the toolchain in my files to find any other reference to main pertaining to linker scripts and got some other help along the way, but there doesnt seem to be a clear solution as to why this is. Of course, if you create your own linker or a generated one you wont run into this problem, but I was just curious as I am trying to learn the tool a bit more.
..but I was just curious as I am trying to learn the tool a bit more
The arm-eabi-none is meant to be used with newlib (as a guess because you have not stated otherwise). This can process elf format files and it is a 'library', but there is no OS. If newlib mechanics want main() to be first, the tool will set things up like this. You don't want an elf file, but a binary. If you want a binary (ihex, srec, etc), then use a linker script! This is what it is meant for.
Use ld --verbose to see the default linker script. You are complaining about the order of emitted .text, but you have done nothing to define the ordering. The linker script may need main to be first so that some other library feature may work. You have a reset vector and a CPU which initializes the stack and 'reset vector' or initial code.
This is still emitted in the 'bad case', but it is not placed correctly. You need to have a custom linker script and position this a the first thing in the binary. Relying on the linker to place it correctly is error prone. An upgrade of tools can definitely change the order.
See: Can _start be a thumb function, were you have options like, -nostartfiles -static -nostdlib and use a custom linker script as an elf binary is unlikely to be understood and you need to flash/burn a binary to whatever boot device (or CPU built-in) is going to read the reset vectors.

How to link 2 files with different starting addresses with ld

I am trying to build a small os. I have an asm file that puts the processor in 64 bit mode with paging enabled. After this, i am jumping to my C code. I want the C code and asm code to be linked into the same file but the C code to have base address at 0xFFFFFF8000000000 and the asm file at 0x5000. How can I do this with ld.
This is what I have so far:
nasm -f elf64 os_init.asm -o ../bin/os_init.o
gcc -c -Os -nostdlib -nostartfiles -nodefaultlibs -fno-builtin vga/*.c utils/*.c *.c memory_management/*.c
ld -Ttext 0x5000 ../bin/os_init.o *.o -o ../bin/kernel.out
objcopy -S -O binary ../bin/kernel.out ../bin/kernel.bin
Currently both files are linked at 0x5000

MinGW Win32 + nasm: "undefined reference"

I am currently developing an OS for learning purposes, and it's been working fine until now. Then I tried to call an assembler function, compiled with nasm, -fwin32, from C code, but all I got was an "undefined reference" error. I have created a small example in pure assembler, which has the same problem, but is easily understandable and way smaller:
It includes two files:
test.asm:
[bits 32]
global _testfunc
_testfunc:
ret
test2.asm:
[bits 32]
extern _testfunc
global _testfunc2
_testfunc2:
call _testfunc
ret
Here is my compiler / linker script (using windows batch files):
nasm.exe -f win32 test.asm -o test.o
nasm.exe -f win32 test2.asm -o test2.o
ld test.o test2.o -o output.tmp
This results in the error:
test2.o:test2.asm:(.text+0x1): undefined reference to `testfunc'
To extend the question, the same happens when the function is called from C:
test.c:
extern void testfunc(void);
void start()
{
testfunc();
}
With this linker script:
gcc -ffreestanding -c test.c -o testc.o
nasm.exe -f win32 test.asm -o test.o
ld test.o testc.o -o output.tmp
In test.o, test2.o and testc.o, it always says _testfunc, so the error has nothing to do with leading underscores!
In my MinGW setup you need a section directive before the code.
; foo.asm
[bits 32]
global _testfunc
section .text
_testfunc:
ret
Then assemble to win32 format:
nasm -fwin32 foo.asm -o foo.o
Now you can check that testfunc is there:
$ nm foo.o
00000000 a .absolut
00000000 t .text
00000001 a #feat.00
00000000 T _testfunc
The T means text section global, so we're good to go.
Note I'd avoid naming anything test since this is a shell command. This can cause endless grief.
The C function is as you showed it, but name the file something else:
// main.c
extern void testfunc(void);
int main(void)
{
testfunc();
return 0;
}
Then to build an executable let gcc do the heavy lifting because ld sometimes needs arcane arguments.
gcc -ffreestanding main.c foo.o -o main
Your missing something important, your code is not in a code section!
Your asm files should look like the following:
test.asm
global _testfunc
section .text ; <<<< This is important!!!
; all code goes below this!
_testfunc:
ret
test2.asm
extern _testfunc
global _testfunc2
section .text ; <<<< Again, this is important!!!
_testfunc2:
call _testfunc
ret

linux linking assembly with gcc gives many errors

I am trying to compile and link a simple "hello, world!" program with GCC. This program uses the "printf" C function. The problem that I am having is that the terminal throws back multiple errors. I am running Archlinux, compiling with NASM, linking with GCC. Here is my code:
; ----------------------------------------------------------------------------
; helloworld.asm
;
; Compile: nasm -f elf32 helloworld.asm
; Link: gcc helloworld.o
; ----------------------------------------------------------------------------
SECTION .data
message db "Hello, World",0
SECTION .text
global main
extern printf
section .text
_main:
push message
call printf
add esp, 4
ret
The errors that I receive are as follows:
/usr/bin/ld: skipping incompatible /usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.2/libgcc.a when searching for -lgcc
/usr/bin/ld: cannot find -lgcc
collect2: error: ld returned 1 exit status
Can someone tell me what is causing these errors and what I need to do to fix them?
Thanks in advance,
RileyH
For such things, you should first understand what exactly gcc is doing. So use
gcc -v helloworld.o -o helloworld
and what is happenning is that you have a 64 bits Linux and linking a 32 bits object in it. So try with
gcc -m32 -v helloworld.o -o helloworld
But I think that you should avoid coding assembly today (optimizing compilers do a better work than you can reasonably do). If you absolutely need a few assembly instructions, put some asm in your C code.
BTW, you could compile with gcc -fverbose-asm -O -wall -S helloworld.c and look inside the generated helloworld.s; and you could also pass .s files to gcc

link nasm program for mac os x

i have some problems with linking nasm program for macos:
GLOBAL _start
SEGMENT .text
_start:
mov ax, 5
mov bx, ax
mov [a], ebx
SEGMENT .data
a DW 0
t2 DW 0
fry$ nasm -f elf test.asm
fry$ ld -o test test.o -arch i386
ld: warning: in test.o, file was built for unsupported file format which is not the architecture being linked (i386)
ld: could not find entry point "start" (perhaps missing crt1.
fry$ nasm -f macho test.asm
fry$ ld -o test test.o -arch i386
ld: could not find entry point "start" (perhaps missing crt1.o)
can anyone help me?
The Mac OS X linker can't link ELF objects. It works only with the Mach-O executable format. Unless you want to figure out how to translate the object files, you'll probably be better off writing code that works with the Mac OS X assembler.
Edit: As #Fry mentions in the comment below, you can make nasm put out Mach-O objects. In that case, the problem is simple - take the _ off of _start in both places in your source file. The result links fine.
nasm -f macho test.asm
ld -e _start -o test test.o
For people who need to stick with the elf format and develop on a mac, you need a cross compiler...
http://crossgcc.rts-software.org/doku.php?id=compiling_for_linux
Then you can proceed with something similar to this...
/usr/local/gcc-4.8.1-for-linux32/bin/i586-pc-linux-ld -m elf_i386 -T link.ld -o kernel kasm.o kc.o

Resources