How to solve this linker error I get when trying to link C++ with Assembly? - macos

I'm trying to link a C++ file and an Assembly file. The Assembly file defines a C++ function called add. I'm using a 64 bit Mac OS. I get the following error when I try to compile the program:
Undefined symbols for architecture x86_64:
"_add", referenced from:
_main in main-d71cab.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [main.o] Error 1
Makefile
hello: main.o hello.o
g++ main.o hello.o -o hello
hello.o: hello.asm
nasm -f macho64 hello.asm -o hello.o
main.o: main.cpp
g++ main.cpp -o main.o
hello.asm
global add
segment .text
add:
mov rax, rdi
add rax, rsi
ret
main.cpp
#include <iostream>
extern "C" int add(int a, int b);
int main(int argc, char* argv[]) {
int a, b;
std::cout << "Enter two numbers\n";
std::cin >> a;
std::cin >> b;
std::cout << "Sum of a and b is " << add(a, b) << std::endl;
return 0;
}
I'd really appreciate any help. Thanks!

On OS/X, the C calling convention is that functions must have an underscore on them (unless it is overridden) when exported from an object in order to be visible to C. You are saying that add is using C calling convention when you put extern "C" on this prototype:
extern "C" int add(int a, int b);
This is in fact correct, but in order for your assembler code to conform you also need to make sure your assembler functions that are going to be global and visible to C/C++ code have a leading underscore on them. Your assembler code would start like this:
global _add
segment .text
_add:
The other problem you have is in your Makefile and the way you generate main.o from main.cpp.
main.o: main.cpp
g++ main.cpp -o main.o
This tells g++ to compile main.cpp to an executable called main.o . What you really want to do is tell g++ to skip the linking to an executable part, and that the output file main.o will actually be an object. To do that change the line to be:
main.o: main.cpp
g++ -c main.cpp -o main.o
The -c option says to skip the linking stage (that generates an executable and simply output an object that will be linked later).
Without this change when it tries to make main.o, it thinks you want an executable, and can't find the function _add because it is isn't in a file it knows about. It is in hello.o but the command g++ main.cpp -o main.o doesn't know that.

Related

How to run manually produce an elf executable using ld?

I'm trying to get my head around how the linking process works when producing an executable. To do that I'm reading Ian Taylor's blog series about it, but a lot of it is beyond me at the moment - so I'd like to see how it works in practice.
At the moment I produce some object files and link them via gcc with:
gcc -m32 -o test.o -c test.c
gcc -m32 -o main.o -c main.c
gcc -m32 -o test main.o test.o
How do I replicate the gcc -m32 -o test main.o test.o stage using ld?
I've tried a very naive: ld -A i386 ./test.o ./main.o
But that returns me these errors:
ld: i386 architecture of input file `./test.o' is incompatible with i386:x86-64 output
ld: i386 architecture of input file `./main.o' is incompatible with i386:x86-64 output
ld: warning: cannot find entry symbol _start; defaulting to 00000000004000b0
./test.o: In function `print_hello':
test.c:(.text+0xd): undefined reference to `_GLOBAL_OFFSET_TABLE_'
test.c:(.text+0x1e): undefined reference to `puts'
./main.o: In function `main':
main.c:(.text+0x15): undefined reference to `_GLOBAL_OFFSET_TABLE_
I'm most confused by _start and _GLOBAL_OFFSET_TABLE_ being missing - what additional info does gcc give to ld to add them?
Here are the files:
main.c
#include "test.h"
void main()
{
print_hello();
}
test.h
void print_hello();
test.c
#include <stdio.h>
void print_hello()
{
puts("Hello, world");
}
#sam : I am not the best people to answer your question because I am a beginner in compilation. I know how to compile programs but I do not really understand all the details (https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools)
So, I decided this year to try to understand how compilation works and I tried to do, more or less, the same things as you tried a few days ago. As nobody has answered, I am going to expose what I have done but I hope an expert will supplement my answer.
Short answer : It is recommended to not use ld directly but to use gcc directly instead. Nevertheless, it is, as you write, interesting to know how the linking process works. This command works on my computer :
ld -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc.so /usr/lib/crti.o /usr/lib/crtn.o
Very Long answer :
How did I find the command above ?
As n.m suggested, run gcc with -v option.
gcc -v -m32 -o test main.o test.o
... /usr/libexec/gcc/x86_64-redhat-linux/4.8.5/collect2 ... (many
options and parameters)....
If you run ld with these options and parameters (copy and paste), it should work.
Try your command with -m elf_i386 (cf. collect2 parameters)
ld -m elf_i386 test.o main.o
ld: warning: cannot find entry symbol _start; ....
Look for symbol _start in object files used in the full ld command.
readelf -s /usr/lib/crt1.o (or objdump -t)
Symbol table '.symtab' contains 18 entries: Num: Value Size
Type Bind Vis Ndx Name... 11: 00000000 0 FUNC
GLOBAL DEFAULT 2 _start
Add this object to your ld command :ld -m elf_i386 test.o main.o /usr/lib/crt1.o
... undefined reference to `__libc_csu_fini'...
Look for this new reference in object files. It is not so obvious to know which library/object files are used because of -L, -l options and some .so include other libraries. For example, cat /usr/lib/libc.so. But, ld with --trace option helps. Try this commandld --trace ... (collect2 parameters)At the end, you should findld -m elf_i386 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc_nonshared.a /lib/libc.so.6 /usr/lib/crti.oor shorter (cf. cat /usr/lib/libc.so) ld -m elf_i386 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc.so /usr/lib/crti.o
It compiles but it does not run (Try to run ./test). It needs the right -dynamic-linker option because it is a dynamically linked ELF executable. (cf collect2 parameters to find it) ld -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc.so /usr/lib/crti.o But, it does not run (Segmentation fault (core dumped)) because you need the epilogue of the _init and _fini functions (https://gcc.gnu.org/onlinedocs/gccint/Initialization.html). Add the ctrn.o object. ld -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc.so /usr/lib/crti.o /usr/lib/crtn.o./test
Hello, world

GCC differently treats an object and a static library regarding undefined symbols

Recently I discoved that Linux linker does not fail due to undefined symbols from static libraries, however does fail due to the same undefined symbols if I link directly with te object files. Here is a simple example:
Source code:
$ cat main.c
int main() { return 0; }
$ cat src.c
int outerUnusedFunc() {
return innerUndefinedFunc();
}
int innerUndefinedFunc();
Creating *.o and *.a from it, comparing using "nm":
$ gcc -c -o main.o main.c
$ gcc -c -o src.o src.c
$ ar r src.a src.o
ar: creating src.a
$ nm src.o
U innerUndefinedFunc
0000000000000000 T outerUnusedFunc
$ nm src.a
src.o:
U innerUndefinedFunc
0000000000000000 T outerUnusedFunc
(Here we clearly see that both *.o and *.a contain the equal symbols list)
And now...
$ ld -o exe main.o src.o
src.o: In function `outerUnusedFunc':
src.c:(.text+0xa): undefined reference to `innerUndefinedFunc'
$ echo $?
1
$ ld -o exe main.o src.a
$ echo $?
0
What is the reason for GCC to treat it differenty?
If you read the static-libraries tag wiki
it will explain why no object files from src.a are linked into your program, and therefore
why it doesn't matter what undefined symbols are referenced in them.
The difference between an object file foo.o and a static library libfoo.a, as linker inputs, is
that an object file is always linked into your program, unconditionally, whereas the same object file
in a static library library, libfoo.a(foo.o), is extracted from libfoo.a and linked into
the program only if the linker needs it to carry on the linkage, as explained by the tag wiki.
Naturally, the linker will give errors only for undefined references in object files that are linked into the program.
The behaviour you are observing is behaviour of the linker, whether or not you invoke it via a GCC
front-end.
Giving the linker foo.o tells it: I want this in the program. Giving the linker
libfoo.a tells it: Here are some object files that you might or might not need.
In the second case — with static library — command line with says "build exe from main.o and add all required things from src.a".
ld just ignores the library because no external symbols required for main.o (outerUnusedFunc is not referenced from main.o).
But in the first case command line says "build exe from main.o and src.o".
ld should place src.o content into output file.
Hence, it obligate to analyze src.o module, add outerUnusedFunc into output file and resolve all symbols for outerUnusedFunc despite it is unused.
You can enable garbage collection for code sections
gcc --function-sections -Wl,--gc-sections -o exe main.c src.c
In this case outerUnusedFunc (as well as all other functions) will be placed
in separate section. ld will see that this section unused (no symbols referenced). It will remove all the section from output file so that innerUndefinedFunc would not be referenced and the symbol should not be resolved — the same result as for library case.
On the other hand, you can manually reference outerUnusedFunc as "undefined" so that ld should find it in library and add to output file.
ld -o exe main.o -u outerUnusedFunc src.a
in this case the same error (undefined reference to innerUndefinedFunc) will be produced.

linking assembly object file with C object file on OS X and can't find symbol

I have a library defined in libadd.asm, it exposes one "function" _add. I have a .c source file that refers to add and I'm trying to get the two object files to link, but am encountering this error regardless of the order in which I link the object files:
Undefined symbols for architecture x86_64:
"_add", referenced from:
_main in prog.o
Here's the code:
// prog.c
#include <stdio.h>
int add();
int main() {
printf("%d\n", add(4, 5));
return 0;
}
And here's the assembly file. It almost certainly doesn't respect the appropriate calling convention. I don't really understand what I should be doing to shuffle the values between registers. (That's what I was trying to figure out originally.)
; libadd.asm
_add:
add eax, edx
ret
Here's what I'm using to the tiny project. I'm intentionally shadowing the implicit .c.o rule with one that does as little as possible and ignores *FLAGS. I'm using cc to drive the linker because that's the simplest way I know to link in the c runtime/standard library/whatever it's called. I've always tried linking with prog.o and libadd.o in the other order.
all: prog
prog: prog.o libadd.o
$(CC) -o prog $^
%.o: %.asm
nasm -f macho64 -o $# $<
%.o: %.c
$(CC) -c -o $# $<
clean:
$(RM) $(wildcard *.o)
running make produces the following output
cc -c -o prog.o prog.c
nasm -f macho64 -o libadd.o libadd.asm
cc -o prog prog.o libadd.o
Undefined symbols for architecture x86_64:
"_add", referenced from:
_main in prog.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [prog] Error 1
Exit 2
libadd.o gets assembled successfully and appears to have the right symbol in it.
% nm libadd.o
0000000000000000 t _add
Why is ld complaining that it can't find the symbol?

MinGW Win32 + nasm: "undefined reference"

I am currently developing an OS for learning purposes, and it's been working fine until now. Then I tried to call an assembler function, compiled with nasm, -fwin32, from C code, but all I got was an "undefined reference" error. I have created a small example in pure assembler, which has the same problem, but is easily understandable and way smaller:
It includes two files:
test.asm:
[bits 32]
global _testfunc
_testfunc:
ret
test2.asm:
[bits 32]
extern _testfunc
global _testfunc2
_testfunc2:
call _testfunc
ret
Here is my compiler / linker script (using windows batch files):
nasm.exe -f win32 test.asm -o test.o
nasm.exe -f win32 test2.asm -o test2.o
ld test.o test2.o -o output.tmp
This results in the error:
test2.o:test2.asm:(.text+0x1): undefined reference to `testfunc'
To extend the question, the same happens when the function is called from C:
test.c:
extern void testfunc(void);
void start()
{
testfunc();
}
With this linker script:
gcc -ffreestanding -c test.c -o testc.o
nasm.exe -f win32 test.asm -o test.o
ld test.o testc.o -o output.tmp
In test.o, test2.o and testc.o, it always says _testfunc, so the error has nothing to do with leading underscores!
In my MinGW setup you need a section directive before the code.
; foo.asm
[bits 32]
global _testfunc
section .text
_testfunc:
ret
Then assemble to win32 format:
nasm -fwin32 foo.asm -o foo.o
Now you can check that testfunc is there:
$ nm foo.o
00000000 a .absolut
00000000 t .text
00000001 a #feat.00
00000000 T _testfunc
The T means text section global, so we're good to go.
Note I'd avoid naming anything test since this is a shell command. This can cause endless grief.
The C function is as you showed it, but name the file something else:
// main.c
extern void testfunc(void);
int main(void)
{
testfunc();
return 0;
}
Then to build an executable let gcc do the heavy lifting because ld sometimes needs arcane arguments.
gcc -ffreestanding main.c foo.o -o main
Your missing something important, your code is not in a code section!
Your asm files should look like the following:
test.asm
global _testfunc
section .text ; <<<< This is important!!!
; all code goes below this!
_testfunc:
ret
test2.asm
extern _testfunc
global _testfunc2
section .text ; <<<< Again, this is important!!!
_testfunc2:
call _testfunc
ret

How to make CMake append linker flags instead of prepending them?

CMake seems to prepend linker flags at the front of a GCC compilation command, instead of appending it at the end. How to make CMake append linker flags?
Here is a simple example to reproduce the problem.
Consider this C++ code that uses clock_gettime:
// main.cpp
#include <iostream>
#include <time.h>
int main()
{
timespec t;
clock_gettime(CLOCK_REALTIME, &t);
std::cout << t.tv_sec << std::endl;
return 0;
}
This is a CMakeLists.txt to compile the C++ file above:
cmake_minimum_required(VERSION 2.8)
set(CMAKE_EXE_LINKER_FLAGS "-lrt")
add_executable(helloapp main.cpp)
Note that we have added -lrt since it has the definition of clock_gettime.
Compiling this using:
$ ls
CMakeLists.txt main.cpp
$ mkdir build
$ cd build
$ cmake ..
$ make VERBOSE=1
Which throws up this error, even though you can see -lrt in the command:
/usr/bin/c++ -lrt CMakeFiles/helloapp.dir/main.cpp.o -o helloapp -rdynamic
CMakeFiles/helloapp.dir/main.cpp.o: In function `main':
main.cpp:(.text+0x15): undefined reference to `clock_gettime'
collect2: ld returned 1 exit status
make[2]: *** [helloapp] Error 1
The problem here is the C++ compilation command generated by CMake has -lrt prepended at the front. The compilation works fine if it had been:
/usr/bin/c++ CMakeFiles/helloapp.dir/main.cpp.o -o helloapp -rdynamic -lrt
How to make CMake append the linker flags at the end?
In general you can't (I think), but in the specific case that you want to link against a particular library, you should be using the syntax
target_link_libraries(helloapp rt)
instead. CMake knows that this corresponds to passing -lrt on the linker command line.

Resources