Link issue under AIX with G++ 6.3 - c++11

After some struggle, I must admit I need some external help on this linker issue. AIX will kill me otherwise.
I am striving to migrate our build from GCC 4.8.5 to GCC 6.3.0 under AIX 7.1:
> uname -a
AIX wreckmach 1 7 00F63B074C00
> oslevel
7.1.0.0
Now, all the compilation phase is going well, but at link stage things starts going messy. My compilation line is like the following (I removed includes directories and some internal defines):
g++ -std=c++11 -Wno-deprecated -g -pthread -shared -maix64 -gdwarf -g3 -O3 -fPIC -fPIC -I./some/includes -D_REENTRANT=1 -DAIX53=1 -DAIX=1 -DNDEBUG=1 -DSVR4=1 -D__VACPP_MULTI__=1 -DRS6000=1 myCppFile.cpp -c -o /home/me/migration/build.AIX71_GCC_60300_64/myCppFile.cpp.1.o
And my link line is
g++ -g -ggdb3 -pthread -maix64 -Wl,-brtl -Wl,-bhalt:5 -Wl,-bnodelcsect -Wl,-brtl,-bexpfull -shared /home/me/migration/build.AIX71_GCC_60300_64/myCppFile.cpp.1.o /home/me/migration/build.AIX71_GCC_60300_64/myOtherCppFile.cpp.1.o -o /home/me/migration/build.AIX71_GCC_60300_64/libtest.so -L/BUILD/SOFT/compilers/AIX71/GCC/60300/lib64 -lgcc_s -lstdc++ -lpthread
This gives me the following output:
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information.
collect2: error: ld returned 8 exit status
By adding the -bnoquiet options, I got :
(ld): save SRE .
SAVE: Section sizes = 69976+5192+280 (0x11158+0x1448+0x118 hex)
SAVE: Size of TOC: 1352 (0x548 hex)
ld: 0711-310 ERROR: Relocation entries from the .text section have been
written to the .loader section. The following csects are in error:
CSECT or (Symbol in CSECT) Inpndx Address TY CL Source-File(Object-File)
Symbols referenced with .loader section RLDs: TY CL Inpndx Name
--------------------------------------------- -- -- ------------------------
<_myCppFile.ro_> [260] 00004080 SD RO ../src/myCppFile.cpp(./build.AIX71_GCC_60300_64/myCppFile.cpp.1.o)
ER PR [140] __gcc_unwind_dbase
SD RW [264] <_myCppFile.rw_>
<_myOtherCppFile.ro_> [107] 00002240 SD RO ../src/myOtherCppFile.cpp(./build.AIX71_GCC_60300_64/myOtherCppFile.cpp.1.o)
SD RW [111] <_myOtherCppFile.rw_>
ER PR [56] __gcc_unwind_dbase
SAVE: The return code is 4.
(ld): rc
RC: Highest return code was 4.
A quick look at libgcc_s/libstdc++ shows that this symbol is defined and exported:
> nm -X64 /BUILD/SOFT/compilers/AIX71/GCC/60300/lib64/libgcc_s.a | grep __gcc_unwind
__gcc_unwind_dbase D 536871816
__gcc_unwind_dbase d 536871816 4
__gcc_unwind_dbase d 536876288 8
> nm -X64 /BUILD/SOFT/compilers/AIX71/GCC/60300/lib64/libstdc++.a | grep __gcc_unwind
__gcc_unwind_dbase D 536871608
__gcc_unwind_dbase d 536871608 4
__gcc_unwind_dbase d 537025144 8
What am I missing and doing wrong?
I precise that the result is the same regardless of which ld I use (system one --prefered-- or gnu binutils one)
EDIT When I create a dummy library and link an executable on it, everything went fine. The whole stuff is using the same options, except the external libraries used in the above quoted project. I am now completely stuck... T_T

Related

How to run manually produce an elf executable using ld?

I'm trying to get my head around how the linking process works when producing an executable. To do that I'm reading Ian Taylor's blog series about it, but a lot of it is beyond me at the moment - so I'd like to see how it works in practice.
At the moment I produce some object files and link them via gcc with:
gcc -m32 -o test.o -c test.c
gcc -m32 -o main.o -c main.c
gcc -m32 -o test main.o test.o
How do I replicate the gcc -m32 -o test main.o test.o stage using ld?
I've tried a very naive: ld -A i386 ./test.o ./main.o
But that returns me these errors:
ld: i386 architecture of input file `./test.o' is incompatible with i386:x86-64 output
ld: i386 architecture of input file `./main.o' is incompatible with i386:x86-64 output
ld: warning: cannot find entry symbol _start; defaulting to 00000000004000b0
./test.o: In function `print_hello':
test.c:(.text+0xd): undefined reference to `_GLOBAL_OFFSET_TABLE_'
test.c:(.text+0x1e): undefined reference to `puts'
./main.o: In function `main':
main.c:(.text+0x15): undefined reference to `_GLOBAL_OFFSET_TABLE_
I'm most confused by _start and _GLOBAL_OFFSET_TABLE_ being missing - what additional info does gcc give to ld to add them?
Here are the files:
main.c
#include "test.h"
void main()
{
print_hello();
}
test.h
void print_hello();
test.c
#include <stdio.h>
void print_hello()
{
puts("Hello, world");
}
#sam : I am not the best people to answer your question because I am a beginner in compilation. I know how to compile programs but I do not really understand all the details (https://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools)
So, I decided this year to try to understand how compilation works and I tried to do, more or less, the same things as you tried a few days ago. As nobody has answered, I am going to expose what I have done but I hope an expert will supplement my answer.
Short answer : It is recommended to not use ld directly but to use gcc directly instead. Nevertheless, it is, as you write, interesting to know how the linking process works. This command works on my computer :
ld -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc.so /usr/lib/crti.o /usr/lib/crtn.o
Very Long answer :
How did I find the command above ?
As n.m suggested, run gcc with -v option.
gcc -v -m32 -o test main.o test.o
... /usr/libexec/gcc/x86_64-redhat-linux/4.8.5/collect2 ... (many
options and parameters)....
If you run ld with these options and parameters (copy and paste), it should work.
Try your command with -m elf_i386 (cf. collect2 parameters)
ld -m elf_i386 test.o main.o
ld: warning: cannot find entry symbol _start; ....
Look for symbol _start in object files used in the full ld command.
readelf -s /usr/lib/crt1.o (or objdump -t)
Symbol table '.symtab' contains 18 entries: Num: Value Size
Type Bind Vis Ndx Name... 11: 00000000 0 FUNC
GLOBAL DEFAULT 2 _start
Add this object to your ld command :ld -m elf_i386 test.o main.o /usr/lib/crt1.o
... undefined reference to `__libc_csu_fini'...
Look for this new reference in object files. It is not so obvious to know which library/object files are used because of -L, -l options and some .so include other libraries. For example, cat /usr/lib/libc.so. But, ld with --trace option helps. Try this commandld --trace ... (collect2 parameters)At the end, you should findld -m elf_i386 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc_nonshared.a /lib/libc.so.6 /usr/lib/crti.oor shorter (cf. cat /usr/lib/libc.so) ld -m elf_i386 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc.so /usr/lib/crti.o
It compiles but it does not run (Try to run ./test). It needs the right -dynamic-linker option because it is a dynamically linked ELF executable. (cf collect2 parameters to find it) ld -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc.so /usr/lib/crti.o But, it does not run (Segmentation fault (core dumped)) because you need the epilogue of the _init and _fini functions (https://gcc.gnu.org/onlinedocs/gccint/Initialization.html). Add the ctrn.o object. ld -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 -o test test.o main.o /usr/lib/crt1.o /usr/lib/libc.so /usr/lib/crti.o /usr/lib/crtn.o./test
Hello, world

Breaking NASM files into multiple with link errors on OS X

My base assembler file foidlrt.asm started getting a bit too large so I broke it up into two. Here is the entirety of the second file folder_stdio.asm:
; foidl_stdio.asm
%include "foidlstnd.inc"
section .text
DEFAULT REL
global foidl_fclose ; Raw file close
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; foidl_close
; Raw file close
; REGISTERS (1):
; RDI file handle
; CALLS:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
foidl_fclose:
mov rax,SYSCALL_FILE_CLOSE ; 0x2000006
syscall
ret
However, now when I build I am now getting this error from make despite the global declaration in the new file:
nasm src/foidlrt.asm -f macho64 --prefix _ -g -O0 -Iincludes/ -o asmobjs/foildrt.o
nasm src/foidlrt.asm -f macho64 --prefix _ -g -O0 -Iincludes/ -o asmobjs/foidl_stdio.o
libtool -static -s -o libs/libfoidlrt.a asmobjs/foildrt.o asmobjs/foidl_stdio.o
gcc src/testlink.c -L libs -l foidlrt -Wall -g -L. -Wl,-pie -I. -o bin/testlink
Undefined symbols for architecture x86_64:
"_foidl_fclose", referenced from:
_main in testlink-4b5ad3.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Version information:
XCode - 7.2.1 (7C1002)
nasm - NASM version 2.12 compiled on Feb 28 2016
gcc - Apple LLVM version 7.0.2 (clang-700.1.81)
RESOLVED
Error was all mine, makefile rule was bad. Working as expected now.

linker option to ignore unused dependencies

I would like to remove all unused symbols from my compiled C++ binary. I saw this, which gives an overview using gcc, which is the toolchain I'm using: How to remove unused C/C++ symbols with GCC and ld?
However, on my system, the linking option (-Wl,--gc-sections) is rejected:
$ gcc -fdata-sections -ffunction-sections a.c -o a.o -Wl,--gc-sections
ld: fatal: unrecognized option '--'
ld: fatal: use the -z help option for usage information
collect2: error: ld returned 1 exit status
I'm running on illumos, which is a (relatively) recent fork of Solaris, with GCC 4.7. Anybody know what the correct linker option to use here is?
Edit: searching the man pages more closely turned up "-zignore":
-z ignore | record
Ignores, or records, dynamic dependencies that are not
referenced as part of the link-edit. Ignores, or
records, unreferenced ELF sections from the relocatable
objects that are read as part of the link-edit. By
default, -z record is in effect.
If an ELF section is ignored, the section is eliminated
from the output file being generated. A section is
ignored when three conditions are true. The eliminated
section must contribute to an allocatable segment. The
eliminated section must provide no global symbols. No
other section from any object that contributes to the
link-edit, must reference an eliminated section.
However the following sequence still puts FUNCTION_SHOULD_BE_REMOVED in the ELF section .text.FUNCTION:
$ cat a.c
int main() {
return 0;
}
$ cat b.c
int FUNCTION_SHOULD_BE_REMOVED() {
return 0;
}
$ gcc -fdata-sections -ffunction-sections -c a.c -Wl,-zignore
$ gcc -fdata-sections -ffunction-sections -c b.c -Wl,-zignore
$ gcc -fdata-sections -ffunction-sections a.o b.o -Wl,-zignore
$ elfdump -s a.out # I removed a lot of output for brevity
Symbol Table Section: .dynsym
[2] 0x08050e72 0x0000000a FUNC GLOB D 1 .text.FUNCTION FUNCTION_SHOULD_BE_REMOVED
Symbol Table Section: .symtab
[71] 0x08050e72 0x0000000a FUNC GLOB D 0 .text.FUNCTION FUNCTION_SHOULD_BE_REMOVED
Because the man pages say "no global symbols", I tried making the function "static" and that had the same end result.
The ld '-z ignore' option is positional, it applies to those input objects which occur after it on the command line. The example you gave:
gcc a.o b.o -Wl,-zignore
Applies the option to no objects -- so nothing is done.
gcc -Wl,-zignore a.o b.o
Should work

GCC 4.5 vs 4.4 linking with dependencies

I am observing a difference when trying to do the same operation on GCC 4.4 and GCC 4.5. Because the code I am doing this with is proprietary, I am unable to provide it, but I am observing a similar failure with this simple test case.
What I am basically trying to do is have one shared library (libb) depend on another shared library (liba). When loading libb, I assume that liba should be loaded as well - even though libb is not necessarily using the symbols in liba.
What I am observing is when I compile with GCC 4.4, I observe that the liba is loaded, but if I compile with GCC 4.5, libb is not loaded.
I have a small test case that consists of two files, a.c and b.c . The contents of the files:
//a.c
int a(){
return 0;
}
//b.c
int b(){
return 0;
}
//c.c
#include <stdio.h>
int a();
int b();
int main()
{
printf("%d\n", a()+b());
return 0;
}
//test.sh
$CC -o liba.so a.c -shared
$CC -o libb.so b.c -shared -L. -la -Wl,-rpath-link .
$CC c.c -L. -lb -Wl,-rpath-link .
LD_LIBRARY_PATH=. ./a.out
This is my output with different versions of GCC
$ CC=gcc-4.4 ./test.sh
1
$ CC=gcc-4.5 ./test.sh
/tmp/cceJhAqy.o: In function `main':
c.c:(.text+0xf): undefined reference to `a'
collect2: ld returned 1 exit status
./test.sh: line 4: ./a.out: No such file or directory
$ CC=gcc-4.6 ./test.sh
/tmp/ccoovR0x.o: In function `main':
c.c:(.text+0xf): undefined reference to `a'
collect2: ld returned 1 exit status
./test.sh: line 4: ./a.out: No such file or directory
$
Can anyone explain what is happening? Another extra bit of information is that ldd on libb.so does show liba.so on GCC 4.4 but not on GCC 4.5.
EDIT
I changed test.sh to the following:
$CC -shared -o liba.so a.c
$CC -L. -Wl,--no-as-needed -Wl,--copy-dt-needed-entries -la -shared -o libb.so b.c -Wl,-rpath-link .
$CC -L. c.c -lb -Wl,-rpath-link .
LD_LIBRARY_PATH=. ./a.out
This gave the following output with GCC 4.5:
/usr/bin/ld: /tmp/cc5IJ8Ks.o: undefined reference to symbol 'a'
/usr/bin/ld: note: 'a' is defined in DSO ./liba.so so try adding it to the linker command line
./liba.so: could not read symbols: Invalid operation
collect2: ld returned 1 exit status
./test.sh: line 4: ./a.out: No such file or directory
There seems to have been changes in how DT_NEEDED libraries are treated during linking by ld. Here's the relevant part of current man ld:
With --copy-dt-needed-entries dynamic libraries mentioned on the command
line will be recursively searched, following their DT_NEEDED tags to other libraries, in order to resolve symbols required by the output binary. With the
default setting however the searching of dynamic libraries that follow it will stop with the dynamic library itself. No DT_NEEDED links will be traversed
to resolve symbols.
(part of the --copy-dt-needed-entries section).
Some time between GCC 4.4 and GCC 4.5 (apparently, see some reference here - can't find anything really authoritative), the default was changed from the recursive search, to no recursive search (as you are seeing with the newer GCCs).
In any case, you can (and should) fix it by specifying liba in your final link step:
$CC c.c -L. -lb -la -Wl,-rpath-link .
You can check that this linker setting is indeed (at least part of) the issue by running with your newer compilers and this command line:
$CC c.c -L. -Wl,--copy-dt-needed-entries -lb -Wl,--no-copy-dt-needed-entries \
-Wl,-rpath-link .

How can I tell, with something like objdump, if an object file has been built with -fPIC?

How can I tell, with something like objdump, if an object file has been built with -fPIC?
The answer depends on the platform. On most platforms, if output from
readelf --relocs foo.o | egrep '(GOT|PLT|JU?MP_SLOT)'
is empty, then either foo.o was not compiled with -fPIC, or foo.o doesn't contain any code where -fPIC matters.
I just had to do this on a PowerPC target to find which shared object (.so) was being built without -fPIC. What I did was run readelf -d libMyLib1.so and look for TEXTREL. If you see TEXTREL, one or more source files that make up your .so were not built with -fPIC. You can substitute readelf with elfdump if necessary.
E.g.,
[user#host lib]$ readelf -d libMyLib1.so | grep TEXT # Bad, not -fPIC
0x00000016 (TEXTREL)
[user#host lib]$ readelf -d libMyLib2.so | grep TEXT # Good, -fPIC
[user#host lib]$
And to help people searching for solutions, the error I was getting when I ran my executable was this:
root#target:/# ./program: error while loading shared libraries: /usr/lib/libMyLi
b1.so: R_PPC_REL24 relocation at 0x0fc5987c for symbol 'memcpy' out of range
I don't know whether this info applies to all architectures.
Source: blogs.oracle.com/rie
I assume, what you really want to know is whether or not a shared library is composed from object files compiled with -fPIC.
As already mentioned, if there are TEXTRELs, then -fPIC was not used.
There is a great tool called scanelf which can show you the symbols that caused .text relocations.
More information can be found at HOWTO Locate and Fix .text Relocations TEXTRELs.
-fPIC means that code will be able to execute in addresses different form the address that was compile for.
To do it , disasambler will look like this....
call get_offset_from_compilation_address
get_offset_from_compilation_address: pop ax
sub ax, ax , &get_offset_from_compilation_address
now in ax we have an offset that we need to add to any access to memory.
load bx, [ax + var_address}
readelf -a *.so | grep Flags
Flags: 0x50001007, noreorder, pic, cpic, o32, mips32
This should work most of the time.
Another option to distinguish whether your program is generated wit -fPIC option:
provided that your code has -g3 -gdwarf-2 option enabled when compiling.
other gcc debug format may also contains the macro info:
Note the following $'..' syntax is assumes bash
echo $' main() { printf("%d\\n", \n#ifdef __PIC__\n__PIC__\n#else\n0\n#endif\n); }' | gcc -fPIC -g3
-gdwarf-2 -o test -x c -
readelf --debug-dump=macro ./test | grep __PIC__
such a method works because gcc manual declares that if -fpic is used, PIC is defined to 1, and
if -fPIC used, PIC is 2.
The above answers by checking the GOT is the better way. Because the prerequest of -g3 -gdwarf-2 I guess seldom being used.
From The Linux Programming Interface:
On Linux/x86-32, it is possible to create a shared library using
modules compiled without the –fPIC option. However, doing so loses
some of the benefits of shared libraries, since pages of program text
containing position-dependent memory references are not shared across
processes. On some architectures, it is impossible to build shared
libraries without the –fPIC option.
In order to determine whether an existing object file has been
compiled with the –fPIC option, we can check for the presence of the
name _GLOBAL_OFFSET_TABLE_ in the object file’s symbol table, using
either of the following commands:
$ nm mod1.o | grep _GLOBAL_OFFSET_TABLE_
$ readelf -s mod1.o | grep _GLOBAL_OFFSET_TABLE_
Conversely, if either of the following equivalent commands yields any
output, then the specified shared library includes at least one object
module that was not compiled with –fPIC:
$ objdump --all-headers libfoo.so | grep TEXTREL
$ readelf -d libfoo.so | grep TEXTREL
However, neither above quoting nor any answer of this question works for x86_64.
What I've observed on my x86_64 Ubuntu machine is that, whether specifying -fPIC or not, it would generate fPIC .o. That is
gcc -g -Wall -c -o my_so.o my_so.c // has _GLOBAL_OFFSET_TABLE_
gcc -g -Wall -fPIC -c -o my_so_fpic.o my_so.c // has _GLOBAL_OFFSET_TABLE_
readelf -s my_so.o > 1.txt && readelf -s my_so_fpic > 2.txt && diff 1.txt 2.txt
has no difference and both my_so.o and my_so_fpic.o can be used to create a shared library.
In order to generate non fpic object file, I found a gcc flag called -fno-pic in the first comment of How to test whether a Linux binary was compiled as position independent code? .
This works,
gcc -g —Wall -fno-pic -c -o my_so_fnopic.o my_so.c // no _GLOBAL_OFFSET_TABLE_
and
gcc -g -Wall -shared -o libdemo.so my_so_fnopic.o
gives error:
/usr/bin/ld: my_so_fnopic.o: relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC
collect2: error: ld returned 1 exit status
can not create a shared library with non pic .o.

Resources