LD: Different ways to use ALIGN() - gcc

What is the difference between these?
mysection ALIGN(4): {...}
and
mysection: {. = ALIGN(4); ...}
and
. = ALIGN(4);
mysection: {...}
Are the results the same?

See:
$ cat foo.c
int mysym __attribute__((section(".mysection"))) = 42;
$ gcc -c foo.c
Case 1
$ cat foo_1.lds
SECTIONS
{
. = 0x10004;
.mysection ALIGN(8): {
*(.mysection)
}
}
$ ld -T foo_1.lds foo.o -o foo1.out
$ readelf -s foo1.out
Symbol table '.symtab' contains 5 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000010008 0 SECTION LOCAL DEFAULT 1
2: 0000000000000000 0 SECTION LOCAL DEFAULT 2
3: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo.c
4: 0000000000010008 4 OBJECT GLOBAL DEFAULT 1 mysym
$ readelf -t foo1.out | grep -A3 '.mysection'
[ 1] .mysection
PROGBITS PROGBITS 0000000000010008 0000000000010008 0
0000000000000004 0000000000000000 0 4
[0000000000000003]: WRITE, ALLOC
Case 2
$ cat foo_2.lds
SECTIONS
{
. = 0x10004;
. = ALIGN(8);
.mysection : {
*(.mysection)
}
}
$ ld -T foo_2.lds foo.o -o foo2.out
$ readelf -s foo2.out
Symbol table '.symtab' contains 5 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000010008 0 SECTION LOCAL DEFAULT 1
2: 0000000000000000 0 SECTION LOCAL DEFAULT 2
3: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo.c
4: 0000000000010008 4 OBJECT GLOBAL DEFAULT 1 mysym
$ readelf -t foo2.out | grep -A3 '.mysection'
[ 1] .mysection
PROGBITS PROGBITS 0000000000010008 0000000000010008 0
0000000000000004 0000000000000000 0 4
[0000000000000003]: WRITE, ALLOC
Case 3
$ cat foo_3.lds
SECTIONS
{
. = 0x10004;
.mysection : {
. = ALIGN(8);
*(.mysection)
}
}
$ ld -T foo_3.lds foo.o -o foo3.out
$ readelf -s foo3.out
Symbol table '.symtab' contains 5 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000010004 0 SECTION LOCAL DEFAULT 1
2: 0000000000000000 0 SECTION LOCAL DEFAULT 2
3: 0000000000000000 0 FILE LOCAL DEFAULT ABS foo.c
4: 0000000000010008 4 OBJECT GLOBAL DEFAULT 1 mysym
$ readelf -t foo3.out | grep -A3 '.mysection'
[ 1] .mysection
PROGBITS PROGBITS 0000000000010004 0000000000010004 0
0000000000000008 0000000000000000 0 4
[0000000000000003]: WRITE, ALLOC
So, Case 1 is equivalent to Case 2 . Both of them align .mysection to
the next 8-byte boundary, 0x10008, after 0x10004, and mysym is at the same address.
But Case 3 does not align .mysection to 0x10008. It remains at 0x10004.
Then the location counter is aligned to 0x10008 after the start of .mysection,
and mysym is at that address.
In all cases the address of the first symbol in .mysection is 0x10008, but
only in Case 1 and Case 2 is that the address of .mysection
Later
How is case 2 affected if I have multiple sections placed into different memory regions?
Whenever the script invokes:
. = ALIGN(N);
it simply sets the location counter to the next N-byte aligned boundary after
its current position. That is all. So:
Case 4
$ cat bar.c
char aa __attribute__((section(".section_a"))) = 0;
char bb __attribute__((section(".section_b"))) = 0;
$ cat bar.lds
SECTIONS
{
. = 0x10004;
. = ALIGN(8);
.section_a : {
*(.section_a)
}
. = 0x20004;
.section_b : {
*(.section_b)
}
}
$ gcc -c bar.c
$ ld -T bar.lds bar.o -o bar.out
$ readelf -s bar.out
Symbol table '.symtab' contains 7 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000010008 0 SECTION LOCAL DEFAULT 1
2: 0000000000020004 0 SECTION LOCAL DEFAULT 2
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 FILE LOCAL DEFAULT ABS bar.c
5: 0000000000020004 1 OBJECT GLOBAL DEFAULT 2 bb
6: 0000000000010008 1 OBJECT GLOBAL DEFAULT 1 aa
$ readelf -t bar.out | egrep -A3 '(section_a|section_b)'
[ 1] .section_a
PROGBITS PROGBITS 0000000000010008 0000000000010008 0
0000000000000001 0000000000000000 0 1
[0000000000000003]: WRITE, ALLOC
[ 2] .section_b
PROGBITS PROGBITS 0000000000020004 0000000000020004 0
0000000000000001 0000000000000000 0 1
[0000000000000003]: WRITE, ALLOC
Here, . = ALIGN(8); has the effect that .section_a, and the first object within it,
aa, are aligned to the first 8-byte boundary, 0x10008, after 0x10004. But . = 0x20004;
moves the location counter to an address that happens not to be 8-byte aligned, so .section_b
and its first object bb are not 8-byte aligned. Indeed, if we deleted . = 0x20004; then
.section_b and the object bb would be placed right after aa, at 0x10009.

I did my own experiment based on #Mike Kinghan's cases. The numbers don't quite line up with his cases, but test the same techniques, but using multiple memory regions.
Observations:
Mike's case 1 (my case 2 and 3) fails to link at all.
Mike's case 2 (my case 1) fails to align the variable bb in the expected way.
Mike's case 3 (my case 5) successfully aligns the variable bb
Using SUBALIGN (my case 4) correctly aligns the variable bb, but would also align every input section in that output section.
Case 1
$ cat bar.c
char aa __attribute__((section(".section_a"))) = 0;
char bb __attribute__((section(".section_b"))) = 0;
$ gcc -c bar.c
$ cat bar1.lds
MEMORY {
FLASH (rx) : ORIGIN = 0x00000001, LENGTH = 0x100000
RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x10000
}
SECTIONS
{
. = 0x10004;
. = ALIGN(8);
.section_a : {
*(.section_a)
} > RAM
_myvar = .;
. = ALIGN(8);
.section_b : {
*(.section_b)
} > FLASH
}
$ ld -T bar1.lds bar.o -o bar.out
$ readelf -s bar.out
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000020000000 0 SECTION LOCAL DEFAULT 1
2: 0000000000000001 0 SECTION LOCAL DEFAULT 2
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 FILE LOCAL DEFAULT ABS bar.c
5: 0000000000000001 1 OBJECT GLOBAL DEFAULT 2 bb
6: 0000000020000000 1 OBJECT GLOBAL DEFAULT 1 aa
7: 0000000020000001 0 NOTYPE GLOBAL DEFAULT 1 _myvar
Case 2
$ cat bar2.lds
MEMORY {
FLASH (rx) : ORIGIN = 0x00000001, LENGTH = 0x100000
RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x10000
}
SECTIONS
{
. = 0x10004;
. = ALIGN(8);
.section_a : {
*(.section_a)
} > RAM
_myvar = .;
.section_b ALIGN(8) : {
*(.section_b)
} > FLASH
}
$ ld -T bar2.lds bar.o -o bar.out
ld: address 0x20000009 of bar.out section `.section_b' is not within region `FLASH'
ld: address 0x20000009 of bar.out section `.section_b' is not within region `FLASH'
Case 3
$ cat bar3.lds
MEMORY {
FLASH (rx) : ORIGIN = 0x00000001, LENGTH = 0x100000
RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x10000
}
SECTIONS
{
. = 0x10004;
. = ALIGN(8);
.section_a : {
*(.section_a)
} > FLASH
_myvar = .;
.section_b ALIGN(8) : {
*(.section_b)
} > RAM
}
$ ld -T bar3.lds bar.o -o bar.out
ld: address 0x9 of bar.out section `.section_b' is not within region `RAM'
ld: address 0x9 of bar.out section `.section_b' is not within region `RAM'
Case 4
$ cat bar4.lds
MEMORY {
FLASH (rx) : ORIGIN = 0x00000001, LENGTH = 0x100000
RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x10000
}
SECTIONS
{
. = 0x10004;
. = ALIGN(8);
.section_a : {
*(.section_a)
} > RAM
_myvar = .;
.section_b : SUBALIGN(8){
*(.section_b)
} > FLASH
}
$ ld -T bar4.lds bar.o -o bar.out
$ readelf -s bar.out
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000020000000 0 SECTION LOCAL DEFAULT 1
2: 0000000000000008 0 SECTION LOCAL DEFAULT 2
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 FILE LOCAL DEFAULT ABS bar.c
5: 0000000000000008 1 OBJECT GLOBAL DEFAULT 2 bb
6: 0000000020000000 1 OBJECT GLOBAL DEFAULT 1 aa
7: 0000000020000001 0 NOTYPE GLOBAL DEFAULT 1 _myvar
Case 5
$ cat bar5.lds
MEMORY {
FLASH (rx) : ORIGIN = 0x00000001, LENGTH = 0x100000
RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x10000
}
SECTIONS
{
. = 0x10004;
. = ALIGN(8);
.section_a : {
*(.section_a)
} > RAM
_myvar = .;
.section_b : {
. = ALIGN(8);
*(.section_b)
} > FLASH
}
$ ld -T bar5.lds bar.o -o bar.out
$ readelf -s bar.out
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000020000000 0 SECTION LOCAL DEFAULT 1
2: 0000000000000001 0 SECTION LOCAL DEFAULT 2
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 FILE LOCAL DEFAULT ABS bar.c
5: 0000000000000008 1 OBJECT GLOBAL DEFAULT 2 bb
6: 0000000020000000 1 OBJECT GLOBAL DEFAULT 1 aa
7: 0000000020000001 0 NOTYPE GLOBAL DEFAULT 1 _myvar

Related

Keep all exported symbols when creating a shared library from a static library

I am creating a shared library from a static library for which I do not have the source code.
Many Stack Overflow questions provide answers on how to do that:
gcc -shared -o libxxx.so -Wl,--whole-archive libxxx.a -Wl,--no-whole-archive
However, some public functions of the static library are included as hidden functions in the shared library:
$ nm --defined-only libxxx.a | grep __intel_cpu_indicator_init
0000000000000000 T __intel_cpu_indicator_init
$ nm libxxx.so | grep __intel_cpu_indicator_init
00000000030bb160 t __intel_cpu_indicator_init
The __intel_cpu_indicator_init symbol went from exported to hidden.
It is not the only symbol that was hidden in the process:
$ nm libxxx.a | grep ' T ' | wc -l
37969
$ nm libxxx.so | grep ' T ' | wc -l
37548
$ nm libxxx.a | grep ' t ' | wc -l
62298
$ nm libxxx.so | grep ' t ' | wc -l
62727
Note that 37969 + 62298 = 100267 and 37548 + 62727 = 100275.
Is there anything I can do to have the linker produce a shared library with all public symbols from the static library also public in the shared library ?
What you observe results when some of the global symbol definitions in some of
the object files archived in libxxx.a were compiled with the function attribute
or variable attribute visibility("hidden")
This attribute has the effect that when the object file containing the
the global symbol definition is linked into a shared library:
The linkage of the symbol is changed from global to local in the static symbol table (.symtab) of the output shared library,
so that when that shared library is linked with anything else, the linker cannot see the definition of the symbol.
The symbol definition is not added to the dynamic symbol table (.dynsym) of the output shared library (which by default it would be)
so that when the shared library is loaded into a process, the loader is likewise unable to find a definition of the symbol.
In short, the global symbol definition in the object file is hidden for the purposes of dynamic linkage.
Check this out with:
$ readelf -s libxxx.a | grep HIDDEN
and I expect you to get hits for the unexported global symbols. If you don't,
you need read no further because I have no other explanation of what you see
and wouldn't count on any workaround I suggested not to shoot you in the foot.
Here is an illustration:
a.c
#include <stdio.h>
void aa(void)
{
puts(__func__);
}
b.c
#include <stdio.h>
void __attribute__((visibility("hidden"))) bb(void)
{
puts(__func__);
}
de.c
#include <stdio.h>
void __attribute__((visibility("default"))) dd(void)
{
puts(__func__);
}
void ee(void)
{
puts(__func__);
}
We'll compile a.c and b.c like so:
$ gcc -Wall -c a.c b.c
And we can see that symbols aa and ab are defined and global in their respective object files:
$ nm --defined-only a.o b.o
a.o:
0000000000000000 T aa
0000000000000000 r __func__.2361
b.o:
0000000000000000 T bb
0000000000000000 r __func__.2361
But we can also observe this difference:
$ readelf -s a.o
Symbol table '.symtab' contains 13 entries:
Num: Value Size Type Bind Vis Ndx Name
...
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 aa
...
as compared with:
$ readelf -s b.o
Symbol table '.symtab' contains 13 entries:
Num: Value Size Type Bind Vis Ndx Name
...
10: 0000000000000000 19 FUNC GLOBAL HIDDEN 1 bb
...
aa is a GLOBAL symbol with DEFAULT visibility and bb is a GLOBAL
symbol with HIDDEN visibility.
We'll compile de.c differently:
$ gcc -Wall -fvisibility=hidden -c de.c
Here, we're instructing the compiler that any symbol shall be given hidden
visibility unless a countervailing visibility attribute is specified for
it in the source code. And accordingly we see:
$ readelf -s de.o
Symbol table '.symtab' contains 15 entries:
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
...
11: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 dd
...
14: 0000000000000013 19 FUNC GLOBAL HIDDEN 1 ee
Archiving these object files in a static library changes them in no way:
$ ar rcs libabde.a a.o b.o de.o
And then if we link all of them into a shared library:
$ gcc -o libabde.so -shared -Wl,--whole-archive libabde.a -Wl,--no-whole-archive
we find that:
$ readelf -s libabde.so | egrep '(aa|bb|dd|ee|Symbol table)'
Symbol table '.dynsym' contains 8 entries:
6: 0000000000001105 19 FUNC GLOBAL DEFAULT 12 aa
7: 000000000000112b 19 FUNC GLOBAL DEFAULT 12 dd
Symbol table '.symtab' contains 59 entries:
45: 0000000000001118 19 FUNC LOCAL DEFAULT 12 bb
51: 000000000000113e 19 FUNC LOCAL DEFAULT 12 ee
54: 0000000000001105 19 FUNC GLOBAL DEFAULT 12 aa
56: 000000000000112b 19 FUNC GLOBAL DEFAULT 12 dd
bb and ee, which were GLOBAL with HIDDEN visibility in the object files,
are LOCAL in the static symbol of libabde.so and are absent altogether
from its dynamic symbol table.
In this light, you may wish to re-evaluate your mission:
The symbols that have been given hidden visibility in the object files in libxxx.a have
been hidden because the person who compiled them had a reason for
wishing to conceal them from dynamic linkage. Do you have a countervailing need
to export them for dynamic linkage? Or do you maybe just want to export them because
you've noticed that they're not exported and don't know why not?
If you nonetheless want to unhide the hidden symbols, and cannot change the source code
of the object files archived in libxxx.a, your least worst resort is to:
Extract each object file from libxxx.a
Doctor it to replace HIDDEN with DEFAULT visibility on its global definitions
Put it into a new archive libyyy.a
Then use libyyy.a instead of libxxx.a.
The binutils tool for doctoring object files is objcopy.
But objcopy has no operations to directly manipulate the dynamic visibility of
a symbol and you'd have to settle for a circuitous kludge that "achieves the effect
of" unhiding the hidden symbols:
With objcopy --redefine-sym, rename each hidden global symbol S as, say, __hidden__S.
With objcopy --add-symbol, add a new global symbol S that has the same value as __hidden_S
but gets DEFAULT visibility by default.
ending up with two symbols with the same definition: the original hidden one
and a new unhidden alias for it.
Preferable to that would a means of simply and solely changing the visibility of a symbol in
an ELF object file, and a means is to hand in the LIEF library (Library to Instrument Executable Formats) -
Swiss Army Chainsaw for object and executable file alterations1.
Here is a Python script that calls on pylief, the LIEF Python module, to unhide the
hidden globals in an ELF object file:
unhide.py
#!/usr/bin/python
# unhide.py - Replace hidden with default visibility on global symbols defined
# in an ELF object file
import argparse, sys, lief
from lief.ELF import SYMBOL_BINDINGS, SYMBOL_VISIBILITY, SYMBOL_TYPES
def warn(msg):
sys.stderr.write("WARNING: " + msg + "\n")
def unhide(objfile_in, objfile_out = None, namedsyms=None):
if not objfile_out:
objfile_out = objfile_in
binary = lief.parse(objfile_in)
allsyms = { sym.name for sym in binary.symbols }
selectedsyms = set([])
nasyms = { sym.name for sym in binary.symbols if \
sym.type == SYMBOL_TYPES.NOTYPE or \
sym.binding != SYMBOL_BINDINGS.GLOBAL or \
sym.visibility != SYMBOL_VISIBILITY.HIDDEN }
if namedsyms:
namedsyms = set(namedsyms)
nosyms = namedsyms - allsyms
for nosym in nosyms:
warn("No symbol " + nosym + " in " + objfile_in + ": ignored")
for sym in namedsyms & nasyms:
warn("Input symbol " + sym + \
" is not a hidden global symbol defined in " + objfile_in + \
": ignored")
selectedsyms = namedsyms - nosyms
else:
selectedsyms = allsyms
selectedsyms -= nasyms
unhidden = 0;
for sym in binary.symbols:
if sym.name in selectedsyms:
sym.visibility = SYMBOL_VISIBILITY.DEFAULT
unhidden += 1
print("Unhidden: " + sym.name)
print("{} symbols were unhidden".format(unhidden))
binary.write(objfile_out)
def get_args():
parser = argparse.ArgumentParser(
description="Replace hidden with default visibility on " + \
"global symbols defined in an ELF object file.")
parser.add_argument("ELFIN",help="ELF object file to read")
parser.add_argument("-s","--symbol",metavar="SYMBOL",action="append",
help="Unhide SYMBOL. " + \
"If unspecified, unhide all hidden global symbols defined in ELFIN")
parser.add_argument("--symfile",
help="File of whitespace-delimited symbols to unhide")
parser.add_argument("-o","--out",metavar="ELFOUT",
help="ELF object file to write. If unspecified, rewrite ELFIN")
return parser.parse_args()
def main():
args = get_args()
objfile_in = args.ELFIN
objfile_out = args.out
symlist = args.symbol
if not symlist:
symlist = []
symfile = args.symfile
if symfile:
with open(symfile,"r") as fh:
symlist += [word for line in fh for word in line.split()]
unhide(objfile_in,objfile_out,symlist)
main()
Usage:
$ ./unhide.py -h
usage: unhide.py [-h] [-s SYMBOL] [--symfile SYMFILE] [-o ELFOUT] ELFIN
Replace hidden with default visibility on global symbols defined in an ELF
object file.
positional arguments:
ELFIN ELF object file to read
optional arguments:
-h, --help show this help message and exit
-s SYMBOL, --symbol SYMBOL
Unhide SYMBOL. If unspecified, unhide all hidden
global symbols defined in ELFIN
--symfile SYMFILE File of whitespace-delimited symbols to unhide
-o ELFOUT, --out ELFOUT
ELF object file to write. If unspecified, rewrite
ELFIN
And here is a shell script:
unhide.sh
#!/bin/bash
OLD_ARCHIVE=$1
NEW_ARCHIVE=$2
OBJS=$(ar t $OLD_ARCHIVE)
for obj in $OBJS; do
rm -f $obj
ar xv $OLD_ARCHIVE $obj
./unhide.py $obj
done
rm -f $NEW_ARCHIVE
ar rcs $NEW_ARCHIVE $OBJS
echo "$NEW_ARCHIVE made"
that takes:
$1 = Name of an existing static library
$2 = Name for a new static library
and creates $2 containing the object files from $1, each modified
with unhide.py to unhide all of its hidden global definitions.
Back with our illustration, we can run:
$ ./unhide.sh libabde.a libnew.a
x - a.o
0 symbols were unhidden
x - b.o
Unhidden: bb
1 symbols were unhidden
x - de.o
Unhidden: ee
1 symbols were unhidden
libnew.a made
and confirm that worked with:
$ readelf -s libnew.a | grep HIDDEN; echo Done
Done
$ readelf -s libnew.a | egrep '(aa|bb|dd|ee)'
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 aa
10: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 bb
11: 0000000000000000 19 FUNC GLOBAL DEFAULT 1 dd
14: 0000000000000013 19 FUNC GLOBAL DEFAULT 1 ee
Finally if we relink the shared library with the new archive
$ gcc -o libabde.so -shared -Wl,--whole-archive libnew.a -Wl,--no-whole-archive
all of the global symbols from the archive are exported:
$ readelf --dyn-syms libabde.so | egrep '(aa|bb|dd|ee)'
6: 0000000000001105 19 FUNC GLOBAL DEFAULT 12 aa
7: 000000000000112b 19 FUNC GLOBAL DEFAULT 12 dd
8: 0000000000001118 19 FUNC GLOBAL DEFAULT 12 bb
9: 000000000000113e 19 FUNC GLOBAL DEFAULT 12 ee
[1]
Download C/C++/Python libraries
Debian/Ubuntu provides C/C++ dev package lief-dev.

Linker error using gcc and clang together on macos sierra

I have to compile c++ code with g++ 6.4.0 (Homebrew g++-6) to a static lib, which is then wrapped into a C static lib (Homebrew gcc-6) and linked to a clang++ (clang 8.1.0) app on macos sierra. So the picture is:
c++ (gcc) wrapped in c (gcc) linked to clang app.
As a testcase I use shared-lib.cpp:
#include <iostream>
using namespace std;
void foo()
{
cerr << "Hi from the shared lib" << endl;
}
together with shared-lib.h
extern void foo();
and wrapper-lib.c
#include "shared-lib.h"
int wrapper()
{
foo();
return 123;
}
along with wrapper-lib.h
#ifdef __cplusplus
extern "C"
{
#endif
extern int wrapper();
#ifdef __cplusplus
}
#endif
The main.cpp that uses all the libs looks like
#include <iostream>
#include <string>
#include "shared-lib.h"
#include "wrapper-lib.h"
using namespace std;
int main()
{
auto s = "Hello world from main";
cout << s << endl;
foo(); // from c++ lib
int result = wrapper(); // from c wrapper lib
cout << "wrapper returned " << result << endl;
return 0;
}
My test built script is
g++-6 --version
echo -----------------------
echo build shared-lib .o with g++
g++-6 -c -Wall -fpic -std=c++11 shared-lib.cpp
echo build a wrapper library in C with gcc
gcc-6 -c -Wall -fpic wrapper-lib.c
echo build static libshared-lib.a
ar rcs libshared-lib.a shared-lib.o
echo build static libwrapper-lib.a
ar rcs libwrapper-lib.a wrapper-lib.o
echo build main with clang
clang++ --version
echo ----------------------
clang++ -v -L/Users/worker -Wall -std=c++11 -stdlib=libstdc++ -lwrapper-lib -lshared-lib main.cpp -o main
echo start the app
./main
If I only call the gcc c++ function foo() then everything works fine.
If I call the C wrapper function wrapper(), then clang comes up with:
Undefined symbols for architecture x86_64:
"_foo", referenced from:
_wrapper in libwrapper-lib.a(wrapper-lib.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
Maybe someone can simply spot, what's wrong with my workflow?
Note, for completeness the whole build script output
Note2, since ar in the gcc#6 toolchain does not work (liblto_plugin.so missing) I use clang's ar tool...
mac-mini:~ worker$ ./build-test.sh
g++-6 (Homebrew GCC 6.4.0) 6.4.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
-----------------------
build shared-lib .o with g++
build a wrapper library in C with gcc
build static libshared-lib.a
build static libwrapper-lib.a
build main with clang
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 8.1.0 (clang-802.0.41)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
----------------------
Apple LLVM version 8.1.0 (clang-802.0.41)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
clang: warning: libstdc++ is deprecated; move to libc++ [-Wdeprecated]
"/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang" -cc1 -triple x86_64-apple-macosx10.12.0 -Wdeprecated-objc-isa-usage -Werror=deprecated-objc-isa-usage -emit-obj -mrelax-all -disable-free -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model pic -pic-level 2 -mthread-model posix -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu penryn -target-linker-version 278.4 -v -dwarf-column-info -debugger-tuning=lldb -resource-dir /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.1.0 -stdlib=libstdc++ -Wall -std=c++11 -fdeprecated-macro -fdebug-compilation-dir /Users/worker -ferror-limit 19 -fmessage-length 166 -stack-protector 1 -fblocks -fobjc-runtime=macosx-10.12.0 -fencode-extended-block-signature -fcxx-exceptions -fexceptions -fmax-type-align=16 -fdiagnostics-show-option -fcolor-diagnostics -o /var/folders/18/m18t0kxx03d7__31kg3wrsr40000gq/T/main-337db7.o -x c++ main.cpp
clang -cc1 version 8.1.0 (clang-802.0.41) default target x86_64-apple-darwin16.7.0
ignoring nonexistent directory "/usr/include/c++/4.2.1/i686-apple-darwin10/x86_64"
ignoring nonexistent directory "/usr/include/c++/4.0.0"
ignoring nonexistent directory "/usr/include/c++/4.0.0/i686-apple-darwin8/"
ignoring nonexistent directory "/usr/include/c++/4.0.0/backward"
#include "..." search starts here:
#include <...> search starts here:
/usr/include/c++/4.2.1
/usr/include/c++/4.2.1/backward
/usr/local/include
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.1.0/include
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include
/usr/include
/System/Library/Frameworks (framework directory)
/Library/Frameworks (framework directory)
End of search list.
"/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ld" -demangle -lto_library /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/libLTO.dylib -no_deduplicate -dynamic -arch x86_64 -macosx_version_min 10.12.0 -o main -L/Users/worker -lwrapper-lib -lshared-lib /var/folders/18/m18t0kxx03d7__31kg3wrsr40000gq/T/main-337db7.o -lstdc++ -lSystem /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/8.1.0/lib/darwin/libclang_rt.osx.a
Undefined symbols for architecture x86_64:
"_foo", referenced from:
_wrapper in libwrapper-lib.a(wrapper-lib.o)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
You compile shared-lib.cpp with:
g++-6 -c -Wall -fpic -std=c++11 shared-lib.cpp
And you compile wrapper-lib.c with:
gcc-6 -c -Wall -fpic wrapper-lib.c
Have a look at the symbol table of shared-lib.o. It's something like:
$ readelf -s shared-lib.o
Symbol table '.symtab' contains 24 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS shared-lib.cpp
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 0 SECTION LOCAL DEFAULT 5
6: 0000000000000000 1 OBJECT LOCAL DEFAULT 5 _ZStL19piecewise_construc
7: 0000000000000000 1 OBJECT LOCAL DEFAULT 4 _ZStL8__ioinit
8: 0000000000000032 73 FUNC LOCAL DEFAULT 1 _Z41__static_initializati
9: 000000000000007b 21 FUNC LOCAL DEFAULT 1 _GLOBAL__sub_I_shared_lib
10: 0000000000000000 0 SECTION LOCAL DEFAULT 6
11: 0000000000000000 0 SECTION LOCAL DEFAULT 9
12: 0000000000000000 0 SECTION LOCAL DEFAULT 10
13: 0000000000000000 0 SECTION LOCAL DEFAULT 8
14: 0000000000000000 50 FUNC GLOBAL DEFAULT 1 _Z3foov
15: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_
16: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _ZSt4cerr
17: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _ZStlsISt11char_traitsIcE
18: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _ZSt4endlIcSt11char_trait
19: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _ZNSolsEPFRSoS_E
20: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _ZNSt8ios_base4InitC1Ev
21: 0000000000000000 0 NOTYPE GLOBAL HIDDEN UND __dso_handle
22: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _ZNSt8ios_base4InitD1Ev
23: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __cxa_atexit
(I'm working on Ubuntu, not OS X.)
Note that there is only one global function defined in this object file and
its name is _Z3foov.
That's the mangled name of the C++ function called foo in shared-lib.cpp. That's
the name the linker sees.
Now the symbol table of wrapper-lib.o:
Symbol table '.symtab' contains 11 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000000 0 FILE LOCAL DEFAULT ABS wrapper-lib.c
2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
5: 0000000000000000 0 SECTION LOCAL DEFAULT 6
6: 0000000000000000 0 SECTION LOCAL DEFAULT 7
7: 0000000000000000 0 SECTION LOCAL DEFAULT 5
8: 0000000000000000 21 FUNC GLOBAL DEFAULT 1 wrapper
9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_
10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND foo
This object file makes an undefined reference to foo, because wrapper-lib.c
is a C source file and you compiled it as such. C does not mangle names. No definition
of foo is provided by any object file in your linkage, so it fails with that
symbol unresolved.
To avoid this and accomplish your linkage, you can direct the C++ compiler
not to mangle the name foo, when compiling shared-lib.cpp. You do so like:
shared-lib.cpp
#include <iostream>
using namespace std;
extern "C" {
void foo()
{
cerr << "Hi from the shared lib" << endl;
}
} //extern "C"
Enclosing the definition of foo in extern "C" {...} has no effect on
C++ compilation except the one you want: the symbol foo will be emitted
as a C symbol; not mangled.
Having done that, you must of course follow suit in shared-lib.h:
shared-lib.h
#ifndef SHARED_LIB_H
#define SHARED_LIB_H
#ifdef __cplusplus
extern "C" {
#endif
void foo();
#ifdef __cplusplus
}
#endif
#endif
With those corrections, let's try again:
$ g++-6 -c -Wall -fpic -std=c++11 shared-lib.cpp
and check the symbol table:
$ readelf -s shared-lib.o | grep foo
14: 0000000000000000 50 FUNC GLOBAL DEFAULT 1 foo
Now the one global function defined is foo, not _Z3foov, and your
linkage will succeed.
If you want to write a C++ library that exports a C++ API and not a C API to
the linker, then you cannot call its API from C except by discovering the
mangled names of the API (with readelf, nm, objdump) and explicitly
calling those mangled names from C. Thus without those extern "C" fixes,
your linkage would also succeed with:
wrapper-lib.c
extern void _Z3foov(void);
int wrapper()
{
_Z3foov();
return 123;
}

Cannot view std::string when compiled with clang

g++ (GCC) 5.2.0
clang version 3.7.1 (tags/RELEASE_371/final)
GNU gdb (GDB) 7.12
Gdb is unable to locate the definition of std::string when compiled with clang for some reason. I have custom compiled and build gcc and clang as Centos 6.5 comes with older version of gcc.
Example code
#include <string>
int main()
{
std::string s("This is a string");
return 0;
}
Compile with g++ and debug - works just fine
[~]$ g++ -ggdb3 -std=c++14 stl.cpp
[~]$ gdb a.out
GNU gdb (GDB) 7.12
Reading symbols from a.out...done.
(gdb) break main
Breakpoint 1 at 0x400841: file stl.cpp, line 5.
(gdb) r
Starting program: /home/vagrant/a.out
Breakpoint 1, main () at stl.cpp:5
5 std::string s("This is a string");
(gdb) n
7 return 0;
(gdb) p s
$1 = {static npos = <optimized out>,
_M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0x612c20 "This is a string"}, _M_string_length = 16, {
_M_local_buf = "\020\000\000\000\000\000\000\000\300\b#\000\000\000\000", _M_allocated_capacity = 16}}
(gdb)
Check that it is linking with my rpm build version of libstdc++ and not system
[~]$ ldd a.out
linux-vdso.so.1 => (0x00007ffd709e0000)
libstdc++.so.6 => /opt/spotx-gcc/lib64/libstdc++.so.6 (0x00007f29318fa000)
libm.so.6 => /lib64/libm.so.6 (0x00007f2931676000)
libgcc_s.so.1 => /opt/spotx-gcc/lib64/libgcc_s.so.1 (0x00007f293145f000)
libc.so.6 => /lib64/libc.so.6 (0x00007f29310cb000)
/lib64/ld-linux-x86-64.so.2 (0x00007f2931c93000)
[~]$ objdump -T -C a.out
a.out: file format elf64-x86-64
DYNAMIC SYMBOL TABLE:
0000000000000000 w D *UND* 0000000000000000 __gmon_start__
0000000000000000 w D *UND* 0000000000000000 _Jv_RegisterClasses
0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 __libc_start_main
0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable
0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable
0000000000000000 DF *UND* 0000000000000000 GLIBCXX_3.4.21 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
0000000000000000 DF *UND* 0000000000000000 GLIBCXX_3.4 std::allocator<char>::~allocator()
0000000000000000 DF *UND* 0000000000000000 GLIBCXX_3.4.21 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
0000000000000000 DF *UND* 0000000000000000 GLIBCXX_3.4 std::allocator<char>::allocator()
0000000000000000 DF *UND* 0000000000000000 GCC_3.0 _Unwind_Resume
0000000000400700 DF *UND* 0000000000000000 CXXABI_1.3 __gxx_personality_v0
All looks good now if I try the same with clang
[~]$ clang++ -std=c++14 -g stl.cpp
[~]$ gdb a.out
GNU gdb (GDB) 7.12
Reading symbols from a.out...done.
(gdb) break main
Breakpoint 1 at 0x400853: file stl.cpp, line 5.
(gdb) r
Starting program: /home/vagrant/a.out
Breakpoint 1, main () at stl.cpp:5
5 std::string s("This is a string");
(gdb) n
7 return 0;
(gdb) p s
$1 = <incomplete type>
(gdb)
Now I get an incomplete type - but the same libraries are being used
[~]$ ldd a.out
linux-vdso.so.1 => (0x00007fff5352d000)
libstdc++.so.6 => /opt/spotx-gcc/lib64/libstdc++.so.6 (0x00007f76b4023000)
libm.so.6 => /lib64/libm.so.6 (0x00007f76b3d9f000)
libgcc_s.so.1 => /opt/spotx-gcc/lib64/libgcc_s.so.1 (0x00007f76b3b88000)
libc.so.6 => /lib64/libc.so.6 (0x00007f76b37f4000)
/lib64/ld-linux-x86-64.so.2 (0x00007f76b43bc000)
[~]$ objdump -T -C a.out
a.out: file format elf64-x86-64
DYNAMIC SYMBOL TABLE:
0000000000000000 w D *UND* 0000000000000000 __gmon_start__
0000000000000000 w D *UND* 0000000000000000 _Jv_RegisterClasses
0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 __libc_start_main
0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable
0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable
0000000000000000 DF *UND* 0000000000000000 GLIBCXX_3.4.21 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
0000000000000000 DF *UND* 0000000000000000 GLIBCXX_3.4 std::allocator<char>::~allocator()
0000000000000000 DF *UND* 0000000000000000 GLIBCXX_3.4.21 std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
0000000000000000 DF *UND* 0000000000000000 GLIBCXX_3.4 std::allocator<char>::allocator()
0000000000000000 DF *UND* 0000000000000000 GCC_3.0 _Unwind_Resume
0000000000400700 DF *UND* 0000000000000000 CXXABI_1.3 __gxx_personality_v0
Does anyone have any advice on where to look or something that I've missed. Both compilers are bootstrapped when building them - everything seems fine - it just appears to be std::string is not defined when using clang.
As ks1322 mentioned, this is because clang has decided not to emit debug information for libstc++.
You can force clang to do so by providing the following flag:
-D_GLIBCXX_DEBUG
I would only provide the flag for debug builds, but if debug is the default and release builds are a special target you should remove it:
release: CXXFLAGS := $(filter-out -D_GLIBCXX_DEBUG,$(CXXFLAGS)) -O2
This has fixed the same problem for me.
The last workaround mentioned in bug 24202 as linked by ks1322 is worth having a look at:
-fno-limit-debug-info will make your debug info larger, slow link (if you're not using -gsplit-dwarf) and debugger performance. But, yes, will address this.
Using -fno-limit-debug-info forces Clang to emit debug information for e.g. std::string at the cost of a larger binary while preserving compatibility with other libraries and the rest of the system/SDK.
As ks1322 and Kevin mentioned, one can instead use -D_GLIBCXX_DEBUG to switch libstdc++ into debug mode but this comes at a heavy price: any library you link against and with which you exchange STL containers (string, vector, etc.) must also be built with -D_GLIBCXX_DEBUG. Meaning: your system/SDK must either support this with a separate set of libraries or you will have to rebuild them yourself.
I've reproduced this issue on Fedora with system clang.
It appears that clang is not emitting debug information for std::string because it was told that libstdc++ provides it. See this comment from bug 24202:
Looks like you don't have debug information for libstdc++ installed:
Missing separate debuginfos, use: dnf debuginfo-install libgcc-5.1.1-4.fc22.x86_64 libstdc++-5.1.1-4.fc22.x86_64
Clang is not emitting debug information for std::string because it was
told that libstdc++ provides it (but in your case, it's not
installed); this is a debug size optimization that GCC apparently
doesn't perform.
Does this work if you install the debug information for libstdc++?
I've installed debug info for libstdc++ with command dnf debuginfo-install libstdc++-6.2.1-2.fc25.x86_64 and that resolved the issue.
clang trusts that debugging symbols for libstd++ are available, so you have to install them. See ks1322's answer for how to do that on Fedora. On Ubuntu, run:
sudo apt-get install libstdc++6-dbgsym
After that, things will just work.
Do not define _GLIBCXX_DEBUG since that'll break libstdc++'s abi.
-fno-limit-debug-info will make clang emit debug info that's larger than necessary, so I'd advise against that too. Just install the debug info package for libstdc++.
for me:
-fno-limit-debug-info is the real solution for clang / clion.
_GLIBCXX_DEBUG cause link error with some library
clang version 10.0.0-4ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
CLion 2022.1.3
Build #CL-221.5921.27, built on June 21, 2022

Remove unused section using objcopy

Suppose I have the following source file:
// file.c:
void f() {}
void g() {}
I compile it into object file using gcc -ffunction-sections:
$ gcc -c -ffunction-sections file.c -o file.o
# It now has at least two sections: `.text.f' (with `f'), `.text.g' (with `g').
Then I try to remove section .text.g (with g) from object file:
$ objcopy --remove-section .text.g file.o
objcopy: stQOLAU8: symbol `.text.g' required but not present
objcopy:stQOLAU8: No symbols
So, is there way to remove function-specific section from object file (compiled with -ffunction-sections)?
Extra info:
Full list of symbols in file.o is:
$ objdump -t file.o
file.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 file.c
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .text.f 0000000000000000 .text.f
0000000000000000 l d .text.g 0000000000000000 .text.g
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 g F .text.f 0000000000000007 f
0000000000000000 g F .text.g 0000000000000007 g
My goal is to eliminate some sections from object file similarly to what ld --gc-sections does.
Or is there some theoretical reason why such task is absolutely out of the scope of objcopy and can only be performed with ld -r?

Which library to link on cygwin for getnetbyname?

I am trying to track down a larger problem and here is the simplified test case.
#include <netdb.h>
int main(int argc, char** argv)
{
getnetbyname("localhost");
return 0;
}
I compile as:
$ gcc -c -Werror -Wall foo.c
$ gcc foo.o
foo.o:foo.c:(.text+0x16): undefined reference to `getnetbyname'
collect2: error: ld returned 1 exit status
$ gcc foo.o -llwres
foo.o:foo.c:(.text+0x16): undefined reference to `getnetbyname'
collect2: error: ld returned 1 exit status
$ gcc foo.o -lwsock32
foo.o:foo.c:(.text+0x16): undefined reference to `getnetbyname'
collect2: error: ld returned 1 exit status
$ gcc foo.o -lmswsock
foo.o:foo.c:(.text+0x16): undefined reference to `getnetbyname'
collect2: error: ld returned 1 exit status
$ gcc foo.o -lamIcrazy
/usr/lib/gcc/i686-pc-cygwin/4.8.3/../../../../i686-pc-cygwin/bin/ld: cannot find -lamIcrazy
collect2: error: ld returned 1 exit status
Not sure where to go from here, I am pretty sure Perl uses this reference but I cannot follow the build (yet). gcc foo.o works on Centos 6.
Here are the .a files with the getnetbyname symbol
Binary file /usr/lib/perl5/5.14/i686-cygwin-threads-64int/CORE/libperl.a matches
Binary file /usr/lib/w32api/libmswsock.a matches
Binary file /usr/lib/w32api/libwsock32.a matches
$ nm /usr/lib/w32api/libmswsock.a --demangle | grep -B 10 getnetbyname
dqsls00019.o:
00000000 b .bss
00000000 d .data
00000000 i .idata$4
00000000 i .idata$5
00000000 i .idata$6
00000000 i .idata$7
00000000 t .text
U _head_lib32_libmswsock_a
00000000 I _imp__getnetbyname#4
00000000 T getnetbyname#4
$ nm /usr/lib/w32api/libwsock32.a --demangle | grep -B 10 getnetbyname
duegs00043.o:
00000000 b .bss
00000000 d .data
00000000 i .idata$4
00000000 i .idata$5
00000000 i .idata$6
00000000 i .idata$7
00000000 t .text
U _head_lib32_libwsock32_a
00000000 I _imp__getnetbyname#4
00000000 T getnetbyname#4
It looks like it is not implemented, per https://cygwin.com/cygwin-api/std-notimpl.html .
I must have misunderstood the exports from libwsock32.a and libmswsock.a

Resources