xcode ld detect duplicate symbol in static libraries - xcode

This question has been asked previously for gcc, but Darwin's ld (clang?) appears to handle this differently.
Say I have a main() function defined in two files, main1.cc and main2.cc. If I attempt to compile these both together I'll get (the desired) duplicate symbol error:
$ g++ -o main1.o -c main1.cc
$ g++ -o main2.o -c main2.cc
$ g++ -o main main1.o main2.o
duplicate symbol _main in:
main1.o
main2.o
ld: 1 duplicate symbol for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
But if I instead stick one of these into a static library, when I go to link the application I won't get an error:
$ ar rcs libmain1.a main1.o
$ g++ -o main libmain1.a main2.o
(no error)
With gcc you can wrap the lib with --whole-archive and then gcc's ld will produce an error. This option is not available with the ld that ships w/ xcode.
Is it possible to get ld to print an error?

I'm sure you know that you're not supposed to put an object file
containing a main function in a static library. In case any of our readers
doesn't: A library is for containing functions that may be reused by many programs.
A program can contain only one main function, and the likelihood
is negligible that the main function of program will be reusable as the main
function of another. So main functions don't go in libraries. (There are a few odd exceptions to this rule).
On then to the problem you're worried about. For simplicity,
I'll exclude linkage of shared/dynamic libraries from consideration in the rest of this.
Your linker detects a duplicate symbol error (a.k.a. multiple definition error)
in the linkage when the competing definitions are in different input object files
but doesn't detect it when one definition is an input object file and the other
is in an input static library. In that scenario, the GNU linker can detect
the multiply defined symbol if it is passed the --whole-archive option before
the static library. But your linker, the Darwin Mach-O linker,
doesn't have that option.
Note that while your linker doesn't support --whole-archive, it has an
equivalent option -all_load. But don't run away with that, because the worry is groundless anyhow. For both linkers:
There really is a multiple definition error in the linkage in the [foo.o ...
bar.o] case.
There really is not a multiple definition error in the linkage in the [foo.o ... libbar.a] case.
And in addition for the GNU linker:
There really is a multiple definition error in the linkage in the
[foo.o ... --whole-archive libbar.a] case.
In no case does either linker allow multiple definitions of a symbol to
get into your program undetected and arbitrarily use one of them.
What's the difference between linking foo.o and linking libfoo.o?
The linker will only add object files to your program.
More precisely, when it meets an input file foo.o, it adds to your program
all the symbol references and symbol definitions from foo.o. (For starters
at least: it may finally discard unused definitions if you've requested that,
and if it can do so without collaterally discarding any used ones).
A static library is just a bag of object files. When the linker meets an input file
libfoo.a, by default it won't add any of the object files in the bag to
your program.
It will only inspect the contents of the bag if it has to, at that point in the linkage.
It will have to inspect the contents of the bag if it has already added
some symbol references to your program that don't have definitions. Those
unresolved symbols might be defined in some of the object files in the bag.
If it has to look in the bag, then it will inspect the object files to
see if any of them contain definitions of unresolved symbols already in the
program. If there are any such object files then it will add them to the program and consider afresh whether it needs to keep looking in the bag. It stops looking in the bag when it finds no more object files in it that the program needs or has found definitions for all symbols referenced by the program, whichever comes first.
If any object files in the bag are needed, this adds at least one more symbol
definition to your program, and possibly more unresolved symbols. Then the linker carries on.
Once it has met libfoo.a and considered which, if any, object files in that bag it needs for your program,
it won't consider it again, unless it meets it again, later in the linkage
sequence.
So...
Case 1. The input files contain [foo.o ... bar.o]. Both foo.o and bar.o
define symbol A. Both object files must be linked, so both definitions of A must
be added to the program and that is a multiple definition error. Both linkers detect it.
Case 2 The input files contain [foo.o ... libbar.a].
libbar.a contains object files a.o and b.o.
foo.o defines symbol A and references, but does not define, symbol B.
a.o also defines A but does not define B, and defines no other symbols
that are referenced by foo.o.
b.o defines symbol B.
Then:-
At foo.o, the object file must be linked. The linker adds the
definition of A and an unresolved reference to B to the program.
At libbar.a, the linker needs a definition for unresolved reference B so it looks in the bag.
a.o does not define B or any other unresolved symbol. It is not linked. The second definition of A is not added.
b.o defines B, so it is linked. The definition of B is added to the program.
The linker carries on.
No two object files that both define A are needed in the program. There is no
multiple definition error.
Case 3 The input files contain [foo.o ... libbar.a].
libbar.a contains object files a.o and b.o.
foo.o defines symbol A. It references but does not define, symbols B and C.
a.o also defines A and it defines B, and defines no other symbols
that are referenced by foo.o.
b.o defines symbol C.
Then:-
At foo.o, the object file is linked. The linker adds to the program the definition of A and a unresolved references to B and C.
At libbar.a, the linker needs definitions for unresolved referencesB
and C so it looks in the bag.
a.o does not define C. But it does define B. So a.o is linked. That adds the required definition of B, plus the not-required, surplus definition of A.
That is a multiple definition error. Both linkers detect it. Linkage ends.
There is a multiple definition error if and only if two definitions
of some symbol are contained in object files that are linked in the program. Object files from a static library are linked only to provide definitions of symbols that the program references. If there is
a multiple definition error, then both linkers detect it.
So why does the GNU linker option --whole-archive give different outcomes?
Suppose that libbar.a contains a.o and b.o. Then:
foo.o --whole-archive -lbar
tells the linker to link all the object files in libbar.a whether
they are needed or not. So this fragment of the linkage command is simply equivalent
to:
foo.o a.o b.o
Thus in case 2 above, the addition of --whole-archive is a way of
creating a multiple definition error where there is none without it. Not
a way of detecting a multiple definition error that was not detected without
it.
And if --whole-archive is mistakenly is used as a way "detecting" fictitious
multiple definition errors, then in those cases where the linkage nevertheless
succeeds, it is also a way of adding an unlimited amount of redundant code
to the program. The same goes for the -all_load option of the Mach-O linker.
Not satisfied?
Even when all that is clear, maybe you still hanker for some way to make it
an error when an input object file in your linkage defines a symbol that
is also defined in another object file that is not needed by the linkage but
happens to be contained in some input static library.
Well, that might be a situation that you want to know about, but it just
isn't any kind of linkage error, multiple-definition or otherwise. The purpose
of static libraries in linkage is to provide default definitions of symbols
that you don't define in the input object files. Provide your own definition
in an object file and the libary default is ignored.
If you don't want linkage to work like that - the way it is intended to work -
but:-
You still want to use a static library
You don't want any definition from an input object file ever to prevail over
one that's in a member of the static library
You don't want to link any redundant object files.
then the simplest solution (though not necessarily the least time-consuming at build time)
is this:
In your project build extract all the members of the static library as a
prerequisite of the link step in a manner that also gives you the list of
their names, e.g.:
$ LIBFOOBAR_OBJS=`ar xv libfoobar.a | sed 's/x - //'g`
$ echo $LIBFOOBAR_OBJS
foo.o bar.o
(But extract them someplace where they cannot clobber any object files you build). Then, again before the link step, run a preliminary throw-away
linkage in which $LIBFOOBAR_OBJS replaces libfoobar.a. E.g
instead of
cc -o prog x.o y.o z.o ... -lfoobar ...
run
cc -o deleteme x.o y.o z.o ... $LIBFOOBAR_OBJS ...
If the preliminary linkage fails - with a multiple definition error or
anything else - then stop there. Otherwise go ahead with the real linkage.
You won't link any redundant object files in prog. The price is performing
a linkage of deleteme that is redundant unless it fails with a multiple
definition error1
In professional practice, nobody runs builds like that to head off the
remote possibility that a programmer has defined a function in
one of x.o y.o z.o that knocks out a function defined in a member of
libfoobar.a without meaning to. Competence and code-review are
counted on to avoid that, in the same way they are counted on to avoid
a programmer defining a function in x.o y.o z.o to do anything that
should be be done using library resources.
[1] Rather than extracting all the object files from the static
library for use in the throw-away linkage, you might consider a
throwaway linkage using --whole-archive, with the GNU linker,
or -all_load, with the Mach-O linker. But there are potential pitfalls with
this approach I won't delve into here.

Related

Override GCC linker symbols in c code using weak declaration

I am building an elf target. I have a linker script where I input some of the symbol locations like(these symbols are defined in a different locations like ROM whose address is provided below),
A = 0x12345678;
B = 0x1234567c;
D = 0x1234568c;
In the C code I can use these variables A and B without declaring them which is expected.
I want to know if I can override the symbol D i.e., My current executable can have its own declaration of D. In that case the linker should ignore D. Is there a way to declare the symbols in linker script as 'weak'? so that the linker can use 'input symbols' only if it is not declared in any of the linked objects.
Use PROVIDE directive
PROVIDE(D = 0x1234568c);
From ld documentation
In some cases, it is desirable for a linker script to define a symbol only if it is referenced and is not defined by any object included in the link.
…
If, on the other hand, the program defines … the linker will silently use the definition in the program.

GCC - how to tell linker not to skip unused sections

My problem is following:
I am trying to write embedded application, which must have it's own linker script supplied (using arm-none-eabi-gcc compiler/linker).
embedded bootloader loads binary and starts at 0x8000 address, this is why I need a dedicated linker script, which allows me to put desired startup function into this address. Script's code is following:
MEMORY
{
ram : ORIGIN = 0x8000, LENGTH = 0x1000
}
SECTIONS
{
.start : { *(.start) } > ram
.text : { *(.text*) } > ram
.bss : { *(.bss*) } > ram
}
Having this what I want to do now is to have a function, that will be inserted into .start section, so that it's at the beginning of 0x8000. For this in my library I use following function:
__attribute__((section(".start"))) void notmain() {
main();
}
This seems to be working fine, but later I link this library with function notmain with the project, which defines main() function. During the link process I can see .start section no more exists and notmain symbol
is totally missing. When I move notmain function out of the library (into the project) its'all fine.
My understanding is, that linker sees, that .start section is not used at all in my Application, which makes it skip all the sections. I already tried adding several attributes to function notmain such as (__attribute__((used)) __attribute__((externally_visible))) but it did not work too (notmain is still missing from the final binary).
CMake source code is following:
** Project **
project(AutomaticsControlExample)
enable_language(ASM)
set(CMAKE_CXX_STANDARD 14)
set(SOURCES main.cpp PID.hpp)
set(DEPENDENCIES RPIRuntime PiOS)
add_executable(${PROJECT_NAME} ${SOURCES})
target_link_libraries(${PROJECT_NAME} ${DEPENDENCIES})
add_custom_command(TARGET ${PROJECT_NAME} POST_BUILD
COMMAND ${CMAKE_OBJDUMP} -D ${PROJECT_NAME}
COMMAND ${CMAKE_OBJDUMP} -D ${PROJECT_NAME} > ${PROJECT_NAME}.list
COMMAND ${CMAKE_OBJCOPY} ${PROJECT_NAME} -O binary ${PROJECT_NAME}.bin
COMMAND ${CMAKE_OBJCOPY} ${PROJECT_NAME} -O ihex ${PROJECT_NAME}.hex)
** Library **
project(RPIRuntime)
enable_language(ASM)
set(CMAKE_CXX_STANDARD 14)
set(LINKER_SCRIPT memmap)
set(LINKER_FLAGS "-T ${CMAKE_CURRENT_SOURCE_DIR}/${LINKER_SCRIPT}")
set(SOURCES
notmain.cpp
assert.cpp)
add_library(${PROJECT_NAME} STATIC ${SOURCES})
target_link_libraries(${PROJECT_NAME} ${LINKER_FLAGS})
My question is: is there any way to prevent linker from omitting linking .start section?
As you know, a static library is an ar archive of object files.
Suppose libfoobar.a contains just foo.o and bar.o. A linkage:
g++ -o prog a.o foo.o bar.o # A
is not the same as the linkage:
g++ -o prog a.o -lfoobar. # B
The linker unconditionally consumes every object file in the linkage sequence,
so in case A, it links a.o, foo.o, bar.o in prog.
The linker does not unconditionally consume every object file that is a member of
a static library in the linkage sequence. A static library is a way of offering to
the linker a bunch of object files from which to pick the ones it needs.
Suppose that a.o calls function foo, which is defined in foo.o, and that
a.o references nothing defined in bar.o.
In that case, the linker unconditionally links a.o into prog, after which
prog contains an undefined reference to foo, for which the linker needs a
definition. Next it reaches libfoobar.a and inspects the archive (by its index,
normally) to see if any member of the archive defines foo. It finds that foo.o does
so. So it extracts foo.o from the archive and links it. It needs no definitions
for any symbols defined in bar.o, so bar.o is not added to the linkage. The
linkage B is exactly the same as:
g++ -o prog a.o foo.o
Suppose on the other hand that a.o calls bar, which is defined in bar.o,
and references nothing defined in foo.o. In that case, the linkage B is
exactly the same as:
g++ -o prog a.o bar.o
So an object file that you insert into a static library for linkage with
your executable will never be linked, by default, unless it provides a definition
for at least one symbol that is referenced, but not defined, in an object file
that has already been linked.
Your function notmain is not referenced in the only object file, main.o that
you are explicitly linking in your program. Therefore, when main.o is linked into your program,
the program contains no undefined reference to notmain: the linker requires no definition
of notmain - it has never heard of notmain - and will not link any object file
from within a static library to obtain a definition of notmain. This has nothing
to do with linkage sections.
When linking an ordinary program with static libraries, as a matter of course
you do it like:
g++ -o prog main.o x.o ... -ly -lz ....
where one of the *.o files - say main.o - is the object file that defines the main function. You never
put main.o in one of the static libraries. That's because, in a ordinary program,
main is not called in any of the other object files you are explicitly linking,
so if main.o was in one of your libraries, the linkage:
g++ -o prog x.o ... -ly -lz ...
would have no need to find a definition of main at any of -ly -lz ..., and no definition
of main would be linked.
The case is just the same with your notmain. If you want it linked you can do one of:-
Add -Wl,--undefined=notmain to your linkage options (replacing notmain with
the mangled name of notmain, for C++). This will make the linker assume it has an
undefined reference to notmain even though it hasn't seen any.
Add the command EXTERN(notmain) to your linker script (again with mangling
for C++). This is equivalent to 1.
Explicitly link an object file that defines notmain. Don't put it in a static library.
3 is effectively what you did when you discovered that:
When I move notmain function out of the library (into the project) its'all fine.
For 3, however, you don't need to compile notmain.cpp in your project and any other
project that needs notmain.o. You can build it independently, install it
in /usr/local/lib and explicitly add /usr/local/lib/notmain.o to the
linkage of your project. That would be following the example of GCC itself, which explicitly
links the crt*.o startup files of an ordinary program just by appending their
absolute names to the linkage, e.g.
/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/crti.o
...
/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/crtn.o

gcc/ld: undefined reference to unused function

I'm using gcc 4.3.4 and ld 2.20.51 in Cygwin under Windows 7. Here's a simplified version of my problem:
foo.o contains function foo_bar() which calls bar() in bar.o
bar.o contains function bar()
main.c calls functions in foo.o, but foo_bar() is not in the call chain
If I try to compile main.c and link it to foo.o, I get an undefined reference to _foo_bar error from ld. As you can see from my Makefile except below, I've tried using flags for putting each function in its own section and having the linker discard unused sections.
COMPILE_CYGWIN = gcc -iquote$(INCDIR)
COMPILE = $(COMPILE_CYGWIN) -g -MMD -MP -Wall -ffunction-sections -Wl,-gc-sections $(DEFINE)
main_OBJECTS = main.o foo.o
main.exe : $(main_OBJECTS)
$(COMPILE) -o main.exe $(main_OBJECTS)
The function foo_bar() is a short function that provides a connection between two networking layers in a protocol stack. Some programs don't need it, so they won't link in the other object files related to the upper layer of the stack. It's a small function, and seems inappropriate to put it into its own .o file.
I don't understand why ld throws the error -- nothing is calling foo_bar(), so there's no need to include bar() in the final executable. A coworker has just told me that ld is not a "smart linker", so maybe what I'm trying to do isn't possible?
Unless the linker is from Cyberdyne Systems it has no way to know exactly which functions will actually be called. It only knows which ones are referenced. Even Skynet's linker can't predict what run-time decisions will be made or what will happen if you load a module dynamically at run-time and it starts calling various global functions1.
So, if you link in module m and it references function f, you will need to link with whatever module has f.
1. This problem is related to the Halting Problem and has been proven undecidable.
I hit the similar issue and I find this page:
http://lists.gnu.org/archive/html/bug-gnu-utils/2004-09/msg00098.html
Highligt:
The GNU linker still works at .o file granularity.
Gcc pulls in foo.o and then find bar() was undefined.
You'd better put foo_bar() into another .o file.

"Undefined reference" when linking C code in Linux

I have a code library (written entirely in C) that I typically compile under Windows, into a .DLL.
I want to compile it under Linux so I can distribute it. I don't care if I distribute it as a .a, a .so, or a bunch of .o files.
All of the individual .c files compile successfully. But when I try to compile a test executable that includes all the .o files, I get a bunch of undefined reference errors.
All of the .o files are on the command line as full paths, and I don't get any errors about missing files.
cc testctd.c -o testctd.out -lm -lc $LIBRARY-PATH/*.o
I also have undefined references to _open, _write, etc.
You have the -l options in the wrong place
-llibrary
-l library
Search the library named library when linking. (The second alternative with the library as a separate argument is only for POSIX compliance> and is not recommended.)
It makes a difference where in the command you write this option; the linker searches and processes libraries and object files in the order they are specified. Thus,
foo.o -lz bar.o
searches library z after file foo.o but before bar.o. If bar.o refers to functions in z, those functions may not be loaded.
The linker searches a standard list of directories for the library, which is actually a file named liblibrary.a. The linker then uses this file as if it had been specified precisely by name.
You haven't given enough information for a complete answer, but I think I know one of your problems: The functions open, read, write, close, etc. have underscores in front of their names on Windows, but they do not on Linux (or any other Unix for that matter). The compiler should have warned you about that when you compiled the .c files -- if it didn't, turn on warnings! Anyway, you're going to have to remove all those underscores. I would recommend a header file that does something like the following:
#ifdef _WIN32
#define open(p, f, m) _open(p, f, m)
#define read(f, b, n) _read(f, b, n)
#define write(f, b, n) _write(f, b, n)
#define close(f) _close(f)
/* etc */
#endif
and then use only the no-underscore names in your actual code.
Also, -l options (such as -lm) must be placed after all object files. It is unnecessary to specify -lc (and it may cause problems, under circumstances which are too arcane to go into here).

Different ways to specify libraries to gcc/g++

I'd be curious to understand if there's any substantial difference in specifying libraries (both shared and static) to gcc/g++ in the two following ways (CC can be g++ or gcc)
CC -o output_executable /path/to/my/libstatic.a /path/to/my/libshared.so source1.cpp source2.cpp ... sourceN.cpp
vs
CC -o output_executable -L/path/to/my/libs -lstatic -lshared source1.cpp source2.cpp ... sourceN.cpp
I can only see a major difference being that passing directly the fully-specified library name would make for a greater control in choosing static or dynamic versions, but I suspect there's something else going on that can have side effects on how the executable is built or will behave at runtime, am I right?
Andrea.
Ok, I can answer myself basing on some experiments and a deeper reading of gcc documentation:
From gcc documentation: http://gcc.gnu.org/onlinedocs/gcc/Link-Options.html
[...] The linker handles an archive file by scanning through it for members which define symbols that have so far been referenced but not defined. But if the file that is found is an ordinary object file, it is linked in the usual fashion. The only difference between using an -l option and specifying a file name is that -l surrounds library with lib' and.a' and searches several directories
This actually answers also to the related doubt about the 3rd option of directly specifying object files on the gcc command line (i.e. in that case all the code in the object files will become part of the final executable, while using archives, only the object files that are really needed will be pulled in).

Resources