What is the right order of linker flags in gcc? - gcc

Normally, I would compile a program that requires a specific library, e.g. math, by passing the linker flag after the sources that need it like so:
gcc foo.c -lm
However, it seems that older versions of gcc work equally well with the reverse order (let's call this BAD ORDER):
gcc -lm foo.c
I wouldn't worry about it if some popular open-source projects I'm trying to compile didn't use the latter while my version of gcc (or is it ld that's the problem?) work only in the former case (also, the correct one in my opinion).
My question is: when did the BAD ORDER stop working and why? It seems that not supporting it breaks legacy packages.

when did the BAD ORDER stop working and why? It seems that not supporting it breaks legacy packages.
When?
Not dead sure but I think pre-GCC 4.5. Long ago. Subsequently, the --as-needed option is operative for shared libraries by default,
so like static libraries, they must occur in the linkage sequence later than the objects for which they provide definitions.
This is a change in the default options that the gcc/g++/gfortran etc. tool-driver passes to ld.
Why?
It was considered confusing to inexpert users that static libraries by default has to appear
later that the objects to which they provided definitions while shared libraries by default did
not - the difference between the two typically being concealed by the -l<name> convention
for linking either libname.a or libname.so.
It was perhaps an unforeseen consequence that inexpert users who
had formerly had a lot of luck with the mistaken belief that a GCC
[compile and] link command conforms to the normal Unix pattern:
command [OPTION...] FILE [FILE...]
e.g.
gcc -lthis -lthat -o prog foo.o bar.o
now fare much worse with it.

Related

Where is __builtin_va_start defined?

I'm trying to locate where __builtin_va_start is defined in GCC's source code, and see how it is implemented. (I was looking for where va_start is defined and then found that this macro is defined as __builtin_va_start.) I used cscope -r in GCC 9.1's source code directory to search the definition but haven't found it. Can anyone point where this function is defined?
That __builtin_va_start is not defined anywhere. It is a GCC compiler builtin (a bit like sizeof is a compile-time operator). It is an implementation detail related to the <stdarg.h> standard header (provided by the compiler, not the C standard library implementation libc). What really matters are the calling conventions and ABI followed by the generated assembler.
GCC has special code to deal with compiler builtins. And that code is not defining the builtin, but implementing its ad-hoc behavior inside the compiler. And __builtin_va_start is expanded into some compiler-specific internal representation of your compiled C/C++ code, specific to GCC (some GIMPLE perhaps)
From a comment of yours, I would infer that you are interested in implementation details. But that should be in your question
If you study GCC 9.1 source code, look inside some of gcc-9.1.0/gcc/builtins.c (the expand_builtin_va_start function there), and for other builtins inside gcc-9.1.0/gcc/c-family/c-cppbuiltin.c, gcc-9.1.0/gcc/cppbuiltin.c, gcc-9.1.0/gcc/jit/jit-builtins.c
You could write your own GCC plugin (in 2Q2019, for GCC 9, and the C++ code of your plugin might have to change for the future GCC 10) to add your own GCC builtins. BTW, you might even overload the behavior of the existing __builtin_va_start by your own specific code, and/or you might have -at least for research purposes- your own stdarg.h header with #define va_start(v,l) __my_builtin_va_start(v,l) and have your GCC plugin understand your __my_builtin_va_start plugin-specific builtin. Be however aware of the GCC runtime library exception and read its rationale: I am not a lawyer, but I tend to believe that you should (and that legal document requires you to) publish your GCC plugin with some open source license.
You first need to read a textbook on compilers, such as the Dragon book, to understand that an optimizing compiler is mostly transforming internal representations of your compiled code.
You further need to spend months in studying the many internal representations of GCC. Remember, GCC is a very complex program (of about ten millions lines of code). Don't expect to understand it with only a few days of work. Look inside the GCC resource center website.
My dead GCC MELT project had references and slides explaining more of GCC (the design philosophy and architecture of GCC changes slowly; so the concepts are still relevant, even if individual details changed). It took me almost ten years full time to partly understand some of the middle-end layers of GCC. I cannot transmit that knowledge in a StackOverflow answer.
My draft Bismon report (work in progress, funded by H2020, so lot of bureaucracy) has a dozen of pages (in its sections ยง1.3 and 1.4) introducing the internal representations of GCC.

G++/LD fails: can't find library when library isn't actually needed

I have a program foo I'm trying to compile and link and I'm running into a chicken and egg dillemma.
For reasons I'll explain below, Within a given directory I'm forced to add a link to several libraries we build (let's call them libA and libB) regardless of my target. I know I only actually need libA for my program; so after all libs are built and this binary is built I verified with ldd -u -r foo to show that libB is an unused direct dependency.
Being unused I altered the makefiles and flags such that libB is enveloped with -Wl --as-needed and -Wl --no-as-needed. I make rebuild, use ldd again and this time it doesn't show any unused deps. So far so good.
Now the fun part: Since its unused I would expect that if libB is not found/available/built that I should still be able to compile and link foo as long is libA is available. (example: If I did a fresh checkout and only built libA before trying to compile this specific test). But ld errors out with /usr/bin/ld: cannot find -lB
This suggests that ld needs to locate libB even if it won't need any of the symbols it provides? That doesn't seem to make sense. If all symbolic dependencies are already met, why does it even need to look at this other library? (That would explain the problem ld has and why this is not possible)
Is there a way I can say "Hey don't complain if you can't find this library and we shouldn't need to link with it?"
The promised reasons below
For various reasons beyond my control I have to share makeflags with many other tests in this directory due to the projects makefile hierarchy. There is a two level makefile for all these tests that says foo is a phony target, his recipe is make -f generictest.mk target=foo, and the generictest.mk just says that the source file is $(target).C, that this binary needs to use each library we build, specifies relative path to our root directory and then includes root's generic makefile. The root directory generic makefile expands all the other stuff out (flags, options, compiler, auto-gen of dependencies through g++ etc), and most importantly for each statement that said "use libX" in generictest.mk it adds -lX to the flags (or in my case enveloped in as-needed's)
While I'm well aware there are lots of things that are very unideal and/or horribly incorrect in terms of makefile best practices with this, I don't have the authority/physical ability to change it. And compared to the alternative employed in other folders, where others make individual concrete copies of this makefile for each target, I greatly prefer it; because that forces me to edit all of them whenever want to revise our whole make pattern, and yields lot of other typos and problems.
I could certainly create another generictest.mk like file to use for some tests and group together those using each based on actual library needs, but it would be kind of neat if I didn't have to as long as I said "you don't all of them, you need each of them but only if you actually use it".
There's no way that the linker can know that the library is not needed. Even after all your "normal" libraries are linked there are still lots and lots of unresolved symbols: symbols for the C runtime library (printf, etc.). The linker has no idea where those are going to come from.
Personally I'd be surprised if the linker didn't complain, even if every single symbol was already resolved. After all there may be fancy things at work here: weak bindings, etc. which may mean that symbols found later on the link line would be preferred over symbols found earlier (I'm not 100% sure this is possible but I wouldn't be surprised).
As for your situation, if you know that the library is not needed can't you just use $(filter-out ...) on the link command line to get rid of it? You'd have to write your own explicit rule for this with your own recipe, rather than using a default one, but at least you could use all the same variables.
Alternatively it MIGHT be possible to play some tricks with target-specific variables. Declare a target-specific variable for that target that resets the variable containing the "bad library" with a value that doesn't contain it (maybe by using $(filter-out ...) as above), and it will override that value for that target only. There are some subtle gotchas with target-specific variables overriding "more general" variables but I think it would work.

including static libraries with -all_load flag

In what cases exactly do you need -all_load flag?
Lets say I have something like
g++ source.cpp -o test libA.a libB.a libC.a
From what i recall if there is some reference to a symbol used in source.cpp that is present
in say libB.a file then that libB.a will be linked (just that symbol or whole code in that library? ) and libA.a and libC.a will be ignored (their code will not be present in final executable).
What happens to other libraries when i use -all_load flag as follows
g++ source.cpp -o test -Wl,-all_load libA.a libB.a libC.a
how does 'strip' command effect the output with all_load flag?
-all_load is for when you want to link compile units that are (to the linker) unnecessary. For instance, perhaps you will dynamically access functions within the static library at runtime that you know the addresses of, but haven't actually made any explicit function calls to. How would you do that? Well, the compiler could help you by storing a bunch of function pointers in the executable to be read at run time, and then you'd build a lookup system for finding those functions using a string, and you'd call the whole thing Objective-C, which is probably the most common user of -all_load (at least if Google is any guide).
The most common case of this in ObjC is when you have a category in its own compile unit. The complier may not be able to tell that you reference it and so won't link it. So ObjC programmers use -all_load (or -force_load) more often than other C-like programmers. In fact, -all_load is a Darwin-specific extension in gcc.
But there are cases where people might want to use -all_load outside of ObjC. For instance, there might be some inter-dependencies in libA and libB. Consider this case:
source.cpp requires A() and B()
libA defines A() in a.o and Aprime() in aprime.o
libB defines B() in b.o and requires Aprime()
This typically won't link (*). The compiler will start with source.o and make a list of requirements: A() and B(). It'll then look at libA and see that it defines A(), so it'll link a.o (but not aprime.o). Then it will look at libB and see that it defines B() and requires Aprime(). It is now out of libraries, and it hasn't resolved Aprime(). It fails.
(*) Actually, it will with clang because clang is quite smart about this. But it won't with g++ at least up through 4.6.
The best solution would be to reorder it so that libB comes first (**). But if the dependencies were circular, you could get completely stuck. -all_load and -force_load let you work around these situations by turning off the linker's optimization.
(**) The really best solution is usually to redesign your libraries to avoid this kind of interdependency, but that may be hoping too much.
If you want to play around with the issue, see https://gist.github.com/rnapier/5710509.
strip just removes symbols from executables. That's not particularly related to static linking and -all_load (though it does impact dynamic linking). strip(1) has lots of discussion of that.

Static library "interface"

Is there any way to tell the compiler (gcc/mingw32) when building an object file (lib*.o) to only expose certain functions from the .c file?
The reason I want to do this is that I am statically linking to a 100,000+ line library (SQLite), but am only using a select few of the functions it offers. I am hoping that if I can tell the compiler to only expose those functions, it will optimize out all the code of the functions that are never needed for those few I selected, thus dratically decreasing the size of the library.
I found several possible solutions:
This is what I asked about. It is the gcc equivalent of Windows' dllexpoort:
http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Code-Gen-Options.html (-fvisibility)
http://gcc.gnu.org/wiki/Visibility
I also discovered link-time code-generation. This allows the linker to see what parts of the code are actually used and get rid of the rest. Using this together with strip and -fwhole-program has given me drastically better results.
http://gcc.gnu.org/onlinedocs/gcc-4.6.1/gcc/Optimize-Options.html (see -flto and -fwhole-program)
Note: This flag only makes sense if you are not compiling the whole program in one call to gcc, which is what I was doing (making a sqlite.o file and then statically linking it in).
The third option which I found but have not yet looked into is mentioned here:
How to remove unused C/C++ symbols with GCC and ld?
That's probably the linkers job, not the compilers. When linking that as a program (.exe), the linker will take care of only importing the relevant symbols, and when linking a DLL, the __dllexport mechanism is probably what you are looking for, or some flags of ld can help you (man ld).

Linking include files in GCC

I can never remember what to type when linking include files in GCC, in fact the only one I can remember is -lm for math.h. The one I am specifically concerned with right now is sys/time.h.
This page clears things up some, but I would still like a list.
Does anyone know of a good list of linking options?
EDIT:
Maybe my question was not clear. I want to know what I need to type at the command line (like -lm for math or -lpthread for pthread) for the various libraries I might need to link when making C programs.
The functionality provided in <sys/time.h> is implemented in libc.so (C library). You don't need to link anything else in as gcc should automatically link to libc.so by itself. There is no 'linking of include files', rather you are linking against libraries that contain the symbols defined by code.
The -l flag is one of GCC's linker options and is used to specify additional libraries to link against.
edit because my gcc was performing optimizations on my source code at compile time
Also, the information in that link is a little outdated - you should not need an explicit link to libm (which is what -l m or -lm does) in modern GCC.
I'm not sure i understand your question but -lm is not an ld option, -l is an option and -lx links libx.a (or .so, it depends). you might want to look at the ld manual for a full list of options.
I think all other standard libraries other than math are included in libc.so(.a) (-lc)

Resources