GCC, PIE, PIC, archives and shared objects - what works with what? - gcc

I have a question about GCC, ThreadSanitizer and the use of archives, shared libraries and PIE and PIC.
I've been reading about as best I can all morning, but I just can't find useful, clear information on-line.
I understand what PIC does. I think I understand that PIE is if you like an optimized version of PIC, which is only for executables.
Now come the questions...
Can I compile an executable with PIC, rather than PIE?
If I compile a shared library (.so) with PIC, must I then use PIC with any executable which uses that library, rather than PIE?
If I compile an archive (.a), can I use PIE? (I have read -static and -pie should not be used together, which implies not).
I'm using ThreadSanitizer. This requires PIC (and perhaps PIE is okay too - but as you can see, I'm not clear about this). I have a library, which can be compiled as an archive (.a) or a shared library (.so). The library needs to use ThreadSanitizer. However, the binary which uses it also needs to use ThreadSanitizer (as it has some code of its own which needs checking).
The library when built as a shared library in fact fails to link when used wih ThreadSanitizer - I think the link is failing to link to libtsan (but this is I suspect not a real library, but a bunch of compiler instrincs built into GCC). This is almost certainly me getting something wrong somewhere.
What I really want to do is use an archive (.a) since the binary is a test programme and should be able to compile without the library being installed (so users can conveniently check/test the library - the makefile for the test binary has a hard coded path to the archive binary).
If I can use PIE with archives (.a), then I'd PIE the library and the test binary. If PIE cannot be used with archives, then I think I need to use PIC with both the library and test binary. I don't want to use a shared library at all, since ThreadSanitizer uses TLS (thread local store) heavily and shared libraries with PIC have absolutely terrible TLS performance.

Ultimate functionality of pic and pie are same but in gcc -fpic is used to create shared libraries whereas -fpie is used to for exes.
No you cannot use pic for an executable
Shared libraries don't care where pie (PIE just make the exec position independent) or normal execs are using it. It is dynamic linker's (ld.so) job to link the shared library.
No you can't make a exec position independent while using a static library. When you link a exec with static library it creates a dependency on exec and the symbols have to be resolved at compile-time.So in short you can't.
I've to go answer rest after office


Create a statically linked shared library

Is it possible to create a shared library which is itself statically linked, i.e. it does not depend on other shared libraries?
Let me be a little bit more concrete..
I want to create a shared library, say mylib.so, which makes use of some other special libraries (in my case its intel mkl and openMP). Since I have installed these libraries I can build mylib.so and include it in other programs without any problem.
However, if I want to use the library (or the executables including it) on another machine I first have to install all the intel stuff. Is there a way to avoid this? My first try was to add the option -static when building mylib.so but this doesn't seem to do anything..
I'm using icc..
Is it possible to create a shared library which is itself statically linked, i.e. it does not depend on other shared libraries?
Not on Linux, not when using GLIBC (your shared library will always depend on at least ld-linux*.so*).
I want to create a shared library, say mylib.so, which makes use of some other special libraries (in my case its intel mkl and openMP).
There is no problem1 statically linking Intel MKL and OpenMP libraries into mylib.so -- you just don't want to depend on these libraries dynamically (in other words, you are asking for an impossible thing which you don't actually need).
To do so, you need two things:
Link mylib.so with archive versions of the libraries you don't want to depend on dynamically, e.g. gcc -o mylib.so -shared mylib.c .../libmkl.a ...
The libraries which you want to statically link into mylib.so must have been built with position-independent code (i.e. with -fPIC flag).
What if the archived version isn't available?
Then you can't link it into your library.
Eg I'm using intel/oneapi/intelpython/latest/lib/libstdc++.so and there is no corresponding .a file..
This is a special case: you wouldn't want to link that version into your library even if it were available.
Instead, your program should use the version installed on the target system.
Having two separate versions of libstdc++ (e.g. one statically linked, and the other dynamically linked) into a single process will end very badly -- either with a crash, or with silent stack or heap corruption.
1 Note that linking in somebody else's library and distributing it may have licensing implications.

Compile single static library for Cortex M3, M4, M23 and M33

I'm currently working on a rather generic communication stack. It gets bytes in on one end, parses the packet and calls a callback.
I want to have this stack in a static library (i.e. libcommstack.a).
The library is aimed towards embedded ARM Cortex-M devices. At the moment we have specified that at least a Cortex-M3 should be used (but it should also work for an M4 or M33).
Right now I'm integrating it into another application to verify that linking it is possible. In the future the idea is that we will ship this .a file to customers so they can build their application around it, without having direct access to our sources (to encapsulate our IP).
We are using GCC ARM v7.2.1 to compile both the library and the application that is linked to it.
The application I'm trying to integrate it with is compiled for a Cortex M33 with -mfloat-abi=hard -mfpu-fpv6-sp-d16.
The code for the library does not use any floating points and is compiled using -march=archv7-m (both have the -mthumb flag).
Linking seemed to all go well, until I actually called a function from the lib. At that point the linker starts to complain:
application.elf uses VFP register arguments, libcommstack.a(somefile.c.obj) does not
failed to merge target specific data of file libcommstack.a(somefile.c.obj)
Since I'm not using floating points in the library and I don't know (upfront) if the target application does or does not have an FPU (or even uses floats), I'm not sure how to approach this.
I figured there would be two approaches:
Compile a single version of the lib, using an instruction set that all of the microcontrollers understand. I was hoping that this would be the case with ARMv7 (although I'm not yet 100% confident that the M23/M33 also support this).
Compile a lot of different libs for the different flavors based on the different architectures, FPU, etc.
As you can imagine, I would prefer to keep it simple and go for option 1, but I'm not sure how to "convince" the linker to link these two (or perhaps how to convince the compiler NOT to care about floating points for the lib).
Does anyone know if option 1 is feasible and how it can be achieved?
If it is not feasible, what would be the variables to keep in mind to determine the different build flavors?
Does anyone know if option 1 is feasible
Well, feasible, probably.
how it can be achieved?
Get all the processors you want to support and determine the instructions sets available on all these processors. Then compile for that instruction set.
But, please don't, that is a workaround.
If it is not feasible, what would be the variables to keep in mind to determine the different build flavors?
Gcc has something like "multilib profiles". See arm-none-eabi-gcc --print-multi-lib output. If you have newlib installed, you can go to /usr/arm-none-eabi/lib/thumb/ and see the directories there - newlib is compiled for each profile and installs separate library for it and different library is picked up depending on configuration. Compile for each of those profiles, and package your library by putting libraries in proper /usr/arm-none-eabi/lib/proper/directory/here and compiler will pick them up by itself (see gcc -v output for library search paths). For an example search newlib sources where it happens, can't find it. (Here's my example). With cmake as a backend as a example you could compile and install as follows:
arm-none-eabi-gcc --print-multi-lib |
while IFS=';' read -r dir opts; do
cmake -B builddir CMAKE_C_FLAGS="$opts" CMAKE_INSTALL_LIBDIR="$dir"
cmake --build builddir
cmake --install builddir --prefix "/usr/arm-none-eabi/"

Is ld called at both compile time and runtime?

I am trying to understand how linking and loading work. My understanding is that the Unix program "ld" contains both linking and loading functionality. When gcc is invoked, after preprocessing, compiling, and assembling, the linker is called which links all object files and .a files into an executable, along with minimal instructions for how shared libraries should be "connected" (what is the correct terminology here?) at runtime. This linker is ld.
At runtime, my understanding is that the executable is loaded into memory, although I'm not sure how. My specific questions are as follows:
1) Are shared object files being "linked" at compile time, or is there another word for what is happening?
2) At runtime, is ld being called for a second time? How can I see proof of this for my executable (on Linux and on MacOS)?
3) Are shared object files being "linked" at runtime, or is there another word for the process when shared objects are read from the location in LD_LIBRARY_PATH at runtime?
Is ld called at both compile time and runtime?
No: ld is not called at either compile or runtime.
When gcc is invoked, after preprocessing, compiling, and assembling, the linker is called which links all object files and .a files into an executable
Most moderately complicated programs use separate compilation and linking steps.
At compilation, a set of relocatable object files is produced (preprocessing, compilation and assembling are invoked at that step). Optionally the .o files are archived into an archive library (libsomething.a).
Then a link step is performed (often this is called "static linking", to differentiate this step from "dynamic loading" that will happen at runtime), producing an executable, or a shared library. Only at this step is /usr/bin/ld is invoked. On Linux, ld is part of the binutils package.
along with minimal instructions for how shared libraries should be "connected"
The linker records which shared libraries are required at runtime, and possibly which versions of libraries or symbols are required.
It also records which runtime loader should be used to load the required shared libraries.
At runtime, my understanding is that the executable is loaded into memory, although I'm not sure how.
The kernel loads executable into memory, and checks whether runtime loader was requested at static link time. If it was, the dynamic loader is also loaded into memory, and execution control is passed to it (instead of the main executable).
It is then the job of the dynamic loader to examine the executable for instructions on which other libraries are required, check whether correct versions can be found, loading them into memory, and arranging things such that symbol resolution will work between the main executable and the shared libraries. This is the runtime loading step, often also called dynamic linking.
The dynamic loader can be part of the OS, but on Linux it's part of libc (GLIBC, uClibc and musl each have their own loader).
No. ld is linking as in creating a library or exe, ld*.so is the loading part. Also ld*.so is part of the OS, not the gcc suite afaik. ld is generally part of (GNU) binutils on a gcc based system (but e.g. usually LLVM lld in a LLVM based system)
ld*.so is ld-linux-{arch}.so.2 on Linux and /libexec/ld-elf.so on e.g. FreeBSD.

How to statically link a library to another library with all symbols resolved

I have some XML parsing utility functions written inside C headers and source files based on expat library.
For this I have compiled my source files to a static library with expat statically linked to it.
I am able to use and the functions from the resulting xml utilities library with my applications only if I statically link both the utility library and expat with my application. I was of the view that I should be able to get my application built with only statically linking my utility library without requiring to statically link expat again with the application executable. Only linking my application with the utility library gives undefined symbol error for expat.
Can someone please guide me what am I missing ? I am using gcc compiler.
"I have compiled my source files to a static library with expat statically linked to it."
I'm fraid you haven't. A static library is not produced by the linker; no linkage is involved, so nothing can be linked to it.
A static library is nothing but a bag of object files in ar archive format.
When you are linking something that is produced by the linker - namely a program or a shared library -
you may offer such a bag to the linker. It will look in the bag and take out just the object files it needs to
carry on the linkage and link them into the target. The bag spares you the difficulty of
needing to know exactly which of the object files in it the linker will need, but the bag itself contributes nothing at all to the linkage.
How can I get expat static library included in my utilities library, so that I only need to link my executable with a single static library. I don't want to extract the two archives and merge the object files together.
There is no other way of combining two ar archives.
Your resistance to linking libexpat is puzzling, without further context. It is available
through the package manager on any distro. You've made a library that depends on libexpat. Clients that link your
library will need also need to link libexpat. This is an utterly routine sort of dependency
that you should simply document and - if you are packaging your library - include
in the package dependencies. Almost invariably when we write new libraries we are augmenting the
libraries already available to our target users. If every library statically
incorporated all of its own dependencies then they would all be the size of an
operating system and of no practical use.

Go code building linker error. Can I link manually?

I am building Go code that uses CGo heavily and this code must be compiled into a shared or static library (static is highly preferred). (code for reference)
It all works just fine on Linux and Mac, but on Windows it fails on linker stage either saying that all 4 modes (c-shared, shared, c-archive, archive) are not available or if invoke go tool link -shared manually complains about missing windows specific instructions.
My understanding is that all I need to build usable lib.a is to compile everything I will use into object files (*.o) and then put it through ar to produce usable static library.
Now the question is whether I can completely skip Go's linker and based on prepared .o files create .a manually?
How would I go about doing that if that is even possible?
Looks like gcc on windows is unable to automatically discover necessary shared libraries. The problem was caused by GCC and not by Go.
Although for compiling Go I had to use self-compiled master tip as current release (1.6.2) does not support shared/static libraries on windows/amd64.
Manually feeding gcc with each shared library (ntdll, winmm etc) in default location (C:\Windows\SysWOW64) has fixed the problem.
