Generate library from ELF file - gcc

I'm trying to generate a static library from a compiled ELF file.
Previously, I've been able to generate the library by compiling my source code to object files, then passing those objects to avr-ar to successfully create my library. In order to reduce code space of the project, I've switched over to using link-time optimisations so save ~1.5 kB of space - however, in order to do so I end up passing all my source and header files to avr-gcc in one invocation and it spits out a .elf file.
I can't seem to get the -flto option working with the linker (I'm using a custom linker script) and compiler driver, otherwise I'd have the object files I need.
Is it possible to take this generated .elf and push it through ar to generate a library?
Problem Context:
This is related to this problem. I've written the shared libraries and bootloader section, and am using this linker script to set out my flash space. Here's the Makefile that drives all this - it's very hacked together.
Ideally, what I'd like to happen is to be able to compile my src/ director to separate object files in obj/, all with link time optimisation enabled to cut down on code space as much as possible but still leaving unused functions in the output (the shared library that is stored in flash is not fully utilised by the bootloader application, but may be linked against by the loaded applications). I'd then like to be able link those objects together to create a .elf and libbootloader.a. The elf is then used to generate a binary to flash to my AVR and the bootloader library is referenced when building user applications that refer to the stored library in flash space already. (Perhaps I want to just link against a list of symbols referencing the shared library section?)

Related

How can shared object files be linked with other shared or regular objects to produce new object files?

I'm reading the ELF specification here: https://refspecs.linuxbase.org/elf/elf.pdf
On page 15:
A shared object file holds code and data suitable for linking in two contexts. First, the link
editor may process it with other relocatable and shared object files to create another object file.
Second, the dynamic linker combines it with an executable file and other shared objects to
create a process image.
I've seen multiple questions raised by others on SO asking about statically linking shared objects, which seems to be what this paragraph is suggesting, and yet the common answer seems to usually be that doing this is not possible.
Either I'm misunderstanding what this is saying (probably), or there isn't a consensus about what can be done with shared objects.
What does this paragraph mean?
Either I'm misunderstanding what this is saying (probably)
What the paragraph appears to try to say: there are two contexts in which a shared library may be used:
By static linker (aka link editor), to build a new shared library or an executable out of relocatable objects (i.e. build a new ET_DYN or ET_EXEC from ET_RELs), and
By dynamic linker to build a process image.
Note that the new shared library built in case 1 does not include the existing shared library in it. The existing library is needed only so the static linker knows how the new shared library (or executable) should reference symbols from the existing library.
Most of the questions I've seen (and probably the ones you are referring to) are "how do I put existing libfoo.so into a new libbar.so?", and that is in fact impossible.
Update:
I'm still not sure I understand. Is #1 the initial creation of the shared library?
Yes: creation of a new shared library or an executable.
Because then an executable also has two contexts: 1) The static linker creating the executable out of relocatable objects and 2) By use of the loader to build a process image.
That is true, but only for dynamically linked executables. Fully-static executables do not involve the loader at all.
I could say a similar thing for relocatable objects as well
Not really: relocatable objects do not normally participate in process image building (there are exceptions, but they are really special and odd-ball), and they are certainly not handled by the dynamic linker (loader).
For all practical purposes, relocatable objects are only useful as building blocks for a shared library or an executable.

Is ld called at both compile time and runtime?

I am trying to understand how linking and loading work. My understanding is that the Unix program "ld" contains both linking and loading functionality. When gcc is invoked, after preprocessing, compiling, and assembling, the linker is called which links all object files and .a files into an executable, along with minimal instructions for how shared libraries should be "connected" (what is the correct terminology here?) at runtime. This linker is ld.
At runtime, my understanding is that the executable is loaded into memory, although I'm not sure how. My specific questions are as follows:
1) Are shared object files being "linked" at compile time, or is there another word for what is happening?
2) At runtime, is ld being called for a second time? How can I see proof of this for my executable (on Linux and on MacOS)?
3) Are shared object files being "linked" at runtime, or is there another word for the process when shared objects are read from the location in LD_LIBRARY_PATH at runtime?
Thanks!
Is ld called at both compile time and runtime?
No: ld is not called at either compile or runtime.
When gcc is invoked, after preprocessing, compiling, and assembling, the linker is called which links all object files and .a files into an executable
Most moderately complicated programs use separate compilation and linking steps.
At compilation, a set of relocatable object files is produced (preprocessing, compilation and assembling are invoked at that step). Optionally the .o files are archived into an archive library (libsomething.a).
Then a link step is performed (often this is called "static linking", to differentiate this step from "dynamic loading" that will happen at runtime), producing an executable, or a shared library. Only at this step is /usr/bin/ld is invoked. On Linux, ld is part of the binutils package.
along with minimal instructions for how shared libraries should be "connected"
The linker records which shared libraries are required at runtime, and possibly which versions of libraries or symbols are required.
It also records which runtime loader should be used to load the required shared libraries.
At runtime, my understanding is that the executable is loaded into memory, although I'm not sure how.
The kernel loads executable into memory, and checks whether runtime loader was requested at static link time. If it was, the dynamic loader is also loaded into memory, and execution control is passed to it (instead of the main executable).
It is then the job of the dynamic loader to examine the executable for instructions on which other libraries are required, check whether correct versions can be found, loading them into memory, and arranging things such that symbol resolution will work between the main executable and the shared libraries. This is the runtime loading step, often also called dynamic linking.
The dynamic loader can be part of the OS, but on Linux it's part of libc (GLIBC, uClibc and musl each have their own loader).
No. ld is linking as in creating a library or exe, ld*.so is the loading part. Also ld*.so is part of the OS, not the gcc suite afaik. ld is generally part of (GNU) binutils on a gcc based system (but e.g. usually LLVM lld in a LLVM based system)
ld*.so is ld-linux-{arch}.so.2 on Linux and /libexec/ld-elf.so on e.g. FreeBSD.

Linking binary against functions/data in specific location in memory

I'm currently in the process of writing an intermediate-memory bootloader for an ATMega.
I'd like to place a section of commonly used functions and data in a specific location in memory, such that:
limited size of the bootloader section is not overcome
library functions, drivers, etc, are not reproduced by the application section and thus wasting space
For illustrative purposes, a map of the desired memory layout is below:
Following some help in this thread on avrfreaks, I'm to the point where I've been able to move all code (in my bootloader + library development environment - applications will be developed in separate projects) not tagged with __attribute__ ((section(".boot"))) to the shared library section successfully via means of a custom linker script.
It was suggested in the avrfreaks thread that I can link my applications by using avr-objcopy --strip-all --keep-symbol=fred --keep-symbol=greg ... boot.elf dummy.elf to create a symbol reference of what I have in my shared library, and then linking my applications against this memory layout with avr-gcc -o app.elf -Wl,--just-symbols=dummy.elf app1.o app2.o ....
The problem I face here is that I need to specify each symbol I want to keep in my dumy.elf. I can use the keep-symbols=<file> directive to specify a text file list of symbols to keep, but I still must generate this list.
I've noticed that there is a bunch of symbols that I don't want to include (stuff like C environment set-up code that is common in name, but different in functionality, for both the bootloader and application) that seems to start with the prefix '_' (but of course, there are some useful and large library functions with the same prefix, e.g. *printf and math routines). Perhaps there won't be conflicts if I link my application against the existing runtime code in the application/bootloader?
How can I generate a list of symbols for my library section that contains the code that I've written (maybe some sed magic and scanning header files)/excludes any symbols that may conflict in linking the application?
The project can be viewed in its current state at this github repository.
Edit: I want to make clear that I could tag everything I want to be in the shared library section with __attribute__ ((section(".library"))), but as I also want to share some rather large libc stuff (vsprintf, etc) between the bootloader and application, this becomes cumbersome very quickly. As such, I've elected to put everything not tagged as boot in the library memory region via a linker script.
Perhaps I just need some advice on my linker script, as I'm not super sure what I'm doing there.
Consider using -R <file> as linker option (gcc -Wl,-R -Wl,<file>).
This will generate references to (global) symbols in <file> just as if they were linked normally, but not include the referenced code.

What happens when compiling against a shared library?

I understand that when linking against a static library i.e. libname.a, the binary code for the used functions is taken out of the archive and inserted in the application binary. Therefore, the static library MUST be present at compilation time.
However, with shared libraries I am lost. The function definitions are not copied. Then why is it needed that the shared library be provided on the linker command line? Also, are there different ways to link against shared libraries and what are they?
The shared libraries need to be fed to the linker's command line so that a reference to the specific functions and the file in which these functions reside, is stored into the executable. When the executable is run, the dynamic linker (/lib/ld-linux.so, /libexec/ld-elf.so, etc, depending on your system) is loaded first and checks these references. Once it finds the lib files, it maps them (using the mmap() system call) to your program's adress space.
You can see these references by running
objdump -T a.out
or
nm -D a.out
For ELF executables, the existence of the .interp section implies that the program uses dynamic linking.
See the man pages for dlopen and dlsym for explicit dynamic link loader management.

shared and static libraries

Can anyone explain me what is the difference between a shared and static library in gcc(makefile)
I read that static library is independent code but it increases the size of your exectuable file
But whereas the shared library it links the functions dynamically and it does not increase the size of your executable file
I cannot understand the difference between these two.
Can anyone tell me when i should create a static library and when i should create a shared library.
They say shared library is a position dependent code
What do we mean by position dependent code?
If shared library does not increase the code and if static library increases the code size
then we can just go for shared library right.
But why do we have static library too what is the real use of it.
please help me guys
When you compile code with gcc, it can directly produce object files and a number of object files can be linked together to produce an executable file.
A static library is simply a collection of object files (usually with an index) which can be used by a linker when it creates an executable.
The difference between just linking object files together and linking with a static library is that with the library, the linker will only pick the object files from it that are absolutely needed, whereas linking with object files, the linker is forced to take all of the files.
A dynamic library is different again - very different. It is a collection of object files, but this time its the output of a linking process with all of its internal links already resolved.
Linking with a dynamic library means that the linker only resolves symbols in the final executable, but adds none of the object code from the library at link time.
Running an executable linked with dynamic libraries means that at run-time some special software has to ask the operating system to load a specified dynamic library into memory (this is why a dynamic library has to be created by a linker) usually before the program gets to main().
This executable can be a very small file on disk, but when loaded into memory and after loading dynamic libraries the total memory image can be quite large as the whole of the library will have been loaded in.
I believe its the case that if several different programs need the same dynamic library, the operating system can use virtual memory for each program to load the library, but this only uses one copy of physical memory - which can be a big saving in some cases.
If you ship an executable which needs a dynamic library, then you might be lucky and the file might already be on the user's machine. If not, you would have to install that as well. However, if you find bugs in your library, then just shipping a new version of your dynamic library will fix the bugs.
An executable built with static libraries comes ready to run, but fixing bugs would involve reshipping the whole executable.
In Summary:
object files static library dynamic library
produced by compiler librarian linker
exe large smaller smallest
ram x 1 copy large smaller largest
ram x n copies large smaller smallest
dep on files independent independent dependant
upgrade lib no no yes
you might want to create a single executable without the dependencies of the libraries.
as far as I know static libraries should be somewhat faster.
Drawbacks: linking shared libraries is not only increasing the executable size it also gets loaded with each executable. If you use the library in multiple executables then it will be in memory multiple times.
Position dependent code means that at the creation of the library the exact location in memory of the library isn't known so only relative memory addressing is available (eg: jump to +50 bytes from this instruction and follow execution from there)

Resources