Linking binary against functions/data in specific location in memory - gcc

I'm currently in the process of writing an intermediate-memory bootloader for an ATMega.
I'd like to place a section of commonly used functions and data in a specific location in memory, such that:
limited size of the bootloader section is not overcome
library functions, drivers, etc, are not reproduced by the application section and thus wasting space
For illustrative purposes, a map of the desired memory layout is below:
Following some help in this thread on avrfreaks, I'm to the point where I've been able to move all code (in my bootloader + library development environment - applications will be developed in separate projects) not tagged with __attribute__ ((section(".boot"))) to the shared library section successfully via means of a custom linker script.
It was suggested in the avrfreaks thread that I can link my applications by using avr-objcopy --strip-all --keep-symbol=fred --keep-symbol=greg ... boot.elf dummy.elf to create a symbol reference of what I have in my shared library, and then linking my applications against this memory layout with avr-gcc -o app.elf -Wl,--just-symbols=dummy.elf app1.o app2.o ....
The problem I face here is that I need to specify each symbol I want to keep in my dumy.elf. I can use the keep-symbols=<file> directive to specify a text file list of symbols to keep, but I still must generate this list.
I've noticed that there is a bunch of symbols that I don't want to include (stuff like C environment set-up code that is common in name, but different in functionality, for both the bootloader and application) that seems to start with the prefix '_' (but of course, there are some useful and large library functions with the same prefix, e.g. *printf and math routines). Perhaps there won't be conflicts if I link my application against the existing runtime code in the application/bootloader?
How can I generate a list of symbols for my library section that contains the code that I've written (maybe some sed magic and scanning header files)/excludes any symbols that may conflict in linking the application?
The project can be viewed in its current state at this github repository.
Edit: I want to make clear that I could tag everything I want to be in the shared library section with __attribute__ ((section(".library"))), but as I also want to share some rather large libc stuff (vsprintf, etc) between the bootloader and application, this becomes cumbersome very quickly. As such, I've elected to put everything not tagged as boot in the library memory region via a linker script.
Perhaps I just need some advice on my linker script, as I'm not super sure what I'm doing there.

Consider using -R <file> as linker option (gcc -Wl,-R -Wl,<file>).
This will generate references to (global) symbols in <file> just as if they were linked normally, but not include the referenced code.

Related

Why would a library consist of both a .lib and .dll file? [duplicate]

I know very little about DLL's and LIB's other than that they contain vital code required for a program to run properly - libraries. But why do compilers generate them at all? Wouldn't it be easier to just include all the code in a single executable? And what's the difference between DLL's and LIB's?
There are static libraries (LIB) and dynamic libraries (DLL) - but note that .LIB files can be either static libraries (containing object files) or import libraries (containing symbols to allow the linker to link to a DLL).
Libraries are used because you may have code that you want to use in many programs. For example if you write a function that counts the number of characters in a string, that function will be useful in lots of programs. Once you get that function working correctly you don't want to have to recompile the code every time you use it, so you put the executable code for that function in a library, and the linker can extract and insert the compiled code into your program. Static libraries are sometimes called 'archives' for this reason.
Dynamic libraries take this one step further. It seems wasteful to have multiple copies of the library functions taking up space in each of the programs. Why can't they all share one copy of the function? This is what dynamic libraries are for. Rather than building the library code into your program when it is compiled, it can be run by mapping it into your program as it is loaded into memory. Multiple programs running at the same time that use the same functions can all share one copy, saving memory. In fact, you can load dynamic libraries only as needed, depending on the path through your code. No point in having the printer routines taking up memory if you aren't doing any printing. On the other hand, this means you have to have a copy of the dynamic library installed on every machine your program runs on. This creates its own set of problems.
As an example, almost every program written in 'C' will need functions from a library called the 'C runtime library, though few programs will need all of the functions. The C runtime comes in both static and dynamic versions, so you can determine which version your program uses depending on particular needs.
Another aspect is security (obfuscation). Once a piece of code is extracted from the main application and put in a "separated" Dynamic-Link Library, it is easier to attack, analyse (reverse-engineer) the code, since it has been isolated. When the same piece of code is kept in a LIB Library, it is part of the compiled (linked) target application, and this thus harder to isolate (differentiate) that piece of code from the rest of the target binaries.
One important reason for creating a DLL/LIB rather than just compiling the code into an executable is reuse and relocation. The average Java or .NET application (for example) will most likely use several 3rd party (or framework) libraries. It is much easier and faster to just compile against a pre-built library, rather than having to compile all of the 3rd party code into your application. Compiling your code into libraries also encourages good design practices, e.g. designing your classes to be used in different types of applications.
A DLL is a library of functions that are shared among other executable programs. Just look in your windows/system32 directory and you will find dozens of them. When your program creates a DLL it also normally creates a lib file so that the application *.exe program can resolve symbols that are declared in the DLL.
A .lib is a library of functions that are statically linked to a program -- they are NOT shared by other programs. Each program that links with a *.lib file has all the code in that file. If you have two programs A.exe and B.exe that link with C.lib then each A and B will both contain the code in C.lib.
How you create DLLs and libs depend on the compiler you use. Each compiler does it differently.
One other difference lies in the performance.
As the DLL is loaded at runtime by the .exe(s), the .exe(s) and the DLL work with shared memory concept and hence the performance is low relatively to static linking.
On the other hand, a .lib is code that is linked statically at compile time into every process that requests. Hence the .exe(s) will have single memory, thus increasing the performance of the process.

How can shared object files be linked with other shared or regular objects to produce new object files?

I'm reading the ELF specification here: https://refspecs.linuxbase.org/elf/elf.pdf
On page 15:
A shared object file holds code and data suitable for linking in two contexts. First, the link
editor may process it with other relocatable and shared object files to create another object file.
Second, the dynamic linker combines it with an executable file and other shared objects to
create a process image.
I've seen multiple questions raised by others on SO asking about statically linking shared objects, which seems to be what this paragraph is suggesting, and yet the common answer seems to usually be that doing this is not possible.
Either I'm misunderstanding what this is saying (probably), or there isn't a consensus about what can be done with shared objects.
What does this paragraph mean?
Either I'm misunderstanding what this is saying (probably)
What the paragraph appears to try to say: there are two contexts in which a shared library may be used:
By static linker (aka link editor), to build a new shared library or an executable out of relocatable objects (i.e. build a new ET_DYN or ET_EXEC from ET_RELs), and
By dynamic linker to build a process image.
Note that the new shared library built in case 1 does not include the existing shared library in it. The existing library is needed only so the static linker knows how the new shared library (or executable) should reference symbols from the existing library.
Most of the questions I've seen (and probably the ones you are referring to) are "how do I put existing libfoo.so into a new libbar.so?", and that is in fact impossible.
Update:
I'm still not sure I understand. Is #1 the initial creation of the shared library?
Yes: creation of a new shared library or an executable.
Because then an executable also has two contexts: 1) The static linker creating the executable out of relocatable objects and 2) By use of the loader to build a process image.
That is true, but only for dynamically linked executables. Fully-static executables do not involve the loader at all.
I could say a similar thing for relocatable objects as well
Not really: relocatable objects do not normally participate in process image building (there are exceptions, but they are really special and odd-ball), and they are certainly not handled by the dynamic linker (loader).
For all practical purposes, relocatable objects are only useful as building blocks for a shared library or an executable.

Generate library from ELF file

I'm trying to generate a static library from a compiled ELF file.
Previously, I've been able to generate the library by compiling my source code to object files, then passing those objects to avr-ar to successfully create my library. In order to reduce code space of the project, I've switched over to using link-time optimisations so save ~1.5 kB of space - however, in order to do so I end up passing all my source and header files to avr-gcc in one invocation and it spits out a .elf file.
I can't seem to get the -flto option working with the linker (I'm using a custom linker script) and compiler driver, otherwise I'd have the object files I need.
Is it possible to take this generated .elf and push it through ar to generate a library?
Problem Context:
This is related to this problem. I've written the shared libraries and bootloader section, and am using this linker script to set out my flash space. Here's the Makefile that drives all this - it's very hacked together.
Ideally, what I'd like to happen is to be able to compile my src/ director to separate object files in obj/, all with link time optimisation enabled to cut down on code space as much as possible but still leaving unused functions in the output (the shared library that is stored in flash is not fully utilised by the bootloader application, but may be linked against by the loaded applications). I'd then like to be able link those objects together to create a .elf and libbootloader.a. The elf is then used to generate a binary to flash to my AVR and the bootloader library is referenced when building user applications that refer to the stored library in flash space already. (Perhaps I want to just link against a list of symbols referencing the shared library section?)

gcc linking in the same lib twice

This might seem like a strange idea, but I need the same library linked into my code twice.
A bit of background: I am writing a bit of firmware with a bootloader and a application. Both bits of code need to use the comms library (spi) and some other system libs to run. I cannot replace those libraries in the bootloader if it is using those libraries to run. Hence I would like to include the lib twice, once for the bootloader, and once for the application.
Previously I have done this by making two programs and splicing the HEX files as part of the build process. This time I would like to make one elf that contains both application and bootloader (with debugging symbols for both). Then I can generate the boot load image by stripping it out in a post build step. This allows me to build a complete image and use the linker to avoid collisions etc without making my own tool, and means I can debug errors in the bootloader and the application easily, but the only stumbling block would having the lib multiple times
I figure a solution might be to make two separate static libs ie bootloader.a and application.a the both already contain the other lib, but this seems messy. Does anyone know a better solution?

Using dynamic library

When I would like to compile a program which uses a dynamic library, do I have to install (i.e. copy to a specific place, say, /usr/share/lib) this library? Or is it ok, if I put this library to any place somewhere and later during linking I point the linker to it, e.g. '-L ./thelibfolder'?
do I have to install (i.e. copy to a specific place, say, /usr/share/lib) this library?
No.
For a UNIX shared library, you need to arrange for two things:
You have to make the library known to the static linker, while linking main executable. Usually this is achieved by adding -L/path/to/directory -lfoo link flags to the link line.
You have to make runtime loader search /path/to/directory as well. This is system-specific. On many systems, setting LD_LIBRARY_PATH environment variable achieves the desired result, though this is usually not the preferred method. Another method is to encode this path into the application itself, e.g. on Linux one would add -Wl,-rpath=/path/to/directory to the application link line.

Resources