Undefining linker symbols in gcc - gcc

We have a programm that runs on an embedded oOS. We normally embed a version string in the output binary that can identify all the versions contained when generating the binary. Usually the compilers we use can make sure that the version string is in the binary by creating an "undefined" symbol, which is then resolved by our version string.
However, we have now moved to a Linux based system and gcc.
gcc is removing the version string from the final exe. The final exe is created through linking in a bunch of libraries. Each library has a version string embedded.
gcc is removing the version string because nothing is referencing the string and we have turned on -Os optimisations.
Is there a way of making sure that gcc does not strip a collection of strings (there are about 5-10 version strings we need to embed)?
Thanks.

Try working with --retain-symbols-file (option to the linker)
From the ld mangpage:
--retain-symbols-file filename
Retain only the symbols listed in the file filename, discarding all others. filename is simply a flat file, with one symbol name per line. This option is especially useful in environments (such as VxWorks) where a large global symbol table is accumulated gradually, to conserve run-time memory.
--retain-symbols-file does not discard undefined symbols, or symbols needed for relocations.
You may only specify --retain-symbols-file once in the command line. It overrides -s and -S.
EDIT I just noticed the last line of the docs quoted above. It will override the 'strip all' option, so I'm not sure this will help you...

Ok, to solve this we did this in a c file:
const char _string_[] = "some string";
Then include the object file in the final link:
gcc <snip> -Wl,--start-group string.o <snip> -Wl,--end-group -Wl,--strip-all -o final.exe

Related

Does "-Wl,-soname" work on MinGW or is there an equivalent?

I'm experimenting a bit with building DLLs on windows using MINGW.
A very good summary (in my opinion) can be found at:
https://www.transmissionzero.co.uk/computing/building-dlls-with-mingw/
There is even a basic project which can be used for the purpose of this discussion:
https://github.com/TransmissionZero/MinGW-DLL-Example/releases/tag/rel%2Fv1.1
Note there is a cosmetic mistake in this project which will make it fail out of the box: the Makefile does not create an "obj" directory - Either adjust the Makefile or create it manually.
So here is the real question.
How to change the Windows DLL name so it differs from the actual DLL file name ??
Essentially I'm trying to achieve on Windows, the effect which is very well described here on Linux:
https://www.man7.org/conf/lca2006/shared_libraries/slide4b.html
Initially I tried changing "InternalName" and ""OriginalFilename" in the resource file used to create the DLL but that does not work.
In a second step, I tried adding "-Wl,-soname,SoName.dll" on the command that performs the final link, to change the Windows DLL name.
However, that does not seem to have the expected effect (I'm using MingW 7.3.0, x86_64-posix-seh-rev0).
Two things makes me say that:
1/ The test executable still works (I would expect it to fail, because it tries to locate SoName.dll but can't find it).
2/ "pexports.exe AddLib.dll" produces the output below, where the library name hasn't changed:
LIBRARY "AddLib.dll"
EXPORTS
Add
bar DATA
foo DATA
Am I doing anything wrong ? Are my expectations wrong perhaps ?
Thanks for your help !
David
First of all, I would like to say it's important to use either a .def file for specifying the exported symbols or use __declspec(dllexport) / __declspec(dllimport), but never mix these two methods. There is also another method using the -Wl,--export-all-symbols linker flag, but I think that's ugly and should only be used when quick and dirty is what you want.
It is possible to tell MinGW to use a DLL filename that does not match the library name. In the link step use -o to specify the DLL and use -Wl,--out-implib, to specify the library file.
Let me illustrate by showing how to build chebyshev as a both static and shared library. Its sources consist of only only 2 files: chebyshev.h and chebyshev.c.
Compile
gcc -c -o chebyshev.o chebyshev.c -I. -O3
Create static library
ar cr libchebyshev.a chebyshev.o
Create a .def file (as it wasn't supplied and __declspec(dllexport) / __declspec(dllimport) wasn't used either). Note that this file doesn't contain a line with LIBRARY allowing the linker to specify the DLL filename later.
There are several ways to do this if the .def file wasn't supplied by the project:
3.1. Get the symbols from the .h file(s). This may be hard as sometimes you need to distinguish for example between type definitions (like typedef, enum, struct) and actual functions and variables that need to be exported;
echo "EXPORTS" > chebyshev.def
sed -n -e "s/^.* \**\(chebyshev_.*\) *(.*$/\1/p" chebyshev.h >> chebyshev.def
3.2. Use nm to list symbols in the library file and filter out the type of symbols you need.
echo "EXPORTS" > chebyshev.def
nm -f posix --defined-only -p libchebyshev.a | sed -n -e "s/^_*\([^ ]*\) T .*$/\1/p" >> chebyshev.def
Link the static library into the shared library.
gcc -shared -s -mwindows -def chebyshev.def -o chebyshev-0.dll -Wl,--out-implib,libchebyshev.dll.a libchebyshev.a
If you have a project that uses __declspec(dllexport) / __declspec(dllimport) things are a lot easier. And you can even have the link step generate a .def file using the -Wl,--output-def, linker flag like this:
gcc -shared -s -mwindows -o myproject.dll -Wl,--out-implib,myproject.dll.a -Wl,--output-def,myproject.def myproject.o
This answer is based on my experiences with C. For C++ you really should use __declspec(dllexport) / __declspec(dllimport).
I believe I have found one mechanism to achieve on Windows, the effect described for Linux in https://www.man7.org/conf/lca2006/shared_libraries/slide4b.html
This involves dll_tool
In the example Makefile there was originally this line:
gcc -o AddLib.dll obj/add.o obj/resource.o -shared -s -Wl,--subsystem,windows,--out-implib,libaddlib.a
I simply replaced it with the 2 lines below instead:
dlltool -e obj/exports.o --dllname soname.dll -l libAddLib.a obj/resource.o obj/add.o
gcc -o AddLib.dll obj/resource.o obj/add.o obj/exports.o -shared -s -Wl,--subsystem,windows
Really, the key seems to be the creation with dlltool of an exports file in conjunction with dllname. This exports file is linked with the object files that make up the body of the DLL and it handles the interface between the DLL and the outside world. Note that dlltool also creates the "import library" at the same time
Now I get the expected effect, and I can see that the "Internal DLL name" (not sure what the correct terminology is) has changed:
First evidence:
>> dlltool.exe -I libAddLib.a
soname.dll
Second evidence:
>> pexports.exe AddLib.dll
LIBRARY "soname.dll"
EXPORTS
Add
bar DATA
foo DATA
Third evidence:
>> AddTest.exe
Error: the code execution cannot proceed because soname.dll was not found.
Although the desired effect is achieved, this still seems to be some sort of workaround. My understanding (but I could well be wrong) is that the gcc option "-Wl,-soname" should achieve exactly the same thing. At least it does on Linux, but is this broken on Windows perhaps ??

what do the symbols 'Wl,-R' and '-Wl,./lib' mean in makefile?

Here is an example of makefile:
LINKFLAGS += -L./lib -lqn -Wl,-R -Wl,./lib
What exactly are the symbols '-Wl,-R' and '-Wl,./lib'?
The symbols in question have no particular meaning to make. They are just text as far as it is concerned, so their meaning depends on how they are used.
If the name "LINKFLAGS" is to be taken as indicative, however, then these will be included among the command-line arguments to link commands make runs (but this is still a question of parts of the makefile that are not in evidence). Such flags are not standardized, so the meaning is still somewhat in question.
If you happen to be using the GNU toolchain then the -Wl option to gcc and g++ assists in passing arguments through to the underlying linker, which would be consistent with the apparent intention. Appearing together as you show them, and supposing that ./lib is a directory, the effect on the GNU linker is equivalent to using its -rpath option and specifying ./lib. That would be a somewhat odd thing to do, but not altogether senseless.
Those are options for the linker (or the link step done by the compiler). You can find in the man page of gcc.
-Wl,option
Pass option as an option to the linker. If option contains commas, it is
split into multiple options at the commas. You can use this syntax to pass
an argument to the option. For example, -Wl,-Map,output.map passes
-Map output.map to the linker. When using the GNU linker, you can also get
the same effect with -Wl,-Map=output.map.
So, it is equivalent to pass the options -Rand .lib to the linker. The man page of ld stats than -R .lib is equivalent to -rpath=.lib
-rpath=dir
Add a directory to the runtime library search path. This is used when linking
an ELF executable with shared objects. All -rpath arguments are concatenated
and passed to the runtime linker, which uses them to locate shared objects at
runtime. The -rpath option is also used when locating shared objects which are
needed by shared objects explicitly included in the link; see the description
of the -rpath-link option. If -rpath is not used when linking an ELF executable,
the contents of the environment variable "LD_RUN_PATH" will be used if it is
defined.
gcc documentation indicates that -Wl is used to pass options to the linker.
gnu ld documentation and ld.so man page indicate that -R does. In summary, registering in the executable a path where shared libraries are searched when the executable is launched. The information about --enable-new-dtags and --disable-new-dtags may be also useful in understanding what happens.
The use of ./lib as argument of -R is odd, $ORIGIN is probably what is desired. Thus, with the various escape mechanisms needed,
LINKFLAGS += -L./lib -lqn -Wl,-R '-Wl,$$ORIGIN/lib'

Two ways of linking to static libraries

Here are a couple of ways to use functions from a static library, built with ar (i.e. libSOMTEHING.a):
ld -o result myapp.o -Lpath/to/library -lname
ld -o result myapp.o path/to/library/libname.a
Since we omit any dynamic libraries from the command line, this should build a static executable.
What are the differences? For example, are the whole libraries linked in the executable, or just the needed functions? In the second example, does switching the places of the lib and the object file matter?
(PS: some non-GNU ld linkers require all options like -o to be before the first non-option filename, in which case they'd only accept -L... -lname before myapp.o)
In the first line, a search for a dynamic library (libname.so) occurs before the static library (libname.a) within a directory. Also, the standard lib path is also searched for libname.*, not just /path/to/library.
From "man ld"
On systems which support shared libraries, ld may also search for
files other than libnamespec.a. Specifically, on ELF and SunOS
systems, ld will search a directory for a library called
libnamespec.so before searching for one called libnamespec.a. (By
convention, a ".so" extension indicates a shared library.)
The second line forces the linker to use the static library at path/to/lib.
If there is no dynamic library built (libname.so), and the only library available is path/to/library/libname.a, then the two lines will produce the same "result" binary.

How to force gcc to link like g++?

In this episode of "let's be stupid", we have the following problem: a C++ library has been wrapped with a layer of code that exports its functionality in a way that allows it to be called from C. This results in a separate library that must be linked (along with the original C++ library and some object files specific to the program) into a C program to produce the desired result.
The tricky part is that this is being done in the context of a rigid build system that was built in-house and consists of literally dozens of include makefiles. This system has a separate step for the linking of libraries and object files into the final executable but it insists on using gcc for this step instead of g++ because the program source files all have a .c extension, so the result is a profusion of undefined symbols. If the command line is manually pasted at a prompt and g++ is substituted for gcc, then everything works fine.
There is a well-known (to this build system) make variable that allows flags to be passed to the linking step, and it would be nice if there were some incantation that could be added to this variable that would force gcc to act like g++ (since both are just driver programs).
I have spent quality time with the gcc documentation searching for something that would do this but haven't found anything that looks right, does anybody have suggestions?
Considering such a terrible build system write a wrapper around gcc that exec's gcc or g++ dependent upon the arguments. Replace /usr/bin/gcc with this script, or modify your PATH to use this script in preference to the real binary.
#!/bin/sh
if [ "$1" == "wibble wobble" ]
then
exec /usr/bin/gcc-4.5 $*
else
exec /usr/bin/g++-4.5 $*
fi
The problem is that C linkage produces object files with C name mangling, and that C++ linkage produces object files with C++ name mangling.
Your best bet is to use
extern "C"
before declarations in your C++ builds, and no prefix on your C builds.
You can detect C++ using
#if __cplusplus
Many thanks to bmargulies for his comment on the original question. By comparing the output of running the link line with both gcc and g++ using the -v option and doing a bit of experimenting, I was able to determine that "-lstdc++" was the magic ingredient to add to my linking flags (in the appropriate order relative to other libraries) in order to avoid the problem of undefined symbols.
For those of you who wish to play "let's be stupid" at home, I should note that I have avoided any use of static initialization in the C++ code (as is generally wise), so I wasn't forced to compile the translation unit containing the main() function with g++ as indicated in item 32.1 of FAQ-Lite (http://www.parashift.com/c++-faq-lite/mixing-c-and-cpp.html).

Different ways to specify libraries to gcc/g++

I'd be curious to understand if there's any substantial difference in specifying libraries (both shared and static) to gcc/g++ in the two following ways (CC can be g++ or gcc)
CC -o output_executable /path/to/my/libstatic.a /path/to/my/libshared.so source1.cpp source2.cpp ... sourceN.cpp
vs
CC -o output_executable -L/path/to/my/libs -lstatic -lshared source1.cpp source2.cpp ... sourceN.cpp
I can only see a major difference being that passing directly the fully-specified library name would make for a greater control in choosing static or dynamic versions, but I suspect there's something else going on that can have side effects on how the executable is built or will behave at runtime, am I right?
Andrea.
Ok, I can answer myself basing on some experiments and a deeper reading of gcc documentation:
From gcc documentation: http://gcc.gnu.org/onlinedocs/gcc/Link-Options.html
[...] The linker handles an archive file by scanning through it for members which define symbols that have so far been referenced but not defined. But if the file that is found is an ordinary object file, it is linked in the usual fashion. The only difference between using an -l option and specifying a file name is that -l surrounds library with lib' and.a' and searches several directories
This actually answers also to the related doubt about the 3rd option of directly specifying object files on the gcc command line (i.e. in that case all the code in the object files will become part of the final executable, while using archives, only the object files that are really needed will be pulled in).

Resources