Using exceptions in a .dll, makes any executable using it crash - gcc

I have a homemade library: mylib.dll
Within this library, C++ exceptions have recently been introduced - all are caught inside the library, i.e. none are thrown out to be caught by the users of the library.
Historically, this library has been statically linked with libgcc and libstdc++ - but as soon as exceptions were introduced, any executable using the shared library would crash on any exception (caught or not) inside mylib.dll.
The executable - let's call it main.exe - is dynamically linked with the library like this:
ld [args] -Wl,--start-group mylib.dll.a <other libraries> -Wl,--end-group ...
As long as mylib.dll is statically linked with the std libraries, any exception will inside mylib.dll will make main.exe crash.
My guess is that this is because there are now several instances of the std library for exceptions (one for the lib and one for the executable) - and then it won't be able to handle it properly, perhaps because of stack unwinding?
It does work by not statically linking libgcc and libstdc++ into mylib.dll, as expected. Here, they are both linked dynamically, and main.exe works just fine with the exception handling inside mylib.dll
It did work also statically linking libgcc and libstdc++ into main.exe as well. Maybe this is because the exceptions were contained inside mylib.dll - and then it wouldn't work if they were thrown from inside the .dll and caught in main.exe? This is only speculation, I don't know.
In any case, it seems that one should never statically link any standard library into a homemade .dll, and always dynamically link the std libraries in this case. This is what I will go with from now on.
I would though still like to understand why it crashes in my case, since the exceptions are contained within mylib.dll

Related

Create a statically linked shared library

Is it possible to create a shared library which is itself statically linked, i.e. it does not depend on other shared libraries?
Let me be a little bit more concrete..
I want to create a shared library, say mylib.so, which makes use of some other special libraries (in my case its intel mkl and openMP). Since I have installed these libraries I can build mylib.so and include it in other programs without any problem.
However, if I want to use the library (or the executables including it) on another machine I first have to install all the intel stuff. Is there a way to avoid this? My first try was to add the option -static when building mylib.so but this doesn't seem to do anything..
I'm using icc..
Is it possible to create a shared library which is itself statically linked, i.e. it does not depend on other shared libraries?
Not on Linux, not when using GLIBC (your shared library will always depend on at least ld-linux*.so*).
I want to create a shared library, say mylib.so, which makes use of some other special libraries (in my case its intel mkl and openMP).
There is no problem1 statically linking Intel MKL and OpenMP libraries into mylib.so -- you just don't want to depend on these libraries dynamically (in other words, you are asking for an impossible thing which you don't actually need).
To do so, you need two things:
Link mylib.so with archive versions of the libraries you don't want to depend on dynamically, e.g. gcc -o mylib.so -shared mylib.c .../libmkl.a ...
The libraries which you want to statically link into mylib.so must have been built with position-independent code (i.e. with -fPIC flag).
Update:
What if the archived version isn't available?
Then you can't link it into your library.
Eg I'm using intel/oneapi/intelpython/latest/lib/libstdc++.so and there is no corresponding .a file..
This is a special case: you wouldn't want to link that version into your library even if it were available.
Instead, your program should use the version installed on the target system.
Having two separate versions of libstdc++ (e.g. one statically linked, and the other dynamically linked) into a single process will end very badly -- either with a crash, or with silent stack or heap corruption.
1 Note that linking in somebody else's library and distributing it may have licensing implications.

Is ld called at both compile time and runtime?

I am trying to understand how linking and loading work. My understanding is that the Unix program "ld" contains both linking and loading functionality. When gcc is invoked, after preprocessing, compiling, and assembling, the linker is called which links all object files and .a files into an executable, along with minimal instructions for how shared libraries should be "connected" (what is the correct terminology here?) at runtime. This linker is ld.
At runtime, my understanding is that the executable is loaded into memory, although I'm not sure how. My specific questions are as follows:
1) Are shared object files being "linked" at compile time, or is there another word for what is happening?
2) At runtime, is ld being called for a second time? How can I see proof of this for my executable (on Linux and on MacOS)?
3) Are shared object files being "linked" at runtime, or is there another word for the process when shared objects are read from the location in LD_LIBRARY_PATH at runtime?
Thanks!
Is ld called at both compile time and runtime?
No: ld is not called at either compile or runtime.
When gcc is invoked, after preprocessing, compiling, and assembling, the linker is called which links all object files and .a files into an executable
Most moderately complicated programs use separate compilation and linking steps.
At compilation, a set of relocatable object files is produced (preprocessing, compilation and assembling are invoked at that step). Optionally the .o files are archived into an archive library (libsomething.a).
Then a link step is performed (often this is called "static linking", to differentiate this step from "dynamic loading" that will happen at runtime), producing an executable, or a shared library. Only at this step is /usr/bin/ld is invoked. On Linux, ld is part of the binutils package.
along with minimal instructions for how shared libraries should be "connected"
The linker records which shared libraries are required at runtime, and possibly which versions of libraries or symbols are required.
It also records which runtime loader should be used to load the required shared libraries.
At runtime, my understanding is that the executable is loaded into memory, although I'm not sure how.
The kernel loads executable into memory, and checks whether runtime loader was requested at static link time. If it was, the dynamic loader is also loaded into memory, and execution control is passed to it (instead of the main executable).
It is then the job of the dynamic loader to examine the executable for instructions on which other libraries are required, check whether correct versions can be found, loading them into memory, and arranging things such that symbol resolution will work between the main executable and the shared libraries. This is the runtime loading step, often also called dynamic linking.
The dynamic loader can be part of the OS, but on Linux it's part of libc (GLIBC, uClibc and musl each have their own loader).
No. ld is linking as in creating a library or exe, ld*.so is the loading part. Also ld*.so is part of the OS, not the gcc suite afaik. ld is generally part of (GNU) binutils on a gcc based system (but e.g. usually LLVM lld in a LLVM based system)
ld*.so is ld-linux-{arch}.so.2 on Linux and /libexec/ld-elf.so on e.g. FreeBSD.

How does gcc/ld find zlib.so?

I've used zlib for ages and never thought about the fact that it is named slightly unconventionally. While most libraries on Linux follow the naming convention of lib<name>.so for shared objects and lib<name>.a for archives, zlib is named zlib.so/zlib.a. My question is: how does gcc/ld know to look for zlib.so when I use -lz as a link flag?
I understand that for linking, gcc invokes ld, which searches for libraries in certain default paths and any path specified with -L, and it appends the lib and .so or .a. parts as necessary. Oddly, gcc's manual page for linking options only mentions that the linker can find archives; there is no mention of the .so extension. The man page for ld at least mentions both extensions, but still only mentions searching by prepending lib to the specified library name. How does ld know to add the lib after the z for zlib? I've never seen this happen to another library.
gcc has several different methods for linking libraries, shared or static. If you specify -lz, gcc is going to look for libz.so (possibly with some version bits between the libz and the .so, but the important part is the file name will start with libz and end with .so), or for libz.a (again, possibly with version info) if you are compiling statically, or as a fallback if the shared library does not exist. If you specify -lzlib it will look for libzlib.so (which is not the standard name - the package is often named zlib, but the library itself is libz). Another way of linking would be to not use the -l<lib> option, and just specify /path/to/zlib.so or -L /path/to zlib.so (or zlib.a if you want). In this case, the library doesn't have to have the lib prefix, but you would have to explicitly provide any version info, unless provisions are made for a symbolic link or something similar to provide the literal name zlib.so.
Applications can also load shared libraries at runtime via dlopen() and it's other associated functions, in which case the library can also be named whatever you want it to be (this doesn't work for static libraries, of course).
So, if the library you are looking at is actually called zlib.so, then it is not being found by gcc ... -lz, unless it just happens to be a symbolic link to libz.so (or vice versa, in which case gcc is really just using libz.so, which happens to have the same content as your zlib.so). However gcc might be using it if the build process explicitly names the library in the link stage (not using -l<lib>) or if your application loads it via dlopen() (but in that case, it's not really linked to your program - it's just loaded at run time).

Is there an advantage to linking a shared object library from gcc generated objects with g++?

I came across a project recently that created it's shared object libraries by linking gcc generated object files (using the CC gnu makefile macro) with g++.
Aside from (possibly) ensuring the source code is encapsulated inside #ifdef __cplusplus / extern "c" { / #endif constructs to avoid name mangling problems, is there any reason why this would be ... better?
If they only link with g++ then adding preprocessor checks and extern "C" is useless, that only affects preprocessing and compilation, at the linking stage that's already done.
They might have wanted to ensure that exceptions could propagate through their C library, but to do that they only need to link to libgcc not libstdc++.
Maybe they just wanted the shared library to depend on libstdc++, so that users of the library would also depend on libstdc++ and wouldn't have to link to it explicitly, although that might not work as they expected.
So in short, no, I can't think of any good reason if all the code is C code and not C++ code.
However just because something is compiled with gcc doesn't mean it's C code, you can use the gcc executable to compile C++ code and it will invoke the C++ front-end (cc1plus) instead of the C front-end (cc1). If the C++ code uses the standard library then you either need to link with -lstdc++ or use g++ to link (which automatically links with -lstdc++). So maybe the answer is that it's C++ code, and the fact they compiled the objects with gcc not g++ made you think it was C code.

mingw, cross-compilation, gcc

Some context:
My program uses libary libfl.a (flex library).
I compile it under linux:
gcc lex.yy.c -lfl
I have mingw compiler installed i586-mingw32msvc-gcc (simple 'hello world' stuff compiles without problem)
I use ubuntu (probably does not matter)
I want to compile under linux for windows (produce binary .exe file which would be usable on windows)
My problem and questions:
When I try compiling my program
i586-mingw32msvc-gcc lex.yy.c -lfl
I get errors:
[...] undefined reference to '_yywrap'
[...] undefined reference to '_WinMain#16'
Do I understand correctly that I have to compile the content of libfl.a also with i586-mingw32msvc-gcc to be able to use it in this cross-compilation?
In the source code there is function yywrap(), but not _yywrap(). Why I get error for function with underscore _?
Whats up with the _WinMain#16? (no usage in source code)
My goal would be to understand what is happening here.
If I get it to work, then its bonus points :)
Any help is appreciated
Yes, certainly. And here's why:
C++ encodes additional semantic information about functions such as namespace/class affinity, parameter types etc. in the function name (that is called name mangling). Thus C++ library names are somewhat different from what you see in the source code. And each compiler does it in it's own way, that's why generally you're unable to link against C++ functions (C function names don't get mangled still) of a library built with a different compiler.
Judging to mangling style, the undefined symbols are brought in by the Microsoft C++ compiler. I don't know exactly about why it needs WinMain, but after you recompile the libs with it, all these errors likely will be gone. And yes: maybe the WinMain() thing rises from msvc using it instead of main(), which presence is obligatory for a well-formed program? ;)

Resources