How to make C function dynamically exported - gcc

My application works with static library which has extensions API. The API is able to call extension init function from the external shared library or from the "local" binary. That is I can include extension init function statically in to the main executable binary.
The local function is searched by dlsym call and init function should be dynamically exported from the main binary. That is following nm call:
nm -CD <binary>
should list my init function.
Let's assume init function has this signature:
int init_func(INIT_STRUCT *);
This function is not called directly - it is only supposed to be loaded by dlsym call.
So I have two related question:
how to force linker to not exclude this function from the generated binary?
how to force compiler/linker to export this function dynamically?
(I use gcc to compile and link my program)

Unfortunately default behavior of GNU toolchain is to not export symbols from executables by default (as opposed to shared libraries which default to exporting all their symbols). You can use a big-hammer -rdynamic flag which tells linker to export all symbols from your executable file. A less intrusive solution would be to provide explicit exports file via -Wl,--dynamic-list when linking (see example usage in Clang sources).

Ok, I will post an answer based on previous comments.
To make all functions dynamically exported: -rdynamic.
For a single function to be always linked (even if not referenced) you need to add -u<function> to the link line.
To link all functions (even unreferenced) use --whole-archive. To return to the normal linking use --no-whole-archive

Related

Calling Internal Functions in a Shared Library

I want to call an unexported function in glibc. Precisely, I want to call ptmalloc_init(). The problem is that the symbol is not exported. I have access to the glibc source code. Therefore, I added a function called ptmalloc_init_caller() in glibc source code and compiled the library. But again I can not see anything in the nm -D output and, as a consequence, can not call the added function from outside. Is there something special about building glibc that is omitted?
You need to make the ptmalloc_init function non-static and add it to malloc/Versions, e.g. under the GLIBC_PRIVATE section. Then it will be exported. Without the change to malloc/Versions, the function will not be mentioned in the generated version script (see libc.map in the build tree), and its symbol will have hidden visibility.

dlopen and dylib : main application and dylib address space

My main application statically links to a static library A with a function ABC and my dynamic library xyz.dylib also statically links to the same static library A which has the same function ABC. The function ABC uses a globally defined variable.
Now when the main application Loads xyz.dylib using dlopen on runtime. The initializer gets called where i have called ABC function. This function ABC and uses the global variable from main application address space.
On Osx, functions which are inline the dylib linker will use the first one that is used. So for example, if an inline function is used in your main executable first, and then used in the loaded dylib, it will use the one in the main executable.
This is normally fine, unless your inline makes reference to a global symbol, in which case you are now be using one if your globals for both the dylib, and your executable.
Again this is usually fine, since the same version is used consistently.
The problem happens when you have 2 inline functions that reference a global that is in both executable and dylib, and one function gets used first in the executable, and another one used first in the dylib. Then you have a mismatched pair. For example:
class MagicAlloc
{
void* Alloc() { return gAlloc.get(); }
void Free( void* v ) { gAlloc.free( v ); }
static RealAllocator gAlloc;
};
Suppose you call MagicAlloc::Alloc in the executable, then call it in the dylib, now for all allocations in both you will use the gAlloc in the executable. Then the first call to MagicAlloc::Free happens in the dylib. Then you will try to free something allocated in the binary on the globals from the dylib.
There are two solutions:
Don't use inlines to reference globals/statics. Move the global structure, and the function definitions into the same translation unit ( object file ). Mark the globals "static" so they aren't even visible outside the TLU. Now your functions will be resolved statically in the link step, and bound to the right global.
Hide all the symbols in the executable except the plugin api. Link as normal, but when linking the binary itself pass the following to the linker:
-Wl,-exported_symbols_list,export_file
Where export file is a list of link symbols that should be exported. E.g. you will need to at least have "_main" in that file. Now when your dylib runs it won't be able to dynamically link to the wrong inlines, because they won't be in the dynamic symbol table. The second solution is also more secure, since a malicious plugin won't be able to access globals as easily.

strip some public symbols from a Windows static library

We want to produce Windows static libraries that have public symbols only for the documented API functions. We want to strip out all other symbols.
This is easy for a *NIX library; you can use the strip(1) utility and specify --keep-symbol=foo, and you can even put a whole list of symbols into a file and specify the file.
How can we do this for a Windows library?
A little more detail: suppose we are making a library and it is built from multiple .C files.
util.c defines the functions util_foo() and util_bar(). math.c defines some_math_func(). Then lib.c defines the functions api_func_0() and api_func_1(). The API functions call the utility and math functions, so those function must not be declared static. When we compile each .c file, the public symbols are in the object file, and then when we link the object files to make a library, the linker will leave all the public symbols visible. Once the linker has produced the static library, we only want the symbols api_func_0 and api_func_1 visible.

how can i create dylib with init function

I am trying to create a dylib in xcode. I can able to create dylb by choosing c/c++ Library template in Xcode.
I want to add "init" method for this dylib. I don't know how to add "init" method for dylib.
My idea is to call this "init" on runtime with the help of dlopen().
Thanks for your valuable feedback.
If you code in C++, you could have static objects in your dlopen-ed library; their constructors get called at dlopen time (and their destructor is running at dlclose time).
If your code is compiled by gcc (be it in C, or in C++, or perhaps even some other languages) you could use the constructor and destructor function attributes
(You could use the obsolete symbols _init and _fini but this is an obsolete feature of dlopen (at least on Linux, and probably on MacOSX). Then you would have to declare them extern "C" void _init(void); in C++.)
Don't forget that dlsym deals with unmangled names, so you want to declare extern "C" the C++ names for it.
You could also have your own convention that your dynamically loaded things should have, for example, a function named my_initialization and your code doing the dlopen would later use dlsym to find it. You should have documented conventions on what symbols are dlsym-ed and how they are used.
I don't know well MacOSX, but I googled this documentation

How does GCC compile applications that reference a static library

I've read that the gcc compiler can perform certain optimization when compiling an application that references a static library, for instance - it will "pull" in only that code from the static library that the application depends upon. This helps keep the size of the application's executable to a minimum if portions of the static library are not being used by the app.
1) Is this true?
2) How does GCC know what code from the static library the application is actually using? Does it only look t the header files that are included (directly and indirectly) in the application and then pull code accordingly? Or does it actually look at what methods from the static library are being called?
A static library is just a bag of object files. The linker (ld) will keep track of which object files are used (i.e. contains a function referenced from somewhere), and not include unreferenced code in the final executable image.
gcc does nothing of the sort. Everything you describe is linking, which is handled by ld.
ld examines the symbol tables of the object files in order to determine which symbols need to be linked, and then pulls the relevant object files from the libraries and links them into the executable.
Answers
1) Yes, only the code referenced will be pulled in. Besides the smaller size there is also a gain in link speed since the static library contains a index table of all the symbols exported by the library. It is quicker doing lookups in this table as opposed to looking up in object files one by one.
Alternatively, if you wanted to pull in all the symbols in the static library regardless of reference. You can pass the --whole-archive switch to ld.
2) It would be more correct to ask this question in the context of ld (the gnu linker) since that is what actually pulls in the references. GCC just invokes the linker after its done compiling (unless you do gcc -c, which causes it to stop after compilation).
So, after compilation is done, ld is invoked with a ordered list of object(.o) files and libraries . ld processes the .o files one by one, and for each the linker
a) Notes down the external symbols needed by this file that cannot be resolved yet. Adds these to a (say) unresolved table.
b) Looks at the symbols (functions, global variables) exported by this file and resolves any previous refrences that it can.
This is a very simplified overview of the linking process.
Now when the linker comes to the static library, it essentially does the same thing, this time using the static library to resolve symbols. However there is one difference, the linker pulls in only the unresolved symbols and its dependencies. So assume we have
a.o and libstatic.a which in turn contains b.o and c.o.
b.o defines bar() and moreBar();
c.o defines baz() and moreBaz();
a.o defines foo();
where foo calls bar which calls baz. Now when you do
gcc -o app a.o libstatic.a
After processing a.o the linker knows that it needs to resolves bar, this gets resolved from the static library, however while resolving bar the linker notices that bar needs baz. This again gets resolved from libstatic.a. moreBar() and moreBaz() have no references and get ignored.

Resources