My main application statically links to a static library A with a function ABC and my dynamic library xyz.dylib also statically links to the same static library A which has the same function ABC. The function ABC uses a globally defined variable.
Now when the main application Loads xyz.dylib using dlopen on runtime. The initializer gets called where i have called ABC function. This function ABC and uses the global variable from main application address space.
On Osx, functions which are inline the dylib linker will use the first one that is used. So for example, if an inline function is used in your main executable first, and then used in the loaded dylib, it will use the one in the main executable.
This is normally fine, unless your inline makes reference to a global symbol, in which case you are now be using one if your globals for both the dylib, and your executable.
Again this is usually fine, since the same version is used consistently.
The problem happens when you have 2 inline functions that reference a global that is in both executable and dylib, and one function gets used first in the executable, and another one used first in the dylib. Then you have a mismatched pair. For example:
class MagicAlloc
{
void* Alloc() { return gAlloc.get(); }
void Free( void* v ) { gAlloc.free( v ); }
static RealAllocator gAlloc;
};
Suppose you call MagicAlloc::Alloc in the executable, then call it in the dylib, now for all allocations in both you will use the gAlloc in the executable. Then the first call to MagicAlloc::Free happens in the dylib. Then you will try to free something allocated in the binary on the globals from the dylib.
There are two solutions:
Don't use inlines to reference globals/statics. Move the global structure, and the function definitions into the same translation unit ( object file ). Mark the globals "static" so they aren't even visible outside the TLU. Now your functions will be resolved statically in the link step, and bound to the right global.
Hide all the symbols in the executable except the plugin api. Link as normal, but when linking the binary itself pass the following to the linker:
-Wl,-exported_symbols_list,export_file
Where export file is a list of link symbols that should be exported. E.g. you will need to at least have "_main" in that file. Now when your dylib runs it won't be able to dynamically link to the wrong inlines, because they won't be in the dynamic symbol table. The second solution is also more secure, since a malicious plugin won't be able to access globals as easily.
Related
I want to call an unexported function in glibc. Precisely, I want to call ptmalloc_init(). The problem is that the symbol is not exported. I have access to the glibc source code. Therefore, I added a function called ptmalloc_init_caller() in glibc source code and compiled the library. But again I can not see anything in the nm -D output and, as a consequence, can not call the added function from outside. Is there something special about building glibc that is omitted?
You need to make the ptmalloc_init function non-static and add it to malloc/Versions, e.g. under the GLIBC_PRIVATE section. Then it will be exported. Without the change to malloc/Versions, the function will not be mentioned in the generated version script (see libc.map in the build tree), and its symbol will have hidden visibility.
My application works with static library which has extensions API. The API is able to call extension init function from the external shared library or from the "local" binary. That is I can include extension init function statically in to the main executable binary.
The local function is searched by dlsym call and init function should be dynamically exported from the main binary. That is following nm call:
nm -CD <binary>
should list my init function.
Let's assume init function has this signature:
int init_func(INIT_STRUCT *);
This function is not called directly - it is only supposed to be loaded by dlsym call.
So I have two related question:
how to force linker to not exclude this function from the generated binary?
how to force compiler/linker to export this function dynamically?
(I use gcc to compile and link my program)
Unfortunately default behavior of GNU toolchain is to not export symbols from executables by default (as opposed to shared libraries which default to exporting all their symbols). You can use a big-hammer -rdynamic flag which tells linker to export all symbols from your executable file. A less intrusive solution would be to provide explicit exports file via -Wl,--dynamic-list when linking (see example usage in Clang sources).
Ok, I will post an answer based on previous comments.
To make all functions dynamically exported: -rdynamic.
For a single function to be always linked (even if not referenced) you need to add -u<function> to the link line.
To link all functions (even unreferenced) use --whole-archive. To return to the normal linking use --no-whole-archive
I am not sure if I am phrasing the question correctly, but basically I want to know how the call instruction is generated when calling an imported function from another library.
For example
GetModuleFileName(...)
is compiled to
call 0x4D0000
where 0x4D0000 is the address of the imported function which is dynamic.
How does windows set those calls and would it be possible to circumvent it and set a custom address instead.
The address used in the call statement isn't dynamic. It's a relative address that's fixed at link time like a call to any other function. That's because the call is actually to a stub, and the stub performs an indirect jump to the real function. The indirect jump uses a memory operand that refers to location in the import table. When the executable (or DLL) is loaded by Windows it updates the import table with addresses of all the functions the executable or DLL uses in any DLLs it's linked to.
So if an executable a call instruction like this:
call _GetModuleFileNameA#12
Then somewhere else in the same executable is astub like this:
_GetModuleFileNameA#12:
jmp [__imp__GetModuleFileNameA#12]
And somewhere in the import table there is a definition like this:
__imp__GetModuleFileNameA#12:
DD ?
Windows sets the value of __imp_GetModuleFileName#12 in the import table when the executable (or DLL) is loaded. There's not much you can do change this, though it's not too hard to change the value after the executable (or DLL) has been loaded. Note that the import table might be located in a read-only section, meaning you may need to change the virtual memory protections in order to do this.
gperftools documentation says that libprofiler should be linked into a target program:
$ gcc myprogram.c -lprofiler
(without changing a code of the program).
And then program should be run with a specific environment variable:
CPUPROFILE=/tmp/profiler_output ./a.out
The question is: how does libprofile have a chance to start and finish a profiler when it is merely loaded, but its functions are not called?
There is no constructor function in that library (proof).
All occasions of "CPUPROFILE" in library code do not refer to any place where profiler is started.
I am out of ideas, where to look next?
As per the documentation the linked webpage, under Linking the library, it describes that the -lprofiler step is the same as linking against the shared object file with LD_PRELOAD option.
The shared object file isn't the same as just the header file. The header file contains function declarations which are looked up by when compiling a program , so the names of the functions resolve, but the names are just names, not implementations. The shared object file (.so) contains the implementations of the functions. For more information see the following StackOverflow answer.
Source file of /trunk/src/profiler.cc on Line 182, has a CPUProfiler constructor, that checks for whether profiling should be enabled or not based on the CPUPROFILE environment variable (Line 187 and Line 230).
It then calls the Start function on Line 237. As per the comments in this file, the destructor calls the Stop function on Line 273.
To answer your question I believe Line 132 CpuProfiler CpuProfiler::instance_; is the line where the CpuProfiler is instantiated.
This lack of clarity in the gperftools documentation is known issue see here.
I think the profiler gets initialized with the REGISTER_MODULE_INITIALIZER macro seen at the bottom of profile-handler.cc (as well as in heap-checker.cc, heap-profiler.cc, etc). This calls src/base/googleinit.h which defines a dummy static object whose constructor is called when the library is loaded. That dummy constructor then calls ProfileHandlerRegisterThread() which then uses the pthread_once variable to initialize the singleton object (ProfileHandler::instance_).
The function REGISTER_MODULE_INITIALIZER simulates the module_init()/module_exit() functions seen in Linux loadable kernel modules.
(my answer is based on the 2.0 version of gperftools)
I am trying to create a dylib in xcode. I can able to create dylb by choosing c/c++ Library template in Xcode.
I want to add "init" method for this dylib. I don't know how to add "init" method for dylib.
My idea is to call this "init" on runtime with the help of dlopen().
Thanks for your valuable feedback.
If you code in C++, you could have static objects in your dlopen-ed library; their constructors get called at dlopen time (and their destructor is running at dlclose time).
If your code is compiled by gcc (be it in C, or in C++, or perhaps even some other languages) you could use the constructor and destructor function attributes
(You could use the obsolete symbols _init and _fini but this is an obsolete feature of dlopen (at least on Linux, and probably on MacOSX). Then you would have to declare them extern "C" void _init(void); in C++.)
Don't forget that dlsym deals with unmangled names, so you want to declare extern "C" the C++ names for it.
You could also have your own convention that your dynamically loaded things should have, for example, a function named my_initialization and your code doing the dlopen would later use dlsym to find it. You should have documented conventions on what symbols are dlsym-ed and how they are used.
I don't know well MacOSX, but I googled this documentation