BACKGROUND:
If an executable file has a external data reference, which is defined in a shared object, the compiler will use copy relocation and place a copy in its .bss section.
Copy relocation is detailed in this site:
http://www.shrubbery.net/solaris9ab/SUNWdev/LLM/p22.html#CHAPTER4-84604
However, my question is:
Is it possible to implement it through GOT, just like the external data reference in shared object? The executable can indirectly accesses this external code through its GOT entry, and this GOT entry can be stuffed with the real address of this symbol in run-time.
I don't know why GCC doesn't implement it like this. What's the upside of copy relocation?
In languages like C and C++ addresses of objects with static storage duration qualify as address constants. It means that conceptually, at language level they are treated as if their values are "known" at compile time.
Of course, this is not the case in reality, when it comes to the matter in question. To counter that the compiler-linker-loader combination has to implement a dynamic mechanism that would provide full support the language-level concept of address constant. Intuitively a GOT-based mechanism, being based on full run-time indirection, would be much farther away from that concept than a load-time relocation-based mechanism.
For one thing, C language was designed as a language that requires no dynamic initialization of objects with static storage duration, i.e. conceptually there's no initializing startup code and no issues related to the order of initialization. But in a GOT-based implementation an initialization of a global variable with such address constant would require startup code to extract the actual value from GOT and place it into the variable. Meanwhile, a relocation-based approach produces a full illusion of such global variable beginning its life with the proper value without any startup code.
If you look at the features provided by the relocation mechanism, you will notice that they are in sync with the C specification of address constant. E.g. the final value might involve adding a fixed offset, which is intended to act as a loader-side implementation of C [] and -> operators, permissible in C address constant expressions.
Is it possible to implement it through GOT, just like the external data reference in shared object?
Yes. For this to work, you'll need to build code that is linked into the main executable with -fPIC. Since that is often less efficient (extra indirection), and usually not done, the linker has to do copy relocations instead.
More info here.
Related
I am currently developing a library for QNX (x86) using GCC, and I want to make some symbols which are used exclusively in the library and are invisible to other modules, notably to the code which uses the library.
This works already, but, while doing the research how to achieve it, I have found a very worrying passage in GCC's documentation (see http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Code-Gen-Options.html#Code-Gen-Options, explanation for flag -fvisibility):
Despite the nomenclature, default always means public; i.e., available
to be linked against from outside the shared object. protected and
internal are pretty useless in real-world usage so the only other
commonly used option is hidden. The default if -fvisibility isn't
specified is default, i.e., make every symbol public—this causes the
same behavior as previous versions of GCC.
I am very interested in how visibility "internal" is pretty useless in real-world-usage. From what I have understood from another passage from GCC's documentation (http://gcc.gnu.org/onlinedocs/gcc-4.8.2/gcc/Function-Attributes.html#Function-Attributes, explanation of the visibility attribute), visibility "internal" is even stronger (more useful for me) than visibility "hidden":
Internal visibility is like hidden visibility, but with additional
processor specific semantics. Unless otherwise specified by the psABI,
GCC defines internal visibility to mean that a function is never
called from another module. Compare this with hidden functions which,
while they cannot be referenced directly by other modules, can be
referenced indirectly via function pointers. By indicating that a
function cannot be called from outside the module, GCC may for
instance omit the load of a PIC register since it is known that the
calling function loaded the correct value.
Could anybody explain in depth?
If you just want to hide your internal symbols, just use -fvisibility=hidden. It does exactly what you want.
The internal flag goes much further than the hidden flag. It tells the compiler that ABI compatibility isn't important, since nobody outside the module will ever use the function. If some outside code does manage to call the function, it will probably crash.
Unfortunately, there are plenty of ways to accidentally expose internal functions to the outside world, including function pointers and C++ virtual methods. Plenty of libraries use callbacks to signal events, for example. If your program uses one of these libraries, you must never use an internal function as the callback. If you do, the compiler and linker won't notice anything wrong, and your program will have subtle, hard-to-debug crash bugs.
Even if your program doesn't use function pointers now, it might start using them years down the road when everyone (including you) has forgotten about this restriction. Sacrificing safety for tiny performance gains is usually a bad idea, so internal visibility is not a recommended project-wide default.
The internal visibility is more useful if you have some heavily-used code that you are trying to optimize. You can mark those few specific functions with __attribute__ ((visibility ("internal"))), which tells the compiler that speed is more important than compatibility. You should also leave a comment for yourself, so you remember to never take a pointer to these functions.
I cannot provide in-depth answer, but I think that "internal" might be unpractical because it is processor dependent. You might get expected behaviour on some systems, but on others you get only "hidden".
I apologize ahead of time if my terminology is incorrect.
Let's say I have a shared library called libVectorMath.so. In it are two interesting functions, addVector() and subtractVector(). The two functions are prototyped in vectorMath.h. I also have an executable called testVectorMath, which uses those two functions, and is dynamically linked to libVectorMath.so.
Generally speaking, to build testVectorMath, I need to build libVectorMath.so as well. Why is this? Is the header file vectorMath.h not sufficient to tell testVectorMath what symbols it should expect to find in libVectorMath.so?
In other words, can't testVectorMath have some instructions in it to say "look for a library called libVectorMath.so and then look for symbols named addVector() and subtractVector() within it"?
Read this link. Its tells about the same in a very good way!
An Excerpt from above is as follows:
All shared library schemes work essentially the same way. At link time, the linker searches through libraries as usual to find modules that resolve otherwise undefined external symbols. But rather than copying the contents of the module into the output file, the linker makes a note of what library the module came from, and puts a list of the libraries in the executable. When the program is loaded, startup code finds those libraries and maps them into the program's address space before the program starts, Figure 1. Standard operating system file mapping semantics automatically share pages that are mapped read-only or copy-on-write. The startup code that does the mapping may be in the operating system, the executable, in a special dynamic linker mapped into the process' address space, or some combination of the three.
Why do we need boost::thread_specific_ptr, or in other words what can we not easily do without it?
I can see why pthread provides pthread_getspecific() etc. These functions are useful for cleaning up after dead threads, and handy to call from C-style functions (the obvious alternative being to pass a pointer everywhere that points to some memory allocated before the thread was created).
In contrast, the constructor of boost:thread takes a callable class by value, and everything non-static in that class becomes thread local once it is copied. I cannot see why I would want to use boost::thread_specific_ptr in preference to a class member any more than I would want to use a global variable in OOP code.
Do I horribly misunderstand anything? A very brief example would help, please. Many thanks.
thread_specific_ptr simply provides portable thread local data access. You don't have to be managing your threads with Boost.Thread to get value from this. The canonical example is the one cited in the Boost docs for this class:
One example is the C errno variable,
used for storing the error code
related to functions from the Standard
C library. It is common practice (and
required by POSIX) for compilers that
support multi-threaded applications to
provide a separate instance of errno
for each thread, in order to avoid
different threads competing to read or
update the value.
I'm currently at a point where I need to link in several modules (basically ELF object files) to my main executable due to a limitation of our target (background: kernel, targeting the ARM architecture). On other targets (x86 specifically) these object files would be loaded at runtime and a specific function in them would be called. At shutdown another function would be called. Both of these functions are exposed to the kernel as symbols, and this all works fine.
When the object files are statically linked however there's no way for the kernel to "detect" their presence so to speak, and therefore I need a way of telling the kernel about the presence of the init/fini functions without hardcoding their presence into the kernel - it needs to be extensible. I thought a solution to this might be to put all the init/fini function pointers into their own section - in much the same way you'd expect from .ctors and .dtors - and call through them at the relevant time.
Note that they can't actually go into .ctors, as they require specific support to be running by the time they're called (specifically threads and memory management, if you're interested).
What's the best way of going about putting a bunch of arbitrary function pointers into a specific section? Even better - is it possible to inject arbitrary data into a section, so I could also store stuff like module name (a struct rather than a function pointer, basically). Using GCC targeted to arm-elf.
GCC attributes can be used to specify a section:
__attribute__((section("foobar")))
What is the best way to design a C API for dlls which deals with the problem of passing "objects" which are C runtime dependent (FILE*, pointer returned by malloc, etc...). For example, if two dlls are linked with a different version of the runtime, my understanding is that you cannot pass a FILE* from one dll to the other safely.
Is the only solution to use windows-dependent API (which are guaranteed to work across dlls) ? The C API already exists and is mature, but was designed from a unix POV, mostly (and still has to work on unix, of course).
You asked for a C, not a C++ solution.
The usual method(s) for doing this kind of thing in C are:
Design the modules API to simply not require CRT objects. Get stuff passed accross in raw C types - i.e. get the consumer to load the file and simply pass you the pointer. Or, get the consumer to pass a fully qualified file name, that is opened , read, and closed, internally.
An approach used by other c modules, the MS cabinet SD and parts of the OpenSSL library iirc come to mind, get the consuming application to pass in pointers to functions to the initialization function. So, any API you pass a FILE* to would at some point during initialization have taken a pointer to a struct with function pointers matching the signatures of fread, fopen etc. When dealing with the external FILE*s the dll always uses the passed in functions rather than the CRT functions.
With some simple tricks like this you can make your C DLLs interface entirely independent of the hosts CRT - or in fact require the host to be written in C or C++ at all.
Neither existing answer is correct: Given the following on Windows: you have two DLLs, each is statically linked with two different versions of the C/C++ standard libraries.
In this case, you should not pass pointers to structures created by the C/C++ standard library in one DLL to the other. The reason is that these structures may be different between the two C/C++ standard library implementations.
The other thing you should not do is free a pointer allocated by new or malloc from one DLL that was allocated in the other. The heap manger may be differently implemented as well.
Note, you can use the pointers between the DLLs - they just point to memory. It is the free that is the issue.
Now, you may find that this works, but if it does, then you are just luck. This is likely to cause you problems in the future.
One potential solution to your problem is dynamically linking to the CRT. For example,you could dynamically link to MSVCRT.DLL. That way your DLL's will always use the same CRT.
Note, I suggest that it is not a best practice to pass CRT data structures between DLLs. You might want to see if you can factor things better.
Note, I am not a Linux/Unix expert - but you will have the same issues on those OSes as well.
The problem with the different runtimes isn't solvable because the FILE* struct belongs
to one runtime on a windows system.
But if you write a small wrapper Interface your done and it does not really hurt.
stdcall IFile* IFileFactory(const char* filename, const char* mode);
class IFile {
virtual fwrite(...) = 0;
virtual fread(...) = 0;
virtual delete() = 0;
}
This is save to be passed accross dll boundaries everywhere and does not really hurt.
P.S.: Be careful if you start throwing exceptions across dll boundaries. This will work quiet well if you fulfill some design creterions on windows OS but will fail on some others.
If the C API exists and is mature, bypassing the CRT internally by using pure Win32 API stuff gets you half the way. The other half is making sure the DLL's user uses the corresponding Win32 API functions. This will make your API less portable, in both use and documentation. Also, even if you go this way with memory allocation, where both the CRT functions and the Win32 ones deal with void*, you're still in trouble with the file stuff - Win32 API uses handles, and knows nothing about the FILE structure.
I'm not quite sure what are the limitations of the FILE*, but I assume the problem is the same as with CRT allocations across modules. MSVCRT uses Win32 internally to handle the file operations, and the underlying file handle can be used from every module within the same process. What might not work is closing a file that was opened by another module, which involves freeing the FILE structure on a possibly different CRT.
What I would do, if changing the API is still an option, is export cleanup functions for any possible "object" created within the DLL. These cleanup functions will handle the disposal of the given object in the way that corresponds to the way it was created within that DLL. This will also make the DLL absolutely portable in terms of usage. The only worry you'll have then is making sure the DLL's user does indeed use your cleanup functions rather than the regular CRT ones. This can be done using several tricks, which deserve another question...