How much of shared object is loaded to memory - gcc

If there is a shared object file say libComponent.so which is made up of two object files Component_1.o and Compononet_2.o.
And there is an application which links to libComponent.so but is only using Compononent_1.o functions.
Will the entire shared object i.e libComponent.so will be loaded into memory when application runs and uses shared object file or just the Component_1.o ?
Is there an option available in gcc compiler to toggle this behaviour of only loading the required symbols from a shared object ?

Well, it depends on what you mean by 'loaded'.
The dynamic linker will map all of the library into the process's virtual memory space and will fill in entries in the executable's import table for each library function used with the addresses of functions in the shared library. But filling in the import table doesn't actually load from those addresses, so they won't be loaded into physical memory.
From then on, the library code will be paged into physical memory on demand when the function is called, just like any other pageable memory in the process's virtual address space. If a function is never called (directly from the application or indirectly from another library function called by the application), it won't be paged in. (Well, paging occurs with page size granularity, so you might pull in a function the application doesn't call if it's next to a function it does call. Some compilers use profile-guided optimization to place functions commonly called together next to each other to minimize the number of pages used.)
(Aside: if your library wasn't compiled to use position-independent code and it's loaded at its non-default base address, the linker will need to fix up addresses in the code when it's loaded, which would cause the entire library to be paged in. This could be done lazily when each page is first loaded, though I'm not sure which linkers do this.)

Related

Loading data segment of already loaded shared library

For global offset table to work, GOT must be at a fixed location from text segment. Now assume that a program needs a shared library. Assume also that the shared library is already loaded by the OS for some other process. Now for our program, since text section of shared library is already loaded, it just needs to load data segment. The shared library text section is mapped back to the virtual address of our process. But what if there is already some data/text or whatever at the fixed offset from the virtual address of our shared library. How does the dynamic linker resolve that conflict? One approach would be to leave R_386_GOTPC in the text section till load time and let the dynamic linker change it the new offset. Is this how it is done in practice.
On GNU, even the same DSO is mapped at different addresses in different processes. No data at all is shared between them. This means that the GOT is just private data (like .data), and is initialized at load time with the proper addresses (either stubs or the proper function addresses with BIND_NOW).
(This assumes that prelink is not in use, which is somewhat broken anyway.)

Entire shared object loaded to RAM or only used symbols?

I'm currently implementing an embedded Linux based system. The persistent data is loaded from a NAND flash. One of the first applications in userland is using some functions of libglib. For the system, a low startup time is very important.
Because glib is large and NAND is slow, many people argue that the start is slowed down, because the entire glib has to be loaded to RAM! I don't believe in this "urban legend".
My points are:
The gcc linker supports lazy loading
A shared library is handled like a memory mapped file. Therefore, the entire library is NOT loaded to RAM, but only sections containing the symbols, when they are accessed.
Are my assumptions correct and does someone have a reference to a text describing the loading of shared objects (not the symbol resolution with GOT, but the "loading" into RAM)?
Many thanks in advance!
Best regards
Jean-Pierre
My points are:
The gcc linker supports lazy loading
There is no such thing as "gcc linker", and (static) linker has nothing to do with anything.
A shared library is handled like a memory mapped file. Therefore, the entire library is NOT loaded to RAM, but only sections containing the symbols, when they are accessed.
This is correct: Linux will do demand paging from "disk", so if your disk is actually flash, and if you don't use compressed filesystem, and if your shared library is correctly built with no text relocations (-Wl,-z,text), then only referenced parts of code and data from the library will be paged into RAM.

using LoadLibrary, effect performance?

when you dynamically load a library at runtime using LoadLibrary in windows (C++), does it load into memory the same as the rest of your program, or might there be some overhead associated with calling functions referenced from that library?
In other words, if you plan on making frequent calls to a function, will it be just as fast from the library as it would if you linked it into you program at compile-time, or do you lose some performance?
(This is not related to libraries that link to or against a program during compile-time via .lib/.a files.)
Once dll is loaded and function pointer variable is initialized by GetProcAddress, there isn't any overhead in function call.

Sharing GlobalAlloc() memory from DLL to multiple Win32 applications

I want to move my caching library to a DLL and allow multiple applications to share a single pointer allocated within the DLL using GlobalAlloc(). How could I accomplish this, and would it result in a significant performance decrease?
You could certainly do this and there won't be any performance implication for a single pointer.
Rather than use GlobalAlloc, a legacy API, you should opt for a different shared heap. For example the simplest to use is the COM allocator, CoTaskMemAlloc. Or you can use HeapAlloc passing the process heap obtained by GetProcessHeap.
For example, and neglecting to show error checking:
void *mem = HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, size);
Note that you only need to worry about heap sharing if you expect the memory to be deallocated in a different module from where it was created. If your DLL both creates and destroys the memory then you can use plain old malloc. Because all modules live in the same process address space, memory allocated by any module in that process, can be used by any other module.
Update
I failed on first reading of the question to pick up on the possibility that you may be wanting multiple process to have access to the same memory. If that's what you need then it is only possible with memory mapped files, or perhaps with some form of IPC.

Memory mapping of binary to VAS

When a new process is created the Address space is created using fork() i.e new page table entries are created for the new process which are exactly same as the parent process.
After fork() the exec() is called. What happens during the exec() system call?
I read in the book "Operating system concepts " that when a new program is executed, the process is given a new empty VAS. Does that mean that the page table entries created during fork() would get deleted/modifeid ? What is the meaning of empty VAS?
How does the memory mapping of binary to VAS is performed? How does the loader knows that what addresses of the VAS should be mapped to the corresponding binary file?
I am really confused here.
when you call exec the kernel will load the binary and set up a whole new set of page tables (replacing the old ones).
The loader gets the address to load the binary at from the binary itself (basically it does read() to get the headers and stuff that's not code, then mmap() to actually load the code/data stuff in the binary)
so it looks at the binary and figures out how it should be loaded, the does mmap(), passing in an address to do the map at for each part of the binary that needs to be in a different place (ie code and data sections are probably two different calls to mmap() also the .bss section would be mapped from /dev/zero)
Note that depending on the OS and the binary being loaded some of this stuff may be handled by the kernel directly or by a userspace loader (on UNIXish systems ld would be the loader, it handles shared object loading)

Resources