I am aware that implicitly linking to libraries at load time can lead to performance increases and as such I was wondering if it was good practice to link in this way at compile time thus increasing executable size (admittedly this is only marginal) compared to linking explicitly at runtime. My question is when linking against Microsoft Windows dll files located in System32, is it 'better' to link at load time as you can be mostly certain that the libraries will be present or follow the explicit approach?
Language used is Delphi (pascal) and the library in question is the WTsAPI32.dll - Terminal Services.
EDIT: As pointed out - my choice of language was incorrect and has been amended. Also, due to having only really every extensively linked to libraries in Unix, my comments about executable size can be omitted, I believed at the time I WAS in fact referring to static linking which bundles the library code into the executable and I now realise this is impossible when using dll files (DUH!). Thanks all.
The two forms of DLL linking are perhaps better named implicit and explicit. Implicit linking is what you refer to as static linking. And explicit linking is what you refer to as runtime linking
For implicit linking the linker writes entries into the import table of the executable file. This import table is metadata that is used by the loader to resolve DLL imports at module load time. A stub function is included for each implicit import that is only a few bytes in size. The executable size implications of implicit linking are negligible.
With explicit linking the imported function's address is resolved by a call to GetProcAddress. This call is made when the programmer chooses. If the DLL or the function cannot be resolved, the programmer can code fall back behaviour. There are size implications to explicit linking that I estimate to be similar to implicit linking. If the function address is evaluated once and remembered between calls then the performance characteristics are similar to implicit linking.
My advice is as follows:
Prefer implicit linking. It is more convenient to code.
If the DLL may not be present, use explicit linking.
If the DLL must be loaded using a full path, use explicit linking.
If you want to unload the DLL during program execution, use explicit linking.
You specifically mention Windows DLLs. You can safely assume that they will be present. Don't try to code to allow your program to run in case user32.dll is missing. Some functions may not be present in older versions of Windows. If you support those older versions you'll need to use explicit linking and provide a fallback. Decide which version you support and use MSDN to be sure that a function is available on your minimum supported platform.
If your only two options are static linking and run-time dynamic linking, then the latter is the best choice for linking with Windows DLLs because it's your only choice. You cannot link statically to a DLL because DLLs are exclusively for dynamic linking; that's what the D stands for. Microsoft does not provide static libraries for the OS modules, so you cannot link to them statically.
But those typically aren't your only two options. There's a third, namely load-time dynamic linking.
In Delphi, you use load-time dynamic linking by marking a function declaration external and specifying the name of the DLL where the function resides. If you use the function, then an entry is created in your module's import table, and when the OS loads your module, it reads the table, loads the referenced DLL, looks up the address of the function, and stores the address in your program's memory image so that your program can call it directly.
You use run-time dyanmic linking by declaring a function pointer, and then using LoadLibrary and GetProcAddress to look up the function's address prior to calling it. In newer Delphi versions, you can also declare a function in the same style that load-time dynamic linking uses, but then mark it with delay. In that case, the Delphi run-time library will call LoadLibrary and GetProcAddress on your behalf the first time you call the function.
The size differences are negligible. Run-time dynamic linking requires your program to contain code to load and link to libraries, but load-time dynamic linking stores more function references in the import table.
Run-time dynamic linking offers more flexibility in the face of uncertain DLL availability. With load-time dynamic linking, if a DLL is missing, or if it doesn't have all the functions mentioned in your import table, then the OS will fail to load your program — none of your code will run. With run-time dynamic linking, however, you have the opportunity to recover from the problem. You can disable certain parts of your program that the missing DLL depends on, or you can search for DLLs in non-standard places, or you can provide alternative implementations of missing functions.
If the functions you're calling are integral to your program's ability to operate, and there's ample reason to expect the functions to be present wherever your program is installed, then you should choose to link at load time. It allows you to write simpler code. You can be confident that you'll have the required functions if they are available on a certain version of windows that you check for in your installer, or if they're provided by DLLs that you distribute with your program.
On the other hand, if the functions you're calling are optional, then you should prefer to link at run time. Use that for loading plug-ins, or for taking advantage of advanced OS features while maintaining backward compatibility. (For example, you might want to take advantage of Windows Vista theme support when it's present, but still allow your program to run on Windows XP.)
Why do you think that compile-time linking to dynamic libraries would increase EXE size ? I believe you are mislead by somewhat poor choice of terms, used in windows programming from far ago. Let us better use relative terms "early binding" and "late binding" instead for the choice who should search for procedure names, compiler/loader or programmer's custom code.
Using early binding (aka static linking against dynamic library) your EXE contains the values (in a special tables):
DLL1 Name:
procedure "aaaaa" into the variable $1234
procedure "bbbbb" into the variable $5678
.
DLL2 Name:
procedure "ccccc" into the variable $4567
...et cetera.
Now, when you turn this into runtime loading (dynamic linking against dynamic libraries) it would look like
VarH1 := SafeLoatLibrary(DLL1 Name);
if Error-Loading-DLL then do-error-handling;
Var1234 := GetProcAfdress(VarH1, "aaaaa");
if Error-Searching-For-Function then do-error-handling;
Var5678 := GetProcAfdress(VarH1, "bbbbb");
if Error-Searching-For-Function then do-error-handling;
et cetera.
Obviously in the latter case your EXE contains all those values like in the 1st case, but more so - it contains a lot of code to deal with those values, that was just absent before.
So, while EXE size difference is not really large for today memory sizes, it is still in favor of early binding (static compilation against dynamic library).
Then what are the benefits for late binding? For example you can load different DLLs from different paths, determined in runtime by configuration - the flexibility and avoiding of DLL Hell (funny, concept of avoiding DLL Hell is against concept of volume saving). You can make your application work with limited functionality, if DLL load failed while statically binded EXE would just not load - graceful degradation concept. And at least you may give user much better, full of semantics, error messages than Windows could ever do.
And the last word, where you got that concept of EXE size from. I believe you mistaken it from talks about - attention! - static linking against static libraries. That is when OBJ/LIB/DCU files are not the part of distribution, but are just temporary code containers, that ultimately takes its place inside the monolythic EXE. Then yes - then your EXE has all those libraries insideitself and thus grows larger. However this case have nothing about dynamic libraries - DLLs.
The wording chosen once ago overuses static/dynamic terms in two closely related topics: how the library is loaded (compile-time vs runtime) and how functions inside the library are located (or bound. By developer's custom codeing ro by some OS-provided or compiler-provided toolset way before 1st line of your sources started execution).
Due to that ambiguity those close but different concepts start overlapping and sometimes this leads to a total confusion.
Now, what more static linking may give you in modern Windows versions. That is WinSxS folder Novadays Windows tends to keep multiple versions of each system DLL and your program may ask for the specific version of it (while in System32 folder there would be the most recent version that your program may be not get used to. Then you can make a special MANIFEST resource and compile it into EXE asking windows to load not DLLs not be name, but by name+version instead. You can replicaty that functionality with dynamic loading as well, but using Windows-provided toolset it is much easier.
Now you can decide which of those options do or do not have importance for your particular case and make somewhat better informed choice.
HTH.
Related
I know very little about DLL's and LIB's other than that they contain vital code required for a program to run properly - libraries. But why do compilers generate them at all? Wouldn't it be easier to just include all the code in a single executable? And what's the difference between DLL's and LIB's?
There are static libraries (LIB) and dynamic libraries (DLL) - but note that .LIB files can be either static libraries (containing object files) or import libraries (containing symbols to allow the linker to link to a DLL).
Libraries are used because you may have code that you want to use in many programs. For example if you write a function that counts the number of characters in a string, that function will be useful in lots of programs. Once you get that function working correctly you don't want to have to recompile the code every time you use it, so you put the executable code for that function in a library, and the linker can extract and insert the compiled code into your program. Static libraries are sometimes called 'archives' for this reason.
Dynamic libraries take this one step further. It seems wasteful to have multiple copies of the library functions taking up space in each of the programs. Why can't they all share one copy of the function? This is what dynamic libraries are for. Rather than building the library code into your program when it is compiled, it can be run by mapping it into your program as it is loaded into memory. Multiple programs running at the same time that use the same functions can all share one copy, saving memory. In fact, you can load dynamic libraries only as needed, depending on the path through your code. No point in having the printer routines taking up memory if you aren't doing any printing. On the other hand, this means you have to have a copy of the dynamic library installed on every machine your program runs on. This creates its own set of problems.
As an example, almost every program written in 'C' will need functions from a library called the 'C runtime library, though few programs will need all of the functions. The C runtime comes in both static and dynamic versions, so you can determine which version your program uses depending on particular needs.
Another aspect is security (obfuscation). Once a piece of code is extracted from the main application and put in a "separated" Dynamic-Link Library, it is easier to attack, analyse (reverse-engineer) the code, since it has been isolated. When the same piece of code is kept in a LIB Library, it is part of the compiled (linked) target application, and this thus harder to isolate (differentiate) that piece of code from the rest of the target binaries.
One important reason for creating a DLL/LIB rather than just compiling the code into an executable is reuse and relocation. The average Java or .NET application (for example) will most likely use several 3rd party (or framework) libraries. It is much easier and faster to just compile against a pre-built library, rather than having to compile all of the 3rd party code into your application. Compiling your code into libraries also encourages good design practices, e.g. designing your classes to be used in different types of applications.
A DLL is a library of functions that are shared among other executable programs. Just look in your windows/system32 directory and you will find dozens of them. When your program creates a DLL it also normally creates a lib file so that the application *.exe program can resolve symbols that are declared in the DLL.
A .lib is a library of functions that are statically linked to a program -- they are NOT shared by other programs. Each program that links with a *.lib file has all the code in that file. If you have two programs A.exe and B.exe that link with C.lib then each A and B will both contain the code in C.lib.
How you create DLLs and libs depend on the compiler you use. Each compiler does it differently.
One other difference lies in the performance.
As the DLL is loaded at runtime by the .exe(s), the .exe(s) and the DLL work with shared memory concept and hence the performance is low relatively to static linking.
On the other hand, a .lib is code that is linked statically at compile time into every process that requests. Hence the .exe(s) will have single memory, thus increasing the performance of the process.
Came across the following paragraph from a page on the MySQL website here:
You can write plugins in C or C++ (or another language that can use C
calling conventions). Plugins are loaded and unloaded dynamically, so
your operating system must support dynamic loading and you must have
compiled the calling application dynamically (not statically). For
server plugins, this means that mysqld must be compiled
dynamically.
What is meant by dynamical compiling? I know about dynamical linking, but I'm not sure what they meant with dynamical compiling.
Also, on Windows 10 (x64), how can I assure that an exe has been compiled dynamically? Is it possible to figure it out from the output of dumpbin? Here's the dumpbin output for mysqld.exe (version 5.7):
Note: I reviewed this old question which did not provide me with that much information. The depends tool it suggests is no longer on Windows.
Compiling dynamically simply means that you are compiling the code such that the compiled output is suitable for dynamic linking.
On Windows, the process of creating as DLL necessarily compiles it such that it's suitable for dynamic linking because DLLs are always dynamically linked.
I believe that most platforms today always compile dynamically and produce relocatable output, even if they're subsequently linked statically.
Can anyone explain me what is the difference between a shared and static library in gcc(makefile)
I read that static library is independent code but it increases the size of your exectuable file
But whereas the shared library it links the functions dynamically and it does not increase the size of your executable file
I cannot understand the difference between these two.
Can anyone tell me when i should create a static library and when i should create a shared library.
They say shared library is a position dependent code
What do we mean by position dependent code?
If shared library does not increase the code and if static library increases the code size
then we can just go for shared library right.
But why do we have static library too what is the real use of it.
please help me guys
When you compile code with gcc, it can directly produce object files and a number of object files can be linked together to produce an executable file.
A static library is simply a collection of object files (usually with an index) which can be used by a linker when it creates an executable.
The difference between just linking object files together and linking with a static library is that with the library, the linker will only pick the object files from it that are absolutely needed, whereas linking with object files, the linker is forced to take all of the files.
A dynamic library is different again - very different. It is a collection of object files, but this time its the output of a linking process with all of its internal links already resolved.
Linking with a dynamic library means that the linker only resolves symbols in the final executable, but adds none of the object code from the library at link time.
Running an executable linked with dynamic libraries means that at run-time some special software has to ask the operating system to load a specified dynamic library into memory (this is why a dynamic library has to be created by a linker) usually before the program gets to main().
This executable can be a very small file on disk, but when loaded into memory and after loading dynamic libraries the total memory image can be quite large as the whole of the library will have been loaded in.
I believe its the case that if several different programs need the same dynamic library, the operating system can use virtual memory for each program to load the library, but this only uses one copy of physical memory - which can be a big saving in some cases.
If you ship an executable which needs a dynamic library, then you might be lucky and the file might already be on the user's machine. If not, you would have to install that as well. However, if you find bugs in your library, then just shipping a new version of your dynamic library will fix the bugs.
An executable built with static libraries comes ready to run, but fixing bugs would involve reshipping the whole executable.
In Summary:
object files static library dynamic library
produced by compiler librarian linker
exe large smaller smallest
ram x 1 copy large smaller largest
ram x n copies large smaller smallest
dep on files independent independent dependant
upgrade lib no no yes
you might want to create a single executable without the dependencies of the libraries.
as far as I know static libraries should be somewhat faster.
Drawbacks: linking shared libraries is not only increasing the executable size it also gets loaded with each executable. If you use the library in multiple executables then it will be in memory multiple times.
Position dependent code means that at the creation of the library the exact location in memory of the library isn't known so only relative memory addressing is available (eg: jump to +50 bytes from this instruction and follow execution from there)
Consider a C++ API defined as a series of __options(declexport/import) classes.
Further, assume that the caller is never permitted to call the ordinary operator new(size_t) on these classes. Either a static factory method does the new-ing or there is a class-specific operator new. And ditto marks on the delete size as needed (frequently just a virtual destructor).
Now, if you compile and link a DLL and an IMPLIB of with the tools from VS2010, can you hand that implib and DLL to a user of VS2005 and expect it to work?
MFC is not involved here at all.
I'd be particularly grateful to any reference to any relatively formal Microsoft statement on the subject.
So long as the name mangling on the C++ API is identical (they are), and does not use STL-type specific parameters, such as basic_string or std::map, whose implementation may have changed between releases of the compiler (and they have), then it should just work.
Of course, you'll want to make sure you either compiled your DLL using /MT mode (static linked runtimes), or include the redistributables for VS2010 runtimes with your supplied libraries and link targets.
EDIT: Expanding on "don't pass in types that have version-specific implementations". A partial list is most easily found by looking at the output of the exports of MSVC100P.DLL.
cd %VS100COMNTOOLS%\..\VC\redist\x86\Microsoft.VC100.CRT
DUMPBIN /exports MSVCP100.DLL
The next issue will be header-only implementations of things like map or set which have changed under the hood between versions of the compiler.
This is why it's highly recommended that only scalar types be passed across boundaries between memory arenas. And thus, simple tests will pass, and be reliable.
You have not mentioned if you have used MFC to create the DLL's .If you have, regular DLL's should work , but I dont think extension shall work as the latter links to the MFC dlls .I am including links for your reference.
http://www.codeguru.com/cpp/cpp/cpp_mfc/tutorials/article.php/c4017
http://www.experts-exchange.com/Programming/System/Windows__Programming/MFC/Q_20385543.html
http://msdn.microsoft.com/en-us/library/26h8x9sy%28v=VS.100%29.aspx
EDIT
If its a normal DLL, there should not be any problem.Also depends on the linkage type.
When one should implicitly or explicitly link to a DLL and what are common practices or pitfalls?
It is fairly rare to explicitly link a DLL. Mostly because it is painful and error prone. You need to write a function pointer declaration for the exported function and get the LoadLibrary + GetProcAddress + FreeLibrary code right. You'd do so only if you need a runtime dependency on a plug-in style DLL or want to select from a set of DLLs based on configuration. Or to deal with versioning, an API function that's only available on later versions of Windows for example. Explicit linking is the default for COM and .NET DLLs.
More background info in this MSDN Library article.
I'm assuming you refer to linking using a .lib vs loading a DLL dynamically using LoadLibrary().
Loading a DLL statically by linking to its .lib is generally safer. The linking stage checks that all the entry points exist in compile time and there is no chance you'll load a DLL that doesn't have the function you're expecting. It is also easier not to have to use GetProcAddress().
So generally you should use dynamic loading only when it is absolutely required.
I agree with other who answered you already (Hans Passant and shoosh). I want add only two things:
1) One common scenario when you have to use LoadLibrary and GetProcAddress is the following: you want use some new API existing in new versions of Windows only, but the API are not critical in your application. So you test with LoadLibrary and GetProcAddress whether the function which you need exist, and use it in the case. What your program do if the functions not exist depend total from your implementation.
2) There are one important options which you not included in your question: delayed loading of DLLs. In this case the operating system will load the DLL when one of its functions is called and not at the application start. It allows to use import libraries (.lib files) in some scenarios where explicitly linking should be used at the first look. Moreover it improve the startup time of the applications and are wide used by Windows itself. So the way is also recommended.