I have an executable that needs a dll file for dependencies. I wonder if it's possible to actually patch a PE file that needs the dll that would read the entry point from a pointer which is located either in memory or in a resource. If this would be possible how can I do it?
Thanks for your help :)
It's not simply an entry point in the DLL that the dependent executable needs. The DllMain is simply invoked by the system (PE loader) when the DLL is loaded. The dynamic loader resolves imported addresses in structures like the Import Address Table and at runtime the application can even resolve addresses dynamically via GetProcAddress.
It is theoretically possible to relocate/base the DLL to some unused part of memory and patch all references to its functions to the relocated code but this is extremely difficult and would require an intimate understanding of the OS. I'm not sure what kind of answer you are looking for. The steps required to achieve this? It would be highly non-trivial and I'm not aware it has been done before (the closest I've seen is ILMerge which is for .NET assemblies). Essentially you are transforming the code to the equivalent of if the DLL had been statically linked at compile time.
Related
I know very little about DLL's and LIB's other than that they contain vital code required for a program to run properly - libraries. But why do compilers generate them at all? Wouldn't it be easier to just include all the code in a single executable? And what's the difference between DLL's and LIB's?
There are static libraries (LIB) and dynamic libraries (DLL) - but note that .LIB files can be either static libraries (containing object files) or import libraries (containing symbols to allow the linker to link to a DLL).
Libraries are used because you may have code that you want to use in many programs. For example if you write a function that counts the number of characters in a string, that function will be useful in lots of programs. Once you get that function working correctly you don't want to have to recompile the code every time you use it, so you put the executable code for that function in a library, and the linker can extract and insert the compiled code into your program. Static libraries are sometimes called 'archives' for this reason.
Dynamic libraries take this one step further. It seems wasteful to have multiple copies of the library functions taking up space in each of the programs. Why can't they all share one copy of the function? This is what dynamic libraries are for. Rather than building the library code into your program when it is compiled, it can be run by mapping it into your program as it is loaded into memory. Multiple programs running at the same time that use the same functions can all share one copy, saving memory. In fact, you can load dynamic libraries only as needed, depending on the path through your code. No point in having the printer routines taking up memory if you aren't doing any printing. On the other hand, this means you have to have a copy of the dynamic library installed on every machine your program runs on. This creates its own set of problems.
As an example, almost every program written in 'C' will need functions from a library called the 'C runtime library, though few programs will need all of the functions. The C runtime comes in both static and dynamic versions, so you can determine which version your program uses depending on particular needs.
Another aspect is security (obfuscation). Once a piece of code is extracted from the main application and put in a "separated" Dynamic-Link Library, it is easier to attack, analyse (reverse-engineer) the code, since it has been isolated. When the same piece of code is kept in a LIB Library, it is part of the compiled (linked) target application, and this thus harder to isolate (differentiate) that piece of code from the rest of the target binaries.
One important reason for creating a DLL/LIB rather than just compiling the code into an executable is reuse and relocation. The average Java or .NET application (for example) will most likely use several 3rd party (or framework) libraries. It is much easier and faster to just compile against a pre-built library, rather than having to compile all of the 3rd party code into your application. Compiling your code into libraries also encourages good design practices, e.g. designing your classes to be used in different types of applications.
A DLL is a library of functions that are shared among other executable programs. Just look in your windows/system32 directory and you will find dozens of them. When your program creates a DLL it also normally creates a lib file so that the application *.exe program can resolve symbols that are declared in the DLL.
A .lib is a library of functions that are statically linked to a program -- they are NOT shared by other programs. Each program that links with a *.lib file has all the code in that file. If you have two programs A.exe and B.exe that link with C.lib then each A and B will both contain the code in C.lib.
How you create DLLs and libs depend on the compiler you use. Each compiler does it differently.
One other difference lies in the performance.
As the DLL is loaded at runtime by the .exe(s), the .exe(s) and the DLL work with shared memory concept and hence the performance is low relatively to static linking.
On the other hand, a .lib is code that is linked statically at compile time into every process that requests. Hence the .exe(s) will have single memory, thus increasing the performance of the process.
I have searched the web , but with different answers to my query. I am not an expert with Windows but I would like to understand it exactly.
When an application is compiled for Windows, which will involve the need for runtime-linking of libraries (DLLs), like using a core library kernel32.dll or some other user-created dll, does the application need to know that the dll exists before run-time.
I have read that a dll must be accompanied by a .lib file which must be linked in at compile time but somewhere it states that the .lib file is not required.
Does the application just execute and expect that it will find the functions needed in a dll and just fail if not found?
Thanks in advance.
List of dynamically linked functions is kept in import section of PECOFF executable file, as defined with the second directory item in optional header. Usual name of the section is .idata or IMPORT.
The only necessary information kept in the import directory is
name of the function
filename of the dynamic library (without path)
So the answer is negative, linker does not need to know anything about the DLL and it doesn't necessarily need to exist at link-time. It is the loader's job to find the refered DLL at run-time (and report error Entry point FunctionName was not found in the library eventually).
Import libraries .lib exist because of older compilers and linkers which lack the directive to declare imported function and its DLL, so they need the linked library which contains import definition and also indirect proxy jumps to IAT. But good linker doesn't need this, as this information can be created on the fly.
I am currently looking at an issue where a project is generating a DLL. Now it is built upon a chain of other projects which are all C++, so they're just .lib files being linked in.
In this case one project uses OpenCL, however I don't believe those code paths are being run. However it would appear that just having OpenCL linked in causes the output .DLL to have a dependency on OpenCL.dll.
Please correct me if I am wrong here (in which case I'll go over the code with a fine-toothed comb to ensure no OpenCL calls are being executed).
I am not sure how Visual Studio (or dependency walker?) figures out which DLL's are dependencies for a given DLL. However I don't want the OpenCL.dll dependency.
What are my options?
One possible one is to take the project with the OpenCL code and refactor so some portion of it can be build with the OpenCL portion excluded from the build. However this will be a fair chunk of work so I am really hoping for something simpler.
If you link a lib you depend on the dll. The easiest thing for you would be to add /DELAYLOAD for this dll:
The /DELAYLOAD option causes the DLL that's specified by dllname to be loaded only on the first call by the program to a function in that DLL.
This generates a 'soft' dependency which only kicks in if you call a function that actually needs the DLL. Make sure you read Constraints of Delay Loading DLLs.
The other option (which I don't recommend) is to use runtime binding via LoadModule and GetProcAddress, then invoke functions via the pointer to function. This removes any dependency, but you are tasked to implement all check, all errors if the DLL is missing, and it can easy go astray if you mismatch the function signature/call convention. Ultimately, you'd be implementing /DELOAYLOAD manually.
I am aware that implicitly linking to libraries at load time can lead to performance increases and as such I was wondering if it was good practice to link in this way at compile time thus increasing executable size (admittedly this is only marginal) compared to linking explicitly at runtime. My question is when linking against Microsoft Windows dll files located in System32, is it 'better' to link at load time as you can be mostly certain that the libraries will be present or follow the explicit approach?
Language used is Delphi (pascal) and the library in question is the WTsAPI32.dll - Terminal Services.
EDIT: As pointed out - my choice of language was incorrect and has been amended. Also, due to having only really every extensively linked to libraries in Unix, my comments about executable size can be omitted, I believed at the time I WAS in fact referring to static linking which bundles the library code into the executable and I now realise this is impossible when using dll files (DUH!). Thanks all.
The two forms of DLL linking are perhaps better named implicit and explicit. Implicit linking is what you refer to as static linking. And explicit linking is what you refer to as runtime linking
For implicit linking the linker writes entries into the import table of the executable file. This import table is metadata that is used by the loader to resolve DLL imports at module load time. A stub function is included for each implicit import that is only a few bytes in size. The executable size implications of implicit linking are negligible.
With explicit linking the imported function's address is resolved by a call to GetProcAddress. This call is made when the programmer chooses. If the DLL or the function cannot be resolved, the programmer can code fall back behaviour. There are size implications to explicit linking that I estimate to be similar to implicit linking. If the function address is evaluated once and remembered between calls then the performance characteristics are similar to implicit linking.
My advice is as follows:
Prefer implicit linking. It is more convenient to code.
If the DLL may not be present, use explicit linking.
If the DLL must be loaded using a full path, use explicit linking.
If you want to unload the DLL during program execution, use explicit linking.
You specifically mention Windows DLLs. You can safely assume that they will be present. Don't try to code to allow your program to run in case user32.dll is missing. Some functions may not be present in older versions of Windows. If you support those older versions you'll need to use explicit linking and provide a fallback. Decide which version you support and use MSDN to be sure that a function is available on your minimum supported platform.
If your only two options are static linking and run-time dynamic linking, then the latter is the best choice for linking with Windows DLLs because it's your only choice. You cannot link statically to a DLL because DLLs are exclusively for dynamic linking; that's what the D stands for. Microsoft does not provide static libraries for the OS modules, so you cannot link to them statically.
But those typically aren't your only two options. There's a third, namely load-time dynamic linking.
In Delphi, you use load-time dynamic linking by marking a function declaration external and specifying the name of the DLL where the function resides. If you use the function, then an entry is created in your module's import table, and when the OS loads your module, it reads the table, loads the referenced DLL, looks up the address of the function, and stores the address in your program's memory image so that your program can call it directly.
You use run-time dyanmic linking by declaring a function pointer, and then using LoadLibrary and GetProcAddress to look up the function's address prior to calling it. In newer Delphi versions, you can also declare a function in the same style that load-time dynamic linking uses, but then mark it with delay. In that case, the Delphi run-time library will call LoadLibrary and GetProcAddress on your behalf the first time you call the function.
The size differences are negligible. Run-time dynamic linking requires your program to contain code to load and link to libraries, but load-time dynamic linking stores more function references in the import table.
Run-time dynamic linking offers more flexibility in the face of uncertain DLL availability. With load-time dynamic linking, if a DLL is missing, or if it doesn't have all the functions mentioned in your import table, then the OS will fail to load your program — none of your code will run. With run-time dynamic linking, however, you have the opportunity to recover from the problem. You can disable certain parts of your program that the missing DLL depends on, or you can search for DLLs in non-standard places, or you can provide alternative implementations of missing functions.
If the functions you're calling are integral to your program's ability to operate, and there's ample reason to expect the functions to be present wherever your program is installed, then you should choose to link at load time. It allows you to write simpler code. You can be confident that you'll have the required functions if they are available on a certain version of windows that you check for in your installer, or if they're provided by DLLs that you distribute with your program.
On the other hand, if the functions you're calling are optional, then you should prefer to link at run time. Use that for loading plug-ins, or for taking advantage of advanced OS features while maintaining backward compatibility. (For example, you might want to take advantage of Windows Vista theme support when it's present, but still allow your program to run on Windows XP.)
Why do you think that compile-time linking to dynamic libraries would increase EXE size ? I believe you are mislead by somewhat poor choice of terms, used in windows programming from far ago. Let us better use relative terms "early binding" and "late binding" instead for the choice who should search for procedure names, compiler/loader or programmer's custom code.
Using early binding (aka static linking against dynamic library) your EXE contains the values (in a special tables):
DLL1 Name:
procedure "aaaaa" into the variable $1234
procedure "bbbbb" into the variable $5678
.
DLL2 Name:
procedure "ccccc" into the variable $4567
...et cetera.
Now, when you turn this into runtime loading (dynamic linking against dynamic libraries) it would look like
VarH1 := SafeLoatLibrary(DLL1 Name);
if Error-Loading-DLL then do-error-handling;
Var1234 := GetProcAfdress(VarH1, "aaaaa");
if Error-Searching-For-Function then do-error-handling;
Var5678 := GetProcAfdress(VarH1, "bbbbb");
if Error-Searching-For-Function then do-error-handling;
et cetera.
Obviously in the latter case your EXE contains all those values like in the 1st case, but more so - it contains a lot of code to deal with those values, that was just absent before.
So, while EXE size difference is not really large for today memory sizes, it is still in favor of early binding (static compilation against dynamic library).
Then what are the benefits for late binding? For example you can load different DLLs from different paths, determined in runtime by configuration - the flexibility and avoiding of DLL Hell (funny, concept of avoiding DLL Hell is against concept of volume saving). You can make your application work with limited functionality, if DLL load failed while statically binded EXE would just not load - graceful degradation concept. And at least you may give user much better, full of semantics, error messages than Windows could ever do.
And the last word, where you got that concept of EXE size from. I believe you mistaken it from talks about - attention! - static linking against static libraries. That is when OBJ/LIB/DCU files are not the part of distribution, but are just temporary code containers, that ultimately takes its place inside the monolythic EXE. Then yes - then your EXE has all those libraries insideitself and thus grows larger. However this case have nothing about dynamic libraries - DLLs.
The wording chosen once ago overuses static/dynamic terms in two closely related topics: how the library is loaded (compile-time vs runtime) and how functions inside the library are located (or bound. By developer's custom codeing ro by some OS-provided or compiler-provided toolset way before 1st line of your sources started execution).
Due to that ambiguity those close but different concepts start overlapping and sometimes this leads to a total confusion.
Now, what more static linking may give you in modern Windows versions. That is WinSxS folder Novadays Windows tends to keep multiple versions of each system DLL and your program may ask for the specific version of it (while in System32 folder there would be the most recent version that your program may be not get used to. Then you can make a special MANIFEST resource and compile it into EXE asking windows to load not DLLs not be name, but by name+version instead. You can replicaty that functionality with dynamic loading as well, but using Windows-provided toolset it is much easier.
Now you can decide which of those options do or do not have importance for your particular case and make somewhat better informed choice.
HTH.
When one should implicitly or explicitly link to a DLL and what are common practices or pitfalls?
It is fairly rare to explicitly link a DLL. Mostly because it is painful and error prone. You need to write a function pointer declaration for the exported function and get the LoadLibrary + GetProcAddress + FreeLibrary code right. You'd do so only if you need a runtime dependency on a plug-in style DLL or want to select from a set of DLLs based on configuration. Or to deal with versioning, an API function that's only available on later versions of Windows for example. Explicit linking is the default for COM and .NET DLLs.
More background info in this MSDN Library article.
I'm assuming you refer to linking using a .lib vs loading a DLL dynamically using LoadLibrary().
Loading a DLL statically by linking to its .lib is generally safer. The linking stage checks that all the entry points exist in compile time and there is no chance you'll load a DLL that doesn't have the function you're expecting. It is also easier not to have to use GetProcAddress().
So generally you should use dynamic loading only when it is absolutely required.
I agree with other who answered you already (Hans Passant and shoosh). I want add only two things:
1) One common scenario when you have to use LoadLibrary and GetProcAddress is the following: you want use some new API existing in new versions of Windows only, but the API are not critical in your application. So you test with LoadLibrary and GetProcAddress whether the function which you need exist, and use it in the case. What your program do if the functions not exist depend total from your implementation.
2) There are one important options which you not included in your question: delayed loading of DLLs. In this case the operating system will load the DLL when one of its functions is called and not at the application start. It allows to use import libraries (.lib files) in some scenarios where explicitly linking should be used at the first look. Moreover it improve the startup time of the applications and are wide used by Windows itself. So the way is also recommended.