dynamic link library - windows

I know that dynamic link library are loaded in memory when an application loaded, the reference is resolved by operation system loader. For example, in windows kernel32.dll, user32.dll and gdi32 dll, so if my application reference a function in a kernel32.dll, for example CreateWindow, is that the whole dll must be loaded in the process, or just part of the dll?
Thanks

whole thing, but don't worry, it's not re-loading the dll over and over, there is one instance for all the programs that use it....another name for dll is so....or shared object, and that's the whole point, to share.
http://en.wikipedia.org/wiki/Dynamic_link_library

You reference one function, you get the whole DLL. You can't load just part of a DLL.
It's annoying because you get all of Shell32.dll just to find where someone's home directory is. Sigh.

Don't worry about this so much, when you "load" a DLL, it's really just a mapped memory file; the Windows OS uses the page fault mechanism to bring in pages on-demand; so if you only use a small piece of the DLL you aren't actually going to fault the whole thing in.

Only the functions you use in that DLL is required, do not worry about cramping the memory, as most of these DLL's are standard and not alone that they are dynamic, the very reason why only certain functions that your code uses are loaded, not the entire dll.
Hope this helps,
Best regards,
Tom.

Related

actual machine code to execute what Win APIs do stays in OS kernel memory space or compiled together as part of the app?

If this question deals with too basic a matter, please forgive me.
As a somewhat-close-to-beginner-level programmer, I really wonder about this--whether the underlying code of every win API function is compiled altogether at the time of writing an app, or whether the machine code for executing win APIs stays in the memory as part of the OS since the pc is booted up, and only the app uses them?
All the APIs for an OS are used by many apps by means of function call. So I thought that rather than making every individual app include the API machine code on their own, apps just contain the header or signature to call the APIs and the API machine code addresses are mapped when launching the app.
I am sorry that I failed to make this question succinct due to my poor English. I really would like to get your insights. Thank you.
The implementation for (most) API calls is provided by the system by way of compiled modules (Portable Executable images). Application code only contains enough information so that the system can identify and load the required modules, and resolve the respective imports.
As an example consider the following code that shows a message box, waits for it to close, and then exits the program:
#include <Windows.h>
int main()
{
::MessageBoxW(nullptr, L"Foo", L"Bar", MB_OK);
}
Given the function signature (declared in WinUser.h, which gets pulled in from Windows.h) the compiler can almost generate a call instruction. It knows the number of arguments, their expected types, and the order and location the callee expects them in. What's missing is the actual target address inside user32.dll, that's only known after a process was fully initialized, and had the user32.dll module mapped into its address space.
Clearly, the compiler cannot postpone code generation until after load time. It needs to generate a call instruction now. Since we know that "all problems in computer science can be solved by another level of indirection" that's what the compiler does, too: Instead of emitting a direct call instruction it generates an indirect call. The difference is that, while a direct call immediately needs to provide the target address, an indirect call can specify the address at which the target address is stored.
In x86 assembly, instead of having to say
call _MessageBoxW#16 ; uh-oh, not yet known
the compiler can conveniently delegate the call to the Import Address Table (IAT):
call dword ptr [__imp__MessageBoxW#16]
Disaster averted, we've bought us just enough time to fix things up before the code actually executes.
Once a process object is created the system hands over control to its primary thread to finish initialization. Part of that initialization is loading dependencies (such as user32.dll here). Once that has completed, the system finally knows the load address (and ultimately the address of imported symbols, such as _MessageBoxW#16), and can overwrite the IAT entry at address __imp__MessageBoxW#16 with the imported function address.
And that is approximately how the system provides implementations for system services without requiring client applications to know where (physically) they will find them.
I'm saying "approximately" because things are somewhat more involved in reality. If that is something you'll want to learn about, I'll leave it up to Raymond Chen. He has published a series of blog entries covering this topic in far more detail:
How were DLL functions exported in 16-bit Windows?
How were DLL functions imported in 16-bit Windows?
How are DLL functions exported in 32-bit Windows?
Exported functions that are really forwarders
Rethinking the way DLL exports are resolved for 32-bit Windows
Calling an imported function, the naive way
How a less naive compiler calls an imported function
Issues related to forcing a stub to be created for an imported function
What happens when you get dllimport wrong?
Names in the import library are decorated for a reason
Why can't I GetProcAddress a function I dllexport'ed?

{$IMAGEBASE $13140000} directive in a unit from an advanced hooking / injection library: explanation needed

What I have done so far:
I found it in the AfxCodeHook.pas unit by Aphex.
I have also skimmed a bunch of interesting sample codes using it:
Inject Library (how to inject a DLL into another process).
Inject Library Ex (how to inject a DLL into another process using the Ex method).
Create Process Ex (how to inject a DLL into a created process using the Ex method).
Inject Executable (InjectLibraryEx's true power: the ability to inject EXE files).
Simple Api Hooking (how to use afxcodehook to manipulate calls to windows apis).
I have also read:
Embarcadero's RAD Studio help entry on Image base address and
Base address page in Wikipedia.
Question:
I seek for an informed opinion and a simple explanation of the {$IMAGEBASE $13140000} directive in Layman's Terms from seasoned Delphi coders.
This specifies the preferred base address of the DLL. If the DLL can be loaded at this address, then the loader will do so. If it cannot, then it needs to be relocated and all the absolute jumps in the DLL need to be adjusted to the new address.
When the loader attempts to map a DLL into a process address space, it first reads the preferred base address. Then it works out the size of the DLL. Finally it checks to see if a contiguous block of memory stretching from the base address to the base address + size can be found. If so then the DLL is loaded at the preferred base address. If another DLL, or the exe resides at the preferred base address, then the DLL will need to be relocated. If the application has reserved heap memory that overlaps with the preferred DLL load address space, then the DLL will need to be relocated.
If a DLL needs to be relocated then its physical pages cannot be shared between processes. The Windows system DLLs have carefully chosen base addresses to ensure that there are no collisions and that they can be shared.
Nowadays, Address Space Layout Randomization (ASLR) complicates matters even further.
You can learn more from these articles:
Dr. Dobbs: Rebasing Win32 DLLs
Peering Inside the PE: A Tour of the Win32 Portable Executable File Format

Working around fls limitations with too many statically linked CRTs?

When loading external DLLs (not under our control) via LoadLibrary, we're hitting a problem where the statically linked CRT in those DLLs are failing to allocate fiber-local storage. This is similar to mskb 193462, except that this is FLS and there's only 128 of them.
Are there any useful ways to work around the problem? The CRT is using GetProcAddress to find FlsAlloc anyway (since that apparently never existed in XP), so does it even really need it?
(This is on Vista, where FlsAlloc actually exists; the DLLs appear to be using MSVC8)
There is frankly no solution here, short of loading less dlls.
You could hook the dll's import address table - but that will happen too late as you can only install an IAT hook when LoadLibrary returns, and the CRT initialization code probably executes in response to DllProcessAttach which will already have been processed.
You could I guess find the kernel32.dll module in memory, and patch the export address for GetProcAddress or perhaps FlsAlloc to point to your implementation. But that approach is getting seriously hackish.

Attempted to read or write protected memory when calling native C DLL

I have a native C dll that exports one function besides DllEntryPoint, FuncX. I'm trying to find out how FuncX communicates with it's caller, because it has a void return type and no parameters. When I call it from a C# harness, I get an AccessViolationException - Attempted to read or write protected memory.
I have a hunch that its client application may allocate a buffer for sending or receiving values from the dll. Is this a valid hunch?
I can't debug the client application because for some reason it doesn't run, so I can't start it and attach to the process. I can, however, disassemble it in IDA Pro, but don't know how to, if I can, try and debug it in there.
If the DLL in question has any static or global symbols, it's possible that all communication is done via those symbols. Do you have any API code that looks like it might be doing this?
It is unlikely that the DLL is using a client supplied buffer, as both client and server would need to know the base address of that buffer, and you can't ask calloc or malloc for a "preferred" address at call time.
You might also try running link /dump /symbols and point it at your DLL. That will show you the list of exported symbols in your DLL. Good luck!
I would try loading the DLL itself into IDA Pro. Hopefully C# preserves the native call stack, and you can look at the code around where the DLL crashes.
Side note: the Decompiler plugin is pretty awesome.

How to intercept dll method calls?

How to intercept dll method calls?
What are the techniques available for it?
Can it be done only in C/C++?
How to intercept method calls from all running processes to a given dll?
How to intercept method calls from a given processes to a given dll?
There are two standard ways I can think of for doing this
DLL import table hook.
For this you need to parse the PE Header of the DLL, find the import table and write the address of your own function instead of what is already written there. You can save the address of the original function to be able to call it later. The references in the external links of this wikipedia article should give you all the information you need to be able to do this.
Direct modification of the code. Find the actual code of the function you want to hook and modify the first opcodes of it to jump to your own code. you need to save the opcode which were there so they will eventually get executed. This is simpler than it sounds mostly because it was already implement by no less than Microsoft themselves in the form of the Detours library.
This is a really neat thing to do. with just a couple of lines of code you can for instance replace all calls to GetSystemMetrics() from say outlook.exe and watch the wonders that occur.
The advantages of one method are the disadvantages of the other. The first method allows you to add a surgical hook exactly to DLL you want where all other DLLs go by unhooked. The second method allows you the most global kind of hook to intercept all calls do the function.
Provided that you know all the DLL functions in advance, one technique is to write your own wrapper DLL that will forward all function calls to the real DLL. This DLL doesn't have to be written in C/C++. All you need to do is to match the function calling convention of the original DLL.
See Microsoft Detours for a library with a C/C++ API. It's a bit non-trivial to inject it in all other programs without triggering virus scanners/malware detectors. But your own process is fair game.
On Linux, this can be done with the LD_PRELOAD environment variable. Set this variable to point at a shared library that contains a symbol you'd like to override, then launch your app.

Resources