Loading/calling ntdll from DllMain - windows

One should not use functions other than those in kernel32.dll from DllMain:
From MS documentation:
Because Kernel32.dll is guaranteed to be loaded in the process address space when the entry-point function is called, calling functions in Kernel32.dll does not result in the DLL being used before its initialization code has been executed. Therefore, the entry-point function can call functions in Kernel32.dll that do not load other DLLs. For example, DllMain can create synchronization objects such as critical sections and mutexes, and use TLS. Unfortunately, there is not a comprehensive list of safe functions in Kernel32.dll.
...
Calling functions that require DLLs other than Kernel32.dll may result in problems that are difficult to diagnose. For example, calling User, Shell, and COM functions can cause access violation errors, because some functions load other system components. Conversely, calling functions such as these during termination can cause access violation errors because the corresponding component may already have been unloaded or uninitialized.
My question:
But the documentation does not mention ntdll.dll. - Can I call LoadLibrary for "ntdll" and use functions in ntdll from DllMain:
1) during DLL_PROCESS_ATTACH (load and use functions of ntdll)?
2) during DLL_PROCESS_DETACH (use functions of previously loaded ntdll)?
Also, please, would somebody with 1500+ reputation like to create a new tag titled "dllmain" ?

The answer to the question "is it safe in DllMain" always defaults to "no". In this case, calling LoadLibrary is never okay.
Generally speaking, calling anything in ntdll.dll is not recommended even places where it is safe to do so.

Related

Safe place to put unsafe DLL cleanup code on Windows?

We hit a case where it would be the best solution for us to put a FreeLibrary call into DllMain / DLL_PROCESS_DETACH.
Of course, you must not do that:
It is not safe to call FreeLibrary from DllMain.
The use case is that we have a situation like this:
(unknown client dll or exe) links dynamically or statically to ->
-> DLL_1, loads dynamically -> DLL_x
DLL_1 should load DLL_x transparently wrt. to it's client code, and it should to load DLL_x dynamically. Now, the loading can be done lazily, so that the LoadLibrary call needn't reside in the DLL_PROCESS_ATTACH part of DLL_1.
But once the client is done with DLL_1, when/before DLL_1 is unloaded from the process, it should also unload (== FreeLibrary) DLL_x.
Is there any way to do this without an explicit DLL_1/Uninitialize function that must be called by the client?
I'll note:
DllMain, and thus also any C++ global static destructor cannot be used.
Is there any other callback mechanism in either kernel32/ntdll or maybe in the shared MS CRT to make this happen?
Are there other patterns to make this usecase work?
The correct approach is an explicit Uninitialize function in DLL_1.
However, if you can't do that, you can work around the problem by launching a helper thread to do the unload for you. If you want to play it safe, launch the thread at the same time you load DLL_x and have it wait on an event object. (For the record, though, it is generally considered safe to launch a thread from DllMain so long as you respect the fact that it won't start up until DllMain has exited.)
Obviously, the helper thread's code can't be in DLL_1. If you can modify DLL_x you can put it there. If not, you'll need a helper DLL. In either case, the DLL containing the helper thread's code can safely self-unload using the FreeLibraryAndExitThread function.

Are Win32 applications automatically linked against ntdll.dll?

I've just found out by accident that doing this GetModuleHandle("ntdll.dll") works without a previous call to LoadLibrary("ntdll.dll").
This means ntdll.dll is already loaded in my process.
Is it safe to assume that ntdll.dll will always be loaded on Win32 applications, so that a call to LoadLibrary is not necessary?
From MSDN on LoadLibrary() (emphasis mine):
The system maintains a per-process reference count on all loaded
modules. Calling LoadLibrary increments the reference count. Calling
the FreeLibrary or FreeLibraryAndExitThread function decrements the
reference count. The system unloads a module when its reference count
reaches zero or when the process terminates (regardless of the
reference count).
In other words, continue to call LoadLibrary() and ensure you get your handle to ntdll.dll to be safe -- but the system will almost certainly be bumping a reference count as it should already be loaded.
As for "is it really always loaded?", see Windows Internals on the Image Loader (the short answer is yes, ntdll.dll is part of the loader itself and is always present).
The relevant paragraph is:
The image loader lives in the user-mode system DLL Ntdll.dll and not in the kernel library. Therefore, it behaves just like standard code that is part of a DLL, and it is subject to the same restrictions in terms of memory access and security rights. What makes this code special is the guaranty that it will always be present in the running process (Ntdll.dll is always loaded) and that it is the first piece of code to run in user mode as part of a new application. (When the system builds the initial context, the program counter, or instruction pointer is set to an initialization function inside Ntdll.dll.)

DllMain DLL_PROCESS_DETACH and GetMessage Function reentrancy

I have written a global hook that hooks using SetWindowsHookEx the WH_GETMESSAGE, WH_CALLWNDPROC and WH_CALLWNDPROCRET.
The hook dll creates a new thread in the hooked process, which, among other things, checks the audio state of the process and calls IAudioSessionManager2::GetSessionEnumerator().
Now the interesting part, I had called UnhookWindowsHookEx() from the hook host AND during the time my dll's worker thread was running the call to IAudioSessionManager2::GetSessionEnumerator(). That call was in the same thread's call stack, where the DllMain with DLL_PROCESS_DETACH was invoked.
I assume the reason was, that GetSessionEnumerator() invokes GetMessage() function somewhere and the latter is reentrant. Unfortunately I do not remember precisely, but I think I saw that in the call stack.
But there are multiple important things I wonder about and things that remain unclear. So here come my related questions:
Can DllMain with DLL_PROCESS_DETACH invoked any time, even in a thread which runs functions from that dll which is currently being unloaded?
What happens with the functions up the stack when DllMain DLL_PROCESS_DETACH exits? Will the code in the functions up the call stack execute eventually?
What if these functions do not exit? When will the dll be unloaded?
Can the DllMain DLL_PROCESS_DETACH be similarly invoked during the callbacks for WH_GETMESSAGE, WH_CALLWNDPROC and WH_CALLWNDPROCRET hooks? I know and have experimentally confirmed that sometimes, although not too often, these functions are reentrant, so the calls to these functions can be injected during the time previous call is still running in the same stack, but I do not know whether also calls to DllMain can be injected in a similar manner.
When precisely the DllMain can be invoked in a thread - are there some specific Windows API functions that need to be called and which in turn may lead to DllMain DLL_PROCESS_DETACH call, or could it happen at any instruction?
If the DllMain DLL_PROCESS_DETACH call can be "injected" at any time AND functions up the call stack do not get executed anymore, then how do I know where precisely was the function up the call stack interrupted? So I could inside DllMain release some handles or resources allocated by the function up the stack.
Is there any way to temporarily prevent/postpone calls to DllMain DLL_PROCESS_DETACH? Locks obviously do not help if the call/interruption occurs in the same stack.
Unfortunately I probably cannot experimentally solve these questions since I have had my hooking (and also unhooking) code running on multiple computers for months before such situation with DllMain occured during unhooking. Though for some reason it then occured with four different programs at once...
Also please, would someone with enough reputation like to merge the tags "reentrant" and "reentrancy"?
So thanks to Hans I now know regarding point (4) that DllMain DLL_PROCESS_DETACH will not be reentrant with hook procedures.
Regarding the thread that the hook created, my log files currently indicate that if DllMain DLL_PROCESS_DETACH is injected into the stack of this thread then that thread indeed will be terminated after DllMain exits and will NOT run to completion. That should answer points (2) and (3). The question itself implicitly answers the point (1).
But for solving the problem for that thread the hook created, I assume that the DllMain DLL_PROCESS_DETACH can be prevented by calling
GetModuleHandleEx
(
GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS,
(LPCTSTR)DllMain,
&hModule_thread
)
and before the thread termination calling
FreeLibraryAndExitThread(hModule_thread, 0)
So using GetModuleHandleEx should answer the point (7), which in turn makes all other points irrelevant. Of course I have to use some IPC to trigger the thread termination in the hooked processes.
The remaining interesting question is point (5), but it is just out of curiosity:
"When precisely the DllMain DLL_PROCESS_DETACH can be invoked in a thread - are there some specific Windows API functions that need to be called and which in turn may lead to DllMain DLL_PROCESS_DETACH call, or could it happen at any instruction?"

using LoadLibrary, effect performance?

when you dynamically load a library at runtime using LoadLibrary in windows (C++), does it load into memory the same as the rest of your program, or might there be some overhead associated with calling functions referenced from that library?
In other words, if you plan on making frequent calls to a function, will it be just as fast from the library as it would if you linked it into you program at compile-time, or do you lose some performance?
(This is not related to libraries that link to or against a program during compile-time via .lib/.a files.)
Once dll is loaded and function pointer variable is initialized by GetProcAddress, there isn't any overhead in function call.

A GetModuleHandle implementation

I need to do it this way because I am in DllMain() therefore, loader lock is held. I've read that GetModuleHandle() also uses the loader lock [page #6] which would result in deadlock.
How could GetModuleHandle() implemented? Some code would be a plus.
Update: Since I am using SetWindowsHookEx on WinXP only. Just going to take advice in the comments, go the easy way, and use GetModuleHandle() the first time the callback gets called.
You can call GetModuleHandle from DllMain. It doesn't load any libraries and doesn't increment module reference count. Other story is with LoadLibrary. Never call it from DllMain.

Resources