Unloading DLL from all processes after unhooking global CBT hook - winapi

How do you properly unload a DLL from all processes when the system-wide hook that loaded them gets unloaded?
From MSDN:
You can release a global hook
procedure by using
UnhookWindowsHookEx, but this function
does not free the DLL containing the
hook procedure. This is because global
hook procedures are called in the
process context of every application
in the desktop, causing an implicit
call to the LoadLibrary function for
all of those processes. Because a call
to the FreeLibrary function cannot be
made for another process, there is
then no way to free the DLL. The
system eventually frees the DLL after
all processes explicitly linked to the
DLL have either terminated or called
FreeLibrary and all processes that
called the hook procedure have resumed
processing outside the DLL.
So what I am looking for, is a method to detect when the hook is unhooked, and then call FreeLibrary from all the processes that were hooked. Are there any other ways to cause instant unloading of a DLL when the hook is unloaded?

Hook dll are unloaded in their message loop. Forcing them to pass in the message loop help to unload them.
Add this after your UnhookWindowsHookEx to force all message loops to wake up :
DWORD dwResult;
SendMessageTimeout(HWND_BROADCAST, WM_NULL, 0, 0, SMTO_ABORTIFHUNG|SMTO_NOTIMEOUTIFNOTHUNG, 1000, &dwResult);
However I still have the issue from time to time. I don't know where it's coming from. I suppose a locked process could prevent the dll to unload, but I have no proof of that.

In general you should use global windows hooking if FreeLibrary is not a required be called. If you do want to do this you can use DLL injection (see for example http://www.codeproject.com/KB/threads/winspy.aspx and http://www.codeproject.com/KB/system/hooksys.aspx) with respect of CreateRemoteThread and LoadLibrary technique. In the case you can do what you want in the remote process. You can combine both techniques.
If you want call FreeLibrary only to do an update of the DLL you can do this in another way. Every DLL which is loaded could be renamed (in cmd.exe for example) to a temporary name and you can call MoveFileEx with MOVEFILE_DELAY_UNTIL_REBOOT flag. Then you can already copy and use new version of the DLL. The old one DLL will be deleted at the next reboot of computer.

Related

DLL injected with SetWindowsHookEx is being unloaded unexpectedly

My program calls SetWindowsHookEx to inject my DLL into a single target process. My DLL contains both a WH_GETMESSAGE and a WH_CALLWNDPROC hook:
hGetMessageProcHook = SetWindowsHookEx(WH_GETMESSAGE,
MyGetMsgProc,
hMyDLLModule,
dwTargetThreadId);
hCallWndProcHook = SetWindowsHookEx(WH_CALLWNDPROC,
MyCallWndProc,
hMyDLLModule,
dwTargetThreadId);
I never call UnhookWindowsHookEx. Windows unloads my DLL automatically, after my program terminates (cleanly or not), when the target app next pumps a Windows message. Both hooks are working well.
But on one machine running Windows Server 2016, Process Explorer shows that my DLL sometimes is unloaded without warning, of course my hook doesn't function any more. I have never seen this before. Is this some protection in Windows Server?

How Does GetStringType Cause a Deadlock in Dll Main?

I'm looking at an application which hangs on Server 2016 but runs find on Server 2008 R2. I have traced it to a hang caused by deadlock when a particular DLL loads. Analysing the DLL (I don't have the source code) I can see that it violates the guideline here
Call GetStringTypeA, GetStringTypeEx, or GetStringTypeW (either
directly or indirectly). This can cause a deadlock or a crash
Specifically it calls GetStringTypeW from DllMain.
I'm trying to understand how this function can cause deadlocks in DllMain.
Your DllMain function runs inside the loader lock, one of the few
times the OS lets you run code while one of its internal locks is
held. This means that you must be extra careful not to violate a lock
hierarchy in your DllMain; otherwise, you are asking for a deadlock. (Refer to "Another reason not to do anything scary in your DllMain: Inadvertent deadlock")
DllMain is called while the loader-lock is held, so if GetStringType acquires the loader lock, deadlock appears.
Significant restrictions are imposed on the functions that can be
called within DllMain. As such, DllMain is designed to perform minimal
initialization tasks, by using a small subset of the Microsoft®
Windows® API. You cannot call any function in DllMain that directly or
indirectly tries to acquire the loader lock. Otherwise, you will
introduce the possibility that your application deadlocks or crashes.
We can't see source code of GetStringType function. It is suggested to follow Dynamic-Link Library Best Practices.
Calling a function from DllMain that will (directly or indirectly) attempt to load another module is going to deadlock.
Module loading is serialized. The system uses a lock to ensure that at any given time no more than a single DLL entry point is entered. If you are calling LoadLibrary (directly or indirectly) from your DLL entry point, you are holding the loader lock. LoadLibrary will attempt to acquire the loader lock itself prior to calling into the module's entry point. So while your code is waiting for LoadLibrary to return, LoadLibrary is waiting for the lock you are holding. That's a deadlock.
This is likely the reason why calling GetStringType from DllMain deadlocks. Without access to source code I cannot verify this, though. You may be able to find evidence by enabling loader snaps.

Windows Server 2012 won't unload DLL in a process

I've created a program that creates a thread, loads a DLL there (LoadLibrary), calls one function in this DLL and unloads it using FreeLibrary.
LoadLibrary and FreeLibrary do not return any error codes, but ProcessHacket2 shows that this DLL still exists in memory. I've verified that my code does not reference this library.
After some time my code repeats these actions and another instance of this DLL appears in the process memory.
If my code will repeats these actions again and again a new instance will remain in RAM. Allocated virtual memory will growth.
I cannot repeat this problem on my development machine and other test computers. Only on one specific server.
It looks like (libmysql.dll):

Why does process loads modules(dlls) in different phases?

The phenomenon goes like this.
I'm trying to implement dll injection. My application create process in 'suspended' state, (Using CreateProcess with CREATE_SUSPENDED), wait for an important dll to be loaded(kernel32.dll of course), and then executes injection. The suspended process is resumed using ResumeThread after the injection.
I'm using suspended process creation in the hope that dll is injected prior to any other execution except dll load. But while executing this, I found something interesting.
I've simply tested this injection using notepad.exe and it worked just fine. But when I've duplicated the executable as notepad2.exe, my injection waits for infinite time. Kernel32.dll never loads up until the main thread of the process is resumed.
It seems there's some sort of 'authenticated' executable that modules are loaded beforehand (or maybe some security solutions installed in my computer that already uses dll injection might cause this different execution).
Does Anyone knows if there's any sort of thing that modules are loaded in different manner? (Frankly, I still don't understand the life cycle of windows process...)
I think I've found an answer. The answer is that though the process is in SUSPENDED state, and therefore main thread is stopped, when certain thread in that process tries to access to a module, then the required modules are loaded (Which means, that windows application seems to 'Lazy Load' require modules)
In the comment, I said that the process protected by certain security solution installed on my PC uses Detour to inject DLL, and therefore modules in certain Process loaded while the process is in SUSPENDED state.
But that was not a specific case of Detour, but general issues on every processes.
I was stupid that the injection technique I used was, to list all the modules loaded in the target process, find kernel32.dll, calculate the offset of LoadLibraryW, and call that function using remote thread. In this technique, the injector enumerates all the modules BEFORE LoadLibrary functions are ever called, and therefore LazyLoading never happens, and therefore I can never see kernel32.dll loaded on the process.
When I used simple technique that use GetProcAddress from current process, get the address of LoadLibrary function, and remotely calling the process, kernel32.dll was successfully loaded and the injection succeeded.
In case of several applications protected through the 'security solutions using Detour', even though I suspended the process, the injection tries do happen, and that's why kernel32.dll was loaded on notepad.exe (which is in the list of the security solution), but not loaded on notepad2.exe (as image name differed, it is not in the list).
I was just stupid, trying to use more complicated methods for dll injection, and that's what caused all these problems.
And thanks a lot for the comments.

System calls on Windows

I just want to ask, I know that standart system calls in Linux are done by int instruction pointing into Interrupt Vector Table. I assume this is similiar on Windows. But, how do you call some higher-level specific system routines? Such as how do you tell Windows to create a window? I know this is handled by the code in the dll, but what actually happend at assembler-instruction level? Does the routine in dll calls software interrupt by int instruction, or is there any different approach to handle this? Thanks.
Making a Win32 call to create a window is not really related to an interrupt. The client application is already linked with the .dll that provides the call which exposes the address for the linker to use. Since you are asking about the difference in calling mechanism, I'm limiting the discussion here to those Win32 calls that are available to any application as opposed to kernel-level calls or device drivers. At an assembly language level, it would be the same as any other function call since most Win32 calls are user-level calls which internally make the needed kernel calls. The linker provides the address of the Win32 function as the target for some sort of branching instruction, the specifics would depend on the compiler.
[Edit]
It looks like you are right about the interrupts and the int. vector table. CodeGuru has a good article with the OS details on how NT kernel calls work. Link:
http://www.codeguru.com/cpp/w-p/system/devicedriverdevelopment/article.php/c8035
Windows doesn't let you call system calls directly because system call numbers can change between builds, if a new system call gets added the others can shift forward or if a system call gets removed others can shift backwards. So to maintain backwards compatability you call the win32 or native functions in DLLs.
Now there are 2 groups of system calls, those serviced by the kernel (ntoskrnl) and by the win32 kernel layer (win32k).
Kernel system call stubs are easily accessible from ntdll.dll, while win32k ones are not exported, they're private within user32.dll. Those stubs contain the system call number and the actual system call instruction which does the job.
So if you wanted to create a window, you'd call CreateWindow in user32.dll, then the extended version CreateWindowEx gets called for backwards compatability, that calls the private system call stub NtUserCreateWindowEx, which invokes code within the win32k window manager.

Resources