I'm trying to understand in a bit more detail how a OS loaderlock is used in relation to the loading and unloading of DLL's in Windows.
I understand that every loaded DLL get notified when a new thread is created/destroyed and or a new DLL is loaded/unloaded.
So does that mean that the DllMain function is run inside a lock and no other thread can access it while it is running, and if you were to create another thread in that function, you could hang the process or even the OS?
Is my understanding correct?
Is there some article somewhere that explain this?
A deadlock can happen when two threads try to acquire two locks in different sequence.
Thread A gets lock A and then tries to get lock B
Meanwhile thread B gets lock B and then tries to get lock A
A thread that's running DllMain has already acquired an implicit O/S lock: therefore they (Microsoft) reckon that it may be unsafe for that thread to try to acquire any other, second lock (e.g. because a different thread might already own that lock and be currently blocked on the implicit O/S lock).
that is correct.
Any such execution is illegal because
it can lead to deadlocks and to use of
DLLs before they have been initialized
by the operating system's loader.
More information can be found here: LoaderLock MDA (MSDN Website)
Related
I'm looking at an application which hangs on Server 2016 but runs find on Server 2008 R2. I have traced it to a hang caused by deadlock when a particular DLL loads. Analysing the DLL (I don't have the source code) I can see that it violates the guideline here
Call GetStringTypeA, GetStringTypeEx, or GetStringTypeW (either
directly or indirectly). This can cause a deadlock or a crash
Specifically it calls GetStringTypeW from DllMain.
I'm trying to understand how this function can cause deadlocks in DllMain.
Your DllMain function runs inside the loader lock, one of the few
times the OS lets you run code while one of its internal locks is
held. This means that you must be extra careful not to violate a lock
hierarchy in your DllMain; otherwise, you are asking for a deadlock. (Refer to "Another reason not to do anything scary in your DllMain: Inadvertent deadlock")
DllMain is called while the loader-lock is held, so if GetStringType acquires the loader lock, deadlock appears.
Significant restrictions are imposed on the functions that can be
called within DllMain. As such, DllMain is designed to perform minimal
initialization tasks, by using a small subset of the Microsoft®
Windows® API. You cannot call any function in DllMain that directly or
indirectly tries to acquire the loader lock. Otherwise, you will
introduce the possibility that your application deadlocks or crashes.
We can't see source code of GetStringType function. It is suggested to follow Dynamic-Link Library Best Practices.
Calling a function from DllMain that will (directly or indirectly) attempt to load another module is going to deadlock.
Module loading is serialized. The system uses a lock to ensure that at any given time no more than a single DLL entry point is entered. If you are calling LoadLibrary (directly or indirectly) from your DLL entry point, you are holding the loader lock. LoadLibrary will attempt to acquire the loader lock itself prior to calling into the module's entry point. So while your code is waiting for LoadLibrary to return, LoadLibrary is waiting for the lock you are holding. That's a deadlock.
This is likely the reason why calling GetStringType from DllMain deadlocks. Without access to source code I cannot verify this, though. You may be able to find evidence by enabling loader snaps.
We develop a user-space process running on Linux 3.4.11 in an embedded MIPS system. The process creates multiple (>10) threads using pthreads. The process has a SIGSEGV signal handler which, among other things, generates a log message which goes to our log file. As part of this flow, it acquires a semaphore (bad, I know...).
During our testing the process appeared to hang. We're currently unable to build gdb for the target platform, so I wrote a CLI tool that uses ptrace to extract the register values and USER data using PTRACE_PEEKUSR.
What surprised me to see is that all of our threads were inside our crash handler, trying to acquire the semaphore. This (obviously?) indicates a deadlock on the semaphore, which means that a thread died while holding it. When I dug up the stack, it seemed that almost all of the threads (except one) were in a blocking call (recv, poll, sleep) when the signal handler started running. Manual stack reconstruction on MIPS is a pain so we have not fully done it yet. One thread appeared to be in the middle of a malloc call, which to me indicates that it crashed due to a heap corruption.
A couple of things are still unclear:
1) Assuming one thread crashed in malloc, why would all other threads be running the SIGSEGV handler? As I understand it, a SIGSEGV signal is delivered to the faulting thread, no? Does it mean that each and every one of our threads crashed?
2) Looking at the sigcontext struct for MIPS, it seems it does not contain the memory address which was accessed (badaddr). Is there another place that has it? I couldn't find it anywhere, but it seemed odd to me that it would not be available.
And of course, if anyone can suggest ways to continue the analysis, it would be appreciated!
Yes, it is likely that all of your threads crashed in turn, assuming that you have captured the thread state correctly.
siginfo_t has a si_addr member, which should give you the address of the fault. Whether your kernel fills that in is a different matter.
In-process crash handlers will always be unreliable. You should use an out-of-process handler, and set kernel.core_pattern to invoke it. In current kernels, it is not necessary to write the core file to disk; you can either read the core file from standard input, or just map the process memory of the zombie process (which is still available when the kernel invokes the crash handler).
I plan to use the WinApi CreateMutex function to only allow one running instance of my application. But I wonder what happens if the apps crashs. Is the created Mutex automatically released by the OS if the main process dies? I can't find an answer to this in the MS knowledgebase.
TIA!
A mutex is a kernel object whose lifetime is controlled by its references. When a process terminates, however it terminates, all the references to kernel objects held by that process are removed. If this leaves a kernel object having no remaining references to it, that kernel object is destroyed.
We have an Windows32 application in which one thread can stop another to inspect its
state [PC, etc.], by doing SuspendThread/GetThreadContext/ResumeThread.
if (SuspendThread((HANDLE)hComputeThread[threadId])<0) // freeze thread
ThreadOperationFault("SuspendThread","InterruptGranule");
CONTEXT Context, *pContext;
Context.ContextFlags = (CONTEXT_INTEGER | CONTEXT_CONTROL);
if (!GetThreadContext((HANDLE)hComputeThread[threadId],&Context))
ThreadOperationFault("GetThreadContext","InterruptGranule");
Extremely rarely, on a multicore system, GetThreadContext returns error code 5 (Windows system error code "Access Denied").
The SuspendThread documentation seems to clearly indicate that the targeted thread is suspended, if no error is returned. We are checking the return status of SuspendThread and ResumeThread; they aren't complaining, ever.
How can it be the case that I can suspend a thread, but can't access its context?
This blog
http://www.dcl.hpi.uni-potsdam.de/research/WRK/2009/01/what-does-suspendthread-really-do/
suggests that SuspendThread, when it returns, may have started the
suspension of the other thread, but that thread hasn't yet suspended. In this case, I can kind of see how GetThreadContext would be problematic, but this seems like a stupid way to define SuspendThread. (How would the call of SuspendThread know when the target thread was actually suspended?)
EDIT: I lied. I said this was for Windows.
Well, the strange truth is that I don't see this behavior under Windows XP 64 (at least not in the last week and I don't really know what happened before that)... but we have been testing this Windows application under Wine on Ubuntu 10.x. The Wine source for the guts of GetThreadContext contains
an Access Denied return response on line 819 when an attempt to grab the thread state fails for some reason. I'm guessing, but it appears that Wine GetThreadStatus believes that a thread just might not be accessible repeatedly. Why that would be true after a SuspendThead is beyond me, but there's the code. Thoughts?
EDIT2: I lied again. I said we only saw the behavior on Wine. Nope... we have now found a Vista Ultimate system that seems to produce the same error (again, rarely). So, it appears that Wine and Windows agree on an obscure case. It also appears that the mere enabling of the Sysinternals Process monitor program aggravates the situation and causes the problem to appear on Windows XP 64; I suspect a Heisenbug. (The Process Monitor
doesn't even exist on the Wine-tasting (:-) machine or the XP 64 system I use for development).
What on earth is it?
EDIT3: Sept 15 2010. I've added careful checking to the error return status, without otherwise disturbing the code, for SuspendThread, ResumeThread, and GetContext. I haven't seen any hint of this behavior on Windows systems since I did that. Haven't gotten back to the Wine experiment.
Nov 2010: Strange. It seems that if I compile this under VisualStudio 2005, it fails on Windows Vista and 7, but not earlier OSes. If I compile under VisualStudio 2010, it doesn't fail anywhere. One might point a finger at VisualStudio2005, but I'm suspicious of a location-sensitivve problem, and different optimizers in VS 2005 and VS 2010 place the code a slightly different places.
Nov 2012: Saga continues. We see this failure on a number of XP and Windows 7 machines, at a pretty low rate (once every several thousand runs). Our Suspend activities are applied to threads that mostly execute pure computational code but that sometimes make calls into Windows. I don't recall seeing this issue when the PC of the thread was in our computational code. Of course, I can't see the PC of the thread when it hangs because GetContext won't give it to me, so I can't directly confirm that the problem only happens when executing system calls. But, all our system calls are channeled through one point, and so far the evidence is that point was executed when we get the hang. So the indirect evidence suggests GetContext on a thread only fails if a system call is being executed by that thread. I haven't had the energy to build a critical experiment to test this hypothesis yet.
Let me quote from Richter/Nassare's "Windows via C++ 5Ed" which may shed some light:
DWORD SuspendThread(HANDLE hThread);
Any thread can call this function to
suspend another thread (as long as you
have the thread's handle). It goes
without saying (but I'll say it
anyway) that a thread can suspend
itself but cannot resume itself. Like
ResumeThread, SuspendThread returns
the thread's previous suspend count. A
thread can be suspended as many as
MAXIMUM_SUSPEND_COUNT times (defined
as 127 in WinNT.h). Note that
SuspendThread is asynchronous with
respect to kernel-mode execution, but
user-mode execution does not occur
until the thread is resumed.
In real life, an application must be
careful when it calls SuspendThread
because you have no idea what the
thread might be doing when you attempt
to suspend it. If the thread is
attempting to allocate memory from a
heap, for example, the thread will
have a lock on the heap. As other
threads attempt to access the heap,
their execution will be halted until
the first thread is resumed.
SuspendThread is safe only if you know
exactly what the target thread is (or
might be doing) and you take extreme
measures to avoid problems or
deadlocks caused by suspending the
thread.
...
Windows actually lets you look inside
a thread's kernel object and grab its
current set of CPU registers. To do
this, you simply call
GetThreadContext:
BOOL GetThreadContext( HANDLE
hThread, PCONTEXT pContext);
To call this function, just allocate a
CONTEXT structure, initialize some
flags (the structure's ContextFlags
member) indicating which registers you
want to get back, and pass the address
of the structure to GetThreadContext.
The function then fills in the members
you've requested.
You should call SuspendThread before
calling GetThreadContext; otherwise,
the thread might be scheduled and the
thread's context might be different
from what you get back. A thread
actually has two contexts: user mode
and kernel mode. GetThreadContext can
return only the user-mode context of a
thread. If you call SuspendThread to
stop a thread but that thread is
currently executing in kernel mode,
its user-mode context is stable even
though SuspendThread hasn't actually
suspended the thread yet. But the
thread cannot execute any more
user-mode code until it is resumed, so
you can safely consider the thread
suspended and GetThreadContext will
work.
My guess is that GetThreadContext may fail if you just called SuspendThread, while the thread is in kernel mode, and the kernel is locking the thread context block at this time.
Maybe on multicore systems, one core is handling the kernel-mode execution of the thread that it's user mode was just suspended, keep locking the CONTEXT structure of the thread, exactly when the other core is calling GetThreadContext.
Since this behaviour is not documented, I suggest contacting microsoft.
There are some particular problems surrounding suspending a thread that owns a CriticalSection. I can't find a good reference to it now, but there is one mention of it on Raymond Chen's blog and another mention on Chris Brumme's blog. Basically, if you are unlucky enough to call SuspendThread while the thread is accessing an OS lock (e.g., heap lock, DllMain lock, etc.), then really strange things can happen. I would assume that this is the case that you are running into extremely rarely.
Does retrying the call to GetThreadContext work after a processor yield like Sleep(0)?
Old issue but good to see you still kept it updated with status changes after experiencing the issue for another more than 2 years.
The cause of your problem is that there is a bug in the translation layer of the x64 version of WoW64, as per:
http://social.msdn.microsoft.com/Forums/en/windowscompatibility/thread/1558e9ca-8180-4633-a349-534e8d51cf3a
There is a rather critical bug in GetThreadContext under WoW64 which makes it return stale contents which makes it unusable in many situations. The contents is stored in user-mode This is why you think the value is not-null but in the stale contents it is still null.
This is why it fails on newer OS but not older ones, try running it on Windows 7 32bit OS.
As for why this bug seems to happen less often with solutions built on Visual Studio 2010 / 2012 it is likely that there is something the compiler is doing which is mitigating most of the problem, for this you should inspect the IL generated from both 2005 and 2010 and see what the differences are. For example does the problem happen if the project is built without optimizations perhaps?
Finally, some further reading:
http://www.nynaeve.net/?p=129
Maybe a thread safety issue. Are you sure that the hComputeThread struct isn't changing out from under you? Maybe the thread was exiting when you called suspend? This may cause suspend to succeed, but by the time you call get context it is gone and the handle is invalid.
Calling SuspendThread on a thread that owns a synchronization object, such as a mutex or critical section, can lead to a deadlock if the calling thread tries to obtain a synchronization object owned by a suspended thread.
- MSDN
I'm developing an add-on for AutoCAD 2009. The project output is a class library. When I attempt to debug and load the class library, I get this "LoaderLock was detected message." I've been writing these add-ons for awhile and this is the first message of this type I've seen.
Where do I start trying to figure this out?
What is LoaderLock and why is it bothering me now?
LoaderLock was detected
Message: Attempting managed execution inside OS Loader lock. Do not attempt to run managed code inside a DllMain or image initialization function since doing so can cause the application to hang.
I went to Debug -> Exceptions -> "Managed Debugging Assistants", found "LoaderLock" and unchecked the "Thrown" checkbox.
I can debug again but what did I do and why did I have to do it? Will this cause other problems for me?
The loader lock is a process-wide lock used by the system to synchronize access to loading DLL's into a process address space. Functions that load DLL's, free DLL's, query DLL info, etc., all acquire the loader lock. What typically impacts developers the most is that the loader lock is held while DllMain is running as well - this means that an OS lock that you aren't normally aware of can be held while running your code.
The loader lock can be viewed as being at a very low level in the lock-hierarchy. Code running under the loader lock during DllMain can be the cause of deadlocks. For instance, the CLR has its own set of internal locks which it could hold while loading DLL's. If you call managed code from within your DllMain, you could cause the CLR on your thread to acquire one of these locks while holding the loader lock. If the CLR on another thread had acquired that lock (causing the origin thread in DllMain to block) and then tried to load a DLL which would acquire the loader lock, your process would deadlock.
It sounds like the CLR is trying to preemptively detect running managed code under the loader lock. When you see the stack from this failure in the debugger, identify what is causing your managed code to be running from within a DllMain and remove it.
In my experience with AutoCAD, the LoaderLock warning can be safely ignored. It's not a sign of your code doing something wrong, but rather the warning is raised because of the way AutoCAD is loading and initializing your application.
This a bug in Visual Studio 2005. Read this article for more details: http://support.microsoft.com/kb/913996