The API method InitializeCriticalSectionAndSpinCount allows you to set a spin count so when EnterCriticalSection is called, it loops using a spinlock to try to acquire the resource some number of times. Only if all the attempts fail does the thread transition to kernel mode to enter a wait state.
If the 'normal' InitializeCriticalSection() is called instead, is there a 'default' spin count set? (Or is it 0, no spin?)
Quoting from this article:
SpinCount ... This field defaults to zero, but can be set to a different value with the InitializeCriticalSectionAndSpinCount API
So the default is no spin.
Related
There are several questions posted (like Send flag to cancel CopyFileEx after it has started) that reference the ability to use the pbCancel parameter of the Win32 CopyFileEx() function to cancel an in-progress copy. What is not clear to me, is why is it safe to set that boolean in another thread without any sort of synchronization mechanism (mutex, etc.)? This functionality is really only useful if another thread sets that boolean to true, as CopyFileEx() blocks until the file copy is finished.
Is this relying on a peculiarity of the Windows platform?
in case simply set boolean by sense variable (which can be 0 or not 0) without any connection with other data - not need any synchronization - for what ?!
one thread set variable to true, another thread read from variable or 0 or not 0. even if you do write and read to variable in critical section - what this change ? nothing ! thread which will be read - anyway load or 0 or not 0 from variable.
synchronization need in another cases. usually when we write to another memory locations, before store true in variable, and want that after another thread read true from variable - all another memory modification will be visible already. but in case cancel callback - no any another data
or if we write complex data to variable (not only or 0 or not 0) and this write not atomic - synchronization need for prevent read partial state. but here impossible any "partial state" by design
For those who not understand:
no matter are write to or read from pbCancel will be atomic.
in any case finally some value will be read from pbCancel.
if this value will be interpreted as TRUE during the copy operation, the operation is canceled. Otherwise, the copy operation will continue to completion.
If this flag is set to TRUE during the copy operation, the operation is canceled. Otherwise, the copy operation will continue to
completion.
even if some "transient state" will be read - and so what -any way this value will be used in if/else expression and as result copy operation will be canceled or continue.
Nowhere is not specified (and this contradicts common sense) that a strict checking of 0 (false) and 1(true) will be performed and in case of any other value there will be an exception or UB.
on the contrary - it is clearly indicated that Otherwise (ie flag not set to TRUE), the copy operation will continue to completion.. no any words about exception, UB, etc
if look on declaration of CopyFileExW, here visible in more details how pbCancel value is interpreted:
_When_(pbCancel != NULL, _Pre_satisfies_(*pbCancel == FALSE))
_Inout_opt_ LPBOOL pbCancel,
so check is (and this is most native)
if (pbCancel == NULL || *pbCancel == FALSE)
// continue copy
else
// cancel copy
here no any "transient state". here or 0 or not 0. even if you write 1 to pbCancel but another thread read from it say 0x5FD38FCA - this will be interpreted as TRUE and copy operation will be canceled.
anyway - if you write true (in strict sense 1) to variable - another thread sooner or later read 1 from this variable. do this in critical section - nothing change - again only sooner or later another thread read this value. not faster.
I am curious about the event parameter that gets passed to IOLockWakeup and IOLockSleep{Deadline}.
i understand that the event is an address that gets passed to both functions. i am assuming this address is used to essentially notify the thread.
so my question is: assuming i is an int, and we are using its address, how do these functions know when to sleep and wakeup?
is the assumption that:
when IOLockWakeup is called, that the contents of event are 0 (which it then changes to a non-zero value), and
when IOLockSleepDeadline is called, that the contents of the event were 0 at the time it was called, and it will stop sleeping because the contents are nonzero
and when we keep calling these functions (in a workloop context) are the contents of the event parameter automatically set to zero when iolocksleep* is called (and when it wakes up), since iolockwakeup presumably changs this to a nonzero value?
You'll notice that the event parameter is of type void*, not int*:
int IOLockSleep( IOLock * lock, void *event, UInt32 interType);
The event parameter is an arbitrary pointer, it’s never dereferenced, and it doesn’t matter what’s stored there, it's used purely for identification purposes: so for example don't pass NULL, because that's not a unique value.
IOLockSleep always suspends the running thread, and IOLockWakeup wakes up any thread that’s sleeping on that address. If no such thread is waiting, nothing at all happens. This is why you’ll usually want to pair the sleep/wakeup with some condition that’s protected by the lock, and send the wakeup while holding the lock - the thing to avoid is going to sleep after the wakeup was sent, in which case your sleeping thread might sleep forever.
So, you'll have some condition for deciding whether or not to sleep, and you'll update that condition before calling wakeup, while holding the lock:
IOLock* myLock;
bool shouldSleep;
…
// sleep code:
IOLockLock(myLock);
while (shouldSleep)
{
IOLockSleep(myLock, &shouldSleep, THREAD_UNINT);
}
IOLockUnlock(myLock);
…
// wakeup code:
IOLockLock(myLock);
shouldSleep = false;
IOLockWakeup(myLock, &shouldSleep, true /* or false, if we want to wake up multiple sleeping threads */);
IOLockUnlock(myLock);
Here, I've used the address of shouldSleep for the event parameter, but this could be anything, it's just convenient to use this because I know no other kext will be using that pointer, as no other kext has access to that variable.
The MSDN page for SleepConditionVariableCS states that
Condition variables are subject to spurious wakeups (those not
associated with an explicit wake) and stolen wakeups (another thread
manages to run before the woken thread). Therefore, you should recheck
a predicate (typically in a while loop) after a sleep operation
returns.
As a result the conditional wait has to be enclosed in a while loop i.e.
while (check_predicate())
{
SleepConditionVariableCS(...)
}
If I were to use events instead of Conditional Variables can I do away with the while loop while waiting (WaitForSingleObject) for the event to be signaled?
For WaitForSingleObject(), there are no spurious wakeups, so you can eliminate the loop.
If you use WaitForMultipleObjectsEx() with bAlertable=TRUE, MsgWaitForMultipleObjects() with a wake mask, or MsgWaitForMultipleObjectsEx() with bAlertable=TRUE or a wake mask, then the wait can end on other conditions before the event is actually signaled.
The kqueue mechanism has an event flag, EV_RECEIPT, which according to the linked man page:
... is useful for making bulk changes to a kqueue
without draining any pending events. When passed as input,
it forces EV_ERROR to always be returned. When a filter is
successfully added the data field will be zero.
My understanding however is that it is trivial to make bulk changes to a kqueue without draining any pending events, simply by passing 0 for the nevents parameter to kevent and thus drawing no events from the queue. With that in mind, why is EV_RECEIPT necesary?
Some sample code in Apple documentation for OS X actually uses EV_RECEIPT:
kq = kqueue();
EV_SET(&changes, gTargetPID, EVFILT_PROC, EV_ADD | EV_RECEIPT, NOTE_EXIT, 0, NULL);
(void) kevent(kq, &changes, 1, &changes, 1, NULL);
But, seeing as the changes array is never examined after the kevent call, it's totally unclear to me why EV_RECEIPT was used in this case.
Is EV_RECEIPT actually necessary? In what situation would it really be useful?
If you are making bulk changes and one of them causes an error, then the event will be placed in the eventlist with EV_ERROR set in flags and the system error in data.
Therefore it is possible to identify which changelist element caused the error.
If you set nevents to zero, you get the error code but no indication of which event caused the error.
So EV_RECEIPT allows you to set nevents to a non-zero value without draining any pending events.
I'm looking for a wait/signal synchronization primitive in IO/Kit working like :
Thread1 : wait(myEvent) // Blocking thread1
Thread2 : wait(myEvent) // Blocking thread2
Thread3 : signal(myEvent) // Release one of thread1 or thread2
This can't be done using an IOLock since the lock/unlock operations would be made from different threads, which is a bad idea according to some doc I've read.
Thread1, 2, 3 can be user threads or kernel threads.
I'd also like to have an optional time out with the wait operation.
Thanks for your help !
You want the function IOLockSleepDeadline(), declared in <IOKit/IOLocks.h>.
You set up a single IOLock somewhere with IOLockAlloc() before you begin. Then, threads 1 and 2 lock the IOLock with IOLockLock() and immediately relinquish the lock and go to sleep by calling IOLockSleepDeadline(). When thread 3 is ready, it calls IOLockWakeup() (with oneThread = true if you only want to wake a single thread). This causes thread 1 or 2 to wake up and immediately acquire the lock (so they need to Unlock or sleep again).
IOLockSleep() works similarly, but without the timeout.
You can do something similar using the IOCommandGate's commandSleep() method which may be more appropriate if your driver already is centred around an IOWorkLoop.
The documentation of method IOLocks::IOLockLock states the following:
Lock the mutex. If the lock is held by any thread, block waiting for
its unlock. This function may block and so should not be called from
interrupt level or while a spin lock is held. Locking the mutex
recursively from one thread will result in deadlock.
So it will certainly do block the other threads (T1 and T2) until the thread holding the lock releases it (T3). One thing that it doesn't seem to support is the timeout.