Busy Waiting with semaphores in win32 - winapi

I wan't to implement a barrier.To do so I do a busy waiting until the value of the semaphore is 0.
I've managed to do this on POSIX using the value of the semaphore. Is there any way to do this in Windows ?

Don't use a simple integer with a busy loop. Use an actual semaphore object via CreateSemaphore(), and use WaitForSingleObject() (or related function) to tell you when the semaphore is in a state to pass through your barrier.

Related

Prevent WaitForSingleObject from hanging on a named semaphore when other process is terminated?

When using a globally named mutex to synchronize across two processes, and one of the two processes are killed (say in Task Manager, or due to a fault), the other process returns from WaitForSingleObject() with the appropriate error code and can continue.
When using a globally name semaphore, it does not release the waiting process if the other process is killed / terminated. WaitForSingleObject() will wait until it times out (which may be INFINITE or hours).
How do I stop WaitForSingleObject() from waiting when the other process is killed or terminated?
In this case, there is a single count on the semaphore used to control read/write requests of a shared buffer. The Requester signals the Provider to provide certain data, the Provider updates the buffer and signals back to the Requester that it can now read the buffer.
I suggest that you switch to using WaitForMultipleObjects and wait for the handle of the process that might get terminated (or thread if you want to do this within a single process) in addition to your semaphore handle. That way you can continue to use INFINITE timeouts. Just have to check the return value to see which object was signalled.
Also, I would consider a process terminating while holding a semaphore somewhat of a bug, particularly a semaphore used for actual inter-process communication.
Adding to the accepted answer.
I added logic if the waitms was going to be longer than some value maxwaitms then the requester/provider exchange the providers process id (GetCurrentProcessId()) before the long process. The requester opens a handle (OpenHandle()) to the provider process and waits on both the semaphore and the process handle to know when writing is done (or process terminated).

How to force GetQueuedCompletionStatus() to return immediately?

I have hand-made thread pool. Threads read from completion port and do some other stuff. One particular thread has to be ended. How to interrupt it's waiting if it hangs on GetQueuedCompletionStatus() or GetQueuedCompletionStatusEx()?
Finite timeout (100-1000 ms) and exiting variable are far from elegant, cause delays and left as last resort.
CancelIo(completionPortHandle) within APC in target thread causes ERROR_INVALID_HANDLE.
CancelSynchronousIo(completionPortHandle) causes ERROR_NOT_FOUND.
PostQueuedCompletionStatus() with termination packet doesn't allow to choose thread.
Rough TerminateThread() with mutex should work. (I haven't tested it.) But is it ideologically good?
I tried to wait on special event and completion port. WaitForMultipleObjects() returned immediately as if completion port was signalled. GetQueuedCompletionStatus() shows didn't return anything.
I read Overlapped I/O: How to wake a thread on a completion port event or a normal event? and googled a lot.
Probably, the problem itself – ending thread's work – is sign of bad design and all my threads should be equal and compounded into normal thread pool. In this case, PostQueuedCompletionStatus() approach should work. (Although I have doubts that this approach is beautiful and laconic especially if threads use GetQueuedCompletionStatusEx() to get multiple packets at once.)
If you just want to reduce the size of the thread pool it doesn't matter which thread exits.
However if for some reason you need to signal to an particular thread that it needs to exit, rather than allowing any thread to exit, you can use this method.
If you use GetQueuedCompletionStatusEx you can do an alertable wait, by passing TRUE for fAlertable. You can then use QueueUserAPC to queue an APC to the thread you want to quit.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms684954(v=vs.85).aspx
If the thread is busy then you will still have to wait for the current work item to be completed.
Certainly don't call TerminateThread.
Unfortunately, I/O completion port handles are always in a signaled state and as such cannot really be used in WaitFor* functions.
GetQueuedCompletionStatus[Ex] is the only way to block on the completion port. With an empty queue, the function will return only if the thread becomes alerted. As mentioned by #Ben, the QueueUserAPC will make the the thread alerted and cause GetQueuedCompletionStatus to return.
However, QueueUserAPC allocates memory and thus can fail in low-memory conditions or when memory quotas are in effect. The same holds for PostQueuedCompletionStatus. As such, using any of these functions on an exit path is not a good idea.
Unfortunately, the only robust way seems to be calling the undocumented NtAlertThread exported by ntdll.dll.
extern "C" NTSTATUS __stdcall NtAlertThread(HANDLE hThread);
Link with ntdll.lib. This function will put the target thread into an alerted state without queuing anything.

pthread_mutex_lock() and EnterCriticalSection

may be I misunderstood something but...
When I call pthread_mutex_lock() and then call pthread_mutex_lock() out of the same thread again without calling pthread_mutex_unlock(), the second call of pthread_mutex_lock() will block.
But: when I call EnterCriticalSection() and call EnterCriticalSection() out of the same thread again without calling LeaveCriticalSection(), the second call of EnterCriticalSection() will NOT block since it is called out of the same thread (what is a very weird behaviour for me).
So my question is there a WinAPI function available that behaves like pthread_mutex_lock() and locks independent from the thread context?
I'm aware of libpthread for Windows but I prefer to have a WinAPI function here.
You could use a Semaphore with the maximum count set to one.
See Semaphore Objects
When you successfully acquire the semaphore, its count is decremented: going to zero in our case.
No other thread can acquire it, including the current one.
pthread_mutex_lock documentation:
If the mutex type is PTHREAD_MUTEX_RECURSIVE, then the mutex maintains the concept of a lock count. When a thread successfully acquires a mutex for the first time, the lock count is set to one. Every time a thread relocks this mutex, the lock count is incremented by one. Each time the thread unlocks the mutex, the lock count is decremented by one. When the lock count reaches zero, the mutex becomes available for other threads to acquire. If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, an error will be returned.
MSDN ReleaseMutex states:
A thread can specify a mutex that it already owns in a call to one of the wait functions without blocking its execution. This prevents a thread from deadlocking itself while waiting for a mutex that it already owns. However, to release its ownership, the thread must call ReleaseMutex one time for each time that it obtained ownership (either through CreateMutex or a wait function).
The wait functions are the equivalent to pthread_mutex_lock.
See Mutex Objects (Windows) to get more details about this API.
And this stackoverflow entry to see what the CRITICAL_SECTION object contains. This will disclose
that the CRITICAL_SECTION object holds - among others - a value LockCount to allow recursive use. See the EnterCriticalSection function to learn about this feature.

Mutex vs Event in Windows

can somebody please explain what is the difference if I do
mutex = createMutex
waitForSingleObject
Release(mutex)
and
event = createEvent
waitForSingleObject
Release(event)
I'm so confused, can I use both versions for the synchronization? thanks in advance for any help
You use a mutex to ensure that only one thread of execution can be accessing something. For example, if you want to update a list that can potentially be used by multiple threads, you'd use a mutex:
acquire mutex
update list
release mutex
With a mutex, only one thread at a time can be executing the "update list".
You use a manual reset event if you want multiple threads to wait for something to happen before continuing. For example, you started multiple threads, but they're all paused waiting for some other event before they can continue. Once that event happens, all of the threads can start running.
The main thread would look like this:
create event, initial value false (not signaled)
start threads
do some other initialization
signal event
Each thread's code would be:
do thread initialization
wait for event to be signaled
do thread processing
Yes, both can be used for synchronization but in different ways.
Mutex is a mutual exclusion object and can be acquired only by a single instance at a time. It is used to avoid the simultaneous use of a common resource, such as a global variable, by pieces of computer code
Event is an objet that can be explicitly set to a state by use of the SetEvent function.

Forcing context switch in Windows

Is there a way to force a context switch in C++ to a specific thread, assuming I have the thread handle or thread ID?
No, you won't be able to force operating system to run the thread you want. You can use yield to force a context switch though...
yield in Win32 API is function SwitchToThread. If there is no other thread available for running, then a ZERO value will be returned and current thread will keep running anyway.
You can only encourage the Windows thread scheduler to pick a certain thread, you can't force it. You do so first by making the thread block on a synchronization object and signaling it. Secondary by bumping up its priority.
Explicit context switching is supported, you'll have to use fibers. Review SwitchToFiber(). A fiber is not a thread by a long shot, it is similar to a co-routine of old. Fibers' heyday has come and gone, they are not competitive with threads anymore. They have very crappy cpu cache locality and cannot take advantage of multiple cores.
The only way to force a particular thread to run is by using process/thread affinity, but I can't imagine ever having a problem for which this was a reasonable solution.
The only way to force a context switch is to force a thread onto a different processor using affinity.
In other words, what you are trying to do isn't really viable.
Calling SwitchToThread() will result in a context switch if there is another thread ready to run that are eligible to run on this processor. The documentation states it as follows:
If calling the SwitchToThread function
causes the operating system to switch
execution to another thread, the
return value is nonzero.
If there are no other threads ready to
execute, the operating system does not
switch execution to another thread,
and the return value is zero.
You can temporarily bump the priority of the other thread, while looping with Sleep(0) calls: this passes control to other threads. Suppose that the other thread has increased a lock variable and you need to wait until it becomes zero again:
// Wait until other thread releases lock
SetThreadPriority(otherThread, THREAD_PRIORITY_HIGHER);
while (InterlockedRead(&lock) != 0)
Sleep(0);
SetThreadPriority(otherThread, THREAD_PRIORITY_NORMAL);
I would check out the book Concurrent Programming for Windows. The scheduler seems to do a few things worth noting.
Sleep(0) only yields to higher priority threads (or possibly others at the same priority). This means you cannot fix priority inversion situations with just a Sleep(0), where other lower priority threads need to run. You must use SwitchToThread, Sleep a non-zero duration, or fully block on some kernel HANDLE.
You can create two synchronization objects (such as two events) and use the API SignalObjectAndWait.
If the hObjectToWaitOn is non-signaled and your other thread is waiting on the hObjectToSignal, the OS can theoretically perform quick context switch inside this API, before end of time slice.
And if you want the current thread to automatically resume, simply inform a small value (such as 50 or 100) on the dwMilliseconds.

Resources