Make parent thread wait till child thread finishes in VC - windows

According to MSDN:
The WaitForSingleObject function can wait for the following objects:
Change notification
Console input
Event
Memory resource notification
Mutex
Process
Semaphore
Thread
Waitable timer
Then we can use WaitForSingleObject to make the parent-thread wait for child ones.
int main()
{
HANDLE h_child_thread = CreateThread(0,0, child, 0,0,0); //create a thread in VC
WaitForSingleObject(h_child_thread, INFINITE); //So, parent-thread will wait
return 0;
}
Question
Is there any other way to make parent-thread wait for child ones in VC or Windows?
I don't quite understand the usage of WaitForSingleObject here, does it mean that the thread's handle will be available when the thread terminates?

You can establish communication between threads in multiple ways and the terminating thread may somehow signal its waiting thread. It could be as simple as writing some special value to a shared memory location that the waiting thread can check. But this won't guarantee that the terminating thread has terminated when the waiting thread sees the special value (ordering/race conditions) or that the terminating thread terminates shortly after that (it can just hang or block on something) and it won't guarantee that the special value gets ever set before the terminating thread actually terminates (the thread can crash). WaitForSingleObject (and its companion WaitForMultipleObjects) is a sure way to know of a thread termination when it occurs. Just use it.
The handle will still be available in the sense that its value won't be gone. But it is practically useless after the thread has terminated, except you need this handle to get the thread exit code. And you still need to close the handle in the end. That is unless you're OK with handle/memory leaks.

for the first queation - yes. The method commonly used here is "Join". the usage is language dependant.
In .NET C++ you can use the Thread's Join method. this is from the msdn:
Thread* newThread = new Thread(new ThreadStart(0, Test::Work));
newThread->Start();
if(newThread->Join(waitTime + waitTime))
{
Console::WriteLine(S"New thread terminated.");
}
else
{
Console::WriteLine(S"Join timed out.");
}
Secondly, the thread is terminated when when you are signaled with "WaitForSingleObject" but the handle is still valid (for a terminated thread). So you still need to explicitly close the handle with CloseHandle.

Related

Thread wait reasons

I've been using code that I found in the following post:
How to get thread state (e.g. suspended), memory + CPU usage, start time, priority, etc
I'm examining thread state, and there's the following enum that describes the reasons for thread 'waiting' status -
enum KWAIT_REASON
{
Executive,
FreePage,
PageIn,
PoolAllocation,
DelayExecution,
Suspended,
UserRequest,
WrExecutive,
WrFreePage,
WrPageIn,
WrPoolAllocation,
WrDelayExecution,
WrSuspended,
WrUserRequest,
WrEventPair,
WrQueue,
WrLpcReceive,
WrLpcReply,
WrVirtualMemory,
WrPageOut,
WrRendezvous,
Spare2,
Spare3,
Spare4,
Spare5,
Spare6,
WrKernel,
MaximumWaitReason
};
Can anyone explain what WrQueue is, and perhaps what the difference between WrUserRequest and UserRequest is?
The information is obtained using NtQuerySystemInformation() with SystemProcessInformation.
WrQueue this is when thread waits on KQUEUE object (look it definition in wdm.h) in kernel. this can be call to ZwRemoveIoCompletion or Win32 shell GetQueuedCompletionStatus (IOCP is exactly KQUEUE object). or thread (begining from vista) call ZwWaitForWorkViaWorkerFactory (worker factory internally use KQUEUE. also possible that thread in kernel calls KeRemoveQueue - this usually does system working threads.
WrUserRequest is used by win32k.sys subsystem. Usually this is when thread calls GetMessage. So if we view WrUserRequest we can be sure that thread is waiting for window messages.
UserRequest - this means that thread waits on some object[s] via WaitForSingleObject[Ex] or WaitForMultipleObjects[Ex] or MsgWaitForMultipleObjects[Ex] (or it equivalents)

Completion object race condition

What happens if complete_all() is called on a completion object (from task B) before the task A gets to do wait_for_completion() on the completion object? Is there some API to find if object is already completed at time of wait and return right away? One way could be using a mutex which is locked before sending the message and unlocked before the wait. That lock needs to be acquired before complete_all() and released after but wondering if there is a cleaner/better way. Any ideas are welcome.
More context: task A initializes the completion object, sends a request to task B along with the address of the completion object and then waits for the completion. Task B does some processing when it gets the message and then does complete_all() on the completion object.
If complete() or complete_all() is called before wait_for_completion() for a particular completion object, then wait_for_completion() will return immediately. A completion object is roughly like a semaphore:
Internally, a completion object has a done counter that is initialized to 0.
wait_for_completion() sleeps until done > 0 (or proceeds immediately if done is already greater than 0), and atomically decrements done before returning.
complete() increments done and wakes up the first process sleeping in wait_for_completion().
complete_all() sets done to UINT_MAX / 2 (effectively infinity) and wakes up everyone sleeping in wait_for_completion().
So if I'm understanding your question correctly, there is no need for additionaly locking; the completion object's internal wait.lock spinlock already synchronizes the counter access so that the case you're worrying about is handled correctly.

IOCP loop termination may cause memory leaks? How to close IOCP loop gracefully

I have the classic IOCP callback that dequeues i/o pending requests, process them, and deallocate them, in this way:
struct MyIoRequest { OVERLAPPED o; /* ... other params ... */ };
bool is_iocp_active = true;
DWORD WINAPI WorkerProc(LPVOID lpParam)
{
ULONG_PTR dwKey;
DWORD dwTrans;
LPOVERLAPPED io_req;
while(is_iocp_active)
{
GetQueuedCompletionStatus((HANDLE)lpParam, &dwTrans, &dwKey, (LPOVERLAPPED*)&io_req, WSA_INFINITE);
// NOTE, i could use GetQueuedCompletionStatusEx() here ^ and set it in the
// alertable state TRUE, so i can wake up the thread with an ACP request from another thread!
printf("dequeued an i/o request\n");
// [ process i/o request ]
...
// [ destroy request ]
destroy_request(io_req);
}
// [ clean up some stuff ]
return 0;
}
Then, in the code I will have somewhere:
MyIoRequest * io_req = allocate_request(...params...);
ReadFile(..., (OVERLAPPED*)io_req);
and this just works perfectly.
Now my question is: What about I want to immediately close the IOCP queue without causing leaks? (e.g. application must exit)
I mean: if i set is_iocp_active to 'false', the next time GetQueuedCompletionStatus() will dequeue a new i/o request, that will be the last i/o request: it will return, causing thread to exit and when a thread exits all of its pending i/o requests are simply canceled by the system, according to MSDN.
But the structures of type 'MyIoRequest' that I have instanced when calling ReadFile() won't be destroyed at all: the system has canceled pending i/o request, but I have to manually destroy those structures I have
created, or I will leak all pending i/o requests when I stop the loop!
So, how I could do this? Am I wrong to stop the IOCP loop with just setting that variable to false? Note that is would happen even if i use APC requests to stop an alertable thread.
The solution that come to my mind is to add every 'MyIoRequest' structures to a queue/list, and then dequeue them when GetQueuedCompletionStatusEx returns, but shouldn't that make some bottleneck, since the enqueue/dequeue process of such MyIoRequest structures must be interlocked? Maybe I've misunderstood how to use the IOCP loop. Can someone bring some light on this topic?
The way I normally shut down an IOCP thread is to post my own 'shut down now please' completion. That way you can cleanly shut down and process all of the pending completions and then shut the threads down.
The way to do this is to call PostQueuedCompletionStatus() with 0 for num bytes, completion key and pOverlapped. This will mean that the completion key is a unique value (you wont have a valid file or socket with a zero handle/completion key).
Step one is to close the sources of completions, so close or abort your socket connections, close files, etc. Once all of those are closed you can't be generating any more completion packets so you then post your special '0' completion; post one for each thread you have servicing your IOCP. Once the thread gets a '0' completion key it exits.
If you are terminating the app, and there's no overriding reason to not do so, (eg. close DB connections, interprocess shared memory issues), call ExitProcess(0).
Failing that, call CancelIO() for all socket handles and process all the cancelled completions as they come in.
Try ExitProcess() first!

Synchronization primitive with IO/Kit

I'm looking for a wait/signal synchronization primitive in IO/Kit working like :
Thread1 : wait(myEvent) // Blocking thread1
Thread2 : wait(myEvent) // Blocking thread2
Thread3 : signal(myEvent) // Release one of thread1 or thread2
This can't be done using an IOLock since the lock/unlock operations would be made from different threads, which is a bad idea according to some doc I've read.
Thread1, 2, 3 can be user threads or kernel threads.
I'd also like to have an optional time out with the wait operation.
Thanks for your help !
You want the function IOLockSleepDeadline(), declared in <IOKit/IOLocks.h>.
You set up a single IOLock somewhere with IOLockAlloc() before you begin. Then, threads 1 and 2 lock the IOLock with IOLockLock() and immediately relinquish the lock and go to sleep by calling IOLockSleepDeadline(). When thread 3 is ready, it calls IOLockWakeup() (with oneThread = true if you only want to wake a single thread). This causes thread 1 or 2 to wake up and immediately acquire the lock (so they need to Unlock or sleep again).
IOLockSleep() works similarly, but without the timeout.
You can do something similar using the IOCommandGate's commandSleep() method which may be more appropriate if your driver already is centred around an IOWorkLoop.
The documentation of method IOLocks::IOLockLock states the following:
Lock the mutex. If the lock is held by any thread, block waiting for
its unlock. This function may block and so should not be called from
interrupt level or while a spin lock is held. Locking the mutex
recursively from one thread will result in deadlock.
So it will certainly do block the other threads (T1 and T2) until the thread holding the lock releases it (T3). One thing that it doesn't seem to support is the timeout.

PostThreadMessage returns ERROR_INVALID_THREAD_ID

I have a multi-threaded simulation running on Windows Vista. When I use PostThreadMessage to send messages between threads, I am getting ERROR_INVALID_THREAD_ID, even though I am quite certain (from stepping through the debugger) that the thread id is valid, and the thread has a message queue, since I call PeekMessage from every thread after I create them, as specified in MSDN. It's likely the target thread is suspended, but that should not be a problem, as far as I can tell.
Any clues on what to try? I am simulating an RTOS based application, so I'm hoping not to have to put in too much Windows specific code.
EDIT -
Another clue - if I remove all the semaphore blocking, the messages work fine (although there are some known race conditions). But message queues should not be affected by thread blocking, right?
Edit 2
The code also has the following retry mechanism, as suggested by MSDN. But it still does not work - the retry always fails. hmmm.....
BOOL bResult = false;
int retry = 0;
DWORD dwError = 0;
do
{
bResult = PostThreadMessage(pTaskHandle->dwThreadID,0,0,(LPARAM)pMessage);
if (!bResult)
{
dwError = GetLastError();
retry++; // should only happen once, if the dest thread has no msg queue
// the retry establishes the queue
Sleep(500);
}
} while (!bResult && retry<3); // MSDN says try this a few times to start msg queue
You mention you call PeekMessage after creating the threads but do these threads have full, active message processing loops that are dispatching the messages? msdn says:
Call PostThreadMessage. If it fails, call the Sleep function and call PostThreadMessage again. Repeat until PostThreadMessage succeeds.
which sounds a little goofy if the only requirement is that the thread called PeekMessage once.
Also be aware that messages posted via. PostThreadMessage don't get dispatched in DispatchMessage. this seems obvious since there's no window for the message to go to, but I've seen people do it, especially when using MsgWaitForMultipleObjects and such to wait on a handle. in this case it seems unlikely you'd get ERROR_INVALID_THREAD_ID... more likely you'd just miss the message.

Resources