Is it possible to advance a deadlocked thread? stuck at WaitForSingleObject - windows

If I have an app that is creating threads which do their work and then exit, and one or more threads get themselves into a deadlock (possibly through no fault of my own!), is there a way of programmatically forcing one of the threads to advance past the WaitForSingleObject it might be stuck at, and thus resolving the deadlock?
I don't necessarily want to terminate the thread, I just want to have it move on (and thus allow the threads to exit "gracefully".
(yes, I know this sounds like a duplicate of my earlier question Delphi 2006 - What's the best way to gracefully kill a thread and still have the OnTerminate handler fire?, but the situation is slightly different - what I'm asking here is whether it is possible to make a WaitForSingleObject (Handle, INFINTE) behave like a WaitForSingleObject (Handle, ItCantPossiblyBeWorkingProperlyAfterThisLong)).
Please be gentle with me.
* MORE INFO *
The problem is not necessarily in code I have the source to. The actual situation is a serial COM port library (AsyncFree) that is thread based. When the port is USB-based, the library seems to have a deadlock between two of the threads it creates on closing the port. I've already discussed this at length in this forum. I did recode one of the WaitForSingleObject calls to not be infinite, and that cured that deadlock, but then another one appeared later in the thread shutdown sequence, this time in the Delphi TThread.Destroy routine.
So my rationale for this is simple: when my threads deadlock, I fix the code if I can. If I can't, or one appears that I don't know about, I just want the thread to finish. I doesn't have to be pretty. I can't afford to have my app choke.

You can make a handle used in WaitForSingleObject invalid by closing it (from some other thread). In this case WaitForSingleObject should return WAIT_FAILED and your thread will be 'moved on'

If you don't use INFINITE but just set a given timeout time, you can check if the call returned because the time out time expired or because the handle you were waiting for got into the signalled state. Then your code can decide what to do next. Enter another waiting cycle, or simply exit anyway maybe showing somewhere 'hey, I was waiting but it was too long and I terminated anyway).
Another options is to use WaitForMultipleObjects and use something alike an event to have the wait terminate if needed. The advantage it doesn't need the timeout to expire.
Of course one the thread is awaken it must be able to handle the "exceptional" condition of continuing even if the "main" handle it was waiting for didn't return in time.

Related

waitpid in infitine wait state after PTRACE_ATTACH

I have integrated Google-Breakpad in my C++ application. Now, I am deliberately crashing the application but it hangs-up in my Ubuntu i686 system. I have to put printf everywhere in Breakpad to check where exactly it is hanging. So, in breakpad, a clone child process is being created and in that process ptrace(PTRACE_ATTACH, pid, NULL, NULL) followed by waitpid(pid, NULL, __WALL) syscall is being called on every thread. With one particular thread waitpid is entering in infinite wait state and I then have to deliberately kill the application.
Does anyone knows why exactly this is happening? Why with this one particular thread waitpid() is going in infinte wait state? Is there any solution for the same?
Thanks.
In general, PTRACE_ATTACH does not guarantee that a process will have anything to report. After PTRACE_ATTACH, waitpid will trigger only if one of two things happen:
The debugee receives a signal.
The debugee exists.
Some things are tantamount to one of those things. For example, if the debugee calls execve, then after a successful execution the kernel makes it appear as if the debugee received a TRAP signal.
If none of those situations happen, there is no reason for PTRACE_ATTACH to do anything at all.
If you want waitpid to return (say, because you want to change the debugee's state), then simply send a signal to the thread after calling PTRACE_ATTACH. This will guarantee that the thread have something to tell you.

How to force GetQueuedCompletionStatus() to return immediately?

I have hand-made thread pool. Threads read from completion port and do some other stuff. One particular thread has to be ended. How to interrupt it's waiting if it hangs on GetQueuedCompletionStatus() or GetQueuedCompletionStatusEx()?
Finite timeout (100-1000 ms) and exiting variable are far from elegant, cause delays and left as last resort.
CancelIo(completionPortHandle) within APC in target thread causes ERROR_INVALID_HANDLE.
CancelSynchronousIo(completionPortHandle) causes ERROR_NOT_FOUND.
PostQueuedCompletionStatus() with termination packet doesn't allow to choose thread.
Rough TerminateThread() with mutex should work. (I haven't tested it.) But is it ideologically good?
I tried to wait on special event and completion port. WaitForMultipleObjects() returned immediately as if completion port was signalled. GetQueuedCompletionStatus() shows didn't return anything.
I read Overlapped I/O: How to wake a thread on a completion port event or a normal event? and googled a lot.
Probably, the problem itself – ending thread's work – is sign of bad design and all my threads should be equal and compounded into normal thread pool. In this case, PostQueuedCompletionStatus() approach should work. (Although I have doubts that this approach is beautiful and laconic especially if threads use GetQueuedCompletionStatusEx() to get multiple packets at once.)
If you just want to reduce the size of the thread pool it doesn't matter which thread exits.
However if for some reason you need to signal to an particular thread that it needs to exit, rather than allowing any thread to exit, you can use this method.
If you use GetQueuedCompletionStatusEx you can do an alertable wait, by passing TRUE for fAlertable. You can then use QueueUserAPC to queue an APC to the thread you want to quit.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms684954(v=vs.85).aspx
If the thread is busy then you will still have to wait for the current work item to be completed.
Certainly don't call TerminateThread.
Unfortunately, I/O completion port handles are always in a signaled state and as such cannot really be used in WaitFor* functions.
GetQueuedCompletionStatus[Ex] is the only way to block on the completion port. With an empty queue, the function will return only if the thread becomes alerted. As mentioned by #Ben, the QueueUserAPC will make the the thread alerted and cause GetQueuedCompletionStatus to return.
However, QueueUserAPC allocates memory and thus can fail in low-memory conditions or when memory quotas are in effect. The same holds for PostQueuedCompletionStatus. As such, using any of these functions on an exit path is not a good idea.
Unfortunately, the only robust way seems to be calling the undocumented NtAlertThread exported by ntdll.dll.
extern "C" NTSTATUS __stdcall NtAlertThread(HANDLE hThread);
Link with ntdll.lib. This function will put the target thread into an alerted state without queuing anything.

Detecting whether any thread is waiting for an event

Let's say I have a manual event handle h (created with CreateEvent, manual).
There are several threads in my application, some thread(s) might be waiting for this event (WaitForSingleObject, WaitForMultipleObject).
At certain times in my application, I want to assert that no thread is waiting for this handle h.
Is there a Windows API function that tells me if any thread waiting for event h at that moment in time ?
I don't believe that the Windows API provides any public mechanism for giving out that information (whether or not threads are waiting for a synchronization object). It is something that a typical application should not need to know and would likely result in race conditions if it were provided.
For example, if the application checked to verify that no threads were waiting and then made a decision based on that, it could easily be wrong because a thread may in the very next clock cycle actually start waiting for the event, so the information would be stale and potentially wrong immediately after the check.

Unblocking accept()

I have a blocking call to accept(). From another thread I close the socket, hoping that it'll unblock the accept() call, which it does but I have a case when it doesn't: e.g. thread A enters accept(), thread B closes the socket, thread A doesn't return from accept().
Question: what could cause closing a socket to not unblock an accept()?
One hacky trick to unblock accept(2) is to actually connect(2) to the listening end from your other thread. Flip some flag indicating it's time to stop the loop, connect(2), close(2) the connecting socket. That way the accept(2)-ing thread would know to close the socket and shut itself down.
You must not ever free a resource in one thread while another thread is or might be using it. You will never get this to work reliably. For one thing, you can never be 100% sure the thread is actually blocked in accept, as opposed to about to block in it. So there will always be race conditions.
And, of course, shutdown won't work because the socket is not connected.
There are a couple of ways to handle this problem. For example, you can set a flag that the thread checks when it returns from accept and then make a connection yourself. That will cause the thread to return from accept and then it will see the flag and terminate.
You can also switch to non-blocking sockets. Have the thread call select or poll with a timeout and check if the thread should terminate when it returns from select or poll. You can also select or poll on both the socket and a pipe. Then just send a byte on the pipe to unblock the thread. pthread_kill is another possibility, as is pthread_cancel.
Not knowing the details of your problem, my guess would be the best solution is to rearchitect so you don't have a thread whose sole job is to wait forever in accept. That way, you won't even have a thread you need to kill. If you don't want to keep accepting connections, just rig things so that your threads stop doing that, but let the threads keep going doing other things. (The number of running threads you have should be dependent on the number of things you can usefully do at once, not the total number of things you have to do.)
Try calling shutdown() followed by close() from thread B. You should be checking the return codes on these calls too as they may help you figure out what is going wrong when thread A fails to become unblocked.

Why use ReadDirectoryChangesW asynchronously?

I've read the documentation for ReadDirectoryChangesW() and also seen the CDirectoryChangeWatcher project, but neither say why one would want to call it asynchronously. I understand that the current thread will not block, but, at least for the CDirectoryChangeWatcher code that uses a completion port, when it calls GetQueuedCompletionStatus(), that thread blocks anyway (if there are no changes).
So if I call ReadDirectoryChangesW() synchronously in a separate thread in the first place that I don't care if it blocks, why would I ever want to call ReadDirectoryChangesW() asynchronously?
When you call it asynchronously, you have more control over which thread does the waiting. It also allows you to have a single thread wait for multiple things, such as a directory change, an event, and a message. Finally, even if you're doing the waiting in the same thread that set up the watch in the first place, it gives you control over how long you're willing to wait. GetQueuedCompletionStatus has a timeout parameter that ReadDirectoryChangesW doesn't offer by itself.
You would call ReadDirectoryChangesW such that it returns its results asynchronously if you ever needed the calling thread to not block. A tautology, but the truth.
Candidates for such threads: the UI thread & any thread that is solely responsible for servicing a number of resources (Sockets, any sort of IPC, independent files, etc.).
Not being familiar with the project, I'd guess the CDirectoryChangeWatcher doesn't care if its worker thread blocks. Generally, that's the nature of worker threads.
I tried using ReadDirectoryChanges in a worker thread synchronously, and guess what, it blocked so that the thread wouldn't exit by itself at the program exit.
So if you don't want to use evil things like TerminateThread, you should use asynchronous calls.

Resources