Cancelling of asynchronous ReadDirectoryChangesW command - winapi

An asynchronous call to ReadDirectoryChangesW() can be cancelled by the CancelIo() function. However, by the time CancelIo() has been called, the notification buffer associated with ReadDirectoryChangesW() may be half filled in. The question is, what happens with those notifications? Should they be processed in a normal way?
More specifically, I issued an overlapped ReadDirectoryChangesW() command with a completion routine, and than cancelled it by means of CancelIo(). When my completion routine is called with an ERROR_OPERATION_ABORTED error, should I still check the notification buffer for possible notifications?
Clarification:
My File System Listener component successfully serves for my company already more than ten years. Now I'm going to modify the component in order to implement a more sophisticated monitoring policy. With this policy, cancelling a particular ReadDirectoryChangesW() request doesn't mean cancelling of monitoring, and I do not want to miss even a single notification.

OK, I found experimentally that the second parameter of completion routine (dwNumberOfBytesTransfered) in the case of ReadDirecoryChangesW() function contains number of bytes written to notification buffer. Unfortunately, I didn't find a clear confirmation of that in documentation.
However, this hypothesis seems reasonable, and if it is always true, I can conclude about presence of notifications by checking of that parameter regardless of the completion routine's first parameter (dwErrorCode). This solves the problem.

Related

Latest Windows threadpool API usage for I/O

I don't understand part of the latest Windows threadpool API. I need help with that.
From the documentation, the recipe to use it for I/O (in my case, for SOCKET) can be summarized as follows:
Call CreateThreadpoolIo.
Call StartThreadpoolIo. You can find this warning there:
You must call this function before initiating each asynchronous I/O operation on the file handle bound to the I/O completion object. Failure to do so will cause the thread pool to ignore an I/O operation when it completes and will cause memory corruption.
Call the operation on the file handle (e.g., WSARecvFrom). If it fails, call CancelThreadpoolIo. Otherwise, process the result when it is available. WSARecvFrom, when used asynchronously, asks for a WSAOVERLAPPED (that you have to create beforehand) but not for any information that links it to the previous call to StartThreadpoolIo. CancelThreadpoolIo only asks for the PTP_IO, but not for any additional information to derive a specific asynchronous operation.
Repeat steps 2 and 3.
Call CloseThreadpoolIo to finish. You can find this warning there:
It may be necessary to cancel threadpool I/O notifications to prevent memory leaks. For more information, see CancelThreadpoolIo.
I usually need it for UDP, so I strive to have several reception operations queued (asynchronous WSARecvFrom operations started) at any given time. That way I don't have to rush to start another reception operation at the beginning of the callback function nor synchronize access to the reception buffers (I can have a pool of them, each one able to contain a datagram, and reissue the reception operation when I finish processing each message; in the interim, other queued operations will keep the receiver busy). Datagrams are independent and self contained. I'm aware that this approach may not be valid for TCP.
StartThreadpoolIo/CancelThreadpoolIo seem to me the source of the problem: StartThreadpoolIo and WSARecvFrom are not directly bound (they don't share any arguments). So:
How can the framework know which operation to cancel when you call CancelThreadpoolIo? How does it cancel just the operation that failed and not any of the pending ones?
You can say, "don't call StartThreadpoolIo concurrently". I can live without several concurrent WSARecvFrom's, but I can't live without concurrent WSARecvFrom and WSASendTo. So I think being unable to have several asynchronous operations at the same time can't be the way the API was designed.
You can say, "call StartThreadpoolIo only once, that will suffice to register the callback; it is an on/off process". But the documentation says:
You must call this function before initiating each asynchronous I/O operation on the file handle...
You can say, "it cancels the operation started by the same thread that just called StartThreadpoolIo". But then the advice of calling CancelThreadpoolIo in the context of calling CloseThreadpoolIo doesn't make sense (I will call CloseThreadpoolIo from the thread that triggers stopping, which will be completely independent from the threads issuing the asynchronous operations; and a single call to CancelThreadpoolIo may not be enough to cancel several operations). Being unable to trigger cancellation from a different thread is a serious limitation, anyway. I'm aware of the existence of CreateThreadpoolCleanupGroup, but my question is more fundamental. I want to understand how this API can be fundamentally right and useful.
You can say "call CreateThreadpoolIo several times, so that you have independent PTP_IO's to work with". It doesn't work. When I call CreateThreadpoolIo a second time, nullptr is returned.
Am I wrong, or is this API awkward? Normally, other asynchronous APIs work with one of these patterns:
Create an operation and receive a handle => call methods passing the handle.
Create a reusable handle => call methods (including starting operations) passing the handle.
The latest Windows threadpool API, in which the handle seems to be implicit, or there are several handles for the same operation (TP_IO, WSAOVERLAPPED, StartThreadpoolIo) and they aren't all explicitly linked together, uses neither of them.
Thank you very much for your help.
How can the framework know which operation to cancel when you call CancelThreadpoolIo? How does it cancel just the operation that failed
and not any of the pending ones?
CancelThreadpoolIo() doesn't cancel IO. It is reciprocal to StartThreadpoolIo(). StartThreadpoolIo() prepares threadpool to accept a completion. If threadpool doesn't expect a completion, it won't wait for it, thus you may miss it. If threadpool expects a completion but completion doesn't happen, threadpool may waste resources.
CancelThreadpoolIo() undoes whatever StartThreadpoolIo() did.

Async callback call for a sync WinHTTP request

I'm using WinHTTP in sync mode, without passing the WINHTTP_FLAG_ASYNC flag, and I thought that the callback is always being called synchronously. That is indeed what's happening most of the time, but sometimes, when calling WinHttpCloseHandle, the callback isn't called with the WINHTTP_CALLBACK_STATUS_HANDLE_CLOSING notification right away. Instead, it's being called afterwards from a different thread.
It that expected behavior? Why does it become async for some cases, if the seesion is sync? I know how to fix it (waiting for the WINHTTP_CALLBACK_STATUS_HANDLE_CLOSING notification if I don't get it right away), but I don't understand why that's the behavior that I'm seeing.
WinHTTP does not promise synchronous "same thread" callbacks in synchronous mode. On the contrary, MSDN states the opposite:
The callback function must be threadsafe and reentrant because it can be called on another thread for a separate request, and reentered on the same thread for the current request. It must therefore be coded to handle reentrance safely while processing. When the dwInternetStatus parameter is equal to WINHTTP_CALLBACK_STATUS_HANDLE_CLOSING, the callback does not need to be able to handle reentrance for the same request, because this callback is guaranteed to be the last, and does not occur when other messages for this request are handled.
This means that the symptom you are seeing is basically behavior by design and is not related to async mode: some callback calls might be sent to you from worker threads and then thread racing might reach your code late in your callback. You need to take this into consideration and either ignore those late calls, or synchronize with them, or reset callbacks early enough explicitly to not receive late notifications.
Regarding WINHTTP_CALLBACK_STATUS_HANDLE_CLOSING specifically MSDN explains what you can rely on exactly (see quote above).

Coalescing GCD file system events

I have a class that implements a file-monitoring service to detect when a file I am interested in has been changed by something other than my application. I use the standard technique of opening the file (with the O_EVTONLY flag) and binding the file descriptor to a Grand Central Dispatch source of type DISPATCH_SOURCE_TYPE_VNODE. When I get an event, I notify my main thread with NSNotificationCenter's postNotificationName:object:userInfo: which calls an observer in my app delegate. So far so good. It works great. But, in general, if the triggering event is an attributes change (i.e. the DISPATCH_VNODE_ATTRIB flag is set on return from dispatch_source_get_data()) then I usually get two closely-spaced events. The behaviour is easily exhibited if I touch(1) the object I am monitoring. I hypothesise this is due to the file's mtime and atime being set non-atomically although I can't verify this. This can lead to spurious notifications being sent to my observer and this raises the possibility of race conditions etc.
What is the best way of dealing with this? I thought of storing a timestamp for the last event received and only sending a notification if the current event is later than this timestamp by some amount (a few tens of milliseconds?) Does this sound like a reasonable solution?
You can't ever escape the "race condition" in this situation, because the notification of your GCD event source in your process is not synchronous with the other process's modification of the underlying file. So, no matter what, you must always be tolerant of the possibility that the change you're being notified for could already be "gone."
As for coalescing, do whatever makes sense for your app. There are two obvious strategies. You can act immediately on a received event, and then drop subsequent events received in some time window on the floor, or you can delay every event for some time period during which you will drop other events for the same file on the floor. It really just depends on what's more important, acting quickly, or having a higher likelihood of a quiescent state (knowing that you can never be sure things are quiescent.)
The only thing I would add is to suggest that you do all your coalescence before dispatching anything to the main thread. The main thread has things like tracking loops, etc that will make it harder to get time-based coalescing right in certain cases.

cancelPreviousPerformRequest vs cancelAllOperations

i use now both but i'm not sure which is better. what is exactly the difference? fuzzy question i know. preparing for wwdc
Sending a cancelAllOperations message to an operation queue cancels all the operations in that queue (that is, it tells the operations to cancel), whereas cancelPreviousPerformRequestsWithTarget: tells the target object to cancel all delayed performs it had previously been told to do.
There is no “better” here; the two methods are incomparable. One cancels NSOperations; the other cancels delayed-perform requests. Which cancellation you use depends entirely on whether you made an NSOperation and put it an NSOperationQueue or sent a delayed-perform request.

I Need an Analogy: Triggers and Events

For another question, I'm running into a misconception that seems to arise here at SO occasionally. Some questioners seem to think that Triggers are to Databases as Events are to OOP.
Does anyone have a good analogy to explain why this is a flawed comparison, and the consequences of misapplying it?
EDIT:
Bill K. has hit it correctly, but maybe doesn't see the importance of the critical differeence between the event and the callback function that strikes me, anyway. Triggers actually cause code to execute every time the event occurs; callbacks only occur whenever one has been registered for an event (which is not true for the vast majority of events); and even then, in most cases the callback's first action is to deregister itself (or at least the callback contains a qualifcation exit so it only executes once.)
If you write a trigger, it will unfailingly execute every time the event occurs, because there's no way to register or deregister to code segment.
Triggers are a way to interpose repeating logic synchronously into the thread of execution (i.e. synchronicity). Events are a means to defer logic until later (i.e. implement asynchronicity).
There are exceptions and mitigations in both cases, but the basic patterns of triggers and callbacks are mostly opposite in intention and implementation. Often the distinction doesn't seem to have fully sunk in. (IMHO, YMMV). :D
They're not the same thing, but they're not unrelated.
In both cases, the mechanism can be described approximately as follows:
Some block of code declares "interest" for changes in state.
Your application affects some change.
The system runs the block of code in response to the change.
Perhaps a database trigger is more like a callback function that has registered interest in a specific event.
Here's an analogy: the event is a rubber ball that you throw. The trigger is a dog that chases after a thrown ball.
If there's some other difference that you have in mind that makes it "dangerous" (note: OP has edited this choice of word out of the question) to compare triggers and events, you can describe what you mean.
Triggers are a way to interpose
repeating logic synchronously into the
thread of execution (i.e.
synchronicity). Events are a means to
defer logic until later (i.e.
implement asynchronicity).
Okay, I see what you mean more clearly. But I think it's in some ways subject to the implementation. I wouldn't assume an event handler has to deregister itself; it depends on the system you're using. A UNIX signal handler, for example, has to prevent itself from catching a new signal while it's already handling one. But a Java servlet inside a Tomcat container should be thread-safe because it may be called concurrently by multiple threads. They're both event handlers, of different kinds.
Event handlers may be synchronous or asynchronous. Can a handler in a publish/subscribe system read messages that were posted recently, but prior to the handler registering its interest? Or only messages posted concurrently?
There's another important reason to treat triggers as different from event handlers: I frequently recommend against doing anything in a trigger that affects state outside the database.
For example, sending an email, writing to a file, posting to a web service, or forking a process is inappropriate inside a trigger. If for no other reason than the transaction that spawned the trigger may be rolled back, but you can't roll back those external effects. You may not even be using explicit transactions, but say you send an email in a BEFORE trigger, but the operation fails because of a NOT NULL constraint or something.
Instead, all such work should be done by code in one's application, after one has confirmed that the SQL operation was successful and the transaction committed.
It's too bad that people keep trying to do inappropriate work inside a trigger. There are senior developers at MySQL who promote UDFs to read and write data in memcached. Wow -- I just noticed these have made it into the MySQL 6.0 product!! Shocking!
So here's another attempt at an analogy, comparing triggers and events to the process of a criminal trial:
A BEFORE trigger is an allegation.
An AFTER trigger is an indictment.
COMMIT is a conviction after a guilty verdict.
ROLLBACK is an acquittal after an innocent verdict.
You only want to put the perpetrator in prison after they are convicted.
Whereas an EVENT is the crime itself.

Resources