Named pipes and OVERLAPPED on Windows - winapi

I'm about to implement my first Windows service. The service will connect to a Bluetooth dongle, and liaison some commands and data to a single client process.
Each process (client, server), shall have at least two thread - one blocking on Read(), another crunching business logic and doing an occasionally Write().
Checking the alternatives, I've decided to go with Named Pipes for IPC, but I'm having trouble understanding some of the settings. Specifically:
I wish to allow simultaneous reads and writes. Do I need to create the pipe with FILE_FLAG_OVERLAPPED, even though I do not intend to do reads and writes on the same thread?
If the answer to the above is 'yes', do I still need to pass an OVERLAPPED structure to ReadFile(), WriteFile(), use GetOverlappedResult() etc? If so, what is the rational behind this?
What is so great about having a single thread do non-blocking reads and writes anyway? What are the use cases?
EDIT
I wish to clarify the question:
Assuming both threads (Read / Write) access the pipe simultaneously, will one block until the other is done due to some internal pipe mutex?
Will setting FILE_FLAG_OVERLAPPED change this behavior?

You don't need to use FILE_FLAG_OVERLAPPED if all you're doing is reading and writing using the same handle at the same time. Other threads reading or writing to the same end of the pipe won't cause it to block. You only need it if you want to perform asynchronous I/O, which apparently you don't.
If you do use the FILE_FLAG_OVERLAPPED you must pass a valid OVERLAPPED structure to ReadFile and WriteFile via the lpOverlapped argument. If you don't use this flag and the handle isn't seekable (eg. a named pipe) then you must pass NULL instead.
The big advantage of using single threaded asynchronous I/O over a multi-threaded implementation is that you don't have to worry about concurrency issues. You can't have race conditions and deadlocks if you only have one thread. (Actually in your case, since you would still have two threads, one in the server and one the client, you can still have deadlocks and maybe race conditions if you really try, but asynchronous I/O would still make it easier to avoid them.)

To enable bi-directional access to a pipe, you have to specify the PIPE_ACCESS_DUPLEX flag for the dwOpenMode parameter. Asynchronous operation (FILE_FLAG_OVERLAPPED) is not strictly required to enable bi-directional mode.
Using asynchronous I/O, however, is recommended for bi-directional pipes. It allows you to issue both read and write operations at the same time, on the same thread (see WaitForMultipleObjects). Either operation is signaled upon completion. This prevents a lengthy write operation from blocking a potential read, for example, and you'll be able to respond to either one in a timely manner. Maybe even more importantly, since you never know when data is available, you'll typically want to always issue a read operation, without having it block your thread.
This is outlined in the documentation for CreateNamedPipe:
If [overlapped] mode is enabled, functions performing read, write, and connect operations that may take a significant time to be completed can return immediately. This mode enables the thread that started the operation to perform other operations while the time-consuming operation executes in the background. For example, in overlapped mode, a thread can handle simultaneous input and output (I/O) operations on multiple instances of a pipe or perform simultaneous read and write operations on the same pipe handle. If overlapped mode is not enabled, functions performing read, write, and connect operations on the pipe handle do not return until the operation is finished. The ReadFileEx and WriteFileEx functions can only be used with a pipe handle in overlapped mode. The ReadFile, WriteFile, ConnectNamedPipe, and TransactNamedPipe functions can execute either synchronously or as overlapped operations.

Related

Is WriteFile/ReadFile on Same Handle Thread-Safe Opened Without FILE_FLAG_OVERLAPPED but using OVERLAPPED structure?

It's clear that if you open a file using FILE_FLAG_OVERLAPPED you need to provide the OVERLAPPED structure and you need to wait when returns ERROR_IO_PENDING and if you don't provide hEvent it waits on the filehandle. Waiting on the filehandle, in this case, is not reliable because any operation that completes signals the filehandle.
Now if opened without the FILE_FLAG_OVERLAPPED you can still provide the OVERLAPPED structure. Say you provide it without an hEvent or didn't provide OVERLAPPED at all, what does it do internally? If it's waiting on the file handle, it seems it would be unreliable in multi-threading applications that use the same handle in multiple threads to do file IO.
If it is multi-thread unreliable and an hEvent would be needed for every IO, how much overhead is involved in CreateEvent ? If not, does it internally create an event and does it have the same overhead?
I need to offer in a support library ability to open a PhysicalDrive in Overlapped mode, yet still, support they Synchronous operations. A new set of functions for overlapped read/write would be created. For the existing function calls, I was going to wait on the handle, but I think that is a problem. So I could either create an event each time or create a one-time event that is shared and use a Mutex to serialize the requests, only that could kill any NCQ type performance gain, especially if not using write cache. Understanding what Windows does internally would help a lot.
TIA!!
Yes, it's safe.
Signalling an event or the file handle is for the benefit of user-mode code waiting for operations. The driver internally is using a completely different synchronization scheme -- the IRP (I/O request packet). Multiple operations will not accidentally complete the wrong request as you seem to worry about.
(As a matter of fact, there is no synchronous I/O model behind the scenes. All I/O is done using IRPs and continuation-passing-style. Synchronous operations in user mode are emulated by performing an async kernel I/O and marking the current thread non-runnable pending that operation. Note that it is pended on the operation, not the event object or file object.)

Is there any way to use IOCP to notify when a socket is readable / writeable?

I'm looking for some way to get a signal on an I/O completion port when a socket becomes readable/writeable (i.e. the next send/recv will complete immediately). Basically I want an overlapped version of WSASelect.
(Yes, I know that for many applications, this is unnecessary, and you can just keep issuing overlapped send calls. But in other applications you want to delay generating the message to send until the last moment possible, as discussed e.g. here. In these cases it's useful to do (a) wait for socket to be writeable, (b) generate the next message, (c) send the next message.)
So far the best solution I've been able to come up with is to spawn a thread just to call select and then PostQueuedCompletionStatus, which is awful and not particularly scalable... is there any better way?
It turns out that this is possible!
Basically the trick is:
Use the WSAIoctl SIO_BASE_HANDLE to peek through any "layered service providers"
Use DeviceIoControl to submit an AFD_POLL request for the base handle, to the AFD driver (this is what select does internally)
There are many, many complications that are probably worth understanding, but at the end of the day the above should just work in practice. This is supposed to be a private API, but libuv uses it, and MS's compatibility policies mean that they will never break libuv, so you're fine. For details, read the thread starting from this message: https://github.com/python-trio/trio/issues/52#issuecomment-424591743
For detecting that a socket is readable, it turns out that there is an undocumented but well-known piece of folklore: you can issue a "zero byte read", i.e., an overlapped WSARecv with a zero-byte receive buffer, and that will not complete until there is some data to be read. This has been recommended for servers that are trying to do simultaneous reads from a large number of mostly-idle sockets, in order to avoid problems with memory usage (apparently IOCP receive buffers get pinned into RAM). An example of this technique can be seen in the libuv source code. They also have an additional refinement, which is that to use this with UDP sockets, they issue a zero-byte receive with MSG_PEEK set. (This is important because without that flag, the zero-byte receive would consume a packet, truncating it to zero bytes.) MSDN claims that you can't combine MSG_PEEK with overlapped I/O, but apparently it works for them...
Of course, that's only half of an answer, because there's still the question of detecting writability.
It's possible that a similar "zero-byte send" trick would work? (Used directly for TCP, and adding the MSG_PARTIAL flag on UDP sockets, to avoid actually sending a zero-byte packet.) Experimentally I've checked that attempting to do a zero-byte send on a non-writable non-blocking TCP socket returns WSAEWOULDBLOCK, so that's a promising sign, but I haven't tried with overlapped I/O. I'll get around to it eventually and update this answer; or alternatively if someone wants to try it first and post their own consolidated answer then I'll probably accept it :-)

IOCP: how does the kernel decide to complete WSASend synchronously or asynchronously?

We wrote software that leverages I/O Completion Ports, and uses WSASend on SOCKET objects, WriteFile on named pipes.
In both situations we are finding that those APIs return SOCKET_ERROR / WAS_IO_PENDING [1] a lot sooner than we expected (or the equivalent for a named pipe WriteFile operation).
It seems we have wrongly assumed that asynchronous completion would trigger if we fill up the send buffer (nInBufferSize in CreateNamedPipe), instead it seems to be a lot more aggressive and unrelated to the size of the send buffer. For both sockets and named pipes, a large send buffer (100k+) and small messages (a few bytes) will always be completed asynchronously on the second write if done fast enough.
Can anyone confirm this? Does anyone have information on the heuristics that the Windows implementation follows in deciding when to complete I/O operations asynchronously, versus doing a synchronous completion?
[1] "If the overlapped operation is successfully initiated and will complete later, WSASend returns SOCKET_ERROR and indicates error code WSA_IO_PENDING." - http://msdn.microsoft.com/en-us/library/windows/desktop/ms742203(v=vs.85).aspx
Why do you think you need to know or care. It's not a documented part of the API and it's possibly affected by the drivers and any layered service providers in the stack at the time.
You have to write the correct code to handle either a successful 'synchronous' send or a pending 'asynchronous' send so what difference does it make how often you get either kind of result?
Also, unless you are using SetFileCompletionNotificationModes() to enable FILE_SKIP_COMPLETION_PORT_ON_SUCCESS the same code path is used for both sync and pending results.

Async signal or notification between processes on Windows

There are 2 processes running on Windows. They communicate with each other through named pipe. When one of them is ready to send a message, I want to notificate the other process asynchronously like signal on Linux so that the other process don't need to check for the pipe continously. Are there some similar methods like the signal mechanism on Windows or other way to solve my problem?
A direct signal mechanism which conceptually works the same way does not exist (one could probably simulate it with a thread injection hack, but don't even think about that). It is not much of a problem, since you can do otherwise.
Every waitable kernel object which can take a name such as an event or a semaphore can be accessed by different processes.
You can WaitForSingleObject on the synchronization primitive until the other process signals it. That would be a Unix-like readiness notification mechanism (not quite as elegant, but to the same effect).
However, that isn't even necessary. Named pipes (not true for anyonymous pipes!) can be used with overlapped I/O. Which means you can use ReadFileEx to initiate a read from the pipe, and it will linger there in the background until it can complete.
You can think of this kind of I/O as "fire and forget". Your process continues running while the read operation is blocked. When the read operation completes, it signals an event or posts a completion message to a completion port (which you can query) or posts an asynchronous procedure call ("APC", a more fancy name for "callback") to the thread that originally called it. That's as close to a "signal" as you can get under Windows.
Unluckily, APCs don't quite work as one would wish, since they only execute at well-defined points (when a thread is in an "alertable wait state", which you must do explicitly by setting the altertable flag in a wait function or calling NtTestAlert).
The likely reasoning why the Windows designers made it that way that this is "safer", but it is also more annoying from an usability point of view. Alas, that is how it works.
Note that the overlapped I/O model is the exact opposite of the readiness notification system under e.g. Linux. Rather than asking the OS whether a descriptor is ready to be read, you tell the OS to read it, and you can have yourself be notified (or verify) whether this has completed.

According to MSDN ReadFile() Win32 function may incorrectly report read operation completion. When?

The MSDN states in its description of ReadFile() function:
If hFile is opened with FILE_FLAG_OVERLAPPED, the lpOverlapped parameter must point to a valid and unique OVERLAPPED structure, otherwise the function can incorrectly report that the read operation is complete.
I have some applications that are violating the above recommendation and I would like to know the severity of the problem. I mean the program uses named pipe that has been created with FILE_FLAG_OVERLAPPED, but it reads from it using the following call:
ReadFile(handle, &buf, n, &n_read, NULL);
That means it passes NULL as the lpOverlapped parameter. That call should not work correctly in some circumstances according to documentation. I have spent a lot of time trying to reproduce the problem, but I was unable to! I always got all data in right place at right time. I was testing only Named Pipes though.
Would anybody know when can I expect that ReadFile() will incorrectly return and report successful completion even the data are not yet in the buffer? What would have to happen in order to reproduce the problem? Does it happen with files, pipes, sockets, consoles, or other devices? Do I have to use particular version of OS? Or particular version of reading (like register the handle to I/O completion port)? Or particular synchronization of reading and writing processes/threads?
Or when would that fail? It works for me :/
Please help!
With regards, Martin
Internally the system only supports asynchronous I/O. For synchronous I/O the system creates a temporary OVERLAPPED structure with hEvent = NULL;, issues an asynchronous I/O request passing in this temporary, and then waits for completion using GetOverlappedResult( bWait = TRUE ).
Recall that the hEvent of the temporary OVERLAPPED structure is NULL and pay attention to the Remarks section of GetOverlappedResult:
If the hEvent member of the OVERLAPPED structure is NULL, the system uses the state of the hFile handle to signal when the operation has been completed.
A file HANDLE is a waitable object that becomes unsignaled when an I/O operation begins, and signaled when an I/O operation ends.
Now consider a scenario where an asynchronous file HANDLE has a pending I/O request at the time you issue a synchronous I/O request. The system creates an OVERLAPPED structure and waits on the hFile HANDLE for completion. In the meantime the asynchronous I/O completes, thereby signaling the HANDLE causing the synchronous I/O to return prematurely without having actually completed.
Worse though is that by the time the asynchronous I/O that was initiated in response to the synchronous I/O request completes it will update the temporary OVERLAPPED structure that no longer exists. The result is memory corruption.
The full story can be found at The Old New Thing.
Seems like you are in a situation where you are deliberately calling an API in contravention of the documented best practices. In such situations all bets are off. It may work, it may not. If may work on this OS, but not on the next iteration of the OS, or the next service pack of the same OS. What happens when you port to Win64? Will it still work then?
Does calling GetLastError() (or looking at #ERR,hr in the debugger) give any value that is useful in addition to the error code?
I recommend that you call it with a valid OVERLAPPED structure, get it working and remove all doubt (and possibility of random failure). Why have possibly buggy code (and very hard to reproduce bugs) in your software when you can fix the problem easily by using a valid OVERLAPPED structure?
Why ask the question rather than fix the code to call the API as it was intended?
I suspect it always appears to work because, even though this is an asynchronous I/O, it completes very quickly. Depending on how you're testing for success, it's possible the function is incorrectly reporting that the operation completed, but it actually completes before you test the results.
The real test would be to do a read on the pipe before there's data to be read.
But really, you should just fix the code. If your architecture cannot handle asynchronous I/O, then remove the FILE_FLAG_OVERLAPPED from the creation of the named pipe.
When they say
Blockquote
If hFile is opened with FILE_FLAG_OVERLAPPED, the lpOverlapped parameter must point to a valid and unique OVERLAPPED structure, otherwise the function can incorrectly report that the read operation is complete.
they mean that there's nothing in the code preventing it working, but there's also a path through their code that can produce erroneous results. Just because you can't reproduce the problem with your particular hardware does not mean there is no problem.
If you really want to reproduce this problem, leave the code as is and go on with your life. Right about the time you've forgotten all about this problem, strange behavior will surface that will not have any obvious relations to calling ReadFile. You'll spend days pulling your hair out, and the problem will appear to come and go randomly. Eventually you'll find it and kick yourself for not following the instructions. Been there, done that, no fun!
The other way to recreate the problem is to schedule an important demo for your customer. It's sure to fail then!
If you don't want to splatter your code with OVERLAPPED structures and all of the related return value checks, Waits, Events, etc, you can write a wrapper function that takes a handle from which to read, and a timeout. Simply replace your calls to ReadFile with this handy-dandy wrapper.

Resources