WinSock: How to properly time out receive using overlapped I/O - windows

Problem criteria:
my service is Windows-only, so portability is not a constraint for me
my service uses threadpools with overlapped I/O
my service needs to open a connection to a remote service, ask a question and receive a reply
the remote service may refuse to answer (root cause is not important)
The solution is trivial to describe: set a timeout on the read.
The implementation of said solution has been elusive.
I think I may have finally tracked down something that is viable, but I am so weary from false starts that I seek someone's approval who has done this sort of thing before before moving ahead with it.
By calling GetOverlappedResultsEx with a non-zero timeout:
https://learn.microsoft.com/en-us/windows/win32/api/ioapiset/nf-ioapiset-getoverlappedresultex
If dwMilliseconds is nonzero, and an I/O completion routine or APC is queued, GetLastError returns WAIT_IO_COMPLETION.
If dwMilliseconds is nonzero and the specified timeout interval elapses, GetLastError returns WAIT_TIMEOUT.
Thus, I can sit and wait until IO has been alerted or the timeout exceeded and react accordingly:
WAIT_TIMEOUT: CancelIoEx on the overlapped structure from the WSARecv, which will trigger my IO complete callback and allow me to do something meaningful (e.g. force the socket closed).
WAIT_IO_COMPLETION: Do nothing. Timeout need not be enforced.
Is it really that simple, though? Because I have yet to find any questions or example code, etc. that closely resembles what I got going on here (which is largely based on a codebase I inherited) and as a consequence, have failed to find any examples/suggestions to support that this is appropriate.
Demo program: https://github.com/rguilbault-mt/rguilbault-mt/blob/main/WinSock.cpp
to run:
-p -d -t -gor
Make the read delay > timeout to force the timeout condition.
Relevant bits for this question:
StartThreadpoolIo(gIoTp[s]);
if (WSARecv(s, bufs, 1, &readBytes, &dwFlags, &ioData->ol, NULL) == SOCKET_ERROR)
{
std::lock_guard<std::mutex> log(gIoMtx);
switch (WSAGetLastError())
{
case WSA_IO_PENDING:
std::cout << preamble(__func__) << "asynchronous" << std::endl;
break;
default:
std::cerr << preamble(__func__) << "WSARecv() failed: " << WSAGetLastError() << std::endl;
CancelThreadpoolIo(gIoTp[s]);
return false;
}
}
else
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "synchronous - " << readBytes << " read" << std::endl;
}
if (gGetOverlappedResult)
{
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "wait until I/O occurs or we timeout..." << std::endl;
}
DWORD bytesTransferred = 0;
if (!GetOverlappedResultEx((HANDLE)s, &ioData->ol, &bytesTransferred, gTimeout, true))
{
DWORD e = GetLastError();
std::lock_guard<std::mutex> log(gIoMtx);
switch (e)
{
case WAIT_IO_COMPLETION:
std::cout << preamble(__func__) << "read activity is forthcoming" << std::endl;
break;
case WAIT_TIMEOUT:
// we hit our timeout, cancel the I/O
CancelIoEx((HANDLE)s, &ioData->ol);
break;
default:
std::cerr << preamble(__func__) << "GetOverlappedResult error is unhandled: " << e << std::endl;
}
}
else
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cerr << preamble(__func__) << "GetOverlappedResult success: " << bytesTransferred << std::endl;
}
}
Confirmation/other suggestions welcomed/appreciated.

I was debating what the proper protocol was and decided I'm just going to answer my own question for the benefit of the world (if anyone bumps into my similar criteria/issue) even though I would have preferred that #HansPassant get credit for the answer.
Anyway, with his suggestion, using the wait mechanism provided by Microsoft allows me to pull of what I need without orchestrating any thread-based monitoring of my own. Here are the relevant bits:
after calling WSARecv, register a wait callback:
else if (gRegisterWait)
{
if (!RegisterWaitForSingleObject(&ioData->waiter, (HANDLE)s, waitOrTimerCallback, ioData, gTimeout, WT_EXECUTEONLYONCE))
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cerr << preamble(__func__) << "RegisterWaitForSingleObject failed: " << GetLastError() << std::endl;
}
else
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "RegisterWaitForSingleObject success: " << ioData->waiter << std::endl;
}
}
when the wait callback is invoked, use the second parameter to decide if the callback was called because of a timeout (true) or other signal (false):
VOID CALLBACK waitOrTimerCallback(
PVOID lpParameter,
BOOLEAN TimedOut
)
{
IoData* ioData = (IoData*)lpParameter;
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << (TimedOut ? "true" : "false") << std::endl;
std::cout << "\tSocket: " << ioData->socket << std::endl;
}
if (!TimedOut)
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "read activity is forthcoming" << std::endl;
}
else
{
// we hit our timeout, cancel the I/O
CancelIoEx((HANDLE)ioData->socket, &ioData->ol);
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "timeout reached, cancelling I/O" << std::endl;
}
// need to unregister the waiter but not supposed to do it in the callback
if (!TrySubmitThreadpoolCallback(unregisterWaiter, &ioData->waiter, NULL))
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cerr << preamble(__func__) << "failed to unregister waiter...does this mean I have a memory leak?" << std::endl;
}
}
per the recommendations of the API:
https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-registerwaitforsingleobject
When the wait is completed, you must call the UnregisterWait or UnregisterWaitEx function to cancel the wait operation. (Even wait operations that use WT_EXECUTEONLYONCE must be canceled.) Do not make a blocking call to either of these functions from within the callback function.
submit the unregistering of the waiter to the threadpool to be dealt with outside of the callback:
VOID CALLBACK unregisterWaiter(
PTP_CALLBACK_INSTANCE Instance,
PVOID Context
)
{
PHANDLE pWaitHandle = (PHANDLE)Context;
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << std::endl;
std::cout << "\Handle: " << (HANDLE)*pWaitHandle << std::endl;
}
if (!UnregisterWait(*pWaitHandle))
{
std::lock_guard<std::mutex> log(gIoMtx);
std::cerr << preamble(__func__) << "UnregisterWait failed: " << GetLastError() << std::endl;
}
}
Managing the pointer to the handle created needs to be accounted for, but I think you can tuck it into the structure wrapping the overlapped IO and then pass the pointer to your wrapper around. Seems to work fine. The documentation makes no indication of whether I'm on the hook for freeing anything, so I assume that is why we're required to call the UnregisterWait function regardless of whether we're only executing once, etc. That detail can be considered outside the scope of the question.
Note, for others' benefit, I've updated the github link from my question with the latest version of the code.

Related

What's the order of use of Win32 APIs for Server-client Pipe comm. in C++ (CreateNamedPipe, WriteFile, CreateFile, ReadFile)

I am trying to write a server/client program in C++, in Visual Studio 2019, using Win32 APIs.
This is the referred documentation: Named Pipe Open Modes
I have used 4 APIs:
On the server side (the one creating the pipe and writing to it): CreateNamedPipe(), WriteFile()
On the client side (the one connecting and reading from the pipe): CreateFile(), ReadFile()
However, I observe the server is NOT able to write to the pipe.
Following is the code I have used.
Servermain.cpp
#include <iostream>
#include <windows.h>
using namespace std;
void namedPipeServer()
{
HANDLE hPipeServer;
char Wbuffer[1024] = "Hello, from the pipe server!";
DWORD dwWrite;
BOOL writeSuccessFlag;
//Create a named pipe
hPipeServer = CreateNamedPipe(
TEXT("\\\\.\\pipe\\Agentpipe"), //lpName
PIPE_ACCESS_OUTBOUND, //dwOpenMode
PIPE_TYPE_BYTE, //dwPipeMode
1, //nMaxInstances
1024 * 16, //nOutBufferSize
1024 * 16, //nInBufferSize
NMPWAIT_USE_DEFAULT_WAIT, //nDefaultTimeOut
NULL); //lpSecurityAttributes
cout << "Inside namedPipeServer()" << endl;
if (hPipeServer != INVALID_HANDLE_VALUE)
{
cout << "Just writing to pipe" << endl;
writeSuccessFlag = WriteFile(
hPipeServer, //HANDLE hFile
Wbuffer, //LPCVOID lpBuffer
30, //DWORD nNumberOfBytesToWrite
&dwWrite,
NULL //LPOVERLAPPED lpOverlapped
);
if (writeSuccessFlag)
{
cout << "Server has written to pipe!" << endl;
}
else
{
cout << "Unsuccessful write to pipe, From Agent" << endl;
}
}
else
{
cout << "Unsuccesful pipe connection. hPipeServer: " << hPipeServer << endl;
}
}
int main()
{
cout << "Inside Agent server. Creating a named pipe.\n" << endl;
namedPipeServer();
while (1);
return 0;
}
Clientmain.cpp:
#include <iostream>
#include <windows.h>
using namespace std;
void readFromPipe()
{
HANDLE hPipeClient;
char rBuffer[1024];
DWORD dwRead;
BOOL readSuccessFlag = 0;
//Connect to the server pipe: \\.\\pipe\\Agentpipe
cout << "Inside readFromPipe()." << endl;
hPipeClient = CreateFile(
TEXT("\\\\.\\pipe\\Agentpipe"), //lpFileName
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
NULL,
NULL
);
while (hPipeClient != INVALID_HANDLE_VALUE)
{
cout << "Just connecting to pipe" << endl;
readSuccessFlag = ReadFile(
hPipeClient, //HANDLE hFile,
rBuffer, //LPVOID lpBuffer,
30, //DWORD nNumberOfBytesToRead,
&dwRead, //LPDWORD lpNumberOfBytesRead,
NULL //LPOVERLAPPED lpOverlapped
);
if (readSuccessFlag)
{
cout << "Client has read from pipe of Agent!" << endl;
cout << "From Agent Pipe: " << rBuffer << endl;
}
else
{
cout << "Unsuccessful Pipe read!" << endl;
}
}
if(hPipeClient == INVALID_HANDLE_VALUE)
{
cout << "Unsuccesful pipe connection at client end. hPipeClient: " << hPipeClient << endl;
}
}
int main()
{
cout << "Inside the client. Calling readFromPipe()" << endl;
readFromPipe();
while (1);
return 0;
}
When the above program is executed, it shows that the server is NOT able to write to the pipe, and the output on the server-side is:
Inside Agent server. Creating a named pipe.
Inside namedPipeServer()
Just writing to pipe
Unsuccessful write to pipe, From Agent
Output on the client console is:
Inside the client. Calling readFromPipe()
Inside readFromPipe().
Just connecting to pipe
Upon looking into the sample program in the Win32 documentation, I have observed that the order of use of these Win32 APIs is different, that looks like below:
Pipe Server program:
main(){
...
namedPipeServer()
...
}
void namedPipeServer()
{
...
CreateFile()
WriteFile()
...
}
Pipe Client program:
main(){
...
readFromPipe()
...
}
void readFromPipe()
{
...
CreateNamedPipe()
ReadFile()
...
}
I would be happy if anyone can provide me with clarity on the use of CreateNamedPipe() & CreateFile() especially.
Does the server have to use CreateFile() first (to create the pipe, before writing to it), or can I use CreateNamedPipe()?
Is the order of use of the APIs in MY program posted incorrect? If it is, please specify why.

Full duplex named pipe lockup when written to

I'm trying to use one NamedPipe for bi-direction IPC. In my mind (and I can't find more information on MSDN), one full-duplex pipe would be sufficient. Here's my code.
//Compiled with these commands during my test:
//g++ -DCLIENT -o client.exe xxx.cpp
//g++ -DSERVER -o server.exe xxx.cpp
#include <iostream>
#include <windows.h>
using namespace std;
DWORD WINAPI ReadingThread(LPVOID a)
{
HANDLE pipe = (HANDLE)a;
BOOL result;
char buffer[256];
DWORD numBytesRead;
while (true)
{
result = ReadFile(pipe, buffer, sizeof(buffer) - 1, &numBytesRead, NULL);
if (result)
{
buffer[numBytesRead] = 0;
cout << "[Thread] Number of bytes read: " << numBytesRead << endl;
cout << "[Thread] Message: " << endl
<< buffer << endl
<< endl;
}
else
{
cout << "[Thread] Failed to read data from the pipe. err=" << GetLastError() << endl;
break;
}
}
return 0;
}
int main(int argc, const char **argv)
{
#ifdef CLIENT
cout << "[Main] Connecting to pipe..." << endl;
HANDLE pipe = CreateFileA("\\\\.\\pipe\\PipeTest", GENERIC_READ | GENERIC_WRITE, FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
#else
cout << "[Main] Creating an instance of a named pipe..." << endl;
HANDLE pipe = CreateNamedPipeA("\\\\.\\pipe\\PipeTest", PIPE_ACCESS_DUPLEX, PIPE_TYPE_BYTE, 1, 0, 0, 0, NULL);
#endif
if (pipe == NULL || pipe == INVALID_HANDLE_VALUE)
{
cout << "[Main] Failed to acquire pipe handle." << endl;
return 1;
}
#ifdef CLIENT
#else
cout << "[Server] Waiting for a client to connect to the pipe..." << endl;
BOOL result = ConnectNamedPipe(pipe, NULL);
if (!result)
{
cout << "[Server] Failed to make connection on named pipe." << endl;
CloseHandle(pipe);
return 1;
}
cout << "[Server] Client is here!" << endl;
{
const char *buf = "Hello pipe!\n";
WriteFile(pipe, buf, strnlen(buf, 30), 0, 0);
}
#endif
CreateThread(0, 0, ReadingThread, pipe, 0, 0);
cout << "[Main] Ready to send data." << endl;
while (true)
{
char buffer[128];
DWORD numBytesWritten = 0;
BOOL result;
cin >> buffer;
if (!strcmp(buffer, "q"))
{
break;
}
cout << "[Main] Writing data to pipe..." << endl;
result = WriteFile(pipe, buffer, strnlen(buffer, _countof(buffer)), &numBytesWritten, 0);
if (result)
{
cout << "[Main] Written " << numBytesWritten << " bytes to the pipe." << endl;
}
else
{
cout << "[Main] Failed to write data to the pipe. err=" << GetLastError() << endl;
}
}
CloseHandle(pipe);
cout << "[Main] Done." << endl;
return 0;
}
I can get the "Hello pipe!" message from server-side to client-side. And I'm expecting to type some string on either program's terminal and press enter, and see it on the other side.
However after the hello message, both program will stuck on the WriteFile call. Meanwhile the thread is stuck at the ReadFile call. How can I make it work, or did I left something out?
when file created for synchronous I/O (flag FO_SYNCHRONOUS_IO present in FILE_OBJECT ) all I/O operations on file is serialized - new operation will be wait in I/O manager before passed to driver, until current(if exist) not complete. in concurrent can execute only single I/O request. if we do blocked read in dedicated thread - all another I/O request on this file will be blocked until read not complete. this related not only to write. even query file name/attributes will block here. as result render reading in separate not help here - we block on first write attemp. solution here use asynchronous files - this let any count of I/O operation execute in concurrent.
Named Pipes in Windows are HALF DUPLEX. As demonstrated on Windows 10. The MSDN Documentation is Wrong. A request has been submitted to Microsoft to correct their documentation.
While a pipe can be opened on the client to be "Generic Read | Generic Write" you can NOT do both at the same time.
And Overlapped IO submitted after the First Overlapped IO will break the pipe.
You can submit overlapped io. Then Wait for it to finish. Then submit the next overlapped io. You can not simultaneously Submit overlapped Reads AND overlapped Writes.
This is by definition, "Half Duplex".

how do i correctly fork() and exit a child process when i'm using ZeroMQ

I have a simple application that listens on a ZeroMQ socket. When the client connects and requests a worker node, I fork() my process, the forked child process creates a new context and a new ZeroMQ socket. The client and the worker node perform a REQ-REP formal behaviour on that socket.
My problem is how do I can gracefully handle a shutdown of my worker node.
The client sends an EXIT message to my worker node, who needs to close its socket and its context (?)
From what I can see, the child process exits however, new clients cannot now talk to my original parent process.
Psuedo Code
while (looping) {
zmq::message_t request;
try {
socket.recv(&request); // Wait
string reqStr = string(static_cast<char *>(request.data()), request.size());
if (reqStr.compare("exit") == 0) {
LOG(INFO) << "exiting.." << endl;
looping = false;
}
LOG(INFO) << "******************************************************************" << endl;
LOG(INFO) << "Received request for an endpoint " << reqStr << endl;
int port = doFork(reqStr);
if (port > 0) {
LOG(INFO) << "Returning endPoint: " << reqStr << " on port: " << port << endl;
string result = NumberToString(port);
zmq::message_t reply(result.length());
memcpy((void *) reply.data(), result.c_str(), result.length());
socket.send(reply);
}
else {
// Child Process exiting OR error in Fork"
looping = false;
child = true;
}
}
catch (zmq::error_t &e) {
LOG(INFO) << "W: Caught Exception OR Interrupt: " << e.what() << " : and pid is " << getpid() << endl;
}
}
if (!child) {
socket.close();
context.close();
LOG(INFO) << "Closed socket and context for pid " << getpid() << endl;
}}
int Forker::doFork(string reqStr) {
pid_t pid;
int port = ++startingPort;
switch (pid = fork()) {
case -1:
LOG(INFO) << "Error in fork";
return -1;
case 0:
LOG(INFO) << "Created child process with pid: " << getpid() << endl;
{
ServicePtr servicePtr(new Service(NumberToString(port)));
LOG(INFO) << "Spawning on port: " << port << endl;
servicePtr->spawn();
}
LOG(INFO) << "Child Process exiting on port: " << port << endl;
return 0;
default:
LOG(INFO) << "Parent process. My process id is " << getpid() << endl;
}
return port;
}
Your psuedocode isn't enough to get much of anywhere - notably you don't define your sockets there, so I don't even know which side is the REQ and which side is the REP (or, indeed, that you're actually using those socket types), or which side binds and which side connects.
But, my first guess is that you've got an uneven send/receive pairing, and something like the following is happening:
Client binds on REQ socket, worker connects on REP socket
Client sends a request (client expects recv next)
Worker sends response (client expects send next)
Client sends a follow up message (client expects recv next)
Worker shuts down
New worker spins up
Client sends a request (ERROR - REQ sockets are strictly send/recv ordering)
IF that's what's causing your issue, you can either make sure to respond back to reset the client socket, or you can use a DEALER socket instead of REQ.

Opencl function found deprecated by Visual Studio

I am getting started with opencl in VS using this tutorial:
https://opencl.codeplex.com/wikipage?title=OpenCL%20Tutorials%20-%201
I am having trouble with setting up the host program. This is the code so far:
const char* clewErrorString(cl_int error) {
//stuff
}
int main(int argc, char **argv) {
cl_int errcode_ret;
cl_uint num_entries;
// Platform
cl_platform_id platforms;
cl_uint num_platforms;
num_entries = 1;
cout << "Getting platform id..." << endl;
errcode_ret = clGetPlatformIDs(num_entries, &platforms, &num_platforms);
if (errcode_ret != CL_SUCCESS) {
cout << "Error getting platform id: " << clewErrorString(errcode_ret) << endl;
exit(errcode_ret);
}
cout << "Success!" << endl;
// Device
cl_device_type device_type = CL_DEVICE_TYPE_GPU;
num_entries = 1;
cl_device_id devices;
cl_uint num_devices;
cout << "Getting device id..." << endl;
errcode_ret = clGetDeviceIDs(platforms, device_type, num_entries, &devices, &num_devices);
if (errcode_ret != CL_SUCCESS) {
cout << "Error getting device id: " << clewErrorString(errcode_ret) << endl;
exit(errcode_ret);
}
cout << "Success!" << endl;
// Context
cl_context context;
cout << "Creating context..." << endl;
context = clCreateContext(0, num_devices, &devices, NULL, NULL, &errcode_ret);
if (errcode_ret < 0) {
cout << "Error creating context: " << clewErrorString(errcode_ret) << endl;
exit(errcode_ret);
}
cout << "Success!" << endl;
// Command-queue
cl_command_queue queue;
cout << "Creating command queue..." << endl;
queue = clCreateCommandQueue(context, devices, 0, &errcode_ret);
if (errcode_ret != CL_SUCCESS) {
cout << "Error creating command queue: " << clewErrorString(errcode_ret) << endl;
exit(errcode_ret);
}
cout << "Success!" << endl;
return 0;
}
This doesn't compile, though: I get an error C4996: 'clCreateCommandQueue': was declared deprecated when i try to compile. I don't understand the whole setup process as of yet, so I don't know if I have messed up something or not. According to chronos, the function doesn't seem to be deprecated though:
https://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clCreateCommandQueue.html
If I remove the command queue part, the rest runs without problems. How can I make this work?
The clCreateCommandQueue function was deprecated as of OpenCL 2.0, and replaced with clCreateCommandQueueWithProperties. If you are only targeting devices that support OpenCL 2.0 (some recent Intel and AMD processors at the time of writing), you can safely use this new function.
If you need your code to run on devices that don't yet support OpenCL 2.0, you can continue using the deprecated clCreateCommandQueue function by using the preprocessor macros that the OpenCL headers provide, e.g:
#define CL_USE_DEPRECATED_OPENCL_1_2_APIS
#include <CL/cl.h>

Why Intel Pin cannot identify the image/routine of some executed instructions?

I am creating a large pintool and I have two questions:
The tool (abridged below to the relevant part only) sometimes cannot identify the image/routine for particular executed instructions. Does anybody know when/why can that happen?
The tool (when instrumenting a Barnes-Hut benchmark) always terminates with an out-of-memory (OOM) error after running for a while (although the benchmark, when run standalone, completes successfully). Which tools to use to debug/trace the OOM error of Pin-instrumented applications?
int main(int argc, char *argv[])
{
PIN_InitSymbols();
if( PIN_Init(argc, argv) )
{
return 0;
}
INS_AddInstrumentFunction(Instruction, 0);
PIN_StartProgram();
return 0;
}
VOID Instruction(INS ins, VOID *v)
{
INS_InsertPredicatedCall( ins,
IPOINT_BEFORE,
(AFUNPTR) handle_ins_execution,
IARG_INST_PTR,
.....);
}
VOID handle_ins_execution (ADDRINT addr, ...)
{
PIN_LockClient();
IMG img = IMG_FindByAddress(addr);
RTN rtn = RTN_FindByAddress(addr);
PIN_UnlockClient();
if( IMG_Valid(img) ) {
std::cerr << "From Image : " << IMG_Name( img ) << std::endl;
} else {
std::cerr << "From Image : " << "(UKNOWN)" << std::endl;
}
if( RTN_Valid(rtn) ) {
std::cerr << "From Routine : " << RTN_Name(rtn) << std::endl;
} else {
std::cerr << "From Routine : " << "(UKNOWN)" << std::endl;
}
}
I recently asked this on the PinHeads forum, and I'm awaiting a response. What I have read in the documentation is that the IMG_FindByAddress function operates by looking "for each image, check if the address is within the mapped memory region of one of its segments." It may be possible that instructions are executed that are not within the valid ranges.
The best way to know what image it is in for cases like this is to look at the context. My pintool (based on DebugTrace) continues to run even without knowing what image it is in. You can look at the log entries before and after this occurs. I see this all the time in dydl on OSX.

Resources