how do i correctly fork() and exit a child process when i'm using ZeroMQ - fork

I have a simple application that listens on a ZeroMQ socket. When the client connects and requests a worker node, I fork() my process, the forked child process creates a new context and a new ZeroMQ socket. The client and the worker node perform a REQ-REP formal behaviour on that socket.
My problem is how do I can gracefully handle a shutdown of my worker node.
The client sends an EXIT message to my worker node, who needs to close its socket and its context (?)
From what I can see, the child process exits however, new clients cannot now talk to my original parent process.
Psuedo Code
while (looping) {
zmq::message_t request;
try {
socket.recv(&request); // Wait
string reqStr = string(static_cast<char *>(, request.size());
if ("exit") == 0) {
LOG(INFO) << "exiting.." << endl;
looping = false;
LOG(INFO) << "******************************************************************" << endl;
LOG(INFO) << "Received request for an endpoint " << reqStr << endl;
int port = doFork(reqStr);
if (port > 0) {
LOG(INFO) << "Returning endPoint: " << reqStr << " on port: " << port << endl;
string result = NumberToString(port);
zmq::message_t reply(result.length());
memcpy((void *), result.c_str(), result.length());
else {
// Child Process exiting OR error in Fork"
looping = false;
child = true;
catch (zmq::error_t &e) {
LOG(INFO) << "W: Caught Exception OR Interrupt: " << e.what() << " : and pid is " << getpid() << endl;
if (!child) {
LOG(INFO) << "Closed socket and context for pid " << getpid() << endl;
int Forker::doFork(string reqStr) {
pid_t pid;
int port = ++startingPort;
switch (pid = fork()) {
case -1:
LOG(INFO) << "Error in fork";
return -1;
case 0:
LOG(INFO) << "Created child process with pid: " << getpid() << endl;
ServicePtr servicePtr(new Service(NumberToString(port)));
LOG(INFO) << "Spawning on port: " << port << endl;
LOG(INFO) << "Child Process exiting on port: " << port << endl;
return 0;
LOG(INFO) << "Parent process. My process id is " << getpid() << endl;
return port;

Your psuedocode isn't enough to get much of anywhere - notably you don't define your sockets there, so I don't even know which side is the REQ and which side is the REP (or, indeed, that you're actually using those socket types), or which side binds and which side connects.
But, my first guess is that you've got an uneven send/receive pairing, and something like the following is happening:
Client binds on REQ socket, worker connects on REP socket
Client sends a request (client expects recv next)
Worker sends response (client expects send next)
Client sends a follow up message (client expects recv next)
Worker shuts down
New worker spins up
Client sends a request (ERROR - REQ sockets are strictly send/recv ordering)
IF that's what's causing your issue, you can either make sure to respond back to reset the client socket, or you can use a DEALER socket instead of REQ.


WinSock: How to properly time out receive using overlapped I/O

Problem criteria:
my service is Windows-only, so portability is not a constraint for me
my service uses threadpools with overlapped I/O
my service needs to open a connection to a remote service, ask a question and receive a reply
the remote service may refuse to answer (root cause is not important)
The solution is trivial to describe: set a timeout on the read.
The implementation of said solution has been elusive.
I think I may have finally tracked down something that is viable, but I am so weary from false starts that I seek someone's approval who has done this sort of thing before before moving ahead with it.
By calling GetOverlappedResultsEx with a non-zero timeout:
If dwMilliseconds is nonzero, and an I/O completion routine or APC is queued, GetLastError returns WAIT_IO_COMPLETION.
If dwMilliseconds is nonzero and the specified timeout interval elapses, GetLastError returns WAIT_TIMEOUT.
Thus, I can sit and wait until IO has been alerted or the timeout exceeded and react accordingly:
WAIT_TIMEOUT: CancelIoEx on the overlapped structure from the WSARecv, which will trigger my IO complete callback and allow me to do something meaningful (e.g. force the socket closed).
WAIT_IO_COMPLETION: Do nothing. Timeout need not be enforced.
Is it really that simple, though? Because I have yet to find any questions or example code, etc. that closely resembles what I got going on here (which is largely based on a codebase I inherited) and as a consequence, have failed to find any examples/suggestions to support that this is appropriate.
Demo program:
to run:
-p -d -t -gor
Make the read delay > timeout to force the timeout condition.
Relevant bits for this question:
if (WSARecv(s, bufs, 1, &readBytes, &dwFlags, &ioData->ol, NULL) == SOCKET_ERROR)
std::lock_guard<std::mutex> log(gIoMtx);
switch (WSAGetLastError())
std::cout << preamble(__func__) << "asynchronous" << std::endl;
std::cerr << preamble(__func__) << "WSARecv() failed: " << WSAGetLastError() << std::endl;
return false;
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "synchronous - " << readBytes << " read" << std::endl;
if (gGetOverlappedResult)
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "wait until I/O occurs or we timeout..." << std::endl;
DWORD bytesTransferred = 0;
if (!GetOverlappedResultEx((HANDLE)s, &ioData->ol, &bytesTransferred, gTimeout, true))
DWORD e = GetLastError();
std::lock_guard<std::mutex> log(gIoMtx);
switch (e)
std::cout << preamble(__func__) << "read activity is forthcoming" << std::endl;
// we hit our timeout, cancel the I/O
CancelIoEx((HANDLE)s, &ioData->ol);
std::cerr << preamble(__func__) << "GetOverlappedResult error is unhandled: " << e << std::endl;
std::lock_guard<std::mutex> log(gIoMtx);
std::cerr << preamble(__func__) << "GetOverlappedResult success: " << bytesTransferred << std::endl;
Confirmation/other suggestions welcomed/appreciated.
I was debating what the proper protocol was and decided I'm just going to answer my own question for the benefit of the world (if anyone bumps into my similar criteria/issue) even though I would have preferred that #HansPassant get credit for the answer.
Anyway, with his suggestion, using the wait mechanism provided by Microsoft allows me to pull of what I need without orchestrating any thread-based monitoring of my own. Here are the relevant bits:
after calling WSARecv, register a wait callback:
else if (gRegisterWait)
if (!RegisterWaitForSingleObject(&ioData->waiter, (HANDLE)s, waitOrTimerCallback, ioData, gTimeout, WT_EXECUTEONLYONCE))
std::lock_guard<std::mutex> log(gIoMtx);
std::cerr << preamble(__func__) << "RegisterWaitForSingleObject failed: " << GetLastError() << std::endl;
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "RegisterWaitForSingleObject success: " << ioData->waiter << std::endl;
when the wait callback is invoked, use the second parameter to decide if the callback was called because of a timeout (true) or other signal (false):
VOID CALLBACK waitOrTimerCallback(
PVOID lpParameter,
IoData* ioData = (IoData*)lpParameter;
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << (TimedOut ? "true" : "false") << std::endl;
std::cout << "\tSocket: " << ioData->socket << std::endl;
if (!TimedOut)
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "read activity is forthcoming" << std::endl;
// we hit our timeout, cancel the I/O
CancelIoEx((HANDLE)ioData->socket, &ioData->ol);
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << "timeout reached, cancelling I/O" << std::endl;
// need to unregister the waiter but not supposed to do it in the callback
if (!TrySubmitThreadpoolCallback(unregisterWaiter, &ioData->waiter, NULL))
std::lock_guard<std::mutex> log(gIoMtx);
std::cerr << preamble(__func__) << "failed to unregister waiter...does this mean I have a memory leak?" << std::endl;
per the recommendations of the API:
When the wait is completed, you must call the UnregisterWait or UnregisterWaitEx function to cancel the wait operation. (Even wait operations that use WT_EXECUTEONLYONCE must be canceled.) Do not make a blocking call to either of these functions from within the callback function.
submit the unregistering of the waiter to the threadpool to be dealt with outside of the callback:
VOID CALLBACK unregisterWaiter(
PVOID Context
PHANDLE pWaitHandle = (PHANDLE)Context;
std::lock_guard<std::mutex> log(gIoMtx);
std::cout << preamble(__func__) << std::endl;
std::cout << "\Handle: " << (HANDLE)*pWaitHandle << std::endl;
if (!UnregisterWait(*pWaitHandle))
std::lock_guard<std::mutex> log(gIoMtx);
std::cerr << preamble(__func__) << "UnregisterWait failed: " << GetLastError() << std::endl;
Managing the pointer to the handle created needs to be accounted for, but I think you can tuck it into the structure wrapping the overlapped IO and then pass the pointer to your wrapper around. Seems to work fine. The documentation makes no indication of whether I'm on the hook for freeing anything, so I assume that is why we're required to call the UnregisterWait function regardless of whether we're only executing once, etc. That detail can be considered outside the scope of the question.
Note, for others' benefit, I've updated the github link from my question with the latest version of the code.

OMNET++: How do I send a message to a specific node based on probability?

I want to make a simulation such that node 1 has a chance to send its message to node 2 or node 3 based on a given probability, and node 2 should do the same. However, if node 3 receives the message at anytime, then the message is deleted. I tried to make it myself, but it is not working how as I planned. Out1 is an output that goes to either node 1 or 2, while Out2 is an output that goes to node 3. When the message starts at node 1 and it goes to node 3 first, then the message gets properly deleted, but other times it will immediately pop up that there are no more events and that the simulation is completed. I attached my node's cc files, and I am sure the other connections and stuff are correct. Any advice would be much appreciated, I'm still very new to omnet++. Thanks!
#include "nodes.h"
void Nodes::initialize()
prob1 = .9;
if(strcmp("node1", getName()) == 0){
if (uniform(0, 1) > prob1){
EV << "Sending initial message\n";
cMessage *msg = new cMessage("hw4Msg");
send(msg, "out1");
else {
EV << "Sending initial message\n";
cMessage *msg = new cMessage("hw4Msg");
send(msg, "out2");
void Nodes::handleMessage(cMessage *msg) {
counter ++;
if((counter == 1)&&(strcmp("node3", getName()) == 0)) {
EV << getName() << "'s counter is " << counter << ", meaning ";
EV << getName() << " has captured the packet. The message will now be deleted.";
delete msg;
prob2 = .9;
if(strcmp("node1", getName()) == 0) {
if (uniform(0, 1) > prob2) {
EV << getName() << " Received message " << msg->getName() << " ,sending it out again\n";
EV << getName() << "'s counter is " << counter;
send(msg, "out1");
else {
EV << getName() << " Received message " << msg->getName();
send(msg, "out2");
if(strcmp("node2", getName()) == 0) {
if (uniform(0, 1) < prob2) {
EV << getName() << " Received message " << msg->getName() << " ,sending it out again\n";
EV << getName() << "'s counter is " << counter;
send(msg, "out1");
else {
EV << getName() << " Received message " << msg->getName();
send(msg, "out2");
What you are missing from your model: how you intend to generate packets. In your current code, you generate a single message (in the initialize() method in node1) and then send that towards other nodes. Once it finds its way to node3 the message is deleted and there is no more events for the simulation to simulate, so it stops.
Otherwise, it cannot immediately finish as the first message either goes first to node3 and gets deleted there in the next event, or it goes to node2 where it is forwarded to either node1 or node3 according to your code.
Unless you have mistyped the node2 name in your NED file. But either way, you should single step this in Qtenv and see each events one by one.
As a side note, this scenario can (almost) fully modeled using the modules from the samples/queueinglib, (a Source, Sink and a Classifier module). I highly recommend to take a look at that sample.

Full duplex named pipe lockup when written to

I'm trying to use one NamedPipe for bi-direction IPC. In my mind (and I can't find more information on MSDN), one full-duplex pipe would be sufficient. Here's my code.
//Compiled with these commands during my test:
//g++ -DCLIENT -o client.exe xxx.cpp
//g++ -DSERVER -o server.exe xxx.cpp
#include <iostream>
#include <windows.h>
using namespace std;
HANDLE pipe = (HANDLE)a;
BOOL result;
char buffer[256];
DWORD numBytesRead;
while (true)
result = ReadFile(pipe, buffer, sizeof(buffer) - 1, &numBytesRead, NULL);
if (result)
buffer[numBytesRead] = 0;
cout << "[Thread] Number of bytes read: " << numBytesRead << endl;
cout << "[Thread] Message: " << endl
<< buffer << endl
<< endl;
cout << "[Thread] Failed to read data from the pipe. err=" << GetLastError() << endl;
return 0;
int main(int argc, const char **argv)
#ifdef CLIENT
cout << "[Main] Connecting to pipe..." << endl;
cout << "[Main] Creating an instance of a named pipe..." << endl;
HANDLE pipe = CreateNamedPipeA("\\\\.\\pipe\\PipeTest", PIPE_ACCESS_DUPLEX, PIPE_TYPE_BYTE, 1, 0, 0, 0, NULL);
if (pipe == NULL || pipe == INVALID_HANDLE_VALUE)
cout << "[Main] Failed to acquire pipe handle." << endl;
return 1;
#ifdef CLIENT
cout << "[Server] Waiting for a client to connect to the pipe..." << endl;
BOOL result = ConnectNamedPipe(pipe, NULL);
if (!result)
cout << "[Server] Failed to make connection on named pipe." << endl;
return 1;
cout << "[Server] Client is here!" << endl;
const char *buf = "Hello pipe!\n";
WriteFile(pipe, buf, strnlen(buf, 30), 0, 0);
CreateThread(0, 0, ReadingThread, pipe, 0, 0);
cout << "[Main] Ready to send data." << endl;
while (true)
char buffer[128];
DWORD numBytesWritten = 0;
BOOL result;
cin >> buffer;
if (!strcmp(buffer, "q"))
cout << "[Main] Writing data to pipe..." << endl;
result = WriteFile(pipe, buffer, strnlen(buffer, _countof(buffer)), &numBytesWritten, 0);
if (result)
cout << "[Main] Written " << numBytesWritten << " bytes to the pipe." << endl;
cout << "[Main] Failed to write data to the pipe. err=" << GetLastError() << endl;
cout << "[Main] Done." << endl;
return 0;
I can get the "Hello pipe!" message from server-side to client-side. And I'm expecting to type some string on either program's terminal and press enter, and see it on the other side.
However after the hello message, both program will stuck on the WriteFile call. Meanwhile the thread is stuck at the ReadFile call. How can I make it work, or did I left something out?
when file created for synchronous I/O (flag FO_SYNCHRONOUS_IO present in FILE_OBJECT ) all I/O operations on file is serialized - new operation will be wait in I/O manager before passed to driver, until current(if exist) not complete. in concurrent can execute only single I/O request. if we do blocked read in dedicated thread - all another I/O request on this file will be blocked until read not complete. this related not only to write. even query file name/attributes will block here. as result render reading in separate not help here - we block on first write attemp. solution here use asynchronous files - this let any count of I/O operation execute in concurrent.
Named Pipes in Windows are HALF DUPLEX. As demonstrated on Windows 10. The MSDN Documentation is Wrong. A request has been submitted to Microsoft to correct their documentation.
While a pipe can be opened on the client to be "Generic Read | Generic Write" you can NOT do both at the same time.
And Overlapped IO submitted after the First Overlapped IO will break the pipe.
You can submit overlapped io. Then Wait for it to finish. Then submit the next overlapped io. You can not simultaneously Submit overlapped Reads AND overlapped Writes.
This is by definition, "Half Duplex".

Serial Communication data problem between Windows and embedded System (STM32) (C/C++)

I currently try to set up communication between a Windows program and a µC.
I'll show you the code to initialize the port:
int serialCommunication::serialInit(void){
//non overlapped communication
hComm = CreateFile( gszPort.c_str(),
cout << "Error opening port." << endl;
return 0;
cout << "Opened Port successfully." << endl;
if (SetCommMask(hComm, EV_RXCHAR) == FALSE){
cout << "Error setting communications mask." << endl;
return 0;
SetCommMask(hComm, EV_RXCHAR);
cout << "Communications mask set successfully." << endl;
if (GetCommState(hComm, &dcbSerialParams) == FALSE){
cout << "Error getting CommState." << endl;
return 0;
GetCommState(hComm, &dcbSerialParams);
cout << "CommState retrieved successfully" << endl;
dcbSerialParams.BaudRate = CBR_115200; // Setting BaudRate = 115200
dcbSerialParams.ByteSize = 8; // Setting ByteSize = 8
dcbSerialParams.StopBits = ONESTOPBIT; // Setting StopBits = 1
dcbSerialParams.Parity = NOPARITY; // Setting Parity = None
if (SetCommState(hComm, &dcbSerialParams) == FALSE){
cout << "Error setting CommState" << endl;
return 0;
SetCommState(hComm, &dcbSerialParams);
cout << "CommState set successfully" << endl << endl;
cout << "+---CommState Parameters---+" << endl;
cout << "Baudrate = " << dcbSerialParams.BaudRate << endl;
cout << "ByteSize = " << static_cast<int>(dcbSerialParams.ByteSize) << endl; //static Cast, um int auszugeben und kein char
cout << "StopBits = " << static_cast<int>(dcbSerialParams.StopBits) << endl; //static Cast, um int auszugeben und kein char
cout << "Parity = " << static_cast<int>(dcbSerialParams.Parity) << endl; //static Cast, um int auszugeben und kein char
cout << "+--------------------------+" << endl;
/*------------------------------------ Setting Timeouts --------------------------------------------------*/
timeouts.ReadIntervalTimeout = 50;
timeouts.ReadTotalTimeoutConstant = 50;
timeouts.ReadTotalTimeoutMultiplier = 10;
timeouts.WriteTotalTimeoutConstant = 50;
timeouts.WriteTotalTimeoutMultiplier = 10;
if (SetCommTimeouts(hComm, &timeouts) == FALSE){
cout << "Error setting timeouts" << endl;
return 0;
SetCommTimeouts(hComm, &timeouts);
cout << "Timeouts set successfully." << endl;
cout << "+--------------------------+" << endl;
return 1;
My Read function looks like this:
void serialCommunication::serialRead(void){
bool readStatus;
bool purgeStatus = 0;
bool correctData = 0;
cout << "Waiting for Data..." << endl; // Programm waits and blocks Port (like Polling)
readStatus = WaitCommEvent(hComm, &dwEventMask, 0);
if (readStatus == FALSE){
cout << "Error in setting WaitCommEvent." << endl;
cout << "Data received." << endl;
readStatus = ReadFile(hComm, &TempChar, sizeof(TempChar), &NoBytesRead, 0);
SerialBuffer += TempChar; // add tempchar to the string
}while (NoBytesRead > 0);
SerialBuffer.pop_back(); // Delete last sign in buffer, otherwise one "0" too much shows up, for example "23900" instead of "2390"
cout << endl << SerialBuffer << endl;
SerialBuffer = ""; // Reset string
So at some point, my µC sends the String "Init complete...!\r\n" after initializing some things. This works well.Init complete proof
Now after that, the communcation produces errors. I am getting Data I should not receive. The µC can only send data, if a specific string is sent to it by the PC. While debugging I could detect, that the µC never receives this specific string and therefore never sends data. In the following picture, I show you what gibberish I am receiving constantly though.
Receiving Gibberish
/EDIT: I am constantly receiving the same gibberish
The funny thing is, I even receive that data, when the µC is completely switched off (Serial Cables are still connected). So there has to be some data at the port, which just is not deleted. I tried to restart the PC aswell, but it didn't help either.
I will also show you my while loop on PC:
while (testAbbruch != 1){
pointer = acMessung(anzahlMessungen, average); // measurement with external multimeter
cout << endl;
cout << "Average: " << average << endl << endl;
if (average >= 30){
testAbbruch = 1; // there won't be a next while iteration
befehl = "stopCalibration\r\n";
cout << "Aktion: ";
std::getline (cin, befehl);
befehl = "increment"; //for debugging
if (befehl == "increment"){
befehl.append("\r\n"); // adding it, so the µC can detect the string correctly
serialTest.serialRead(); // µC has to answer
else if(befehl == "decrement"){
befehl.append("\r\n"); // adding it, so the µC can detect the string correctly
serialTest.serialRead(); // µC has to answer
befehl = ""; // string leeren für nächsten Aufruf
I know my program is far from perfect, but if I understood the serial Communication with Windows correctly, the buffer is deleted while reading.
Is there any clue you could give me?
EDIT// I just wrote a program that expects one of two inputs: One input is called "increment" the other one is called "decrement". Those inputs are sent to the µC via the serial communication port. Every time I try to send "increment" and instantly after that I am reading from the port, I receive the weird data from this picture. Now, every time I try to send "decrement" and instantly after that I am reading from the port, I receive the weird data from that picture.
So my guess is that the data somehow is changed and then looped back to the PC? But why and how?!

CreateFile() returns INVALID_HANDLE_VALUE but GetLastError() is ERROR_SUCCESS

I am opening a serial port using CreateFile(). I've got a testcase (too complicated to redistribute) that consistently causes CreateFile() to return INVALID_HANDLE_VALUE and GetLastError() to return ERROR_SUCCESS. By the looks of it, this bug only occurs if one thread opens the port at the exact same time that another port closes it. The thread opening the port runs across this problem.
I don't know if this makes a difference, but later on in the code I associate the port with a CompletionPort using CreateIoCompletionPort.
Here is my code:
HANDLE port = CreateFile(L"\\\\.\\COM1",
0, // must be opened with exclusive-access
0, // default security attributes
0); // hTemplate must be NULL for comm devices
DWORD errorCode = GetLastError();
cerr << L"CreateFile() failed with error: " << errorCode << endl;
I'm pretty sure this sort of thing should not happen. Am I doing anything wrong? How do I get the API to return a correct result?
MORE DETAILS: This code is taken from a serial-port library I've developed: JPeripheral
Here is the actual (unsanitized) source-code:
JLong SerialChannel::nativeOpen(String name)
cerr << "nativeOpen(" << name << ")" << endl;
wstring nameWstring = name;
HANDLE port = CreateFile((L"\\\\.\\" + nameWstring).c_str(),
0, // must be opened with exclusive-access
0, // default security attributes
0); // hTemplate must be NULL for comm devices
cerr << "nativeOpen.afterCreateFile(" << name << ")" << endl;
cerr << "port: " << port << ", errorCode: " << GetLastError() << endl;
DWORD errorCode = GetLastError();
switch (errorCode)
throw PeripheralNotFoundException(jace::java_new<PeripheralNotFoundException>(name, Throwable()));
throw PeripheralInUseException(jace::java_new<PeripheralInUseException>(name, Throwable()));
throw IOException(jace::java_new<IOException>(L"CreateFile() failed with error: " +
// Associate the file handle with the existing completion port
HANDLE completionPort = CreateIoCompletionPort(port, ::jperipheral::worker->completionPort, Task::COMPLETION, 0);
if (completionPort==0)
throw AssertionError(jace::java_new<AssertionError>(L"CreateIoCompletionPort() failed with error: " +
cerr << "nativeOpen.afterCompletionPort(" << name << ")" << endl;
// Bind the native serial port to Java serial port
SerialPortContext* result = new SerialPortContext(port);
cerr << "nativeOpen.afterContext(" << name << ")" << endl;
return reinterpret_cast<intptr_t>(result);
Here is the actual output I get:
port: 00000374, errorCode: 0
port: FFFFFFFF, errorCode: 0 CreateFile() failed with error: The operation completed successfully.
HANDLE port = CreateFile(...);
cerr << "nativeOpen.afterCreateFile(" << name << ")" << endl;
cerr << "port: " << port << ", errorCode: " << GetLastError() << endl;
DWORD errorCode = GetLastError();
The output to cerr invokes winapi calls under the hood. Which will reset the thread error value returned by GetLastError(). Fix:
HANDLE port = CreateFile(...);
int err = GetLastError();
// etc, use err instead...
