Handling WSAENETDOWN - winapi

I'm new to Winsock programming and I'm trying to learn how to use asynchronous sockets with WSAEventSelect(). I'm a bit unsure on how to handle a WSAENETDOWN error.
What exactly happens when I get a WSAENETDOWN error? Are my sockets and event objects automatically destroyed? What sort of cleanup do I need to do? What is the proper way of handling a WSAENETDOWN error if I'd like to try to reconnect? Is it ok to call connect() again, should I close and recreate my sockets and event objects, or should I call WSACleanup() and start over from scratch?

WSAENETDOWN means that on this socket a network error occured and sending and receiving data is not possible anymore. To handle this error you should close this one socket. There is no need to close other sockets or WASCleanup as other sockets can still be functionable (think about a computer with two network cards where one network is down but the other still functions). The sockets and events are not destroyed automatically.

Related

ZeroMQ connect to physically non connected socket

I'm trying to understand if ZeroMQ can connect pub or sub socket to non existing (yet) ip address. Will it automatically connect when this IP address will appear in the future?
Or should I check up existance first before connecting?
Is the behavior same for PUB and SUB sockets?
The answer is buried somewhat in the manual, here:
for most transports and socket types the connection is not performed immediately but as needed by ØMQ. Thus a successful call to zmq_connect() does not mean that the connection was or could actually be established. Because of this, for most transports and socket types the order in which a server socket is bound and a client socket is connected to it does not matter. The ZMQ_PAIR sockets are an exception, as they do not automatically reconnect to endpoints.
As that quote says, the order of binding and connecting does not matter. This is extremely useful, as you don't then have to worry about start-up order; the client will be quite happy waiting for a server to come online, able to run other things without blocking on the connect.
Other Things That Are Useful
The direction of bind/connect is independent of the pattern used on top; thus a PUB socket can be connected to a SUB socket that has been bound to an interface (whereas the other way round might feel more natural).
The other thing that I think a lot of people don't realise is that you can bind (or connect) sockets more than once, to different transports. So a PUB socket can quite happily send to SUB clients that are both local in-process threads, other processes on the same machine via ipc, and to clients on remote machines via tcp.
There are other things that you can do. If you use the ZMQ_FD option from here, you can get ZMQ_EVENT notifcations in some way or other (I can't remember the detail) which will tell you when the underlying connection has been successfully made. Using the file descriptor allows you to include that in a zmq_poll() (or some other reactor like epoll() or select()). You can also exploit the heartbeat functionality that a socket can have, which will tell you if the connection dies for some reason or other (e.g. crashed process at the other end, or network cable fallen out). Use of a reactor like zmq_poll(), epoll() or select() means that you can have a pure actor model event-driven system, with no need to routinely check up on status flags, etc.
Using these facilities in ZMQ allows for the making of very robust distributed applications/system that know when various bits of themselves have died, come back to life, taken a network-out holiday, etc. For example, just knowing that a link is dead perhaps means that a node in your distributed app changes its behaviour somehow to adapt to that.

ZeroMQ REQ/REP server error handling

I am trying to use the ZeroMQ rep/req and cannot figure out how to handle server side errors. Look at the code from here:
socket.bind("tcp://*:%s" % port)
while True:
# Wait for next request from client
message = socket.recv()
print "Received request: ", message
time.sleep (1)
socket.send("World from %s" % port)
My problem is what happens if the client calls socket.send() and then hangs or crashes. Wouldn't the server just get stuck on socket.send() or socket.recv() forever?
Note that it is not a problem with TCP sockets. With TCP sockets I can simply break the connection. With ZMQ, the connections are implicitly managed for me and I don't know if it is possible to break a 'session' or 'connection' and start over.
You can terminate ZMQ sockets much the same way you terminate TCP sockets.
socket.close()
If you need to wait on a message but only up for a finite amount of time you can pass a timeout flag to socket.recv(timeout=1024) and then handle the timeout error case the same way you would when a TCP socket timeouts or disconnects. If you need to manage several sockets all of which may be in an error state then the Poller class will let you accomplish this.
The ZMQ Z-guide offers lots of good hints on how to structure your services to handle different scenarios.
I think chapter 4 can be of interest to you, especially the Lazy Pirate Pattern.
Check out the examples of Lazy Pirate Server and Lazy Pirate Client.
In general,
Make sure you setsockopt() on the socket such that send and recv will not block forever. (Temporary blocking – be it client or server – is OK, but infinite blocking is bad because your application cannot do anything else)
In the event that any of the I/O got an error,
If you are the client, close() the current socket and re-create a new one to establish a new connection
If you are the server, there's nothing else to do, you simply are waiting for a new connection. You will want to explore the Poller class.

IOCP, AcceptEx, overlapped and WSAEINVAL

I have a server that uses IOCPs, sockets and overlapped. Initially everything is just wonderful. The listening socket hands off to a newly created socket using AcceptEx on an IOCP. I can handle thousands of connections just fine.
When the server process falls behind in processing, it will close and disconnect the listening port. When it catches back up, it will reestablish the listening port with a new IOCP.
The issue I have run into is that on after reestablishing the listening port, and a new connection arrives, I attempt to accept using the exact same code path as above. The AcceptEx fails with WSAEINVAL.
I know I have left out some details (and the devil is always in the details, no?) -- but would appreciate assistance on where I should be looking.
If a curious soul would like more information, I'd be happy to supply.
It's hard to guess at what your problem might be given you don't show any source code, but...
There's no need to close the listening socket, simply stop posting new AcceptEx() calls and the server will not be able to accept any new connections.
if you really want to close the listening socket as well then do not close the IOCP and make sure you use the same IOCP when you recreate the listening socket.
I will answer my own question, because I have figured out what the underlying issue was. One thing that was critical to the issue, but was not stated in the problem statement was that the server had sub-processes.
It turns out that while the default behavior in windows is to not have handles inherited by sub-processes, the behavior of winsock is the opposite: handles are inherited by sub-processes unless explicitly set to no-inherit on creation.
Creating sockets with non-inheritable handles solves this problem. I hope that this helps someone out there that runs into this issue.

boost pion comet-like httpserver

I'm trying to efficiently implement comet-like functionality using HTTPServer class of boost::pion.
Basically, in my 'handleURI' function, I would like to postpone returning results to the client, until the server is ready to respond (for instance, until another user has sent a message to the first user, to use a simple comet 'hello world' application).
What should I do? Put the state on the stack, and exit silently, without creating a HTTPResponseWriter?
Cheers!
Setup a timeout ASIO event for your connection so that you can reap the connection after 20min or something reasonable like that. I don't know about Boost Pion, but in ASIO you'd want to register a read handler that catches when the connection closes and a timeout handler to alert you for when the connection has actually timed out. Enable TCP keep alives on your socket to detect when the socket should be reaped in the event that it just vanishes (though tcp keep alives aren't a guarantee so don't rely exclusively on them - not all clients support tcp keep alives). As for the timer, check out the following timer example:
https://github.com/sean-/Boost.Examples/blob/master/asio/timer/timer.cc

WinSock best accept() practices

Imagine you have a server which can handle only one client at a time. The server uses WSAAsyncSelect to be notified of new connections. In this case, what is the best way of handling FD_ACCEPT messages:
A > Accept the connection attempt right away but queue the client until its turn?
B > Do not accept the next connection attempt until we are done serving the currently connected client?
What do you guys think is the most efficient?
Here I describe the cons that I'm aware for both options. Hopefully this might help you decide.
A)
Upon a new client connection, it could send tons of data making your receive buffer become full, which causes unnecessary packets to be transmitted (see this). If you don't plan to receive any data from the client, shutdown receiving on that socket, thus if the client sends any data after that, the connection is reset. Moreover, if your protocol has strict rules, disconnect the client.
If the connection stays idle for too long, the system might disconnect it. To solve this, use setsockopt to set SO_KEEPALIVE on each client socket.
B)
If you don't accept the connection after a certain period (I guess the default is 60 seconds), it will timeout. In a normal (or most common) situation this indicates the server is overloaded, thus unable to answer in time. However, if the client is also designed by you, make the socket non-blocking, try to connect, then manage the timeout as you wish.
Ask yourself: what do you want the user experience to be at the other end? Do you want them to be stuck? Do you want them to time out? Do you want them to get a polite message?

Resources