IOCP, AcceptEx, overlapped and WSAEINVAL - windows

I have a server that uses IOCPs, sockets and overlapped. Initially everything is just wonderful. The listening socket hands off to a newly created socket using AcceptEx on an IOCP. I can handle thousands of connections just fine.
When the server process falls behind in processing, it will close and disconnect the listening port. When it catches back up, it will reestablish the listening port with a new IOCP.
The issue I have run into is that on after reestablishing the listening port, and a new connection arrives, I attempt to accept using the exact same code path as above. The AcceptEx fails with WSAEINVAL.
I know I have left out some details (and the devil is always in the details, no?) -- but would appreciate assistance on where I should be looking.
If a curious soul would like more information, I'd be happy to supply.

It's hard to guess at what your problem might be given you don't show any source code, but...
There's no need to close the listening socket, simply stop posting new AcceptEx() calls and the server will not be able to accept any new connections.
if you really want to close the listening socket as well then do not close the IOCP and make sure you use the same IOCP when you recreate the listening socket.

I will answer my own question, because I have figured out what the underlying issue was. One thing that was critical to the issue, but was not stated in the problem statement was that the server had sub-processes.
It turns out that while the default behavior in windows is to not have handles inherited by sub-processes, the behavior of winsock is the opposite: handles are inherited by sub-processes unless explicitly set to no-inherit on creation.
Creating sockets with non-inheritable handles solves this problem. I hope that this helps someone out there that runs into this issue.

Related

ZeroMQ connect to physically non connected socket

I'm trying to understand if ZeroMQ can connect pub or sub socket to non existing (yet) ip address. Will it automatically connect when this IP address will appear in the future?
Or should I check up existance first before connecting?
Is the behavior same for PUB and SUB sockets?
The answer is buried somewhat in the manual, here:
for most transports and socket types the connection is not performed immediately but as needed by ØMQ. Thus a successful call to zmq_connect() does not mean that the connection was or could actually be established. Because of this, for most transports and socket types the order in which a server socket is bound and a client socket is connected to it does not matter. The ZMQ_PAIR sockets are an exception, as they do not automatically reconnect to endpoints.
As that quote says, the order of binding and connecting does not matter. This is extremely useful, as you don't then have to worry about start-up order; the client will be quite happy waiting for a server to come online, able to run other things without blocking on the connect.
Other Things That Are Useful
The direction of bind/connect is independent of the pattern used on top; thus a PUB socket can be connected to a SUB socket that has been bound to an interface (whereas the other way round might feel more natural).
The other thing that I think a lot of people don't realise is that you can bind (or connect) sockets more than once, to different transports. So a PUB socket can quite happily send to SUB clients that are both local in-process threads, other processes on the same machine via ipc, and to clients on remote machines via tcp.
There are other things that you can do. If you use the ZMQ_FD option from here, you can get ZMQ_EVENT notifcations in some way or other (I can't remember the detail) which will tell you when the underlying connection has been successfully made. Using the file descriptor allows you to include that in a zmq_poll() (or some other reactor like epoll() or select()). You can also exploit the heartbeat functionality that a socket can have, which will tell you if the connection dies for some reason or other (e.g. crashed process at the other end, or network cable fallen out). Use of a reactor like zmq_poll(), epoll() or select() means that you can have a pure actor model event-driven system, with no need to routinely check up on status flags, etc.
Using these facilities in ZMQ allows for the making of very robust distributed applications/system that know when various bits of themselves have died, come back to life, taken a network-out holiday, etc. For example, just knowing that a link is dead perhaps means that a node in your distributed app changes its behaviour somehow to adapt to that.

Why might an EventMachine outbound data buffer stop sending and just fill up forever (while other connections can still send)

I have an EventMachine server sending TCP data down to a Mac client (via GCDAsyncSocket). It always works flawlessly for a while, but inevitably the server suddenly stops sending data on a connection-by-connection basis. The connection is still maintained, and the server still receives data from the client, but it doesn't go the other way.
When this happens, I've discovered via connection#get_outbound_data_size that the connection send buffer is filling up infinitely (via #send_data) and not being sent to the client.
Are there specific (and hopefully fixable) reasons why this might occur? The reactor keeps humming along, and other active connections to the server continue working fine (though they sometimes fall into buffer hell as well).
I see one reason at least: when the remote client no longer read data from its side of the TCP connection (with a recv() call or whatever).
Then, the scenario is: the receiving TCP buffer on the client side becomes full. And the OS can no longer accepts TCP pacquets from its peer, since it cannot store them queue them. As a consequence, the sending TCP buffer on the server side becomes full too as your application continue to send paquets on the socket! Soon your server is no longer able to write into the socket since the send() system call will :
blocks undefinitively. (waiting for buffer to empty enough for the new paquet)
ot returns with an EWOULDBLOCK error. (if you configured your socket as a non-blocking one)
I usually met that kind of use case in TEST environment when I put a breakpoint in my code on the client side.
There was a patch was applied to GCDAsyncSocket on March 23 that prevents the reads from stopping. Did this patch solve your problem?

Handling WSAENETDOWN

I'm new to Winsock programming and I'm trying to learn how to use asynchronous sockets with WSAEventSelect(). I'm a bit unsure on how to handle a WSAENETDOWN error.
What exactly happens when I get a WSAENETDOWN error? Are my sockets and event objects automatically destroyed? What sort of cleanup do I need to do? What is the proper way of handling a WSAENETDOWN error if I'd like to try to reconnect? Is it ok to call connect() again, should I close and recreate my sockets and event objects, or should I call WSACleanup() and start over from scratch?
WSAENETDOWN means that on this socket a network error occured and sending and receiving data is not possible anymore. To handle this error you should close this one socket. There is no need to close other sockets or WASCleanup as other sockets can still be functionable (think about a computer with two network cards where one network is down but the other still functions). The sockets and events are not destroyed automatically.

Seeking info on how to use the VB6 Winsock, flow of events, etc

I'm using the MS Winsock control in VB6 and I want to understand things like
"when does the Server Close the
connection (triggering the
Winsock_Close() event), and a
related question:
How do you know
when all the data from a a Post has
been returned?
More info:
I should have mentioned: I've already read the MSDN description, etc., but it doesn't actually explain what's happening. E.g., it explains the the Close() event fires when the Server ends the connection but doesn't explain what would cause the connection to end and whether a broken connection would trigger a Close event, etc.
And none of the MSDN descriptions explain know when all the data has arrived. (I suspect it's the Close even firing).
You might want to try out the following walkthrough
tcp.oflameron.com/
You can find the complete code here
If you have any Qs in particular, plz ask here...
GoodLUCK!!
- CVS
Using the Winsock Control at http://msdn.microsoft.com/en-us/library/aa733709(VS.60).aspx
MSDN Search of "Winsock control" at http://social.msdn.microsoft.com/Search/en-US?query=Winsock+control&ac=8
Documentation Lacks
The documentation will not provide the information you are asking for. This is an ActiveX control that allows you to connect computers through TCP/IP protocol stacks.
The information you want applies to how these computer "talk" (the protocol). That totally depends on the server application and client application that are communicating. For instance, if I am connecting to the FTP Service of another computer, the server will not close the connection until I send the appropriate command or until the server detects an idle connection. On the other hand, some services will close the connection on any invalid command, especially SMTP Servers will tighten security.
You need to check out the documentation of the service you are connecting with. The documentation will tell you how to send commands, command format, response codes, how commands are acknowledge, and so on.
SAMPLE: VBFTP.EXE: Implementing FTP Using WinInet API from VB at http://support.microsoft.com/kb/175179

WinSock best accept() practices

Imagine you have a server which can handle only one client at a time. The server uses WSAAsyncSelect to be notified of new connections. In this case, what is the best way of handling FD_ACCEPT messages:
A > Accept the connection attempt right away but queue the client until its turn?
B > Do not accept the next connection attempt until we are done serving the currently connected client?
What do you guys think is the most efficient?
Here I describe the cons that I'm aware for both options. Hopefully this might help you decide.
A)
Upon a new client connection, it could send tons of data making your receive buffer become full, which causes unnecessary packets to be transmitted (see this). If you don't plan to receive any data from the client, shutdown receiving on that socket, thus if the client sends any data after that, the connection is reset. Moreover, if your protocol has strict rules, disconnect the client.
If the connection stays idle for too long, the system might disconnect it. To solve this, use setsockopt to set SO_KEEPALIVE on each client socket.
B)
If you don't accept the connection after a certain period (I guess the default is 60 seconds), it will timeout. In a normal (or most common) situation this indicates the server is overloaded, thus unable to answer in time. However, if the client is also designed by you, make the socket non-blocking, try to connect, then manage the timeout as you wish.
Ask yourself: what do you want the user experience to be at the other end? Do you want them to be stuck? Do you want them to time out? Do you want them to get a polite message?

Resources