Connection pooling: interaction of MaxIdleConnsPerHost and IdleConnTimeout in http.Transport - go

I am trying to write a heavy duty proxy to a set of web APIs in Golang. In order to prevent port exhaustion, I have decided to not use the DefaultClient and instead create a customized instance for http.Client. There are many interesting setting in the http.Transport that I can play around with.
I have come across the MaxIdleConnsPerHost and IdleConnTimeout fields and I have this question.
If I increase the value of MaxIdleConnsPerHost it means there will be more idle connection, but are they reusable idle connections? Or in other words, to make a decent connection pool, should I increase the value of MaxIdleConnsPerHost together with the timeout for IdleConnTimeout accordingly, or does it behave exactly the opposite?

yes, IdleConns are reusable as these are keep-alive connections. But to make golang honour the reusability of keep-alive connections you need to make sure of 2 things in your applications.
Read until Response is complete (i.e. ioutil.ReadAll(rep.Body))
Call Body.Close()
here's a link for more explaination.

Related

Something like the Zero MQ REP/REQ model but without having to reply?

Currently I have a REP/REQ model up and running in my code.
However, I do not need for either to send replies. So replies are just wasting time.. I don't know if that matters in the real world or not.
Basically it looks like this.
Client PCs - Connect - REQ
these guys all connect to the Server and update the Server with Info they have on a regular basis. They don't care if the Server didn't receive a particular message, nor do they need any info back from the Server.
there are many of these guys but not excessive.. Let's say between 10 and 100.. all hitting the same server.. well probably not, probably it will be in groups.. a group of them hit one server, another group another.. clients would send messages several times a second. But not much more than several. I have not really done any timing, I don't know how really to time on my computer at less than 1-2 ms resolution so I really don't know what to expect or what is feasible in terms of performance and how many REQ clients can be served by 1 server REP.
Server PC - Bind - REP
this guy sits there running in a loop on his own separate thread waiting for REQs to come in. He sends replies to the REQs because he has to, not because he really wants to or needs to.
Alternate Models
from some googling it seems that PUSH PULL was recommended if you just want to sent messages and don't care about replies.
However, I couldn't figure out how to fit that into my architecture because the binds and connects seem to be reversed from what I need to have.. I would like my Bind to be on the Server because the Client "Connect" guys are not always available to be reached..
Solutions
1) good alternate model
A good alternate model that works and is relatively simple would be great. I'm not sure there really is one but apart from REP/REQ and PUB/SUB I don't really know too much about other models.
2) I'm worrying about nothing?
if message replies to REQ by REP are always going to be really fast and the reception of those replies by by REQ from REP also are really fast, then I guess I'm worrying about nothing. That would be good to know, so feel free to let me know if this is the case.
The Connection question
I don't really understand what connecting sockets does.
On a client REQ should I make a connect at the start of each loop before sending that one single message? Or should I connect before the loop to my socket that I also created before the loop?
I also don't understand what this means in terms of reliability or if I have to make special checks about connected status and reconnect, or if that is done automatically.
To sum up
I have a "global" context.. created at the start, disposed of at the end
This daddy context has 1 or 2 sockets (connected to the same address, including port) - I'm still debugging this dual socket on the same address thing so I'm not sure if that is ok or it just doesn't work that way - clarification would be nice
These context(s) are lazy initialized and outside the loop scope, so we are not recreating sockets on a regular basis
connect calls for the sockets occur currently outside of the loop scope, but I'm not sure if it is not better to have them inside the loop scope.
I think I'm getting mixed up here.. I think the dual sockets are on my PUB/SUB model .. 1 PUB with 2 SUB sockets on each client, but anyhow please let me know if that would be a problem as well.
If you do not need Request-Reply, do not use it.
Request-Reply is generally slow because you need a round trip to the server for every message. This means you get twice the network latency, which is the time a network package needs to travel over the network. That does not matter if network traffic is low but will become a bottleneck when the traffic is high, for example multiple messages per second.
As you already mentioned Push-Pull is a valid alternative for one-way traffic. With Push-Pull you create a Pull socket on the server and bind it to an endpoint (this is similar to the Reply socket). You create a Push socket on the clients and connect it to the server endpoint (this is similar to the Request socket).
If you send multiple messages from the client to the same server, you should connect only once. Setting up a network connection is a costly operation because it requires multiple network round trips, at least for TCP.

What is the correct structure of an async TCP server in Go?

I have a client sending around 500k requests (messages)/min. Each message will be around 200 bytes to 2KB. Each message will be saved in a database (like Couchbase).
What is the correct way to structure a Go TCP server in terms of cores, ports, connections and goroutines to handle this load?
Like JimB mentions, a TCP server shouldn't be difficult to stand up and start benchmarking for your needs. A simple layout would be to wait on incoming TCP connections and then execute a go routine to handle it. In that goroutine you can put whatever blocking code you want, in this case a write out to a DB. Here is a link to a simple example:
Simple example
Once you get that working, you can make it more sophisticated if it doesn't meet your performance standards. Here is a nice example of using a worker pool to handle 1M HTTP requests per minute.
More sophisticated example

How can I limit total concurrent subscriber connections to a ZeroMQ publisher endpoint?

When building a pub-sub service using ZeroMQ on a Linux system, is there any way to enforce concurrent subscriber limits?
For example, I might want to create a ZeroMQ publisher service on a resource-limited system, and want to prevent overloading the system by setting a limit of, say, 100 concurrent connections to the tcp publisher endpoint. After that limit is reached, all subsequent connection attempts from ZeroMQ subscribers would fail.
I understand ZeroMQ doesn't provide notifications about connect/disconnect, but I've been looking for socket options that might allow such limits -- so far, no luck.
Or is this something that should be handled at some other level, perhaps within the protocol?
Yes, ZeroMQ is a Can-Do messaging framework:
Besides the trivial Formal Communication Pattern Framework elements ( the library primitives ), the strongest powers behind the ZeroMQ is the ability to develop one's own messaging system(s).
In your case, it is enough to enrich the scene with a few additional things ... a SUB-process -> PUB-process message-flow-channel, so as to allow PUB-side process to count a number of SUB-process instances concurrently connected and to allow for a disconnect ( a step delegated rather "back" to a SUB-process side suicside move, as the classical PUB-process, intentionally, has no instrumentation to manage subscriptions ) once a limit is dynamically achieved.
Plus add some dynamics for the inter-node signalling to start re-counting and/or to equip the SUB-process side(s) with a self-advertising mechanism to push-keepAliveSIG-s to the PUB-side and expect this signalling to be a weak and informative-only indication as there are many real-world collisions, where decentralised node simply fail to deliver a "guaranteed-delivery" message(s) and a well designed, distributed, low-latency, high-performance system has to cope well with this reality and have the self-healing state-recovery policies designed and in-built into own behaviour.
( Fig. courtesy imatix/ZeroMQ )
The ZeroMQ library can be thought of as a very powerful LEGO-tool-box for designing cool distributed systems, than a ready-made / batteries-included, stiff, quasi-solution-for-just-a-few-academic-cases ( well, it might be considered such, but just for some no-brainer's life, while our lives are much more colourful & teasing, aren't they ? )
So, "How to?"
Worth, definitely worth a few days to read the both of Pieter Hintjens' books & a few weeks for shifting one's mind to start designing with the ZeroMQ full-powers on one's side.
With just a few Python add-on habits ( a zmq.Context() early-setup, and not forgetting a finally: aContext.term() )
There's no way that I'm aware of to configure ZMQ to limit connections automatically... however, you have other options to accomplish what you're looking for. Perhaps the "traditional" way to accomplish this is with a second set of "network communication" sockets... perhaps REQ/REP from subscriber to publisher, asking for permission to connect.
You also have the option, depending on your version of ZMQ (and I've never used it and I can't find it in 5 minutes of searching, so I don't know how recent your version must be) to use XPUB/XSUB sockets, which can accomplish bi-directional communication. You can connect with XSUB, send a subscribe request, then receive a positive or negative response (you might have to play with your subscriber topics to communicate directly with just the single subscriber, I'm not sure), and react accordingly.
Either way, you'll be allowing a connection of some sort between the two systems and then either allowing it or terminating it depending on the situation. This could be less than completely ideal since you'll have to carve out a little overhead to handle connections that you'll be refusing... let's say you're saturated at 100 clients and all of a sudden get 100 new subscribe requests... you may or may not be able to cope with that sort of burst traffic.
You can test out the overhead in alternative communication mediums... like you could publish a webservice that indicates subscriber status that a client could check first, but that may not be any better to have clients connecting that way.
If you're absolutely at the limit of your resources, you'll have to set up a second server to handle subscriber status:
Server 1 is your publisher. You could set it up with a PUB socket and a REP socket.
Server 2 is your status server. It has a REQ socket. Have it subscribe to something like "system-status" or some such thing as that. It will also have your mechanism for communicating with new subscribers, be that a ZMQ socket or a web service or whatever else.
A client will request status from your status server. The status server will send a request to your publisher, which will increment it's subscriber count and reply with success, or keep its subscriber count and reply with failure. This success or failure will be communicated back to the subscriber, which will use that information to connect or not.
Disconnections will have to be communicated in a similar way... and you'll have to use some sort of heartbeating round-robin to confirm clients weren't a victim of catastrophic failure.
This will allow your publisher to make intelligent choices about whether it has resources or not. If you just want to set a static number, you don't even need the connection between the status server and the publisher, you can just keep count on the status server... but just to ensure the overall health of the network then it's probably best not to go that simplistic route.
Anyway, those are just some ideas to accomplish what you're looking for. ZMQ gives you options with which to craft your solutions moreso than actual solutions.

Listening to multiple sockets: select vs. multi-threading

A server needs to listen to incoming data from several sockets (10-20). After some initializations, those sockets are created and do not change (i.e. no new sockets accepted, and none of them is expected to close during the lifetime of the server).
One option is to select() on all sockets, then deal with incoming data per socket (i.e. route to proper handling function).
Another option is to open one thread per socket and let each thread recv() and handle the input.
(The first option has the benefit of setting a timeout, but this is not an issue in this case,
since all the sockets are quite active).
Assuming the following: Windows server, has enough memory such that 20MB (for the 20 threads) is a non-issue, is any of those options expected to be faster then the other?
There's not much in it in you app. Typically, using a thread-per-socket is easier than asynchronous approaches because it's a simpler overall structure and it's easier to maintain state.

Is this a scaleable named pipe server implementation?

Looking at this example of named pipes using Overlapped I/O from MSDN, I am wondering how scaleable it is? If you spin up a dozen clients and have them hit the server 10x/sec, with each batch of 10 being immediately after the other, before sleeping for a full second, it seems that eventually some of the instances of the client are starved.
Server implementation:
http://msdn.microsoft.com/en-us/library/aa365603%28VS.85%29.aspx
Client implementation (assuming call is made 10x/sec, and there are a dozen instances).
http://msdn.microsoft.com/en-us/library/aa365603%28VS.85%29.aspx
The fact that the web page points out that:
The pipe server creates a fixed number of pipe instances.
and
Although the example shows simultaneous operations on different pipe instances, it avoids simultaneous operations on a single pipe instance by using the event object in the OVERLAPPED structure. Because the same event object is used for read, write, and connect operations for each instance, there is no way to know which operation's completion caused the event to be set to the signaled state for simultaneous operations using the same pipe instance
you can probably safely assume that it's not as scalable as it could be; it's an API usage example after all; demonstration of functionality is usually the most important design constraint for such code.
If you need 12 clients making 10 connections per second then I'd personally have the server able to handle MORE than just 12 clients to allow for the period when the server is preparing for a new client to connect... Personally I'd switch to using sockets but that's just me (and I'm skewed that way because I've done lots of high performance socket's work and so have all the code)...

Resources