Go(lang): about MaxIdleConnsPerHost in the http client's transport - go

In case MaxIdleConnsPerHost is set to a high number, let's say 1000, the number of connections open will still depend on the other host, right? I mean, allowing 1000 idle connections with the same host will result in 1000 connections open as long as these are not closed by the other host?
So, effectively setting this value to a high number, will result in never closing a connection, but waiting for the other host to do it? am I interpreting this correctly?

Your understanding is correct. MaxIdleConnsPerHost restricts how many connections there are which are not actively serving requests, but which the client has not closed.
Idle connections are useful for web browsers because they can keep reusing connections for subsequent HTTP requests to the same server. Idle connections have a cost for the server, though. They use kernel resources, and you may run up against per process limits or kernel limits on the number of open connections, files, or handles, which may cause unexpected errors in your program, or even for other programs on the same machine.
As such, be careful when increasing MaxIdleConnsPerHost to a large number. It only makes sense to increase idle connections if you are seeing many connections in a short period from the same clients.

Related

HTTPS connection keep alive performance

Does anyone know what’s the difference, in milliseconds and percentage, between the total time it takes to make an HTTPS request that is allowed to use keep alive vs one that doesn’t? For the sake of this question, let’s assume a web server that has one GET endpoint called /time that simply returns the server’s local time, and that clients call this endpoint on average once a minute.
My guess is that, putting the server on my home LAN, and calling /time from my laptop on the LAN would take 200ms. With keep-alive it’s probably going to be 150ms. So that’s 50ms difference, and 25% improvement.
My second question is similar, but only considers server processing time. Let’s say the server takes 100ms to process a GET /time request, but only 50ms to process the same with keep-alive. That’s 50ms faster, but a 50% performance gain, which is very meaningful as it increases the server’s capacity.
I think you have confused a lot of tings here. Keepalive header in HTTP protocol suggests a client that server wouldn't mind to accept multiple requests through the same connection.
Connection is a term related to underlying TCP protocol, and there is an overhead (three way handshake) in establishing it. On the other hand, too many connections at once hurt server's performance. That's why those options exist.
HTTPS implies security-associated workflow on top of HTTP protocol and I suspect it bears no relevance in the context of your question whatsoever.
So if you talk a request a minute, there is no any noticeable difference. The overhead of connection establishment is on the order of doezens milliseconds, so you will notice a difference starting at hundreds requests a second.
Here is my experiment. It's not HTTP, but it illustrates well the benefits of keeping the connection alive.
My setup is a network of servers that create their own secure connections.
I wrote a stress test that creates 100 threads on Server1. Each thread opens a TCP connection and establishes a secure channel with Server2. The thread on server2 sends the numbers 1..1000, and the thread on Server1 simply reads them, and sends "OK" to Server2. The TCP connections and secure channels are "kept alive".
First run:
100 threads are created on Server1
100 TCP connections are established between Server1 and Server2
100 threads are created on Server2 (to serve the Server1 requests)
100 secure channels are established, one per thread
total runtime : 10 seconds
Second run:
100 threads are created on Server1 (but those might have been reused by the JVM from the previous runs)
No new TCP connections are needed. The old ones are reused.
No threads are created on Server2. They are still waiting for requests.
No secure channels are established
total runtime : 1 second

How long can a websocket connection last?

How long can a client be connected to a server via websockets? Is there a time limit, can they potentially be connected together for years?
A WebSocket connection can in theory last forever. Assuming the endpoints remain up, one common reason why long-lived TCP connections eventually terminate is inactivity. WebSockets have a ping-pong mechanism which among other things can avoid closure by "smart" network routers which often have an inactivity timeout of a few hours. But if your application actively sends data (in either direction) at least once an hour, that's probably enough.
Still, "forever" is unlikely to be achieved in practice, because TCP connections on most networks eventually get terminated by some outage or other. You'll want intelligent reconnection logic if reliability is important.

nodeJS being bombarded with reconnections after restart

We have a node instance that has about 2500 client socket connections, everything runs fine except occasionally then something happens to the service (restart or failover event in azure), when the node instances comes back up and all socket connections try to reconnect the service comes to a halt and the log just shows repeated socket connect/disconnects. Even if we stop the service and start it the same thing happens, we currently send out a package to our on premise servers to kill the users chrome sessions then everything works fine as users begin logging in again. We have the clients currently connecting with 'forceNew' and force web sockets only and not the default long polling than upgrade. Any one ever see this or have ideas?
In your socket.io client code, you can force the reconnects to be spread out in time more. The two configuration variables that appear to be most relevant here are:
reconnectionDelay
Determines how long socket.io will initially wait before attempting a reconnect (it should back off from there if the server is down awhile). You can increase this to make it less likely they are all trying to reconnect at the same time.
randomizationFactor
This is a number between 0 and 1.0 and defaults to 0.5. It determines how much the above delay is randomly modified to try to make client reconnects be more random and not all at the same time. You can increase this value to increase the randomness of the reconnect timing.
See client doc here for more details.
You may also want to explore your server configuration to see if it is as scalable as possible with moderate numbers of incoming socket requests. While nobody expects a server to be able to handle 2500 simultaneous connections all at once, the server should be able to queue up these connection requests and serve them as it gets time without immediately failing any incoming connection that can't immediately be handled. There is a desirable middle ground of some number of connections held in a queue (usually controllable by server-side TCP configuration parameters) and then when the queue gets too large connections are failed immediately and then socket.io should back-off and try again a little later. Adjusting the above variables will tell it to wait longer before retrying.
Also, I'm curious why you are using forceNew. That does not seem like it would help you. Forcing webSockets only (no initial polling) is a good thing.

TCP connection time in windows

I am doing some performance testing with a large number of threads. Each thread is sending HTTP requests to another IP. It looks like at some stages the connections are closed (because there are too many threads) and then of course have to be reopned.
I am looking to get some ball park figures for how long it takes windows to Open TCP connections.
Is there any way I can get this?
Thanks.
This is highly dependent on the endpoints you're trying to connect to, is it not?
As an extreme best case, you can test it yourself by targeting an IIS on localhost.
I wouldn't be surprised if routers and servers that you are connecting through may drop connections as a measure against what could be perceived as connection storms or even denial-of-service attacks.

Performance Implication of Creating New TCP Connection Per Message

In the past, our server application was designed so that a client creates one TCP connection, keeps this connection established indefinitely and sends messages when needed. These messages may come in high volume bursts or with longer idle periods in between. We are switching to a different connection protocol now, where the client will create a new connection per message and then disconnect after sending.
I have a few questions:
I know there is some overhead for the 3-shake handshake to establish the connection. But is this overhead significant (cpu, memory, bandwidth, etc.)?
Is there any difference between the latency of message being transferred for an established TCP connection that has been idle for minutes vs. creating a new connection and sending the message?
Are there any other factors/considerations that I should be considering if I'm trying to determine the performance impact of switching to this new connection protocol compared to the old one?
Any help at all is greatly appreciated.
Opening and closing a lot of TCP sessions may impact connection tracking firewalls and load balancers, causing them to slow down or even fail and reject the connection. Some, like the Linux iptables conntrack, have moderate default values for the number of tracked connections.
The program might run out of available local port numbers if it cycles messages too quickly. There is a TCP timeout before a socket can be considered "closed". There is often an operating system timer to clean up these closed connections. If too many sockets are opened too quickly, the operating system may not have had time to clean up.
The handshake adds about an extra 80 bytes to your bandwidth cost. The TCP connection close also involves FIN or RST packets, although these flags may be combined with the data packet.
Latency in an established TCP session might be a tiny bit higher if the Nagle algorithm is turned on for the message sender. Nagle causes the OS to wait for more data before sending a partially filled packet. The TCP session that is being closed will flush all data. The same effect can be had in the open session by disabling Nagle with the TCP_NODELAY flag.

Resources