Does anyone know what’s the difference, in milliseconds and percentage, between the total time it takes to make an HTTPS request that is allowed to use keep alive vs one that doesn’t? For the sake of this question, let’s assume a web server that has one GET endpoint called /time that simply returns the server’s local time, and that clients call this endpoint on average once a minute.
My guess is that, putting the server on my home LAN, and calling /time from my laptop on the LAN would take 200ms. With keep-alive it’s probably going to be 150ms. So that’s 50ms difference, and 25% improvement.
My second question is similar, but only considers server processing time. Let’s say the server takes 100ms to process a GET /time request, but only 50ms to process the same with keep-alive. That’s 50ms faster, but a 50% performance gain, which is very meaningful as it increases the server’s capacity.
I think you have confused a lot of tings here. Keepalive header in HTTP protocol suggests a client that server wouldn't mind to accept multiple requests through the same connection.
Connection is a term related to underlying TCP protocol, and there is an overhead (three way handshake) in establishing it. On the other hand, too many connections at once hurt server's performance. That's why those options exist.
HTTPS implies security-associated workflow on top of HTTP protocol and I suspect it bears no relevance in the context of your question whatsoever.
So if you talk a request a minute, there is no any noticeable difference. The overhead of connection establishment is on the order of doezens milliseconds, so you will notice a difference starting at hundreds requests a second.
Here is my experiment. It's not HTTP, but it illustrates well the benefits of keeping the connection alive.
My setup is a network of servers that create their own secure connections.
I wrote a stress test that creates 100 threads on Server1. Each thread opens a TCP connection and establishes a secure channel with Server2. The thread on server2 sends the numbers 1..1000, and the thread on Server1 simply reads them, and sends "OK" to Server2. The TCP connections and secure channels are "kept alive".
First run:
100 threads are created on Server1
100 TCP connections are established between Server1 and Server2
100 threads are created on Server2 (to serve the Server1 requests)
100 secure channels are established, one per thread
total runtime : 10 seconds
Second run:
100 threads are created on Server1 (but those might have been reused by the JVM from the previous runs)
No new TCP connections are needed. The old ones are reused.
No threads are created on Server2. They are still waiting for requests.
No secure channels are established
total runtime : 1 second
Related
We had issues with lot of Applications connecting to MQ server without properly doing a disconnect. Hence we introduced DISCINT on our server connection channels with a value 1800 sec which we found ideal for our transactions. But our Keep Alive interval is pretty high with 900 sec. We would like to reduce that less than 300 as suggested by mqconfig util. But before doing that I would like to know if this is going to affect our disconnect interval value and whether it is going to override our disconnect interval value and make more frequent disconnects which will be a performance hit for us.
How does both these values work and how they are related?
Thanks
TCP KeepAlive works below the application layer in the protocol stack, so it does not affect the disconnecting of the channel configured by the DISCINT.
However lowering the value can result in more frequent disconnects, if your network is unreliable, for example has intermittent very short (shorter then the current KeepAlive, but longer then the new) periods when packets are not flowing.
I think the main difference is, that DISCINT is for disconnecting a technically working channel, which is not used for a given period, while KeepAlive is for detecting a not working TCP connection.
And MQ provides means to detect not working connections in the application layer too, configured by the heartbeat interval.
These may help:
http://www-01.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.con.doc/q015650_.htm
http://www-01.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.ref.con.doc/q081900_.htm
http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/overview.html
http://www-01.ibm.com/support/knowledgecenter/SSFKSJ_7.5.0/com.ibm.mq.ref.con.doc/q081860_.htm
In case MaxIdleConnsPerHost is set to a high number, let's say 1000, the number of connections open will still depend on the other host, right? I mean, allowing 1000 idle connections with the same host will result in 1000 connections open as long as these are not closed by the other host?
So, effectively setting this value to a high number, will result in never closing a connection, but waiting for the other host to do it? am I interpreting this correctly?
Your understanding is correct. MaxIdleConnsPerHost restricts how many connections there are which are not actively serving requests, but which the client has not closed.
Idle connections are useful for web browsers because they can keep reusing connections for subsequent HTTP requests to the same server. Idle connections have a cost for the server, though. They use kernel resources, and you may run up against per process limits or kernel limits on the number of open connections, files, or handles, which may cause unexpected errors in your program, or even for other programs on the same machine.
As such, be careful when increasing MaxIdleConnsPerHost to a large number. It only makes sense to increase idle connections if you are seeing many connections in a short period from the same clients.
After noticing a drastically slow load time on one of my website I started running some tests on Pingdom - http://tools.pingdom.com/
I've been comparing 2 sites, and the drastic difference is the 'Connect' time. On the slower site its around 2.5 seconds where as on my other sites its down around 650ms. I suppose its worth mentioning the slower site is hosted by a different company.
Thew only definition Pingdom offers is "The web browser is connecting to the server". I was hoping
Someone could elaborate on this a little for me, and
Point me in a direction of resolving it.
Thanks in advance
Every new TCP connection goes through a three-way handshake before the client can issue a request e.g. GET, to the web server.
Client sends SYN to server, server responds with SYN-ACK, client responds with ACK and then sends the request.
How long this process takes is latency bound i.e. if the round-trip to the server is 100ms then the full handshake will take 150ms, but as the client sends the request just after it sends the ACK work on the basis it's a cost of one-roundtrip.
Congestion and other factors can also affect the TCP connect time.
Connect times should be in the milliseconds range not in the seconds range - my round-trip time from the UK to a server in NY is 100ms so that's roughly what I'd expect the TCP connect time to be if I was requesting something from a server there.
See #igrigorik's High Performance Browser Networking for a really in-depth discussion / explanation - http://chimera.labs.oreilly.com/books/1230000000545/ch02.html#TCP_HANDSHAKE
I am using haproxy with round-robin perfectly but now I am facing a problem: one of my backend server is loaded.
I want to know if i can balancer the traffic according to the load on backend server.Also if one fails with limit of max. conn traffic goes to other backend server
difference between least conn, round robin & global max conn, default max conn, and server max connection
If a server is more loaded than other ones, then mechanically it will see more concurrent connections for the same request rate. That's where it becomes useful to switch to the leastconn algorithm, which will ensure that all servers always run with the same number of concurrent connections. This is useful for instance if some of your requests are much longer than other ones (eg: complex requests in a database).
For the second point, I'll be short because everything is in the doc, but leastconn focuses on the number of concurrent connections while round robin focuses on the cumulated number of connections. With round robin, each server gets a request
in turn, so the requests on a same server are optimally spaced. This is normally better for static servers or for applications with stickiness where users make a large number of requests once attached to a server, since this ensures you have the same number of users on the same server. Global maxconn is the total amount of concurrent connections a single haproxy process will support. It will stop accepting incoming connections when the limit is reached. The default maxconn applies to frontends only, and when a frontend's maxconn is reached, this frontend only will stop accepting new connections. The server maxconn ensures that haproxy never sends too many connections to a server. When the limit is reached, an other server is selected when possible (no cookie, etc), or the request is queued until the server releases a connection to pick it. If your servers are overloaded, you should check the number of connections and apply a server maxconn slightly below this value to protect them.
I am doing some performance testing with a large number of threads. Each thread is sending HTTP requests to another IP. It looks like at some stages the connections are closed (because there are too many threads) and then of course have to be reopned.
I am looking to get some ball park figures for how long it takes windows to Open TCP connections.
Is there any way I can get this?
Thanks.
This is highly dependent on the endpoints you're trying to connect to, is it not?
As an extreme best case, you can test it yourself by targeting an IIS on localhost.
I wouldn't be surprised if routers and servers that you are connecting through may drop connections as a measure against what could be perceived as connection storms or even denial-of-service attacks.