if we fire hundreds of requests in 1 second, is it better (faster) to use HTTP version of url than HTTPS ?
http://example.com/response.json
vs
https://example.com/response.json
Are there any technical reasons in the background of cURL for that?
HTTPS has overhead when compared to HTTP: in bandwidth (at the start of the connection) and in processing (encryption/decryption). The latter is usually negligible since you'll almost always hit bandwidth limit before the CPU limit. The bandwidth overhead is negligible if you have long-lasting connections (either big transfers, or a persistent connection); but if you open many short-lived connections, the connection overhead piles up.
Related
I’m working on a site with the goal of being as fast as possible. This goal requires letting mobile clients make the initial HTTP request in one round-trip. (HTTP/2’s HPACK should take care of subsequent requests for the same page.)
The conventional wisdom is that 14 kilobytes of compressed response is as much as you can expect out of the first round-trip for a web page (because of TCP Slow Start), but similar calculations as that theory’s don’t produce similar results when testing.
My target connection has the following characteristics:
3rd-generation cellular data protocol (3G)
200ms latency (400ms RTT)
400Kb/s maximum download bandwidth
300Kb/s maximum upload bandwidth
0% packet loss, at least for the purposes of this question
HTTP/2 over TLS
Assume Android Chrome for the client
Ultimately, I want to set performance goals for how big the app-controllable request headers can be; mainly Etag and Cookie. (I can’t really control Referer and such, but at least they have a known maximum size in practice.)
You can’t do one round trip HTTP/2 pages (nor HTTPS pages, and pretty much never could even with HTTP/1.1).
This is because the TLS handshake requires at lease one round trip (though TLSv1.3 does have a 0-RTT repeat handshake that is not usually supported by browsers and servers).
HTTP/2 requires further messages on top which, while they do not require to be a knowledged (so no round trip technically) will result in TCP acknowledgements, so the congestion window (CWND) will have increased beyond 14Kb in this instance. Additionally as you start to stream the first response it’s TCP packets will also be acknowledged increasing the CWND further.
I recently wrote a blog post on this: https://www.tunetheweb.com/blog/critical-resources-and-the-first-14kb/
So how much do you really have to play with for that first response if it’s not 14KB? Well that’s impossible to realistically say because it very much depends on the TCP stacks (and TLS and HTTP/2 stacks) on each side. My advice is not to obsess with this number and just deliver your website in as little data as possible. In particular don’t worry if you are delivering 15KB or 16KB as you don’t have to kill yourself to get under this 14KB magic number.
Saying that while Cookies can be large (though eTags typically are not), they are not typically more than a KB or two. So if you are trying to make your space savings there then you probably are looking in the wrong place - or have a really super optimised site where these headers are the last place to optimise!
Reading the SPDY whitepaper at http://dev.chromium.org/spdy/spdy-whitepaper, it seems like supporting it will improve my HTTP latency. However, I'm not clear on a few of the claims.
1) "Because HTTP can only fetch one resource at a time (HTTP pipelining helps, but still enforces only a FIFO queue), a server delay of 500 ms prevents reuse of the TCP channel for additional requests." -- Where did this 500ms number come from?
2) "We discovered that SPDY's latency savings increased proportionally with increases in packet loss rates, up to a 48% speedup at 2%." -- But doesn't putting all the requests on a single TCP connection mean that congestion control will slow down all your requests whereas is you had multiple connections, 1 TCP stream would slow down but others would not?
3) "[With pipelining] any delays in the processing of anything in the stream (either a long request at the head-of-line or packet loss) will delay the entire stream." -- This implies that packet loss would not delay the entire stream using SPDY. Why not?
The 500ms reference is simply an example, the number can be 50ms or 5s, but the point is still the same: HTTP forces FIFO processing, which results in inefficient use of the underlying TCP connection. As the paper notes, pipelining can help in theory, but in practice pipeline is not used due to many intermediaries which break when you turn it on. Hence, you're stuck with the worst case scenario: full RTT + server processing time, and FIFO ordering.
Re, packet loss. Yup, you're exactly right. One of the downsides of using a single connection is that in the case of packet loss, the throughput of the entire connection is cut in half, as opposed to 1/2 of one of the N connections in flight. Having said that, there are also some benefits! For example, when you saturate a single connection, you get much faster recovery due to triple ACK's + potentially much wider congestion windows to begin with.. Due to the fact that most HTTP transfers are relatively small (tens of KB's), it is not unusual for many connections to terminate even before they exit the slow-start phase!
Re, pipelining. Lost packet would delay the stream - that's TCP. The win is in eliminating head-of-line blocking, which enables a lot more and a lot smarter optimization by the browser, followed by some of the wins I described above.
#GroovyDotCom: Here's some hands-on proof of HTTP2's (SPDY's) performance benefits:
http://www.httpvshttps.com/
I have reacently activated gzip compression on an IIS6 webserver. I use both static and dynamic compression (static level 10 and dyamic level 1). This was a measure to increase server response time performance. However it seems like the page loads slower after the compression was activated. All my measurements in firebug indicate this.
Has anyone else had this problem? What can be the cause?
You are doing more work on the server and the client so it is normal that response times increase. On low bandwidth connections you might make it good with reduced transfer times.
If you are on a high bandwidth connection then the compression will have not have a significant impact on the transfer delay as it is already short uncompressed. However you will pay 100% of the CPU penalty.
Now zipping large responses takes quite some CPU power, if the server CPU's are already loaded the response times might even get worse.
My advice : check the server CPU and if it is non-negligeable then either turn off zipping or buy a bigger box. If you have a large population on mobile or in remote locations with poor internet connectivity then zipping might be useful, otherwise it will make little difference.
You might also look in using a reverse proxy to reduce the load of the server.
How much bandwidth do you have between your browser and your server?
compressing and decompressing the stream is more work, and so on a fast network, it may actually be slower -- is this an intranet application? You will see the most gain for compression if you have tight bandwidth requirements (either lots of traffic, or a lower bandwidth connection.
How much compression helps will also depend upon what kind of content your site is delivering.
The best thing to do is to test and measure under the same conditions that your site will be under when it is in production.
Static compression works quite well because a copy of the gzipped file is placed in a temporary folder, however dynamically compressed responses have to be re-gzipped every time and unless the bandwidth is a big problem I don't think it is worth it.
We are using a business Ethernet connection (3Mbit upload, 3Mbit download) and trying to understand issues with our tested bandwidth speeds. When uploading a large file we sustain 340 KB/s; downloading we sustain 340KB/s. However when we run these transfers simultaneously the two transfer speeds rise and fall erratically with a average speed for both at around 250 KB/s. We're using a Hatteras HN404 CPi and we've bypassed the router (plugged a machine directly into the Hatteras; set the NIC to full-duplex).
Is this expected? Should a max upload interfere with a max download on this type of Internet connection?
Are you sure the bottleneck is your connection?
Do you also see this behavior when the simultaneous upload and download are occurring on different systems, or only when one system is handling both the upload and download?
If the problem goes away when independent machines are doing the work, the bottleneck is likely closer to the hard drive.
This sounds expected from my experience with lower end lines. On a home line, I've found that traffic shaping and changing buffer sizes can be a huge help.
TCP/IP without any unusual traffic shaping will favor the most aggressive traffic at the expense of everything else. In your case, this means responses to the outgoing ACKs and such for the download will be delayed or maybe even dropped. See if your HN404 supports class based queuing or something similar and try it out.
Yes it is expected. This is symptomatic of any case in which you have a throttled or capped connection. If you saturate your uplink it will affect your downlink and vice versa.
This is because the your connection's rate-limiting impacts the TCP handshake acknowledgement packets (ACKs) and disrupts the normal "balance" of how these packets flow.
This is very thoroughly described on this page about Cable Modem Troubleshooting Tips, although it is not limited to cable modems:
If you saturate your cable modem's
upload cap with an upload, the ACK
packets of your download will have to
queue up waiting for a gap between the
congested upload data packets. So your
ACKs will be delayed getting back to
the remote download server, and it
will therefore believe you are on a
very slow link, and slow down the
transmission of further data to you.
So how do you avoid this? The best way is to implement some sort of traffic-shaping or QoS (Quality of Service) on individual sessions to limit them to a maximum throughput based on a percentage of your total available bandwidth.
For example on my home network I have it so that no outbound connection can utilize any more than 67% (2/3rd) of my 192Kbps uplink. That means any single outbound session can only utilized 128Kbps, therefore protecting my downlink speed by preventing the uplink from becoming saturated.
In most cases you are able to perform this kind of traffic-shaping based on any available criteria such as source ip, destination ip, protocol, port, time of day, etc.
It appears that I was wrong about the simultaneous transfer speeds. The 250KB/s speeds up and down were miscalculated by the transfer program (seemed to have been showing a high average speed). Apparently the Business Ethernet (in this case it is an XO circuit provisioned by Speakeasy) only supports 3Mb total, not up AND down (for 6Mbit total). So if I am transferring up and down at the same time in theory I should only have 1.5Mbit up and down or 187.5KB/s at the maximum (if there was zero overhead).
I know there's no single hard-and-fast answer, but is there a generic order-of-magnitude estimate approximation for the encryption overhead of SSL versus unencrypted socket communication? I'm talking only about the comm processing and wire time, not counting application-level processing.
Update
There is a question about HTTPS versus HTTP, but I'm interested in looking lower in the stack.
(I replaced the phrase "order of magnitude" to avoid confusion; I was using it as informal jargon rather than in the formal CompSci sense. Of course if I had meant it formally, as a true geek I would have been thinking binary rather than decimal! ;-)
Update
Per request in comment, assume we're talking about good-sized messages (range of 1k-10k) over persistent connections. So connection set-up and packet overhead are not significant issues.
Order of magnitude: zero.
In other words, you won't see your throughput cut in half, or anything like it, when you add TLS. Answers to the "duplicate" question focus heavily on application performance, and how that compares to SSL overhead. This question specifically excludes application processing, and seeks to compare non-SSL to SSL only. While it makes sense to take a global view of performance when optimizing, that is not what this question is asking.
The main overhead of SSL is the handshake. That's where the expensive asymmetric cryptography happens. After negotiation, relatively efficient symmetric ciphers are used. That's why it can be very helpful to enable SSL sessions for your HTTPS service, where many connections are made. For a long-lived connection, this "end-effect" isn't as significant, and sessions aren't as useful.
Here's an interesting anecdote. When Google switched Gmail to use HTTPS, no additional resources were required; no network hardware, no new hosts. It only increased CPU load by about 1%.
I second #erickson: The pure data-transfer speed penalty is negligible. Modern CPUs reach a crypto/AES throughput of several hundred MBit/s. So unless you are on resource constrained system (mobile phone) TLS/SSL is fast enough for slinging data around.
But keep in mind that encryption makes caching and load balancing much harder. This might result in a huge performance penalty.
But connection setup is really a show stopper for many application. On low bandwidth, high packet loss, high latency connections (mobile device in the countryside) the additional roundtrips required by TLS might render something slow into something unusable.
For example we had to drop the encryption requirement for access to some of our internal web apps - they where next to unusable if used from china.
Assuming you don't count connection set-up (as you indicated in your update), it strongly depends on the cipher chosen. Network overhead (in terms of bandwidth) will be negligible. CPU overhead will be dominated by cryptography. On my mobile Core i5, I can encrypt around 250 MB per second with RC4 on a single core. (RC4 is what you should choose for maximum performance.) AES is slower, providing "only" around 50 MB/s. So, if you choose correct ciphers, you won't manage to keep a single current core busy with the crypto overhead even if you have a fully utilized 1 Gbit line. [Edit: RC4 should not be used because it is no longer secure. However, AES hardware support is now present in many CPUs, which makes AES encryption really fast on such platforms.]
Connection establishment, however, is different. Depending on the implementation (e.g. support for TLS false start), it will add round-trips, which can cause noticable delays. Additionally, expensive crypto takes place on the first connection establishment (above-mentioned CPU could only accept 14 connections per core per second if you foolishly used 4096-bit keys and 100 if you use 2048-bit keys). On subsequent connections, previous sessions are often reused, avoiding the expensive crypto.
So, to summarize:
Transfer on established connection:
Delay: nearly none
CPU: negligible
Bandwidth: negligible
First connection establishment:
Delay: additional round-trips
Bandwidth: several kilobytes (certificates)
CPU on client: medium
CPU on server: high
Subsequent connection establishments:
Delay: additional round-trip (not sure if one or multiple, may be implementation-dependant)
Bandwidth: negligible
CPU: nearly none