CDN-server with http/1.1 vs. webserver with http/2 - http2

I have a hosted webserver with http/2 (medium fast) and additionally I have a space on a fast CDN-Server with only http/1.1.
Is it recommended to load some ressources from the CDN or should I use only the webserver because of http/2?
Loading too many recources from the CDN could be a bottleneck due to http/1.1?
Would be kind to get some hints...

You need to test. It really depends on your app, your users and your servers.
Under HTTP/1.1 you are limited to 6 connections to a domain. So hosting content on a separate domain (e.g. static.example.com) or loading from a CDN was a way to increase that limit beyond 6. These separate domains are also often cookie-less as they are on separate domains which is good for performance and security. And finally if loading jQuery from code.jquery.com then you might benefit from the user already having downloaded it for another site so save that download completely (though with the number of versions of libraries and CDNs the chance of having a commonly used library already downloaded and in the browser cache is questionable in my opinion).
However separate domains requires setting up a separate connection. Which means a DNS lookup, a TCP connection and usually an HTTPS handshake too. This all takes time and especially if downloading just one asset (e.g. jQuery) then those can often eat up any benefits from having the assets hosted on a separate site! This is in fact why browsers limit the connections to 6 - there was a diminishing rate of return in increasing it beyond that. I've questioned the value of sharded domains for a while because of this and people shouldn't just assume that they will be faster.
HTTP/2 aims to solve the need for separate domains (aka sharded domains) by removing the need for separate connections by allowing multiplexing, thereby effectively removing the limit of 6 "connections", but without the downsides of separate connections. They also allow HTTP header compression, reducing the performance downside to sending large cookies back and forth.
So in that sense I would recommended just serving everything from your local server. Not everyone will be on HTTP/2 of course but the support is incredible strong so most users should.
However, the other benefit of a CDN is that they are usually globally distributed. So a user on the other side of the world can connect to a local CDN server, rather than come all the way back to your server. This helps with connection time (as TCP handshake and HTTPS handshake is based on shorter distances) and content can also be cached there. Though if the CDN has to refer back to the origin server for a lot of content then there is still a lag (though the benefits for the TCP and HTTPS setup are still there).
So in that sense I would advise to use a CDN. However I would say put all the content through this CDN rather than just some of it as you are suggesting, but you are right HTTP/1.1 could limit the usefulness of that. That's weird those as most commercial CDNs support HTTP/2, and you also say you have a "CDN server" (rather than a network of servers - plural) so maybe you mean a static domain, rather than a true CDN?
Either way it all comes down to testing as, as stated at the beginning of this answer it really depends on your app, your users and your servers and there is no one true, definite answer here.
Hopefully that gives you some idea of the things to consider. If you want to know more, because Stack Overflow really isn't the place for some of this and this answer is already long enough, then I've just written a book which spends large parts discussing all this: https://www.manning.com/books/http2-in-action

Related

Why might an HTTPS page request be faster than HTTP?

This article mentions and this site seems designed to show that HTTPS can be faster than HTTP. I'm surprised; I thought HTTPS was just HTTP plus encryption, which adds a small, likely negligable amount of work but doesn't remove any.
Why might an HTTPS page load be faster than one over HTTP?
It's a bit of a con to be honest.
HTTPS is slower than HTTP. There's no denying that. HTTPS works over HTTP so has to do everything HTTP does and more. Now, with good web server config, the computational cost of HTTPS is almost non-existent to the average user on today's modern hardware but it is there. But it also slows down the first page render as it takes a few hundred extra milliseconds to set up the HTTPS connection. Again not a big deal for most people but it is there.
Now there is the argument that someone - be it a mobile network or ISP or whatever - can change HTTP by injecting ads and the like, potentially slowing down a website, but that's not the reason for the speed difference here.
The reason that website is faster is because it is using HTTP/2 when using HTTPS and not when using HTTP. HTTP/2 is faster than HTTP/1.1 - especially for websites with lots of resources.
Of course you can say that HTTP/2 is only available over HTTPS and while that is true*, the corollary is not - implementing HTTPS does not automatically give you HTTP/2.
*Well technically it's not true that HTTP/2 requires HTTPS as per the spec, but all the browser makers have said they will only support this over HTTPS so it basically is true to all intents and purposes.
Additionally the sample website loads 360 small and near identical (but crucially not identical) resources. Precisely the sort of thing that HTTP/2 is very good at. And while average web pages are growing, most of them don't load 360 near identical images - so that network latency is basically the only bottleneck. Most have other issues as well that are nothing to do with the network latency issues that HTTP/2 massively improves.
The speed gains for HTTP/2 are hugely impressive and it is the future and everyone should use it, as latency is a major bottleneck. But that test site is an extreme example of it. Depending on the exact site's make up, HTTP/2 will mostly offset the cost of HTTPS and in many cases more than offset it - but that does not mean HTTPS itself is faster.
There are very good reasons to use HTTPS, and the article is fantastic for listing them all (except for that first one). In my opinion HTTPS should be the default and everyone should move to it - precisely for the other reasons listed. But it's a lie to say HTTPS is faster that HTTP. Or, at the very least, it obfuscates the truth by not explaining why it can be faster. And then listing HTTP/2 as a second, seemingly unrelated, reason to further confuse the reader! I just don't understand why they couldn't combine these two points into one and fully explain this so questions like this didn't need to be asked? Same for that sample site - why is there no FAQ to explain why HTTPS is apparently faster?
Historically yes, https was http+ssl/tls, so it was slower
But now with spdy/http2, it's a new protocol, which can be faster than http when dealing with multiple requests:
it can compress headers, and if you send the same header multiple times (like cookies) it doesn't need to send it fully but just an id
if can reuse tcp connections, so it avoid the overload of opening multiple tcp connections and stream efficiently data
If you use some kind of network scanner (e.g. a component of an antivirus, proxy of firewall), it may scan plain HTTP traffic causing a slowdown. At the same time, it won't touch encrypted HTTPS traffic unless you installed a special root certificate that will help the intermediate scanner to process HTTPS traffic. So if there is some kind of intermediate service scanning HTTP traffic, but not HTTPS -- using HTTPS will be much faster.

Do websites share cached files?

We're currently doing optimizations to our web project when our lead told us to push the use of CDNs for external libraries as opposed to including them into a compile+compress process and shipping them off a cache-enabled nginx setup.
His assumption is that if the user has visits example.com which uses a CDN'ed version of jQuery, the jQuery is cached that time. If the user happens to visit example2.com and happen to use the same CDN'ed jQuery, the jQuery will be loaded from cache instead of over the network.
So my question is: Do domains actually share their cache?
I argued that even if it is possible the browser does share cache, the problem is that we are running on the assumption that the previous sites use the same exact CDN'ed file from the same exact CDN. What are the chances of running into a user browsing through a site using the same CDN'ed file? He said to use the largest CDN to increase chances.
So the follow-up question would be: If the browser does share cache, is it worth the hassle to optimize based on his assumption?
I have looked up topics about CDNs and I have found nothing about this "shared domain cache" or CDNs being used this way.
Well your lead is right this is basic HTTP.
All you are doing is indicating to the client where it can find the file.
The client then handles sending a request to the CDN in compliance with their caching rules.
But you shouldn't over-use CDNs for libraries either, keep in mind that if you need a specific version of the library, especially older ones, you won't be likely to get much cache hits because of version fragmentation.
For widely used and heavy libraries like jQuery you want the latest version of it is recommended.
If you can take them all from the same CDN all the better (ie: Google's) especially as http2 is coming.
Additionally they save you bandwidth, which can amount to a lot when you have high loads of traffic, and can reduce the load time for users far from your server (Google's is great for this).

Good practice or bad practice to force entire site to HTTPS?

I have a site that works very well when everything is in HTTPS (authentication, web services etc). If I mix http and https it requires more coding (cross domain problems).
I don't seem to see many web sites that are entirely in HTTPS so I was wondering if it was a bad idea to go about it this way?
Edit: Site is to be hosted on Azure cloud where Bandwidth and CPU usage could be an issue...
EDIT 10 years later: The correct answer is now to use https only.
you lose a lot of features with https (mainly related to performance)
Proxies cannot cache pages
You cannot use a reverse proxy for performance improvement
You cannot host multiple domains on the same IP address
Obviously, the encryption consumes CPU
Maybe that's no problem for you though, it really depends on the requirements
HTTPS decreases server throughput so may be a bad idea if your hardware can't cope with it. You might find this post useful. This paper (academic) also discusses the overhead of HTTPS.
If you have HTTP requests coming from a HTTPS page you'll force the user to confirm the loading of unsecure data. Annoying on some websites I use.
This question and especially the answers are OBSOLETE. This question should be tagged: <meta name="robots" content="noindex"> so that it no longer appears in search results.
To make THIS answer relevant:
Google is now penalizing website search rankings when they fail to use TLS/https. You will ALSO be penalized in rankings for duplicate content, so be careful to serve a page EITHER as http OR https BUT NEVER BOTH (Or use accurate canonical tags!)
Google is also aggressively indicating insecure connections which has a negative impact on conversions by frightening-off would-be users.
This is in pursuit of a TLS-only web/internet, which is a GOOD thing. TLS is not just about keeping your passwords secure — it's about keeping your entire world-facing environment secure and authentic.
The "performance penalty" myth is really just based on antiquated obsolete technology. This is a comparison that shows TLS being faster than HTTP (however it should be noted that page is also a comparison of encrypted HTTP/2 HTTPS vs Plaintext HTTP/1.1).
It is fairly easy and free to implement using LetsEncrypt if you don't already have a certificate in place.
If you DO have a certificate, then batten down the hatches and use HTTPS everywhere.
TL;DR, here in 2019 it is ideal to use TLS site-wide, and advisable to use HTTP/2 as well.
</soapbox>
If you've no side effects then you are probably okay for now and might be happy not to create work where it is not needed.
However, there is little reason to encrypt all your traffic. Certainly login credentials or other sensitive data do. One the main things you would be losing out on is downstream caching. Your servers, the intermediate ISPs and users cannot cache the https. This may not be completely relevant as it reads that you are only providing services. However, it completely depends on your setup and whether there is opportunity for caching and if performance is an issue at all.
It is a good idea to use all-HTTPS - or at least provide knowledgeable users with the option for all-HTTPS.
If there are certain cases where HTTPS is completely useless and in those cases you find that performance is degraded, only then would you default to or permit non-HTTPS.
I hate running into pointlessly all-https sites that handle nothing that really requires encryption. Mainly because they all seem to be 10x slower than every other site I visit. Like most of the documentation pages on developer.mozilla.org will force you to view it with https, for no reason whatsoever, and it always takes long to load.

Will it ever be possible to run all web traffic via HTTPS?

I was considering what would it take (technologically) to move all the web traffic to HTTPS. I thought that computers are getting faster, and faster, so some time from now it will be possible to run all traffic via HTTPS without any noticeable cost.
But then again, I thought, encryption strength will have to evolve to counter the loss of security. If computers get 10x faster, encryption will have to be 10x stronger, or it will be 10x easier to break.
So, will we ever be able to encrypt all web traffic "for free"?
Edit: I'm asking only about the logic of performance increases in computing vs encryption. If we can use the same crypto algorhytms and keys in 20 years, they will consume a far lower percentage of the overall computing capacity of a server (or client), and in effect, that will make it "free" to encrypt and sign everything that we transmit over networks.
One of the big issues with using HTTPS is that its considered secure and so most web browsers don't do any caching, or at least do very limited caching.
Without the cache, you'll notice that HTTPS pages load significantly slower and a non-encrypted page would.
HTTPS should be used to protect sensitive information.
I have no idea about the CPU impact of running everything through SSL. I would say that on the client side, the CPU isn't an issue since most workstations are running idle most of the time anyway. The big program would be on the web server side due to the sheer number of concurrent requests that are being handled.
In order to get to the point that SSL is basically 'free', you'd have to have dedicated hardware for encryption (which already exists today).
EDIT: Based on the comments, the question's author suggests this is the answer he was looking for :
Using crypto is already pretty fast,
particularly considering that we're
using CPU cycles vs. data
transmission. Crypto keys do not need
to get longer. I don't think there's
any technical reason why this is
impractical.
-David Thornley
UPDATE: I just read that Google's SPDY protocol (designed to replace HTTP) looks like it will use SSL on every connection. So, it looks like Google thinks that it's possible!
To make SSL the underlying transport
protocol, for better security and
compatibility with existing network
infrastructure. Although SSL does
introduce a latency penalty, we
believe that the long-term future of
the web depends on a secure network
connection. In addition, the use of
SSL is necessary to ensure that
communication across existing proxies
is not broken.
Chris Thompson mentions browser caching, but that's easily fixable in the browser. What isn't fixable on switching everything to HTTPS is proxy caching. Because HTTPS is encrypted end-to-end, transparent HTTP proxies don't work. There are a lot of places where transparent proxying can speed things up (for instance at NAT boundaries).
Dealing with the additional bandwidth from losing transparent proxying is probably doable - allegedly HTTP traffic is trivial compared with p2p anyway, so it's not as if transparent proxies are the only thing keeping the internet online. It will hit latency irrevocably, and make a slashdotting even worse than it is currently. But then with cloud hosting, both those might be dealt with by tech. Of course "secure server" takes on a different meaning with cloud hosting, or even with other forms of de-centralisation of content across the network like akamai.
I don't think the CPU overhead is that significant. Sure, if your server is currently CPU bound at least some of the time, then switching all traffic from HTTP to HTTPS will kill it stone dead. Some servers may decide that HTTPS is not worth the monetary cost of a CPU that can handle the load, and they will prevent literally everyone adopting it. But I doubt it will be a major barrier for long. For instance, Google has crossed it already and happily serves apps (although not searches) as https without fuss. And the more work servers are doing per connection, the less proportional extra work is required to SSL-secure that connection. SSL can be and is hardware accelerated where necessary.
There's also the management/economic problem that HTTPS relies on trusted CAs, and trusted CAs cost money. There are other ways to design a PKI than the one SSL actually uses, but there are reasons SSL works how it does. For example SSH places the responsibility on the user to obtain a key fingerprint from the server by a secure side-channel, and this is the result: some users don't think that level of inconvenience is justified by its security purpose. If users don't want security, then they won't get it unless it's impossible for them to avoid it.
If users just auto-click "accept" for untrusted SSL certificates, then you pretty much might as well not have it, since these days a man-in-the-middle attack is not significantly more difficult than plain eavesdropping. So, again, there's a significant block of servers which just aren't interesting in paying for (working) HTTPS.
Encryption would not have to get 10x stronger in the sense that you would not need to use 10x more bits. The difficulty of brute force cracking increases exponentially with an increasing key length. At most key lengths would have to get slightly longer.
What would be the point of running all traffic through SSL, even stuff where there is obviously no advantage? This seems incredibly wasteful. For example, it seems ridiculous to download a Linux distro through SSL.
The cost isn't that great nowadays.
Also...having a computer that is 10x faster will in no way make it necessary to change encryption. AES (a common encryption for SSL) is strong enough that it would take a very very long time to break.
Will it be possible? YES
Will it be advisable? NO
For a few reasons.
extra cpu cycles on server and client would use more power which incurs cost and emissions
ssl certs would be required for every server
it's useless to encrypt data that doesn't need to be hidden
IMO, the answer is no. The main reason for this is that if you consider how many pages have items from multiple sources that would each have to use https and have a valid certificate that I don't think would work for some of the big companies that would have to change all their links.
It isn't a bad idea and maybe some Web x.0 would have more secure communications by default, but I don't think http will be that protocol.
Just to give a couple of examples, though I am from Canada which may affect how these sites render:
www.msn.com :
atdmt.com
s-msn.com
live.com
www.cnn.com :
revsci.net
cnn.net
turner.com
dl-rms.com
Those were listed through "NoScript" which notes this page has code from "google-analytics.com" and "quantserve.com" besides the stackoverflow.com for a third example of this.
A major difference with https is that a session is kept open until you close it. Saves a lot of hassle with session cookies but puts a load on the server.
How long should google keep the https session with you alive after you send a query?
Do you want a persistent connection to every popup ad?

HTTPS on Apache; Will it slow Apache?

Our company runs a website which currently supports only http traffic.
We plan to support https traffic too as some of the customers who link to our pages want us to support https traffic.
Our website gets moderate amount of traffic, but is expected to increase over time.
So my question is this:
Is it a good idea to make our website https only?(redirect all http traffic to https)
Will this bring down the websites performance?
Has anyone done any sort of measurement?
PS: I am a developer who also doubles up as a apache admin.
Yes, it will impact performance, but it's usually not too bad compared to the running all the DB queries that go into the typical dymanically generated page.
Of course the real answer is: don't guess, benchmark it. Try it both ways and see the difference. You can use tools like siege and ab to simulate traffic.
Also, I think you may have more luck with this question over at http://www.serverfault.com/
I wouldn't worry about the load on the server; unless you are serving high volumes of static content, the encryption itself won't create much of a burden, in my experience.
However, using SSL dramatically slows down web sites by creating a lot more latency in connection setup.
An encrypted session requires about* three times as much time to set up as an unencrypted one, and the exact time depends on the latency.
Even on low latency connections, it is noticeable to the end user, but on higher latency (e.g. different continents, especially Australasia where latency to America/Europe is quite high) it makes a dramatic difference and will severely impact the user experience.
There are things you can do to mitigate it, such as ensuring that keep-alives are on (But don't turn them on without understanding exactly what the impact is), minimising the number of requests and maximising the use of browser cache.
Using HTTPS also affects browser behaviour in some cases. Certain optimisations tend to get turned off for security reasons, and some web browsers don't store objects loaded over HTTPS in the disc cache, which means they'll need to get them again in a later session, further impacting the user experience.
* An estimate based on some informal measurement
Is it a good idea to make our website
https only?(redirect all http traffic
to https) Will this bring down the
websites performance?
I'm not sure if you really mean all HTTP traffic or just page traffic. A lot of sites unnecessarily encrypt images, javascript and a bunch of other content that doesn't need to be hidden. This kind of content comprises most of the data transferred in a request so
if you do find feel that HTTPs is taking too much out of the system you can recommend the programmers separate content that needs to be secured from the content that does not.
Most webservers, unless severely underpowered, do not even use a fraction of the CPU power for serving up content. Most production servers I've seen are under 10%, even when using some SSL traffic. I think it would be best to see where your current CPU usage is at, and then do some of your own benchmarking to see how much extra CPU usage is used by an SSL request. I would guess it isn't that much.
No, it is not good idea to make any website as only https. Page loading speed might be little slower, because your server has to perform redirection operation unnecessarily for each web page request. It is better idea to make only pages as https that may contain secure/personal/sensitive information of users or organization. Even if the user information passing through web pages, you can use https. The web page which have information that can be shown to all in the world can normally use http. Finally, it is up to your requirement. If all pages contain secure information, you may make the website as https only.

Resources