Are CDNs that don't cache useful? - caching

Is there any benefit to (HTTP-) serving a non-cacheable resource over a CDN?
(my use case: I'm serving a static Single Page App and I'd like to improve its load time, but I don't want index.html to get cached, because I want every new release to be reflected immediately. Specifically, this static site is hosted on AWS S3, and the CDN is AWS CloudFront.)
I assume that most of the performance benefits of CDNs are achieved through caching, but I could imagine other benefits due to, say, priviledged network infrastructure. As I don't know the first thing about networks, this may sound like a silly question.

Yes, it can be useful by moving the content closer to the user. Most CDN's will serve your static file from a geographical location as close to the user as possible, typically providing better latency.
Of course, you need to have users across the globe for this to make sense to you.

Related

Why it is recommended to store images on remote server?

Sorry for such a question, but I can not find any article on the web with cons on that, I guess it is about async uploading and downloading, but it's just a guess, is there somewhere a detailed info?
It's mostly about specialization, data locality, and concurrency.
Servers that are specialized at serving static content typically do so much faster than dynamic web servers (the web servers are optimized for the specific use-case).
You also have the advantage of storing your content in many zones to achieve better performance (the content is physically closer to the person requesting it), where-as web applications typically should be near its other dependencies, such as databases.
Lastly browsers (for http/1 at least) only allow a fixed number of connections per server, so if your images and api calls are on separate servers, one cannot influence the other in terms of request scheduling.
There are a lot of other reasons I'm sure, but these are just off the top of my head.

Not getting improvements by using CDN

I've just added a CDN distribution using Amazon Cloudfront to my Rails application on Heroku, it's working OK.
My homepage serves around 11 static assets, I've made some tests using http://www.webpagetest.org/ and there are no differences (in terms of performance, optimizing load times) between using the CDN or not.
Is there any particular reason why this could be happening?
My region is Latin America btw, so it's using the All locations edge option.
Thanks.
The main benefits of using CDN from Amazon or others is that they are hosted on fast and reliable servers and offload the traffic served directly from your server, which in case that you have a dedicated fast server you won't see a considerable boost.
But another benefit is that they are potentially cached by user's browser (due to visiting other websites which have used the same CDN) so the visitor will have a better experience first time they visit your site.
A couple of suggestinos.
If the site CSS is one of the static assets that you have moved to CloudFront then I would try moving it back to your main server.
Since page display can't start until the site CSS is downloaded, you want to serve this as fast as possible. If it's coming from a CDN then it requires a second HTTP request.
Also, use the waterfall display from webpagetest.org to pinpoint where the bottlenecks are.
Good luck!

Is it better to use Cache or CDN?

I was studying about browser performance when loading static files and this doubt has come.
Some people say that use CDN static files (i.e. Google Code, jQuery
latest, AJAX CDN,...) is better for performance, because it requests
from another domain than the whole web page.
Other manner to improve the performance is to set the Expires header
equal to some months later, forcing the browser to cache the static
files and cutting down the requests.
I'm wondering which manner is the best, thinking about performance and
if I may combine both.
Ultimately it is better to employ both techniques if you are doing web performance optimization (WPO) of a site, also known as front-end optimization (FEO). They can work amazingly hand in hand. Although if I had to pick one over the other I'd definitely pick caching any day. In fact I'd say it's imperative that you setup proper resource caching for all web projects even if you are going to use a CDN.
Caching
Setting Expires headers and caching of resources is a must and should be done 100% of the time for your resources. There really is no excuse for not doing caching. On Apache this is super easy to config after enabling mod_expires.c and mod_headers.c. The HTML5 Boilerplate project has good implementation example in the .htaccess file and if your server is something else like nginx, lighttpd or IIS check out these other server configs.
Here's a good read if anyone is interested in learning about caching: Mark Nottingham's Caching Tutorial
Content Delivery Network
You mentioned Google Code, jQuery latest, AJAX CDN and I want to just touch on CDN in general including those you pay for and host your own resources on but the same applies if you are simply using the jquery hosted files cdn or loading something from http://cdnjs.com/ for example.
I would say a CDN is less important than setting server side header caching but a CDN can provide significant performance gains but your content delivery network performance will vary depending on the provider.
This is especially true if your traffic is a worldwide audience and the CDN provider has many worldwide edge/peer locations. It will also reduce your webhosting bandwidth significantly and cpu usage (a bit) since you're offloading some of the work to the CDN to deliver resources.
A CDN can, in some rarer cases, cause a negative impact on performance if the latency of the CDN ends up being slower then your server. Also if you over optimize and employ too much parallelization of resources (using multi subdomains like cdn1, cdn2, cdn3, etc) it is possible to end up slowing down the user experience and cause overhead with extra DNS lookups. A good balance is needed here.
One other negative impact that can happen is if the CDN is down. It has happened, and will happen again. This is more true with free CDN. If the CDN goes down for whatever reason, so does your site. It is yet another potential single point of failure (SPOF). For javascript resources you can get clever and load the resource from the CDN and should it fail, for whatever the case, then detect and load a local copy. Here's an example of loading jQuery from ajax.googleapis.com with a fallback (taken from the HTML5 Boilerplate):
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js"></script>
<script>window.jQuery || document.write('<script src="js/vendor/jquery-1.8.2.min.js"><\/script>')</script>
Besides obvious free API resources out there (jquery, google api, etc) if you're using a CDN you may have to pay a fee for usage so it is going to add to hosting costs. Of course for some CDN you have to even pay extra to get access to certain locations, for example Asian nodes might be additional cost then North America.
For public applications, go for CDN.
Caching helps for repeated requests, but not for the first request.
To ensure fast load on first page visit use a CDN, chances are pretty good that the file is already cached by another site already.
As other have mentioned already CDN results are of course heavily cached too.
However if you have an intranet website you might want to host the files yourself as they typically load faster from an internal source than from a CDN.
You then also have the option to combine several files into one to reduce the number of requests.
A CDN has the benefit of providing multiple servers and automatically routing your traffic to the closest location to your client. This can result in faster delivery, optimized by location.
Also, static content doesn't require special application servers (like dynamic content) so for you to be able to offload it to a CDN means you completely reduce that traffic. A streaming video clip may be too big to cache or should not be cached. But you don't neccessarily want to support that bandwidth. A CDN will take on that traffic for you.
It is not always about the cache. A small application web server may just want to provide the dynamic content but needs a solution for the heavy hitting media that rarely changes. CDNs handle the scaling issue for you.
Agree with #Anthony_Hatzopoulos (+1)
CDN complements Caching; also in some cases, it will help optimize Caching directives.
For example, a company I work for has integrated behavior-learning algorithms into its CDN, to identify and dynamically cache generated objects.
Ordinarily, these objects would be un-Cachable (i.e. [Cache-Control: max-age=0] Http header). But in this case, the system is able to identify Caching possibilities and override original HTTP Header directions. (For example: a dynamically generated popular product that should be Cached, or popular Search result page that, while being generated dynamically, is still presented time over time in the same form to thousands of users).
And yes, before you ask, the system can also identify personalized data and very freshness, to prevent false positives... :)
Implementing such an algorithm was only possible due to a reverse-proxy CDN technology. This is an example of how CDN and Caching can complement each other, to create better and smarter acceleration solutions.
Above those experts quotes, the explanation are perfect to understand CDN tech and also cache
I would just provide my personal experience, I had worked on the joomla virtuemart site and unfortunately it will not allow update new joomla and virtuemart version cause it was too much customised fields in product pages, so once the visitor up to 900/DAY and lots user could not put their items in their basket because each time to called lots js and ajax called for order items takes too much time
After optimise the site, we decide to use CDN, then the performance is really getting good, along by record from gtmetrix, the first YSlow Score was 50% then after optimise + CDN it goes to 74%
https://gtmetrix.com/reports/www.florihana.com/jWlY35im
and from dashboard of CDN you could see which datacenter cost most and data charged most to get your improvement of marketing:
But yes to configure CDN it has to be careful of purge time and be balancing numbers of resource CDN cause if it down some problem you need to figure out which resource CDN cause
Hope this does help

What's a good alternative to page caching on Heroku?

I understand page caching isn't a good option on heroku since each dyno has an emepheral file system (so they wouldn't share files and it would get wiped out on each restart).
So I'm wondering what the best alternative is. I have a large amount of potential files that could get generated in a traditional page caching scenario (say 10GB-100GB) so redis/memcached don't seem like good options here. Redis can write out to disk, but my understanding is that once you exceed it's memory capacity, it's not the right solution to start reading off of disk.
Has anyone found a good solution here? I'm thinking maybe MongoStore. (And some way to run this in conjunction with redis since I'm using redis for some other scenarios.) Thanks!
If your site is 100% static content and never going to be dynamic, S3 may be a good option. You can then create a CNAME to the s3 domain. This allows you to leverage CloudFront should you need it. Otherwise, 100GB would have to go into the database, which is in turn then pulled up by your application.
Heroku's cedar stack allows for custom buildpacks. This one vendors nginx. This would be good if you envision transitioning to a more dynamic site.

What's the speediest web hosting choices out there that are scalable to large traffic spikes and can handle fast page loads?

Is cloud hosting the way to go? Or is there something better that delivers fast page loads?
The reason I ask is because I run a buddypress site on a bluehost dedicated server, but it seems to run slow at most times of the day. This scares me because at the moment the sites not live and I'm afraid when it gets traffic it'll become worse and my visitors will lose interest. I use Amazon Cloud to handle all my media, JS, and CSS files along with a catching plugin, but it still loads slow at times.
I feel like the problem is Bluehost, because I visit other sites running buddypress and their sites seem to load instantly. Im not web hosting savvy so can someone please point me in the right direction here?
The hosting choice depends on many factors such as technical requirements, growth rates, burst rates, budgets and more.
Bigger Hardware
To scale up hosting operation, your first choice is often just using a more powerful server, VPS, or cloud instance. The point is not so much cloud vs. dedicated but that you simply bring more compute power to the problem. Cloud can make scaling up easier - often with a few clicks.
Division of Labor
The next step often is division of labor. You offload database, static content, caching or other items to specific servers or services. For example, you could offload static content to a CDN. You could a dedicated database.
Once again, cloud vs non-cloud is not the issue. The point is to bring more resources to your hosting problems.
Pick the Right Application Stack
I cannot stress enough picking the right underlying technology for your needs. For example, I've recently helped a client switch from a Apache/PHP stack to a Varnish/Nginx/PHP-FPM stack for a very business Wordpress operation (>100 million page views/mo). This change boosted capacity by nearly 5X with modest hardware changes.
Same App. Different Story
Also just because you are using a specific application, it does not mean the same hosting setup will work for you. I don't know about the specific app you are using but with Drupal, Wordpress, Joomla, Vbulletin and others, the plugins, site design, themes and other items are critical to overall performance.
To complicate matter, user behavior is something to consider as well. Consider a discussion form that has a 95:1 read:post ratio. What if you do something in the design to encourage more posts and that ratio moves to 75:1. That means more database writes, less caching, etc.
In short, details matter, so get a good understanding of your application before you start to scale out hosting.
A hosting service is part of the solution. Another part is proper server configuration.
For instance this guy has optimized his setup to serve 10 million requests in a day off a micro-instance on AWS.
I think you should look at your server config first, then shop for other hosts. If you can't control server configuration, try AWS, Rackspace or other cloud services.
just an FYI: You can sign up for AWS and use a micro instance free for one year. The link I posted - he just optimized on the same server. You might have to upgrade to a small server because Amazon has stated that micro is only to handle spikes and sustained traffic.
Good luck.

Resources