Domain aliasing vs edge side includes for CDN - caching

I'm designing a web application to support use of a CDN in the future.
Two options I've considered:
Use domain aliasing for static content on the site, including CSS, JS, and some images.
Use "edge side includes" to designate static content regions.
(1) is simpler and I've implemented it before. For example, we would prefix each IMG src with http://images1.mysite.com/, and then later update the corresponding DNS to use the CDN. The drawback I've heard from users of our internal "pre-production" site is that they would have to push the images to images1.mysite.com to preview their changes internally -- ideally, files would not get pushed to images1.mysite.com until they're ready for production. (NOTE - hosts file changes and DNS tricks are not an option here.)
Instead, they would like to simply use relative or absolute paths for static content. e.g. /images/myimage.gif
(2) is not as familiar to me and I would like more info. Would this allow our "pre-production" team to reference static content with a relative path in "pre-production environment" and yet have it work with the CDN in production without HTML modifications?
Could someone compare the two options, in terms of ease of development, flexibility, and cost?

Here's a variation on the second option to consider.
Leave relative image URLs alone in your HTML. On your production server, have image requests return a server-side redirect to the image location on the CDN. This generates marginally more traffic than the other techniques, but it generates an access log entry for each image hit, keeps your HTML and site structure simple, factors specific CDN dependencies out of your site source, and lets you enable, disable or switch CDN-based image service on the fly.
If you are using a demand-pulled CDN such as Coral, you also need to ensure that requests either issued by or declined by the CDN are served directly from your production server. See Using CoralCDN as a server operator for more information on this technique.

Related

Use google hosted jQuery-ui or self host custom download of jQuery UI?

I'm working on a site where we are using the slide function from jquery-ui.
The Google-hosted minified version of jquery-ui weighs 63KB - this is for the whole library. The custom download of just the slide function weighs 14KB.
Obviously if a user has cached the Google hosted version its a no-brainer, but if they haven't it will take longer to load as I could just lump the custom jquery-ui slide function inside of my main.js file.
I guess it comes down to how many other sites using jquery-ui (if this was just for the normal jquery the above would be a no-brainer as loads of sites use jquery, but I'm a bit unsure as per the usage of jquery-ui)...
I can't work out what's the best thing to do in the above scenario?
I'd say if the custom selective build is that small, both absolutely and relatively, there's a good reasons to choose that path.
Loading a JavaScript resource has several implications, in the following order of events:
Loading: Request / response communication or, in case of a cache hit - fetching. Keep in mind that CDN or not, the communication only affects the first page. If your site is built in a traditional "full page request" style (as opposed to SPA's and the likes), this literally becomes a non-issue.
Parsing: The JS engine needs to parse the entire resource.
Executing: The JS engine executes the entire resource. That means that any initialization / loading code is executed, even if that's initialization for features that aren't used in the hosting page.
Memory usage: The memory usage depends on the entire resource. That includes static objects as well as function (which are also objects).
With that in mind, having a smaller resource is advantageous in ways beyond simple loading. More so, a request for such a small resource is negligible in terms of communication. You wouldn't even think twice about it had it been a mini version of the company logo somewhere on the bottom of the screen where nobody even notices.
As a side note and potential optimization, if your site serves any proprietary library, or a group of less common libraries, you can bundle all of these together, including the jQuery UI subset, and your users will only have a single request, again making this advantageous.
Go with the Google hosted version
It is likely that the user would have recently visited a website that loads jQuery-UI hosted on Google servers.
It will take load off from your server and make other elements load faster.
Browsers load a fixed number of resources from one domain. Loading the jQuery-UI from Google servers will make sure it is downloaded concurrently with other resource that reside on your servers.
The Yahoo developer network recommends using a CDN. Their full reasons are posted here.
https://developer.yahoo.com/performance/rules.html
This quote from their site really seals it in my mind.
"Deploying your content across multiple, geographically dispersed servers will make your pages load faster from the user's perspective."
I am not an expert but my two cents are these anyway. With a CDN you can be sure that there is reduced latency, plus as mentioned, user is most likely to have picked it up from some other website hosted by googleAlso the thing I always care about, save bandwidth.

Should I use https for static files (images, css)

I use https protocol for my login, registration, admin pages of my web app.
If I don't write some htaccess rule, all my static files images, css, js, ect. are loaded through https too.
Does this decrease the performance of my app and is it better to use http for all static resources of my app?
If you attempt to include a static file over HTTP while the original dynamic page was served through HTTPS the browser might emit a warning that this webpage is trying to server non secure content over a secure channel. So you should avoid doing that. There's of course a penalty from serving a resource over HTTPS but static files are usually cached by browsers so that shouldn't be that much of a problem. Also you might consider minifying and combining your scripts into a single one in order to reduce the number of HTTP(S) requests made to the server. That's where you will gain most.
For your images you might also consider using a technique called CSS sprites.

Cloudfront, EC2 and Relative URLs

This is probably a simple question, but I can't find a straightforward answer anywhere.
My website is hosted on Amazon EC2.
I want to take advantage of Amazon Cloudfront to speed up the loading of the images, Javascript and CSS on my site.
Throughout my site, I've used relative URLs to point to the images, Javascript and CSS that reside on my EC2 server.
So, now, in order to take advantage of Cloudfront, do I need to change all my relative URLs into absolute URLs which point to Cloudfront, or will this be handled automatically by Amazon/EC2/Cloudfront? Or, maybe a better way to ask the question is, can I leave all my URLs as relative URLs and still get all the advantages of Cloudfront?
Short answer is no, your relative URLs will not work as expected on CloudFront - except for the case mentioned by Gio Hunt that once your page loads the CSS file, any relative url inside the CSS file itself will resolve to CloudFront, but this probably isn't very useful in your case.
See this answer for a solution using SASS that pretty closely matches what I've done in the past:
I used SASS - http://sass-lang.com
I have a mixin called cdn.scss with content like $image_path: "/images/";
Import that mixin in the sass style #import "cdn.scss"
Update image paths as such: background:url($image_path + "image.png");
On deployment I change the $image_path variable in the mixin.scss and then rerun sass
Basically you regenerate your CSS to use the CDN (CloudFront) base url by creating a variable that all your pages respect. The difficulty involved with doing this will depend on how many references and file you need to change, but a simple search and replace for relative paths is pretty easy to accomplish.
Good luck.
If you want to leave everything as is, you could pass everything through cloudfront setting up your site as a custom origin. This can work pretty well if your site is mostly static.
If you do want to take advantage of cloudfront without sending everything through, you will need to update your relative urls to absolute ones. CSS files can keep relative urls as long as the css file is served via cloudfront.

Using mod_proxy to redirect asset requests to a CDN

I'm interested in using a CDN for storing some assets in my web application. However, instead of hardcoding the CDN url into each of my assets (Javascript and CSS), I'd like to use a simple RewriteRule to redirect requests for assets to a CDN.
However, I'm wondering if there are some disadvantages to this method. For one, the server is still processing requests for assets since it needs to identify and redirect such requests. And another, I'm concerned that the CDN will look at the location of my server as opposed to the location of my client.
Has anyone ever dealt with this kind of strategy, and what was your solution? Thanks!
That's not a great strategy, as it completely nullifies any benefits or using a CDN. For every request for a static asset, your server has to process a request, which is exactly what you're trying to avoid.
Bite the bullet, and set up your application to be configurable (you do use basic configuration, correct?) enough so that you change base URLs for all your static assets.

Question about using subdomains to force caching

I haven't had a huge opportunity to research the subject but I figure I'll just ask the question and see if we can create a knowledge base on the subject here.
1) Using subdomains will force a client side cache, is this by default or is there an easy way for a client to disable it? More curious about what kind of a percentage of users I should be expecting to affect.
2) What all will be cached? Images? Stylesheets? Flash SWFs? Javascripts? Everything?
3) I remember reading that you must use a subdomain or www in your URL for this to work, is this correct? (and does this mean SO won't allow it?)
I plan on integrating this onto all of my websites eventually but first I am going to try to do it for a network of flash game websites so I am thinking www.example.com for the website will remain the same but instead of using www.example.com/images, www.example.com/stylesheets, www.example.com/javascript, & www.example.com/swfs I will just create subdomains that point to them (img.example.com, css.example.com, js.example.com & swf.example.com respectively) -- is this the best course of action?
Using subdomains for content elements isn't so much to force caching, but to trick a browser into opening more connections than it might otherwise do. This can speed up page load time.
Caching of those elements is entirely down the HTTP headers delivered with that content.
For static files like CSS, JS etc, a server will typically tell the client when the file was modified, which allows a browser to ask for the file "If-Modified-Since" that timestamp. Specifics of how to improve on this by adding some extra caching headers would depend on which webserver you use. For example, with Apache you can use the mod_expires module to set the Expires header, or the Header directive to output other types of cache control headers.
As an example, if you had a subdirectory with your css files in, and wanted to ensure they were cached for at least an hour, you could place a .htaccess in that directory with these contents
ExpiresActive On
ExpiresDefault "access plus 1 hours"
Check out YSlow's documentation. YSlow is a plugin for Firebug, the amazing Firefox web development plugin. There is lots of good info on a number of ways to speed up your page loads, one of which is using one or more subdomains to encourage the browser to do more parallel object loads.
One thing I've done on two Django sites is to use a custom template tag to create pseudo-paths to images, css, etc. The path contains the time-last-modified as a pseudo directory. This path component is stripped out by an Apache .htaccess mod_rewrite rule. The object is then given a 10 year time-to-live (ExpiresDefault "now plus 10 years") so the browser will only load it once. If the object changes, the pseudo path changes and the browser will fetch the updated object.

Resources