Understanding how images are served and cached - image

So I'm wondering how browsers treat requests for images. I'm hoping to use a cdn for serving product images on my website. I'd also like to use the cdn for serving button images and images used in my css.
The problem with this is that I don't have control over the expires headers (Rackspace files is what I'm looking into).
See, say I have a large image file as a background on my home page. So the page is accessed often, but the image stays the same. Is the browser going to request this image every time?
Or should I just use a cdn for my product images?

caching is quite a broad subject. I suggest you start by reading about the different kinds of caching here http://www.mnot.net/cache_docs/#BROWSER and how caching works here http://www.web-caching.com/mnot_tutorial/how.html
Now, to answer your question: assuming the user has caching enabled and the cdn response headers are properly configured a user visiting your page multiple times will only request that background image once until the cache expires or those files are cleaned.

No, AFAIK you need necessarily to add the 'cache' header to your images to enable browser caching. This is a great tutorial about it.
Additionally you can read this article from Yahoo to get a very brief view of the topics.
Review specially these topics of the article:
Minimize HTTP Requests
Add an Expires or a Cache-Control Header
Use a Content Delivery Network
Hope it helps you

Related

Employing a CDN for a dynamic website

I have a website forum where users exchange photos and text with one another on the home page. The home page shows 20 latest objects - be they photos or text. The 21st object is pushed out out of view. A new photo is uploaded every 5 seconds. A new text string is posted every second. In around 20 seconds, a photo that appeared at the top has disappeared at the bottom.
My question is: would I get a performance improvement if I introduced a CDN in the mix?
Since the content is changing, it seems I shouldn't be doing it. However, when I think about it logically, it does seem I'll get a performance improvement from introducing a CDN for my photos. Here's how. Imagine a photo is posted, appearing on the page at t=1 and remaining there till t=20. The first person to access the page (closer to t=1) will enable to photo to be pulled to an edge server. Thereafter, anyone accessing the photo will be receiving it from the CDN; this will last till t=20, after which the photo disappears. This is a veritable performance boost.
Can anyone comment on what are the flaws in my reasoning, and/or what am I failing to consider? Also would be good to know what alternative performance optimizations I can make for a website like mine. Thanks in advance.
You've got it right. As long as someone accesses the photo within the 20 seconds that the image is within view it will be pulled to an edge server. Then upon subsequent requests, other visitors will receive a cached response from the nearest edge server.
As long as you're using the CDN for delivering just your static assets, there should be no issues with your setup.
Additionally, you may want to look into a CDN which supports HTTP/2. This will provide you with improved performance. Check out cdncomparison.com for a comparison between popular CDN providers.
You need to consider all requests hitting your server, which includes the primary dynamically generated HTML document, but also all static assets like CSS files, Javascript files and, yes, image files (both static and user uploaded content). An HTML document will reference several other assets, each of which needs to be downloaded separately and thus incurs a server hit. Assuming for the sake of argument that each visitor has an empty local cache, a single page load may incur, say, ~50 resource hits for your server.
Probably the only request which actually needs to be handled by your server is the dynamically generated HTML document, if it's specific to the user (because they're logged in). All other 49 resource requests are identical for all visitors and can easily be shunted off to a CDN. Those will just hit your server once [per region], and then be cached by the CDN and rarely bother your server again. You can even have the CDN cache public HTML documents, e.g. for non-logged in users, you can let the CDN cache HTML documents for ~5 seconds, depending on how up-to-date you want your site to appear; so the CDN can handle an entire browsing session without hitting your server at all.
If you have roughly one new upload per second, that means there is likely a magnitude more passive visitors per second. If you can let a CDN handle ~99% of requests, that's a dramatic reduction in actual hits to your server. If you are clever with what you cache and for how long and depending on your particular mix of anonymous and authenticated users, you can easily reduce server loads by a magnitude or two. On the other side, you're speeding up page load times accordingly for your visitors.
For every single HTML document and other asset, really think whether this can be cached and for how long:
For HTML documents, is the user logged in? If no, and there's no other specific cookie tracking or similar things going on, then the asset is static and public for all intents and purposes and can be cached. Decide on a maximum age for the document and let the CDN cache it. Even caching it for just a second makes a giant difference when you get 1000 hits per second.
If the user is logged in, set the cache pragma to private, but still let the visitor's browser cache it for a few seconds. These headers must be decided upon by your forum software while it's generating the document.
For all other assets which aren't access restricted: let the CDN cache it for a long time and you can practically forget about ever having to serve those particular files ever again. These headers can be statically configured for entire directories in the web server.

Howto cache images on IIS7

I have a web application that get a big amount of images via a server script proxy
which re-sizes them to specific thumbnails size.
so the url of a given image would be something like:
myDomain.com/scripts/myScript.php?url=anotherDomain.com/images/someImage.jpg&width=120&height=80
Due to some requirements I have to proxy these images this way, and I need to cache these images to ease the load on the server processing and re-sizing.
How do I go about configuring IIS7 to cache these urls?
I don't have in-depth knowledge of headers, so if needed, your elaboration will be very much appreciated.
thanks
click your site, click http Response headers, then click "set common headers", then expire web content and uncheck "immediately", then specify when you'd like it to expire.
Source

website Caching image gallery

I have a website that shows hundreds / thousands of images to users. Users reload the gallery fairly frequently, but their caches are invariably not big enough to save the images from visit to visit, and have to slowly re-download.
What are good ways to ensure that images get cached? I've already ruled out using localstorage. What advice can you give me?
Just to be clear, the http headers are set correctly (folks with big caches get instantaneous loads), just that the browser default caches are not big enough.
Have you considered caching on the server-side like varnish in front of your webserver? If you have enough memory on your server(s) this will help alot on the download-time for your users (and your serverload).
Or are you just interested in client-side cache? If so, it would help if you could post your http cache-headers.

Image server HTTP vs HTTPS in IIS

Using IIS6 on win2003, if I have an image server is it more efficient to serve up images as http than https? Also is it possible to serve up images as http on a webpage than is https.
Like if this html file: Secure.html is on a server that forces https to be used, but inside https://www.myultrahighsecurewebsite.com/Secure.html/
and I have a image like so:
<img src="http://www.notsosecureatall.com/imagecode/serveupanimage.aspx?id=1&auth=1" />
Is this a good way to serve up images? Is there a better way? Also I wanted to have a defacto site to serve up images so I don't have to push images as well as code and markup between development and live. Should the images or certain type of images like tiff, png, svg, etc... be on https vs http or does it matter?
If I do implement an image server or a page dedicated to keeping track of images, I just want an easy system to push images from development to my production environment without getting snagged on installation problems. But also not having to worry about customers having access to images until a version has been properly approved.
It will be more efficient from a CPU (client & server) and network point of view to use plain old HTTP. You'll avoid the overhead of TLS traffic and encryption/decryption of data.
The scenario (mixed HTTP/HTTPS) you present will often annoy a user though. You'll get a warning similar to "This page has unsecured content" (this is getting at the HTTP-only images).
See Loading http content on https website for more info (and additional gotchas) on the previous paragraph.
Are you sure that you need to worry about efficient loading of images? Don't overengineer this. Please share some more details if you've noticed that this is a problem.

Question about using subdomains to force caching

I haven't had a huge opportunity to research the subject but I figure I'll just ask the question and see if we can create a knowledge base on the subject here.
1) Using subdomains will force a client side cache, is this by default or is there an easy way for a client to disable it? More curious about what kind of a percentage of users I should be expecting to affect.
2) What all will be cached? Images? Stylesheets? Flash SWFs? Javascripts? Everything?
3) I remember reading that you must use a subdomain or www in your URL for this to work, is this correct? (and does this mean SO won't allow it?)
I plan on integrating this onto all of my websites eventually but first I am going to try to do it for a network of flash game websites so I am thinking www.example.com for the website will remain the same but instead of using www.example.com/images, www.example.com/stylesheets, www.example.com/javascript, & www.example.com/swfs I will just create subdomains that point to them (img.example.com, css.example.com, js.example.com & swf.example.com respectively) -- is this the best course of action?
Using subdomains for content elements isn't so much to force caching, but to trick a browser into opening more connections than it might otherwise do. This can speed up page load time.
Caching of those elements is entirely down the HTTP headers delivered with that content.
For static files like CSS, JS etc, a server will typically tell the client when the file was modified, which allows a browser to ask for the file "If-Modified-Since" that timestamp. Specifics of how to improve on this by adding some extra caching headers would depend on which webserver you use. For example, with Apache you can use the mod_expires module to set the Expires header, or the Header directive to output other types of cache control headers.
As an example, if you had a subdirectory with your css files in, and wanted to ensure they were cached for at least an hour, you could place a .htaccess in that directory with these contents
ExpiresActive On
ExpiresDefault "access plus 1 hours"
Check out YSlow's documentation. YSlow is a plugin for Firebug, the amazing Firefox web development plugin. There is lots of good info on a number of ways to speed up your page loads, one of which is using one or more subdomains to encourage the browser to do more parallel object loads.
One thing I've done on two Django sites is to use a custom template tag to create pseudo-paths to images, css, etc. The path contains the time-last-modified as a pseudo directory. This path component is stripped out by an Apache .htaccess mod_rewrite rule. The object is then given a 10 year time-to-live (ExpiresDefault "now plus 10 years") so the browser will only load it once. If the object changes, the pseudo path changes and the browser will fetch the updated object.

Resources