website Caching image gallery - caching

I have a website that shows hundreds / thousands of images to users. Users reload the gallery fairly frequently, but their caches are invariably not big enough to save the images from visit to visit, and have to slowly re-download.
What are good ways to ensure that images get cached? I've already ruled out using localstorage. What advice can you give me?
Just to be clear, the http headers are set correctly (folks with big caches get instantaneous loads), just that the browser default caches are not big enough.

Have you considered caching on the server-side like varnish in front of your webserver? If you have enough memory on your server(s) this will help alot on the download-time for your users (and your serverload).
Or are you just interested in client-side cache? If so, it would help if you could post your http cache-headers.

Related

Employing a CDN for a dynamic website

I have a website forum where users exchange photos and text with one another on the home page. The home page shows 20 latest objects - be they photos or text. The 21st object is pushed out out of view. A new photo is uploaded every 5 seconds. A new text string is posted every second. In around 20 seconds, a photo that appeared at the top has disappeared at the bottom.
My question is: would I get a performance improvement if I introduced a CDN in the mix?
Since the content is changing, it seems I shouldn't be doing it. However, when I think about it logically, it does seem I'll get a performance improvement from introducing a CDN for my photos. Here's how. Imagine a photo is posted, appearing on the page at t=1 and remaining there till t=20. The first person to access the page (closer to t=1) will enable to photo to be pulled to an edge server. Thereafter, anyone accessing the photo will be receiving it from the CDN; this will last till t=20, after which the photo disappears. This is a veritable performance boost.
Can anyone comment on what are the flaws in my reasoning, and/or what am I failing to consider? Also would be good to know what alternative performance optimizations I can make for a website like mine. Thanks in advance.
You've got it right. As long as someone accesses the photo within the 20 seconds that the image is within view it will be pulled to an edge server. Then upon subsequent requests, other visitors will receive a cached response from the nearest edge server.
As long as you're using the CDN for delivering just your static assets, there should be no issues with your setup.
Additionally, you may want to look into a CDN which supports HTTP/2. This will provide you with improved performance. Check out cdncomparison.com for a comparison between popular CDN providers.
You need to consider all requests hitting your server, which includes the primary dynamically generated HTML document, but also all static assets like CSS files, Javascript files and, yes, image files (both static and user uploaded content). An HTML document will reference several other assets, each of which needs to be downloaded separately and thus incurs a server hit. Assuming for the sake of argument that each visitor has an empty local cache, a single page load may incur, say, ~50 resource hits for your server.
Probably the only request which actually needs to be handled by your server is the dynamically generated HTML document, if it's specific to the user (because they're logged in). All other 49 resource requests are identical for all visitors and can easily be shunted off to a CDN. Those will just hit your server once [per region], and then be cached by the CDN and rarely bother your server again. You can even have the CDN cache public HTML documents, e.g. for non-logged in users, you can let the CDN cache HTML documents for ~5 seconds, depending on how up-to-date you want your site to appear; so the CDN can handle an entire browsing session without hitting your server at all.
If you have roughly one new upload per second, that means there is likely a magnitude more passive visitors per second. If you can let a CDN handle ~99% of requests, that's a dramatic reduction in actual hits to your server. If you are clever with what you cache and for how long and depending on your particular mix of anonymous and authenticated users, you can easily reduce server loads by a magnitude or two. On the other side, you're speeding up page load times accordingly for your visitors.
For every single HTML document and other asset, really think whether this can be cached and for how long:
For HTML documents, is the user logged in? If no, and there's no other specific cookie tracking or similar things going on, then the asset is static and public for all intents and purposes and can be cached. Decide on a maximum age for the document and let the CDN cache it. Even caching it for just a second makes a giant difference when you get 1000 hits per second.
If the user is logged in, set the cache pragma to private, but still let the visitor's browser cache it for a few seconds. These headers must be decided upon by your forum software while it's generating the document.
For all other assets which aren't access restricted: let the CDN cache it for a long time and you can practically forget about ever having to serve those particular files ever again. These headers can be statically configured for entire directories in the web server.

Fixing slow response time for resources

I have a Magento website and I have been noticing an increase in warnings from Catchpoint that various images, CSS files, and javascript files are taking longer than usual to load. We use Edgecast for our CDN and have all images, CSS, and JS files hosted there. I have been in contact with them and they determined that the delays happen when the cache for the resource has expired and it must contact the origin for an updated file. The problem is that I can't figure out why it would take longer than a second to return a small image file. If I load the offending image off our server (not from the CDN) in my browser it always returns quickly. I assume that if you call up an image file directly using the full URL to the image file (say a product image, for example), that would bypass any Magento logic or database access and simply return the image to you. This should happen quickly, and it normally does, but sometimes it doesn't.
We have a number of things in play that may have an effect. There are API calls to the server for various integrations, though they are directed at a secondary server and not the web frontend. We may also have a large number of stale images since Magento doesn't delete any images even if you replace them or delete the product.
I realize this is a fairly open ended question, and I'm sorry if it breaks SO protocol, but I'm grasping at straws here. If anyone has any ideas on where to look or what could cause small resource files, like images, to take upwards of 8 seconds to load, I'm all ears. As an eCommerce site, it's getting close to peak season, and I can feel the hot breath of management on my neck. Any help would be greatly appreciated.
Thanks!
Turns out we had stumbled upon some problems with the CDN that they were somewhat aware of and not quick to admit. They made some changes to our account to work around the issues and things are much better now.

Understanding how images are served and cached

So I'm wondering how browsers treat requests for images. I'm hoping to use a cdn for serving product images on my website. I'd also like to use the cdn for serving button images and images used in my css.
The problem with this is that I don't have control over the expires headers (Rackspace files is what I'm looking into).
See, say I have a large image file as a background on my home page. So the page is accessed often, but the image stays the same. Is the browser going to request this image every time?
Or should I just use a cdn for my product images?
caching is quite a broad subject. I suggest you start by reading about the different kinds of caching here http://www.mnot.net/cache_docs/#BROWSER and how caching works here http://www.web-caching.com/mnot_tutorial/how.html
Now, to answer your question: assuming the user has caching enabled and the cdn response headers are properly configured a user visiting your page multiple times will only request that background image once until the cache expires or those files are cleaned.
No, AFAIK you need necessarily to add the 'cache' header to your images to enable browser caching. This is a great tutorial about it.
Additionally you can read this article from Yahoo to get a very brief view of the topics.
Review specially these topics of the article:
Minimize HTTP Requests
Add an Expires or a Cache-Control Header
Use a Content Delivery Network
Hope it helps you

Why is my Symfony site so slow? Can someone decipher this pingdom?

my site is really really slow, 4603prospect.com? Sometimes its ok, sometimes its slow. I am caching thumbnails, I dont understand what to make of this pingdom report.... http://tools.pingdom.com/?url=4603prospect.com&treeview=0&column=objectID&order=1&type=0&save=false
Thanks
Todd
Things that might help to check:
Are you using shared hosting? Your server may be responding slowly due to the activity on other websites on the server
Are your images falling foul of an htaccess rule which results in redirects?
How many images are you loading at once? Most browsers can only make ~8 http requests at a time. You could try combining some of your CSS and JS files together to limit the HTTP requests.

Mixing Secure and Non-Secure Content on Web Pages - Is it a good idea?

I'm trying to come up with ways to speed up my secure web site. Because there are a lot of CSS images that need to be loaded, it can slow down the site since secure resources are not cached to disk by the browser and must be retrieved more often than they really need to.
One thing I was considering is perhaps moving style-based images and javascript libraries to a non-secure sub-domain so that the browser could cache these resources that don't pose a security risk (a gradient isn't exactly sensitive material).
I wanted to see what other people thought about doing something like this. Is this a feasible idea or should I go about optimizing my site in other ways like using CSS sprite-maps, etc. to reduce requests and bandwidth?
Browsers (especially IE) get jumpy about this and alert users that there's mixed content on the page. We tried it and had a couple of users call in to question the security of our site. I wouldn't recommend it. Having users lose their sense of security when using your site is not worth the added speed.
Do not mix content, there is nothing more annoying then having to go and click the yes button on that dialog. I wish IE would let me always select show mixed content sites. As Chris said don't do it.
If you want to optimize your site, there are plenty of ways, if SSL is the only way left buy a hardware accelerator....hmmm if you load an image using http will it be cached if you load it with https? Just a side question that I need to go find out.
Be aware that in IE 7 there are issues with mixing secure and non-secure items on the same page, so this may result in some users not being able to view all the content of your pages properly. Not that I endorse IE 7, but recently I had to look into this issue, and it's a pain to deal with.
This is not advisable at all. The reason browsers give you such trouble about insecure content on secure pages is it exposes information about the current session and leaves you vulnerable to man-in-the-middle attacks. I'll grant there probably isn't much a 3rd party could do to sniff venerable info if the only insecured content is images, but CSS can contain reference to javascript/vbscript via behavior files (IE). If your javascript is served insecurely, there isn't much that can be done to prevent a rouge script scraping your webpage at an inopportune time.
At best, you might be able to get a way with iframing secure content to keep the look and feel. As a consumer I really don't like it, but as a web developer I've had to do that before due to no other pragmatic options. But, frankly, there's just as many if not more defects with that, too, as after all, you're hoping that something doesn't violate the integrity of the insecure content so that it may host the secure content and not some alternate content.
It's just not a great idea from a security perspective.

Resources