RFC2616 13.3, Browser History and Caching - caching

I've been trying to get my head around the whole issue of browser history Vs caching and RFC2616 13.13
Does this section of the RFC mean that if a user goes "Back" in the browser, for example, it should always display the page from it's local storage, ignoring any cache directives, unless the user has configured it otherwise?
So browsers that reload the page when navigating the history, even if caching directives are instructing it do so, are not complying with the specification? And the spec is saying this is bad because "this will tend to force service authors to avoid using HTTP expiration controls and cache controls when they would otherwise like to."
Also, even though a directive may instruct the broswer not to cache, e.g. using Cache-Control: no-store, it can/should store it in it's history cache?
From what I've read, it seems that most browsers violate the standard, apart from Opera. Is this because the security concerns around the re-display of pages with sensitive data from history are seen as more important than the issue the standard talks about?
I'd be grateful if anyone is able shed some light/clarification on this area, thanks.

History and cache are completely separate. We're trying to clarify this in httpbis; see https://svn.tools.ietf.org/svn/wg/httpbis/draft-ietf-httpbis/latest/p6-cache.html#history.lists

Related

Is output of debugClientLibs flag cached in CQ/AEM

I've been using debugClientLibs flag in my AEM pages, (helpful for debugging clientlibs related issues) like this localhost:4502/content/geometrixx/en.html?debugClientLibs=true.
Recently, I was seeing some caching related issue of JS. I noticed, when using debugClientLibs flag, no-cache header was not included in Request Header of individual JS files.
It does not make sense to cache these individual files as they would defeat the purpose of debugging clientlibs(i would not want to see the cached JS and CSS files when i'm using debugClientLibs flag in my pages). Attaching a screentshot of Request and Response headers i got.
My Question here is:
Are these individual clientlib files cached on the browser ?
Short answer - it depends.
Every browser has its own implementation for networking & caching rules. Response headers are hints to the browser to help them be more efficient. But the browsers may chose to do its own thing. Even more distracting, the behavior may change between versions of a given browser. Further, even if the browser's default behavior is to follow (or ignore) such headers, the user may configure different behavior. So don't assume anything, especially globally.

How do different browsers handle caching for static content without an Expires Header?

After running the YSlow plugin on a site, I saw that one of the recommendations was to add far future expires headers to the scripts, stylesheets, and images.
I asked a different question about how to set this up in IIS, but I am actually just curious about how each browser behaves.
I have read that IE will cache items per browsing session, so once you reopen the site after closing the browser, it will need to reload all of the content. I believe that Firefox will go ahead and set a expiration date on its own. I have also heard that IE does not cache at all when connecting over HTTPS. I am not sure if these are at all accurate, though, and was wondering if someone could clear up any misconceptions I may have. Thanks!
You are right about Firefox setting its own expiration date. See the second item in this blog post:
http://blog.httpwatch.com/2008/10/15/two-important-differences-between-firefox-and-ie-caching/
IE, like Firefox, can cache HTTPS based content. However, you need to set Cache-Control: public for persistent caching across browser sessions in Firefox. See Tip #3 in this blog post:
http://blog.httpwatch.com/2009/01/15/https-performance-tuning/

Clear all website cache?

Is it possible to clear all site cache? I would like to do this when the user logs out or the session expires instead of instructing the browser not to cache on each request.
As far as I know, there is no way to instruct the browser to clear all the pages it has cached for your site. The only control that you, as a website author, have over caching of a page occurs when the browser tries to access that page. You can specify that cached versions of your pages should expire at a certain time using the Expires header, but even then the browser won't actually clear the page from its cache at that time.
i certainly hope not - that would give the web site destructive powers over the client machine!
If security is your main concern here, why not use HTTPS? Browsers don't cache content received via HTTPS (or cache it only in memory).
One tricky way to mimic this would be to include the session-id as a parameter when referencing any static piece of content on the site. When the user establishes the session, the browser will recognize all the pieces of content as new due to the inclusion of this parameter. For the duration of the session the browser will used the static content in its cache. After the user logs out and logs back in again, the session-id parameter for the static contents will be different, so the browser will recognize this is as completely new content and will download everything again.
That being said... this is a hack and I wouldn't recommend pursuing it.. For what reason do you want the user's cache to be cleared after their session expires? There's probably a better solution that can fit your situation as opposed to what you are currently asking for.
If you are talking about asp.net cache objects, you can use this:
For Each elem As DictionaryEntry In Cache
Cache.Remove(elem.Key)
Next
to remove items from the cache, but that may not be the full-extent of what you are trying to accomplish.

Mixing Secure and Non-Secure Content on Web Pages - Is it a good idea?

I'm trying to come up with ways to speed up my secure web site. Because there are a lot of CSS images that need to be loaded, it can slow down the site since secure resources are not cached to disk by the browser and must be retrieved more often than they really need to.
One thing I was considering is perhaps moving style-based images and javascript libraries to a non-secure sub-domain so that the browser could cache these resources that don't pose a security risk (a gradient isn't exactly sensitive material).
I wanted to see what other people thought about doing something like this. Is this a feasible idea or should I go about optimizing my site in other ways like using CSS sprite-maps, etc. to reduce requests and bandwidth?
Browsers (especially IE) get jumpy about this and alert users that there's mixed content on the page. We tried it and had a couple of users call in to question the security of our site. I wouldn't recommend it. Having users lose their sense of security when using your site is not worth the added speed.
Do not mix content, there is nothing more annoying then having to go and click the yes button on that dialog. I wish IE would let me always select show mixed content sites. As Chris said don't do it.
If you want to optimize your site, there are plenty of ways, if SSL is the only way left buy a hardware accelerator....hmmm if you load an image using http will it be cached if you load it with https? Just a side question that I need to go find out.
Be aware that in IE 7 there are issues with mixing secure and non-secure items on the same page, so this may result in some users not being able to view all the content of your pages properly. Not that I endorse IE 7, but recently I had to look into this issue, and it's a pain to deal with.
This is not advisable at all. The reason browsers give you such trouble about insecure content on secure pages is it exposes information about the current session and leaves you vulnerable to man-in-the-middle attacks. I'll grant there probably isn't much a 3rd party could do to sniff venerable info if the only insecured content is images, but CSS can contain reference to javascript/vbscript via behavior files (IE). If your javascript is served insecurely, there isn't much that can be done to prevent a rouge script scraping your webpage at an inopportune time.
At best, you might be able to get a way with iframing secure content to keep the look and feel. As a consumer I really don't like it, but as a web developer I've had to do that before due to no other pragmatic options. But, frankly, there's just as many if not more defects with that, too, as after all, you're hoping that something doesn't violate the integrity of the insecure content so that it may host the secure content and not some alternate content.
It's just not a great idea from a security perspective.

IE6 and Caching

It seems that IE6 ignores any form of cache invalidation sent via http headers, I've tried setting Pragma to No Cache and setting Cache Expiration to the current time, yet in IE6, hitting back will always pull up a cached version of a page I am working on.
Is there a specific HTTP Header that IE6 does listen too?
Cache-Control: private, max-age=0 should fix it. From classic ASP this is done with Response.Expires=-1.
Keep in mind when testing that just because your server is serving pages with caching turned off doesn't mean that the browser will obey that when it has an old cached page that it was told was okay to cache. Clear the cache or use F5 to force that page to be reloaded.
Also, for those cases where the server is serving cached content it you can use Ctrl+F5 to signal the server not to serve it from cache.
You must be careful. If you are using AJAX via XMLHttpRequest (XHR), cache "recommendations" set in the header are not respected by ie6.
The fix is to use append a random number to the url queries used in AJAX requests. For example:
http://test.com?nonce=0123
A good generator for this is the UTC() function that returns a unique timestame for the user's browser... that is, unless they mess with their system clock.
Have you tried setting an ETag in the header? They're a pretty reliable way to indicate that content has changed w3c Spec & Wikipedia
Beyond that, a little more crude way is to append a random query string parameter to the request, such as the current unix timestamp. As I said, crude, but then IE6 is not the most subtle of beasts
see Question: Making sure a webpage is not cached, across all browsers. How to control web page caching, across all browsers? I think this should help out with your problem too.
Content with "Content-Encoding: gzip" Is Always Cached Although You Use "Cache-Control: no-cache"
http://support.microsoft.com/kb/321722
You could also disable gzip just for IE6
A little note: By experience I know that IE6 will load Javascript from cache even if forced to reload the page via Ctrl-F5. So if you are working on Javascript always empty the cache.
The IE web developer toolbar can help immensely with this. There's a button for clearing the cache.

Resources