Here's the situation:
I have a web application which response to a request for a list of resources, lets say:
/items
This is initially requested directly by the web browser by navigating to that path. The browser uses it's standard "Accept" header which includes "text/html" and my application notices this and returns the HTML content for the item list.
Within the returned HTML is some JavaScript (jQuery), which then does an ajax request to retrieve the actual data:
/items
Only this time, the "Accept" header is explicitly set to "application/json". Again, my application notices this and JSON is correctly returned to the request, the data is inserted into the page, and everything is happy.
Here comes the problem: The user navigates to another page, and later presses the BACK button. They are then prompted to save a file. This turns out to be the JSON data of the item list.
So far I've confirmed this to happen in both Google Chrome and Firefox 3.5.
There's two possible types of answers here:
How can I fix the problem. Is
there some magic combination of
Cache-Control headers, or other
voodoo which cause the browser to do
the right thing here?
If you think I am doing something
horribly wrong here, how should I go
about this? I'm seeking correctness,
but also trying not to sacrifice
flexibility.
If it helps, the application is a JAX-RS web application, using Restlet 2.0m4. I can provide sample request/response headers if it's helpful but I believe the issue is completely reproducible.
Is there some magic combination of Cache-Control headers, or other voodoo which cause the browser to do the right thing here?
If you serve different responses to different Accept: headers, you must include the header:
Vary: Accept
in your response. The Vary header should also contain any other request headers that influence the response, so for example if you do gzip/deflate compression you'd have to include Accept-Encoding.
IE, unfortunately handles many values of Vary poorly, breaking cacheing completely, which might or might not matter to you.
If you think I am doing something horribly wrong here, how should I go about this?
I don't think the idea of serving different content for different types at the same URL is horribly wrong, but you are letting yourself in for more compatibility problems than you really need. Relying on headers working through JSON isn't really a great idea in practice; you'd be best off just having a different URL, such as /items/json or /items?format=json.
I know this question is old, but just in case anyone else runs into this:
I was having this same problem with a Rails application using jQuery, and I fixed it by telling the browser not to cache the JSON response with the solution given here to a different question:
jQuery $.getJSON works only once for each control. Doesn't reach the server again
The problem only seemed to occur with Chrome and Firefox. Safari was handling the back behavior okay without explicitly having to tell it to not cache.
Old question, but for anyone else seeing this, there is nothing wrong with the questioner's usage of the Accept header.
This is a confirmed bug in Chrome. (Previously also in Firefox but since fixed.)
http://code.google.com/p/chromium/issues/detail?id=94369
Related
I've been using debugClientLibs flag in my AEM pages, (helpful for debugging clientlibs related issues) like this localhost:4502/content/geometrixx/en.html?debugClientLibs=true.
Recently, I was seeing some caching related issue of JS. I noticed, when using debugClientLibs flag, no-cache header was not included in Request Header of individual JS files.
It does not make sense to cache these individual files as they would defeat the purpose of debugging clientlibs(i would not want to see the cached JS and CSS files when i'm using debugClientLibs flag in my pages). Attaching a screentshot of Request and Response headers i got.
My Question here is:
Are these individual clientlib files cached on the browser ?
Short answer - it depends.
Every browser has its own implementation for networking & caching rules. Response headers are hints to the browser to help them be more efficient. But the browsers may chose to do its own thing. Even more distracting, the behavior may change between versions of a given browser. Further, even if the browser's default behavior is to follow (or ignore) such headers, the user may configure different behavior. So don't assume anything, especially globally.
I've got a bug report from the field that essentially boils down to image caching. Namely, an image from the same URL is getting cached and it's causing confusion because the image itself is supposed to change.
My fix is to do this bit here. Which I'm certain will work.
However - I can't freaking reproduce this. I would prefer not to do the methods I've seen here because they require code modification and I'd rather test this on the code as it exists now before I test a fix.
Is there any way in a browser like IE to force it to cache like mad? Just temporarily, of course.
You can use Fiddler to force things to cache or not to cache; just use the Filters tab and add a caching header like
Cache-Control: public,max-age=3600
You can have the customer use www.fiddlercap.com to collect a traffic capture so you can see exactly what they see.
You should also understand that the proper way to control caching is by setting HTTP headers rather than forcing the browser to guess: http://blogs.msdn.com/b/ie/archive/2010/07/14/caching-improvements-in-internet-explorer-9.aspx
Well, I'm trying to optimize my application and currently using page speed for this. One of the strongest recommendations was that I needed to leverage browser caching. The report sent me to this page:
http://code.google.com/intl/pt-BR/speed/page-speed/docs/caching.html#LeverageBrowserCaching
In this page there is this quote:
If the Last-Modified date is
sufficiently far enough in the past,
chances are the browser won't refetch
it.
My point is: it doesn't matter the value I set to the Last-Modified header (I tried 10 years past), when I access and reload my application (always clearing the browser recent history) I get status 200 for the first access, and 304 for the reaming ones.
Is there any way I can get the behavior described in the google documentation? I mean, the browser don't try to fetch the static resources from my site?
You might have better success using the Expires header (also listed on that Google doc link).
Also keep in mind that all of these caching-related headers are hints or suggestions for browsers to follow. Different browsers can behave differently.
The method of testing is a good example. In you case you mentioned getting status 304 for remaining requests, but are you getting those by doing a manual browser refresh? Browsers will usually do a request in that case.
I am using CouchDB for a web app and having problems with IE8 caching the results of a view. From my reading it seems one solution would be to change the "Cache-Control" HTTP header to "no-cache". Right now the CouchDB returns the value "must-revalidate".
Is there a way to change the value of this header at all? Can it just be changed for the view?
FYI if you are using jQuery just remember to include cache:false as one of your options in your $.ajax() calls.
CouchDB should be using Etags that change whenever the view content or code changes. However, looking through CouchDB's bug database it looks like there is a cache issue with Internet Explorer that's been unsolved for a while. If that looks like the problem you're having, it might be helpful to propose a fix in the bug ticket or at least remind the CouchDB mailinglist/IRC of the issue.
It looks like the issue is simply IE's bug though, so some sort of workaround might be necessary, like querying the view with the age old extra random parameter hack to make the URL unique when you know you would otherwise encounter a cache issue.
What is the requirement for the browser to show the ubiquitous "this page has expired" message when the user hits the back button?
What are some user-friendly ways to prevent the user from using the back button in a webapp?
Well, by default whenever you're dealing with a form POST, and then the user hits back and then refresh then they'll see the message indicating that the browser is resubmitting data. But if the page is set to expire immediately then they won't even have to hit refresh and they'll see the page has expired message when they hit back.
To avoid both messages there are a couple things to try:
1) Use a form GET instead. It depends on what you're doing but this isn't always a good solution as there are still size restrictions on a GET request. And the information is passed along in the querystring which isn't the most secure of options.
-- or --
2) Perform a server-side redirect to a different page after the form POST.
Looks like a similar question was answered here:
Redirect with a 303 after POST to avoid "Webpage has expired": Will it work if there are more bytes than a GET request can handle?
As a third option one could prevent a user from going back in their browser at all. The only time I've felt a need to do this was to prevent them from doing something stupid such as paying twice. Although there are better server-side methods to handle that. If your site uses sessions then you can prevent them from paying twice by first disabling cache on the checkout page and setting it expire immediately. And then you can utilize a flag of some sort stored in a session which will actually change the behavior of the page if you go back to it.
you need to set pragma-cache control option in HTTP headers:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
However, from the usability point of view, this is discouraged approach to the matter. I strongly encourage you to look for other options.
ps: as proposed by Steve, redirection via GET is the proper way (or check page movement with JS).
Try using the following code in the Page_Load
Response.Cache.SetCacheability(HttpCacheability.Private)
use one of the following before session_start:
session_cache_expire(60); // in minutes
ini_set('session.cache_limiter', 'private');
/Note:
Language is PHP
I'm not sure if this is standard practice, but I typically solve this issue by not sending a Vary header for IE only. In Apache, you can put the following in httpd.conf:
BrowserMatch MSIE force-no-vary
According to the RFC:
The Vary field value indicates the set
of request-header fields that fully
determines, while the response is
fresh, whether a cache is permitted to
use the response to reply to a
subsequent request without
revalidation.
The practical effect is that when you go "back" to a POST, IE simply gets the page from the history cache. No request at all goes to the server side. I can see this clearly in HTTPWatch.
I would be interested to hear potential bad side-effects of this approach.