I've been using debugClientLibs flag in my AEM pages, (helpful for debugging clientlibs related issues) like this localhost:4502/content/geometrixx/en.html?debugClientLibs=true.
Recently, I was seeing some caching related issue of JS. I noticed, when using debugClientLibs flag, no-cache header was not included in Request Header of individual JS files.
It does not make sense to cache these individual files as they would defeat the purpose of debugging clientlibs(i would not want to see the cached JS and CSS files when i'm using debugClientLibs flag in my pages). Attaching a screentshot of Request and Response headers i got.
My Question here is:
Are these individual clientlib files cached on the browser ?
Short answer - it depends.
Every browser has its own implementation for networking & caching rules. Response headers are hints to the browser to help them be more efficient. But the browsers may chose to do its own thing. Even more distracting, the behavior may change between versions of a given browser. Further, even if the browser's default behavior is to follow (or ignore) such headers, the user may configure different behavior. So don't assume anything, especially globally.
Related
Traditionally a browser will parse HTML and then send further requests to the server for all related data. This seems like inefficient to me, since it might require a large number of requests, even though my server already knows that a browser that wants to use this web application will need all of it's resources.
I know that js and css could be inlined, but that complicates server side code and img data as base64 bloats the size of the data... I'm aware as well that rendering can start before all assets are downloaded, which would potentially no longer work (depending on the implementation). I still feel that streaming an entire application in one go should be faster on slow connections than making tens of requests separately.
Ideally I would like the server to stream an entire directory into one HTTP response.
Does any model for this exist?
Does the reasoning make sense?
ps: If browser support for this is completely lacking, I'm wondering about a 2 step approach. Download a small JavaScript which downloads a compressed web app file, extracts it and plugs the resources into the page. Is anyone already doing something like this?
Update
I found one: http://blog.another-d-mention.ro/programming/read-load-files-from-zip-in-javascript/
I started to research related issues in order to find the way to get best results with what seems possible without changing web standards, and I wondered about caching. If I could send the last modified date of every subresource of a page along with the initial HTML page, a browser could avoid asking if modified headers once it has loaded every resource at least once. This would in effect be better than to send all resources with the initial request, since that would be beneficial only on the first load, and detrimental on subsequent loads, since it would be better for browsers to use their cache (as Barmar pointed out).
Now it turns out that even with a web extension you can not get hold of the if-modified-since header and so you surely can't tell the browser to use the cached version instead of contacting the server.
I then found this post from Facebook on how they tried to reduce traffic by hashing their static files and giving them a 1 year expiry date. This would mean that the url garantuees the content of the file. They still saw plenty of unnecessary if-modified-since requests and they managed to convince Firefox and Chrome to change the behaviour of their reload buttons to no longer reload static resources. For Firefox this requires a new cache-control: immutable header, for Chrome it doesn't.
I then remembered that I had seen something like that before and it turns out there is a solution for this problem which is more convenient than hashing the contents of resources and serving them from a database for at least ten years. It is to just a new version number in the filename. The even more convenient solution would be to just add a version query string, but it turns out that that doesn't always work.
Admittedly, changing your filenames all the time is a nuisance, because files referencing these files also need to change. However the files don't actually need to change. If you control the server it might be as simple as writing a redirect rule to make sure that logo.vXXXX.png will be redirected to logo.png (where XXXX is the last modified timestamp in seconds since epoch)[1]. Now let your template system automatically generate the timestamp, like in wordpress' wp_enqueue_script. WordPress actually satisfies itself with the query string technique. Now you can set the expiration date to a far future and use the immutable cache header. If browsers respect the cache control, you can now safely ignore etags and if-modified-since headers, since they are now completely redundant.
This solution guarantees the browser shall never ask for cache validation and yet you shall never see a stale resource, without having to decide on the expiry date in advance.
It doesn't answer the original question here about how to avoid having to do multiple requests to fetch the resources on the same page on a clean cache, but ever after (as long as the browser cache doesn't get cleared), you're good! I suppose that's good enough for me.
[1] You can even avoid the server overhead of checking the timestamp on every resource every time a page references it by using the version number of your application. In debug mode, for development, one can use the timestamp to avoid having to bump the version on every modification of the file.
I've been trying to get my head around the whole issue of browser history Vs caching and RFC2616 13.13
Does this section of the RFC mean that if a user goes "Back" in the browser, for example, it should always display the page from it's local storage, ignoring any cache directives, unless the user has configured it otherwise?
So browsers that reload the page when navigating the history, even if caching directives are instructing it do so, are not complying with the specification? And the spec is saying this is bad because "this will tend to force service authors to avoid using HTTP expiration controls and cache controls when they would otherwise like to."
Also, even though a directive may instruct the broswer not to cache, e.g. using Cache-Control: no-store, it can/should store it in it's history cache?
From what I've read, it seems that most browsers violate the standard, apart from Opera. Is this because the security concerns around the re-display of pages with sensitive data from history are seen as more important than the issue the standard talks about?
I'd be grateful if anyone is able shed some light/clarification on this area, thanks.
History and cache are completely separate. We're trying to clarify this in httpbis; see https://svn.tools.ietf.org/svn/wg/httpbis/draft-ietf-httpbis/latest/p6-cache.html#history.lists
I've got a bug report from the field that essentially boils down to image caching. Namely, an image from the same URL is getting cached and it's causing confusion because the image itself is supposed to change.
My fix is to do this bit here. Which I'm certain will work.
However - I can't freaking reproduce this. I would prefer not to do the methods I've seen here because they require code modification and I'd rather test this on the code as it exists now before I test a fix.
Is there any way in a browser like IE to force it to cache like mad? Just temporarily, of course.
You can use Fiddler to force things to cache or not to cache; just use the Filters tab and add a caching header like
Cache-Control: public,max-age=3600
You can have the customer use www.fiddlercap.com to collect a traffic capture so you can see exactly what they see.
You should also understand that the proper way to control caching is by setting HTTP headers rather than forcing the browser to guess: http://blogs.msdn.com/b/ie/archive/2010/07/14/caching-improvements-in-internet-explorer-9.aspx
I'm using Play Framework (v1.1.1) and I have a doubt about the #{cache} tag.
I suppose the question would be "when should I use it?" but I think it's quite generic.
So besides that, I would like to know if someone has checked its behaviour with Javascript. I understand that it will cache the output of other tags embedded in its body, but it will also cache Javascript? More specifically, if I include some script tags that reference external resources (like a CDN), the file will get cached too or only the tag?
The purpose of the Cache tag is to cache the output that the server sends to the client. Javascript, images and any other information that is contained within the code sent to the client side is not cached, unless specifically told to do so by the headers set in the tag of your HTML.
By default, Play (if you extend the main.html) does not specify any cache-control headers, so therefore your scripts will be cached based on the browsers standard caching policy. This should be "no-cache" according to the http spec, but I am doubtful of whether this is the case.
Here's the situation:
I have a web application which response to a request for a list of resources, lets say:
/items
This is initially requested directly by the web browser by navigating to that path. The browser uses it's standard "Accept" header which includes "text/html" and my application notices this and returns the HTML content for the item list.
Within the returned HTML is some JavaScript (jQuery), which then does an ajax request to retrieve the actual data:
/items
Only this time, the "Accept" header is explicitly set to "application/json". Again, my application notices this and JSON is correctly returned to the request, the data is inserted into the page, and everything is happy.
Here comes the problem: The user navigates to another page, and later presses the BACK button. They are then prompted to save a file. This turns out to be the JSON data of the item list.
So far I've confirmed this to happen in both Google Chrome and Firefox 3.5.
There's two possible types of answers here:
How can I fix the problem. Is
there some magic combination of
Cache-Control headers, or other
voodoo which cause the browser to do
the right thing here?
If you think I am doing something
horribly wrong here, how should I go
about this? I'm seeking correctness,
but also trying not to sacrifice
flexibility.
If it helps, the application is a JAX-RS web application, using Restlet 2.0m4. I can provide sample request/response headers if it's helpful but I believe the issue is completely reproducible.
Is there some magic combination of Cache-Control headers, or other voodoo which cause the browser to do the right thing here?
If you serve different responses to different Accept: headers, you must include the header:
Vary: Accept
in your response. The Vary header should also contain any other request headers that influence the response, so for example if you do gzip/deflate compression you'd have to include Accept-Encoding.
IE, unfortunately handles many values of Vary poorly, breaking cacheing completely, which might or might not matter to you.
If you think I am doing something horribly wrong here, how should I go about this?
I don't think the idea of serving different content for different types at the same URL is horribly wrong, but you are letting yourself in for more compatibility problems than you really need. Relying on headers working through JSON isn't really a great idea in practice; you'd be best off just having a different URL, such as /items/json or /items?format=json.
I know this question is old, but just in case anyone else runs into this:
I was having this same problem with a Rails application using jQuery, and I fixed it by telling the browser not to cache the JSON response with the solution given here to a different question:
jQuery $.getJSON works only once for each control. Doesn't reach the server again
The problem only seemed to occur with Chrome and Firefox. Safari was handling the back behavior okay without explicitly having to tell it to not cache.
Old question, but for anyone else seeing this, there is nothing wrong with the questioner's usage of the Accept header.
This is a confirmed bug in Chrome. (Previously also in Firefox but since fixed.)
http://code.google.com/p/chromium/issues/detail?id=94369