When reading some answers to aquestion on clearing cache for JS files, somebody pointed to this part of the http spec. It basically says that URLS containing a ? should not be pulled from the cache, unless a specific expiry date is given. How do query string absent URLs which are so common with MVC websites (RoR, ASP.Net MVC, etc.) get cached, and is the behaviour different then with more traditional query string based urls?
AFAIK there is no difference on the part of browsers as both Firefox and IE will (incorrectly) cache the response from a url with a querystring, in the same way they cache the response from a url without a querystring. In the case of Safari it respects the spec and doesn't cache urls with querystrings. HTTP proxies tend to be a tad errectic with what they consider cacheable.
It pays to have the headers set correctly and it's worth investigating ETags.
I believe you manage caching in ASP.NET MVC using the OutputCache attribute (on your controller methods).
Related
I've searched around but couldn't find anything about this. Am I the only one who have experienced that CSRF protection in CodeIgniter doesn't work with page caching?
What do I have:
A webpage which will be cached trough this line:
$this->output->cache( 120 );
In the Javascript on that page I've got a Ajax call where the data contains the CSRF token too. Everything works fine when caching is disabled or when I disable CSRF protection.
Does somebody know a workaround or something so I can have caching and CSRF protection enabled?
Thanks!
I'm somewhat surprised that form_open() doesn't handle that for you, in a similar way that benchmarking functions' output aren't cached.
Here are two possible workarounds.
Use the caching driver (!= output->cache())
Instead of using Output class caching, which caches a completely rendered page, you could employ the caching driver's key-value cache to save rendered portions of your page.
If the form containing this problematic CSRF token is complex and contains a lot of dynamic content from an external data source, cache those database results (either with the caching driver or by enabling database result caching) and feed the cached values into a dynamic form.
Warning about file-based cache from the manual:
Unlike caching from the Output Class, the driver file-based caching
allows for pieces of view files to be cached. Use this with care, and
make sure to benchmark your application, as a point can come where
disk I/O will negate positive gains by caching.
Of course, if you have access to memcached or APC, use that instead.
Disable output caching for that page and profile.
Intercepting the output cache (fully-rendered page) and replacing the CSRF token value
I came across an interesting solution on Caching forms with CSRF tokens (in Symfony). To paraphrase the original author:
Before setting the cached response, find and replace CSRF tokens.
Store the position of the tokens with the response (so it gets cached as well).
Before returning a response from the cache inject fresh CSRF tokens.
In CodeIgniter, intercepting the cache seems to require use of the pre_system hook-point, though in your case, you may be able to use cache_override. Take a look at this excellent article on the way in which CodeIgniter implements CRSF tokens for inspiration. I don't think it would be trivial to implement, though.
Don't cache that page and don't worry about it
This is obviously the simplest solution. Test it. Depending on your page complexity, the negative performance impact of not caching that subset of pages may well outweigh the pain of implementing either of the above two solutions. (Since we don't know what your views or controller look like, whether or not this is an acceptable solution in your case isn't immediately obvious). If it's an isolated login form in an SPA, you can probbaly get away with it.
I have a specific Adobe CQ5 (5.5) content template that authors will use to create pages. I want to exclude any page that is created from this template from the dispatcher cache. As I understand it currently, the only way I know to prevent caching is to configure dispatcher.any to not cache a particular URL. But in this case, the URL isn't known until a web author uses the template to create a page. I don't want to have to go back and modify dispatcher.any every time a page is created--or at least I want to automate this if there is no other way. I am using IIS for the dispatcher. The reason I don't want to cache the pages is because the underlying JSPs that render the content for these pages produce dynamic content, and the pages don't use querystrings and won't carry authentication headers. The pages will be created in unpredictable directories, so I don't know the URL pattern ahead of time.
How can I configure things so that any page that is created from a certain template will be automatically excluded from the dispatcher cache?
It seems like CQ ought to have some mechanism to respect HTTP response/caching headers. If the HTTP response headers specify that the response shouldn't be cached, it seems like the dispatcher shouldn't cache it--regardless of what dispatcher.any says. This is the CQ5 documentation I have been referencing.
I don't know about the IIS verson of the Dispatcher, but certainly with the Apache module if you add a custom HTTP header "dispatcher: nocache" it will not cache the page in the Dispatcher. You would need to change the code to add this, which would be something like:
request.setHeader("Dispatcher", "nocache");
It might also work as meta tags in the html, but I've not tried this.
This is documented here: http://dev.day.com/content/kb/home/Dispatcher/faq-s/DispatcherNoCache.html
You might use cache control tags in the template's head. See info on PRAGMA and Cache-Control meta tags here: HTTP Cache- Control.
Imagine a website that is highly cached where the output of almost every GET action is cached into a html file that is accessible directly from the HTTP server without having to perform a server-side CGI operation. Now imagine that in addition to that, JavaScript is used to filter the response of the HTML request using AJAX. The AJAX response contains only the appropriate response of the page (so for standard HTML pages it will contain everything except for the surrounding layout, for modals it will contain only the modal box HTML, etc...).
Now lets imagine that the HTML content may be cached neutrally (when nobody is logged in) or cached for someone who is logged in. There are certain areas of the page that are tied to session data (like the welcome message, the profile link, etc...) and that data is specific to the session. But since we're using JavaScript, we can buffer the AJAX response, change the session element values, and then stick it into the DOM all the while the user is unaware of any session hot swapping. This relies ofcoarse only on GET requests and pages where the actual content is not 100% session dependent.
Now here is my question. If I were to implement this (and trust me I will) then how might I actually keep track of the session activity while the user is browsing the page? With a traditional server-side operation, whenever the user accesses a page then the server-side framework will update the session and keep tabs on the session-related variables. With a static HTTP request operation then all server-side involvement is avoided. So I will need to figure out some way of keeping track of what's going on with the session; here are my approaches:
1) Perform two AJAX requests (or an additional one when needed):
Once the user queries a page then the contents will be downloaded as static HTML. But at the same time that page is queried then another AJAX request will be serviced to a session-specific URL/server updating/querying the status of the session. This can be done side by side or can be performed after every few requests are made.
Pros = HTML files are left unchanged, HTML files can be set to have a ETag or future expires header, JavaScript can cache only the static HTML and use it for offline browsing, a session-server can be dedicated, optimized and configured for session activity.
Cons = Two AJAX requests are performed, excessive polling for potentially redundant data, session handling made be separated from content server.
2) Use a midway proxy that appends the session-data as a trailing session JSON
A request is made to the server. There is a proxy in between that locally accesses the session data and then performs another HTTP request (either locally or remotely) which is then concatenated with the session data findings fetched just before. The browser is responded with a clean copy of HTML code where has JavaScript-specific session content and then everything is updated at the same moment.
Pros = Everything is downloaded at once, only one connection required, works like a normal HTTP request would
Cons = Caching gets difficult when a dynamic content proxy is used, content-length may need to be search and replaced with to append additional data, may not work with some browsers?
3) Use Comet for session data
A persistant, reverse-AJAX comet connection could be established at the start of the website connection. Then, all static-HTML requests could be accessed normally. All session-related requests could be accessed from the comet connection.
Pros = Separation of static content and dynamic content.
Cons = Comet isn't supported very well and doesn't work very well, server latency, may conflict with same origin policy.
How do you guys think this problem should be solved? Do you think its doable?
The solution I've found is to use templated data and dynamic data separate from each other. It's too much work and too messy to implement this on your own so you can go as far as using a MVC framework to provide JSON requests with templating (AngularJS, KnockoutJS, EmbedJS, etc...) or you can just stick to using templates in general. Keep in mind that this destroys SEO.
I have a little issue with a sublayout outputting data which should differ between SSL and non-SSL requests.
To replicate if you create a sublayout and in code behind render out the URL. If you then add this sublayout to a page through the sitecore interface, eg: presentation>details etc and set caching on and set all caching variable to on. Do you publish etc so the page is now viewable (also behaves the same if your doing it directly calling the the control by sublayout control in code).
If you execute this page in non-ssl mode (http://URL) you will get a URL such as; http://URL...
Then if you execute this page in ssl mode (https://URL) your output will still be http://URL...
So does anyone know of a way to get this so we an cache both instances.
Regards,
Chris
You can use the existing VaryByParm functionality to create a VaryByUrl behaviour by programatically setting the cache key for the sublayout with the complete request url, including the scheme. I believe this would be the same process as described in Mark Ursino's response here: Customizing sublayout caching in Sitecore
As a side note, if in a multilanguage site with language prefixes be aware that using Sitecore.Context.RawUrl will give you the request url with language prefixes stripped out by the actions of the StripLanguage preProcessRequest pipeline step.
Paul
I think I see your issue -- you are outputting absolute URLs for your images, and based on whether or not SSL is used on the first request, the SSL URL of your images may or may not be included in the cached output.
My first suggestion is to disable absolute URLs, if possible. Is there a reason you need them?
My backup would be to point you at the renderingControls configuration in Web.config. If we're talking about sublayouts here, you could potentially subclass sublayout, and create a new factory for sublayout rendering. When you subclass sublayout, override its GetCacheKey method to add a flag if the request is ssl...
//if request is ssl
return base.GetCacheKey()+"#ssl";
Fair warning that I've never done this, just making an educated suggestion based on available Sitecore config and APIs.
Good luck.
I'm using Play Framework (v1.1.1) and I have a doubt about the #{cache} tag.
I suppose the question would be "when should I use it?" but I think it's quite generic.
So besides that, I would like to know if someone has checked its behaviour with Javascript. I understand that it will cache the output of other tags embedded in its body, but it will also cache Javascript? More specifically, if I include some script tags that reference external resources (like a CDN), the file will get cached too or only the tag?
The purpose of the Cache tag is to cache the output that the server sends to the client. Javascript, images and any other information that is contained within the code sent to the client side is not cached, unless specifically told to do so by the headers set in the tag of your HTML.
By default, Play (if you extend the main.html) does not specify any cache-control headers, so therefore your scripts will be cached based on the browsers standard caching policy. This should be "no-cache" according to the http spec, but I am doubtful of whether this is the case.