Do XMLHttpRequests cache the response headers? - ajax

If I make an XMLHttpRequest and the browser caches the response, will it also cache the HTTP response headers? That is, the next time I make the same request, will I get the same values back from response.getResponseHeader?
Is this browser-dependent?

I'm certain at least the major browsers don't try to cache the headers. However, if you want to prevent caching altogether, you may need to send special headers. If you want to do a quick test of caching behaviors, there's a page here:
http://www.mnot.net/javascript/xmlhttprequest/cache.html
And I recommend if you want to see what's actually happening, that you go get a packet sniffer such as Wireshark and see for yourself. I can imagine the browser at least performs a HEAD request for an XmlHttpRequest even if it gives you the cached body, but I could be wrong.

Headers are either not cached or not reused. Since a request is sent, headers are received and only then can be decided if this request is valid in the cache. The new headers have then already been downloaded, so no point in reusing the old ones.
Only the response body is cached.
You could ofcourse very easily test this. Make a XHR request to a static resource (img or txt file or something) and check the Date header.
I don't think it's browser dependant. A browser caching and reusing HTTP headers would be very, very strange.
edit
jQuery adds (by default I think) anti caching arguments to GET requests (very annoying), which would sort-of answer your question: nothing is cached like that.

Related

Prevent AJAX POST responses from being cached

I have AJAX POST requests generated from my webpage, and there may be multiple post requests with the same post data. But the response may vary, and I want to make sure I am not getting cached responses to any of these requests. I need each request to hit the webpage.
Am I right in assuming that responses to POST requests will not be cached?
There is two level of caching will be involved in that process
Browser caching
Server caching
To eliminate first one you have to cheat your browser and add a fake parameter to your ajax request so it will think it's unique each time i.e
www.example.com/api/ajax?123
www.example.com/api/ajax?1234
For server level you have to make sure that no cache been added to your configuration for such link, for example some developer will cache any file ends with .json or service like Cloud Flare it will automatically cache any static content.

What are the security benefits of CORS preflight requests?

I've been working on a classic SPA where the front end app lives on app.example.com while the API lives on api.example.com, hence requiring the use of CORS requests. Have setup the server to return the CORS header, works fine.
Whenever an AJAX request is not simple, the browser makes an extra OPTIONS request to the server to determine if it can make the call with the payload. Find Simple Requests on MDN
The question is: What are the actual benefits of doing the OPTIONS request, especially in regards to security?
Some users of my app have significant geographical latency and since the preflight cache doesn't last long, the preflight requests cause latencies to be multiplied.
I'm hoping to make POST requests simple, but just embedding the Content-Type of application/json negates that. One potential solution is to "hack" it by using text/plain or encoding in the url. Hence, I hope to leave with a full understanding of what CORS preflight requests do for web security. Thanks.
As noted on the article you linked to:
These are the same kinds of cross-site requests that web content can
already issue, and no response data is released to the requester
unless the server sends an appropriate header. Therefore, sites that
prevent cross-site request forgery have nothing new to fear from HTTP
access control.
Basically it was done to make sure CORS does not introduce any extra means for cross-domain requests to be made that would otherwise be blocked without CORS.
For example, without CORS, the following form content types could only be done cross-domain via an actual <form> tag, and not by an AJAX request:
application/x-www-form-urlencoded
multipart/form-data
text/plain
Therefore any server receiving a request with one of the above content-types knows that there is a possibility of it coming from another domain and knows to take measures against attacks such as Cross Site Request Forgery. Other content types such as application/json could previously only be made from the same domain, therefore no extra protection was necessary.
Similarly requests with extra headers (e.g. X-Requested-With) would have previously been similarly protected as they could have only come from the same domain (a <form> tag cannot add extra headers, which was the only way previously to do a cross-domain POST). GET and POST are also the only methods supported by a form. HEAD is also listed here as it performs identically to GET, but without the message body being retrieved.
So, in a nutshell it will stop a "non simple" request from being made in the first place, without OPTIONS being invoked to ensure that both client and server are talking the CORS language. Remember that the Same Origin Policy only prevents reads from different origins, so the preflight mechanism is still needed to prevent writes from taking place - i.e. unsafe methods from being executed in a CSRF scenario.
You might be able to increase performance using the Access-Control-Max-Age header. Details here.

How to invalidate the cache of an arbitrary URL?

According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.10 clients must invalidate the cache associated with a URL after a POST, PUT, or DELETE request.
Is it possible to instruct a web browser to invalidate the cache of an arbitrary URL, without making an HTTP request to it?
For example:
PUT /companies/Nintendo creates a new company called "Nintendo"
GET /companies lists all companies
Every time I create a new company, I want to invalidate the cache associated with GET /companies. The browser doesn't do this automatically because the two operate on different URLs.
Is the Cache-Control mechanism inappropriate for this situation? Should I use no-cache along with ETag instead? What is the best-practice for this situation?
I know I can pass no-cache the next time I GET /companies but that requires the application to keep track URL invalidation instead of pushing the responsibility to the browser. Meaning, I want to invalidate the URL after step 1 as opposed to having to persist this information and applying it at step 2. Any ideas?
Yes, you can (within the same domain). From this answer (slightly paraphrased):
In response to a PUT or POST request, if the Content-Location header URI is different from the request URI, then the cache for the Content-Location URI is invalidated.
So in your case, include a Content-Location: /companies header in response to your POST request. This will invalidate the browser's cached version of /companies.
Note that this does not work for GET requests.
No, in HTTP/1.1 you may only invalidate a client's cache for a resource in a response to a request for that resource. It may be in response to a PUT, POST or DELETE rather than a GET (see RFC 7234, section 4.4 for details).
If you have a resource where you need clients to confirm that they have the latest version then no-cache and an entity tag is an ideal solution.
HTTP/2 allows for pushing a cache clear (Nine Things to Expect from HTTP/2 4. Cache Pushing).
In the link which you have given "the phrase "invalidate an entity" means that the cache will either remove all instances of that entity from its storage, or will mark these as "invalid" and in need of a mandatory revalidation before they can be returned in response to a subsequent request.". Now the question is where are the caches? I believe the Cache the article is talking about is the server cache.
I have worked on a project in VC++ where whenever a model changes the cache is updated. There is a programming logic implemention involved to achieve this. Your mentioned article rightly says "There is no way for the HTTP protocol to guarantee that all such cache entries are marked invalid" HTTP Protocol cannot invalidate cache on its own.
In our project example we used publish subscribe mechanism. Wheneven an Object of class A is updated/inserted it is published to a bus. The controllers register to listen to objects on the Bus. Suppose A Controller is interested in Object A changes, it will not be called back whenever Object Type B is changed and published. When Object Type A indeed is changed and published then Controller A Listener function updates the Cache with latest changes of Object A. The subsequent request of GET /companies will get the latest from the cache. Now there is a time gap between changing the object A and the Cache being refreshed with the latest changes. To avoid something wrong happening in this time gap Object is marked dirty before the Object A Changes. So a request coming inbetween of these times will wait for dirty flag being cleared.
There is also a browser cache. I remember ETAGS are used to validate this. ETAG is the checksum of the resource. For this Client should maintain old ETAG value somehow. If the checksum of resource has changed then the new resource with HTTP 200 is sent else HTTP 304 (use local copy) is sent.
[Update]
PUT /companies/Nintendo
GET /companies
are two different resources. Your the cache for /companies/Nintendo is only expected to be updated and not /companies (I am talking of client side cache) when PUT /companies/Nintendo request is executed. Suppose you call GET /companies/Nintendo next time, based on http headers the response is returned. GET /companies is a brand new request as it points to different resource.
Now question is what should be the http headers? It is purely application specific. Suppose it is stock quote I would not cache. Suppose it is NEWS item I would cache for certain time. Your reference link http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html has all the details of Cache http headers. Only thing not mentioned much is ETag usage. ETag can have checksum of resource. Check http://en.wikipedia.org/wiki/HTTP_ETag and also check https://devcenter.heroku.com/articles/increasing-application-performance-with-http-cache-headers

No expires header sent, content cached, how long until browser makes conditional GET request?

Assume browser default settings, and content is sent without expires headers.
user visits website, browser caches images etc.
user does not close browser, or refresh page.
user continues to surf site normally.
assume the browse doesn't dump the cache for any reason.
The browser will cache images etc as the user surfs, but it's unclear when it will issue a conditional GET request to ask about content freshness (apart from refreshing the page). If this is a browser specific setting, where can I see it's value (for browsers like: safari, IE, FireFox, Chrome).
[edit: yes - I understand that you should always send expires headers. However, this research is aimed at understanding how the browser works with content w/o expires headers.]
From the the HTTP caching spec (section 13.4): Unless specifically constrained by a cache-control (section 14.9) directive, a caching system MAY always store a successful response (see section 13.8) as a cache entry, MAY return it without validation if it is fresh, and MAY return it after successful validation. This means that a user agent is free to do whatever it wants if no cache control header is sent. Most browsers use a combination of user settings and heuristics to determine whether (and how long) to cache in this situation.
HTTP/1.1 defines a selection of caching mechanisms; the expires header is merely one, there is also the cache-control header.
To directly answer your question: for a resource returned with no expires header, you must consider the returned cache-control directives.
HTTP/1.1 defines no caching behaviour for a resource served with no cache-related headers. If a resource is sent with no cache-control or expires headers you must assume the client will make a regular (non-conditional) request the next time the same resources is requested.
Any deviation from this behaviour qualifies the client as being not a fully conformant HTTP client, in which case the question becomes: what behaviour is to be expected from a non-conformant HTTP client? There is no way to answer that.
HTTP caching is complex, to fully understand what a conformant client should do in a given scenario, read and understand the HTTP caching spec.
Unless you send an expires header, most browsers will make a GET request for each subsequent refresh and will either get HTTP 200 OK (it will download the content again) or HTTP 304 Not Modified (and use the data in cache).

Can you reliably set or delete a cookie during the server side processing of an Ajax (XHR) call?

I have done a bit of testing on this myself (During the server side processing of a DWR Framework Ajax request handler to be exact) and it seems you CAN successfully manipulate cookies, but this goes against much that I have read on Ajax best practices and how browsers interpret the response from an XmlHttpRequest. Note I have tested on:
IE 6 and 7
Firefox 2 and 3
Safari
and in all cases standard cookie operations on the HttpServletResponse object during Ajax request handling were correctly interpreted by the browser, but I would like to know if it best practice to push the cookie manipulation to the client side, or if this (much cleaner) server side cookie handling can be trusted.
I would welcome answers both specific to the DWR Framework and Ajax in general.
XMLHttpRequest always uses the Web Browser's connection framework. This is a requirement for AJAX programs to work correctly as the user would get logged out if the XHR object lacked access to the browser's cookie pool.
It's theoretically possible for a web browser to simply share session cookies without using the browser's connection framework, but this has never (to my knowledge) happened in practice. Even the Flash plugin uses the Web Browser's connections.
Thus the end result is that it IS safe to manipulate cookies via AJAX. Just keep in mind that the AJAX call might never happen. They are not guaranteed events, so don't count on them.
In the context of DWR it may not be "safe".
From reading the DWR site it says:
It is important that you treat the HTTP request and response as read-only. While HTTP headers might get through OK, there is a good chance that some browsers will ignore them.
I've taken this to mean that setting cookies or request attributes is a no-no.
Saying that, I have code which does set request attributes (code I wrote before I read that page) and it appears to work fine (apart from deleting cookies which I mentioned in my comment above).
Manipulating cookies on the client side is rather the opposite of "best practice". And it shouldn't be necessary, either. HttpOnly cookies weren't introduced for nothing.

Resources