Caching with SSL certification - caching

I read if the request is authenticated or secure, it won't be cached. We previously worked on our cache and now planning to purchase a SSL certificate.
If caching cannot be done with SSL connection then is that mean our work on caching is useless?
Reference: http://www.mnot.net/cache_docs/

Your reference is wrong. Content sent over https will be cached in modern browsers, but they obviously cannot be cached in intermediate proxies. See http://arstechnica.com/business/2011/03/https-is-great-here-is-why-everyone-needs-to-use-it-so-ars-can-too/ or https://blog.httpwatch.com/2011/01/28/top-7-myths-about-https/ for example.

You can use the Cache-Control: public header to allow a representation served over HTTPS to be cached.
While the document you refer to says "If the request is authenticated or secure (i.e., HTTPS), it won’t be cached.", it's within a paragraph starting with "Generally speaking, these are the most common rules that are followed [...]".
The same document goes into more details after this:
Useful Cache-Control response headers include:
public — marks authenticated responses as cacheable; normally, if HTTP authentication is required, responses are automatically private.
(What applies to HTTP with authentication also applies to HTTPS.)
Obviously, documents that actually contain sensitive information only aimed for the authenticated user should not be served with this header, since they really shouldn't be cached. However, using this header for items that are suitable for caching (e.g. common images and scripts) should improve the performance of your website (as expected for caching over plain HTTP).
What will never happen with HTTPS is the caching of resources by intermediate proxy servers (between the client and your web-server, at least the external part, if you have a load-balancer or similar). Some CDNs will serve content over HTTPS (assuming it's suitable for your system to trust these CDNs). In general, these proxy servers wouldn't fall under the control of your cache design anyway.

Related

How to make clients request over HTTPS without HSTS preload?

If I request our website using HTTP http://example.com, the reponse is 301 Moved Permanently with the Location header set to https://example.com - which, of course, is insecure due to MIM attack.
Is there not a way to just repond to the browser something along "make the same request again but this time over HTTPS" insted of explicitly telling the browser the URL?
I was expecting to find this kind of solution on Troy Hunt's blog post but the only suggestion there is to use HSTS preload (ie. register our site with Google) which we do not want to do.
HTTP Strict-Transport-Security (HSTS) allows you to send a HTTP Header to say “next time you use this domain - make sure it’s over HTTPS even if the user types http:// or uses a link beginning http://“.
In Apache it is set with the following config:
Header always set Strict-Transport-Security "max-age=60;"
This sends the message telling the browser to remember this header for 60 seconds. You should increase this as you confirm there are no issues. A setting of 63072000 (2 years) is often recommended.
So this is more secure than a redirect as it happens automatically without needing an insecure HTTP request to be sent which could be intercepted, read and even changed on an insecure network.
For example let’s imagine you have logged on to your internet banking previously on your home WiFi, the browser has remembered the HSTS setting and then you visit your local coffee shop. Here you try to connect to the free WiFi but actually connect to a hackers WiFi instead. If you go to your internet banking with a HTTP link, bookmark or by typing the URL, then HSTS will kick in and you will go over HTTPS straight away and the hacker cannot unencrypt your traffic (within reason).
So. All is good. You can also add the includeSubDomains attribute:
Header always set Strict-Transport-Security "max-age= 63072000; includeSubDomains"
Which adds extra security.
The one flaw with HSTS is it requires that initial connection to load this HTTP header and protect you in future. It also times out after the max-age time. That’s where preload comes in. You can submit your domain to the browsers and they will load this domain’s HSTS setting into the browser code and make this permanent so even that first connection is secure.
However I really don’t like preload to be honest. I just find the fact it’s out of your control dangerous. So if you discover some domain is not using HTTPS (e.g. http://blog.example.com or http://intranet.example.com or http://dev.example.com) then as soon as the preload comes into affect - BANG you’ve forced yourself to upgrade these and quickly as they are inaccessible until then. Reversing from browser takes months at least and few can live with that downtime. Of course you should test this, but that requires going to https://example.com (instead of https://www.example.com) and using includeSubDomains to fully replicate what preload will do and not everyone does that. There are many, many examples of sites getting this wrong.
You’ve also got to ask what you are protecting against and what risks you are exposing yourself to? With a http:// link a hacker intercepting could get access to cookies (which the site can protect against by using the secure attribute on cookies) and possibly intercept the traffic by keeping you on http:// instead of upgrading to https:// (which is mostly mitigated with HSTS and is increasingly flagged by the browser anyway). Remember that even on an attackers WiFi network the green padlock means the connection is secure (within reasonable limitations). So as long as you look for this (and your users do, which is more difficult I admit) the risks are reasonably small. This is why the move to HTTPS everywhere and then HTTPS by default is so important. So for most sites I think HSTS without preload is sufficient, and leaves the control with you the site owner.

WebSocket and the Origin header field

The following is quoted from RFC6455 - WebSocket protocol.
Servers that are not intended to process input from any web page but
only for certain sites SHOULD verify the |Origin| field is an origin
they expect. If the origin indicated is unacceptable to the server,
then it SHOULD respond to the WebSocket handshake with a reply
containing HTTP 403 Forbidden status code.
The |Origin| header field protects from the attack cases when the
untrusted party is typically the author of a JavaScript application
that is executing in the context of the trusted client. The client
itself can contact the server and, via the mechanism of the |Origin|
header field, determine whether to extend those communication
privileges to the JavaScript application. The intent is not to prevent
non-browsers from establishing connections but rather to ensure that
trusted browsers under the control of potentially malicious JavaScript
cannot fake a WebSocket handshake.
I just cannot be sure about what the 2nd paragraph means, especially the italic part. Could anyone explain it a bit? Or maybe an example.
My understanding so far is like this:
If server CAN be sure that requests DO come from Web pages, the ORIGIN header can be used to prevent access from un-welcomed Web pages.
If server CANNOT be sure that requests come from Web pages, the ORIGIN header is merely advisory.
Your understanding seem to be correct, but..
I would rephrase it - you can be sure, that javascript client will send proper origin header. You don't know what will be sent by other clients (and whether the value is correct or not).
This should prevent other pages to connect to "your" web socket endpoints (which is a big deal, imagine injected javascript somewhere on jsfiddle or some frequently visited page), but if you need to make sure that no other client will be able to connect to it, you'll need to introduce some other security measures.
I believe this is meant only as prevention of browser based "data stealing" or "DDoSing", nothing else; you can still do that by using some other client.

Amazon Cloudfront: private content but maximise local browser caching

For JPEG image delivery in my web app, I am considering using Amazon S3 (or Amazon Cloudfront
if it turns out to be the better option) but have two, possibly opposing,
requirements:
The images are private content; I want to use signed URLs with short expiration times.
The images are large; I want them cached long-term by the users' browser.
The approach I'm thinking is:
User requests www.myserver.com/the_image
Logic on my server determines the user is allowed to view the image. If they are allowed...
Redirect the browser (is HTTP 307 best ?) to a signed Cloudfront URL
Signed Cloudfront URL expires in 60 seconds but its response includes "Cache-Control max-age=31536000, private"
The problem I forsee is that the next time the page loads, the browser will be looking for
www.myserver.com/the_image but its cache will be for the signed Cloudfront URL. My server
will return a different signed Cloudfront URL the second time, due to very short
expiration times, so the browser won't know it can use its cache.
Is there a way round this without having my webserver proxy the image from Cloudfront (which obviously negates all the
benefits of using Cloudfront)?
Wondering if there may be something I could do with etag and HTTP 304 but can't quite join the dots...
To summarize, you have private images you'd like to serve through Amazon Cloudfront via signed urls with a very short expiration. However, while access by a particular url may be time limited, it is desirable that the client serve the image from cache on subsequent requests even after the url expiration.
Regardless of how the client arrives at the cloudfront url (directly or via some server redirect), the client cache of the image will only be associated with the particular url that was used to request the image (and not any other url).
For example, suppose your signed url is the following (expiry timestamp shortened for example purposes):
http://[domain].cloudfront.net/image.jpg?Expires=1000&Signature=[Signature]
If you'd like the client to benefit from caching, you have to send it to the same url. You cannot, for example, direct the client to the following url and expect the client to use a cached response from the first url:
http://[domain].cloudfront.net/image.jpg?Expires=5000&Signature=[Signature]
There are currently no cache control mechanisms to get around this, including ETag, Vary, etc. The nature of client caching on the web is that a resource in cache is associated with a url, and the purpose of the other mechanisms is to help the client determine when its cached version of a resource identified by a particular url is still fresh.
You're therefore stuck in a situation where, to benefit from a cached response, you have to send the client to the same url as the first request. There are potential ways to accomplish this (cookies, local storage, server scripting, etc.), and let's suppose that you have implemented one.
You next have to consider that caching is only just a suggestion and even then it isn't a guarantee. If you expect the client to have the image cached and serve it the original url to benefit from that caching, you run the risk of a cache miss. In the case of a cache miss after the url expiry time, the original url is no longer valid. The client is then left unable to display the image (from the cache or from the provided url).
The behavior you're looking for simply cannot be provided by conventional caching when the expiry time is in the url.
Since the desired behavior cannot be achieved, you might consider your next best options, each of which will require giving up on one aspect of your requirement. In the order I would consider them:
If you give up short expiry times, you could use longer expiry times and rotate urls. For example, you might set the url expiry to midnight and then serve that same url for all requests that day. Your client will benefit from caching for the day, which is likely better than none at all. Obvious disadvantage is that your urls are valid longer.
If you give up content delivery, you could serve the images from a server which checks for access with each request. Clients will be able to cache the resource for as long as you want, which may be better than content delivery depending on the frequency of cache hits. A variation of this is to trade Amazon CloudFront for another provider, since there may be other content delivery networks which support this behavior (although I don't know of any). The loss of the content delivery network may be a disadvantage or may not matter much depending on your specific visitors.
If you give up the simplicity of a single static HTTP request, you could use client side scripting to determine the request(s) that should be made. For example, in javascript you could attempt to retrieve the resource using the original url (to benefit from caching), and if it fails (due to a cache miss and lapsed expiry) request a new url to use for the resource. A variation of this is to use some caching mechanism other than the browser cache, such as local storage. The disadvantage here is increased complexity and compromised ability for the browser to prefetch.
Save a list of user+image+expiration time -> cloudfront links. If a user has an non-expired cloudfront link use it for an image and don't generate a new one.
It seems you already solved the issue. You said that your server is issuing a redirect http 307 to the cloudfront URL (signed URL) so the browser caches only the cloudfront URL not your URL(www.myserver.com/the_image). So the scenario is as follows :
Client 1 checks www.myserver.com/the_image -> is redirect to CloudFront URL -> content is cached
The CloudFront url now expires.
Client 1 checks again www.myserver.com/the_image -> is redirected to the same CloudFront URL-> retrieves the content from cache without to fetch again the cloudfront content.
Client 2 checks www.myserver.com/the_image -> is redirected to CloudFront URL which denies its accesss because the signature expired.

What makes cross domain ajax insecure?

I'm not sure I understand what types of vulnerabilities this causes.
When I need to access data from an API I have to use ajax to request a PHP file on my own server, and that PHP file accesses the API. What makes this more secure than simply allowing me to hit the API directly with ajax?
For that matter, it looks like using JSONP http://en.wikipedia.org/wiki/JSONP you can do everything that cross-domain ajax would let you do.
Could someone enlighten me?
I think you're misunderstanding the problem that the same-origin policy is trying to solve.
Imagine that I'm logged into Gmail, and that Gmail has a JSON resource, http://mail.google.com/information-about-current-user.js, with information about the logged-in user. This resource is presumably intended to be used by the Gmail user interface, but, if not for the same-origin policy, any site that I visited, and that suspected that I might be a Gmail user, could run an AJAX request to get that resource as me, and retrieve information about me, without Gmail being able to do very much about it.
So the same-origin policy is not to protect your PHP page from the third-party site; and it's not to protect someone visiting your PHP page from the third-party site; rather, it's to protect someone visiting your PHP page, and any third-party sites to which they have special access, from your PHP page. (The "special access" can be because of cookies, or HTTP AUTH, or an IP address whitelist, or simply being on the right network — perhaps someone works at the NSA and is visiting your site, that doesn't mean you should be able to trigger a data-dump from an NSA internal page.)
JSONP circumvents this in a safe way, by introducing a different limitation: it only works if the resource is JSONP. So if Gmail wants a given JSON resource to be usable by third parties, it can support JSONP for that resource, but if it only wants that resource to be usable by its own user interface, it can support only plain JSON.
Many web services are not built to resist XSRF, so if a web-site can programmatically load user data via a request that carries cross-domain cookies just by virtue of the user having visited the site, anyone with the ability to run javascript can steal user data.
CORS is a planned secure alternative to XHR that solves the problem by not carrying credentials by default. The CORS spec explains the problem:
User agents commonly apply same-origin restrictions to network requests. These restrictions prevent a client-side Web application running from one origin from obtaining data retrieved from another origin, and also limit unsafe HTTP requests that can be automatically launched toward destinations that differ from the running application's origin.
In user agents that follow this pattern, network requests typically use ambient authentication and session management information, including HTTP authentication and cookie information.
EDIT:
The problem with just making XHR work cross-domain is that many web services expose ambient authority. Normally that authority is only available to code from the same origin.
This means that a user that trusts a web-site is trusting all the code from that website with their private data. The user trusts the server they send the data to, and any code loaded by pages served by that server. When the people behind a website and the libraries it loads are trustworthy, the user's trust is well-placed.
If XHR worked cross-origin, and carried cookies, that ambient authority would be available to code to anyone that can serve code to the user. The trust decisions that the user previously made may no longer be well-placed.
CORS doesn't inherit these problems because existing services don't expose ambient authority to CORS.
The pattern of JS->Server(PHP)->API makes it possible and not only best, but essential practice to sanity-check what you get while it passes through the server. In addition to that, things like poisened local resolvers (aka DNS Worms) etc. are much less likely on a server, than on some random client.
As for JSONP: This is not a walking stick, but a crutch. IMHO it could be seen as an exploit against a misfeature of the HTML/JS combo, that can't be removed without breaking existing code. Others might think different of this.
While JSONP allows you to unreflectedly execute code from somwhere in the bad wide world, nobody forces you to do so. Sane implementations of JSONP allways use some sort of hashing etc to verify, that the provider of that code is trustwirthy. Again others might think different.
With cross site scripting you would then have a web page that would be able to pull data from anywhere and then be able to run in the same context as your other data on the page and in theory have access to the cookie and other security information that you would not want access to be given too. Cross site scripting would be very insecure in this respect since you would be able to go to any page and if allowed the script on that page could just load data from anywhere and then start executing bad code hence the reason that it is not allowed.
JSONP on the otherhand allows you to get data in JSON format because you provide the necessary callback that the data is passed into hence it gives you the measure of control in that the data will not be executed by the browser unless the callback function does and exec or tries to execute it. The data will be in a JSON format that you can then do whatever you wish with, however it will not be executed hence it is safer and hence the reason it is allowed.
The original XHR was never designed to allow cross-origin requests. The reason was a tangible security vulnerability that is primarily known by CSRF attacks.
In this attack scenario, a third party site can force a victim’s user agent to send forged but valid and legitimate requests to the origin site. From the origin server perspective, such a forged request is not indiscernible from other requests by that user which were initiated by the origin server’s web pages. The reason for that is because it’s actually the user agent that sends these requests and it would also automatically include any credentials such as cookies, HTTP authentication, and even client-side SSL certificates.
Now such requests can be easily forged: Starting with simple GET requests by using <img src="…"> through to POST requests by using forms and submitting them automatically. This works as long as it’s predictable how to forge such valid requests.
But this is not the main reason to forbid cross-origin requests for XHR. Because, as shown above, there are ways to forge requests even without XHR and even without JavaScript. No, the main reason that XHR did not allow cross-origin requests is because it would be the JavaScript in the web page of the third party the response would be sent to. So it would not just be possible to send cross-origin requests but also to receive the response that can contain sensitive information that would then be accessible by the JavaScript.
That’s why the original XHR specification did not allow cross-origin requests. But as technology advances, there were reasonable requests for supporting cross-origin requests. That’s why the original XHR specification was extended to XHR level 2 (XHR and XHR level 2 are now merged) where the main extension is to support cross-origin requests under particular requirements that are specified as CORS. Now the server has the ability to check the origin of a request and is also able to restrict the set of allowed origins as well as the set of allowed HTTP methods and header fields.
Now to JSONP: To get the JSON response of a request in JavaScript and be able to process it, it would either need to be a same-origin request or, in case of a cross-origin request, your server and the user agent would need to support CORS (of which the latter is only supported by modern browsers). But to be able to work with any browser, JSONP was invented that is simply a valid JavaScript function call with the JSON as a parameter that can be loaded as an external JavaScript via <script> that, similar to <img>, is not restricted to same-origin requests. But as well as any other request, a JSONP request is also vulnerable to CSRF.
So to conclude it from the security point of view:
XHR is required to make requests for JSON resources to get their responses in JavaScript
XHR2/CORS is required to make cross-origin requests for JSON resources to get their responses in JavaScript
JSONP is a workaround to circumvent cross-origin requests with XHR
But also:
Forging requests is laughable easy, although forging valid and legitimate requests is harder (but often quite easy as well)
CSRF attacks are a not be underestimated threat, so learn how to protect against CSRF

Is a change required only in the code of a web application to support HSTS?

If I want a client to always use a HTTPs connection, do I only need to include the headers in the code of the application or do I also need to make a change on the server? Also how is this different to simply redirecting a user to a HTTPs page make every single time they attempt to use HTTP?
If you just have HTTP -> HTTPS redirects a client might still try to post sensitive data to you (or GET a URL that has sensitive data in it) - this would leave it exposed publicly. If it knew your site was HSTS then it would not even try to hit it via HTTP and so that exposure is eliminated. It's a pretty small win IMO - the bigger risks are the vast # of root CAs that everyone trusts blindly thanks to policies at Microsoft, Mozilla, Opera, and Google.

Resources