What is the purpose of http2 pseudo-headers :authority & :method? - http2

What is the purpose of http2 pseudo-headers :authority & :method ? I feel confused because :authority & :method seems to repeat the Request URL(the host) and Request Method in http 1.1
Compared to :path pseudo-headers, as explained in https://developers.google.com/web/fundamentals/performance/http2#header_compression I can see it can be used for the consecutive requests for other resource. So I suspect :authority & :method maybe an optimization for that purpose too. But I can't figure out how exactly. For example, if :authority, :method and :path are all different from the original Request URL and Request Method, shouldn't browser issue a new request?

The HTTP/2 :method pseudo-header is equivalent to the HTTP/1.1 request method (the first token of the HTTP/1.1 request line).
The HTTP/2 :authority pseudo-header is a stricter, mandatory, information about the host authority (i.e. host name and host port).
In HTTP/1.1, the host authority is derived from multiple sources, and may even be absent, causing a number of confusing behaviors that depend on server implementations.
For example, the authority could be present in the HTTP/1.1 request target, when it is in absolute form.
If the HTTP/1.1 request target is in origin form, typically the authority is defined by the Host header.
However, a HTTP/1.1 request could be of this form:
GET / HTTP/1.1\r\n
Host: \r\n
\r\n
where the Host field is empty and hence there is no authority. Requests of this type are uncommon, but technically valid and different server implementations may behave differently.
Furthermore, the authority may be overridden by the Forwarded header and its obsolete predecessors X-Forwarded-* headers, but again with fuzzy rules.
The purpose of the HTTP/2 :authority pseudo-header is to clear out the confusion about the authority so that there is a single source and no more multiple sources such as the HTTP/1.1 absolute form or Host header, etc.
Regarding the optimizations, it is true that HTTP/2 may optimize the send of the :authority information, but that is orthogonal to the purpose of the :authority pseudo-header.
The optimization works (very roughly) in this way: when a user agent (such as a browser) makes a request for a page, it is likely that it will make subsequent requests to the same authority for other, secondary, resources that are necessary to render the page (for example, CSS resources, JavaScript resources, images, etc.).
The HTTP/2 protocol indexes the :authority string, let's say at index 17 of the HPACK context, so the browser sends to the server the first request with the information :authority => (17, "veryverylongdomainname.com").
For the second request, the browser and the server now share this common information, so the browser can just send to the server :authority => 17, and the server will look up the authority from index 17, saving the send of the authority string bytes over the network.
This HPACK mechanism is valid for most headers, not only for pseudo-headers. For further information please see the HPACK specification.
Browser must make a request for every resource. If they make requests for the same authority, the HPACK optimization may kick in and reduce the number of bytes over the network.
If browsers make requests for different authorities (imagine a web spider), then the authority is likely to be different in every request and the HPACK optimization will likely not kick in because the authority is always different.
In the worst case (authorities always different), HTTP/2 is as good as HTTP/1.1; but in the common case (web pages with many resources with the same authority), HTTP/2 is better than HTTP/1.1, as less bytes are sent over the network.

Related

What is the HTTP/1 equivalent of the HTTP/2 `:scheme` header?

I'm writing a proxy from HTTP/2 to HTTP/1 and vice-versa.
When I have an incoming HTTP/2 request, which defines :scheme, what header should I map that to for my proxied HTTP/1 request?
The closest thing I can find is https://www.rfc-editor.org/rfc/rfc7239#section-5.4
Mapping the HTTP/2 :scheme pseudo header to HTTP/1.1 X-Forwarded-Proto header would be correct.
You basically shouldn't map it.
For a start HTTP has no direct equivalent to the :scheme pseudo-header. The request was a relative path (e.g. /path/page/) rather than an absolute path (e.g. https://www.example.com/path/page/) and the Host header contained just the server name and not the scheme.
So the connection knows whether it is HTTP or HTTPS and this is exposed to webservers and the like (e.g. in the REQUEST_SCHEME variable for Apache) but at an HTTP level it doesn't know.
If acting as an intercepting proxy and taking one HTTP/2 connection, and forwarding requests to another, then you should open a HTTP or HTTPS connection for that second connection, as you see fit depending what the downstream system supports.
As sbordet points out if you want to make the downstream system aware of what the original scheme was then you can use X-Forwarded-Proto header (technically obseleted but still used) or the Forwarded header, but that's more for informational purposes rather than a direct mapping of what was in the original request. The scheme is related to the current request.
According to RFC 7540, section 8.1.2:
While HTTP/1.x used the message start-line (see [RFC7230],
Section 3.1) to convey the target URI, the method of the request, and
the status code for the response, HTTP/2 uses special pseudo-header
fields beginning with ':' character (ASCII 0x3a) for this purpose.
And:
The ":scheme" pseudo-header field includes the scheme portion of
the target URI ([RFC3986], Section 3.1).
":scheme" is not restricted to "http" and "https" schemed URIs. A
proxy or gateway can translate requests for non-HTTP schemes,
enabling the use of HTTP to interact with non-HTTP services.
So, if you're proxing HTTP, it should be "http" and if you're proxying HTTPS, it should be "https".
Reading again, I can see that I may have had the sense of the question the wrong way around (I was thinking HTTP1 client, HTTP2 server). But the above two quotes are still the relevant ones. You don't put :scheme in an HTTP1 header - it forms part of the URI that you place in the message start line.

CORS with client https certificates

I have a site with two https servers. One (frontend) serves up a UI made of static pages. The other (backend) serves up a microservice. Both of them happen to be using the same (test) X509 certificate to identify themselves. Individually, I can connect to them both over https requiring the client certificate "tester".
We were hiding CORS issues until now by going through an nginx setup that makes the frontend and backend appear that they are same Origin. I have implemented the headers 'Access-Control-Allow-Origin', 'Access-Control-Allow-Credentials' for all requests; with methods, headers for preflight check requests (OPTIONS).
In Chrome, cross-site like this works just fine. I can see that front-end URLs and backend URLs are different sites. I see the OPTIONS requests being made before backend requests are made.
Even though Chrome doesn't seem to need it, I did find the xmlhttprequest object that will be used to perform the request and did a xhr.withCredentials = true on it, because that seems to be what fetch.js does under the hood when it gets "credentials":"include". I noticed that there is an xhr.setRequestHeader function available that I might need to use to make Firefox happy.
Firefox behaves identically for the UI calls. But for all backend calls, I get a 405. When it does this, there is no network connection being made to the server. The browser just decided that this is a 405 without executing any https request. Even though this is different behavior from Chrome, it kind of makes sense. Both the front-end UI and backend service need a client certificate to be chosen. I chose the certificate "tester" when I connected to the UI. When it goes to make a backend request, it could assume that the same client certificate should be used to reach the back-end. But maybe it assumes that it could be different, and there is something else I need to tell Firefox.
Is anybody here using CORS in combination with 2 way SSL certificates like this, and had this Firefox problem and fixed it somewhere. I suspect that it's not a server-side fix, but something that the client needs to do.
Edit: see the answer here: https://stackoverflow.com/a/74744206/537554
I haven't actually tested this using client certificates, but I seem to recall that Firefox will not send credentials if Access-Control-Allow-Origin is set to the * wildcard instead of an actual domain. See this page on MDN.
Also there's an issue with Firefox sending a CORS request to a server that expects the client certificate to be presented in the TLS handshake. Basically, Firefox will not send the certificate during the preflight, creating a chicken and the egg problem. See this bug on bugzilla.
When using CORS with credentials (basic auth, cookies, client certificate, etc.):
Access-Control-Allow-Credentials must be true
Access-Control-Allow-Origin must not be *
Access-Control-Allow-Origin must not be multi-value (neither duplicated nor comma-delimited)
Access-Control-Allow-Origin must be set to exactly the value from the request's Origin header in order for the request to work (either hard-coded that way or if it passes a whitelist of allowed values)
The preflight OPTIONS request must not require credentials (including the client certificate). Part of the purpose of the preflight is to ask what is allowed in a CORS request, and therefore sending credentials before knowing if they are allowed is incorrect.
The preflight OPTIONS request must return a 200-level response, generally 204
Note: For Access-Control-Allow-Origin, you may want to consider allowing the value null since redirect chains (like the ones typically used for OAuth) can cause that Origin value in a request from a browser.

Caching with SSL certification

I read if the request is authenticated or secure, it won't be cached. We previously worked on our cache and now planning to purchase a SSL certificate.
If caching cannot be done with SSL connection then is that mean our work on caching is useless?
Reference: http://www.mnot.net/cache_docs/
Your reference is wrong. Content sent over https will be cached in modern browsers, but they obviously cannot be cached in intermediate proxies. See http://arstechnica.com/business/2011/03/https-is-great-here-is-why-everyone-needs-to-use-it-so-ars-can-too/ or https://blog.httpwatch.com/2011/01/28/top-7-myths-about-https/ for example.
You can use the Cache-Control: public header to allow a representation served over HTTPS to be cached.
While the document you refer to says "If the request is authenticated or secure (i.e., HTTPS), it won’t be cached.", it's within a paragraph starting with "Generally speaking, these are the most common rules that are followed [...]".
The same document goes into more details after this:
Useful Cache-Control response headers include:
public — marks authenticated responses as cacheable; normally, if HTTP authentication is required, responses are automatically private.
(What applies to HTTP with authentication also applies to HTTPS.)
Obviously, documents that actually contain sensitive information only aimed for the authenticated user should not be served with this header, since they really shouldn't be cached. However, using this header for items that are suitable for caching (e.g. common images and scripts) should improve the performance of your website (as expected for caching over plain HTTP).
What will never happen with HTTPS is the caching of resources by intermediate proxy servers (between the client and your web-server, at least the external part, if you have a load-balancer or similar). Some CDNs will serve content over HTTPS (assuming it's suitable for your system to trust these CDNs). In general, these proxy servers wouldn't fall under the control of your cache design anyway.

What is the motivation behind the introduction of preflight CORS requests?

Cross-origin resource sharing is a mechanism that allows a web page to make XMLHttpRequests to another domain (from Wikipedia).
I've been fiddling with CORS for the last couple of days and I think I have a pretty good understanding of how everything works.
So my question is not about how CORS / preflight work, it's about the reason behind coming up with preflights as a new request type. I fail to see any reason why server A needs to send a preflight (PR) to server B just to find out if the real request (RR) will be accepted or not - it would certainly be possible for B to accept/reject RR without any prior PR.
After searching quite a bit I found this piece of information at www.w3.org (7.1.5):
To protect resources against cross-origin requests that could not originate from certain user agents before this specification existed a
preflight request is made to ensure that the resource is aware of this
specification.
I find this is the hardest to understand sentence ever. My interpretation (should better call it 'best guess') is that it's about protecting server B against requests from server C that is not aware of the spec.
Can someone please explain a scenario / show a problem that PR + RR solves better than RR alone?
I spent some time being confused as to the purpose of the preflight request but I think I've got it now.
The key insight is that preflight requests are not a security thing. Rather, they're a not-changing-the-rules thing.
Preflight requests have nothing to do with security, and they have no bearing on applications that are being developed now, with an awareness of CORS. Rather, the preflight mechanism benefits servers that were developed without an awareness of CORS, and it functions as a sanity check between the client and the server that they are both CORS-aware. The developers of CORS felt that there were enough servers out there that were relying on the assumption that they would never receive, e.g. a cross-domain DELETE request that they invented the preflight mechanism to allow both sides to opt-in. They felt that the alternative, which would have been to simply enable the cross-domain calls, would have broken too many existing applications.
There are three scenarios here:
Old servers, no longer under development, and developed before CORS. These servers may make assumptions that they'll never receive e.g. a cross-domain DELETE request. This scenario is the primary beneficiary of the preflight mechanism. Yes these services could already be abused by a malicious or non-conforming user agent (and CORS does nothing to change this), but in a world with CORS the preflight mechanism provides an extra 'sanity check' so that clients and servers don't break because the underlying rules of the web have changed.
Servers that are still under development, but which contain a lot of old code and for which it's not feasible/desirable to audit all the old code to make sure it works properly in a cross-domain world. This scenario allows servers to progressively opt-in to CORS, e.g. by saying "Now I'll allow this particular header", "Now I'll allow this particular HTTP verb", "Now I'll allow cookies/auth information to be sent", etc. This scenario benefits from the preflight mechanism.
New servers that are written with an awareness of CORS. According to standard security practices, the server has to protect its resources in the face of any incoming request -- servers can't trust clients to not do malicious things. This scenario doesn't benefit from the preflight mechanism: the preflight mechanism brings no additional security to a server that has properly protected its resources.
What was the motivation behind introducing preflight requests?
Preflight requests were introduced so that a browser could be sure it was dealing with a CORS-aware server before sending certain requests. Those requests were defined to be those that were both potentially dangerous (state-changing) and new (not possible before CORS due to the Same Origin Policy). Using preflight requests means that servers must opt-in (by responding properly to the preflight) to the new, potentially dangerous types of request that CORS makes possible.
That's the meaning of this part of the original specification: "To protect resources against cross-origin requests that could not originate from certain user agents before this specification existed a preflight request is made to ensure that the resource is aware of this specification."
Can you give me an example?
Let's imagine that a browser user is logged into their banking site at A.com. When they navigate to the malicious B.com, that page includes some Javascript that tries to send a DELETE request to A.com/account. Since the user is logged into A.com, that request, if sent, would include cookies that identify the user.
Before CORS, the browser's Same Origin Policy would have blocked it from sending this request. But since the purpose of CORS is to make just this kind of cross-origin communication possible, that's no longer appropriate.
The browser could simply send the DELETE and let the server decide how to handle it. But what if A.com isn't aware of the CORS protocol? It might go ahead and execute the dangerous DELETE. It might have assumed that—due to the browser's Same Origin Policy—it could never receive such a request, and thus it might have never been hardened against such an attack.
To protect such non-CORS-aware servers, then, the protocol requires the browser to first send a preflight request. This new kind of request is something that only CORS-aware servers can respond to properly, allowing the browser to know whether or not it's safe to send the actual DELETE.
Why all this fuss about the browser, can't the attacker just send a DELETE request from their own computer?
Sure, but such a request won't include the user's cookies. The attack that this is designed to prevent relies on the fact that the browser will send cookies (in particular, authentication information for the user) for the other domain along with the request.
That sounds like Cross-Site Request Forgery, where a form on site B.com can be submitted to A.com with the user's cookies and do damage.
That's right. Another way of putting this is that preflight requests were created so as to not increase the CSRF attack surface for non-CORS-aware servers.
But POST is listed as a method that doesn't require preflights. That can change state and delete data just like a DELETE!
That's true! CORS does not protect your site from CSRF attacks. Then again, without CORS you are also not protected from CSRF attacks. The purpose of preflight requests is just to limit your CSRF exposure to what already existed in the pre-CORS world.
Sigh. OK, I grudgingly accept the need for preflight requests. But why do we have to do it for every resource (URL) on the server? The server either handles CORS or it doesn't.
Are you sure about that? It's not uncommon for multiple servers to handle requests for a single domain. For example, it may be the case that requests to A.com/url1 are handled by one kind of server and requests to A.com/url2 are handled by a different kind of server. It's not generally the case that the server handling a single resource can make security guarantees about all resources on that domain.
Fine. Let's compromise. Let's create a new CORS header that allows the server to state exactly which resources it can speak for, so that additional preflight requests to those URLs can be avoided.
Good idea! In fact, the header Access-Control-Policy-Path was proposed for just this purpose. Ultimately, though, it was left out of the specification, apparently because some servers incorrectly implemented the URI specification in such a way that requests to paths that seemed safe to the browser would not in fact be safe on the broken servers.
Was this a prudent decision that prioritized security over performance, allowing browsers to immediately implement the CORS specification without putting existing servers at risk? Or was it shortsighted to doom the internet to wasted bandwidth and doubled latency just to accommodate bugs in a particular server at a particular time?
Opinions differ.
Well, at the very least browsers will cache the preflight for a single URL?
Yes. Though probably not for very long. In WebKit browsers the maximum preflight cache time is currently 10 minutes.
Sigh. Well, if I know that my servers are CORS-aware, and therefore don't need the protection offered by preflight requests, is there any way for me to avoid them?
Your only real option is to make sure that your requests use CORS-safe methods and headers. That might mean leaving out custom headers that you would otherwise include (like X-Requested-With), changing the Content-Type, or more.
Whatever you do, you must make sure that you have proper CSRF protections in place, since CORS will not block all unsafe requests. As the original specification puts it: "resources for which simple requests have significance other than retrieval must protect themselves from Cross-Site Request Forgery".
Consider the world of cross-domain requests before CORS. You could do a standard form POST, or use a script or an image tag to issue a GET request. You couldn't make any other request type other than GET/POST, and you couldn't issue any custom headers on these requests.
With the advent of CORS, the spec authors were faced with the challenge of introducing a new cross-domain mechanism without breaking the existing semantics of the web. They chose to do this by giving servers a way to opt-in to any new request type. This opt-in is the preflight request.
So GET/POST requests without any custom headers don't need a preflight, since these requests were already possible before CORS. But any request with custom headers, or PUT/DELETE requests, do need a preflight, since these are new to the CORS spec. If the server knows nothing about CORS, it will reply without any CORS-specific headers, and the actual request will not be made.
Without the preflight request, servers could begin seeing unexpected requests from browsers. This could lead to a security issue if the servers weren't prepared for these types of requests. The CORS preflight allows cross-domain requests to be introduced to the web in a safe manner.
CORS allows you to specify more headers and method types than was previously possible with cross-origin <img src> or <form action>.
Some servers could have been (poorly) protected with the assumption that a browser cannot make, e.g. cross-origin DELETE request or cross-origin request with X-Requested-With header, so such requests are "trusted".
To make sure that server really-really supports CORS and not just happens to respond to random requests, the preflight is executed.
I feel that the other answers aren't focusing on the reason pre-fight enhances security.
Scenarios:
1) With pre-flight. An attacker forges a request from site dummy-forums.com while the user is authenticated to safe-bank.com
If the Server does not check for the origin, and somehow has a flaw, the browser will issue a pre-flight request, OPTION method. The server knows none of that CORS that the browser is expecting as a response so the browser will not proceed (no harm whatsoever)
2) Without pre-flight. An attacker forges the request under the same scenario as above, the browser will issue the POST or PUT request right away, the server accepts it and might process it, this will potentially cause some harm.
If the attacker sends a request directly, cross origin, from some random host it's most likely one is thinking about a request with no authentication. That's a forged request, but not a xsrf one. so the server has will check credentials and fail.
CORS doesn't attempt to prevent an attacker who has the credentials to issue requests, although a whitelist could help reduce this vector of attack.
The pre-flight mechanism adds safety and consistency between clients and servers.
I don't know if this is worth the extra handshake for every request since caching is hardy use-able there, but that's how it works.
Here's another way of looking at it, using code:
<!-- hypothetical exploit on evil.com -->
<!-- Targeting banking-website.example.com, which authenticates with a cookie -->
<script>
jQuery.ajax({
method: "POST",
url: "https://banking-website.example.com",
data: JSON.stringify({
sendMoneyTo: "Dr Evil",
amount: 1000000
}),
contentType: "application/json",
dataType: "json"
});
</script>
Pre-CORS, the exploit attempt above would fail because it violates the same-origin policy. An API designed this way did not need XSRF protection, because it was protected by the browser's native security model. It was impossible for a pre-CORS browser to generate a cross-origin JSON POST.
Now CORS comes on the scene – if opting-in to CORS via pre-flight was not required, suddenly this site would have a huge vulnerability, through no fault of their own.
To explain why some requests are allowed to skip the pre-flight, this is answered by the spec:
A simple cross-origin request has been defined as congruent with those
which may be generated by currently deployed user agents that do not
conform to this specification.
To untangle that, GET is not pre-flighted because it is a "simple method" as defined by 7.1.5. (The headers must also be "simple" in order to avoid the pre-flight).
The justification for this is that "simple" cross-origin GET request could already be performed by e.g. <script src=""> (this is how JSONP works). Since any element with a src attribute can trigger a cross-origin GET, with no pre-flight, there would be no security benefit to requiring pre-fight on "simple" XHRs.
Additionally, for HTTP request methods that can cause side-effects on
user data (in particular, for HTTP methods other than GET, or for POST
usage with certain MIME types), the specification mandates that
browsers "preflight" the request
Source
In a browser supporting CORS, reading requests (like GET) are already protected by the same-origin policy: A malicious website trying to make an authenticated cross-domain request (for example to the victim's internet banking website or router's configuration interface) will not be able to read the returned data because the bank or the router doesn't set the Access-Control-Allow-Origin header.
However, with writing requests (like POST) the damage is done when the request arrives at the webserver.* A webserver could check the Origin header to determine if the request is legit, but this check is often not implemented because either the webserver has no need for CORS or the webserver is older than CORS and is therefore assuming that cross-domain POSTs are completely forbidden by the same-origin policy.
That is why webservers are given the chance to opt-in into receiving cross-domain write requests.
* Essentially the AJAX version of CSRF.
Aren't the preflighted requests about Performance? With the preflighted requests a client can quickly know if the operation is allowed before send a large amount of data, e.g., in JSON with PUT method. Or before travel sensitive data in authentication headers over the wire.
The fact of PUT, DELETE, and other methods, besides custom headers, aren't allowed by default(They need explicit permission with "Access-Control-Request-Methods" and "Access-Control-Request-Headers"), that sounds just like a double-check, because these operations could have more implications to the user data, instead GET requests.
So, it sounds like:
"I saw that you allow cross-site requests from http://foo.example, BUT are you SURE that you'll allow DELETE requests? Did you consider the impacts that these requests might cause in the user data?"
I didn't understand the cited correlation between the preflighted requests and the old servers benefits. A Web Service that was implemented before CORS, or without a CORS awareness, will never receive ANY cross-site request, because first their response won't have the "Access-Control-Allow-Origin" header.

Use of JSON-P with Sensitive Information

I have a secured website that requires a user to authenticate, and would like to return sensitive data to the client from my API via JSON-P so that I can get around ajax cross-domain issues. I own both the client and server, so I am not concerned about the security from the client perspective (i.e. reading malicious js from the server).
I have been researching ways to secure the JSON-P to prevent Cross-Site Request Forgery, but haven't been able to clearly determine whether checking the Referer is a foolproof method for securing the data. As I understand it, the Referer header cannot be spoofed in this situation because the calls would be from javascript, and Headers cannot be changed. Is this a correct assumption?
I would like some clear-cut examples of why or why not checking the Referer would/wouldn't work to secure JSON-P.
Thanks!
EDIT:
Just to clarify - the JSON-P is secured via Spring Security, so it wouldn't only be secured by the Referer header. I am mostly concerned here about session hijacking...
Jsonp urls can be called using normal curl code. Http refer can easily be forged.
I would like some clear-cut examples of why or why not checking the Referer would/wouldn't work to secure JSON-P.
Referer is not guaranteed to be sent, so:
if you require it to be present and match a trusted site, you will be breaking the app for everyone whose browser or network setup doesn't send it;
if you permit it to be absent to get around that, you open yourself to attack not just for those users, but for everyone where the attacker can induce Referer not to be sent (most notably, from HTTPS pages;
also, to behave properly with proxies you would have to no-cache all your responses (or Vary: Referer, but that won't work right in IE)
Referrer-checking is a weak and problematic method which sometimes sees use as a desperate last measure... it's not something you should build when you've got the choice. If you control both servers you can easily include a request token on one page that gets recognised by the script on the either.

Resources