What does "258EAFA5-E914-47DA-95CA-C5AB0DC85B11" means in WebSocket Protocol - websocket

I don't understand the meaning of "258EAFA5-E914-47DA-95CA-C5AB0DC85B11" in RFC 6455.
Why does the server need this magic string?
And why does the WebSocket protocol need this mechanism?

The RFC explains it. It is a GUID, selected because it is "unlikely to be used by network endpoints that do not understand the WebSocket Protocol". See RFC 6455.
If you're interested in the specifics of the format for GUIDs like that, see RFC 4122.

From the Quora answer:
There is no reason for choosing the magic string. The particular magic string GUID was chosen to add some level of integrity to the WebSockets protocol, because the string is globally unique.
The RFC (RFC 6455 - The WebSocket Protocol) only says:
...concatenate this with the Globally Unique Identifier (GUID,
[RFC4122]) "258EAFA5-E914-47DA-95CA-C5AB0DC85B11" in string form,
which is unlikely to be used by network endpoints that do not
understand the WebSocket Protocol.
Hope that answers your question.

Why does the WebSocket protocol need this mechanism?
A websocket connection is asked by a browser, simply with the code below
new WebSocket("wss://echo.websocket.org")
From the debugger we can see a 101 GET, and by inspecting the request header, we can see this particular entry:
Sec-WebSocket-Key: qcq+klmT4W41IrmG3/fseA==
This is a unique hash, identifying the browser.
On the server side the $client_key hash is received. Only the hash value is kept. The return value looks like this using PHP:
"Sec-WebSocket-Accept: " . base64_encode(sha1( $client_key .
"258EAFA5-E914-47DA-95CA-C5AB0DC85B11",true))
The browser get back the response, (example). This is the sha1 of the sent key concatenated with the 258EAFA5-E914-47DA-95CA-C5AB0DC85B11 websocket unique GUID.
Sec-WebSocket-Accept: r1Km05q03xuNRYy7mxkCRRgbh2M=
The browser is then checking if the hash match his own calculation, done under the hood. If so, the handshake completed, the remote server is actually a real websocket server, and hence the tunnel is created, and kept alive.
https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_servers

Related

What is the HTTP/1 equivalent of the HTTP/2 `:scheme` header?

I'm writing a proxy from HTTP/2 to HTTP/1 and vice-versa.
When I have an incoming HTTP/2 request, which defines :scheme, what header should I map that to for my proxied HTTP/1 request?
The closest thing I can find is https://www.rfc-editor.org/rfc/rfc7239#section-5.4
Mapping the HTTP/2 :scheme pseudo header to HTTP/1.1 X-Forwarded-Proto header would be correct.
You basically shouldn't map it.
For a start HTTP has no direct equivalent to the :scheme pseudo-header. The request was a relative path (e.g. /path/page/) rather than an absolute path (e.g. https://www.example.com/path/page/) and the Host header contained just the server name and not the scheme.
So the connection knows whether it is HTTP or HTTPS and this is exposed to webservers and the like (e.g. in the REQUEST_SCHEME variable for Apache) but at an HTTP level it doesn't know.
If acting as an intercepting proxy and taking one HTTP/2 connection, and forwarding requests to another, then you should open a HTTP or HTTPS connection for that second connection, as you see fit depending what the downstream system supports.
As sbordet points out if you want to make the downstream system aware of what the original scheme was then you can use X-Forwarded-Proto header (technically obseleted but still used) or the Forwarded header, but that's more for informational purposes rather than a direct mapping of what was in the original request. The scheme is related to the current request.
According to RFC 7540, section 8.1.2:
While HTTP/1.x used the message start-line (see [RFC7230],
Section 3.1) to convey the target URI, the method of the request, and
the status code for the response, HTTP/2 uses special pseudo-header
fields beginning with ':' character (ASCII 0x3a) for this purpose.
And:
The ":scheme" pseudo-header field includes the scheme portion of
the target URI ([RFC3986], Section 3.1).
":scheme" is not restricted to "http" and "https" schemed URIs. A
proxy or gateway can translate requests for non-HTTP schemes,
enabling the use of HTTP to interact with non-HTTP services.
So, if you're proxing HTTP, it should be "http" and if you're proxying HTTPS, it should be "https".
Reading again, I can see that I may have had the sense of the question the wrong way around (I was thinking HTTP1 client, HTTP2 server). But the above two quotes are still the relevant ones. You don't put :scheme in an HTTP1 header - it forms part of the URI that you place in the message start line.

Is it possible to specify HTTP headers in the URL?

I've got a (Spring) handler that I'd like users to be able to bookmark. As it's coded now, they get different formats (CSV, JSON) back based on the Accept header.
Would there be any way for users to specify the URL so that they can say what header they want? Or am I going to have to give URL-level parameter for the different formats?
Would there be any way for users to specify the URL so that they can say what header they want?
no there is no way to do that magically.
Or am I going to have to give URL-level parameter for the different formats? Yes this is valid
This is quoted from xml.com:
Server-driven negotiation. The service provider determines the right representation from prior knowledge of its clients or uses the information provided in HTTP headers like Accept, Accept-Charset, Accept-Encoding, Accept-Language, and User-Agent. The drawback of this approach is that the server may not have the best knowledge about what a client really wants.
Client-driven negotiation. A client initiates a request to a server. The server returns a list of available of representations. The client then selects the representation it wants and sends a second request to the server. The drawback is that a client needs to send two requests.
Proxy-driven negotiation. A client initiates a request to a server through a proxy. The proxy passes the request to the server and obtains a list of representations. The proxy selects one representation according to preferences set by the client and returns the representation back to the client.
URI-specified representation. A client specifies the representation it wants in the URI query string.

How to prevent other from tampering response msg in HTTPS?

In HTTPS, only server hold the private key and is able to decode the message.
My doubt is whether server will encode the response before sending it to client?
If so, how does the client decode it, since it does not have the private key?
If not, how does it prevent others from tampering the response message?
I think I can answer my question by myself. The server will encrypt the response with public-key and send it to client. Other than that, the server will send a checksum as well, which acts as the signature. The checksum is generated based on the private-key that only server knows, therefore it is hard for others to fabricate it. Thus, if anyone trying tampering the message, it won't match the checksum.

WebSocket and the Origin header field

The following is quoted from RFC6455 - WebSocket protocol.
Servers that are not intended to process input from any web page but
only for certain sites SHOULD verify the |Origin| field is an origin
they expect. If the origin indicated is unacceptable to the server,
then it SHOULD respond to the WebSocket handshake with a reply
containing HTTP 403 Forbidden status code.
The |Origin| header field protects from the attack cases when the
untrusted party is typically the author of a JavaScript application
that is executing in the context of the trusted client. The client
itself can contact the server and, via the mechanism of the |Origin|
header field, determine whether to extend those communication
privileges to the JavaScript application. The intent is not to prevent
non-browsers from establishing connections but rather to ensure that
trusted browsers under the control of potentially malicious JavaScript
cannot fake a WebSocket handshake.
I just cannot be sure about what the 2nd paragraph means, especially the italic part. Could anyone explain it a bit? Or maybe an example.
My understanding so far is like this:
If server CAN be sure that requests DO come from Web pages, the ORIGIN header can be used to prevent access from un-welcomed Web pages.
If server CANNOT be sure that requests come from Web pages, the ORIGIN header is merely advisory.
Your understanding seem to be correct, but..
I would rephrase it - you can be sure, that javascript client will send proper origin header. You don't know what will be sent by other clients (and whether the value is correct or not).
This should prevent other pages to connect to "your" web socket endpoints (which is a big deal, imagine injected javascript somewhere on jsfiddle or some frequently visited page), but if you need to make sure that no other client will be able to connect to it, you'll need to introduce some other security measures.
I believe this is meant only as prevention of browser based "data stealing" or "DDoSing", nothing else; you can still do that by using some other client.

Ensure WebSockets only connecting from known domain

How can I make sure only a script hosted on a specific list of domains is allowed to connect to my WebSocket application?
Or to prevent opinion based closevotes, is there a state-of-the-art or native way?
I do not intend to implement user authentication.
The mechanism for this with WebSocket is the origin header.
This HTTP header is set by browsers to the domain of the host that served the HTML that contained the JavaScript which opened the WebSocket connection.
A WebSocket server can inspect the origin header during the initial opening handshake of the WebSocket protocol. The server can then only allow proceeding of the connection if the origin matches a known whitelist.
The header cannot be modified from JavaScript, and all browsers are required by the RFC6455 specification to include it.
Caution: a non-browser WebSocket client can of course fake the origin header to any value it likes.
#oberstet gave you the right answer.
If you are worried about bots or programmatic HTTP agents, then you are going to have a bad time. Everything in a HTTP request can be spoofed. Your only option is to use cookies to attach a token with limited time validity that certify the user went through an allowed website to get that script. Get that cookie in the WebSocket handshake and decide if you allow it or not.
E.g.: When a user visit your site, or one of your sites, return a cookie with a symmetrically encrypted token based on the user IP address, User-Agent header, and Origin header; when the user initiates a WebSocket connection, if it is in the same 2nd domain, it will send the cookie, then if the data adds up allow the connection, otherwise, reject it. If the WS is in another domain, then you will have to forget about cookies and rely on a web socket message once the connection is established to check the validity of the connection.

Resources