Does the connection get closed at any point during the WebSocket handshake or immediately after? - websocket

According to the Wikipedia article: http://en.wikipedia.org/wiki/WebSocket,
The server sends back this response to the client during handshake:
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat
Does this close the connection (as HTTP responses usually do) or it is kept open throughout the entire handshake and it can start sending WebSocket frames straight away (assuming that it succeeds)?

An HTTP socket going through the handshake process to be upgraded to the webSocket protocol is not closed during that process. The same open socket goes through the whole process and then becomes the socket used for the webSocket protocol. As soon as the upgrade is complete, that very socket is ready for messages to be sent per the webSocket protocol.
It is this use of the exact same socket that enables a webSocket connection to run on the same port as an HTTP request (no extra port is needed) because it literally starts out as an HTTP request (with some extra headers attached) and then when those headers are recognized and both sides agree, the socket from that original HTTP request on the original web port (often port 80) is then switched to use the webSocket protocol. No additional connection on some new port is needed.
I actually find it a relatively elegant design because it makes for easy coexistence with a web server which was an important design parameter. And, a slight extra bit of connection overhead (protocol upgrade negotiation) is generally not an issue because webSocket connections by their very nature are designed to be long running sockets which you open once and use over an extended period of time so a little extra overhead to open them doesn't generally bother their use.
If, for any reason, the upgrade is not completed (both sides don't agree on the upgrade to webSocket), then the socket would remain an HTTP socket and would behave as HTTP sockets normally do (likely getting closed right away, but subject to normal HTTP interactions).
You can see this answer for more details on the back and forth during an upgrade to webSocket: SocketIO tries to connect using same port as the browser used to get web page

Related

How to keep long connection in HTTP2?

I am reading the documentation of Alexa Voice Service capabilities and came across the part on managing HTTP2 connection. I don't really understand how this down channel works behind the scenes. Is it using server push? Well, could server push be used to keep a long connection? Or is it just using some tricks to keep the connection alive for a very long time?
As stated on the documentation, the client needs to establish a down channel stream with the server.
Based on what I read here https://www.rfc-editor.org/rfc/rfc7540, From this state diagram:
once the stream sends a HEADER frame, followed by an END STREAM flag, the state will be half-closed(local) on the point of view of the client. So, this is how half-closed state for the device happened, as stated in above image. Correct me that if I am wrong.
For managing the HTTP connection, this is what it says.
Based on my understanding: the client sets a timeout of 60minutes for the GET request. After the request is sent, the server will not send any response. Then the connection will remain open for 60minutes. But once a response is sent from the server, the connection should be closed. Isn't that supposed to happen? Or, is it because when the server sends response through the down channel stream, it did not send an END STREAM flag so the stream will not be closed?
But once a response is sent from the server, the connection should be closed.
HTTP/1.1 and HTTP/2 use persistent connections, which means that a single connection can be used not just for one request/response, but for several request/response cycles.
Only HTTP/1.0 was closing the connection after the response, and so for HTTP/2 this is not the case, the connection will remain open until either peer decides to explicitly close it.
The recommendations about the idle timeouts are exactly to prevent the client to explicitly close the connection too early when it sees no network traffic, independently from requests or responses.

Will websockets over HTTP2 also be multiplexed in streams?

I am trying to clarify/understand whether websockets over HTTP/2 will also be multiplexed over a TCP connection using streams. Section 5 of RFC8441 seems to suggest it
After successfully processing the opening handshake, the peers should proceed with the WebSocket Protocol [RFC6455] using the HTTP/2 stream from the CONNECT transaction as if it were the TCP connection referred to in [RFC6455]. The state of the WebSocket connection at this point is OPEN, as defined by [RFC6455], Section 4.1.
The HTTP/2 stream closure is also analogous to the TCP connection closure of [RFC6455]. Orderly TCP-level closures are represented as END_STREAM flags ([RFC7540], Section 6.1). RST exceptions are represented with the RST_STREAM frame ([RFC7540], Section 6.4) with the CANCEL error code ([RFC7540], Section 7).
But my confusion arises from the fact that even with HTTP/1.1, while tabs in a browser share the underlying TCP connections (e.g. chrome makes 6 TCP connections) to the same host, creating a websocket to the same host in different tabs leads to distinct TCP connection in each tab.
I am not sure why the difference between the two & if it is likely to be the same for websockets over HTTP/2 as well.
Any experts out here who can clarify. Thanks.
But my confusion arises from the fact that even with HTTP/1.1, while tabs in a browser share the underlying TCP connections (e.g. chrome makes 6 TCP connections) to the same host, creating a websocket to the same host in different tabs leads to distinct TCP connection in each tab.
You are right that this is the current state of affairs, unfortunately, for HTTP/1.1.
RFC 8441, as you point out, has been specified to solve this problem and piggy back WebSocket "connections" over HTTP/2 streams, so it would be possible to open just one TCP connection to an origin server and use that connection for both HTTP/2 communication and for WebSocket communication.
The difference between HTTP/1.1 and HTTP/2 stems from the fact that HTTP/1.1 WebSocket connections cannot be (efficiently) pooled.
Every WebSocket connection is tied to a specific URI (e.g. ws://host/path1) and it's more typical for an application to open different WebSocket connections for different URIs (rather than many WebSocket connections for the same URI).
Because they cannot be pooled, browsers basically have to allow an unlimited number of them, a new one every time you call new WebSocket(...) from JavaScript.
With HTTP/2 instead, you will be able to open a new HTTP/2 stream inside the same HTTP/2 connection.
The number of concurrent streams depends on browser implementations, but it's typically around 100 if not more, which leaves plenty of concurrency for both HTTP/2 and WebSocket (unless the client application is really abusing WebSocket).
Fortunately, client applications won't need to be changed to leverage this feature.
When browsers and server will support it, your application will use less resources (just one TCP connection) rather than the many it's using now.
[Disclaimer, I'm the implementer of such feature in Jetty]
We have seen a few browsers implementing this feature and we are finalizing the implementation of this feature in the Jetty 10.0.x server, see https://github.com/eclipse/jetty.project/issues/3537.
creating a websocket to the same host in different tabs leads to distinct TCP connection in each tab
A WebSocket connection is always a new TCP connection, since it has to perform an HTTP/S request that Upgrades to a WebSocket connection and is therefore no longer an HTTP/S connection if successful. WebSocket connections are distinct and can't be shared or reused, unlike HTTP/S connections (assuming keep-alives are used).

How does browsers handle DNS lookup and TLS with WebSockets?

I'm currently considering the option of implementing websockets in my application. But before doing so, I want to make sure I understand correctly how it works and if it's gonna be worth it.
I understand the basics: Via WebSockets the handshake will be made only once via HTTP and then will talk to the server in order to switch to a lower level TCP layer, at that point, we have a full-duplex channel between the server and the client.
Currently I'm measuring the ajax requests made to my server (which are many), I've got this information:
The "DNS Lookup", "Initial connection" and "SSL" times are what I'm trying to eliminate (if possible)
For my understanding these times are part of the handshaking process and I'm assuming that using websockets it will happen only on the beginning (the handshake), but I'm not sure.
So my question is: Am I correct? Implementing WebSockets will ensure that the "DNS lookup" and the "Initial connection" steps happen only on the handshake?
Thanks in advance for your help, and sorry if my understanding is wrong.
I understand the basics: Via WebSockets the handshake will be made only once via HTTP and then will talk to the server in order to switch to a lower level TCP layer, at that point, we have a full-duplex channel between the server and the client.
It will not switch to a lower level TCP layer. Instead it will switch the protocol from plain HTTP (request, response) to a message based protocol - which like HTTP is on top of TCP at the application layer and not at a lower level. It's just a different protocol on the same level. It behaves a bit like TCP in that you can send and receive messages at any time without being restricted to request/response scheme of HTTP. But for example TCP is a data stream while WebSockets implements a message oriented protocol.
And, DNS is outside of WebSockets. DNS is only needed to lookup the IP address to establish the TCP connection which then gets used to do the initial HTTP handshake which is needed before the protocol switch to WebSockets.
The situation is similar with TLS. After the DNS lookup to get the IP address the TCP connection is established, then the TLS session on top of the TCP connection is established and then the initial HTTP handshake preceding the switch to WebSockets is done: i.e. HTTP inside a TLS tunnel inside a TCP connection - in other words HTTPS. The WebSocket protocol is then also spoken inside this TLS tunnel, similar how it is done with HTTP.
So my question is: Am I correct? Implementing WebSockets will ensure that the "DNS lookup" and the "Initial connection" steps happen only on the handshake?
Correct. At the beginning of each ws:// connection you might have a DNS lookup if the IP for the name is not already cached. You will have a TCP handshake and you will have a HTTP handshake which then leads to the protocol switch. This is true for every ws:// connection. And for wss:// there is additionally the creation of the TLS tunnel after the TCP connection was established and before the HTTP handshake gets started.

How do websockets work in detail?

There's a fantastic answer which goes into detail as to how REST apis work.
How do websockets work in the similar detail?
Websockets create and represent a standard for bi-directional communication between a server and client. This communication channel creates a TCP connection which is outside of HTTP and is run on a seperate server.
To start this process a handshake is performed between the server and client.
Here is the work flow
1) The user makes an HTTP request to the server with an upgrade header, indicating that the client wishes to establish a WebSocket connection.
2) If the server uses the WebSocket protocol, then it will accept the upgrade and send a response back.
3) With the handshake finished, the WebSocket protocol is used from now on. All communications will use the same underlying TCP port. The new returning status code, 101, signifies Switching Protocols.
As part of HTML5 it should work with most modern browsers.

ruby HttpClient library closing socket after response with persistent connection?

I'm using the HTTPClient gem (http://github.com/nahi/httpclient) for ruby, to post data to IIS 6.1. Even though both support HTTP 1.1 it seems to be closing the socket after each request made, rather than using persistent connections. I haven't added any flags to enable persistent connections (mainly because having poked about the source code it appears that they should be enabled by default).
The reason I think the socket is being close is that if I watch the requests in Wireshark once each request is made I see FIN/ACK TCP packets sent from the client to the server, then the same sent back the other way.
Am I misreading that or does that mean the socket is being closed?
Wikipedia's article on TCP suggests that the FIN/ACK packets are a signal to terminate the connection. Check which of the client or server initiated the sending of the FIN packet - that's the party requesting that the connection is closed.
As you saw in the source, an HTTP 1.1 implementation should assume that connections are persistent by default.
Is the client specifying HTTP 1.1 in its request and is the server responding accordingly?

Resources