As I simply understands HTTP2 is m:1 pattern where you put m logical connections into 1 TCP stream
Is it possible to do m:n pattern in http2?
m streams are demuxed into n connections for better reliability, because often one single TCP breaks all h2 hangs.
It would be possible, but in practice it is not done.
Browsers especially try hard to open just one connection to a domain, and even reuse the same connection for different subdomains if they can figure out that it resolves to the same IP address and same certificate.
Other clients may implement a m:n scheme (for example, Jetty 9.4.x HTTP/2 client does - disclaimer: I'm the maintainer).
The problem of choosing a good n may not be trivial, and it risks to go back to HTTP/1.1 6-8 TCP connections per domain.
Since each connection would be multiplexed anyway, the failure of a single HTTP/2 connection would be worse than the failure of a single HTTP/1.1 connection (because it would fail multiple requests rather than just one), so I guess that it would not make that much of a difference with respect to a single HTTP/2 connection.
Google's QUIC protocol aims at resolving this issue since it is based on UDP and has built-in in support for connection migration (i.e. switching from WiFi to mobile network).
Related
I am trying to clarify/understand whether websockets over HTTP/2 will also be multiplexed over a TCP connection using streams. Section 5 of RFC8441 seems to suggest it
After successfully processing the opening handshake, the peers should proceed with the WebSocket Protocol [RFC6455] using the HTTP/2 stream from the CONNECT transaction as if it were the TCP connection referred to in [RFC6455]. The state of the WebSocket connection at this point is OPEN, as defined by [RFC6455], Section 4.1.
The HTTP/2 stream closure is also analogous to the TCP connection closure of [RFC6455]. Orderly TCP-level closures are represented as END_STREAM flags ([RFC7540], Section 6.1). RST exceptions are represented with the RST_STREAM frame ([RFC7540], Section 6.4) with the CANCEL error code ([RFC7540], Section 7).
But my confusion arises from the fact that even with HTTP/1.1, while tabs in a browser share the underlying TCP connections (e.g. chrome makes 6 TCP connections) to the same host, creating a websocket to the same host in different tabs leads to distinct TCP connection in each tab.
I am not sure why the difference between the two & if it is likely to be the same for websockets over HTTP/2 as well.
Any experts out here who can clarify. Thanks.
But my confusion arises from the fact that even with HTTP/1.1, while tabs in a browser share the underlying TCP connections (e.g. chrome makes 6 TCP connections) to the same host, creating a websocket to the same host in different tabs leads to distinct TCP connection in each tab.
You are right that this is the current state of affairs, unfortunately, for HTTP/1.1.
RFC 8441, as you point out, has been specified to solve this problem and piggy back WebSocket "connections" over HTTP/2 streams, so it would be possible to open just one TCP connection to an origin server and use that connection for both HTTP/2 communication and for WebSocket communication.
The difference between HTTP/1.1 and HTTP/2 stems from the fact that HTTP/1.1 WebSocket connections cannot be (efficiently) pooled.
Every WebSocket connection is tied to a specific URI (e.g. ws://host/path1) and it's more typical for an application to open different WebSocket connections for different URIs (rather than many WebSocket connections for the same URI).
Because they cannot be pooled, browsers basically have to allow an unlimited number of them, a new one every time you call new WebSocket(...) from JavaScript.
With HTTP/2 instead, you will be able to open a new HTTP/2 stream inside the same HTTP/2 connection.
The number of concurrent streams depends on browser implementations, but it's typically around 100 if not more, which leaves plenty of concurrency for both HTTP/2 and WebSocket (unless the client application is really abusing WebSocket).
Fortunately, client applications won't need to be changed to leverage this feature.
When browsers and server will support it, your application will use less resources (just one TCP connection) rather than the many it's using now.
[Disclaimer, I'm the implementer of such feature in Jetty]
We have seen a few browsers implementing this feature and we are finalizing the implementation of this feature in the Jetty 10.0.x server, see https://github.com/eclipse/jetty.project/issues/3537.
creating a websocket to the same host in different tabs leads to distinct TCP connection in each tab
A WebSocket connection is always a new TCP connection, since it has to perform an HTTP/S request that Upgrades to a WebSocket connection and is therefore no longer an HTTP/S connection if successful. WebSocket connections are distinct and can't be shared or reused, unlike HTTP/S connections (assuming keep-alives are used).
As far as i know, TCP break down a message into segments. So, Why is multiplexing again on HTTP2? What are the benefits of multiplexing twice?
TCP isn’t multiplexed. TCP is just a guaranteed messaging stream (i.e. missing packets are re-requested and the TCP stream is basically temporarily blocked while this happens).
TCP, as a packet based protocol, can be used for multiplexed connections if the higher level application protocol (e.g. HTTP) allows sending of multiple messages. Unfortunately HTTP/1.1 does not allow this: once a HTTP/1.1 message is sent, no other message can be sent on that connection until that message is returned in full (ignoring the badly supported pipelining concept). This means HTTP/1.1 is basically synchronous and, if the full bandwidth is not used and other HTTP messages are queued, then it wastes any extra capacity that could be used on the underlying TCP connection.
To get around this more TCP connections can be opened, which basically allows HTTP/1.1 to act like a (limited) multiplexed protocol. If the network bandwidth was fully utilised then those extra connections would not add any benefit - it’s the fact there is capacity and that the other TCP connections are not being fully utilised that means this makes sense.
So HTTP/2 adds multiplexing to the protocol to allow a single TCP connection to be used for multiple in flight HTTP requests.
It does this by changing the text-based HTTP/1.1 protocol to a binary, packet-based protocol. These may look like TCP packets but that’s not really relevant (in the same way that saying TCP is similar to IP because it’s packet based is not relevant). Splitting messages into packets is really the only way of allowing multiple messages to be in flight at the same time.
HTTP/2 also adds the concept of streams so that packets can belong to different requests - TCP has no such concept - and this is what really makes HTTP/2 multiplexed.
In fact, because TCP doesn’t allow separate, independents streams (i.e. multiplexing), and because it is guaranteed, this actually introduces a new problem where a single dropped TCP packet holds up all the HTTP/2 streams on that connection, despite the fact that only one stream should really be affected and the other streams should be able to carry on despite this. This can even make HTTP/2 slower in certain conditions. Google is experimenting with moving away from TCP to QUIC to address this.
More details on what multiplexing means under HTTP/2 (and why it is a good improvement!) in my answer here: What does multiplexing mean in HTTP/2
TCP doesn't do multiplexing. The TCP segments just means that the (single) stream data is chopped up into pieces that can be sent in IP packets. Each TCP segment is only identified with a stream offset (sequence number), not with any useful way to identify separate streams. (We'll ignore the rarely-useful Urgent Pointer thing.)
So to do multiplexing, you need to put something on top of TCP. Which HTTP/2 does.
HTTP & HTTP/2 are both application level protocols that must utilize a lower level protocol like TCP to actually talk on the Internet. The protocol of the Internet is generally TCP over IP over Ethernet.
It looks like this:
As you can see HTTP is sitting above TCP. Below TCP is IP. One of the main protocols of the Internet. IP itself deals with packets which are switched/multiplexed. I think that's where you might be getting the idea that TCP is multiplexed, it's not. Think of a TCP connection as being like a single lane road tunnel where no one can pass. Lets say it has one single lane in each direction. This is what a TCP connection would look like. A tunnel where you put data in one end, and it comes out the other in the same order it went in. That is TCP. You can see there is no multiplexing on that. However, TCP does provides a reliable connection protocol for which other protocols may be built on top of like HTTP. And reliability is essential for HTTP.
HTTP 1.1 is simply a request response protocol. But as you know, it's not multiplexed. So only allow one outstanding request at a time and has to send the whole response to each request at a time. Previously the browsers got around that limitation by creating multiple TCP connections (tunnels) to the server with which to make more requests.
HTTP 2 actually splits the data up again and allows multiplexing over the one connection so that no further connections need to be created. It means the server can start servicing multiple requests and multiplex the responses so that the browser can start receiving images, pages and other resources at the same time, not one at a time.
Hope that makes it clear.
HTML5 websockets currently use a form of TCP communication. However, for real-time games, TCP just won't cut it (and is great reason to use some other platform, like native). As I probably need UDP to continue a project, I'd like to know if the specs for HTML6 or whatever will support UDP?
Also, are there any reliable benchmarks for WebSockets that would compare the WS protocol to a low-level, direct socket protocol?
On a LAN, you can get Round-trip times for messages over WebSocket of 200 microsec (from browser JS to WebSocket server and back), which is similar to raw ICMP pings. On MAN, it's around 10ms, WAN (over residential ADSL to server in same country) around 30ms, and so on up to around 120-200ms via 3.5G. The point is: WebSocket does add virtually no latency to the one you will get anyway, based on the network.
The wire level overhead of WebSocket (compared to raw TCP) is between 2 octets (unmasked payload of length < 126 octets) and 14 octets (masked payload of length > 64k) per message (the former numbers assume the message is not fragmented into multiple WebSocket frames). Very low.
For a more detailed analysis of WebSocket wire-level overhead, please see this blog post - this includes analysis covering layers beyond WebSocket also.
More so: with a WebSocket implementation capable of streaming processing, you can (after the initial WebSocket handshake), start a single WebSocket message and frame in each direction and then send up to 2^63 octets with no overhead at all. Essentially this renders WebSocket a fancy prelude for raw TCP. Caveat: intermediaries may fragment the traffic at their own decision. However, if you run WSS (that is secure WS = TLS), no intermediaries can interfere, and there you are: raw TCP, with a HTTP compatible prelude (WS handshake).
WebRTC uses RTP (= UDP based) for media transport but needs a signaling channel in addition (which can be WebSocket i.e.). RTP is optimized for loss-tolerant real-time media transport. "Real-time games" often means transferring not media, but things like player positions. WebSocket will work for that.
Note: WebRTC transport can be over RTP or secured when over SRTP. See "RTP profiles" here.
I would recommend developing your game using WebSockets on a local wired network and then moving to the WebRTC Data Channel API once it is available. As #oberstet correctly notes, WebSocket average latencies are basically equivalent to raw TCP or UDP, especially on a local network, so it should be fine for you development phase. The WebRTC Data Channel API is designed to be very similar to WebSockets (once the connection is established) so it should be fairly simple to integrate once it is widely available.
Your question implies that UDP is probably what you want for a low latency game and there is truth to that. You may be aware of this already since you are writing a game, but for those that aren't, here is a quick primer on TCP vs UDP for real-time games:
TCP is an in-order, reliable transport mechanism and UDP is best-effort. TCP will deliver all the data that is sent and in the order that it was sent. UDP packets are sent as they arrive, may be out of order, and may have gaps (on a congested network, UDP packets are dropped before TCP packets). TCP sounds like a big improvement, and it is for most types of network traffic, but those features come at a cost: a delayed or dropped packet causes all the following packets to be delayed as well (to guarantee in-order delivery).
Real-time games generally can't tolerate the type of delays that can result from TCP sockets so they use UDP for most of the game traffic and have mechanisms to deal with dropped and out-of-order data (e.g. adding sequence numbers to the payload data). It's not such a big deal if you miss one position update of the enemy player because a couple of milliseconds later you will receive another position update (and probably won't even notice). But if you don't get position updates for 500ms and then suddenly get them all out once, that results in terrible game play.
All that said, on a local wired network, packets are almost never delayed or dropped and so TCP is perfectly fine as an initial development target. Once the WebRTC Data Channel API is available then you might consider moving to that. The current proposal has configurable reliability based on retries or timers.
Here are some references:
WebRTC Introduction
WebRTC FAQ
WebRTC Data Channel Proposal
To make a long story short, if you want to use TCP for multiplayer games, you need to use what we call adaptive streaming techniques. In other words, you need to make sure that the amount of real-time data sent to synchronize the game world among the clients is governed by the currently available bandwidth and latency for each client.
Dynamic throttling, conflation, delta delivery, and other mechanisms are adaptive streaming techniques, which don't magically make TCP as efficient as UDP, but make it usable enough for several types of games.
I tried to explain these techniques in an article: Optimizing Multiplayer 3D Game Synchronization Over the Web (http://blog.lightstreamer.com/2013/10/optimizing-multiplayer-3d-game.html).
I also gave a talk on this topic last month at HTML5 Developer Conference in San Francisco. The video has just been made available on YouTube: http://www.youtube.com/watch?v=cSEx3mhsoHg
There's no UDP support for Websockets (there really should be), however you can apparently use WebRTC's RTCDataChannel API for UDP-like communication. There's a good article here:
http://www.html5rocks.com/en/tutorials/webrtc/datachannels/
RTCDataChannel actually uses SCTP which has configurable reliability and ordered delivery. You can get it to act like UDP by telling it to deliver messages unordered, and setting the maximum number of retransmits to 0.
I haven't tried any of this though.
I'd like to know if the specs for HTML6 or whatever will support UDP?
WebSockets won't. One of the benefits of WebSockets is that it piggybacks the existing HTTP connection. This means that to proxies and firewalls WebSockets looks like HTTP so they don't get blocked.
It's likely arbitrary UDP connections will never be part of any web specification because of security concerns. The closest thing to what you're after will likely come as part of WebRTC and it's associated JSEP protocol.
are there any reliable benchmarks ... that .. compare the WS protocol to a low-level, direct socket protocol?
Not that I'm aware of. I'm going to go out on a limb and predict WebSockets will be slower ;)
An XMMP server sends push notifications to a client behind a NAT using a public endpoint( IP + Port) supplied by NAT to client. But how long this endpoint is assigned to this specific client by NAT, what will happen if NAT assigns same endpoint to another client ? How this problem can be solved?
XMPP uses a standard TCP connection. NATs will keep the association for as long as the connection is alive (unless they are horribly broken).
Update: The last part of my statement could have been expanded a bit. Horribly broken NAT implementations do exist. Generally these are a small percentage, but many (most?) popular XMPP clients do ensure they send some kind of keepalive over idle connections.
There are three kinds of keepalive you can use I'll list them here in order of bandwidth/processing requirements:
TCP keepalives are a good lightweight option, especially as once they are enabled, they are automatically handled by the OS. How to enable them will depend on your language and framework, but at the lowest level, you need to enable the SO_KEEPALIVE option on the socket.
There are two problems with TCP keepalives. One is that you can't control them from your application (unless you write platform-specific code). The second problem is that some NAT implementations are so broken that they will ignore TCP keepalives too! But you're hopefully down to a very small percentage now.
So another option is whitespace keepalives. Since these involve data going across the stream, you should be safe from even the broken NATs that ignore keepalives.
Whitespace keepalives simply involve sending the space character (' ') across the XMPP stream at any time it is idle. XML and XMPP allow unlimited whitespace between elements, and it is simply ignored by the recipient.
Finally, you can use fully-fledged XMPP pings (XEP-0199). These involve ending an actual <iq/> 'get' stanza to the server, which then must reply. This ensures data flows in both directions, and should make even the most broken NAT implementations keep your connection alive.
Ok, I should mention that there is an even worse class of NAT. I have seen NATs that will simply 'forget' about your mapping for a range of reasons, including their mapping table being full, or just after a timer. There is nothing you can do to work around these, they don't work with any long-lived TCP connections. The best you could probably do at that point is use BOSH (essentially XMPP over HTTP).
Conclusion: If you are concerned that your application may run behind some of these devices, I suggest something like the following algorithm (exact times may be tweaked, but I recommend these as minimum values):
If you have not sent any data for 60s, send a single space character.
If you have not received any data for 120s, send an XMPP ping to your server.
If the server doesn't reply to the ping within a reasonable amount of time, reconnect.
Because the behaviour of broken NAT devices is beyond any standard protocol specification, it is naturally impossible to devise a perfect solution that will work with all of them, all of the time. You just have to accept that these are a small minority, and none of this matters for working NAT devices (though there are other kinds of network breakages that may make regular keepalives/pings a good idea, depending on the needs of your application).
The Solution is sending keep alive messages to maintain the NAT entry. XMPP whitespace is typically used. Send it eg every Ten minutes to preserve reachability of the nated client.
You have to keep in mind that NAT is no standardized technique. Thus there are different implementations. The provided RFCs in the comment above is from the BEHAVE working group.
In a recent series of question I have asked alot about UDP, boost::asio and c++ in general.
My latest question, which doesn't seem to have an answer here at Stackoverflow, is this:
In a client/server application, it is quite okay to require that the server open a port in any firewall, so that messages are allowed in. However, doing the same for clients is definately not a great user experience.
TCP-connections typically achieve this due to the fact that most routers support stateful packet inspection, allowing response packets through if the original request originated from the local host.
It is not quite clear to me how this would work with UDP, since UDP is stateless, and there is no such thing as "response packets" (to my knowledge). How should I account for this in my client application?
Thanks for any answers!
UDP itself is stateless, but the firewall typically is not. The convention on UDP is that if a request goes out from client:port_A to server:port_B, then the response will come back from server:port_B to client:port_A.
The firewall can take advantage of this. If it sees a UDP request go out from the client, it adds an entry to its state table that lets it recognise the response(s), to allow them in. Because UDP is stateless and has no indication of connection termination, the firewall will typically implement a timeout - if no traffic occurs between that UDP address pair for a certain amount of time, the association in the firewall's state table is removed.
So - to take advantage of this in your client application, simply ensure that your server sends responses back from the same port that it uses to receive the requests.