Why does Websocket RFC allows control frame interleaved with multiframe - websocket

From the Websocket's RFC6455,
it's possible that control frames interleave with fragmented frames.
I don't understand the need for it, as it makes the design more complex for both sending and receiving part.
Currently, control frame can be "Close", "Ping" and "Pong" (everything else is reserved).
If the control frame is "Close", then receiving the end of the fragmentation is useless, so no interleaving would be required (the fragmenting side could just send the "Close" opcode and stop sending any more fragment, since you are not supposed to send anything after a "Close").
If the control frame is "Ping" or "Pong", it does not make any sense. The fragmenting side is sending data to the client, so why would it ask for pinging the client if it's alive (it has this information in the send system call already) ? Or reply to a ping immediately, since it's actually sending data to the client ?
So, why do we need this mechanism (of interleaved control frame) at all ?

It is to detect half open connections: http://blog.stephencleary.com/2009/05/detection-of-half-open-dropped.html
The other side could be sending you data, but unable to get your data. So being able of interleave pings and pongs, it is possible to check that at least the other end can understand your messages and reply to them.
It does not make it much more complex. You have to read delimited frames anyway, when you find a control frame, take action and continue reading more frames.
http://www.whatwg.org/specs/web-apps/current-work/multipage/network.html#ping-and-pong-frames
.3.4 Ping and Pong frames
The WebSocket protocol specification defines Ping and Pong frames that
can be used for keep-alive, heart-beats, network status probing,
latency instrumentation, and so forth. These are not currently exposed
in the API.
User agents may send ping and unsolicited pong frames as desired, for
example in an attempt to maintain local network NAT mappings, to
detect failed connections, or to display latency metrics to the user.
User agents must not use pings or unsolicited pongs to aid the server;
it is assumed that servers will solicit pongs whenever appropriate for
the server's needs.

Related

Why can't http2 streams be reused?

According to RFC7540:
An HTTP request/response exchange fully consumes a single stream. A request starts with the HEADERS frame that puts the stream into an "open" state. The request ends with a frame bearing END_STREAM, which causes the stream to become "half-closed (local)" for the client and "half-closed (remote)" for the server. A response starts with a HEADERS frame and ends with a frame bearing END_STREAM, which places the stream in the "closed" state.
Knowing that a stream cannot be reopened once it's closed, this means that if I want to implement a long-lived connection where the client sends a stream of requests to the server, I will have to use a new stream for each request. But there is a finite number of streams available, so in theory, I could run out of streams and have to restart the connection.
Why did the writers of the specification design a request/response exchange to completely consume a stream? Wouldn't it have been easy to make a stream like a single thread of exchanges, where you can have multiple exchanges done in serial in one stream?
The point of having many streams multiplexed in a single connection is to interleave them, so that if one cannot proceed, others can.
Reusing a stream for more than one request means just reusing its stream id. I don't see much benefit reusing 4-byte integers -- on the contrary the implementation would become quite more complicated.
For example, the server can inform the client of the last stream that it processed when it's about to close a connection. If stream ids are reused, it would not be possible to report this reliably.
Also, imagine the case where the client sends requestA on stream5; this arrives on the server where its processing takes time; the client times out, sends a RST_STREAM for stream5 (to cancel requestA) and then requestB on stream5. While these are in-flight, the server finishes the processing of requestA and sends the response for requestA on stream5. Now the client reads a response, but it does not know if it is that of requestA or that of requestB.
But there is a finite number of streams available, so in theory, I could run out of streams and have to restart the connection.
That is correct. At 1 ms per exchange, it will take about 12 days to consume the stream ids for a single connection ((2^31-1)/1000/3600/24/2=12.4 days) -- remember that stream ids are incremented by 2 (clients only send odd stream ids).
While this is possible, I have never encountered this case in all the HTTP/2 deployments that I have seen -- typically the connection goes idle and gets closed well before consuming all stream ids.
The specification preferred simplicity and stable features over the ability to reuse stream ids.
Also, bear in mind that HTTP/2 was designed mostly with the web in mind, where browsers make a number of requests to download a web page and its resources, but then stay idle for a while.
The case where an HTTP/2 connection is bombed with non-stop requests is definitely possible, but much rarer and as such it has not probably been deemed important enough in the design -- using 8 bytes for stream ids seems overkill and a cost that is paid for each request even if the 4 bytes limit is never, practically, reached.

How to create two udp sockets where one is sending requests and another one receiving the answers?

I'm looking for a proper way to have one goroutine sending out request packets to specific servers while a second goroutine receiving the responses and handling them, maybe even create a new goroutine for each response to handle.
The architecture of the game is that there are multiple masterservers, which can be asked for ip lists of registered servers.
After getting the ips and ports from the masterservers, each of the ips gets a request for its data, like server name, map, players, etc.
Also, are there better ways to handle this?
Currently I am creating a goroutine per request that also waits for a response afterwards.
The waiting for a response timeouts after 35ms and continues to send 1.2 times the previous amount of request packets to have a small burst of requests. Also the timeout is doubled on every retry.
I'd like to know if there are better strategies that have proven to be more robust and have a lower latency, that are not too complex.
Edit:
I only create the client side sockets, but would have, if there is no better approach, a client that sends UDP request packets that contain a different socket's address as sender value in order to receive the answers on a different socket that acts kind of like a server, where all the response packets are collected. In order to separate the sending socket from the receiving socket.
This question is tagged as client-server as one of the sockets is supposed to act like a server, even tho all it does is receive expected answers in response to request packets sent by the client socket.

Websocket questions: framing, masking

Couple questions about websockets protocol sending BINARY data:
Why is the payload masked? doesn't TCP guarantee data integrity?
What exactly is fragmentation? does it mean that, if I send a single frame of 1000 byte payload, the other end (due to intermediate proxies) may receive four separate frames of 200, 300, 270, and 230 bytes each (with only the final frame having the FIN bit set?)
The payload sent from client to server (not server to client) is masked neither for reasons of data integrity nor authenticity, but to prevent rogue scripts from confusing (and potentially attacking) old intermediaries (Web proxies and the like).
Any WebSocket client that conforms to RFC6455 MUST mask client-to-server frames. Nevertheless, some libraries allow you to turn off masking for client, and turn off failing on non-masked client frames (e.g. AutobahnPython).
The latter can be useful to elimit the CPU overhead associated with masking. It may be acceptable when both endpoints are under your control and either the route between both are fully under your control (e.g. talking WebSocket over loopback or Unix domain sockets or LAN) or you are using TLS, and hence (in most situations) no intermediary will be able to look inside the traffic anyway.
Fragmentation works like this: a WebSocket message may be split into multiple WebSocket frames - and also coalesced any time not only by the sender, but also any intermedaries on the way to the receiver. And yes, only the last WebSocket frame of a sequence of frames for a given message will have the FIN bit set.

How much data is transfered during the first http request

I would like to have a lightning fast website, which is focussed on mobile. Therefore I would like to inline as much graphics, styles and scripts as possible and only use one or two fast HTTP-Requests to display the first part of the page.
My question is how much can I inline, how big may my document get until it gets divided.
As I know so far HTTP uses TCP to send the IP Packets and TCP has a window how far the last send and highest acknowledged Packet may be appart and it scales this window.
But how much payload can be transported, before the server has to wait for an ACK of my client in worst case (first window send, no ACKs received yet). And what does it depend to, the browser, the OS, the device?
But how much payload can be transported, before the server has to wait for an ACK of my client in worst case (first window send, no ACKs received yet). And what does it depend to, the browser, the OS, the device?
It depends on the size of the socket receive buffer in the receiver.

boost::asio sending data faster than receiving over TCP. Or how to disable buffering

I have created a client/server program, the client starts
an instance of Writer class and the server starts an instance of
Reader class. Writer will then write a DATA_SIZE bytes of data
asynchronously to the Reader every USLEEP mili seconds.
Every successive async_write request by the Writer is done
only if the "on write" handler from the previous request had
been called.
The problem is, If the Writer (client) is writing more data into the
socket than the Reader (server) is capable of receiving this seems
to be the behaviour:
Writer will start writing into (I think) system buffer and even
though the data had not yet been received by the Reader it will be
calling the "on write" handler without an error.
When the buffer is full, boost::asio won't fire the "on write"
handler anymore, untill the buffer gets smaller.
In the meanwhile, the Reader is still receiving small chunks
of data.
The fact that the Reader keeps receiving bytes after I close
the Writer program seems to prove this theory correct.
What I need to achieve is to prevent this buffering because the
data need to be "real time" (as much as possible).
I'm guessing I need to use some combination of the socket options that
asio offers, like the no_delay or send_buffer_size, but I'm just guessing
here as I haven't had success experimenting with these.
I think that the first solution that one can think of is to use
UDP instead of TCP. This will be the case as I'll need to switch to
UDP for other reasons as well in the near future, but I would
first like to find out how to do it with TCP just for the sake
of having it straight in my head in case I'll have a similar
problem some other day in the future.
NOTE1: Before I started experimenting with asynchronous operations in asio library I had implemented this same scenario using threads, locks and asio::sockets and did not experience such buffering at that time. I had to switch to the asynchronous API because asio does not seem to allow timed interruptions of synchronous calls.
NOTE2: Here is a working example that demonstrates the problem: http://pastie.org/3122025
EDIT: I've done one more test, in my NOTE1 I mentioned that when I was using asio::iosockets I did not experience this buffering. So I wanted to be sure and created this test: http://pastie.org/3125452 It turns out that the buffering is there event with asio::iosockets, so there must have been something else that caused it to go smoothly, possibly lower FPS.
TCP/IP is definitely geared for maximizing throughput as intention of most network applications is to transfer data between hosts. In such scenarios it is expected that a transfer of N bytes will take T seconds and clearly it doesn't matter if receiver is a little slow to process data. In fact, as you noticed TCP/IP protocol implements the sliding window which allows the sender to buffer some data so that it is always ready to be sent but leaves the ultimate throttling control up to the receiver. Receiver can go full speed, pace itself or even pause transmission.
If you don't need throughput and instead want to guarantee that the data your sender is transmitting is as close to real time as possible, then what you need is to make sure the sender doesn't write the next packet until he receives an acknowledgement from the receiver that it has processed the previous data packet. So instead of blindly sending packet after packet until you are blocked, define a message structure for control messages to be sent back from the receiver back to the sender.
Obviously with this approach, your trade off is that each sent packet is closer to real-time of the sender but you are limiting how much data you can transfer while slightly increasing total bandwidth used by your protocol (i.e. additional control messages). Also keep in mind that "close to real-time" is relative because you will still face delays in the network as well as ability of the receiver to process data. So you might also take a look at the design constraints of your specific application to determine how "close" do you really need to be.
If you need to be very close, but at the same time you don't care if packets are lost because old packet data is superseded by new data, then UDP/IP might be a better alternative. However, a) if you have reliable deliver requirements, you might ends up reinventing a portion of tcp/ip's wheel and b) keep in mind that certain networks (corporate firewalls) tend to block UDP/IP while allowing TCP/IP traffic and c) even UDP/IP won't be exact real-time.

Resources