Is Websocket messge oriented? - websocket

I am checking out the behaviors of Websocket.
Is Websocket message oriented unlike TCP stream?
For example, when I send data ABC, DEF, GHI, then is it guaranteed to receive data ABC, DEF, GHI? In TCP stream, it is not guranteed: we may receive AB, DEFG, HI.

Yes, it is message-oriented (well, actually frame-oriented).
Per RFC 6455:
After a successful handshake, clients and servers transfer data back
and forth in conceptual units referred to in this specification as
"messages". On the wire, a message is composed of one or more
frames. The WebSocket message does not necessarily correspond to a
particular network layer framing, as a fragmented message may be
coalesced or split by an intermediary.
...
The WebSocket Protocol is designed on the principle that there should
be minimal framing (the only framing that exists is to make the
protocol frame-based instead of stream-based and to support a
distinction between Unicode text and binary frames). It is expected
that metadata would be layered on top of WebSocket by the application
layer, in the same way that metadata is layered on top of TCP by the
application layer (e.g., HTTP).
Conceptually, WebSocket is really just a layer on top of TCP that
does the following:
adds a web origin-based security model for browsers
adds an addressing and protocol naming mechanism to support
multiple services on one port and multiple host names on one IP
address
layers a framing mechanism on top of TCP to get back to the IP
packet mechanism that TCP is built on, but without length limits
includes an additional closing handshake in-band that is designed
to work in the presence of proxies and other intermediaries

Related

Is sending redundant data over websocket over HTTP/2 practically free?

I'm writing a web app feature that would use WebSocket messages to transmit JSON structures between client and server. The most simple protocol for the app would be to keep repeatedly sending mostly redudant parts back and forth. Can HTTP/2 compression effectively compress redundant parts in separate messages going back and forth? I know this should be possible in theory but how about in practice?
Example case:
Assume that the shared_state is a string that is mostly same but not identical between different messages:
Client connects:
ws = new WebSocket("wss://myserver.example/foo/bar");
Client sends message over the WebSocket connection:
{ command: "foo", shared_state: "...long data here..." }
Server sends:
{ command: "bar", shared_state: "...long data here slightly modified..." }
Client sends:
{ command: "zoo", shared_state: "...long data here slightly modified again..." }
All these messages will be passed over a single HTTP/2 connection using a single websocket.
Will the messages going in both directions be compressed by HTTP/2? This would mean that the later data packets effectively could just use some references to already seen data in previously transmitted data in the same HTTP/2 connection. It would simplify the custom protocol that I need to implement if I can keep sending the shared state instead of just delta without causing high bandwidth usage in practice. I don't need to care about old clients that cannot support HTTP/2.
I'm assuming the delta between messages to be less than 1 KB but the whole message including the redundant part could be usually in range 10-60 KB.
The way WebSocket over HTTP/2 works is that WebSocket frames are carried as opaque bytes in HTTP/2 DATA frames.
A logical WebSocket connection is carried by a single HTTP/2 stream, with "infinite" request content and "infinite" response content, as DATA frames (containing WebSocket frames) continue to flow from client to server and from server to client.
Since WebSocket bytes are carried by DATA frames, there is no "compression" going on at the HTTP/2 level.
HTTP/2 only offers "compression" for HTTP headers via HPACK in HEADERS frames, but WebSocket over HTTP/2 cannot leverage that (because it does not use HTTP/2 HEADERS frames).
Your custom protocol has key/value pairs, but it's a WebSocket payload carried by DATA frames.
The only "compression" that you're going to get is driven by WebSocket's permessage-deflate functionality.
For example, a browser would open a connection, establish WebSocket over HTTP/2 (with the permessage-deflate extension negotiated during WebSocket upgrade), and then send a WebSocket message. The WebSocket message will be compressed, the compressed bytes packed into WebSocket frames, the WebSocket frames packed in HTTP/2 DATA frames and then sent over the network.
If your shared_state compresses well, then you are trading network bandwidth (less bytes over the network) for CPU usage (to decompress).
If it does not compress well, then it's probably not worth it.
I'd recommend that you look into existing protocols over WebSocket, there may be ones that do what you need already (for example, Bayeux).
Also, consider not using JSON as format since now JavaScript supports binary, so you may be more efficient (see for example CBOR).

Does the Websocket protocol manage the sending of large data in chunks

Hi guys I was just wondering if the websocket protocol already handles the sending of large data in chunks. At least knowing that it does will save me the time of doing so myself.
According to RFC-6455 base framing, has a maximum size limit of 2^63 bytes which means it actually depends on your client library implementation.
I was just wondering if the websocket protocol already handles the sending of large data in chunks...
Depends what you mean by that.
The WebSockets protocol is frame based (not stream based)
If what you're wondering about is "will a huge payload arrive in one piece?" - the answer is always "yes".
The WebSockets protocol is a frame / message based protocol - not a streaming protocol. Which means that the protocols wraps and unwraps messages in a way that's designed to grantee message ordering and integrity. A messages will not get...
...truncated in the middle (unlike TCP/IP, which is a streaming based protocol, where ordering is preserved, but not message boundaries).
The WebSockets protocol MAY use fragmented "packets"
According to the standard, the protocol may break large messages to smaller chunks. It doesn't have too.
There's a 32 bit compatibility concern that makes some clients / servers fragment messages into smaller fragments and later put them back together on the receiving end (before the onmessage callback is called).
Application layer "chunking" is required for multiplexing
Sending large payloads over a single WebSocket connection will cause a pipelining issue, where other messages will have to wait until the huge payload is sent, received and (if required) re-assembled.
In practice, this means that large payloads should be fragmented by the application layer. This "chunked" application layer approach will enable multiplexing the single WebSocket connection.

Websocket: binary data messages order [duplicate]

If we send two messages over the same html5 websocket a split millisecond apart from each other,
Is it theoretically possible for the messages to arrive in a different order than they were sent?
Short answer: No.
Long answer:
WebSocket runs over TCP, so on that level #EJP 's answer applies. WebSocket can be "intercepted" by intermediaries (like WS proxies): those are allowed to reorder WebSocket control frames (i.e. WS pings/pongs), but not message frames when no WebSocket extension is in place. If there is a neogiated extension in place that in principle allows reordering, then an intermediary may only do so if it understands the extension and the reordering rules that apply.
It's not possible for them to arrive in your application out of order. Anything can happen on the network, but TCP will only present you the bytes in the order they were sent.
At the network layer TCP is suppose to guarantee that messages arrive in order. At the application layer, errors can occur in the code and cause your messages to be out of order in the logic of your code. It could be the network stack your application is using or your application code itself.
If you asked me, can my Node.js application guarantee sending and receiving messages in order? I'm going to have to say no. I've run websocket applications connected to WiFi under high latency and low signal. It causes very strange behavior as if packets are dropped and messages are out of sequence.
This article is a good read https://samsaffron.com/archive/2015/12/29/websockets-caution-required

What does X mean when used with ØMQ socket types?

ZeroMQ defines a series of socket types, commonly referred to as SUB, PUB, XSUB, XPUB, DEALER...
Looking through some API code, there are methods such as XSend, XHasIn, XHiccupped.
These X characters seem to be used as semantic modifiers. Is there any pattern or significance to their usage?
When applied to socket types, the x is a signifier that the socket exposes the raw elements of the protocol in some way that is otherwise hidden by the more common socket type.
Take, for example, xSUB & xPUB.
PUB/SUB typically only communicates one way, PUB sends and SUB receives. But with xPUB/xSUB, the otherwise hidden element of the protocol is exposed: xSUB sends a "subscribe" message to xPUB, and xPUB can receive that message and you can see it to act on it in some more interesting way than just maintaining and sending data for that subscription.
Likewise, there used to be xREP and xREQ. In ZMQ, REP/REQ have very strict requirements for the order in which they send/receive messages. These requirements are enforced by low level elements of the socket protocols which are not exposed. xREP and xREQ allowed you to break those requirements by exposing the elements that forced you to follow those message patterns. These socket types were so useful, they eventually evolved into ROUTER and DEALER sockets.
I haven't looked at a low enough level to say exactly why there are methods like xsend & etc, but my gut feeling is that they are the versions of these methods in the x socket types that are designed to expose these protocol elements.

Websocket questions: framing, masking

Couple questions about websockets protocol sending BINARY data:
Why is the payload masked? doesn't TCP guarantee data integrity?
What exactly is fragmentation? does it mean that, if I send a single frame of 1000 byte payload, the other end (due to intermediate proxies) may receive four separate frames of 200, 300, 270, and 230 bytes each (with only the final frame having the FIN bit set?)
The payload sent from client to server (not server to client) is masked neither for reasons of data integrity nor authenticity, but to prevent rogue scripts from confusing (and potentially attacking) old intermediaries (Web proxies and the like).
Any WebSocket client that conforms to RFC6455 MUST mask client-to-server frames. Nevertheless, some libraries allow you to turn off masking for client, and turn off failing on non-masked client frames (e.g. AutobahnPython).
The latter can be useful to elimit the CPU overhead associated with masking. It may be acceptable when both endpoints are under your control and either the route between both are fully under your control (e.g. talking WebSocket over loopback or Unix domain sockets or LAN) or you are using TLS, and hence (in most situations) no intermediary will be able to look inside the traffic anyway.
Fragmentation works like this: a WebSocket message may be split into multiple WebSocket frames - and also coalesced any time not only by the sender, but also any intermedaries on the way to the receiver. And yes, only the last WebSocket frame of a sequence of frames for a given message will have the FIN bit set.

Resources