How to transfer data structures between browser and server - websocket

In the context of interactive web apps, especially those libraries utilizing technologies such as websockets, how can we transfer data structures (e.g. maps, lists, sets, etc) between the client browser and the server? The examples I've come across so far only pass strings.
Is this supported case by case dependent on the libraries used, or a more general mechanism is available?

You can send three things over a websocket (from the client perspective):
Strings
ArrayBuffers
Blobs
If you have a compound Javascript data structures (hierarchy of maps and arrays) then you should use JSON to serialize them to strings and send them over the WebSocket connection as a string.
If you are interested in sending binary byte or file data over the WebSocket connection you could still serialize to a string (inefficient bandwidth-wise) or you can send the data as ArrayBuffers or Blobs.
Note 1: When sending an ArrayBuffer or Blob results in a binary WebSocket frame on the wire your server needs to support binary frames.
Note 2: The client gets to choose what type of object is returned when the server sends a binary frame. This is the binaryType property on the WebSocket object which can be set to either "arraybuffer" or "blob".
Note 3: If you browser only supports the older WebSocket Hixie protocol series (e.g. iOS Safari) then there is no binary data and you can only send and receive strings.

The usual way is JSON or similar.

Related

Is sending redundant data over websocket over HTTP/2 practically free?

I'm writing a web app feature that would use WebSocket messages to transmit JSON structures between client and server. The most simple protocol for the app would be to keep repeatedly sending mostly redudant parts back and forth. Can HTTP/2 compression effectively compress redundant parts in separate messages going back and forth? I know this should be possible in theory but how about in practice?
Example case:
Assume that the shared_state is a string that is mostly same but not identical between different messages:
Client connects:
ws = new WebSocket("wss://myserver.example/foo/bar");
Client sends message over the WebSocket connection:
{ command: "foo", shared_state: "...long data here..." }
Server sends:
{ command: "bar", shared_state: "...long data here slightly modified..." }
Client sends:
{ command: "zoo", shared_state: "...long data here slightly modified again..." }
All these messages will be passed over a single HTTP/2 connection using a single websocket.
Will the messages going in both directions be compressed by HTTP/2? This would mean that the later data packets effectively could just use some references to already seen data in previously transmitted data in the same HTTP/2 connection. It would simplify the custom protocol that I need to implement if I can keep sending the shared state instead of just delta without causing high bandwidth usage in practice. I don't need to care about old clients that cannot support HTTP/2.
I'm assuming the delta between messages to be less than 1 KB but the whole message including the redundant part could be usually in range 10-60 KB.
The way WebSocket over HTTP/2 works is that WebSocket frames are carried as opaque bytes in HTTP/2 DATA frames.
A logical WebSocket connection is carried by a single HTTP/2 stream, with "infinite" request content and "infinite" response content, as DATA frames (containing WebSocket frames) continue to flow from client to server and from server to client.
Since WebSocket bytes are carried by DATA frames, there is no "compression" going on at the HTTP/2 level.
HTTP/2 only offers "compression" for HTTP headers via HPACK in HEADERS frames, but WebSocket over HTTP/2 cannot leverage that (because it does not use HTTP/2 HEADERS frames).
Your custom protocol has key/value pairs, but it's a WebSocket payload carried by DATA frames.
The only "compression" that you're going to get is driven by WebSocket's permessage-deflate functionality.
For example, a browser would open a connection, establish WebSocket over HTTP/2 (with the permessage-deflate extension negotiated during WebSocket upgrade), and then send a WebSocket message. The WebSocket message will be compressed, the compressed bytes packed into WebSocket frames, the WebSocket frames packed in HTTP/2 DATA frames and then sent over the network.
If your shared_state compresses well, then you are trading network bandwidth (less bytes over the network) for CPU usage (to decompress).
If it does not compress well, then it's probably not worth it.
I'd recommend that you look into existing protocols over WebSocket, there may be ones that do what you need already (for example, Bayeux).
Also, consider not using JSON as format since now JavaScript supports binary, so you may be more efficient (see for example CBOR).

Socket.io - different maxHttpBufferSize values depending on the nature of the request

I am creating an application that allows users to submit JSON or Base64 image data via socket.io
The goal I am trying to achieve is:
if JSON is submitted, the message can have a maximum size of 1MB
if a Base64 image is submitted, the message can have a maximum size of 5MB
From the socket.io docs I can see that:
you can specify a maxHttpBufferSize option value that allows you to limit the maximum message size
namespaces allow you to split logic over a single connection
However, I can't figure out the correct way to get the functionality to work the way I have described above.
Would I need to:
set up 2 separate io instances on the server, one for JSON data and the other for Base64 images (therefore allowing me to set separate maxHttpBufferSize values for each), and then the client can use the correct instance, depending on what they want to submit (if so, what is the correct way of doing this?)
set up 1 instance with a maxHttpBufferSize of 5MB, and then add in my own custom logic to determine message sizes and prevent further actions if the data is JSON and over 1MB in size
set this up in some totally different way that I haven't thought of
Many thanks
From what I can see in the API, maxHttpBufferSize is a parameter for the underlying Engine.IO server (of which there is one instance per Socket.IO Server Instance). Obviously you're free to set up two servers but I doubt it makes sense to separate the system into two entirely different applications.
Talk of using Namespaces to separate logic is more about handling different messages at different endpoints (for example you would register a removeUserFromChat message handler to a user connecting via an /admin namespace, but you wouldn't want to register this to a user connecting via the /user namespace).
In the most recent socket server I set up, I defined my own protocol where part of the response would contain a HTTP status code, as well as a description that could be displayed to the user. For example I would return 200 on success. If I was uploading a file via a REST HTTP Interface, I would expect a 400 (BAD REQUEST) response if my request couldn't be processed - and I believe that this makes sense for your use case. Alternatively you could define your own custom 4XX error code if the file is too large, and handle this in your UI purely based on the code returned. Obviously you don't need to follow the HTTP protocol, and the design decisions are ultimately up to you, but in my opinion it makes sense to return some kind of error response in your message handler.
I suspect that the maxHttpBufferSize has different use at lower levels than your use case. When sending content over network, content is split into 'n bytes' of packets and when a application writes 'n' bytes, the network sends a packet over network (the less the n, more overhead due to network headers. The more the n, high latency because of waiting involved in accumulating n bytes before sending). Documentation is not clear about maxHttpBufferSize but it could be the packet size (n) configuration, not limit on the max data on connection.
It seems, http request header Content-Length might serve your purpose. It gives the actual object size based on that you can make a decision.

Does the Websocket protocol manage the sending of large data in chunks

Hi guys I was just wondering if the websocket protocol already handles the sending of large data in chunks. At least knowing that it does will save me the time of doing so myself.
According to RFC-6455 base framing, has a maximum size limit of 2^63 bytes which means it actually depends on your client library implementation.
I was just wondering if the websocket protocol already handles the sending of large data in chunks...
Depends what you mean by that.
The WebSockets protocol is frame based (not stream based)
If what you're wondering about is "will a huge payload arrive in one piece?" - the answer is always "yes".
The WebSockets protocol is a frame / message based protocol - not a streaming protocol. Which means that the protocols wraps and unwraps messages in a way that's designed to grantee message ordering and integrity. A messages will not get...
...truncated in the middle (unlike TCP/IP, which is a streaming based protocol, where ordering is preserved, but not message boundaries).
The WebSockets protocol MAY use fragmented "packets"
According to the standard, the protocol may break large messages to smaller chunks. It doesn't have too.
There's a 32 bit compatibility concern that makes some clients / servers fragment messages into smaller fragments and later put them back together on the receiving end (before the onmessage callback is called).
Application layer "chunking" is required for multiplexing
Sending large payloads over a single WebSocket connection will cause a pipelining issue, where other messages will have to wait until the huge payload is sent, received and (if required) re-assembled.
In practice, this means that large payloads should be fragmented by the application layer. This "chunked" application layer approach will enable multiplexing the single WebSocket connection.

Fastest most elegant way to split websocket buffer into different variables so that they can be compared

Best way to send & receive an array over websockets
On the client side I'm using javascript that sends the data.
I'm sending to a microcontroller (ESP8266) that is programmed in c++ using the websocket library with the arduino IDE
At the moment I'm sending the variable which I build up on the client side.
It is then sent to the microcontroller and received by the payload buffer.
I am sending this from the client
#,tank,pond,1537272000,1537272000,Normal,4789,12
I received here in the code:
case WStype_TEXT: Serial.printf("[%u] get Text: %s\n", num, payload);
this is the result of what I receive
[0] here it is: #,tank,pond,1537272000,1537272000,Normal,4789,12
I am using a hash(#) to mark the start of the data.
I have been googling and searching forums for days but can't fathom which is the best way to do this.
What is the fastest most elegant code to split this up into different variables so that they can be compared?
Briefly, I prefer JSON. You can use the library below for ESP8266:
https://github.com/bblanchon/ArduinoJson
I think your websocket server is written in a high level language so neither JSON parsing nor composing will not be an issue for you.
Good luck.

go rpc, http or websockets,which is fastest for transferring many small pieces of data, repeatedly, from one server to another

Background
I'm experimenting creating a memory + cpu profiler in go, and wish to transfer the information quickly, maybe every second, from the program/service being profiled to a server which will do all of the heavy lifting by saving the data to a database and/or serving it via http to a site; this will reduce the load on the program being profiled for more accurate measurements. It will be small pieces of data being transferred. I know there are some libraries out there already, but like I said, experimenting.
Transfer Content Type
I have not decided on a concrete transfer type but looks like JSON for HTTP or Websockets and just the Struct for RPC (if I've done my research correctly)
Summary
I will likely try each just to see for myself, but have little experience using RPC and Websockets and would like some opinions or recommendations on which may be faster or more suitable for what I'm trying to do:
HTTP
RPC
Websockets
Anything else I'm not thinking about
As you mentioned in your comment, HTTP is not a requirement.
In this case in search for the fastest transferring solution I would completely drop the HTTP transport layer and would use just plain (TCP) socket connections as HTTP gives quite a big overhead just for transferring a few bytes.
Define your own protocol (which may be very simple), open a TCP connection to the server, and send the data packets every seconds or so as your requirements dictate.
Your protocol for sending (and receiving) data can be as simple as:
Do an optional authenticating or client/server identification (to ensure you connected to the server/program you wanted to).
Use the encoding/gob packgae from the standard library to send data in binary form over the connection.
So basically the profiled program (client) should open the TCP connection, and use gob.NewEncoder() wrapping the connection to send data. The server should accept the incoming TCP connection and use gob.NewDecoder() to wrapping the connection to recieve data.
Client calls Encoder.Encode() so send profiling info, it can be typically a struct value. Server calls Decoder.Decode() to receive a profiling info, the struct that the client sent. That's all.
Sending data in binary form using the encoding/gob package requires you to use the same type to describe the profiling data on both sides. If you want more flexibility, you may also use the encoding/json package to send/receive profiling info as JSON text. The downside is that JSON will require more data to be sent and it takes more time to produce and parse the JSON text compared to the binary representation.
If loosing some profiling packets (or recieving duplicates) is not an issue, you may want to try/experiment using UDP instead of TCP which may be even more efficient.

Resources