Any CDN out there that is able to cache WebSocket data - caching

Is there any CDN provider able to offer some sort of WebSocket caching either via "Edge Workers" or other mechanism?
I would like that some WebSocket requests that we make to be able to serve them as close to the user as possible without us setting up our entire infrastructure in multiple datacenters.

No, there is no way to cache WebSocket messages - e.g. the protocol does not have any "addressable" resources as e.g. HTTP. HTTP has more built in design for caching in the protocol, e.g. GET requests are typically cacheable but not POST requests and you can add cache related headers to the requests, e.g. how long it can be cached and so on.
For this reason, I would recommend to use HTTP instead of WebSocket unless you have a use case where you really have to use WebSockets. With HTTP you have access to all CDN and proxy infrastructure that is available globally.
It is getting more and more common to use e.g. Server Sent Events, HTTP/2 streams or possibly GRPC for use cases where WebSockets was used before. The upcoming WebTransport protocol will probably be the replacement(?) - at least when most other traffic is HTTP/3.

Seconded, caching WebSocket messages is not possible/not recommended. I would encourage you to look into transforming your WebSocket messaging into a form of Edge Computing.
For instance:
Akamai offers EdgeWorkers.
CloudFlare offers Workers.
AWS offers Lambda.
This might offer you the ability to handle your workflow instead of relying on WebSockets.
/Mike (and yes, I work at Akamai)

Related

Is there a reason why HTTP2 is required when websocket is already available?

Is there a reason why HTTP2 is required when websocket is already available?
Nathan Aw (Singapore)
Why do we need HTTP/2 when we have WebSockets? Well why did we need WebSockets when we have TCP? Or even IP? Protocols are basically agreed standards that can be implemented by independent parties.
WebSockets are good for two way communication but are mostly unstructured on top of that and application specific. HTTP is (mostly) a series of one-way requests to the server (ask for a resource, receive an answer) — though HTTP/2 enhances that slightly with HTTP/2 push and the binary framing layer could in theory be used more for proper two way push. So the full two way nature of WebSockets — the very thing they are good at — is not really needed for most HTTP use cases.
Looking at HTTP, it has various extras that WebSockets does not have. Including defined methods, headers and compression. This allows a well-defined understanding between various HTTP implementations to facilitate communications for its use case including features like multiplexing, caching, compression, redirects, error handling... etc. If you had to reinvent all of those on top of WebSockets (which is a very raw protocol), you’d end up with an HTTP/2 like protocol.
Could HTTP/2 have used WebSockets to act as it’s underlying transport layer? Possibly, but that’s an unnecessary extra level of abstraction (IP->TCP->WS->HTTP2->HTTP), not to mention that websockets are often established over HTTP initially. HTTP is big enough to have its own transport protocol so in fact they’ve gone the other way and specified WebSockets over HTTP/2.
Finally, it should also be noted that HTTP/2 does not make Web Sockets obselete either. They are different and with different advantages and disadvantages.

Is there a Socket.IO alternative that is not based on WebSockets?

I built a realtime application that, thanks to Socket.IO, can serve a lot of different client types (C#, Java, Browser, ...)! I know that there are a lot of Socket.IO alternatives, but from my understanding, everything is more or less based on WebSockets. (I know that Socket.IO has fallbacks if WebSockets are not working, but that they are more less "inferior workarounds" so to speak...)
My question is: Is there any comparable real-time engine available that is NOT based on WebSockets, but can still serve all those different clients?
You don't say what your endpoints are. If one of the endpoints is a browser with purely the built-in capabilities of the browser and Javascript, then a webSocket is your only way to get a continuous connection from the browser to some other destination.
If a webSocket is not supported (in an older browser), then the other socket.io fallbacks (such as xhr-long-polling) are the next best alternatives. As the browser has limited communication capabilities, if you can't use a webSocket, then an ajax call is your only other generally supported option without requiring plug-ins on each browser (such as Flash or Java or something like that). socket.io already supports the next-best options that are available in a browser - you can't do better than that if you're talking about a standard browser with no custom plug-ins.
If your endpoints don't necessarily include a browser and you can use any language or library you want, then you can use plain TCP sockets and then use whatever protocol you want over a TCP socket.
The WebSocket protocol establishes a bidirectional communication channel between server and client; they kind of speak more naturally with each other. The server can just send something to the client and the other way around. In http it just goes in one direction, there's a request and a response and everything needs to be initiated with a request from the client.
From my experience, realtime webApps like a multiplayer game or a chat become easier to develop and it apparently creates less overhead than using http - but still you can do the same things more or less elegant with http as well (see e.g. long polling).
Look at gmail or other existing webApps, they all use http (so does Socket.io as a fallback) and it works quite well.

SPDY as replacement for Websockets?

First off - I understand SPDY and Websockets aren't the same thing, and that you can run Websockets over SPDY like you can with HTTP, etc.
However - I am wondering if SPDY would be a viable replacement for websockets if I am trying to provide a REST (like) API that also supports server push (bi-directional calls over the same connection).
My current prototype uses websockets (node+socket.io), and works fine. However, my issue with websockets is I am having to dream up my own JSON protocol for routing requests both to and from the server. I'd much rather use REST-style URIs and Headers in requests, which fits better in a REST-based architecture. SPDY seems like it would support this better.
Also, because of the lack of headers, I'm concerned websockets won't fit well in our deployment network, and thinking SPDY would be a better fit again.
However, I've not seen many examples of bidirectional SPDY requests, apart from pushing files to the browser. I would like to push events and data to the browsers, such as:
Content-Type: application/json
{
"id": "ca823f3e233233",
"name": "Greg Brady"
}
but it's not clear to me how the browser/Javascript might "listen" and react to these, as I would with the WebSocket and socket.io APIs.
Let's start from the beginning: why would you want to run WebSockets over SPDY, as opposed to doing an HTTP upgrade? If you upgrade an HTTP connection to WS, then nothing else can use that TCP stream - the WS connection can be idle, but the connection is blocked nonetheless. With SPDY, you can mux multiple requests/responses, and a websocket connection (or even multiple) over the same underlying TCP stream. On a practical note, as of July 2012, WS over SPDY is still a work in progress, so you will have to wait to use SPDY for WebSockets - hopefully not too long though!
But let's assume the support is there... The reason why it's not clear how to listen for "SPDY Push" from JavaScript is because there is no way to do that! A pushed resource goes into your browsers cache - nothing more, nothing less. If you need to stream data to your javascript callbacks, then WebSockets, or Server-Sent Events (SSE) is the answer.
So, putting it all together:
HTTP adds a lot of overhead for individual small requests (headers, etc)
WebSockets gives you a low overhead channel, but requires you implement own routing
SPDY will significantly reduce the overhead and cost of small HTTP requests (win)
SSE is a good, simple alternative to pushing data to the client (which works today, over SPDY)
You could use SPDY+SSE to meet your goals, and all of that communication can run over the same TCP channel. SPDY requests to the server, SSE push from the server.
First some clarifications:
The base WebSocket protocol (IETF 6455) is not layered onto HTTP. The initial handshake for WebSocket connections is HTTP compatible, but once the handshake is completed, the protocol is a framed, bi-directional full-duplex connection with very low overhead (often just 2 bytes per frame of header).
The WebSocket over SPDY idea is a proposal that may or may not see the light of day. In this case, WebSocket is in fact being layered on SPDY. The initial connection/handshake may happen faster due to the nature of SPDY versus HTTP, however, the data frames will have more overhead because the WebSocket header fields are mapped into SPDY header fields.
SPDY aims to be a more efficient replacement for HTTP. WebSocket is an entirely different beast that enables very low-latency bi-directional/full-duplex messaging between the client and server.
If what you are interested in server-push with a simple API and you don't need super low-latency, then you might consider server-sent events which has an API that is simple and similar to the WebSocket API. Or you could look into one of the many good Comet libraries which enable server-push but will better support old browsers unlike any of the above solutions.
However, my issue with websockets is I am having to dream up my own
JSON protocol for routing requests both to and from the server.
I wrote a thin RPC layer over socket.io wrapping network calls in promises just for that reason. You can take a peek at it here.

With websockets, is there a place for AJAX?

I'm currently building a realtime application using Node. I'm using socket.io to power my real-time interactions, but have jQuery loaded, so I have AJAX available to me. I initially used socket.io for all my communication between the server and client.
I'm starting to think that AJAX might be better suited for certain cases like doing RESTful transactions asynchronously, because I don't have to write a separate message case in my socket to handle each new transaction as well as write the RESTful routing.
I'm wondering if I am on to something or if its best to use sockets for performance or something else I'm not thinking about.
Thanks!
Matt Mueller
Yes, WebSockets (RFC 6455) and Ajax are quite different and serve different purposes.
As you say, with Ajax you can do RESTful requests. This means that you can take advantage of existing HTTP-infrastructure like e.g. proxies to cache requests and use conditional get requests. Ajax request may be quite heavy-weight since every Ajax request contains HTTP headers and include cookies.
WebSockets is designed for low latency bi-directional communication. By design, WebSockets has very little overhead in each message. E.g. WebSockets messages doesn't have to include any HTTP Headers, and may in future be used for VoIP and streaming in both directions.
Another difference is that Ajax can be used with stateless servers. E.g. if you have your web load balanced with multiple servers, any server can handle an Ajax request, even after reboot (or upgrade). Websocket's are "connected" and use a stateful server, so it may be harder to use multiple servers with it.
There is also Server Sent Events, that are similar to WebSockets, in that the server can push data to the client (which can't be done with Ajax without hacks (e.g. comet)), and it can also handle automatic reconnections. But it's only for messages in one direction (server to client). See HTML5 Server-Side Event: EventSource vs. wrapped WebSocket.
Those are two completely different technologies and could be used together: with AJAX the request is initiated by the client, while with WebSockets the request is initiated by the server in order to push some data to the client.

Any HTTP proxies with explicit, configurable support for request/response buffering and delayed connections?

When dealing with mobile clients it is very common to have multisecond delays during the transmission of HTTP requests. If you are serving pages or services out of a prefork Apache the child processes will be tied up for seconds serving a single mobile client, even if your app server logic is done in 5ms. I am looking for a HTTP server, balancer or proxy server that supports the following:
A request arrives to the proxy. The proxy starts buffering in RAM or in disk the request, including headers and POST/PUT bodies. The proxy DOES NOT open a connection to the backend server. This is probably the most important part.
The proxy server stops buffering the request when:
A size limit has been reached (say, 4KB), or
The request has been received completely, headers and body
Only now, with (part of) the request in memory, a connection is opened to the backend and the request is relayed.
The backend sends back the response. Again the proxy server starts buffering it immediately (up to a more generous size, say 64KB.)
Since the proxy has a big enough buffer the backend response is stored completely in the proxy server in a matter of miliseconds, and the backend process/thread is free to process more requests. The backend connection is immediately closed.
The proxy sends back the response to the mobile client, as fast or as slow as it is capable of, without having a connection to the backend tying up resources.
I am fairly sure you can do 4-6 with Squid, and nginx appears to support 1-3 (and looks like fairly unique in this respect). My question is: is there any proxy server that empathizes these buffering and not-opening-connections-until-ready capabilities? Maybe there is just a bit of Apache config-fu that makes this buffering behaviour trivial? Any of them that it is not a dinosaur like Squid and that supports a lean single-process, asynchronous, event-based execution model?
(Siderant: I would be using nginx but it doesn't support chunked POST bodies, making it useless for serving stuff to mobile clients. Yes cheap 50$ handsets love chunked POSTs... sigh)
What about using both nginx and Squid (client — Squid — nginx — backend)? When returning data from a backend, Squid does convert it from C-T-E: chunked to a regular stream with Content-Length set, so maybe it can normalize POST also.
Nginx can do everything you want. The configuration parameters you are looking for are
http://wiki.codemongers.com/NginxHttpCoreModule#client_body_buffer_size
and
http://wiki.codemongers.com/NginxHttpProxyModule#proxy_buffer_size
Fiddler, a free tool from Telerik, does at least some of the things you're looking for.
Specifically, go to Rules | Custom Rules... and you can add arbitrary Javascript code at all points during the connection. You could simulate some of the things you need with sleep() calls.
I'm not sure this method gives you the fine buffering control you want, however. Still, something might be better than nothing?
Squid 2.7 can support 1-3 with a patch:
http://www.squid-cache.org/Versions/v2/HEAD/changesets/12402.patch
I've tested this and found it to work well, with the proviso that it only buffers to memory, not disk (unless it swaps, of course, and you don't want this), so you need to run it on a box that's appropriately provisioned for your workload.
Chunked POSTs are a problem for most servers and intermediaries. Are you sure you need support? Usually clients should retry the request when they get a 411.
Unfortunately, I'm not aware of a ready-made solution for this. In the worst case scenario, consider developing it yourself, say, using Java NIO -- it shouldn't take more than a week.

Resources