There are many blogs and discussions about WebSocket and HTTP, and many developers and sites strongly advocate WebSockets, but I still can not understand why.
For example (arguments of WebSocket lovers):
HTML5 Web Sockets represents the next evolution of web communications—a full-duplex, bidirectional communications channel that operates through a single socket over the Web. - websocket.org
HTTP supports streaming: request body streaming(you are using it while uploading large files) and response body streaming.
During making the connection with WebSocket, client, and server exchange data per frame which is 2 bytes each, compared to 8 kilobytes of HTTP header when you do continuous polling.
Why do that 2 bytes not include TCP and under TCP protocols overhead?
GET /about.html HTTP/1.1
Host: example.org
This is ~48 bytes HTTP header.
HTTP chunked encoding - Chunked transfer encoding:
23
This is the data in the first chunk
1A
and this is the second one
3
con
8
sequence
0
So, the overhead per each chunk is not big.
Also, both protocols work over TCP, so all TCP issues with long-live connections are still there.
Questions:
Why is the WebSockets protocol better?
Why was it implemented instead of updating the HTTP protocol?
1) Why is the WebSockets protocol better?
WebSockets is better for situations that involve low-latency communication especially for low latency for client to server messages. For server to client data you can get fairly low latency using long-held connections and chunked transfer. However, this doesn't help with client to server latency which requires a new connection to be established for each client to server message.
Your 48 byte HTTP handshake is not realistic for real-world HTTP browser connections where there is often several kilobytes of data sent as part of the request (in both directions) including many headers and cookie data. Here is an example of a request/response to using Chrome:
Example request (2800 bytes including cookie data, 490 bytes without cookie data):
GET / HTTP/1.1
Host: www.cnn.com
Connection: keep-alive
Cache-Control: no-cache
Pragma: no-cache
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.68 Safari/537.17
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: [[[2428 byte of cookie data]]]
Example response (355 bytes):
HTTP/1.1 200 OK
Server: nginx
Date: Wed, 13 Feb 2013 18:56:27 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: CG=US:TX:Arlington; path=/
Last-Modified: Wed, 13 Feb 2013 18:55:22 GMT
Vary: Accept-Encoding
Cache-Control: max-age=60, private
Expires: Wed, 13 Feb 2013 18:56:54 GMT
Content-Encoding: gzip
Both HTTP and WebSockets have equivalent sized initial connection handshakes, but with a WebSocket connection the initial handshake is performed once and then small messages only have 6 bytes of overhead (2 for the header and 4 for the mask value). The latency overhead is not so much from the size of the headers, but from the logic to parse/handle/store those headers. In addition, the TCP connection setup latency is probably a bigger factor than the size or processing time for each request.
2) Why was it implemented instead of updating HTTP protocol?
There are efforts to re-engineer the HTTP protocol to achieve better performance and lower latency such as SPDY, HTTP 2.0 and QUIC. This will improve the situation for normal HTTP requests, but it is likely that WebSockets and/or WebRTC DataChannel will still have lower latency for client to server data transfer than HTTP protocol (or it will be used in a mode that looks a lot like WebSockets anyways).
Update:
Here is a framework for thinking about web protocols:
TCP: low-level, bi-directional, full-duplex, and guaranteed order transport layer. No browser support (except via plugin/Flash).
HTTP 1.0: request-response transport protocol layered on TCP. The client makes one full request, the server gives one full response, and then the connection is closed. The request methods (GET, POST, HEAD) have specific transactional meaning for resources on the server.
HTTP 1.1: maintains the request-response nature of HTTP 1.0, but allows the connection to stay open for multiple full requests/full responses (one response per request). Still has full headers in the request and response but the connection is re-used and not closed. HTTP 1.1 also added some additional request methods (OPTIONS, PUT, DELETE, TRACE, CONNECT) which also have specific transactional meanings. However, as noted in the introduction to the HTTP 2.0 draft proposal, HTTP 1.1 pipelining is not widely deployed so this greatly limits the utility of HTTP 1.1 to solve latency between browsers and servers.
Long-poll: sort of a "hack" to HTTP (either 1.0 or 1.1) where the server does not respond immediately (or only responds partially with headers) to the client request. After a server response, the client immediately sends a new request (using the same connection if over HTTP 1.1).
HTTP streaming: a variety of techniques (multipart/chunked response) that allow the server to send more than one response to a single client request. The W3C is standardizing this as Server-Sent Events using a text/event-stream MIME type. The browser API (which is fairly similar to the WebSocket API) is called the EventSource API.
Comet/server push: this is an umbrella term that includes both long-poll and HTTP streaming. Comet libraries usually support multiple techniques to try and maximize cross-browser and cross-server support.
WebSockets: a transport layer built-on TCP that uses an HTTP friendly Upgrade handshake. Unlike TCP, which is a streaming transport, WebSockets is a message based transport: messages are delimited on the wire and are re-assembled in-full before delivery to the application. WebSocket connections are bi-directional, full-duplex and long-lived. After the initial handshake request/response, there is no transactional semantics and there is very little per message overhead. The client and server may send messages at any time and must handle message receipt asynchronously.
SPDY: a Google initiated proposal to extend HTTP using a more efficient wire protocol but maintaining all HTTP semantics (request/response, cookies, encoding). SPDY introduces a new framing format (with length-prefixed frames) and specifies a way to layering HTTP request/response pairs onto the new framing layer. Headers can be compressed and new headers can be sent after the connection has been established. There are real world implementations of SPDY in browsers and servers.
HTTP 2.0: has similar goals to SPDY: reduce HTTP latency and overhead while preserving HTTP semantics. The current draft is derived from SPDY and defines an upgrade handshake and data framing that is very similar the the WebSocket standard for handshake and framing. An alternate HTTP 2.0 draft proposal (httpbis-speed-mobility) actually uses WebSockets for the transport layer and adds the SPDY multiplexing and HTTP mapping as an WebSocket extension (WebSocket extensions are negotiated during the handshake).
WebRTC/CU-WebRTC: proposals to allow peer-to-peer connectivity between browsers. This may enable lower average and maximum latency communication because as the underlying transport is SDP/datagram rather than TCP. This allows out-of-order delivery of packets/messages which avoids the TCP issue of latency spikes caused by dropped packets which delay delivery of all subsequent packets (to guarantee in-order delivery).
QUIC: is an experimental protocol aimed at reducing web latency over that of TCP. On the surface, QUIC is very similar to TCP+TLS+SPDY implemented on UDP. QUIC provides multiplexing and flow control equivalent to HTTP/2, security equivalent to TLS, and connection semantics, reliability, and congestion control equivalentto TCP. Because TCP is implemented in operating system kernels, and middlebox firmware, making significant changes to TCP is next to impossible. However, since QUIC is built on top of UDP, it suffers from no such limitations. QUIC is designed and optimised for HTTP/2 semantics.
References:
HTTP:
Wikipedia HTTP Page
W3C List of HTTP related Drafts/Protocols
List of IETF HTTP/1.1 and HTTP/2.0 Drafts
Server-Sent Event:
W3C Server-Sent Events/EventSource Candidate Recommendation
W3C Server-Sent Events/EventSource Draft
WebSockets:
IETF RFC 6455 WebSockets Protocol
IETF RFC 6455 WebSocket Errata
SPDY:
IETF SPDY Draft
HTTP 2.0:
IETF HTTP 2.0 httpbis-http2 Draft
IETF HTTP 2.0 httpbis-speed-mobility Draft
IETF httpbis-network-friendly Draft - an older HTTP 2.0 related proposal
WebRTC:
W3C WebRTC API Draft
List of IETF WebRTC Drafts
IETF WebRTC Overview Draft
IETF WebRTC DataChannel Draft
Microsoft CU-WebRTC Proposal Start Page
QUIC:
QUIC Chrominum Project
IETF QUIC Draft
You seem to assume that WebSocket is a replacement for HTTP. It is not. It's an extension.
The main use-case of WebSockets are Javascript applications which run in the web browser and receive real-time data from a server. Games are a good example.
Before WebSockets, the only method for JavaScript applications to interact with a server was through XmlHttpRequest. But these have a major disadvantage: The server can't send data unless the client has explicitly requested it.
But the new WebSocket feature allows the server to send data whenever it wants. This allows to implement browser-based games with a much lower latency and without having to use ugly hacks like AJAX long-polling or browser plugins.
So why not use normal HTTP with streamed requests and responses
In a comment to another answer you suggested to just stream the client request and response body asynchronously.
In fact, WebSockets are basically that. An attempt to open a WebSocket connection from the client looks like a HTTP request at first, but a special directive in the header (Upgrade: websocket) tells the server to start communicating in this asynchronous mode. First drafts of the WebSocket protocol weren't much more than that and some handshaking to ensure that the server actually understands that the client wants to communicate asynchronously. But then it was realized that proxy servers would be confused by that, because they are used to the usual request/response model of HTTP. A potential attack scenario against proxy servers was discovered. To prevent this it was necessary to make WebSocket traffic look unlike any normal HTTP traffic. That's why the masking keys were introduced in the final version of the protocol.
A regular REST API uses HTTP as the underlying protocol for communication, which follows the request and response paradigm, meaning the communication involves the client requesting some data or resource from a server, and the server responding back to that client. However, HTTP is a stateless protocol, so every request-response cycle will end up having to repeat the header and metadata information. This incurs additional latency in case of frequently repeated request-response cycles.
With WebSockets, although the communication still starts off as an initial HTTP handshake, it is further upgraded to follow the WebSockets protocol (i.e. if both the server and the client are compliant with the protocol as not all entities support the WebSockets protocol).
Now with WebSockets, it is possible to establish a full-duplex and persistent connection between the client and a server. This means that unlike a request and a response, the connection stays open for as long as the application is running (i.e. it’s persistent), and since it is full-duplex, two-way simultaneous communication is possible i.e now the server is capable of initiating communication and 'push' some data to the client when new data (that the client is interested in) becomes available.
The WebSockets protocol is stateful and allows you to implement the Publish-Subscribe (or Pub/Sub) messaging pattern which is the primary concept used in the real-time technologies where you are able to get new updates in the form of server push without the client having to request (refresh the page) repeatedly. Examples of such applications are Uber car's location tracking, Push Notifications, Stock market prices updating in real-time, chat, multiplayer games, live online collaboration tools, etc.
You can check out a deep dive article on Websockets which explains the history of this protocol, how it came into being, what it’s used for and how you can implement it yourself.
Here's a video from a presentation I did about WebSockets and how they are different from using the regular REST APIs: Standardisation and leveraging the exponential rise in data streaming
For the TL;DR, here are 2 cents and a simpler version for your questions:
WebSockets provides these benefits over HTTP:
Persistent stateful connection for the duration of the connection
Low latency: near-real-time communication between server/client due to no overhead of reestablishing connections for each request as HTTP requires.
Full duplex: both server and client can send/receive simultaneously
WebSocket and HTTP protocol have been designed to solve different problems, I.E. WebSocket was designed to improve bi-directional communication whereas HTTP was designed to be stateless, distributed using a request/response model. Other than sharing the ports for legacy reasons (firewall/proxy penetration), there isn't much common ground to combine them into one protocol.
Why is the WebSockets protocol better?
I don't think we can compare them side by side like who is better. That won't be a fair comparison simply because they are solving two different problems. Their requirements are different. It will be like comparing apples to oranges. They are different.
HTTP is a request-response protocol. The client (browser) wants something, the server gives it. That is. If what the data client wants is big, the server might send streaming data to avoid unwanted buffer problems. Here the main requirement or problem is how to make the request from clients and how to response the resources(hypertext) they request. That is where HTTP shine.
In HTTP, only client requests. The server only responds.
WebSocket is not a request-response protocol where only the client can request. It is a socket(very similar to TCP socket). Mean once the connection is open, either side can send data until the underlining TCP connection is closed. It is just like a normal socket. The only difference with TCP socket is WebSocket can be used on the web. On the web, we have many restrictions on a normal socket. Most firewalls will block other ports than 80 and 433 that HTTP used. Proxies and intermediaries will be problematic as well. So to make the protocol easier to deploy to existing infrastructures WebSocket use HTTP handshake to upgrade. That means when the first time connection is going to open, the client sent an HTTP request to tell the server saying "That is not HTTP request, please upgrade to WebSocket protocol".
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Once the server understands the request and upgraded to WebSocket protocol, none of the HTTP protocols applied anymore.
So my answer is Neither one is better than each other. They are completely different.
Why was it implemented instead of updating the HTTP protocol?
Well, we can make everything under the name called HTTP as well. But shall we? If they are two different things, I will prefer two different names. So do Hickson and Michael Carter .
The other answers do not seem to touch on a key aspect here, and that is you make no mention of requiring supporting a web browser as a client. Most of the limitations of plain HTTP above are assuming you would be working with browser/ JS implementations.
The HTTP protocol is fully capable of full-duplex communication; it is legal to have a client perform a POST with a chunked encoding transfer, and a server to return a response with a chunked-encoding body. This would remove the header overhead to just at init time.
So if all you're looking for is full-duplex, control both client and server, and are not interested in extra framing/features of WebSockets, then I would argue that HTTP is a simpler approach with lower latency/CPU (although the latency would really only differ in microseconds or less for either).
Related
Sorry for what I think might be a silly question but:
My understanding of HTTP 2 is that you create a connection, establish TLS if you want to, then upgrade to 2, and do lots of little queries on that one connection :. getting a speed increase because you're not re-establishing connections and TLS.
If this is true, When it comes to sessions - is each request over the connection treated as an unique request and :. has cookies and headers, or whether they're all treated as sub-requests of the original request?
This has impact on proxies whether you could merge requests from clients into a single stream or not?
My understanding of HTTP 2 is that you create a connection, establish TLS if you want to, then upgrade to 2
Not quite. If you use TLS then using HTTP/2 will be negotiated as part of the TLS negotiation. If you are not using TLS then you can upgrade but browsers only support HTTP/2 over HTTPS so that's the majority of the use cases.
When it comes to sessions - is each request over the connection treated as an unique request and :. has cookies and headers, or whether they're all treated as sub-requests of the original request?
Each request is part of what's call a stream in HTTP/2. It is a unique request, with it's own Cookies and Headers and is unrelated to the previous requests (though see note below for some caveats on this). Conceptually it's really no different than the fact that HTTP/1.1 allows you to send multiple unique, unrelated requests on the one connection - but unlike HTTP/1.1 multiple requests can all be in transit at the same time in HTTP/2 thanks to multiplexing.
This diagram might help explain it: https://freecontent.manning.com/mental-model-graphic-how-is-http-1-1-different-from-http-2/
This has impact on proxies whether you could merge requests from clients into a single stream or not?
I'm not sure what you mean by this?
Note: While HTTP/2 requests are independent of each other and HTTP is still stateless at a higher level, there are a few places when you go lower level where the strict wording of that could be challenged and where there is technically some connection between the requests. For example:
HTTP/2 actually uses HPACK header compression to compress HTTP headers so if the same header is sent twice on different requests (e.g. the same cookie) then the second call will include a reference to the previously received header, rather than repeat the data.
Each request has a unique stream id, which is an increasing integer which is either odd (client initiated) or even (server initiated) so of course the stream ids must be managed by the HTTP/2 implementation so arguable.
HTTP/2 push resources are pushed in response to a previous request stream id.
But these are all really just connection level issues and efficiencies. To a HTTP/2 user (e.g. a web developer) each HTTP/2 request is as independent as it was under HTTP/1.1 and HTTP is still stateless.
The new HTTP/2 protocol comes with some promising features. Some of them:
Multiplexing - a single TCP connection can be used to make multiple HTTP/2 requests and receive multiple responses (to a single origin)
HTTP/2 Server Push - sending server responses to the client without receiving requests, i.e. initiated by the server
Bidirectional connection - HTTP/2 spec - Streams and Multiplexing:
A "stream" is an independent, bidirectional sequence of frames
exchanged between the client and server within an HTTP/2 connection.
The motivation behind HTTP/2 is explained here HTTP/2 FAQ:
HTTP/1.1 has served the Web well for more than fifteen years, but its
age is starting to show.
and
The goal of the Working Group is that typical uses of HTTP/1.x can use HTTP/2 and see some benefit.
So HTTP/2 is nice and comes to replace HTTP/1.x. Unfortunately, HTTP/2 does not support WebSockets. In this question Does HTTP/2 make websockets obsolete? it is made clear that the HTTP/2 Server Push is not an alternative, neither are Server-Sent Events.
Now to the question: What do we use if we want WebSockts functionality over HTTP/2?
Current forms of HTTP/2 Protocol Negotiation:
HTTP/2 connections start in one of three ways:
In an encrypted connection (TLS/SSL) using ALPN (Application Layer Protocol Negotiation). Most browsers require TLS/SSL for HTTP/2 and use this method for HTTP/2 connection establishment.
In clear text, using the HTTP/1.1 Upgrade header (same as Websockets). Most browsers require TLS/SSL for HTTP/2, so this is limited in it's support.
In clear text, using a special string at the beginning of an HTTP/1.1 connection (which could allow HTTP/2 servers in clear text to disable HTTP/1.1 support). Limited client support.
Negotiating the Websocket Protocol, present tense:
Negotiating Websocket connections, at the moment, requires HTTP/1.1 support and makes use of the HTTP/1.1 Upgrade header.
This is often performed by the same application server that listens to the HTTP/1.1 and HTTP/2 connections. Web applications that support concurrency (whether evented or thread based) are usually protocol agnostic (as long as HTTP semantics are preserved) and work well enough on both protocols.
This allows HTTP data to be used during connection establishment (and perhaps effect the Websocket connection state/authentication procedure).
Once the Websocket connection is established, it's totally independent from the HTTP semantics / layer.
Negotiating the Websocket Protocol in an HTTP/2 world:
In an HTTP/2 (only) world, which might be a while into the future, there could be a number of possible approaches to Websocket protocol negotiation: an ALPN based approach and an HTTP/2 "tunnel" (or "stream") approach.
The ALPN approach preserves protocol independence at the expense of the pre-upgrade (HTTP) stage, while the "stream" approach provides the HTTP pre-"upgrade" (or Connect) stage at the expense of high coupling and complexity.
The ALPN Approach:
One possible future approach will simply add the Websocket protocol to the ALPN negotiation table.
At the moment, ALPN is used to select (or default to) the "http/1.1" protocol and the Upgrade request is handled by the HTTP/1.1 server. Which means that Websocket still provides us with the HTTP header data during protocol negotiation (while using it's own TCP/IP connection)
In the future, ALPN might simply add "wss" as an available choice.
Using this approach, the Websocket (which is currently established using the HTTP/1.1 Upgrade header, both in encrypted and clear text forms) could easily be negotiated using the ALPN extension to the TLS/SSL layer.
This will keep the Websocket protocol independent from the HTTP/2 protocol and allow it's use even when HTTP isn't supported.
However, this will come with the downside that cookies and other HTTP headers might be no longer available as part of the protocol negotiation. Another difference (both good and bad) is that this approach will require a separate TCP/IP connection.
The HTTP/2 "tunnel" / "stream" approach
Another possible future approach, which is reflected in this proposed draft, will dispose of the HTTP/1.1 variation of the Websocket protocol in favor of an HTTP/2 "stream" approach.
HTTP/2 "streams" are the way HTTP/2 implements multiplexing and allows multiple requests to be handled concurrently. Each request receives a stream number ID and any data pertaining to this request (headers, responses etc') is identified using the same numerical stream ID.
Under this approach, "Websocket" data will be contain within the HTTP/2 wrapper and the stream ID will be used to identify the "Websocket" stream.
Although this might provide some benefits (HTTP headers and cookies could be provided as part of the Websocket negotiation), it's not without its downfalls.
Higher complexity and tighter protocol coupling are just two examples, both of which are very serious downfalls.
Conclusion:
At the time of this writing, HTTP/1.1 Upgrade semantics are required for Websocket connections, both when using clear text (ws) and encrypted (wss) connections.
The future is, as of yet, undecided and it will probably take a long time before the current Upgrade process (using HTTP/1.1) is phased out
Well your timing is rather apt!
A new version of the internet standards draft was literally just published:
Bootstrapping WebSockets with HTTP/2
Additional information here:
https://github.com/mcmanus/draft-h2ws/blob/master/README.md
And you can follow the discussion in it here:
https://lists.w3.org/Archives/Public/ietf-http-wg/2017OctDec/0032.html
Until this is approved, and then implemented by browsers and servers, I would say that Daniel Haxx’s post that you included in your question represents a very good summary of the current status.
One of your links actually has one answer: you can just use SSE.
Semantically, you can achieve the same things with either websockets or (SSE + POST ). The view that the two technologies address different use cases is, roughly speaking, bikeshedding around "this syntax works better for this".
There are ongoing efforts to port something similar to websockets to HTTP/2, but unless those technologies make possible new uses cases or efficiencies, I see no point.
HTTP/2 is definitely the future trend because it is now the standard of HTTP protocol. As we can see in Can I use, 70.15 percent of browsers support the HTTP/2. But HTTP/2 is so new that there are browsers that only support HTTP/1.x and there are many servers that only support HTTP/1.x. I knew that a client can use HTTP upgrade mechanism to negotiate a proper protocol to communicate with the server. For example, if the server supports HTTP/2, their communicating protocol will switch to HTTP/2, otherwise, HTTP/1.x is used. But this only applies to the situdation where the browser the clients used supports both HTTP/2 and HTTP/1.x, right?
But what if a user on a browser that only supports HTTP/1.x wants to communicate with HTTP/2 only server? Will the server ignore the request or send an error back to the user?
And what if a user on a brower that only supports HTTP/2 wants to communicate with HTTP/1.1 only server? I am thinking the process might go like this: The user sends a connecion preface to the server, the server cannot recognize the request, so the user might receive a connection error message. Is this right?
Or is there any browser that supports only HTTP/2?
It's important to take in consideration that the most implementations of HTTP/2 uses it over TLS 1.2 with ALPN protocol (Application Layer Protocol Negotiation). Thus the client just start the standard TLS connection. As the part of such communication the client sends "Client Hello" to the server with some information:
It's like: "Hi, Tom! It's Bob. I speak German, Russian and English. Let's talk a little". And the server send "Server Hello":
"Hi, Bob! I suggest to speak German or English". Then the client send one more short message "OK, then let's speak German" and he start to speak German without waiting of any response from the server:
The whole communication looks like on the picture below
Because both the client and the server start the communication just using TLS 1.2, which the both know. They start the main communication after the protocol negotiation. Thus the problem which you describe could not exist in the practice.
If a browser only supports HTTP/1.1 and the server only supports HTTP/2, they cannot communicate. The server will not recognize what the client sends (in particular there will be no connection preface, which the server treats - following the specification - as a connection error), and will close the connection.
"A browser that only supports HTTP/2" does not exist; if they support HTTP/2, they also support HTTP/1.1. But let's assume that such browser exist.
In this latter case, the server will see the connection preface and will not recognize the PRI method. What exactly the server does in this case depends on the server. It may return a 400 Bad Request, or perhaps just close the connection, or it may trigger an internal server error.
I've tried to visit a http2 only server with curl --http1.1 -i, here is what I got
HTTP/1.0 403 Forbidden
Content-Type: text/plain
Unknown ALPN Protocol, expected `h2` to be available.
If this is a HTTP request: The server was not configured with the `allowHTTP1` option or a listener for the `unknownProtocol` event.
Recently I've found out of Server-Sent events as a much simpler alternative to WebSockets for doing push from the server. Most places that compare them (like here, here and here) say that if you don't need full duplex communications between client and server, then WebSockets is overkill and SSE are good enough.
My question is what would be the downside of using SSE when you do need bidirectional communications (like a chat for example), using regular ajax requests for sending messages from the client and the server stream for receiving them? Considering that I have to do little to no configuration on the server side to use SSE, it seems to be a much more appealing option.
SSE Advantages over WebSockets:
No special web server or web proxy changes required.
Define custom events (otherwise, client API is basically the same)
Easier integration of existing authentication mechanisms (OAuth, OpenID, etc)
SSE Disadvantages compared to WebSockets:
Unidirectional communication channel (server to client). Client to server requires a separate channel.
Browser support is more limited (no native IE support whereas WebSockets is supported in IE 10): WebSockets, SSE
Relies on client to verify origin (possibly more vulnerable to XSS attacks than WebSockets)
No native support for binary types (WebSockets supports raw frames using ArrayBuffers and Blobs).
Requires a full fledged web server even if the SSE endpoint is not serving static web content (a standalone WebSocket server can be fairly simple)
SSE with AJAX for bi-directional communication will have MUCH higher round-trip latency and higher client->server bandwidth than using a WebSocket connection. This is due to the overhead of connection setup for every client->server AJAX request. Also, server->client latency can have spikes with SSE since in many configurations the long-held connection will eventually be closed (often every 30 seconds) and need to be re-opened causing a temporary spike in server->client latency as well.
References:
http://www.html5rocks.com/en/tutorials/eventsource/basics/
https://developer.mozilla.org/en-US/docs/Server-sent_events
https://developer.mozilla.org/en-US/docs/Server-sent_events/Using_server-sent_events
http://dev.w3.org/html5/eventsource/
Ajax requests are huge compared to small WebSocket messages. Standard HTTP requests (Ajax) include a lot of headers including cookies with every request while WebSocket messages is just a few bytes.
The good thing with HTTP (Ajax) request is that they are easier to cache if that is a benefit for your problem.
First off - I understand SPDY and Websockets aren't the same thing, and that you can run Websockets over SPDY like you can with HTTP, etc.
However - I am wondering if SPDY would be a viable replacement for websockets if I am trying to provide a REST (like) API that also supports server push (bi-directional calls over the same connection).
My current prototype uses websockets (node+socket.io), and works fine. However, my issue with websockets is I am having to dream up my own JSON protocol for routing requests both to and from the server. I'd much rather use REST-style URIs and Headers in requests, which fits better in a REST-based architecture. SPDY seems like it would support this better.
Also, because of the lack of headers, I'm concerned websockets won't fit well in our deployment network, and thinking SPDY would be a better fit again.
However, I've not seen many examples of bidirectional SPDY requests, apart from pushing files to the browser. I would like to push events and data to the browsers, such as:
Content-Type: application/json
{
"id": "ca823f3e233233",
"name": "Greg Brady"
}
but it's not clear to me how the browser/Javascript might "listen" and react to these, as I would with the WebSocket and socket.io APIs.
Let's start from the beginning: why would you want to run WebSockets over SPDY, as opposed to doing an HTTP upgrade? If you upgrade an HTTP connection to WS, then nothing else can use that TCP stream - the WS connection can be idle, but the connection is blocked nonetheless. With SPDY, you can mux multiple requests/responses, and a websocket connection (or even multiple) over the same underlying TCP stream. On a practical note, as of July 2012, WS over SPDY is still a work in progress, so you will have to wait to use SPDY for WebSockets - hopefully not too long though!
But let's assume the support is there... The reason why it's not clear how to listen for "SPDY Push" from JavaScript is because there is no way to do that! A pushed resource goes into your browsers cache - nothing more, nothing less. If you need to stream data to your javascript callbacks, then WebSockets, or Server-Sent Events (SSE) is the answer.
So, putting it all together:
HTTP adds a lot of overhead for individual small requests (headers, etc)
WebSockets gives you a low overhead channel, but requires you implement own routing
SPDY will significantly reduce the overhead and cost of small HTTP requests (win)
SSE is a good, simple alternative to pushing data to the client (which works today, over SPDY)
You could use SPDY+SSE to meet your goals, and all of that communication can run over the same TCP channel. SPDY requests to the server, SSE push from the server.
First some clarifications:
The base WebSocket protocol (IETF 6455) is not layered onto HTTP. The initial handshake for WebSocket connections is HTTP compatible, but once the handshake is completed, the protocol is a framed, bi-directional full-duplex connection with very low overhead (often just 2 bytes per frame of header).
The WebSocket over SPDY idea is a proposal that may or may not see the light of day. In this case, WebSocket is in fact being layered on SPDY. The initial connection/handshake may happen faster due to the nature of SPDY versus HTTP, however, the data frames will have more overhead because the WebSocket header fields are mapped into SPDY header fields.
SPDY aims to be a more efficient replacement for HTTP. WebSocket is an entirely different beast that enables very low-latency bi-directional/full-duplex messaging between the client and server.
If what you are interested in server-push with a simple API and you don't need super low-latency, then you might consider server-sent events which has an API that is simple and similar to the WebSocket API. Or you could look into one of the many good Comet libraries which enable server-push but will better support old browsers unlike any of the above solutions.
However, my issue with websockets is I am having to dream up my own
JSON protocol for routing requests both to and from the server.
I wrote a thin RPC layer over socket.io wrapping network calls in promises just for that reason. You can take a peek at it here.