Websockets: how to handle sending different data to many clients - websocket

I'm having a play around with websockets and I'm having a bit of trouble wrapping my head around some stuff. Specifically, being able to send a whole bunch of subscribers different data without using a stupid amount of resources.
For example, if you had some sort of twitter like service, how would you send all followers of a person a newly posted tweet that they have made (and do the same for the other hundreds of people doing the same). It just seems that handling that many separate people is a bit absurd.
Can someone talk me through how you would go about treating each client individually? Please tell me if I have the whole idea of websockets wrong.
Thanks in advance!
P.S. for reference, I'm probably going to play around using either node or clojure (with aleph)

Use an established messaging protocol and broker on top of websockets.
It seems you are looking at websockets at the application layer when it is more of a network protocol. A variety of messaging APIs exists (such as JMS) with open source message brokers that are designed to do the complex and scalable message routing.

Related

SPA: using websockets only. Why not?

I am redesigning a web application which previously has been rendered server side to a Single Page Application and started to read about websockets . The web application will be using sockets to have new records and/or messages pushed to the client. I have been wondering why most pages which make use of sockets don't handle all their communication over the socket. Most of the times there is RESTful backend in addition to the websocket. Would it be a bad idea to have the client query for new resources over the socket? If so why - other than that a RESTful api might be easier to use with other devices?
I can imagine that using websockets would probably not be the best idea in case the network connection is kind of bad like on mobile devices, but that probably should work quite well with a reasonable connection to the web.
I found this related question, however it is from 2011 and seems a little outdated:
websocket api to replace rest api?
No, it won´t be a bad idea. Actually I work in an application that uses a WebSocket connection for all what is data interaction, the web server only handles requests for resources, views under different languages, dimensions .. etc..
The problem may be the lack of frameworks/tools based on a persistent connection. For many years most of frameworks, front and back end, have been designed and built around the request/response model. The approach shift may be no so easy to accept.
Coming back to this question a few years later, I would like to point out a few aspects to illustrate that having all your communication through websockets does have its drawbacks:
there is no common support for compression. You can easily configure your webserver to compress http requests and browsers have been known to happily accept compressed responses for years, however for web sockets it is still not that easy (even though the situation has improved)
client frameworks often are build upon commonly used standards like rest. The further away you move from frameworks expectations, the less addons or features will be available.
caching in the browser is not as easy. By now this goes a long way, reaching into the realm of offline availability and PWAs.
when using technology, that is only used by a subset of users, it is more likely to find new bugs, or bugs might take longer to fix. And if it's not bugs, there might be an edge case somewhere around the corner. This isn't an issue per se - but something to be aware of. If one runs into those things, they often easily take up quite some time to fix or work around.

realtime app need push service advice

I am looking for a realtime hosted push/socket service (paid is fine) which will handle many connections/channels from many clients (JS) and server api which can subscribe/publish to those channels from a PHP script.
Here is an example:
The client UI has a fleet of 100 trucks rendered, when a truck is modified its data is pushed to channel (eg. /updates/truck/34) to server (PHP subscriber), DB is updated and receipt/data is sent back to the single truck channel.
We have a prototype working in Firebase.io but we don't need the firebase database, we just need the realtime transmission. One of the great features of firebase.io is that its light and we can subscribe to many small channels at once. This helps reduce payload as only that object data that has changed is transmitted.
Correct me if I am wrong but I think pusher and pubnub will allow me to create 100 truck pub/subs (in this example) for each client that opens the site?
Can anyone offer a recommendation?
I can confirm that you can use Pusher to achieve this - I work for Pusher.
PubNub previously counted each channel as a connection, but they now seem to have introduced multiplexing. This FAQ states you can support 100 channels over the multiplexed connection.
So, both of these services will be able to achieve what you are looking for. There will also be more options available via this Realtime Web Tech guide which I maintain.
[I work for Firebase]
Firebase should continue to work well for you even if you don't need the persistence features. We're not aware of any case where our persistence actually makes things harder, and in many cases it actually makes your life a lot easier. For example, you probably want to be able to ask "what is the current position of a truck" without needing to wait for the next time an update is sent.
If you've encountered a situation where persistence is actually a negative for you, we'd love to hear about it. That certainly isn't our intention.
Also - we're not Firebase.io -- we're just Firebase (though we do own the firebase.io domain name).

nodejs: Ajax vs Socket.IO, pros and cons

I thought about getting rid of all client-side Ajax calls (jQuery) and instead use a permanent socket connection (Socket.IO).
Therefore I would use event listeners/emitters client-side and server-side.
Ex. a click event is triggered by user in the browser, client-side emitter pushes the event through socket connection to server. Server-side listener reacts on incoming event, and pushes "done" event back to client. Client's listener reacts on incoming event by fading in DIV element.
Does that make sense at all?
Pros & cons?
There is a lot of common misinformation in this thread that is very inaccurate.
TL/DR;
WebSocket replaces HTTP for applications! It was designed by Google with the help of Microsoft and many other leading companies. All browsers support it. There are no cons.
SocketIO is built on top of the WebSocket protocol (RFC 6455). It was designed to replace AJAX entirely. It does not have scalability issues what-so-ever. It works faster than AJAX while consuming an order of magnitude fewer resources.
AJAX is 10 years old and is built on top of a single JavaScript XMLHTTPRequest function that was added to allow callbacks to servers without reloading the entire page.
In other words, AJAX is a document protocol (HTTP) with a single JavaScript function.
In contrast, WebSocket is a application protocol that was designed to replace HTTP entirely. When you upgrade an HTTP connection (by requesting WebSocket protocol), you enable two-way full duplex communication with the server and no protocol handshaking is involved what so ever. With AJAX, you either must enable keep-alive (which is the same as SocketIO, only older protocol) or, force new HTTP handshakes, which bog down the server, every time you make an AJAX request.
A SocketIO server running on top of Node can handle 100,000 concurrent connections in keep-alive mode using only 4gb of ram and a single CPU, and this limit is caused by the V8 garbage collection engine, not the protocol. You will never, ever achieve this with AJAX, even in your wildest dreams.
Why SocketIO so much faster and consumes so much fewer resources
The main reasons for this is again, WebSocket was designed for applications, and AJAX is a work-around to enable applications on top of a document protocol.
If you dive into the HTTP protocol, and use MVC frameworks, you'll see a single AJAX request will actually transmit 700-900 bytes of protocol load just to AJAX to a URL (without any of your own payload). In striking contrast, WebSocket uses about 10 bytes, or about 70x less data to talk with the server.
Since SocketIO maintains an open connection, there's no handshake, and server response time is limited to round-trip or ping time to the server itself.
There is misinformation that a socket connection is a port connection; it is not. A socket connection is just an entry in a table. Very few resources are consumed, and a single server can provide 1,000,000+ WebSocket connections. An AWS XXL server can and does host 1,000,000+ SocketIO connections.
An AJAX connection will gzip/deflate the entire HTTP headers, decode the headers, encode the headers, and spin up a HTTP server thread to process the request, again, because this is a document protocol; the server was designed to spit out documents a single time.
In contrast, WebSocket simply stores an entry in a table for a connection, approximately 40-80 bytes. That's literally it. No polling occurs, at all.
WebSocket was designed to scale.
As far as SocketIO being messy... This is not the case at all. AJAX is messy, you need promise/response.
With SocketIO, you simply have emitters and receivers; they don't even need to know about each-other; no promise system is needed:
To request a list of users you simply send the server a message...
socket.emit("giveMeTheUsers");
When the server is ready, it will send you back another message. Tada, you're done. So, to process a list of users you simply say what to do when you get a response you're looking for...
socket.on("HereAreTheUsers", showUsers(data) );
That's it. Where is the mess? Well, there is none :) Separation of concerns? Done for you. Locking the client so they know they have to wait? They don't have to wait :) You could get a new list of users whenever... The server could even play back any UI command this way... Clients can connect to each other without even using a server with WebRTC...
Chat system in SocketIO? 10 lines of code. Real-time video conferencing? 80 lines of code Yes... Luke... Join me. use the right protocol for the job... If you're writing an app... use an app protocol.
I think the problem and confusion here is coming from people that are used to using AJAX and thinking they need all the extra promise protocol on the client and a REST API on the back end... Well you don't. :) It's not needed anymore :)
yes, you read that right... a REST API is not needed anymore when you decide to switch to WebSocket. REST is actually outdated. if you write a desktop app, do you communicate with the dialog with REST? No :) That's pretty dumb.
SocketIO, utilizing WebSocket does the same thing for you... you can start to think of the client-side as simple the dialog for your app. You no longer need REST, at all.
In fact, if you try to use REST while using WebSocket, it's just as silly as using REST as the communication protocol for a desktop dialog... there is absolutely no point, at all.
What's that you say Timmy? What about other apps that want to use your app? You should give them access to REST? Timmy... WebSocket has been out for 4 years... Just have them connect to your app using WebSocket, and let them request the messages using that protocol... it will consume 50x fewer resources, be much faster, and 10x easier to develop... Why support the past when you're creating the future?
Sure, there are use cases for REST, but they are all for older and outdated systems... Most people just don't know it yet.
UPDATE:
A LOT of people have been asking me recently how can they start writing an app in 2018 (and now soon 2019) using WebSockets, that the barrier seems really high, that once they play with Socket.IO they don't know where else to turn or what to learn.
Fortunately the last 3 years have been very kind to WebSockets...
There are now 3 major frameworks that support BOTH REST and WebSocket, and even IoT protocols or other minimal/speedy protocols like ZeroMQ, and you don't have to worry about any of it; you just get support for it out of the box.
Note: Although Meteor is by far the most popular, I am leaving it out because although they are a very, very well-funded WebSocket framework, anyone who has coded with Meteor for a few years will tell you, it's an internal mess and a nightmare to scale. Sort of like WordPress is to PHP, it is there, it is popular, but it is not very well made. It's not well-thought out, and it will soon die. Sorry Meteor folks, but check out these 3 other projects compared to Meteor, and you will throw Meteor away the same day :)
With all of the below frameworks, you write your service once, and you get both REST and WebSocket support. What's more, it's a single line of config code to swap between almost any backend database.
Feathers Easiest to use, works the same on the front and backend, and supports most features, Feathers is a collection of light-weight wrappers for existing tools like express. Using awesome tools like feathers-vuex, you can create immutable services that are fully mockable, support REST, WebSocket and other protocols (using Primus), and get free full CRUD operations, including search and pagination, without a single line of code (just some config). Also works really great with generated data like json-schema-faker so you can not only fully mock things, you can mock it with random yet valid data. You can wire up an app to support type-ahead search, create, delete and edit, with no code (just config). As some of you may know, proper code-through-config is the biggest barrier to self-modifying code. Feathers does it right, and will push you towards the front of the pack in the future of app design.
Moleculer Moleculer is unfortunately an order of magnitude better at the backend than Feathers. While feathers will work, and let you scale to infinity, feathers simply doesn't even begin to think about things like production clustering, live server consoles, fault tolerance, piping logs out of the box, or API Gateways (while I've built a production API gateway out of Feathers, Moleculer does it way, way better). Moleculer is also the fastest growing, both in popularity and new features, than any WebSocket framework.
The winning strike with Moleculer is you can use a Feathers or ActionHero front-end with a Moleculer backend, and although you lose some generators, you gain a lot of production quality.
Because of this I recommend learning Feathers on the front and backend, and once you make your first app, try switching your backend to Moleculer. Moleculer is harder to get started with, but only because it solves all the scaling problems for you, and this information can confuse newer users.
ActionHero Listed here as a viable alternative, but Feathers and Moleculer are better implementations. If anything about ActionHero doesn't Jive with you, don't use it; there are two better ways above that give you more, faster.
NOTE: API Gateways are the future, and all 3 of the above support them, but Moleculer literally gives you it out of the box. An API gateway lets you massage your client interaction, allowing caching, memoization, client-to-client messaging, blacklisting, registration, fault tolerance and all other scaling issues to be handled by a single platform component. Coupling your API Gateway with Kubernetes will let you scale to infinity with the least amount of problems possible. It is the best design method available for scalable apps.
Update for 2021:
The industry has evolved so much that you don't even need to pay attention to the protocol. GraphQL now uses WebSockets by default! Just look up how to use subscriptions, and you're done. The fastest way to handle it will occur for you.
If you use Vue, React or Angular, you're in luck, because there is a native GraphQL implementation for you! Just call your data from the server using a GraphQL subscription, and that data object will stay up to date and reactive on it's own.
GraphQL will even fall-back to REST for you when you need to use legacy systems, and subscriptions will still update using sockets. Everything is solved when you move to GraphQL.
Yes, if you thought "WTH?!?" when you heard you can simply subscribe, like with FireBase, to a server object, and it will update itself for you. Yes. That's now true. Just use a GraphQL subscription. It will use WebSockets.
Chat system? 1 line of code.
Real time video system? 1 line of code.
Video game with 10mb of open world data shared across 1m real-time users? 1 line of code. The code is just your GQL query now.
As long as you build or use the right back-end, all this realtime stuff is now done for you with GQL subscriptions. Make the switch as soon as you can and stop worrying about protocols.
Socket.IO uses persistent connection between client and server, so you will reach a maximum limit of concurrent connections depending on the resources you have on server side, while more Ajax async requests can be served with the same resources.
Socket.IO is mainly designed for realtime and bi-directional connections between client and server and in some applications there is no need to keep permanent connections. On the other hand Ajax async connections should pass the HTTP connection setup phase and send header data and all cookies with every request.
Socket.IO has been designed as a single process server and may have scalability issues depending server resources that you are bound to.
Socket.IO in not well suited for applications when you are better to cache results of client requests.
Socket.IO applications face with difficulties with SEO optimization and search engine indexing.
Socket.IO is not a standard and not equivalent to W3C Web Socket API, It uses current Web Socket API if browser supports, socket.io created by a person to resolve cross browser compatibility in real time apps and is so young, about 1 year old. Its learning curve, less developers and community resources compared with ajax/jquery, long term maintenance and less need or better options in future may be important for developer teams to make their code based on socket.io or not.
Sending one way messages and invoking callbacks to them can get very messy.
$.get('/api', sendData, returnFunction); is cleaner than
socket.emit('sendApi', sendData); socket.on('receiveApi', returnFunction);
Which is why dnode and nowjs were built on top of socket.io to make things manageable. Still event driven but without giving up callbacks.

How to structure a client-server application with 'push' notifications

EDIT: I forgot to include the prime candidate for web applications: JSON over HTTP/REST + Comet. It combines the best features of the others (below)
Persevere basically bundles everything I need in a server
The focus for Java and such is definitely on Comet servers, but it can't be too hard to use/write a client.
I'm embarking on an application with a server holding data, and clients executing operations which would affect this data, and thus require some sort of notification across all interested/subscribed clients.
The first client will probably be written in WPF, but we'll probably need to add clients written in other languages, e.g. a Java (Swing?) client, and possibly, a web client.
The actual question(s): What protocol should I use to implement this? How easy would it be to integrate with JS, Java and .NET (precisely, C#) clients?
I could use several interfaces/protocols, but it'd be easier overall to use one that is interoperable. Given that interoperability is important, I have researched a few options:
JSON-RPC
lightweight
supports notifications
The only .NET lib I could find, Jayrock doesn't support notifications
works well with JS
also true of XML-based stuff (and possibly, even binary protocols) BUT this would probably be more efficient, thanks to native support
Protobuf/Thrift
IDL makes it easy to spit out model classes in each language
doesn't seem to support notifications
Thrift comes with RPC out of the box, but protobufs don't
not sure about JS
XML-RPC
simple enough, but doesn't support notifications
SOAP: I'm not even sure about this one; I haven't grokked this yet.
seems rather complex
Message Queues/PubSub approach: Not strictly a protocol, but might be fitting
I hardly know anything about them, and got lost amongst the buzzwords`-- JMS? **MQ?
Perhaps combined with some RPC mechanism above, although that might not be strictly necessary, and possibly, overkill.
Other options are, of course, welcome.
I am partial to the pub/sub design you've suggested. I'd take a look at ZeroMQ. It has bindings to C#, Java, and many other platforms.
Bindings list: http://www.zeromq.org/bindings:clr
I also found this conversation on the ZeroMQ dev listing that may answer some questions you have about multiple clients and ZeroMQ: http://lists.zeromq.org/pipermail/zeromq-dev/2010-February/002146.html
As XMPP was mentioned, SIP has a similar functionality. This might be more accessible for you.
We use Servoy for this. It does automatic data broadcasting to web-clients and java-clients. I'm not sure if broadcasts can be sent to other platforms, you might be able to find an answer to that on their forum.
If you want to easily publish events to clients across networks, you may wish to look at a the XMPP standard. (Used by, amongst other things, Jabber and Google Talk.)
See the extension for publish-subscribe functionality.
There are a number of libraries in different languages including C#, Java and Javascript.
You can use SOAP over HTTP to modify the data on the server and SOAP over SMTP to notify the subscribed clients.
OR
The server doesn't know anything about the subscription and the clients call the server by timeout to track updates they are interested in, using XML-RPC, SOAP (generated using WSDL), or simply HTTP GET if there is no need to pass back complex data on tracking.

WebSocket cross-connection communication (Tornado?)

I'm fumbling around a bit with WebSockets, and was pretty pleased with how easy it was to get a Tornado server running that does basic websocket connections. I've never used Tornado before today, and while I like what I've seen there's a few questions that I have regarding it's use.
Primarily, I'm using WebSockets so that I can have low-overhead communications between two or more client machines. (For the purposes of conversation let's just say it's a chat client) Obviously I can connect into the server from multiple machines, and they can all push messages to the server and the server can respond, which is great! But that's not too much better than your standard AJAX requests. If I have a persistent connection I want to be able to push data to the clients as well. The simplest possible scenario is user 1 posts a message to the server and upon receiving it the server immediately pushes it to user 2.
So what would be a good way to accomplish that? As far as I can see in Tornado there's no way to communicate between connections other than placing the message in a datastore somewhere and having all the other connections poll for new info. That strikes me as terribly clunky though, because all you're really doing at that point is moving the polling process from the client to the server.
Of course, I may be barking up the wrong tree entirely here. It's certainly plausible that Tornado simply isn't the right tool for this job, and if that's the case I'd be happy to hear suggestions for alternatives!
Here is a chat server using tornado, WebSockets and redis: https://gist.github.com/pelletier/532067 (Updated: link fixed, thanks #SamidhT)
Though the answer has already been accepted: Using a different service still seems very inefficient to me. Why don't you just go with shared memory + conditional variables / semaphores? You sound like you got a standard Consumer-Producer problem

Resources