Which group messaging technology to use? - ruby

I feel a little bit kind of confused — for about 24 hours I have been thinking which group broadcasting technology to use in my project.
Basically, what I need is:
create groups (by some backend process)
broadcast messages by any client (1:N, N:N)
(potentially) direct messages (1:1)
(important) authenticate/authorize clients with my own backend (say, through some kind of HTTP API)
to be able to kick specific clients by backend process (or server plugin)
Here is what I will have:
Backend-related process(es) in either Ruby or Haxe
Frontend in JS+Haxe(Flash9) — in browser, so ideally communicating through 80/443, but not necessarily.
So, this technology will have to be easily accessible in Haxe for Flash and preferably Ruby.
I've been thinking about: RabbitMQ (or OpenAMQ), RabbitMQ+STOMP, ejabberd, ejabberd+BOSH, juggernaut (with a need to write a Haxe lib for it).
Any ideas/suggestions?

RabbitMQ, Haxe and as3: http://geekrelief.wordpress.com/2008/12/15/hxamqp-amqp-with-haxe/
RabbitMQ, Ruby and ACLs: http://pastie.org/pastes/368315
You might also want to look at using Nanite with RabbitMQ to manage backend groups: http://brainspl.at/articles/2008/10/11/merbcamp-keynote-and-introducing-nanite
You say you need:
* broadcast messages by any client (1:N, N:N)
* (potentially) direct messages (1:1)
You can easily do both using RabbitMQ. RabbitMQ supports both cases, 1:N pubsub and 1:1 messaging, with 'direct' exchanges.
The direct exchange pattern is as follows:
Any publisher (group member) sends a message to the broker with a 'routing key' such as "yurii". RabbitMQ matches this key with subscription bindings in the routing table (aka "exchange") for you. Each binding represents a subscription by a queue, expressing interest in messages with a given routing key. When the routing and binding keys match, the message is then routed to queues for subsequent consumption by clients (group members). This works for 1:N and 1:1 cases; with N:N building on 1:N.
Introduction to the routing model: http://blogs.digitar.com/jjww/2009/01/rabbits-and-warrens/
General intro: http://google-ukdev.blogspot.com/2008/09/rabbitmq-tech-talk-at-google-london.html
You also require:
* (important) authenticate/authorize clients with my own backend (say, through some kind of HTTP API)
Please see the ACLs code for this (link above). There is also a HTTP interface to RabbitMQ but we have not yet combined the HTTP front end with the ACL code. That shouldn't hold oyu back though. Please come to the rabbitmq-discuss list where this topic has been talked about recently.
You also require:
* create groups (by some backend process)
* to be able to kick specific clients by backend process (or server plugin)
I suggest looking at how tools like Nanite and Workling do this. Group creation is not usually part of a messaging system, instead, in RabbitMQ, you create routing patterns using subscriptions. You can kick specific clients by sending messages to them by whichever key they have used to bind their consuming queue to the exchange.
Hope this helps!

If you are going to be doing Flash dev have you looked at SmartfoxServer? It has everything you want and has native Flash client libraries. I used in on a project to manage 10s of thousands of connected users.

Well group communication is a slightly different beast than simple messaging / queuing.
Most group communication systems are commercial but there are two (that I know of) open-source / free you can take a look at:
Spread Toolkit
Both of these might be tough to find Ruby bindings though. Spread, and probably OpenAIS, view clients as trusted so a browser based client doesn't make sense. You'd need to have your browser front-ends talk to a group client(s) on the back-end.

We've been using ActiveMQ. Our vendor who supplies our HR system is using Ruby/ActiveMQ to broadcast and receive updates.

Other open source message brokers which support the Stomp protocol are OpenMQ, which is included in GlassFish V3 and GlassFish 2.1.1 but also works standalone, and soon the JBoss message broker, HornetQ V2.1.
OpenMQ supports temporary queues which are useful for a RPC style communication, but ActiveMQ offers some interesting features in the Stomp adapter too.


Bidirectional client-server communication using Server-Sent Events instead of WebSockets?

It is possible to achieve two-way communication between a client and server using Server Sent Events (SSE) if the clients send messages using HTTP POST and receive messages asynchronously using SSE.
It has been mentioned here that SSE with AJAX would have higher round-trip latency and higher client->server bandwidth since an HTTP request includes headers and that websockets are better in this case, however isn't it advantageous for SSE that they can be used for consistent data compression, since websockets' permessage-deflate supports selective compression, meaning some messages might be compressed while others aren't compressed
Your best bet in this scenario would be to use a WebSockets server because building a WS implementation from scratch is not only time-consuming but the fact that it has already been solved makes it useless. As you've tagged Socket.io, that's a good option to get started. It's an open source tool and easy to use and follow from the documentation.
However, since it is open-source, it doesn't provide some functionality that is critical when you want to stream data in a production level application. There are issues like scalability, interoperability (for endpoints operating on protocols other than WebSockets), fault tolerance, ensuring reliable message ordering, etc.
The real-time messaging infrastructure plus these critical production level features mentioned above are provided as a service called a 'Data Stream Network'. There are a couple of companies providing this, such as Ably, PubNub, etc.
I've extensively worked with Ably so comfortable to share an example in Node.js that uses Ably:
var Ably = require('ably');
var realtime = new Ably.Realtime('YOUR-API-KEY');
var channel = realtime.channels.get('data-stream-a');
//subscribe on devices or database
channel.subscribe(function(message) {
console.log("Received: " message.data);
//publish from Server A
channel.publish("example", "message data");
You can create a free account to get an API key with 3m free messages per month, should be enough for trying it out properly afaik.
There's also a concept of Reactor functions, which is essentially invoking serverless functions in realtime on AWS, Azure, Gcloud, etc. You can place a database on one side too and log data as it arrives. Pasting this image found on Ably's website for context:
Hope this helps!
Yes, it's possible.
You can have more than 1 parallel HTTP connection open, so there's nothing stopping you.

What are the technologies for building real-time servers?

I am a backend developer and I would like to know what are the common technologies for building real-time servers. I know I could use a service like Firebase, but I really want to create it. I have some experience using Websockets on Java, but I would like to know more ways to achieve a real-time server. When I say real-time, I mean something like Facebook. I also would like to know how to scale real-time servers.
Thank you all!
I've asked the same in multiple forums. Common answer to this question is strangely enough still:
Server-Sent Events (SSE)
But those are mainly ways of transporting or streaming events to the clients. Something needs to be built on top of it. And there are multiple other things to consider, such as:
Considerations for real-time API's
What events to send to the client
How to send each client only the events they need
How to handle authorization for events
Where to keep state on the event subscriptions (for stateless services)
How to recover from missed events due to lost connections and service crashes
Producing events for search-, or pagination queries
How to scale
Publish/Subscribe solutions
There are multiple pub/sub solutions out there, such as:
But because of the limitation of a topic based pub/sub architecture, some of the above questions are still left unanswered and has to be dealt with by yourself. Examples are lost connections, where Pusher has no fallback, neither does SocketCluster, and PubNub has a limited queue.
Resgate - Realtime API Gateway
An alternative to the traditional topic based pub/sub pattern is using a resource-aware realtime API Gateway, such as Resgate.
Instead of the client subscribing to topics, the gateway keeps track on which resources (objects or arrays) that the client has fetched, keeping the client data up to date until it unsubscribes.
As a developer of Resgate, I can really recommend checking it out as it solves all above question, is language agnostic, simple and light-weight, and blazingly fast.
Read more at NATS blog.
Let's say you want to scale both in the number of concurrent clients and the number of events that is produced. You will eventually need to ensure each client only gets the data they are interested in through either traditional topic based publish/subscribe, or through resource subscriptions. All above solutions handles that.
I also assume all the above mentioned solutions scales concurrent clients by allowing you to add more nodes/servers that handles the persistent WebSocket connections.
With Resgate, first level of scaling is done by simply running multiple instances (it is a simple executable), and adding a load balancer that distributes the connection evenly between them:
Handling 100M concurrent clients
Let's say a single Resgate instance handles 10000 persistent WebSocket connections, and you can add 10000 Resgates (distributed to multiple data centers) to a single NATS Server. This would allow a total of 100M connections. Of course, depending on your data, you might have other scaling issues as well, such as network traffic ;) .
A second layer of scaling (and adding redundancy) would be to replicate the whole setup to different data centers, and have the services synchronize their data between the data centers using other tools like Kafka, CockroachDB, etc.
Scaling data retrieval
With the traditional publish/subscribe solution that only deals with events, you will also have to handle scaling for the HTTP (REST) requests.
With Resgate, this is not required, as resource data is also fetched over the WebSocket connection. This allows Resgate not only to ensure that resource data and events are synchronized (another issue with separate pub/sub solutions), but also that the data can be cached. If multiple clients requests the same data, Resgate will only need to fetch it from the service once, effectively improving scalability.
Butterfly Server .NET is a real-time server written in C# allowing you to create real-time apps. You can see the source at https://github.com/firesharkstudios/butterfly-server-dotnet.

Netty and Channels and Websockets

So, I have built a Netty 3.6.2 based Websockets server application. This application will have many, many users.
The idea is that clients register to listen for information on a topic, and when information flows through the server, the server sends the information to the clients. Sound straightforward so far, right?
I implemented this by building a giant map held in memory mapping the topic to the client's Channel. When the server wants to send a message about a topic too all interested clients, it loops over all channels mapped to that topic. Seem straightforward, right?
However, in some preliminary multi-user testing, I find myself realizing there is not a one-to-one mapping between channel and client. How do I specifically target sending a message to a particular client if not through the channel? I am at a loss....
There should be a one-one ratio of clients to open channels. The fact that there is not is some sort of issue that is not related to netty.
Thank you for your help.

can I develop a publish subscribe system without using MOM

I am trying to develop a publish/subscribe system.
To this end, I have read some papers and articles regarding it.
And they all talk about Messaging service as an integral part of publish/subscribe system.
My question is, can I develop a publish subscribe system without using MOM like JMS?
Or am I missing or oversimplifying things?
I do not think you are oversimplifying things. There are stand-alone products available that provide advanced functionality based on publish/subscribe, without being part of a larger MOM system.
One of them is a group of products implementing the Data Distribution Service (DDS) specification, as standardized by the Object Management Group (OMG). Check out this Wikipedia entry for a very brief introduction and list of references.
DDS supports many advanced data management features like a strong-typed and content aware databus, distributed state management and historical data access. Its rich set of Quality of Service settings allows to off-load a lot of the complexity from your applications to the middleware. This is all based on the publish/subscribe paradigm.
If you would tell more about your application, then I might be able to point you to similar use cases using this technology -- if you are interested.
It depends what you mean by "MOM". If you think MOM = JMS then yes, there are plenty of pub/sub applications which are not JMS servers (off the top of my head): 0MQ, TIBCO Rendezvous and the many AMQP implementations around.
I guess my definition of MOM is an infrastructure for reliably getting a message from one system to another in an asynchronous manner. Pub/sub is a feature on top of the message transport which allows a message to be distributed to multiple other systems. Once you get beyond the point of opening a socket and stuffing a bunch of bytes down it, I would argue you are in the realm of MOM.
So, no you don't need JMS to do pub/sub....there are plenty of open-source and closed-source alternatives out there. Which one depends on your requirements and skills.
You can look at multicast that provides one to many communication. Multicast does not require MOM, instead it requires multicast enabled IP network. Usually the network routers take care of creating copies of message and delivering messages to destinations.

Advice on using ZeroMQ

I'm developing a new client-server app (.Net) and have up until now been using WCF, which suits the app's request-response approach nicely. However I've been asked to replace this with a socket-based solution, partly to support non-.Net clients, and future pub-sub/broadcast requirements (I realise WCF is capable, but there are other drivers behind the decision). Having failed miserably at writing my own async socket solution, I'm now looking at ZeroMQ.
My client app has a couple of background threads that periodically request data from the server. Additionally, certain UI actions (e.g. a button click) can trigger a message to the server. WCF made this easy - the code simply called the relevant method on a singleton WCF service proxy (actually I use the Castle Windsor WCF facility which gives me async calling capabilities, but that's probably irrelevant to my question).
I'm not too sure how this approach would translate to ZeroMQ, particularly with regards to managing the sockets - I'm very new to ZeroMQ and still reading the guide. Am I right in saying that I'll need a separate socket for each thread (i.e. the two b/g threads and the UI)? What about socket lifetime - do I create one each time I want to send/receive (presumably inefficient), or create the socket when the thread starts and reuse it for the entire lifetime of the thread?
One thing has to be very clear. ZMQ sockets can connect and talk to ZMQ sockets only.
This means that if I am building an distributed application whose components communicate to each other, I have liberty to choose any communication approach as external clients are not exposed to it.
Choosing ZMQ Sockets for such means is a good idea. It allows you to instantly build on many communication patterns like req/rep, push/pull, pub/sub etc and also build more complicated topologies using ZMQ devices.
How ever, This constraint is not to be taken lightly when external clients are concerned. This will enforce all external clients to use ZMQ sockets which might not be ideal. If one of the client happens to be a browser consuming your web services then you will need to provide services through regular client.
Is your client app using regular sockets?
Can it be re-written to use ZMQ sockets?
if not then don't use ZMQ sockets for external interface
but only for your internal component communication.
[Edit: Further notes]
ZMQ is a wrapper over sockets but that does a few things which are hard to get done by hand
It manages messaging at higher throughput by batching multiple messages at the same time
Optimizes use of socket at the same time
A socket can send messages to only one another socket, ZMQ socket can connect to multiple ZMQ sockets
ZMQ socket based solution can take immediate advantage of various patterns - REQ/REP, PUSH/PULL, PUB/SUB etc
How ever, it is common to mistake ZMQ to be a messaging queue.
Messaging Queue as available has other properties like message persistence and delivery guarantee etc by implementing a queue for storage.
ZMQ stands for "Zero Messaging Queue"
I have only been learning ZMQ in recent times and have been very happy to use it.
Checkout my mini tutorial on ZMQ and See if it make sense for you to use it:
Regarding castle integration, check out what Henry's done on his fork:
