how to set a load balancer for clustered MQTT brokers - spring

I'm working on a project to develop a real time mobile messaging application that needs to have advanced message filtering based on message content and user's balance, meaning that if the user has ran out of balance or if he's sending content that violates the policy the messages have to be blocked.
For this reason I need to implement some load balancing solution that scans published messages and could also determine if the message should be blocked based on the rules above, hence I can't implement a basic proxy as I need special rules applied on each message.
The difficult part:
Mobile app needs to receive subscription messages (connection acknowledgement too) without passing through the load balancer preferably (see my next point).
The problem is that the only way I could forward subscription messages to the mobile app would be by handling connections and subscriptions in the load balancer which is disastrous. I need the connections to be transparent and the load balancer stateless.
How can I accomplish this? (if it's of any help my current design involves Java component with spring boot for load balancing and VerneMQ as the message broker)

Try
Mobile App -- > MQTT Broker -- > Message Scan / Block Algorithm -- > MQTT Broker -- > Subscriber.
Your mobile app should have the intelligence to stop messages when the app runs out of balance after the first message.
So the MQTT Broker should not send the message to its subscriber directly at its layer. It should send out messsages that were received after processing.
Not sure anyone has a ready made solution for this flow. But doable.

Related

Routing messages from Kafka to web socket clients connected to application server cluster

I would like to figure out the best way to route messages from Kafka to web socket clients connected to a load balanced application server cluster. I understand that spring-kafka facilitates consuming and publishing messages to a kafka topic, but how does this work in a load balanced application server scenario when connecting to a distributed kafka topic. Here are the requirements that I would like to satisfy, with the overall goal of facilitating peer to peer messaging in an application with a very, very large volume of users:
Web clients can connect to a tomcat application server via web sockets connection via a load balancer.
Web client can send a message/notification to another client thats connected to different tomcat application server.
Messages are saved in the database and published to a kafka topic/partition that can be consumed by the appropriate web clients/users.
Kafka can be scaled to many brokers with many consumers.
I can see how this can be implemented quite easily in a single application server scenario where the consumer consumes all messages from a kafka topic and re-distributes via spring messaging/websockets. But I can't figure out how this would work in a load balanced application server scenario where there are consumers on each application server forming an overall consumer group for the kafka topic. Assuming that each of the application servers are are consuming sub-sets/partitions of the kafka topic, how do they know which server their intended recipients are connected to? And even if they knew which server their recipients were connected to, how would they route the message to them via websockets?
I considered that the application server load balancing could work by logging users with a particular routing key (users starts with 'A' etc) on to a specific application server, then only consuming messages for users starts with 'A' on that application server. But this seems like it would be difficult to maintain and would make autoscaling very difficult. This seems like it should be an common scenario to implement but I can't find any tools or approaches that fit this scenario.
Sounds like every single consumer should live in its own consumer group. This way all the available consumers are going to consume all the messages sent to the topic. Therefore all the connected websocket clients are going to be notified with those messages.
If you need more complex logic with those messages at
after consuming, e.g. filtering, routing, transforming, aggregating etc., you should consider to involve Spring Integration in you project: https://spring.io/projects/spring-integration
Broadcast to all the consumer may work, but the most efficient solution should route message to the node holds the websocket connection for the target user precisely. As i know, route in a distributed system can be done as follows:
Put the route information in a middleware,such as Redis; Or implement a service by yourself to keep track of all the ssesions. That is, solved in a centralized way.
Let the websocket server find route by themselves. In this circumstance, consensus algorithm like gossip should be taken into consideration.

WebSocket client disconnect due to network loss doesn't get intercepted by Spring server

I have an application in which clients use websockets to connect to a server which is running Spring Boot Tomcat.
My question is if there is a way for the server to detect a client disconnect due to a network loss.
Thanks.
if you are using stomp , check SessionDisconnectEvent.
For raw Websocket connections, you can use :
WebSocketHandler-->afterConnectionClosed
I have searched before for this and the solution I was able to find was to implement a ping-pong mechanism between the server and the clients.
For example, each few seconds send a dummy message to the client on a specific topic and receive back another dummy reply, if you didn't get a reply for a configured period you can consider the client disconnected.
As mentioned here,
STOMP and Spring also allow us to set up topics, where every
subscriber will receive the same message. This is going to be very
useful for tracking active users. In the UI, each user subscribes to a
topic that reports back which users are active, and in our example
that topic will produce a message every 2 seconds. The client will
reply to every message containing a list of users with its own
heartbeat, which then updates the message being sent to other clients.
If a client hasn't checked in for more than 5 seconds (i.e. missed two
heartbeats), we consider them offline. This gives us near real time
resolution of users being available to chat. Users will appear in a
box on the left hand side of the screen, clicking on a name will pull
up a chat window for them, and names with an envelope next to them
have new messages.

MSMQ WCF life cycle on a web site

I've created a console application that listens to a queue using WCF in the past and have no problems with that implementation.
My question:
If, instead of listening to the queue on a console application, I listen to a queue through my website, when would the message be picked up? Would it be instant, as is the case with the console app? Would the message only be received when someone requests a page on the site?
Regards.
A website is not a good host container for a MSMQ client. The reason is the app pool unloads during time of low traffic.
So effectively you are correct in that you will not consume message until the app pool is loaded.
However, that does not prevent others from sending you messages, as the queue receives the messages regardless of whether your client is loaded or not. These would then be stored until the client came back to consume them (providing the queues are durable).
A windows service would be a much more appropriate container.

Build durable architecture with Websphere MQ clients

How can you create a durable architecture environment using MQ Client and server if the clients don't allow you to persist messages nor do they allow for assured delivery?
Just trying to figure out how you can build a salable / durable architecture if the clients don't appear to contain any of the necessary components required to persist data.
Thanks,
S
Middleware messaging was born of the need to persist data locally to mitigate the effects of failures of the remote node or of the network. The idea at the time was that the queue manager was installed locally on the box where the application lives and was treated as part of the transport stack. For instance you might install TCP and WMQ as a transport and some apps would use TCP while others used WMQ.
In the intervening 20 years, the original problems that led to the creation of MQSeries (Now WebSphere MQ) have largely been solved. The networks have improved by several nines of availability and high availability hardware and software clustering have provided options to keep the different components available 24x7.
So the practices in widespread use today to address your question follow two basic approaches. Either make the components highly available so that the client can always find a messaging server, or put a QMgr where the application lives in order to provide local queueing.
The default operation of MQ is that when a message is sent (MQPUT or in JMS terms producer.send), the application does not get a response back on the MQPUT call until the message has reached a queue on a queue manager. i.e. MQPUT is a synchronous call, and if you get a completion code of OK, that means that the queue manager to which the client application is connected has received the message successfully. It may not yet have reached its ultimate destination, but it has reached the protection of an MQ Server, and therefore you can rely on MQ to look after the message and forward it on to where it needs to get to.
Whether client connected, or locally bound to the queue manager, applications sending messages are responsible for their data until an MQPUT call returns successfully. Similarly, receiving applications are responsible for their data once they get it from a successful MQGET (or JMS consumer.receive) call.
There are multiple levels of message protection are available.
If you are using non-persistent messages and asynchronous PUTs, then you are effectively saying it doesn't matter too much whether the messages reach their destination (although they generally will).
If you want MQ to really look after your messages, use synchronous PUTs as described above, persistent messages, and perform your PUTs and GETs within transactions (aka syncpoint) so you have full application control over the commit points.
If you have very unreliable networks such that you expect to regularly fail to get the messages to a server, and expect to need regular retries such that you need client-side message protection, one option you could investigate is MQ Telemetry (e.g. in WebSphere MQ V7.1) which is designed for low bandwidth and/or unreliable network communications, as a route into the wider MQ.

Socket.IO with RabbitMQ?

I'm currently using Socket.IO with redis store.
And I'm using Room feature with it.
So I'm totally okay with Room join (subscribe)
and Leave (unsubscribe) with Socket.IO.
I just see this page
http://www.rabbitmq.com/blog/2010/11/12/rabbitmq-nodejs-rabbitjs/
And I have found that some people are using Socket.IO with rabbitMQ.
Why using Socket.IO alone is not good enough?
Is there any good reason to use Socket.IO with rabbitMQ?
SocketIO is a browser --> server transport mechanism whereas RabbitMQ is a server --> server message bus.
The two can be implemented together to create a very responsive system in scenarios where a user journey consists of a message starting life on a browser and ending up in, say, some persistence layer (such as a database).
A message would be transported to the web server via socketIO and then, instead of the web server being responsible for persisting the message, it would drop it on a Rabbit queue and leave some other process responsible for persisting it. This way, the web server is free to return to its web serving responsibilities and, crucially, lessening its load.
Take a look at SockJS http://sockjs.org .
It's made by the RabbitMQ team
It's simpler than Socket.io
There's an erlang server for SockJS
Apart from that, there is an experimental project within RabbitMQ team that intends to provide a SockJS plugin for RabbitMQ.
I just used rabbitMQ with socket.io for a totally different reason than in the accepted answer. It wasn't that relevant in 2012, that's why I'm updating here.
I'm using a docker swarm deployment of a chat application with scalability and high availability. I have three replicas of the chat application (which uses socket.io) running in the cluster. The swarm cluster automatically load-balances the incoming requests and at any given time a client might get connected to any of the three replicas of the application.
With this scenario, it gets really necessary to sync the WebSocket responses in the replicas of the application because two clients connected to two different instances of the application wouldn't get each other's messages because they've been connected to different WebSockets.
This is where rabbitMQ intervenes. It syncs all the instances of the application and whenever a message is pushed from a WebSocket on a replica, it gets pushed by all replicas.
Complete details of the project have been given here. This is a potential use case of socket.io and rabbitMQ use in conjunction. This goes for any application using socket.io in a distributed environment with high availability and scalability.

Resources