shared node wise queue - data-structures

I am building a proxy server using Java. This application is deployed in docker container (multiple instances)
Below are requirements I am working on.
Clients send http requests to my proxy server
Proxy server forward those requests in the order it received to destination node server.
When destination is not reachable, proxy server store those requests and forward it when it is available in future.
Similarly when a request fails, request will be re-tried after "X" time
I implemented a node wise queue implantation (Hash Map - (Key) node name - (value) reachability status + requests queue in the order it received).
Above solution works well when there is only one instance. But I would like to know how to solve this when there are multiple instances? Is there any shared datastructure I can use to solve this issue. ActiveMQ, Redis, Kafka something of that kind (I am very new to shared memory / processing).
Any help would be appreciated.
Thanks in advance.
Ajay

There is an Open Source REST Proxy for Kafka based on Jetty which you might get some implementation ideas from.
https://github.com/confluentinc/kafka-rest
This proxy doesn’t store messages itself because kafka clusters are highly available for writes and there are typically a minimum of 3 kafka nodes available for Message persistence. The kafka client in the proxy can be configured to retry if the cluster is temporarily unavailable for write.

Related

Working of websocket services in clustered deployment

Lets say I have a websocket implemented in springboot. The architecture is microservice. I have deployed the service in kubernetes cluster and I have 2 running instance of the service, the socket implementation is using stomp and redis as broker.
Now the first connection is created between a client and one of the service. Does all the data flow occur through the client and the connected service? Would the other service also have a connection? Incase the current service goes down would the other service open up a connection?
Now lets say I'am sending some data back to the client which comes through a kafka topic. One of the either service could read it. If then would either of them be able to send the data back to the client?
Can someone help me understand these scenarios?
A websocket is a permanent connection. After opening it, it will be routed through kubernetes to a fixed pod. No other pod will receive the connection.
If the pod goes down, the connection is terminated.
If a new connection is created, for example by a different user, it may be routed to a different pod.
What data is transmitted, for example with kafka as source, is not relevant in this context. It could be anything.

Routing messages from Kafka to web socket clients connected to application server cluster

I would like to figure out the best way to route messages from Kafka to web socket clients connected to a load balanced application server cluster. I understand that spring-kafka facilitates consuming and publishing messages to a kafka topic, but how does this work in a load balanced application server scenario when connecting to a distributed kafka topic. Here are the requirements that I would like to satisfy, with the overall goal of facilitating peer to peer messaging in an application with a very, very large volume of users:
Web clients can connect to a tomcat application server via web sockets connection via a load balancer.
Web client can send a message/notification to another client thats connected to different tomcat application server.
Messages are saved in the database and published to a kafka topic/partition that can be consumed by the appropriate web clients/users.
Kafka can be scaled to many brokers with many consumers.
I can see how this can be implemented quite easily in a single application server scenario where the consumer consumes all messages from a kafka topic and re-distributes via spring messaging/websockets. But I can't figure out how this would work in a load balanced application server scenario where there are consumers on each application server forming an overall consumer group for the kafka topic. Assuming that each of the application servers are are consuming sub-sets/partitions of the kafka topic, how do they know which server their intended recipients are connected to? And even if they knew which server their recipients were connected to, how would they route the message to them via websockets?
I considered that the application server load balancing could work by logging users with a particular routing key (users starts with 'A' etc) on to a specific application server, then only consuming messages for users starts with 'A' on that application server. But this seems like it would be difficult to maintain and would make autoscaling very difficult. This seems like it should be an common scenario to implement but I can't find any tools or approaches that fit this scenario.
Sounds like every single consumer should live in its own consumer group. This way all the available consumers are going to consume all the messages sent to the topic. Therefore all the connected websocket clients are going to be notified with those messages.
If you need more complex logic with those messages at
after consuming, e.g. filtering, routing, transforming, aggregating etc., you should consider to involve Spring Integration in you project: https://spring.io/projects/spring-integration
Broadcast to all the consumer may work, but the most efficient solution should route message to the node holds the websocket connection for the target user precisely. As i know, route in a distributed system can be done as follows:
Put the route information in a middleware,such as Redis; Or implement a service by yourself to keep track of all the ssesions. That is, solved in a centralized way.
Let the websocket server find route by themselves. In this circumstance, consensus algorithm like gossip should be taken into consideration.

All JMSs Message from Distributed Queue across the Cluster

Currently using WebLogic and Distributed Queues. And I know from the documentation that Distributed Queues allow you to retrieve a connection to any of the Queues across a cluster by using the Global JNDI name. It seems one of the main pieces of functionality Distributed Queue gives you is load balanced connections across multiple managed servers. So we have 4 Managed Servers (two on each physical, that communicate over multicast), and each Managed Server has an individual JMS Server which is configured to it's own Data Store.
I am 99% certain I already know the answer to this, but it appears that if you wanted to do a Consume a message off of a Queue, and that Queue exists on each Mgd Server in the Cluster, you cannot technically pull a Message off of any of the Queues (you can only pull the Message off the Queue to which you are connected to). So if I have a Message on Mgd Server 4, and I connect to Mgd Server 1, I won't see the messages on the Queue from Mgd Server 4.
So is there a way in Java EE or WLS to consume a message from all the nodes of a Queue (across the Cluster). Like a view into every instance of the Queue on each Mgd Server? It doesn't appear so and the documentation makes it seem like this is not possible, as well as this video (around minute 5):
http://www.youtube.com/watch?v=HAKixK_wp0Q
No you cannot consumer a message that is delivered to one managed server when your client is connected to another managed server of the same cluster.
Here's how it works.
When using UDT, wls provides a JNDI name that resolves internally into 4 distinct JNDI names for each of the managed server, the JMS servers on each of the managed servers are distinct.
When using the UDQ JNDI name when you post a message, it gets to one of the 4 managed servers using the algorithm you chose and other configuration done in your connection factory.
When a message consumer listens to the UDQ it gets pinned to the JMS server on one of the managed servers. It has no visibility about messages in the other servers.
Usually UDQ is used in scenarios where you want the message to be consumed concurrently by more than one managed server. You would normally deploy a MDB to the cluster, meaning the MDB will be deployed to each of the managed server and each of these will be able to consume the messages from their local JMS server.
I believe you can if your message store is config'd to use a database. If so, then I would think removing an item from the queue would remove it from the shared db table. I.e. all JMS servers are pointing to the same db instance and table. That should be pretty easy to test, too.

Design of queues with JMS & QPID

I am currently assigned the task of writing an application which will use JMS API to communicate using Apache qpid as the JMS provider.
so basically my application will have multiple instances of a server running.Each server will serve a set of unique desks.so each instance will only have data for the desks it is serving.
There will also be multiple instances of clients each configured by desk again.
Now when the client starts up, it will request the data for the desk it is serving to the servers.The request should only go to the server that has that desk data loaded and the response should only go back to the client who requested the data for that desk.
I am thinking of using queues for this.i am not sure if i should create only one request queue which will be used by all the servers or i should create seperate queues for each server.
For response ,I am planning to use temporary queues.
Not that the request from the client to the server is not very often.Say each client may send around 50 requests a day.
can someone please tell me if this is a good design?

Configuring JMS over a Weblogic Cluster

I have a setup of 2 WLS managed servers configured as part of a WLS cluster.
1) The requirement is to send requests to another system and receive responses using JMS as interface.
2) The request could originate from either of the Managed Servers. So the corresponding response should reach the managed server which originated the request.
3) The external system (to which requests are sent) should not be aware of how many managed servers are in the cluster (not a must have requirement)
How should JMS be configured for meeting these requirments?
Simple! Setup a response queue for each managed server and add a "reply-to" field in the messages you send to the other system. The other system will then ask the request where to send the reply. Deploy one Message Driven Bean (MDB) on each managed server (i.e. not on the cluster, one per managed server) to consume reply messages send to reply queues. Note that you might want to use clustered reply queues and persistent messages for load balancing and failover.
This is actually a combination of the Request-Reply and the Return Address patterns and is illustrated by the picture below:

Resources