Spring Integration Cluster with aggregator - cluster-computing

we are building a cluster using SI that has an aggregator in it's flow. Here we found we have two options for the aggregator:
Use a LockRegistry (Redis, Zookeeper, etc)
Use a Zookeeper leader listener
After reading the docs I couldn't find what's the best option. In which scenarios is better a LockRegistry than a leader listener?
I appreciate any help with this doubt
thanks in advance
Regards,
Guzmán

It depends on your application requirements; with the leader listener, typically, only one instance is active; the other instance will take over when the first one dies.
If you want active-active, use global locks.
A shared MessageGroupStore is required in both situations.

Related

How can I connect to multiple Rabbitmq nodes with Spring Rabbitmq?

I am writing a service with Spring and I am using Spring AMQP in order to connect to Rabbitmq.
I have two rabbitmq clusters, one is only for publishing messages(the messages are sent to the other cluster via the federation plugin) and the other cluster is for declaring queues that end users will consume from.
The nodes sit behind aws lb, each cluster has a lb.
I am using CachingConnectionFactory and RabbitTemplate,RabbitAdmin in my code and I want to have connections to all the nodes so I can use them.
For the cluster that will contain the queues I added to the config the queue-master-locator=random so new queues will be declared in all the nodes in the cluster even if my service does not have a connection to them.
With the cluster that publishes messages I have more of a problem because I need a direct connection in my service to each of the nodes so I will be able to separate the load between the nodes.
So my problem is, how do I create connections in my service to all the nodes in the cluster so they will all be used for declaring queues and sending messages?
Now, after I will have some sort of solution to this issue, the next issue will be what happens when a new node is added to the cluster? How can I create a connection to it and start using it as well?
I am using Rabbitmq - 3.7.9, Spring - 2.0.5, Spring AMQP - 2.0.5
Thanks alot!
There is currently no mechanism to do anything like that.
By default, Spring AMQP opens only one connection (optionally two, one for publishing, one for consuming).
Even when using CacheMode.CONNECTION, you'll get a new connection for each consumer (and connections will be created and cached on demand for producers), you won't get any control as to which node it connects to; that's a function of the LB.
The framework does provide the LocalizedQueueConnectionFactory which will try to consume from the node that hosts a queue, but it won't work with a load balancer in place.
In general, however, such optimization is rarely needed.
Are you trying to solve an actual problem you are experiencing now, or something that you perceive that might be a problem?
It is generally best not to perform premature optimization.

Spring Batch remote partitioning how to shutdown slaves

I want to use Spring Batch remote partitioning to handle large workloads on the cloud, and spin up/shutdown VMs on demand.
However, when configuring the slave steps, I'm using the StepExecutionRequestHandler to handle the step requests from a JMS queue. Right now the application just hangs. How can I shut down the application after the queue is depleted?
How can I shut down the application after the queue is depleted?
In a remote partitioning setup, workers are listeners on a queue on which StepExecutionRequests are coming. The question is how to know, from the listener point of view, that the queue is depleted? This is a tricky design problem. There are some known solutions like the "End-Of-Stream" message or "Poison" record but those are tricky too since you have to make sure all listeners get one such message.
If you are using Spring Cloud Task to launch your workers, you can use the DeployerPartitionHandler which provides an elegant way to dynamically create workers on demand up to a maximum configurable number. You can find more details about it here: https://docs.spring.io/spring-cloud-task/docs/2.0.0.RELEASE/reference/htmlsingle/#batch-partitioning and an example in this github repo: https://github.com/mminella/scaling-demos/blob/master/partitioned-demo/src/main/java/io/spring/batch/partitiondemo/configuration/BatchConfiguration.java#L75
The ice on the cake is that this is based on Spring Cloud Deployer which means you can use it on any cloud provider that implements the SCD SPI. Here is how to do it for:
Kubernetes: https://docs.spring.io/spring-cloud-task/docs/2.0.0.RELEASE/reference/htmlsingle/#_notes_on_developing_a_batch_partitioned_application_for_the_kubernetes_platform
cloud foundry: https://docs.spring.io/spring-cloud-task/docs/2.0.0.RELEASE/reference/htmlsingle/#_notes_on_developing_a_batch_partitioned_application_for_the_cloud_foundry_platform

How to get Kafka bootstrap configuration setting from Kafka connector

Question for Kafka experts:
Anyone knows how to get worker's config setting 'bootstrap.servers' either from SinkConnector or SinkTask? Or how to get Kafka cluster information from connector?
Thank you
AFAIK, the Connect API doesn't provide this information right now.
If you need this functionality, perhaps your best bet is to open a JIRA on the Apache Kafka project and explain the use-case (i.e. what are you planning on doing this information).

how to configure redis, hsqldb,zookeeper, multiple admin in spring xd

I want to create a distributed cluster in spring xd.
I am able to create a cluster with single admin, one zookeeper, one instance of redis and hsqldb.
But when i'm trying to do that with multiple instance of zookeeper , hsqldb, redis ,i'm not able to configure it correctly.
You should only have a single instance of zookeeper, hsqldb and redis. All xd-admins should be configured to connect to the same instance of each of these services and so should the xd-containers be.
Like Thomas has mentioned, the idea is that you have your (multiple) instances of admin and containers deployed, and all connect to the same zk,redis, hsqldb & rabbitmq.
Why do you want to start multiple instances of these applications?
Zookeeper provides the topology of the cluster and manages deployments. Also, it makes sure to note when nodes go up and down - avoiding single point of failures when you have many xd-admin instances (one is leader and the others replicate, they will become leader if the current one fails).
Or are you talking about making those instance parallel to avoid a SPOF? In that case, you should try to dedicate an entire VM for each of those applications.

ActiveMQ single consumer multiple producers

Can anybody point out a reference on how to implement
a single consumer multiple producer in activemq? Or could give a very simple implementation.
This will be very helpful.
Thanks
Matt Raible's AppFuse project is a good skeleton project which is implemented using different libraries. You can pick the one which uses Spring and introduce ActiveMQ as Bharati Raja has explained in his blog post jms with appfuse1x.
There is no special implementation required for this. This is the core business of MessageBrokers. The only thing you need to make sure of:
If you decide to give an ID to your producers, make sure they are different from each other.. You cannot have multiple producers with the same ID. Same goes for consumers.
If you need to guarantee that a message can be consumed by only one consumer, then this is the point-to-point communication model which can be implemented using a JMS Queue in ActiveMQ.
Many producers can send messages to the same queue. Only one active consumer will receive a message from the queue.

Resources