Scaling consumers #StreamListener - spring-boot

We're using spring cloud to serve asynchronous tasks. I wonder if there is any way to scale listeners set up by #StreamListener? The goal is to have multiple workers within one application instance.
I read about spring.cloud.stream.instancecount, but I don't want to replicate whole application, only increase workers count.

You should be able to accomplish that via spring.cloud.stream.bindings.input.consumer.concurrency consumer property. Here is more info

Related

spring boot: Multiple listeners (with selector) of a single queue should be defined as a separate microservice

I have a single queue with multiple listeners with selector so that each queue processes only specific set of messages, do I need to create each listener as a separate microservice. My understanding is that since MQ infrastructure spawn multiple instances of the consumer based on the load we need not to have them as a separate microservice, I am new to this area, please help me with right design for my use case. Thanks.

Scheduling jobs while consuming Kafka messages

I want build a single Spring Boot application which does multiple different tasks concurrently. I did research on the internet but I could not find any way out. Let me get into detail.
I would like to start jobs in certain intervals for example once a day. I can do it using Spring Quartz. I also would like to listen messages on a dedicated internet address. Messages will come from Apache Kafka platform. Thus, I would like to use Kafka integration for Spring framework.
Is it applicable practically (listening messages all the time and executing scheduled jobs on time)
Functionally speaking, this design is fine: a single Spring Boot app can consume Kafka messages while also executing quartz jobs.
But higher level, you should ask why these two functions belong in a single app. Is there some inherent relationship between the quartz jobs and Kafka messages being consumed? Are you just combining them solely to limit yourself to one app and save on compute/memory resources?
You should also consider the impacts to scalability. What if you need to increase the rate at which you consume Kafka messages? If you scale your app to get more Kafka consumers, you have to worry about multiple apps now firing your quartz jobs.
So yes, it can be done, but without any more detail it sounds like you should break this design into 2 separate applications: one for Quartz and one for Kafka consuming.

Spring Batch remote partitioning how to shutdown slaves

I want to use Spring Batch remote partitioning to handle large workloads on the cloud, and spin up/shutdown VMs on demand.
However, when configuring the slave steps, I'm using the StepExecutionRequestHandler to handle the step requests from a JMS queue. Right now the application just hangs. How can I shut down the application after the queue is depleted?
How can I shut down the application after the queue is depleted?
In a remote partitioning setup, workers are listeners on a queue on which StepExecutionRequests are coming. The question is how to know, from the listener point of view, that the queue is depleted? This is a tricky design problem. There are some known solutions like the "End-Of-Stream" message or "Poison" record but those are tricky too since you have to make sure all listeners get one such message.
If you are using Spring Cloud Task to launch your workers, you can use the DeployerPartitionHandler which provides an elegant way to dynamically create workers on demand up to a maximum configurable number. You can find more details about it here: https://docs.spring.io/spring-cloud-task/docs/2.0.0.RELEASE/reference/htmlsingle/#batch-partitioning and an example in this github repo: https://github.com/mminella/scaling-demos/blob/master/partitioned-demo/src/main/java/io/spring/batch/partitiondemo/configuration/BatchConfiguration.java#L75
The ice on the cake is that this is based on Spring Cloud Deployer which means you can use it on any cloud provider that implements the SCD SPI. Here is how to do it for:
Kubernetes: https://docs.spring.io/spring-cloud-task/docs/2.0.0.RELEASE/reference/htmlsingle/#_notes_on_developing_a_batch_partitioned_application_for_the_kubernetes_platform
cloud foundry: https://docs.spring.io/spring-cloud-task/docs/2.0.0.RELEASE/reference/htmlsingle/#_notes_on_developing_a_batch_partitioned_application_for_the_cloud_foundry_platform

How to make Spring kafka client distributed

I have messages coming in from Kafka. So I am planning to write a listener and "onMessage". I want to process it and push it in to solr.
So my question is more architectural, like I have worked on web apps all my career, so in big data how to deploy the spring kafka listener, so I can process thousands of messages a second.
How do I make my spring code use multiple nodes to distribute the
load?
I am planning to write a SpringBoot application to run in
a tomcat container.
If you use the same group id for all instances, different partitions will be assigned to different consumers (instances of your application).
So, be sure that you specified enough partitions in the topic you are going to consume.

How does Spring XD load balance between instances of the same module in different containers

I have read this post but it's not my case and not enough clear:
How does load balancing in Spring XD get done?
I have a composed job with different instances of the same sub-jobs deployed in different containers. My composed job is scheduled to run periodically. I need to know how Spring XD choose the sub-jobs instances to invoke for every new request to the composed job.
The same question for a stream triggered every X minutes.
It's handled by the transport (rabbit, redis).
Each downstream module competes for messages - with rabbit it will generally be round robin; with redis it will be more random.

Resources