Kafka concurrency config Spring setConsumerTaskExecutor and setConcurrency - spring-boot

What is different between setConsumerTaskExecutor() and setConcurrency() in spring kafka?
What happend if the maxpool size of ConsumerTaskExecutor different with setConcurrency value?
ThreadPoolTaskExecutor customExecutor= new ThreadPoolTaskExecutor();
exec.setCorePoolSize(3);
exec.setMaxPoolSize(6);
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<String, String>();
factory.setConcurrency(10);
factory.getContainerProperties().setConsumerTaskExecutor(customExecutor);
Thank you

concurrency:
The maximum number of concurrent KafkaMessageListenerContainers running. Messages from within the same partition will be processed sequentially.
ConsumerTaskExecutor:
Set the executor for threads that poll the consumer.
https://docs.spring.io/spring-kafka/docs/current/api/org/springframework/kafka/listener/ConcurrentMessageListenerContainer.html#getConcurrency()
https://docs.spring.io/spring-kafka/docs/current/api/org/springframework/kafka/listener/ContainerProperties.html#setConsumerTaskExecutor(org.springframework.core.task.AsyncListenableTaskExecutor)
In other words, ConcurrentKafkaListenerContainer creates 1 or more KafkaMessageListenerContainers based on concurrency. And every KafkaMessageListenerContainer shares the ContainerProperties, using ConsumerTaskExecutor to run a ListenerConsumer which delegate to a Kafka consumer to do a poll.

Related

Number of connections to Kafka boot strap server

I have two topics with 10 partition on each and I am using below code to listen to messages. Here how many connections will establish ? is it 20?
Does each connection will be at partition level or namespace (bootstarap server) level ?
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<Object, Object>> kafkaListenerContainerFactory() {
final ConcurrentKafkaListenerContainerFactory<Object, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
#KafkaListener(topics = "#{'${spring.kafka.topics}'.split(',')}",
concurrency = "20",
clientIdPrefix = "client123",
groupId = "group123")
public void listen(final ConsumerRecord<Object, Object> inputEvent) throws Exception {
handleMessage(inputEvent);
}
concurrency represents the number of threads; each thread creates a Consumer; they run in parallel, each consumer is responsible for partitions in topics, in your case you will start 20 consumer threads which will read from two topics, in the same consumer groups, because you have only 10 partitions per topic, then you will have 10 threads that will probably won't get partitions assigned and be idle.

Spring Data Redis - StreamMessageListenerContainer only spawning one thread

I am using spring data redis to subscribe to the 'task' redis stream to process tasks.
For some reason redis stream consumer only spawns one thread and processes one message at a time sequentially even thought I explicitly provide a Threadpool TaskExecutor.
I expect it to delegate the creation of threads to the provided Threadpool and spawn a thread up to the Threadpool configured limits. I can see that it is using the give TaskExecutor, but it's not spawning more than one thread.
Even when I don't specify my own taskExecutor, and it internally defaults to SimpleAsyncTaskExecutor, the problem still continues. Tasks are processed sequentially one at a time, one after the other, even when they are long lasting task.
What am I missing here?
#Bean
public Subscription
redisTaskStreamListenerContainer(
RedisConnectionFactory connectionFactory,
#Qualifier("task") RedisTemplate<String, Task<TransportEnvelope>> redisTemplate,
#Qualifier("task") StreamListener<String, MapRecord<String, String, String>> listener,
#Qualifier("task") Executor taskListenerExecutor) {
StreamMessageListenerContainerOptions<String, MapRecord<String, String, String>>
containerOptions = StreamMessageListenerContainerOptions.builder()
.pollTimeout(Duration.ofMillis(consumerPollTimeOutInMilli))
.batchSize(consumerReadBatchSize)
.executor(taskListenerExecutor)
.build();
StreamMessageListenerContainer<String, MapRecord<String, String, String>> container =
StreamMessageListenerContainer.create(connectionFactory, containerOptions);
StreamMessageListenerContainer.ConsumerStreamReadRequest<String> readOptions
=
StreamMessageListenerContainer.StreamReadRequest
.builder(StreamOffset.create(streamName, ReadOffset.lastConsumed()))
//turn off auto shutdown of stream consumer if an error occurs.
.cancelOnError((ex) -> false)
.consumer(Consumer.from(groupId, consumerId))
.build();
Subscription subscription = container.register(readOptions, listener);
container.start();
return subscription;
}
#Bean
#Qualifier("task")
public Executor redisListenerThreadPoolTaskExecutor() {
ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
threadPoolTaskExecutor.setCorePoolSize(30);
threadPoolTaskExecutor.setMaxPoolSize(50);
threadPoolTaskExecutor.setQueueCapacity(Integer.MAX_VALUE);
threadPoolTaskExecutor.setThreadNamePrefix("redis-listener-");
threadPoolTaskExecutor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
return threadPoolTaskExecutor;
}

Spring kafka consume records with some delay

I'm using spring kafka in my application. I want to add some delay of 15 mins for consuming the records for one of the listener - kafkaRetryListenerContainerFactory. I have two listeners. Below is my configuration:
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Object> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(primaryConsumerFactory());
return factory;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Object> kafkaRetryListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(retryConsumerFactory());
factory.setConcurrency(this.kafkaConfigProperties.getConsumerConcurrency());
factory.setAutoStartup(true);
factory.getContainerProperties().setAckMode(AckMode.MANUAL_IMMEDIATE);
return factory;
}
Kafka retry listener:
#KafkaListener(topics = "${spring.kafka.retry.topic}", groupId = "${spring.kafka.consumer-group-id}",
containerFactory = "kafkaRetryListenerContainerFactory", id = "retry.id")
public void retryMessage(ConsumerRecord<String, String> record, Acknowledgment acknowledgment) {
Thread.sleep(900000);
LOG.info(String.format("Consumed retry message -> %s", record.toString()));
acknowledgment.acknowledge();
}
When I added Thread.sleep(), I'm getting continuous rebalancing error in the logs
Attempt to heartbeat failed since group is rebalancing
My spring kafka version is 2.3.4
Below are the config values:
max.poll.interval.ms = 1200000 (this is higher than thread.sleep)
heartbeat.interval.ms = 3000
session.timeout.ms = 10000
I have tried ack.nack(900000); Still getting the rebalancing error
Any help will be appreciated
A filter is not the right approach; you need to Thread.sleep() the thread and make sure that max.poll.interval.ms is larger than the total sleep and processing time for the records received by the poll.
In 2.3, the container has the option to sleep between polls; with earlier versions, you have to do the sleep yourself.
EDIT
I just found this in my server.properties (homebrew on Mac OS):
############################# Group Coordinator Settings #############################
# The following configuration specifies the time, in milliseconds, that the GroupCoordinator will delay the initial consumer rebalance.
# The rebalance will be further delayed by the value of group.initial.rebalance.delay.ms as new members join the group, up to a maximum of max.poll.interval.ms.
# The default value for this is 3 seconds.
# We override this to 0 here as it makes for a better out-of-the-box experience for development and testing.
# However, in production environments the default value of 3 seconds is more suitable as this will help to avoid unnecessary, and potentially expensive, rebalances during application startup.
group.initial.rebalance.delay.ms=0
That explains why we see the partitions initially assigned to the first consumer (see comment below).
Setting it back to the default 3000 works for me.

Rabbitmq concurrent consumers in Spring boot

I'm using #RabbitListener annotation and SimpleRabbitListenerContainerFactory bean for parallel execution of rabbitmq messages and setting the min and max concurrent consumers in the following way :
#Bean
public SimpleRabbitListenerContainerFactory rabbitListenerContainerFactory() {
SimpleRabbitListenerContainerFactory factory = new SimpleRabbitListenerContainerFactory();
factory.setConnectionFactory(connectionFactory());
factory.setConcurrentConsumers(MIN_RABBIT_CONCURRENT_CONSUMERS);
factory.setMaxConcurrentConsumers(MAX_RABBIT_CONCURRENT_CONSUMERS);
factory.setConsecutiveActiveTrigger(1);
factory.setAcknowledgeMode(AcknowledgeMode.MANUAL);
return factory;
}
The min limit is 3 and the max limit is 10. With this configuration, only 3 messages are getting executed parallelly, even though there are 12 messages in the queue.
Please tell me what is wrong with the config?
You can create max concurrent consumers using rabbitMQ annotation
#RabbitListener(queues = "your-queue-name", concurrency = "4")
public void customCheck(Object requestObject) {
// process
}
With the default configuration, a new consumer will be added every 10 seconds if the other consumers are still busy.
The algorithm (and properties to affect it) is described here.

Spring Kafka listenerExecutor

I'm setting up a kafka listener in a spring boot application and I can't seem to get the listener running in a pool using an executor. Here's my kafka configuration:
#Bean
ThreadPoolTaskExecutor messageProcessorExecutor() {
logger.info("Creating a message processor pool with {} threads", numThreads);
ThreadPoolTaskExecutor exec = new ThreadPoolTaskExecutor();
exec.setCorePoolSize(200);
exec.setMaxPoolSize(200);
exec.setKeepAliveSeconds(30);
exec.setAllowCoreThreadTimeOut(true);
exec.setQueueCapacity(0); // Yields a SynchronousQueue
exec.setThreadFactory(ThreadFactoryFactory.defaultNamingFactory("kafka", "processor"));
return exec;
}
#Bean
public ConsumerFactory<String, PollerJob> consumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.GROUP_ID_CONFIG, consumerGroup);
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
DefaultKafkaConsumerFactory<String, PollerJob> factory = new DefaultKafkaConsumerFactory<>(props,
new StringDeserializer(),
new JsonDeserializer<>(PollerJob.class));
return factory;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, PollerJob> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, PollerJob> factory
= new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(Integer.valueOf(kafkaThreads));
factory.getContainerProperties().setListenerTaskExecutor(messageProcessorExecutor());
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL);
return factory;
}
The ThreadFactoryFactory used by the ThreadPoolTaskExecutor just makes sure the thread is named like 'kafka-1-processor-1'.
The ConsumerFactory has the ENABLE_AUTO_COMMIT_CONFIG flag set to false and I'm using manual mode for the acknowledgement which is required to use executors according to the documentation.
My listener looks like this:
#KafkaListener(topics = "my_topic",
group = "my_group",
containerFactory = "kafkaListenerContainerFactory")
public void listen(#Payload SomeJob job, Acknowledgment ack) {
ack.acknowledge();
logger.info("Running job {}", job.getId());
....
}
Using the Admin Server I can inspect all the threads and only one kafka-N-processor-N threads is being created but I expected to see up to 200. The jobs are all running one at a time on the that one thread and I can't figure out why.
How can I get this setup to run the listeners using my executor with as many threads as possible?
I'm using Spring Boot 1.5.4.RELEASE and kafka 0.11.0.0.
If your topic has only one partition, according the consumer group policy, only one consumer is able to poll that partition.
The ConcurrentMessageListenerContainer indeed creates as much target KafkaMessageListenerContainer instances as provided concurrency. And it does that only in case it doesn't know the number of partitions in the topic.
When the rebalance in consumer group happens only one consumer gets partition for consuming. All the work is really done there in a single thread:
private void startInvoker() {
ListenerConsumer.this.invoker = new ListenerInvoker();
ListenerConsumer.this.listenerInvokerFuture = this.containerProperties.getListenerTaskExecutor()
.submit(ListenerConsumer.this.invoker);
}
One partition - one thread for sequential records processing.

Resources