Question on Spring Kafka Listener Consumer Offset Acknowledgement - spring-boot

I have created the below consumer factory.
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Object> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Object> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setAutoStartup(autoStart);
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
return factory;
}
The Kafka listener is given below.
#KafkaListener(id= "${topic1}" ,
topics = "${topic1}",
groupId = "${consumer.group1}", concurrency = "1", containerFactory = "kafkaListenerContainerFactory")
public void consumeEvents1(String jsonObject, #Headers Map<String, String> header, Acknowledgment acknowledgment) {
LOG.info("Message - {}", jsonObject);
LOG.info(header.get(KafkaHeaders.GROUP_ID) + header.get(KafkaHeaders.RECEIVED_TOPIC)+String.valueOf(header.get(KafkaHeaders.OFFSET)));
acknowledgment.acknowledge();
}
In the consumer factory, I did not set factory.setBatchListener(true); My understanding is that the above listener code is called for each message as it is not a batch listener. That is what the behavior I saw. In the batch listener, I get a list of messages instead of the message by message.
As the listener is not batch-based, the acknowledgment.acknowledge() is going to have the same behavior for MANUAL, Or MANUAL_IMMEDIATE. Is that the correct understanding?
I referred to the below material.

With MANUAL, the commit is queued until the whole batch is processed; this is more efficient, but increases the possibility of getting redeliveries.
With MANUAL_IMMEDIATE, the commit occurs right away, as long as you call it on the listener thread.

Related

Spring Kafka - Display Custom Error Message with #Retry

I am currently using Spring Kafka to consume messages from topic along with #Retry of Spring. So basically, I am retrying to process the consumer message in case of an error. But while doing so, I want to avoid the exception message thrown by KafkaMessageListenerContainer. Instead I want to display a custom message. I tried adding an error handler in the ConcurrentKafkaListenerContainerFactory but on doing so, my retry does not get invoked.
Can someone guide me on how to display a custom exception message along with #Retry scenario as well? Below are my code snippets:
ConcurrentKafkaListenerContainerFactory Bean Config
#Bean
ConcurrentKafkaListenerContainerFactory << ? , ? > concurrentKafkaListenerContainerFactory(ConcurrentKafkaListenerContainerFactoryConfigurer configurer, ConsumerFactory < Object, Object > kafkaConsumerFactory) {
ConcurrentKafkaListenerContainerFactory < Object, Object > kafkaListenerContainerFactory =
new ConcurrentKafkaListenerContainerFactory < > ();
configurer.configure(kafkaListenerContainerFactory, kafkaConsumerFactory);
kafkaListenerContainerFactory.setConcurrency(1);
// Set Error Handler
/*kafkaListenerContainerFactory.setErrorHandler(((thrownException, data) -> {
log.info("Retries exhausted);
}));*/
return kafkaListenerContainerFactory;
}
Kafka Consumer
#KafkaListener(
topics = "${spring.kafka.reprocess-topic}",
groupId = "${spring.kafka.consumer.group-id}",
containerFactory = "concurrentKafkaListenerContainerFactory"
)
#Retryable(
include = RestClientException.class,
maxAttemptsExpression = "${spring.kafka.consumer.max-attempts}",
backoff = #Backoff(delayExpression = "${spring.kafka.consumer.backoff-delay}")
)
public void onMessage(ConsumerRecord < String, String > consumerRecord) throws Exception {
// Consume the record
log.info("Consumed Record from topic : {} ", consumerRecord.topic());
// process the record
messageHandler.handleMessage(consumerRecord.value());
}
Below is the exception that I am getting:
You should not use #Retryable as well as the SeekToCurrentErrorHandler (which is now the default, since 2.5; so I presume you are using that version).
Instead, configure a custom SeekToCurrentErrorHandler with max attempts, back off, and retryable exceptions.
That error message is normal; it's logged by the container; it's logging level can be reduced from ERROR to INFO or DEBUG by setting the logLevel property on the SeekToCurrentErrorHandler. You can also add a custom recoverer to it, to log your custom message after the retries are exhausted.
my event retry template,
#Bean(name = "eventRetryTemplate")
public RetryTemplate eventRetryTemplate() {
RetryTemplate template = new RetryTemplate();
ExceptionClassifierRetryPolicy retryPolicy = new ExceptionClassifierRetryPolicy();
Map<Class<? extends Throwable>, RetryPolicy> policyMap = new HashMap<>();
policyMap.put(NonRecoverableException.class, new NeverRetryPolicy());
policyMap.put(RecoverableException.class, new AlwaysRetryPolicy());
retryPolicy.setPolicyMap(policyMap);
ExponentialBackOffPolicy backOffPolicy = new ExponentialBackOffPolicy();
backOffPolicy.setInitialInterval(backoffInitialInterval);
backOffPolicy.setMaxInterval(backoffMaxInterval);
backOffPolicy.setMultiplier(backoffMultiplier);
template.setRetryPolicy(retryPolicy);
template.setBackOffPolicy(backOffPolicy);
return template;
}
my kafka listener using the retry template,
#KafkaListener(
groupId = "${kafka.consumer.group.id}",
topics = "${kafka.consumer.topic}",
containerFactory = "eventContainerFactory")
public void eventListener(ConsumerRecord<String, String>
events,
Acknowledgment acknowledgment) {
eventRetryTemplate.execute(retryContext -> {
retryContext.setAttribute(EVENT, "my-event");
eventConsumer.consume(events, acknowledgment);
return null;
});
}
my kafka consumer properties,
private ConcurrentKafkaListenerContainerFactory<String, String>
getConcurrentKafkaListenerContainerFactory(
KafkaProperties kafkaProperties) {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.getContainerProperties().setAckMode(AckMode.MANUAL_IMMEDIATE);
factory.getContainerProperties().setAckOnError(Boolean.TRUE);
kafkaErrorEventHandler.setCommitRecovered(Boolean.TRUE);
factory.setErrorHandler(kafkaErrorEventHandler);
factory.setConcurrency(1);
factory.getContainerProperties().setAckMode(AckMode.MANUAL_IMMEDIATE);
factory.setConsumerFactory(getEventConsumerFactory(kafkaProperties));
return factory;
}
kafka error event handler is my custom error handler that extends the SeekToCurrentErrorHandler and implements the handle error method some what like this.....
#Override
public void handle(Exception thrownException, List<ConsumerRecord<?, ?>>
records,
Consumer<?, ?> consumer, MessageListenerContainer container) {
log.info("Non recoverable exception. Publishing event to Database");
super.handle(thrownException, records, consumer, container);
ConsumerRecord<String, String> consumerRecord = (ConsumerRecord<String,
String>) records.get(0);
FailedEvent event = createFailedEvent(thrownException, consumerRecord);
failedEventService.insertFailedEvent(event);
log.info("Successfully Published eventId {} to Database...",
event.getEventId());
}
here the failed event service is my custom class again which with put these failed events into a queryable relational DB (I chose this to be my DLQ).

Error handling - Consumer - apache kafka and spring

I am learning to use kafka, I have two services a producer and a consumer.
The producer produces messages that require processing (queries to services and database). These messages are received by the consumer, it is responsible for processing them and saves the result in a database
Producer
#Autowired
private KafkaTemplate<String, String> kafkaTemplate;
...
kafkaTemplate.send(topic, message);
Consumer
#KafkaListener(topics = "....")
public void listen(#Payload String message) {
....
}
I would like all messages to be processed correctly by the consumer.
I do not know how to handle errors on the consumer side in this context. For example, a database might be temporarily disabled and could not handle certain messages.
What to do in these cases?
I know that the responsibility belongs to the consumer.
I could do retries, but retry several times in a row if a database is down does not seem like a good idea. And if I continue to consume messages, the index advances and I lose the events that I could not process by mistake.
You have control over kafka consumer in form of committing the offset of records read. Kafka will continue to return the same records unless the offset is committed. You can set offset commit to manual and based on the success of your business logic decide whether to commit or not. See a sample below
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "false");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
final int minBatchSize = 200;
List<ConsumerRecord<String, String>> buffer = new ArrayList<>();
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
buffer.add(record);
}
if (buffer.size() >= minBatchSize) {
insertIntoDb(buffer);
consumer.commitSync();
buffer.clear();
}
}
Consumer.commitsync() commits the offset.
Also see the kakfa consumer documentation to understand the consumer offsets here .
This link was very helpful https://dzone.com/articles/spring-for-apache-kafka-deep-dive-part-1-error-han
Spring provides the DeadLetterPublishingRecoverer class that performs a correct handling of errors.

How to set acknowledgement in case when retry gets exhausted in Kafka consumer

I have a Kafka consumer which retry 5 time and I am using Spring Kafka with retry template . Now if all retry are failed then how to does acknowledge work in that case . Also if i have set acknowledge mode to manually then how to acknowledge those message
Consumer
#Bean("kafkaListenerContainerFactory")
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory(RetryTemplate retryTemplate) {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL_IMMEDIATE);
factory.setRetryTemplate(retryTemplate);
factory.setRecoveryCallback(context -> {
log.error("Maximum retry policy has been reached {}", context.getAttribute("record"));
Acknowledgment ack = (Acknowledgment) context.getAttribute(RetryingMessageListenerAdapter.CONTEXT_ACKNOWLEDGMENT);
ack.acknowledge();
return null;
});
factory.setConcurrency(Integer.parseInt(kafkaConcurrency));
return factory;
}
Kafka Listener
#KafkaListener(topics = "${kafka.topic.json}", containerFactory = "kafkaListenerContainerFactory")
public void recieveSegmentService(String KafkaPayload, Acknowledgment acknowledgment) throws Exception {
KafkaSegmentTrigger kafkaSegmentTrigger;
kafkaSegmentTrigger = TransformUtil.fromJson(KafkaPayload, KafkaSegmentTrigger.class);
log.info("Trigger recieved from segment service {}", kafkaSegmentTrigger);
processMessage(kafkaSegmentTrigger);
acknowledgment.acknowledge();
}
If you have a RecoveryCallback that handles the record when retries are exhausted, and you are NOT using manual acks, the container will commit the offset.
If you are using MANUAL acks, you can commit the offset in the recovery callback.
The context object passed to the callback has an attribute RetryingMessageListenerAdapter.CONTEXT_ACKNOWLEDGMENT ('acknowledgment').
It also has CONTEXT_CONSUMER and CONTEXT_RECORD.

Spring kafka Batch Listener- commit offsets manually in Batch

I am implementing spring kafka batch listener, which reads list of messages from Kafka topic and posts the data to a REST service.
I would like to understand the offset management in case of the REST service goes down, the offsets for the batch should not be committed and the messages should be processed for the next poll. I have read spring kafka documentation but there is confusion in understanding the difference between Listener Error Handler and Seek to current container error handlers in batch. I am using spring-boot-2.0.0.M7 version and below is my code.
Listener Config:
#Bean
KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(Integer.parseInt(env.getProperty("spring.kafka.listener.concurrency")));
// factory.getContainerProperties().setPollTimeout(3000);
factory.getContainerProperties().setBatchErrorHandler(kafkaErrorHandler());
factory.getContainerProperties().setAckMode(AckMode.BATCH);
factory.setBatchListener(true);
return factory;
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> propsMap = new HashMap<>();
propsMap.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, env.getProperty("spring.kafka.bootstrap-servers"));
propsMap.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,
env.getProperty("spring.kafka.consumer.enable-auto-commit"));
propsMap.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG,
env.getProperty("spring.kafka.consumer.auto-commit-interval"));
propsMap.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, env.getProperty("spring.kafka.session.timeout"));
propsMap.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.GROUP_ID_CONFIG, env.getProperty("spring.kafka.consumer.group-id"));
return propsMap;
}
Listener Class:
#KafkaListener(topics = "${spring.kafka.consumer.topic}", containerFactory = "kafkaListenerContainerFactory")
public void listen(List<String> payloadList) throws Exception {
if (payloadList.size() > 0)
//Post to the service
}
Kafka Error Handler:
public class KafkaErrorHandler implements BatchErrorHandler {
private static Logger LOGGER = LoggerFactory.getLogger(KafkaErrorHandler.class);
#Override
public void handle(Exception thrownException, ConsumerRecords<?, ?> data) {
LOGGER.info("Exception occured while processing::" + thrownException.getMessage());
}
}
How to handle Kafka listener so that if something happens during processing batch of records, I wouldn't loose data.
With Apache Kafka we never lose the data. There is indeed an offset in partition logs to seek to any arbitrary position.
On the other hand, when we consume records from a partition there is no requirement to commit their offsets - the current consumer holds the state in the memory. We need to commit only for other, new consumers in the same group when the current one is dead. Independently of the error, the current consumer always moves on to poll new data behind its current in-memory offset.
So, to reprocess the same data in the same consumer we definitely have to use seek operation to move the consumer back to the desired position. That's why Spring Kafka introduces SeekToCurrentErrorHandler:
This allows implementations to seek all unprocessed topic/partitions so the current record (and the others remaining) will be retrieved by the next poll. The SeekToCurrentErrorHandler does exactly this.
https://docs.spring.io/spring-kafka/reference/htmlsingle/#_seek_to_current_container_error_handlers

Spring Kafka listenerExecutor

I'm setting up a kafka listener in a spring boot application and I can't seem to get the listener running in a pool using an executor. Here's my kafka configuration:
#Bean
ThreadPoolTaskExecutor messageProcessorExecutor() {
logger.info("Creating a message processor pool with {} threads", numThreads);
ThreadPoolTaskExecutor exec = new ThreadPoolTaskExecutor();
exec.setCorePoolSize(200);
exec.setMaxPoolSize(200);
exec.setKeepAliveSeconds(30);
exec.setAllowCoreThreadTimeOut(true);
exec.setQueueCapacity(0); // Yields a SynchronousQueue
exec.setThreadFactory(ThreadFactoryFactory.defaultNamingFactory("kafka", "processor"));
return exec;
}
#Bean
public ConsumerFactory<String, PollerJob> consumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.GROUP_ID_CONFIG, consumerGroup);
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
DefaultKafkaConsumerFactory<String, PollerJob> factory = new DefaultKafkaConsumerFactory<>(props,
new StringDeserializer(),
new JsonDeserializer<>(PollerJob.class));
return factory;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, PollerJob> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, PollerJob> factory
= new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(Integer.valueOf(kafkaThreads));
factory.getContainerProperties().setListenerTaskExecutor(messageProcessorExecutor());
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.MANUAL);
return factory;
}
The ThreadFactoryFactory used by the ThreadPoolTaskExecutor just makes sure the thread is named like 'kafka-1-processor-1'.
The ConsumerFactory has the ENABLE_AUTO_COMMIT_CONFIG flag set to false and I'm using manual mode for the acknowledgement which is required to use executors according to the documentation.
My listener looks like this:
#KafkaListener(topics = "my_topic",
group = "my_group",
containerFactory = "kafkaListenerContainerFactory")
public void listen(#Payload SomeJob job, Acknowledgment ack) {
ack.acknowledge();
logger.info("Running job {}", job.getId());
....
}
Using the Admin Server I can inspect all the threads and only one kafka-N-processor-N threads is being created but I expected to see up to 200. The jobs are all running one at a time on the that one thread and I can't figure out why.
How can I get this setup to run the listeners using my executor with as many threads as possible?
I'm using Spring Boot 1.5.4.RELEASE and kafka 0.11.0.0.
If your topic has only one partition, according the consumer group policy, only one consumer is able to poll that partition.
The ConcurrentMessageListenerContainer indeed creates as much target KafkaMessageListenerContainer instances as provided concurrency. And it does that only in case it doesn't know the number of partitions in the topic.
When the rebalance in consumer group happens only one consumer gets partition for consuming. All the work is really done there in a single thread:
private void startInvoker() {
ListenerConsumer.this.invoker = new ListenerInvoker();
ListenerConsumer.this.listenerInvokerFuture = this.containerProperties.getListenerTaskExecutor()
.submit(ListenerConsumer.this.invoker);
}
One partition - one thread for sequential records processing.

Resources