When will kafka retry to process the messages that have not been acknowledged? - spring-boot

I have a consumer which is configured with the manual ACK property :
#Bean
public ConcurrentKafkaListenerContainerFactory<String, MessageAvro> kafkaListenerContainerFactory() {
final ConcurrentKafkaListenerContainerFactory<String, MessageAvro> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.getContainerProperties().setAckMode(ContainerProperties.AckMode.MANUAL);
factory.setConsumerFactory(consumerFactory());
return factory;
}
And a consumer with a #KafkaListener method which did some job like :
#KafkaListener(
topics = "${tpd.topic-name}",
containerFactory = "kafkaListenerContainerFactory",
groupId = "${tpd.group-id}")
public void messageListener(final ConsumerRecord<String, MessageAvro> msg, #Payload final MessageAvro message, final Acknowledgment ack) {
if (someCondition) {
// do something
ack.acknowledge();
} else {
// do not acknoledge the message here in order to retry it later.
}
}
In case where the condition is "false" and we move on to the "else" part, when will my consumer try to read the unacknowledged message again?
And in case it doesn't do it again, how do I tell my #KafkaListener to take into account the unacknowledged messages?

As soon as you commit (or "acknowledge") an offset, all previous offsets are also committed in the sense, that the ConsumerGroup will not try to read it again.
That means: If you hit the "else" condition and your job keeps running in a way that it will hit the "if" condition with the acknowledgment all offsets are committed.
The reason behind this is that a Kafkaconsumer will report back to the brokers which offset to read next. For this to achieve Kafka stores that information within an internal Kafka topic called __consumer_offsets as a key/value pair, where
key: ConsumerGroup, Topic name, Partition
value: next offset to read
That internal topic is a compacted topic which means it will eventually only store the latest value for the mentioned key. As a consequence Kafka will not track the "un-acknowledged" messages in between.
Workaround
What people usually do is to fork those "un-acknowledged" messages into another topic so they can be inspected and consumed together at a later point in time. That way, you will not block your actual application from consuming further messages and you can deal with the un-acknowledged messages seperately.

Related

Kafka consumer error handling offset reset

I am using consuming events from kafka streams in a spring boot application version 2.4. The version of kafka client is 2.3.There are two consumers consuming the events. I want to put back the events back in kafka incase of any error. I Do NOT want to put the failed event in a dead letter queue. I am using ConsumerAwareListenerErrorHandler.
#Override
public Object handleError(Message<?> message, ListenerExecutionFailedException exception, Consumer<?, ?> consumer) {
ConsumerRecord record = (ConsumerRecord) message.getPayload();
// consumer.seek(new TopicPartition(record.topic(), record.partition()), record.offset());
Collection collection = Arrays.asList(new TopicPartition(record.topic(), record.partition()));
consumer.seekToBeginning(collection);
return null;
}
Now what I want is if I stop the consumer, The same error event should be consumed by the other running consumer. Kindly help.
Thanks
That won't work because any other records fetched by the previous poll() will still be processed; use a SeekToCurrentErrorHandler instead.
https://docs.spring.io/spring-kafka/docs/2.5.5.RELEASE/reference/html/#seek-to-current

Error handling - Consumer - apache kafka and spring

I am learning to use kafka, I have two services a producer and a consumer.
The producer produces messages that require processing (queries to services and database). These messages are received by the consumer, it is responsible for processing them and saves the result in a database
Producer
#Autowired
private KafkaTemplate<String, String> kafkaTemplate;
...
kafkaTemplate.send(topic, message);
Consumer
#KafkaListener(topics = "....")
public void listen(#Payload String message) {
....
}
I would like all messages to be processed correctly by the consumer.
I do not know how to handle errors on the consumer side in this context. For example, a database might be temporarily disabled and could not handle certain messages.
What to do in these cases?
I know that the responsibility belongs to the consumer.
I could do retries, but retry several times in a row if a database is down does not seem like a good idea. And if I continue to consume messages, the index advances and I lose the events that I could not process by mistake.
You have control over kafka consumer in form of committing the offset of records read. Kafka will continue to return the same records unless the offset is committed. You can set offset commit to manual and based on the success of your business logic decide whether to commit or not. See a sample below
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", "test");
props.put("enable.auto.commit", "false");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("foo", "bar"));
final int minBatchSize = 200;
List<ConsumerRecord<String, String>> buffer = new ArrayList<>();
while (true) {
ConsumerRecords<String, String> records = consumer.poll(100);
for (ConsumerRecord<String, String> record : records) {
buffer.add(record);
}
if (buffer.size() >= minBatchSize) {
insertIntoDb(buffer);
consumer.commitSync();
buffer.clear();
}
}
Consumer.commitsync() commits the offset.
Also see the kakfa consumer documentation to understand the consumer offsets here .
This link was very helpful https://dzone.com/articles/spring-for-apache-kafka-deep-dive-part-1-error-han
Spring provides the DeadLetterPublishingRecoverer class that performs a correct handling of errors.

Kafka listener receiving List<ConsumerRecord<String, String>>, is it possible to consume?

I am super new in Kafka and I frankly have no idea about this type of consumer (as far as I understood is like that due is batch ready), so I am struggling to figure out how to basically consume the list of these events.
I have something like this:
#KafkaListener(topics = "#{'${kafka.listener.list-of-topics}'.split(',')}")
public void readMessage(List<ConsumerRecord<String, String>> records,
final Acknowledgment acknowledgment) {
try {
....
I know when I receive an event (at least a single one) is of type "MyObject" so I can do it fine when I get a single message.
I believe there must be a way to read/cast this List<ConsumerRecords<String,String> but I cannot figure out how..
any ideas?
See the reference manual: Batch Listeners.
Starting with version 1.1, #KafkaListener methods can be configured to receive the entire batch of consumer records received from the consumer poll. To configure the listener container factory to create batch listeners, set the batchListener property:
#Bean
public KafkaListenerContainerFactory<?> batchFactory() {
ConcurrentKafkaListenerContainerFactory<Integer, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setBatchListener(true); // <<<<<<<<<<<<<<<<<<<<<<<<<
return factory;
}
...
You can also receive a list of ConsumerRecord<?, ?> objects but it must be the only parameter (aside from optional Acknowledgment, when using manual commits, and/or Consumer<?, ?> parameters) defined on the method:
...
When using Spring Boot, set the property spring.kafka.listener.type=batch.

Spring kafka Batch Listener- commit offsets manually in Batch

I am implementing spring kafka batch listener, which reads list of messages from Kafka topic and posts the data to a REST service.
I would like to understand the offset management in case of the REST service goes down, the offsets for the batch should not be committed and the messages should be processed for the next poll. I have read spring kafka documentation but there is confusion in understanding the difference between Listener Error Handler and Seek to current container error handlers in batch. I am using spring-boot-2.0.0.M7 version and below is my code.
Listener Config:
#Bean
KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(Integer.parseInt(env.getProperty("spring.kafka.listener.concurrency")));
// factory.getContainerProperties().setPollTimeout(3000);
factory.getContainerProperties().setBatchErrorHandler(kafkaErrorHandler());
factory.getContainerProperties().setAckMode(AckMode.BATCH);
factory.setBatchListener(true);
return factory;
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> propsMap = new HashMap<>();
propsMap.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, env.getProperty("spring.kafka.bootstrap-servers"));
propsMap.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,
env.getProperty("spring.kafka.consumer.enable-auto-commit"));
propsMap.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG,
env.getProperty("spring.kafka.consumer.auto-commit-interval"));
propsMap.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, env.getProperty("spring.kafka.session.timeout"));
propsMap.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.GROUP_ID_CONFIG, env.getProperty("spring.kafka.consumer.group-id"));
return propsMap;
}
Listener Class:
#KafkaListener(topics = "${spring.kafka.consumer.topic}", containerFactory = "kafkaListenerContainerFactory")
public void listen(List<String> payloadList) throws Exception {
if (payloadList.size() > 0)
//Post to the service
}
Kafka Error Handler:
public class KafkaErrorHandler implements BatchErrorHandler {
private static Logger LOGGER = LoggerFactory.getLogger(KafkaErrorHandler.class);
#Override
public void handle(Exception thrownException, ConsumerRecords<?, ?> data) {
LOGGER.info("Exception occured while processing::" + thrownException.getMessage());
}
}
How to handle Kafka listener so that if something happens during processing batch of records, I wouldn't loose data.
With Apache Kafka we never lose the data. There is indeed an offset in partition logs to seek to any arbitrary position.
On the other hand, when we consume records from a partition there is no requirement to commit their offsets - the current consumer holds the state in the memory. We need to commit only for other, new consumers in the same group when the current one is dead. Independently of the error, the current consumer always moves on to poll new data behind its current in-memory offset.
So, to reprocess the same data in the same consumer we definitely have to use seek operation to move the consumer back to the desired position. That's why Spring Kafka introduces SeekToCurrentErrorHandler:
This allows implementations to seek all unprocessed topic/partitions so the current record (and the others remaining) will be retrieved by the next poll. The SeekToCurrentErrorHandler does exactly this.
https://docs.spring.io/spring-kafka/reference/htmlsingle/#_seek_to_current_container_error_handlers

Spring handling RabbitMQ messages concurrently

I am fairly new to message-handling with Spring, so bear with me.
I would like my RabbitMQ message-handler to handle messages concurrently in several threads.
#Component
public class ConsumerService {
#RabbitListener(queues = {"q"})
public void messageHandler(#Payload M msg) {
System.out.println(msg);
}
}
...
#Configuration
#Import({MessageConverterConfiguration.class, ConsumerService.class})
public class ConsumerConfiguration {
#Autowired
private ConnectionFactory connectionFactory;
#Bean
public List<Declarable> declarations() {
return Arrays.asList(
new DirectExchange("e", true, false),
new Queue("q", true, false, false),
new Binding("q", Binding.DestinationType.QUEUE, "e", "q", null)
);
}
#Bean
public SimpleRabbitListenerContainerFactory rabbitListenerContainerFactory(MessageConverter contentTypeConverter, SimpleRabbitListenerContainerFactoryConfigurer configurer) {
SimpleRabbitListenerContainerFactory factory = new SimpleRabbitListenerContainerFactory();
factory.setConcurrentConsumers(10);
configurer.configure(factory, connectionFactory);
factory.setMessageConverter(contentTypeConverter);
return factory;
}
}
In my small test there are 4 messages on queue "q". I get to process them all. That is fine. But I get to process them one by one. If I set a breakpoint in "ConsumerService.messageHandler" (essentially delaying the completion of handling a message) I would like to end up having 4 threads in that breakpoint. But I never have more than one thread. As soon as I let it run to complete handling of a message, the next message gets to be handled. What do I need to do to handle the messages concurrently?
There are two ways of achieving this
Either use a threadpool to handle messae processing at your consumer.
Or, create multiple consumer.
I saw you are using concurrentConsumers property to automatically handling of creating multiple consumers by Spring AMQP. Try setting the PrefetchCount to 1 and set MaxConcurrentConsumers also.
Most probably you already have four messages in queues and as default value of Prefetch Count is large only one consumer is consuming all the messages present on queue.
Sorry, I forgot to write that I got it working. Essentially what I have now is:
...
factory.setConcurrentConsumers(10);
factory.setMaxConcurrentConsumers(20);
factory.setConsecutiveActiveTrigger(1);
factory.setConsecutiveIdleTrigger(1);
factory.setPrefetchCount(100);
...
I do believe with concurrentConsumers alone it will actually eventually (under enough load) handle messages in parallel. Problem was that I had only 4 messages in my little test, and it will never bother to activate more than one consumer(-thread) for that. Setting consecutiveActiveTrigger to 1 helps here. Guess prefetchCount also has something to say. Anyway, case closed.

Resources