Spring Kafka don't respect max.poll.records with strange behavior - spring

Well, I'm trying the following scenario:
In application.properties set max.poll.records to 50.
In application.properties set enable-auto-commit=false and ack-mode to manual.
In my method added #KafkaListener, but don't commit any message, just read, log but don't make an ACK.
Actually, in my Kafka topic, I have 500 messages to be consumed, so I'm expecting the following behavior:
Spring Kafka poll() 50 messages (offset 0 to 50).
As I said, I didn't commit anything, just log the 50 messages.
In the next Spring Kafka poll() invocation, get the same 50 messages (offset 0 to 50), as step 1. Spring Kafka, in my understanding, should continue in this loop (step 1-3) reading always the same messages.
But what happens is the following:
Spring Kafka poll() 50 messages (offset 0 to 50).
As I said, I didn't commit anything, just log the 50 messages.
In the next Spring Kafka poll() invocation, get the NEXT 50 messages, different from step 1 (offset 50 to 100).
Spring Kafka reads the 500 messages, in blocks of 50 messages, but don't commit anything. If I shut down the application and start again, the 500 messages are received again.
So, my doubts:
If I configured the max.poll.recors to 50, how spring Kafka get the next 50 records if I didn't commit anything? I understand the poll() method should return the same records.
Does Spring Kafka have some cache? If yes, this can be a problem if I get 1million records in cache without commit.

Your first question:
If I configured the max.poll.recors to 50, how spring Kafka get the
next 50 records if I didn't commit anything? I understand the poll()
method should return the same records.
First, to make sure that you did not commit anything, you must make sure that you understand the following 3 parameters, which i believe you understood.
ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, set it to false(which is also the recommended default). And if it is set to false, take note that auto.commit.interval.ms becomes irrelevant. Check out this documentation:
Because the listener container has it’s own mechanism for committing
offsets, it prefers the Kafka ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG
to be false. Starting with version 2.3, it unconditionally sets it to
false unless specifically set in the consumer factory or the
container’s consumer property overrides.
factory.getContainerProperties().setAckMode(AckMode.MANUAL); You take the responsibility to acknowledge. (Ignored when transactions are being used) and ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG can't be true.
factory.getContainerProperties().setSyncCommits(true/false); Set whether or not to call consumer.commitSync() or commitAsync() when the container is responsible for commits. Default true. This is responsible for sync-ing with Kafka, nothing else, if set to true, that call will block until Kafka responds.
Secondly, no the consumer poll() will not return the same records. For the current running consumer, it tracks its offset in memory with some internal index, we don't have to care about committing offsets. Please also see #GaryRussell s explanation here.
In short, he explained:
Once the records have been returned by the poll (and offsets not
committed), they won't be returned again unless you restart the
consumer or perform seek() operations on the consumer to reset the
offset to the unprocessed ones.
Your second question:
Does Spring Kafka have some cache? If yes, this can be a problem if I
get 1million records in cache without commit.
There is no "cache", it's all about offsets and commits, explanation as per above.
Now to achieve what you wanted to do, you can consider doing 2 things after fetching the first 50 records, i.e for the next poll():
Either, re-start the container programatically
Or call consumer.seek(partition, offset);
BONUS:
Whatever configuration you choose, you can always check out the results, by looking at the LAG column of this output:
kafka-consumer-groups.bat --bootstrap-server localhost:9091 --describe --group your_group_name

Consumer not committing the offset will have impact only in situations like:
Your consumer crashed after reading 200 messages, when you restart it, it will start again from 0.
Your consumer is no longer assigned a partition.
So in a perfect world, you don't need to commit at all and it will consume all the messages because consumer first asks for 1-50,then 51-100.
But if the consumer crashed, nobody knows what was the offset that consumer read. If the consumer had committed the offset, when it is restarted it can check the offset topic to see where the crashed consumer left and start from there.
max.poll.records defines how many records to fetch at one go but it does not define which records to fetch.

Related

Polling behavior when using ReactiveKafkaConsumerTemplate

I have a Spring Boot application using ReactiveKafkaConsumerTemplate for consuming messages from Kafka.
I've consume messages using kafkaConsumerTemplate.receive() therefore I'm manually acknowledging each message. Since I'm working in an asynchronous manner, messages are not processed sequentially.
I'm wondering how does the commit and poll process work in this scenario - If I polled 100 messages but acknowledged only 99 of them (message not acknowledged is in the middle of the 100 messages I polled, say number 50), what happens on the next poll operation? Will it actually poll only after all 100 messages are acknowledged (and offset is committed) and until then I'll keep getting the un-acknowledged messages over and over to my app until I acknowledge it?
Kafka maintains 2 offsets for a consumer group/partition - the current position() and the committed offset. When a consumer starts, the position is set to the last committed offset.
Position is updated after each poll, so the next poll will never return the same record, regardless of whether it has been committed (unless a seek is performed).
However, with reactor, you must ensure that commits are performed in the right order, since records are not acknowledged individually, just the committed offset is retained.
If you commit out of order and restart your app, you may get some processed messages redelivered.
We recently added support in the framework for out-of-order commits.
https://projectreactor.io/docs/kafka/release/reference/#_out_of_order_commits
The current version is 1.3.11, including this feature.

How Kafka poll method works behind the scene in Spring Boot?

In Kafka for Spring, I see by default the max-poll-records value is 500.
So my question is suppose if 500 messages are not present in the Topic will the consumer will wait to get 500 records then the poll method will run and fetch the batch of records.
I am a bit confused here like what are all the checks before pulling the message from Topic.
Kafka operates with hybrid strategies of polling. Usually, it is a combination of the number of records (or bytes) and time interval.
All the properties can be overridden to fit your expectations for consumption.

What will happen if my kafka consumer group is changed after each restart

Let’s say for instance, my kafka consumer (in Consumer Group 1) is reading messages from Kafka Topic A.
Now if that consumer consumes 12 messages before failing.
When the consumer starts up again, and now it has different consumer group (i.e. consumer group 2),
Question 1 -? On restart, will it continue from where it left off in the offset (or position) because that offset is stored by Kafka and/or ZooKeeper or will it start consuming messages from 1st message.
Question 2-> Is there a way to ensure that on restart (When consumer has different consumer group), it still start consuming from where it left off before restarting?
Just to give you the context, i am trying to update in-memory caches in each node/server on receiving a message on kafka topic. In order to do that, i am using a different consumer group for each node/server so that each message is consumed by all the nodes/servers to update in-memory cache. Please let me know if there are better ways to do this. Thanks!
Consumer offsets are maintained per consumer group and hence if you have a different consumer group on each restart you can make use of the auto.offset.reset property
The auto.offset.reset property specifies
What to do when there is no initial offset in Kafka or if the current offset does not exist any more on the server (e.g. because that data has been deleted): earliest: automatically reset the offset to the earliest offsetlatest: automatically reset the offset to the latest offsetnone: throw exception to the consumer if no previous offset is found for the consumer's groupanything else: throw exception to the consumer.
Having informed about the current approach - I believe you should relook at the design and it would be better to have a different consumer group per node but ensure to keep the same consumer group name per node even after a restart. This is a suggestion based on the info provided but there could be better solutions as well after going into the detail of the design/implementation.

Spring #Kafkalistener auto commit offset or manual: Which is recommended?

As per what I read on internet, method annotated with Spring #KafkaListener will commit the offset in 5 sec by default.
Suppose after 5 seconds, the offset is committed but the processing is still going on and in between consumer crashes because of some issue, in that case after rebalancing, the partition will be assigned to other consumer and it will start processing from next message because previous message offset was committed.
This will result in loss of the message.
So, do I need to commit the offset manually after processing completes? What would be the recommended approach?
Again, if processing is done, and just before commit, the consumer crashed, then how to avoid the message
duplication in this case.
Please suggest the way which will avoid message loss and duplication. I am using Spring KafkaListener
with default configuration.
As usual this depends on your use case and how you would like to deal with issues during your processing. The usage of auto-commit will change the delivery semantics of your application.
Enabling the auto commits is more an "at-most-once" semantics as you would read the data and commit it before you have actually processed the data. In case your processing fails the message was already committed and you will not read it again, it is therefore "lost" for your application (for your particular consumerGroup to be more precise).
Disabling the auto commit is more a "at-least-once" semantics as you are committing the data only after the processing of the data. Imagine you fetch 100 messages from the topic. 50 of them were processed sucessfullay and your application fails during the processing of the 51st message. Now, as you disabled auto commit and only commit all or none messages at the end of the processing, you have not committed any of the 100 messages, the next time your application reads the same 100 messages again. However, you have now created 50 duplicate messages as they were already processed successfully previously.
To conclude, you need to figure out if your use case can rather handle data loss or deal with duplicates. Dealing with duplicates can be ensured if your application is idempotent.
You are asking about "how to prevent data loss and duplicates" which means you are referring to "exactly-once-scemantics". This is a big topic in distributed streaming systems and you could check the spring-kafka docs if this is supported under which configuration and dependent on the output operation of your application.
Please also check the comment of GaryRussell on this post:
"the Spring team does not recommend using auto commit; the listener container Ackmode (BATCH or RECORD) will commit the offsets in a deterministic manner; recent versions of the framework disable auto commit (unless specifically enabled)"
If the consumer takes 5+ seconds to process the message then you have a problem in the code that needs to be fixed.
Auto-commit is risky in Production as can lead to problem scenarios (message loss etc.)
Better to go with manual commit to have better control.
Make the consumer idempotent so that duplicate message and WIP state of consumer is not a problem. May be, maintain processing status in consumer's DB so that if processing is half done then on consumer restart it can clear the WIP state and process afresh. Similarly, if processing status is Complete state then on restart it will see the Complete status and simply commit the duplicate message to Kafka.

Spring Kafka Auto Commit Offset In Case of Failures

I am using Spring Kafka 1.2.2.RELEASE. I have a Kafka Listener as consumer that listens to a topic and index the document in elastic.
My Auto commit offset property is set to true //default.
I was under the impression that in case there is an exception in the listener(elastic is down) the offsets should not be committed and the same message should be processed for the next poll
However this is not happening and the consumer commits the offset on the next poll.After reading posts and documentation i learnt that this is the case that with auto commit set to true to next poll will commit all offset
My doubt is why is the consumer calling the next poll and also how can i prevent any offset from committing with auto commit to true or do i need to set this property to false and commit manually.
I prefer to set it to false; it is more reliable for the container to manage the offsets for you.
Set the container's AckMode to RECORD (it defaults to BATCH) and the container will commit the offset for you after the listener returns.
Also consider upgrading to at least 1.3.3 (current version is 2.1.4); 1.3.x introduced a much simpler threading model, thanks to KIP-62
EDIT
With auto-commit, the offset will be committed regardless of success/failure. The container won't commit after a failure, unless ackOnError is true (another reason to not use auto commit).
However, that still won't help because the broker won't send the same record again. You have to perform a seek operation on the Consumer for that.
In 2.0.1 (current version is 2.1.4), we added the SeekToCurrentErrorHandler which will cause the failed and unprocessed records to be re-sent on the next poll. See the reference manual.
You can also use a ConsumerAwareListener to perform the seek yourself (also added in 2.0).
With older versions (>= 1.1) you have to use a ConsumerSeekAware listener which is quite a bit more complicated.
Another alternative is to add retry so the delivery will be re-attempted according to the retry settings.
Apparently, there will be message loss with Spring Kafka <=1.3.3 #KafkaListener even if you use ackOnError=false if we expect Spring Kafka to automatically (at least document) take care of this by retrying and "simply not doing poll again"? :). And, the default behavior is to just log.
We were able to reproduce message loss/skip on a consumer even with spring-kafka 1.3.3.RELEASE (no maven sources) and with a single partition topic, concurrency(1), AckOnError(false), BatchListener(true) with AckMode(BATCH) for any runtime exceptions. We ended up doing retries inside the template or explore ConsumerSeekAware.
#GaryRussell, regarding "broker won't send same record again" or continue returning next batch of messages without commit?, is this because, consumer poll is based on current offset that it seeked to get next batch of records and not exactly on last offsets committed? Basically, consumers need not commit at all assuming some run time exceptions on every processing and keep consuming entire messages on topic. Only a restart will start from last committed offset (duplicate).
Upgrade to 2.0+ to use ConsumerAwareListenerErrorHandler seems requires upgrading to at least Spring 5.x which is a major upgrade.

Resources