How can I reset Kafka state to "start of universe"? - apache-kafka-streams

I'm still working on a Kafka Streams application that I described in
Why isn't Kafka consumer producing results?. In that posting, I asked why setting
kstreams_props.put( ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
doesn't appear to reset the state of Kafka to "start of the universe" before any data are pushed to any topic. I am now encountering a variant of that issue:
My application consists of a producer program that pushes data to a Kafka stream and a consumer program that groups the data, aggregates the groups, and then converts the resulting KTable back into a stream, which I print out.
The aggregation step is essentially adding up all the values, then putting those sums into the output stream as new data. What I observe, though, is that every time I run the program, the resulting aggregated values get bigger and bigger, almost as if Kafka is somehow retaining the previous results and including those in the aggregation.
In order to try fixing this, I deleted all my topics (except for __consumer_offsets, which Kafka would not allow), then re-ran my application, but the aggregated values continue to grow, as if Kafka were retaining the result of previous computations even though I thought that deleting the intermediate topics would fix things. I even tried stopping and restarting the Kafka server, to no avail.
What's going on here and, more to the point, how can I fix this? I've tried various suggestions about setting AUTO_OFFSET_RESET_CONFIG, also with no effect. I should mention that one aspect of my application is that my original producer creates its own Kafka timestamps in the Producer.send call, although disabling that also seemed to have no effect.
Thanks in advance, -- Mark

AUTO_OFFSET_RESET_CONFIG only triggers if there are not committed offsets: If an application starts, it first looks for committed offsets and applies the reset policy only, if there are no valid offsets.
Furthermore, for a Kafka Streams application, resetting offsets would not be sufficient and you should use the reset tool bin/ -- this blog post explains the tool in details:


duplicate events by consumer

we observed that one of the consumer try to pick the events multiple times from kafka topic. we have the below seetings on consumer application side.
spring.kafka.consumer.enable-auto-commit=false &
how to avoid the duplicate by the consumer application.
Do we need to fine tune the above configuration settings to avoid the consumer to pick the events multiple times from the kafka topic.
Since you've disabled auto commits, you do need to fine tune when you actually commit a record, otherwise you could have at least once processing.
You could also read the examples of the exactly once processing capabilities using transactions and idempotent producers
The auto.offset.reset only applies if your consumer group is removed, or never exists at all (you're not committing anything). In that case, you're always going to read from the beginning of the topic

What is the most efficient way to know that a Kafka event is visible in a K-Table?

We use Kafka topics as both events and a repository. Using the kafka-streams API we define a simple K-Table that represents all the events in the topic.
In our use case we publish events to the topic and subsequently reference the K-Table as the backing repository. The main issue is that the published events are not immediately visible on the K-Table.
We tried transactions and exactly once semantics as described here ( but there is always a delay we cannot control.
Publish Event
Undetermined amount of time
Published Event is visible in the K-Table
Is there a way to eliminate the delay or otherwise know that a specific event has been consumed by the K-Table.
NOTE: We tried both partition and global tables with similar results.
Because Kafka is an asynchronous system the observed delay is expected and you cannot do anything to avoid it.
However, if you publish a message to a topic, the KafkaProducer allows you to pass in a Callback to the send() method and the callback will be executed after the message was written to the topic providing the record's metadata like topic, partition, and offset.
After Kafka Streams processed messages, it will eventually commit the offsets (you can configure the commit interval, too). Thus, you can know if the message is in the KTable after the offset was committed. By default, committing happens every 30 seconds only and it's not recommended to use a very short commit interval because it implies large overhead. Thus, I am not sure if this would help for your case, as it seem you want a more timely "response".
As an alternative, you can also disable caching on the KTable and use a toStream().process() step -- after each update to the KTable, the changelog stream provided by toStream() will contain the record and you can access the record metadata (including its offset) in the Processor via the given ProcessorContext object. Thus should also allow you to figure out, when the record is available in the KTable.

Kafka 2.1 behaviour change for retentions and Kafka Stream application, what can we so that retention works?

Following is from the Kafka Documentation for 2.1.
Offset expiration semantics has slightly changed in this version.
According to the new semantics, offsets of partitions in a group will
not be removed while the group is subscribed to the corresponding
topic and is still active (has active consumers). If group becomes
empty all its offsets will be removed after default offset retention
period (or the one set by broker) has passed (unless the group becomes
active again). Offsets associated with standalone (simple) consumers,
that do not use Kafka group management, will be removed after default
offset retention period (or the one set by broker) has passed since
their last commit.
If I understand this correctly, as long as Stream Thread consumer's are connected, no retention setting will be effective?
I also started to observe following Exception after the restart of stream application
stream thread - Restoring Stream Tasks failed. Deleting StreamTasks stores to recreate from scratch.
org.apache.kafka.clients.consumer.OffsetOutOfRangeException: Offsets out of range with no configured reset policy for partitions:' but stream application uses the property 'StreamsConfig.consumerPrefix(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG), "earliest"'...
I think it has to do something with retention but I can't tell what?
If I understand this correctly, as long as Stream Thread consumer's are connected, no retention setting will be effective?
This applies to __consumer_offset topic only, that is a Kafka internal topic. For all regular/user topics, retention time is applied the same way as in all previous versions. Also note, this only applies if you upgrade your brokers to 2.1.
For the log message of Streams: you don't need to worry about it. It seems that your application was offline for a longer time, and thus, your local store is not in a consistent state any longer. Thus, it's deleted and recreated from scratch from the changelog topic.

Which guarantees does Kafka Stream provide when using a RocksDb state store with changelog?

I'm building a Kafka Streams application that generates change events by comparing every new calculated object with the last known object.
So for every message on the input topic, I update an object in a state store and every once in a while (using punctuate), I apply a calculation on this object and compare the result with the previous calculation result (coming from another state store).
To make sure this operation is consistent, I do the following after the punctuate triggers:
write a tuple to the state store
compare the two values, create change events and context.forward them. So the events go to the results topic.
swap the tuple by the new_value and write it to the state store
I use this tuple for scenario's where the application crashes or rebalances, so I can always send out the correct set of events before continuing.
Now, I noticed the resulting events are not always consistent, especially if the application frequently rebalances. It looks like in rare cases the Kafka Streams application emits events to the results topic, but the changelog topic is not up to date yet. In other words, I produced something to the results topic, but my changelog topic is not at the same state yet.
So, when I do a stateStore.put() and the method call returns successfully, are there any guarantees when it will be on the changelog topic?
Can I enforce a changelog flush? When I do context.commit(), when will that flush+commit happen?
To get complete consistency, you will need to enable processing.guarantee="exaclty_once" -- otherwise, with a potential error, you might get inconsistent results.
If you want to stay with "at_least_once", you might want to use a single store, and update the store after processing is done (ie, after calling forward()). This minimized the time window to get inconsistencies.
And yes, if you call context.commit(), before input topic offsets are committed, all stores will be flushed to disk, and all pending producer writes will also be flushed.

Kafka Streams: How to avoid forwarding downstream twice when repartitioning

In my application I have KafkaStreams instances with a very simple topology: there is one processor, with a key-value store, and each incoming message gets written to the store and is then forwarded downstream to a sink.
I would like to increase the number of partitions I have for my source topic, and then reprocess the data, so that each store will contain only keys relevant to its partition. (I understand this is done using the Application Reset Tool). However, while reprocessing the data, I don't want to forward anything downstream; I want only new data to be forwarded. (Otherwise, consumers of the result topic will handle old values again). My question: is there an easy way to achieve this? Any build-in mechanism that can assist me in telling reprocessed data and new data apart maybe?
Thank you in advance
There is not build-in mechanism. But you might be able to just remove the sink operation that is writing to the result topic when you reprocess your data -- when reprocessing is done, you stop the application, add the sink again and restart. Not sure if this works for you.
Another possible solution might be, to use a transform() an implement an offset-based filter. For each input topic partitions, you get the offset of the first new message (this is something you need to do manually before you write the Transformer). You use this information, to implement a filter as a custom Transformer: for each input record, you check the record's partition and offset and drop it, if the record's offset is smaller then the offset of the first new message of this partition.
