I've read that kafka provides a consumer client library that allows recovery by saving the last offset read in zookeeper (not 100% sure about where it's stored).
Is it possible to do the same with Sarama consumers?
Let's say that I'm reading until offset 550, my consumer crashes for 5 min, we are now at offset 700 but I want to resume consuming from offset 550.
Is that possible without having to save the state by myself?
I would assume it does but I don't understand how.
I've found sarama.OffsetNewest/Oldestbut that's not what I'm looking for...
Kafka consumers used to store offsets in Zookeeper but now they store them directly in Kafka. See the Consumer section in the Kafka docs.
Sarama handles that very well and Sarama consumers will commit (store) offsets in Kafka by default.
Have a look at the Sarama Consumer example. Initially this example starts at the end of the topic but upon restarting, it will restart from its last position.
Related
Let’s say for instance, my kafka consumer (in Consumer Group 1) is reading messages from Kafka Topic A.
Now if that consumer consumes 12 messages before failing.
When the consumer starts up again, and now it has different consumer group (i.e. consumer group 2),
Question 1 -? On restart, will it continue from where it left off in the offset (or position) because that offset is stored by Kafka and/or ZooKeeper or will it start consuming messages from 1st message.
Question 2-> Is there a way to ensure that on restart (When consumer has different consumer group), it still start consuming from where it left off before restarting?
Just to give you the context, i am trying to update in-memory caches in each node/server on receiving a message on kafka topic. In order to do that, i am using a different consumer group for each node/server so that each message is consumed by all the nodes/servers to update in-memory cache. Please let me know if there are better ways to do this. Thanks!
Consumer offsets are maintained per consumer group and hence if you have a different consumer group on each restart you can make use of the auto.offset.reset property
The auto.offset.reset property specifies
What to do when there is no initial offset in Kafka or if the current offset does not exist any more on the server (e.g. because that data has been deleted): earliest: automatically reset the offset to the earliest offsetlatest: automatically reset the offset to the latest offsetnone: throw exception to the consumer if no previous offset is found for the consumer's groupanything else: throw exception to the consumer.
Having informed about the current approach - I believe you should relook at the design and it would be better to have a different consumer group per node but ensure to keep the same consumer group name per node even after a restart. This is a suggestion based on the info provided but there could be better solutions as well after going into the detail of the design/implementation.
I'm using segmentio/kafka-go client to read messages from a topic.
I'm unable to find.. how to start reading from last/new message.
Everytime I start the code, it starts reading from beginning offset in that partition.
What you need to know about consuming messages from Kafka is that each consumer client is part of a Consumer Group. Kafka stores the already processed offset for each Consumer Group at Topic-Partition level in an internal Kafka topic called __consumer_offsets. This enables a consumer of a Consumer Group to continue consumption from where it left off after a re-start.
In your case it means you need to set the Consumer Group (in the KafkaConsumer API it is the configuration "group.id") and keep it constant. Only then you will be able to continue reading from the latest/new est message and not start from beginning after a re-start.
NOTE: NOT A DUPLICATE OF How to get consumer group offsets for partition in Golang Kafka 10 does not answer my question, it's not even a working solution
I'm trying to write a function in go that queries kafka for all consumer group offsets for all topics.
To do that, I was hoping to read all the messages in __consumer_offsets topic and parse them.
However, in all the kakfa go libraries I looked through, I could not find a way to just read all the messages from __consumer_offsets without consuming them.
(kafka-go either gives me a way to read from a single partition, or consume messages from the entire topic)
So my question is, simply put: Is there a way, using any kafka library out there, to get consumer group offsets for all the groups for all the topics?
If not, is there a way to get the offset for a given topic and group id?
I am new to Kafka. Currently I am experimenting with this Channel Consumer example from Confluent Inc's Github repo
From what I know, consumers are separated into groups. Each group has their own offset in the partition. Let's say I have 40 messages in a particular topic let's call it, owner_commands. A consumer, belongs to the dog group, joins and begins to consume those 40 messages.
When I disconnected and reconnected this consumer, I noticed that messages don't show up anymore. It says that I have reached the end of file. However, if I join the cluster with another consumer, which belongs to a different group (say cat) I get to read those 40 messages again.
Do you know if there is a way for consumers in the dog group to rewind and replay those messages again using Kafka's Go API. I looked at the source code for Kafka Golang API, I couldn't find anything that indicates to me that I can rewind and look at a particular message in the past.
Thank you
You could use CommitOffsets and just commit back to the offset you want to rewind to. The next poll will start from that offset.
CommitOffsets is documented here:
http://docs.confluent.io/current/clients/confluent-kafka-go/index.html#Consumer.CommitOffsets
Outside of the API, there's functionality in the kafka-consumer-groups command to move the position of consumer groups as well. This is released with Apache Kafka 0.11.
I'm interested to use Kafka in one of my projects, but there is a requirement that the messaging broker have to keep the the messages when one of the subscriber (consumer) is disconnected.
I see that JMS have this feature.
In the website it said that Kafka had durability features.
Is it the same like JMS or is it have different meaning ?
Consumer pulls the data from kafka (brokers). Consumer specifies the offset from where it wants to gather the data. If Consumer disconnects and comes back, it can continue where it left. It can also start consuming data from earlier point (changing the offset).
Kafka does support a durable consumer style pattern, but there are a few ways to achieve it.
First you need to understand the concept of Offsets and Consumer Position
Kafka maintains a numerical offset for each record in a partition.
This offset acts as a unique identifier of a record within that
partition, and also denotes the position of the consumer in the
partition. For example, a consumer which is at position 5 has consumed
records with offsets 0 through 4 and will next receive the record with
offset 5. There are actually two notions of position relevant to the
user of the consumer: The position of the consumer gives the offset of
the next record that will be given out. It will be one larger than the
highest offset the consumer has seen in that partition. It
automatically advances every time the consumer receives messages in a
call to poll(Duration).
The committed position is the last offset that has been stored
securely. Should the process fail and restart, this is the offset that
the consumer will recover to. The consumer can either automatically
commit offsets periodically; or it can choose to control this
committed position manually by calling one of the commit APIs (e.g.
commitSync and commitAsync).
The offset can be stored/persisted on either the Kafka server or the client side:
Kafka Server persists/holds the consumers position, in this case there are 2 sub options:
Consumer explicitly commits the message consumption
Consumer automatically commits the message consumption
Client application persists/holds
the consumers position
This is all as per https://kafka.apache.org/22/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html.