My application create with SpringBoot and is in cluster (two different istance openshit)
Every istance has one consumer that read message of topic in replication factory.
I would like to find a mechanism to block the reading of a message into topic in replication factory if it has already been read by one of the two consumers
Example:
CONSUMER CLIENT A -- READ MSG_1 --> BROKER_1
- Offset increase
- Commit OK
CONSUMER CLIENT B --> NOT READ MSG_1 --> BROKER_1
-- Correct beacause already commit
Now BROKER_1 is show and new lead is BROKER_2
How can I block the already read message into BROKER_2?
Thanks all!
Giuseppe.
Replication factor doesn't control if/how consumers read messages. The partition count does. If the topic only has one partition, then only one consumer instance is able to read messages, and all other instances are "blocked". And if the message is already read and commited then it doesn't matter which broker is the leader because the offsets are maintained per topic, not per replica
If you have more than one partition and you still want to block consumers from being able to read data, then you'll need to implement some external, coordinated lock via Zookeeper, for example
Related
We have one topic with one partition due to ordering of message requirements. We have two consumers running on different servers with same set of configurations i.e. groupId, consumerId, consumerGroup. i.e.
1 Topic -> 1 Partition -> 2 Consumers
When we deploy consumers same code is deployed on both the servers. Noticed when a message comes we see both the consumers are consuming message rather than only one processing. Reason having consumers running on two separate servers is if one server crashes at least other can continue processing messages. But looks like if both up both consuming messages. Reading Kafka docs it says if we have more consumers than partitions then some stay idle don't see that happening. Anything we are missing on configuration side apart from consumerId & groupId. Thanks
As #Gary Russel said, as long as the two consumer instances have their own consumer group, they will consume every event that is written to the topic. Just put them into the same consumer-group. You can provide a consumer-group-id in the consumer.properties.
Let’s say for instance, my kafka consumer (in Consumer Group 1) is reading messages from Kafka Topic A.
Now if that consumer consumes 12 messages before failing.
When the consumer starts up again, and now it has different consumer group (i.e. consumer group 2),
Question 1 -? On restart, will it continue from where it left off in the offset (or position) because that offset is stored by Kafka and/or ZooKeeper or will it start consuming messages from 1st message.
Question 2-> Is there a way to ensure that on restart (When consumer has different consumer group), it still start consuming from where it left off before restarting?
Just to give you the context, i am trying to update in-memory caches in each node/server on receiving a message on kafka topic. In order to do that, i am using a different consumer group for each node/server so that each message is consumed by all the nodes/servers to update in-memory cache. Please let me know if there are better ways to do this. Thanks!
Consumer offsets are maintained per consumer group and hence if you have a different consumer group on each restart you can make use of the auto.offset.reset property
The auto.offset.reset property specifies
What to do when there is no initial offset in Kafka or if the current offset does not exist any more on the server (e.g. because that data has been deleted): earliest: automatically reset the offset to the earliest offsetlatest: automatically reset the offset to the latest offsetnone: throw exception to the consumer if no previous offset is found for the consumer's groupanything else: throw exception to the consumer.
Having informed about the current approach - I believe you should relook at the design and it would be better to have a different consumer group per node but ensure to keep the same consumer group name per node even after a restart. This is a suggestion based on the info provided but there could be better solutions as well after going into the detail of the design/implementation.
I'm using segmentio/kafka-go client to read messages from a topic.
I'm unable to find.. how to start reading from last/new message.
Everytime I start the code, it starts reading from beginning offset in that partition.
What you need to know about consuming messages from Kafka is that each consumer client is part of a Consumer Group. Kafka stores the already processed offset for each Consumer Group at Topic-Partition level in an internal Kafka topic called __consumer_offsets. This enables a consumer of a Consumer Group to continue consumption from where it left off after a re-start.
In your case it means you need to set the Consumer Group (in the KafkaConsumer API it is the configuration "group.id") and keep it constant. Only then you will be able to continue reading from the latest/new est message and not start from beginning after a re-start.
We have an IBM MQ JMS queue and want to distribute the data into multiple consumers for load balancing. So if we write two JMS Clients to consume from same JMS queue what will happen? Will Messages be equally distributed across both consumers since one consumer will delete the data after it is read? Is there a possibility for data duplication, like if the same message is read by both consumers in a race condition?
My comments below are based on destructive get and not a browse get.
So if we write two JMS Clients to consume from same JMS queue what
will happen?
They will both consume messages.
Will Messages be equally distributed across both consumers since one
consumer will delete the data after it is read?
No. The "hot" consumer will be feed the next available message, assuming it is "getting" a message again before the next message arrives.
Is there a possibility for data duplication, like if the same message
is read by both consumers in a race condition?
Not if you are performing a destructive get (the default).
I'm interested to use Kafka in one of my projects, but there is a requirement that the messaging broker have to keep the the messages when one of the subscriber (consumer) is disconnected.
I see that JMS have this feature.
In the website it said that Kafka had durability features.
Is it the same like JMS or is it have different meaning ?
Consumer pulls the data from kafka (brokers). Consumer specifies the offset from where it wants to gather the data. If Consumer disconnects and comes back, it can continue where it left. It can also start consuming data from earlier point (changing the offset).
Kafka does support a durable consumer style pattern, but there are a few ways to achieve it.
First you need to understand the concept of Offsets and Consumer Position
Kafka maintains a numerical offset for each record in a partition.
This offset acts as a unique identifier of a record within that
partition, and also denotes the position of the consumer in the
partition. For example, a consumer which is at position 5 has consumed
records with offsets 0 through 4 and will next receive the record with
offset 5. There are actually two notions of position relevant to the
user of the consumer: The position of the consumer gives the offset of
the next record that will be given out. It will be one larger than the
highest offset the consumer has seen in that partition. It
automatically advances every time the consumer receives messages in a
call to poll(Duration).
The committed position is the last offset that has been stored
securely. Should the process fail and restart, this is the offset that
the consumer will recover to. The consumer can either automatically
commit offsets periodically; or it can choose to control this
committed position manually by calling one of the commit APIs (e.g.
commitSync and commitAsync).
The offset can be stored/persisted on either the Kafka server or the client side:
Kafka Server persists/holds the consumers position, in this case there are 2 sub options:
Consumer explicitly commits the message consumption
Consumer automatically commits the message consumption
Client application persists/holds
the consumers position
This is all as per https://kafka.apache.org/22/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html.