Kafka multiple consumer instances does not receive message - spring

i have an app that i want to open in multiple instances and i need each instance to receive the message.Actually each instance receives message only if the instance have different group id.I put more partitions than actual instances and it does not work.The app works as consumer/producer at the same time.
code:
#EnableKafka
#Configuration
public class KafkaProducerConfigForDepartment {
#Value(value = "${kafka.bootstrapAddress}")
private String bootstrapAddress;
#Bean
public ProducerFactory<String, MessageEventForDepartment> producerFactoryForDepartment() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapAddress);
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
return new DefaultKafkaProducerFactory<>(configProps);
}
#Bean
public NewTopic topic1() {
return TopicBuilder.name("MARCEL")
.partitions(10)
.compact()
.build();
}
#Bean
public KafkaTemplate<String, MessageEventForDepartment> kafkaTemplate() {
return new KafkaTemplate<>(producerFactoryForDepartment());
}
}
#Configuration
public class KafkaTopicConfig {
#Value(value = "${kafka.bootstrapAddress}")
private String bootstrapAddress;
/* #Value(value = "${kafka.configId}")
private String configId;*/
#Bean
public ConsumerFactory<String, MessageEventForDepartment> consumerFactoryForDepartments() {
Map<String, Object> props = new HashMap<>();
props.put(JsonDeserializer.TRUSTED_PACKAGES, "*");
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapAddress);
//props.put(ConsumerConfig.GROUP_ID_CONFIG, configId);
return new DefaultKafkaConsumerFactory<>(props, new StringDeserializer(), new JsonDeserializer<>(MessageEventForDepartment.class));
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, MessageEventForDepartment>
kafkaListenerContainerFactoryForDepartments() {
ConcurrentKafkaListenerContainerFactory<String, MessageEventForDepartment> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactoryForDepartments());
return factory;
}
}
#Component
#Slf4j
public class DepartmentKafkaService {
#Autowired
private DepartmentRepository departmentRepository;
#KafkaListener(topics = "MARCEL" , groupId = "ren",containerFactory = "kafkaListenerContainerFactoryForDepartments")
public void listenGroupFoo(MessageEventForDepartment message) {
...
Do you know why this happening because i thought if i will have more partitions per group than instances this will work?

That is how Kafka is designed; for multiple instances to get all the records, they must be in different groups.
When they are in the same group, the partitions are distributed across the active instances. Each time a new member arrives, or leaves, the partitions are rebalanced

Related

KafkaConsumer not processing all the request posted in springboot

Have implemented KafkaProducer and KafkaConsumers using springboot.
Once message received at the consumer I am invoking REST API for saving the record into MongoDB.
But All the records posted to Kafka consumer are not getting saved to mongoDB.
Ex: out of 5 records posted to kafkaconsumer only 3 records saved to mongoDB.
Attaching the kafkaconsumer code:
KafkaConsumerConfiguration
#EnableKafka
#Configuration
public class KafkaConsumerConfiguration {
#Value("${spring.kafka.hostname}")
private String kafkaHostname;
public static final String GROUP_ID_CONFIG = "group_json1";
public static final String ENABLE_AUTO_COMMIT_CONFIG = "false";
public static final String AUTO_OFFSET_RESET_CONFIG = "earliest";
#Bean
public ConsumerFactory<String, UserKafkaDTO> userConsumerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaHostname);
config.put(ConsumerConfig.GROUP_ID_CONFIG, GROUP_ID_CONFIG);
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
config.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, ENABLE_AUTO_COMMIT_CONFIG);
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, AUTO_OFFSET_RESET_CONFIG);
JsonDeserializer<UserKafkaDTO> deserializer = new JsonDeserializer<>(UserKafkaDTO.class);
deserializer.setRemoveTypeHeaders(false);
deserializer.addTrustedPackages("*");
deserializer.setUseTypeMapperForKey(true);
return new DefaultKafkaConsumerFactory<>(config, new StringDeserializer(), deserializer);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, UserKafkaDTO> userKafkaListenerFactory() {
ConcurrentKafkaListenerContainerFactory<String, UserKafkaDTO> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(userConsumerFactory());
factory.setBatchListener(true);
return factory;
}
#Bean
#ConditionalOnMissingBean(name = "kafkaListenerContainerFactory")
public ConcurrentKafkaListenerContainerFactory<String, Object>kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setBatchListener(true);
return factory;
}
}
KafkaConsumerListener
#EnableKafka
#Configuration
public class KafkaConsumerListener {
#KafkaListener(topics = { "topicName" }, containerFactory = "userKafkaListenerFactory",autoStartup = "${listen.auto.start}")
public void consumeJson(UserKafkaDTO userKafkaDTO) {
invokeAPItoSaveRecordTOMongoDB(userKafkaDTO);
}
Try change the
factory.setBatchListener(true);
to
factory.setBatchListener(false);
because your listener handle single object only.

Exactly Once Two Kafka Clusters

I have 2 Kafka clusters. Cluster A and Cluster B. These clusters are completely separate.
I have a spring-boot application that listens to a topic on Cluster A transforms the event and then produces it onto Cluster B. I require exactly once as these are financial events. I have noticed that with my current application I sometimes get duplicates as well as miss some events. I have tried to implement exactly once as best I could. One of the posts said flink would be a better option over spring-boot. Should i move over to flink? Please see the Spring code below.
ConsumerConfig
#Configuration
public class KafkaConsumerConfig {
#Value("${kafka.server.consumer}")
String server;
#Value("${kafka.kerberos.service.name:}")
String kerberosServiceName;
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, server);
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
config.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 30000);
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, AvroDeserializer.class);
config.put(ConsumerConfig.ISOLATION_LEVEL_CONFIG, "read_committed");
// skip kerberos if no value is provided
if (kerberosServiceName.length() > 0) {
config.put("security.protocol", "SASL_PLAINTEXT");
config.put("sasl.kerberos.service.name", kerberosServiceName);
}
return config;
}
#Bean
public ConsumerFactory<String, AccrualSchema> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs(), new StringDeserializer(),
new AvroDeserializer<>(AccrualSchema.class));
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, AccrualSchema> kafkaListenerContainerFactory(ConsumerErrorHandler errorHandler) {
ConcurrentKafkaListenerContainerFactory<String, AccrualSchema> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setAutoStartup(true);
factory.getContainerProperties().setAckMode(AckMode.RECORD);
factory.setErrorHandler(errorHandler);
return factory;
}
#Bean
public KafkaConsumerAccrual receiver() {
return new KafkaConsumerAccrual();
}
}
ProducerConfig
#Configuration
public class KafkaProducerConfig {
#Value("${kafka.server.producer}")
String server;
#Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, server);
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
config.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");
config.put(ProducerConfig.ACKS_CONFIG, "all");
config.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "prod-1");
config.put(ProducerConfig.COMPRESSION_TYPE_CONFIG,"snappy");
config.put(ProducerConfig.LINGER_MS_CONFIG, "10");
config.put(ProducerConfig.BATCH_SIZE_CONFIG, Integer.toString(32*1024));
return new DefaultKafkaProducerFactory<>(config);
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
}
KafkaProducer
#Service
public class KafkaTopicProducer {
#Autowired
private KafkaTemplate<String, String> kafkaTemplate;
public void topicProducer(String payload, String topic) {
kafkaTemplate.executeInTransaction(kt->kt.send(topic, payload));
}
}
KafkaConsumer
public class KafkaConsumerAccrual {
#Autowired
KafkaTopicProducer kafkaTopicProducer;
#Autowired
Gson gson;
#KafkaListener(topics = "topicname", groupId = "groupid", id = "listenerid")
public void consume(AccrualSchema accrual,
#Header(KafkaHeaders.RECEIVED_PARTITION_ID) Integer partition, #Header(KafkaHeaders.OFFSET) Long offset,
#Header(KafkaHeaders.CONSUMER) Consumer<?, ?> consumer) {
AccrualEntity accrualEntity = convertBusinessObjectToAccrual(accrual,partition,offset);
kafkaTopicProducer.topicProducer(gson.toJson(accrualEntity, AccrualEntity.class), accrualTopic);
}
public AccrualEntity convertBusinessObjectToAccrual(AccrualSchema ao, Integer partition,
Long offset) {
//Transform code goes here
return ae;
}
}
Exactly Once semantics are not supported across clusters; the key is producing records and committing the consumed offset(s) within one atomic transaction.
That is not possible in your environment since a transaction can't span two clusters.
According to this specific Draft KIP :
https://cwiki.apache.org/confluence/display/KAFKA/KIP-656%3A+MirrorMaker2+Exactly-once+Semantics
The better thing that you can expect right know is an at-least-one semantic between two clusters.
It implies that you have to protect from duplication on the consumer side of the target broker.
As an example, you can identify a set a properties that must be unique for a specific windows of time. But i will really depend on your use case.

How to use "li-apache-kafka-clients" in spring boot app to send large message (above 1MB) from Kafka producer?

How to use li-apache-kafka-clients in spring boot app to send large message (above 1MB) from Kafka producer to Kafka Consumer? Below is the GitHub link of li-apache-kafka-clients:
https://github.com/linkedin/li-apache-kafka-clients
I have imported .jar file of li-apache-kafka-clients and put the below configuration for producer:
props.put("large.message.enabled", "true");
props.put("max.message.segment.bytes", 1000 * 1024);
props.put("segment.serializer", DefaultSegmentSerializer.class.getName());
and for consumer:
message.assembler.buffer.capacity,
max.tracked.messages.per.partition,
exception.on.message.dropped,
segment.deserializer.class
but still getting error for large message. Please help me to solve error.
Below is my code, please let me know where I need to create LiKafkaProducer:
#Configuration
public class KafkaProducerConfig {
#Value("${kafka.boot.server}")
private String kafkaServer;
#Bean
public ProducerFactory<String, String> producerFactory() {
return new DefaultKafkaProducerFactory<>(producerConfig());
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<String, String>(producerFactory());
}
#Bean
public Map<String, Object> producerConfig() {
// TODO Auto-generated method stub
Map<String, Object> config = new HashMap<>();
config.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
config.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
config.put("bootstrap.servers", "localhost:9092");
config.put("acks", "all");
config.put("retries", 0);
config.put("batch.size", 16384);
config.put("linger.ms", 1);
config.put("buffer.memory", 33554432);
// The following properties are used by LiKafkaProducerImpl
config.put("large.message.enabled", "true");
config.put("max.message.segment.bytes", 1000 * 1024);
config.put("segment.serializer", DefaultSegmentSerializer.class.getName());
config.put("auditor.class", LoggingAuditor.class.getName());
return config;
}
}
#RestController
#RequestMapping("/kafkaProducer")
public class KafkaProducerController {
#Autowired
private KafkaSender sender;
#PostMapping
public ResponseEntity<List<Student>> sendData(#RequestBody List<Student> student){
sender.sendData(student);
return new ResponseEntity<List<Student>>(student, HttpStatus.OK);
}
}
#Service
public class KafkaSender {
private static final Logger LOGGER = LoggerFactory.getLogger(KafkaSender.class);
#Autowired
private KafkaTemplate<String, String> kafkaTemplate;
#Value("${kafka.topic.name}")
private String topicName;
public void sendData(List<Student> student) {
// TODO Auto-generated method stub
Map<String, Object> headers = new HashMap<>();
headers.put(KafkaHeaders.TOPIC, topicName);
headers.put("payload", student.get(0));
// Construct a JSONObject from a Map.
JSONObject HeaderObject = new JSONObject(headers);
System.out.println("\nMethod-2: Using new JSONObject() ==> " + HeaderObject);
final String record = HeaderObject.toString();
Message<String> message = MessageBuilder.withPayload(record).setHeader(KafkaHeaders.TOPIC, topicName)
.setHeader(KafkaHeaders.MESSAGE_KEY, "Message")
.build();
kafkaTemplate.send(topicName, message.toString());
}
}
You would need to implement your own ConsumerFactory and ProducerFactory to create the LiKafkaConsumer and LiKafkaProducer respectively.
You should be able to subclass the default factories provided by the framework.

How to access MessageHeaders in Kafka batch listener

I have the below Autoconfiguration class for Kafka:
#Configuration
#EnableKafka
#ConditionalOnClass(KafkaReceiver.class)
#ConditionalOnProperty({"spring.kafka.bootstrap-servers"})
public class KafkaAutoConfiguration<T> { #Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
private KafkaProperties kafkaConfig;
private String groupId;
#Autowired
public void setKafkaProperties(KafkaProperties properties) {
this.kafkaConfig = properties;
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,
org.apache.kafka.common.serialization.ByteArrayDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, kafkaConfig.getGroupId());
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
return props;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaBatchListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setBatchListener(true);
return factory;
}
#Bean
public DefaultKafkaHeaderMapper headerMapper(){
return new DefaultKafkaHeaderMapper();
}
#Bean("simpleReceiver")
#ConditionalOnMissingBean(name = "simpleReceiver")
#ConditionalOnProperty({"service.kafka.consumer.topics"})
public KafkaReceiver simpleReceiver() {
return new KafkaSimpleReceiver();
}
#Bean("batchReceiver")
#ConditionalOnMissingBean(name = "batchReceiver")
#ConditionalOnProperty({"service.kafka.consumer.batch-topics"})
public KafkaReceiver batchReceiver() {
return new KafkaBatchReceiver();
}
}
Simple listener bean:
public class KafkaSimpleReceiver implements KafkaReceiver {
#KafkaListener(topics = "#{'${service.kafka.consumer.topics}'.split(',')}", containerFactory = "kafkaListenerContainerFactory")
public void receive(ConsumerRecord record, #Headers MessageHeaders headers) throws KafkaException {
}
}
Batch listener bean:
public class KafkaBatchReceiver implements KafkaReceiver {
#KafkaListener(topics = "#{'${service.kafka.consumer.batch-topics}'.split(',')}", containerFactory = "kafkaBatchListenerContainerFactory")
public void receive(List<ConsumerRecord> records, #Headers MessageHeaders headers) throws KafkaException {
}
}
Simple listener is working fine, but I am getting the following error for batch listener. How can we access MessageHeaders in this case?
A parameter of type 'List<ConsumerRecord>' must be the only parameter (except for an optional 'Acknowledgment' and/or 'Consumer')
EDIT
This is what I did to convert ConsumerRecord to MessageHeaders
public void receive(List<ConsumerRecord> records) {
for(ConsumerRecord record : records) {
Map<String, Object> headersList = new HashMap<>();
for(final Header h : record.headers()) {
headersList.put(h.key(), new String(h.value()));
}
MessageHeaders headers = new MessageHeaders(headersList);
}
}
You can get a list of Message<?>.
#KafkaListener(topics = "so54086076", id = "so54086076")
public void listen(List<Message<?>> records) {
System.out.println(records.size() + ":" + records);
}
The message payloads will be the ConsumerRecord.value(); the other ConsumerRecord properties will be in the headers.

Unable to read same message on same topic by multiple consumers

I am publishing a message to topicX. To consume this message by multiple consumers, I have written following ConsumerConfig (Note, each consumer has different groupd_id.
#EnableKafka
#Configuration
public class ConsumerConfig {
#Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
#Bean
public Map<String, Object> consumerConfigs(String groupId) {
Map<String, Object> props = new HashMap<>();
props.put(BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(GROUP_ID_CONFIG, "my.groupid." + groupId);
props.put(AUTO_OFFSET_RESET_CONFIG, "earliest");
return props;
}
#Bean
public ConsumerFactory<String, Car> consumerFactory(String groupId) {
return new DefaultKafkaConsumerFactory<>(consumerConfigs(groupId), new StringDeserializer(),
new JsonDeserializer<>(Car.class));
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Car> consumerOneKafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Car> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory("one"));
return factory;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, Car> consumerTwoKafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, Car> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory("two"));
return factory;
}
}
And then on specific consumer, I have added following annotation (specified the group id and container factory)
#KafkaListener(id = "my.groupid.one", topics = "${app.topic.topicname}", containerFactory = "consumerOneKafkaListenerContainerFactory")
But when i publish the message, only one consumer consumes it. How can i i configure kafka so that each published message is consumed by multiple consumers?
I have tried this code with 1 partition and n partitions (where n = number of consumers) and yet i am having this issue.
UPDATE
Here is my topic producer class
#Service
#Slf4j
public class TopicProducer {
#Autowired
private KafkaTemplate<String, TopicPayload> kafkaTemplate;
public void send(String topicName, TopicPayload payload) {
log.info("sending message={} to topic={}", payload, topicName);
kafkaTemplate.send(topicName, payload);
}
}
And here is my TopicProducerConfig
#Configuration
public class TopicProducerConfig {
#Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
#Bean
public Map<String, Object> producerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
props.put(VALUE_SERIALIZER_CLASS_CONFIG, JsonSerializer.class);
return props;
}
#Bean
public ProducerFactory<String, TopicPayload> producerFactory() {
return new DefaultKafkaProducerFactory<>(producerConfigs());
}
#Bean
public KafkaTemplate<String, TopicPayload> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
}
I post a message by calling producer.send(topicName, payload).

Resources