Consuming Kafka Messages in Spring - spring-boot

By following this tutorial, I was able to create a simple producer-consumer example. In my example there was only 1 topic and i was listening to that topic. For that reason, the code in ReceiverConfig makes sense. Specially the point around GROUP_ID_CONFIG i.e., i create topic topic_name and then it was configured in this configuration. Now my question is that, what if I have more than 1 topic. Let's say i have topic_1, topic_2 and so on? Shall i create ReceiverConfig for each individual topic?
#EnableKafka
#Configuration
public class ReceiverConfig {
#Value("${spring.kafka.bootstrap-servers}")
private String bootstrapServers;
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(GROUP_ID_CONFIG, "topic_name");
props.put(AUTO_OFFSET_RESET_CONFIG, "earliest");
return props;
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
}

The short answer is no, you do not need to create multiple configuration for each topic.
Before going any further, I think there is good to specify that the groupId is the group which the Consumer process belongs to and topic to be consumed by the the Consumer process are two different things.
With the sentence below you will tell to the Consumer that its belong to the topic_name group, nothing more.
props.put(GROUP_ID_CONFIG, "topic_name");
If you want a Consumer to read data from multiple topics, there is a subscribe method which receives a Collections as a parameter, in that way you specify all the topics to read data without having to create a new configuration for each topic.
Please check this example out, you will see the method I mentioned
// Subscribe to the topic.
consumer.subscribe(Collections.singletonList(TOPIC));

Related

How can I test that I have configured ChainedKafkaTransactionManager correctly in my spring boot service

My spring boot service needs to consume kafka events off one topic, do some processing (including writing to the db with JPA) and then produce some events on a new topic. No matter what happens I cannot have a situation where I have published events without updating the database, and if anything goes wrong then I want the next poll of the consumer to retry the event. My processing logic including the db update is idempotent so retrying that is fine
I think I have achieved exactly once semantics as described on https://docs.spring.io/spring-kafka/reference/html/#exactly-once by using a ChainedKafkaTransactionManager like so:
#Bean
public ChainedKafkaTransactionManager chainedTransactionManager(JpaTransactionManager jpa, KafkaTransactionManager<?, ?> kafka) {
kafka.setTransactionSynchronization(SYNCHRONIZATION_ON_ACTUAL_TRANSACTION);
return new ChainedKafkaTransactionManager(kafka, jpa);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<?, ?> kafkaListenerContainerFactory(
ConcurrentKafkaListenerContainerFactoryConfigurer configurer,
ConsumerFactory<Object, Object> kafkaConsumerFactory,
ChainedKafkaTransactionManager chainedTransactionManager) {
ConcurrentKafkaListenerContainerFactory<Object, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
configurer.configure(factory, kafkaConsumerFactory);
factory.getContainerProperties().setTransactionManager(chainedTransactionManager);
return factory;
}
The relevant kafka config in my application.yaml file looks like:
kafka:
...
consumer:
group-id: myGroupId
auto-offset-reset: earliest
properties:
isolation.level: read_committed
...
producer:
transaction-id-prefix: ${random.uuid}
...
Because the commit order is critical to my application I would like to write a integration test to prove that the commits happen in the desired order and that if an error occurs during the commit to kafka then the original event is consumed again. However I am struggling to find a good way of causing a failure between the db commit and the kafka commit.
Any suggestions or alternative ways I could do this?
Thanks
You could use a custom ProducerFactory to return a MockProducer (provided by kafka-clients.
Set the commitTransactionException so that it is thrown when the KTM tries to commit the transaction.
EDIT
Here is an example; it doesn't use the chained TM, but that shouldn't make a difference.
#SpringBootApplication
public class So66018178Application {
public static void main(String[] args) {
SpringApplication.run(So66018178Application.class, args);
}
#KafkaListener(id = "so66018178", topics = "so66018178")
public void listen(String in) {
System.out.println(in);
}
}
spring.kafka.producer.transaction-id-prefix=tx-
spring.kafka.consumer.auto-offset-reset=earliest
#SpringBootTest(classes = { So66018178Application.class, So66018178ApplicationTests.Config.class })
#EmbeddedKafka(bootstrapServersProperty = "spring.kafka.bootstrap-servers")
class So66018178ApplicationTests {
#Autowired
EmbeddedKafkaBroker broker;
#Test
void kafkaCommitFails(#Autowired KafkaListenerEndpointRegistry registry, #Autowired Config config)
throws InterruptedException {
registry.getListenerContainer("so66018178").stop();
AtomicReference<Exception> listenerException = new AtomicReference<>();
CountDownLatch latch = new CountDownLatch(1);
((ConcurrentMessageListenerContainer<String, String>) registry.getListenerContainer("so66018178"))
.setAfterRollbackProcessor(new AfterRollbackProcessor<>() {
#Override
public void process(List<ConsumerRecord<String, String>> records, Consumer<String, String> consumer,
Exception exception, boolean recoverable) {
listenerException.set(exception);
latch.countDown();
}
});
registry.getListenerContainer("so66018178").start();
Map<String, Object> props = KafkaTestUtils.producerProps(this.broker);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
DefaultKafkaProducerFactory<String, String> pf = new DefaultKafkaProducerFactory<>(props);
KafkaTemplate<String, String> template = new KafkaTemplate<>(pf);
template.send("so66018178", "test");
assertThat(latch.await(10, TimeUnit.SECONDS)).isTrue();
assertThat(listenerException.get()).isInstanceOf(ListenerExecutionFailedException.class)
.hasCause(config.exception);
}
#Configuration
public static class Config {
RuntimeException exception = new RuntimeException("test");
#Bean
public ProducerFactory<Object, Object> pf() {
return new ProducerFactory<>() {
#Override
public Producer<Object, Object> createProducer() {
MockProducer<Object, Object> mockProducer = new MockProducer<>();
mockProducer.commitTransactionException = Config.this.exception;
return mockProducer;
}
#Override
public Producer<Object, Object> createProducer(String txIdPrefix) {
Producer<Object, Object> producer = createProducer();
producer.initTransactions();
return producer;
}
#Override
public boolean transactionCapable() {
return true;
}
};
}
}
}
Do not use ChainedKafkaTransactionManager anymore, it is deprecated.
according to docs:
https://docs.spring.io/spring-kafka/reference/html/#container-transaction-manager
"The ChainedKafkaTransactionManager is now deprecated, since version 2.7; see the javadocs for its super class ChainedTransactionManager for more information. Instead, use a KafkaTransactionManager in the container to start the Kafka transaction and annotate the listener method with #Transactional to start the other transaction."
In my tests, where I tried to simulate exception in Producer after DB transaction committed, I simply left mandatory field empty in Kafka event (used Avro schema), and in the second test I deleted the topic for producing with the help of Kafka Admin. And then I wrote some asserts to verify that Kafka Listener was called again, when retrying.

Consumer Class Listener method not getting triggered to receive messages from topic. Kafka with Spring Boot App

I'm using Kafka with Spring Boot. I use Rest Controllers to call Producer/Consumer API's. Producer Class is able to add messages to the topic. I verified using command line utility (Console-consumer.sh). However my Consumer class is not able to receive them in Java for further processing.
#KafkaListener used in Consumer class listener method should be able to receive messages when my Producer class posts messages to the topic which is not happening. Any help appreciated.
Is it still necessary for consumer to subscribe and poll for records when I have already created KafkaListenerContainerFactory that is responsible for invoking Consumer Listener method when a message is posted to the topic?
Consumer Class
#Component
public class KafkaListenersExample {
private final List<KafkaPayload> messages = new ArrayList<>();
#KafkaListener(topics = "test_topic", containerFactory = "kafkaListenerContainerFactory")
public void listener(KafkaPayload data) {
synchronized (messages){
messages.add(data);
}
//System.out.println("message from kafka :"+data);
}
public List<KafkaPayload> getMessages(){
return messages;
}
}
Consumer Config
#Configuration
class KafkaConsumerConfig {
private String bootstrapServers = "localhost:9092";
#Bean
public ConsumerFactory<String, KafkaPayload> consumerConfigs() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG,
bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
return new DefaultKafkaConsumerFactory<>(props) ;
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, KafkaPayload> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, KafkaPayload> factory =
new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerConfigs());
return factory;
}
}
The listener container creates the consumer, subscribes, and takes care of the polling.
Turning on DEBUG logging should help determine what's wrong.
If the records are already in the topic, you need to set ConsumerConfig.AUTO_OFFSET_RESET_CONFIG to earliest. Otherwise, the consumer starts consuming from the end of the topic (latest).

Infinite retries with SeekToCurrentErrorHandler in kafka consumer

I've configured a kafka consumer with SeekToCurrentErrorHandler in Spring boot application using spring-kafka. My consumer configuration is :
#Bean
public ConsumerFactory<String, String> consumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "kafkaserver");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "group-id");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer2.class);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, ErrorHandlingDeserializer2.class);
props.put(ErrorHandlingDeserializer2.KEY_DESERIALIZER_CLASS, StringDeserializer.class);
props.put(ErrorHandlingDeserializer2.VALUE_DESERIALIZER_CLASS, StringDeserializer.class.getName());
props.put(JsonDeserializer.KEY_DEFAULT_TYPE, "java.lang.String");
props.put(JsonDeserializer.VALUE_DEFAULT_TYPE, "java.lang.String");
return new DefaultKafkaConsumerFactory<>(props);
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
SeekToCurrentErrorHandler seekToCurrentErrorHandler = new SeekToCurrentErrorHandler(5);
seekToCurrentErrorHandler.setCommitRecovered(true);
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.getContainerProperties().setAckOnError(false);
factory.getContainerProperties().setAckMode(AckMode.MANUAL_IMMEDIATE);
factory.setErrorHandler(seekToCurrentErrorHandler);
return factory;
}
To test SeekToCurrentErrorHandler config, I pushed a record in kafka with incorrect format so that it fails with deserialization exception. As per my understanding the error handler should try to handle the failed record 5 times and after that it should log and move on to the next record.
But it keeps on reading the failed record infinite number of times.
Please tell me where am I going wrong.
I have exactly the same problem and the only fix I do is make sure the concurrency level is same as the number of partition for the topic. Otherwise it will keeps retrying infinitely.
Sounds like a bug of spring kafka.

AWS SQS (queue) with Spring Boot - performance issues

I have a service that reads all messages from AWS SQS.
#Slf4j
#Configuration
#EnableJms
public class JmsConfig {
private SQSConnectionFactory connectionFactory;
public JmsConfig(
#Value("${amazon.sqs.accessKey}") String awsAccessKey,
#Value("${amazon.sqs.secretKey}") String awsSecretKey,
#Value("${amazon.sqs.region}") String awsRegion,
#Value("${amazon.sqs.endpoint}") String awsEndpoint) {
connectionFactory = new SQSConnectionFactory(
new ProviderConfiguration(),
AmazonSQSClientBuilder.standard()
.withCredentials(new AWSStaticCredentialsProvider(
new BasicAWSCredentials(awsAccessKey, awsSecretKey)))
.withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration(awsEndpoint, awsRegion))
.build());
}
#Bean
public DefaultJmsListenerContainerFactory jmsListenerContainerFactory() {
DefaultJmsListenerContainerFactory factory =
new DefaultJmsListenerContainerFactory();
factory.setConnectionFactory(this.connectionFactory);
factory.setDestinationResolver(new DynamicDestinationResolver());
factory.setConcurrency("3-10");
factory.setSessionAcknowledgeMode(Session.CLIENT_ACKNOWLEDGE);
factory.setReceiveTimeout(2000L); //??????????
return factory;
}
#Bean
public JmsTemplate defaultJmsTemplate() {
return new JmsTemplate(this.connectionFactory);
}
I've heard about long polling so I wonder how I could use it in my case. I wonder how this listener works - I do not want to create unnecessary calls to the AWS SQS.
My listener that reads messages and converts them to the Object and saves on Redis db:
#JmsListener(destination = "${amazon.sqs.destination}")
public void receive(String requestJSON) throws JMSException {
log.info("Received");
try {
Trace trace = Trace.fromJSON(requestJSON);
traceRepository.save(trace);
(...)
I'd like to know your opinions - what is the best approach to minimalize unnecessary calls to SQS to get messages.
Maybe shoud I use for example
factory.setReceiveTimeout(2000L);
Unfortunately there is too little information in Internet about it
Thanks,
Matthew

Spring Integration - kafka Outbound adapter not taking topic value exposed as spring bean

I have successfully integrated kafka outbound channle adapter with fixed topic name. Now, i want to make the topic name configurable and hence, want to expose it via application properties.
application.properties contain one of the following entry:
kafkaTopic:testNewTopic
My configuration class looks like below:
#Configuration
#Component
public class KafkaConfig {
#Value("${kafkaTopic}")
private String kafkaTopicName;
#Bean
public String getTopic(){
return kafkaTopicName;
}
#Bean
public KafkaTemplate<String, String> kafkaTemplate() {
return new KafkaTemplate<>(producerFactory());
}
#Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");//this.brokerAddress);
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
// set more properties
return new DefaultKafkaProducerFactory<>(props);
}
}
and in my si-config.xml, i have used the following (ex: topic="getTopic") :
<int-kafka:outbound-channel-adapter
id="kafkaOutboundChannelAdapter" kafka-template="kafkaTemplate"
auto-startup="true" sync="true" channel="inputToKafka" topic="getTopic">
</int-kafka:outbound-channel-adapter>
However, the configuration is unable to pick up the topic name when exposed via bean. But it works fine when i hard code the value of the topic name.
Can someone please suggest what i am doing wrong here?
Does topic within kafka outbound channel accept the value referred as bean?
How do i externalize it as every application using my utility will supply different kafka topic names
The topic attribute is for string value.
However it supports property placeholder resolution:
topic="${kafkaTopic}"
and also SpEL evaluation for aforementioned bean:
topic="#{getTopic}"
Just because this is allowed by the XML parser configuration.
However you may pay attention that KafkaTemplate, which you inject into the <int-kafka:outbound-channel-adapter> has defaultTopic property. Therefore you won't need to worry about that XML.
And one more option available for you is Spring Integration Annotations configuration. Where you can define a #ServiceActivator for the KafkaProducerMessageHandler #Bean:
#ServiceActivator(inputChannel = "inputToKafka")
#Bean
KafkaProducerMessageHandler kafkaOutboundChannelAdapter() {
kafkaOutboundChannelAdapter adapter = new kafkaOutboundChannelAdapter( kafkaTemplate());
adapter.setSync(true);
adapter.setTopicExpression(new LiteralExpression(this.kafkaTopicName));
return adapter;
}

Resources