Consumer restart when I reset Spring Boot app - spring-boot

I have a Kafka topic with data, called "topic01"
I want to create a consumer that every time I start my Spring Boot 2 application, start reading that topic from the beginning.
I have the following code, that if I add something new to the topic if it reaches me, but when starting the first time, it won't read me from the beginning of the topic.
#KafkaListener(topics = "topic01")
public void listenTopic01(ConsumerRecord<String, MiDTO> consumerRecord) throws Exception {
logger.info("KafkaHandler");
logger.info(consumerRecord.value().toString());
logger.info(consumerRecord.key().toString());
latch.countDown();
}
application.properties:
spring.kafka.consumer.group-id=XXXXX
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.JsonDeserializer
What configuration should I add, so that this #KafkaListener reads the topic from the beginning, every time I restart my application.

Either use a unique (random) group-id each time, or have your listener class implement ConsumerSeekAware and add
#Override
public void onPartitionsAssigned(Consumer<?, ?> consumer, Collection<TopicPartition> partitions) {
consumer.seekToBeginning(partitions);
}
or
#KafkaListener(topics = "topic01",
groupId = "#{T(java.util.UUID).randomUUID().toString()}")

Related

Can selective disable on Queue consumption in #JmsListener SpringBoot possible?

I'm using SpringBoot along with #JmsListener to retrieve IBM MQ messages from multiple queues within the same QManager. So far I can get messages without any issues. But there could be scenarios, where I had to stop consuming msgs from one of these queues temporarily. It doesn't have to be dynamic.
I'm not using any custom ConnectionFactory methods. When needed, I would like to make config changes in application.properties to disable that particular Queue consumption and restart the process. Is this possible? Can't find any specific info for this scenario. Would appreciate any suggestions. TIA.
#Component
public class MyJmsListener {
#JmsListener(destination = "{ibm.mq.queue.queue01}")
public void handleQueue01(String message) {
System.out.println("received: "+message);
}
#JmsListener(destination = "{ibm.mq.queue.queue02}")
public void handleQueue02(String message) {
System.out.println("received: "+message);
}
}
application.properties
ibm.mq.queue.queue01=IBM.QUEUE01
ibm.mq.queue.queue02=IBM.QUEUE02
If you give each #JmsListener an id property, you can start and stop them individually using the JmsListenerEndpointRegistry bean.
registry.getListenerContainer(id).stop();

spring-cloud-stream - Kafka producer prefix unique per node

I want to send something to Kafka topic in producer-only (not in read-write process) transaction using output-channel.
I read documentation and another topic on StackOverflow (Spring cloud stream kafka transactions in producer side).
Problem is that i need to set unique transactionIdPrefix per node.
Any suggestion how to do it?
Here is one way...
#Component
class TxIdCustomizer implements EnvironmentAware {
#Override
public void setEnvironment(Environment environment) {
Properties properties = new Properties();
properties.setProperty("spring.cloud.stream.kafka.binder.transaction.transactionIdPrefix",
UUID.randomUUID().toString());
((StandardEnvironment) environment).getPropertySources()
.addLast(new PropertiesPropertySource("txId", properties));
}
}

how to stop consuming messages from kafka when error occurred and restart consuming again after some time in spring boot

This is the first time i am using Kafka. i have a spring boot application and i am consuming messages from kafka topics and storing messages in DB. I have a requirement to handle DB fail over, if DB is down that message should not be committed and suspend consuming messages for some time and after some time listener can start consuming messages again. what is the better approach to do this.
i am using spring-kafka:2.2.8.RELEASE which is internally using kafka 2.0.1
Configure a ContainerStoppingErrorHandler and throw an exception from your listener.
https://docs.spring.io/spring-kafka/docs/2.2.13.RELEASE/reference/html/#container-stopping-error-handlers
You can restart the container later when you have detected that your DB is back online.
https://docs.spring.io/spring-kafka/docs/2.2.13.RELEASE/reference/html/#kafkalistener-lifecycle
EDIT
#SpringBootApplication
public class So62125817Application {
public static void main(String[] args) {
SpringApplication.run(So62125817Application.class, args);
}
#Bean
TaskScheduler scheduler() {
return new ThreadPoolTaskScheduler();
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so62125817").partitions(1).replicas(1).build();
}
}
#Component
class Listener {
private final TaskScheduler scheduler;
private final KafkaListenerEndpointRegistry registry;
public Listener(TaskScheduler scheduler, KafkaListenerEndpointRegistry registry,
AbstractKafkaListenerContainerFactory<?, ?, ?> factory) {
this.scheduler = scheduler;
this.registry = registry;
factory.setErrorHandler(new ContainerStoppingErrorHandler());
}
#KafkaListener(id = "so62125817.id", topics = "so62125817")
public void listen(String in) {
System.out.println(in);
// run this code if you want to stop the container and restart it in 60 seconds
this.scheduler.schedule(() -> {
this.registry.getListenerContainer("so62125817.id").start();
}, new Date(System.currentTimeMillis() + 60_000));
throw new RuntimeException("test restart");
}
}
There are two approaches which I can think of doing this:
First Approach: Let the auto-commit option for consuming messages be true. The configuration for this is enable.auto.commit. By default, this would be true, so you do not need to change anything. Whenever your DB operation fails, you can put the messages on a different topic say a topic named failed_events. When you do this, you can have the same application (Which populates the DB) running say once at a daily level to consume the message from failed_events topic and populate the DB again. This way you can keep track of how many times the DB write gets failed. One small thing to note is what if during this run also the DB is down, then what do you do. You can decide what to do in this case. Probably discard the message if it is Ok to do so, or do a certain number of retries.
Second approach: If it is very deterministic to know that for how long the DB would be down. And if the time period is very small, then it is better to do a sleep operation in the case of DB write failure. Say the application sleeps for 10 minutes before it retries again. You will not have to create a separate topic in this case.
The advantage of this approach is that you don't have to run a separate instance of the same application to fetch from a different topic. You could do all of them in one single application. Maintaining this becomes relatively easier.
The disadvantage of this approach is that if the DB is down for a very long period, say 1 day, Then you will end up losing the message.

Spring kafka Batch Listener- commit offsets manually in Batch

I am implementing spring kafka batch listener, which reads list of messages from Kafka topic and posts the data to a REST service.
I would like to understand the offset management in case of the REST service goes down, the offsets for the batch should not be committed and the messages should be processed for the next poll. I have read spring kafka documentation but there is confusion in understanding the difference between Listener Error Handler and Seek to current container error handlers in batch. I am using spring-boot-2.0.0.M7 version and below is my code.
Listener Config:
#Bean
KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<String, String>> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
factory.setConcurrency(Integer.parseInt(env.getProperty("spring.kafka.listener.concurrency")));
// factory.getContainerProperties().setPollTimeout(3000);
factory.getContainerProperties().setBatchErrorHandler(kafkaErrorHandler());
factory.getContainerProperties().setAckMode(AckMode.BATCH);
factory.setBatchListener(true);
return factory;
}
#Bean
public Map<String, Object> consumerConfigs() {
Map<String, Object> propsMap = new HashMap<>();
propsMap.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, env.getProperty("spring.kafka.bootstrap-servers"));
propsMap.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG,
env.getProperty("spring.kafka.consumer.enable-auto-commit"));
propsMap.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG,
env.getProperty("spring.kafka.consumer.auto-commit-interval"));
propsMap.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, env.getProperty("spring.kafka.session.timeout"));
propsMap.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.GROUP_ID_CONFIG, env.getProperty("spring.kafka.consumer.group-id"));
return propsMap;
}
Listener Class:
#KafkaListener(topics = "${spring.kafka.consumer.topic}", containerFactory = "kafkaListenerContainerFactory")
public void listen(List<String> payloadList) throws Exception {
if (payloadList.size() > 0)
//Post to the service
}
Kafka Error Handler:
public class KafkaErrorHandler implements BatchErrorHandler {
private static Logger LOGGER = LoggerFactory.getLogger(KafkaErrorHandler.class);
#Override
public void handle(Exception thrownException, ConsumerRecords<?, ?> data) {
LOGGER.info("Exception occured while processing::" + thrownException.getMessage());
}
}
How to handle Kafka listener so that if something happens during processing batch of records, I wouldn't loose data.
With Apache Kafka we never lose the data. There is indeed an offset in partition logs to seek to any arbitrary position.
On the other hand, when we consume records from a partition there is no requirement to commit their offsets - the current consumer holds the state in the memory. We need to commit only for other, new consumers in the same group when the current one is dead. Independently of the error, the current consumer always moves on to poll new data behind its current in-memory offset.
So, to reprocess the same data in the same consumer we definitely have to use seek operation to move the consumer back to the desired position. That's why Spring Kafka introduces SeekToCurrentErrorHandler:
This allows implementations to seek all unprocessed topic/partitions so the current record (and the others remaining) will be retrieved by the next poll. The SeekToCurrentErrorHandler does exactly this.
https://docs.spring.io/spring-kafka/reference/htmlsingle/#_seek_to_current_container_error_handlers

Spring Cloud Stream does not create a queue

I'm trying to configure a simple Spring Cloud Stream application with RabbitMQ. The code I use is mostly taken from spring-cloud-stream-samples.
I have an entry point:
#SpringBootApplication
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
}
and a simple messages producer from the example:
#EnableBinding(Source.class)
public class SourceModuleDefinition {
private String format = "yyyy-MM-dd HH:mm:ss";
#Bean
#InboundChannelAdapter(value = Source.OUTPUT, poller = #Poller(fixedDelay = "${fixedDelay}", maxMessagesPerPoll = "1"))
public MessageSource<String> timerMessageSource() {
return () -> new GenericMessage<>(new SimpleDateFormat(this.format).format(new Date()));
}
}
Additionally, here is application.yml configuration:
fixedDelay: 5000
spring:
cloud:
stream:
bindings:
output:
destination: test
When I run the example, it connects to Rabbit and creates an exchange called test. But my problem is, it doesn't create a queue and binding automatically. I can see traffic going in Rabbit, but all my messages are then gone. While I need them to stay in some queue unless they are read by consumer.
Maybe I misunderstand something, but from all the topics I read, it seems like Spring Cloud Stream should create a queue and a binding automatically. If not, how do I configure it so my messages are persisted?
I'm using Spring Cloud Brixton.SR5 and Spring Boot 1.4.0.RELEASE.
A queue would be created as soon as you start a consumer application.
In the case of Rabbit MQ, where we have separate queues for each consumer group and we cannot know all groups beforehand, if you want to have the queues created automatically for consumer groups that are known in advance, you can use the requiredGroups property of the producers. This will ensure that messages are persisted until a consumer from that group is started.
See details here: http://docs.spring.io/spring-cloud-stream/docs/Brooklyn.BUILD-SNAPSHOT/reference/htmlsingle/#_producer_properties
You will need a consumer in order to have a queue created.
Here you can find an example of a producer and a consumer using rabbitMq:
http://ignaciosuay.com/how-to-implement-asyncronous-communication-between-microservices-using-spring-cloud-stream/

Resources