Spring kafka application with multiple consumer groups stops consuming messages - spring-boot

kafka version 2.3.1
Spring boot verison: 2.2.5.RELEASE
I have spring boot Kafka application with 3 consumer groups. It stops consuming messages because of failing heartbeat. I tried updating the consumer configuration suggested by multiple stack overflow threads. Even after that, I am facing the issue. In the configuration,
Consumers are taking less than one second as per log to consume a message till the point they are being consumed and suddenly it stops. Also, some of the processing in the consumer happening in an asynchronous thread
Below is the configuration for one of the consumer factories.
I gave 10 seconds buffered time for each record and based on that configured MAX_POLL_INTERVAL_MS_CONFIG
#Bean
public ConsumerFactory<Object, Object> reqConsumerFactory()
{
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.GROUP_ID_CONFIG, "req-event-group");
props.put(ConsumerConfig.CLIENT_ID_CONFIG, "req-event-group");
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokerConfig.getBootstrapAddress());
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put(ConsumerConfig.RETRY_BACKOFF_MS_CONFIG, 1000);
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, 50*15*1000);
props.put(ConsumerConfig.HEARTBEAT_INTERVAL_MS_CONFIG, 5000);
props.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, 50*10*1000);
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, 50);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
return new DefaultKafkaConsumerFactory<>(props);
}
#Bean
public KafkaListenerContainerFactory<ConcurrentMessageListenerContainer<Object, Object>> reqKafkaListenerContainerFactory()
{
ConcurrentKafkaListenerContainerFactory<Object, Object> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(reqConsumerFactory());
factory.getContainerProperties().setAckMode(AckMode.MANUAL_IMMEDIATE);
factory.setErrorHandler(new SeekToCurrentErrorHandler(2));
factory.setConcurrency(2);
return factory;
}
One of the consumer method
#KafkaListener(topicPattern = "${process.update.requirement.topic.name}", containerFactory = "reqKafkaListenerContainerFactory", groupId = "req-event-group")
public void handleCompleteAndErrorRequirement(ConsumerRecord<String, Object> consumerRecord,Acknowledgment acknowledgment)
{
RequirementEventMsg requirementEventMsg = (RequirementEventMsg)consumerRecord;
acknowledgment.acknowledge();
//asynchronous method call here
}
I don't see any error other than this
2020-06-23 13:28:06.815 [INFO ] AbstractCoordinator:855 - [Consumer clientId=consumer-6, groupId=process-consumer-group] Attempt to heartbeat failed since group is rebalancing
2020-06-23 13:28:06.835 [INFO ] AbstractCoordinator:855 - [Consumer clientId=consumer-4, groupId=process-consumer-group] Attempt to heartbeat failed since group is rebalancing
2020-06-23 13:28:07.175 [INFO ] ConsumerCoordinator:472 - [Consumer clientId=consumer-4, groupId=process-consumer-group] Revoking previously assigned partitions [UPDATE_REQUIREMENT_TOPIC-1, UPDATE_REQUIREMENT_TOPIC-0]
2020-06-23 13:28:07.176 [INFO ] KafkaMessageListenerContainer:394 - partitions revoked: [UPDATE_REQUIREMENT_TOPIC-1, UPDATE_REQUIREMENT_TOPIC-0]
2020-06-23 13:28:07.177 [INFO ] AbstractCoordinator:509 - [Consumer clientId=consumer-4, groupId=process-consumer-group] (Re-)joining group
2020-06-23 13:28:07.233 [INFO ] ConsumerCoordinator:472 - [Consumer clientId=consumer-6, groupId=process-consumer-group] Revoking previously assigned partitions [PROCESS_EVENT_TOPIC-0, PROCESS_EVENT_TOPIC-1]
2020-06-23 13:28:07.233 [INFO ] KafkaMessageListenerContainer:394 - partitions revoked: [PROCESS_EVENT_TOPIC-0, PROCESS_EVENT_TOPIC-1]

Related

Spring Kafka - Batch processing not working

I have Spring Kafka consumer and I want to consume 50 records each 60th seconds. I referred few documents and configured my application like this->
Consumer Configurations
#Bean
public ConsumerFactory<String, DeviceInfo> consumerFactory() {
Map<String, Object> config = new HashMap<>();
config.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, kafkaConfig.getConsumerBootstrapServers());
config.put(ConsumerConfig.GROUP_ID_CONFIG, "fixit-airwatch-etl");
config.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
config.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, AUTO_OFFSET_RESET_CONFIG);
config.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "50");
config.put(ConsumerConfig.MAX_POLL_INTERVAL_MS_CONFIG, "60000");
return new DefaultKafkaConsumerFactory<>(config, new StringDeserializer(),
new JsonDeserializer<>(DeviceInfo.class));
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, DeviceInfo> kafkaListenerFactory(ConsumerFactory<String, DeviceInfo> consumerFactory) {
ConcurrentKafkaListenerContainerFactory<String, DeviceInfo> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
factory.setBatchListener(true);
factory.setBatchErrorHandler(new BatchLoggingErrorHandler());
return factory;
}
Kafka listener
#KafkaListener(topics = "${app.kafka.topic}", groupId = "etl-group", containerFactory = "kafkaListenerFactory")
public void receive(#Payload List<DeviceInfo> messages) {
log.info("Got these many records from the topic {}", messages.size());
}
application.properties
spring.kafka.listener.type=batch
Inspite of having all these configurations, looks like I'm not seeing the expected behavior. Log statements are as below.
2022-07-04 12:07:22.533 INFO 89732 --- [ntainer#0-0-C-1] c.w.g.g.f.consumer.KafkaConsumer : Got these many records from the topic 9
2022-07-04 12:07:22.533 INFO 89732 --- [ntainer#0-0-C-1] c.w.g.g.f.consumer.KafkaConsumer : Got these many records from the topic 4
2022-07-04 12:07:22.534 INFO 89732 --- [ntainer#0-0-C-1] c.w.g.g.f.consumer.KafkaConsumer : Got these many records from the topic 6
2022-07-04 12:07:22.534 INFO 89732 --- [ntainer#0-0-C-1] c.w.g.g.f.consumer.KafkaConsumer : Got these many records from the topic 8
2022-07-04 12:07:22.535 INFO 89732 --- [ntainer#0-0-C-1] c.w.g.g.f.consumer.KafkaConsumer : Got these many records from the topic 8
2022-07-04 12:07:22.535 INFO 89732 --- [ntainer#0-0-C-1] c.w.g.g.f.consumer.KafkaConsumer : Got these many records from the topic 6
2022-07-04 12:07:22.536 INFO 89732 --- [ntainer#0-0-C-1] c.w.g.g.f.consumer.KafkaConsumer : Got these many records from the topic 11
Eventhough, I mentioned the batch size as 50, it is fetching random number of records. Also, the delay between each batch processing is not what I configured. Did I miss anything in this? Please share your thoughts. TIA.
Your configuration looks fine, but you need to keep one thing in mind is, at every 60 secs, 50 records must be available in the topic to be consumed, for this consumer to work as expected. Or you need to adjust the value of the following properties.
fetch.min.bytes
This property allows a consumer to specify the minimum amount of data that it
wants to receive from the broker when fetching records. If a broker receives a request
for records from a consumer but the new records amount to fewer bytes than
min.fetch.bytes, the broker will wait until more messages are available before send‐
ing the records back to the consumer. This reduces the load on both the consumer
and the broker as they have to handle fewer back-and-forth messages in cases where
the topics don’t have much new activity (or for lower activity hours of the day). You
will want to set this parameter higher than the default if the consumer is using too
much CPU when there isn’t much data available, or reduce load on the brokers when
you have large number of consumers.
fetch.max.wait.ms
By setting fetch.min.bytes, you tell Kafka to wait until it has enough data to send
before responding to the consumer. fetch.max.wait.ms lets you control how long to
wait. By default, Kafka will wait up to 500 ms. This results in up to 500 ms of extra
latency in case there is not enough data flowing to the Kafka topic to satisfy the mini‐
mum amount of data to return. If you want to limit the potential latency (usually due
to SLAs controlling the maximum latency of the application), you can set
fetch.max.wait.ms to a lower value. If you set fetch.max.wait.ms to 100 ms and
fetch.min.bytes to 1 MB, Kafka will recieve a fetch request from the consumer and
will respond with data either when it has 1 MB of data to return or after 100 ms,
whichever happens first.

Enabling exactly once causes streams shutdown due to timeout while initializing transactional state

I've written a simple example to test the join functionality. As I sometimes get messages duplicated in the resulting topic and sometimes missing messages in this topic, I thought while pinpointing the problem to enable exactly once semantics. However while doing this through:
props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE);
I get a timeout that causes kafka streams to shut down in my app:
2019-05-02 17:02:32.585 INFO 153056 --- [-StreamThread-1] o.a.kafka.common.utils.AppInfoParser : Kafka version : 2.0.1
2019-05-02 17:02:32.585 INFO 153056 --- [-StreamThread-1] o.a.kafka.common.utils.AppInfoParser : Kafka commitId : fa14705e51bd2ce5
2019-05-02 17:02:32.593 INFO 153056 --- [-StreamThread-1] o.a.k.c.p.internals.TransactionManager : [Producer clientId=join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72-StreamThread-1-0_0-producer, transactionalId=join-test-0_0] ProducerId set to -1 with epoch -1
2019-05-02 17:03:32.599 ERROR 153056 --- [-StreamThread-1] o.a.k.s.p.internals.StreamThread : stream-thread [join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72-StreamThread-1] Error caught during partition assignment, will abort the current process and re-throw at the end of rebalance: {}
org.apache.kafka.common.errors.TimeoutException: Timeout expired while initializing transactional state in 60000ms.
2019-05-02 17:03:32.599 INFO 153056 --- [-StreamThread-1] o.a.k.s.p.internals.StreamThread : stream-thread [join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72-StreamThread-1] partition assignment took 60044 ms.
current active tasks: []
current standby tasks: []
previous active tasks: []
2019-05-02 17:03:32.601 INFO 153056 --- [-StreamThread-1] o.a.k.s.p.internals.StreamThread : stream-thread [join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72-StreamThread-1] State transition from PARTITIONS_ASSIGNED to PENDING_SHUTDOWN
2019-05-02 17:03:32.601 INFO 153056 --- [-StreamThread-1] o.a.k.s.p.internals.StreamThread : stream-thread [join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72-StreamThread-1] Shutting down
2019-05-02 17:03:32.615 INFO 153056 --- [-StreamThread-1] o.a.k.s.p.internals.StreamThread : stream-thread [join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72-StreamThread-1] State transition from PENDING_SHUTDOWN to DEAD
2019-05-02 17:03:32.615 INFO 153056 --- [-StreamThread-1] org.apache.kafka.streams.KafkaStreams : stream-client [join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72] State transition from REBALANCING to ERROR
2019-05-02 17:03:32.615 WARN 153056 --- [-StreamThread-1] org.apache.kafka.streams.KafkaStreams : stream-client [join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72] All stream threads have died. The instance will be in error state and should be closed.
2019-05-02 17:03:32.615 INFO 153056 --- [-StreamThread-1] o.a.k.s.p.internals.StreamThread : stream-thread [join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72-StreamThread-1] Shutdown complete
Exception in thread "join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: stream-thread [join-test-90a0aa93-dfd8-4d4f-894b-85a3c5634f72-StreamThread-1] Failed to rebalance.
at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:870)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:810)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timeout expired while initializing transactional state in 60000ms.
static String ORIGINAL = "original-sensor-data";
static String ERROR = "error-score";
public static void main(String[] args) throws IOException {
SpringApplication.run(JoinTest.class, args);
Properties props = getProperties();
final StreamsBuilder builder = new StreamsBuilder();
final KStream<String, OriginalSensorData> original = builder.stream(ORIGINAL, Consumed.with(Serdes.String(), new OriginalSensorDataSerde()));
final KStream<String, ErrorScore> error = builder.stream(ERROR, Consumed.with(Serdes.String(), new ErrorScoreSerde()));
KStream<String, ErrorScore> result = original.join(
error,
(originalValue, errorValue) -> new ErrorScore(new Date(originalValue.getTimestamp()), errorValue.getE(),
originalValue.getData().get("TE700PV").doubleValue(), errorValue.getT(), errorValue.getR()),
// KStream-KStream joins are always windowed joins, hence we must provide a join window.
JoinWindows.of(Duration.ofMillis(3000).toMillis()),
Joined.with(
Serdes.String(), /* key */
new OriginalSensorDataSerde(), /* left value */
new ErrorScoreSerde() /* right value */
)
).through("atl-joined-data-repartition", Produced.with(Serdes.String(), new ErrorScoreSerde()));
result.foreach((key, value) -> System.out.println("Join Stream: " + key + " " + value));
KafkaStreams streams = new KafkaStreams(builder.build(), props);
streams.start();
}
private static Properties getProperties() {
Properties props = new Properties();
//Url of the kafka broker, this can also be found in the Aiven console
props.put("bootstrap.servers", "localhost:9095");
props.put("group.id", "join-test");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put("application.id", "join-test");
props.put("default.timestamp.extractor", "com.my.SensorDataTimestampExtractor");
//The key of a message is a string
props.put("key.deserializer",
StringDeserializer.class.getName());
props.put("value.deserializer",
StringDeserializer.class.getName());
props.put(StreamsConfig.PROCESSING_GUARANTEE_CONFIG, StreamsConfig.EXACTLY_ONCE);
props.put(StreamsConfig.COMMIT_INTERVAL_MS_CONFIG, 1000);
return props;
}
I'm expecting that the app starts without the timeout and continues working

Spring boot rabbitMq setExceptionHandler

How can I convert the following code using the Spring framework?
ConnectionFactory factory = new ConnectionFactory();
factory.setExceptionHandler(new BrokerExceptionHandler(logger, instance));
public final class BrokerExceptionHandler extends StrictExceptionHandler {
#Override
public void handleReturnListenerException(Channel channel, Throwable exception) {
logger.log(Level.SEVERE, "ReturnListenerException detected: ReturnListener.handleReturn", exception);
this.publishAlert(exception, "ReturnListener.handleReturn");
logger.log(Level.SEVERE, "Close application", exception);
System.exit(-1);
}
....
}
Basically I need to specify a custom exception handler if a rabbitMQ exception occurs and then stop the application
How can I publish a rabbitMq message every time that there is an exception?
EDIT
I modified my configuration class in this way:
#Bean
SimpleMessageListenerContainer containerPredict(ConnectionFactory connectionFactory,
MessageListenerAdapter listenerPredictAdapter) {
SimpleMessageListenerContainer container = new SimpleMessageListenerContainer();
container.setConnectionFactory(connectionFactory);
container.setDefaultRequeueRejected(false);
container.setErrorHandler(new BrokerExceptionHandler());
container.setQueueNames(getQueueName());
container.setMessageListener(listenerAdapter);
return container;
}
and this is my BrokerExceptionHandler class
public class BrokerExceptionHandler implements ErrorHandler {
private final Logger logger = Logger.getLogger(getClass().getSimpleName());
#Autowired
private Helper helper;
#Override
public void handleError(Throwable t) {
logger.log(Level.SEVERE, "Exception Detected. Publishing error alert");
String message = "Exception detected. Message: " + t.getMessage());
// Notify the error to the System sending a new RabbitMq message
System.out.println("---> Before convertAndSend");
rabbitTemplate.convertAndSend(exchange, routing, message);
System.out.println("---> After convertAndSend");
}
}
I can see the log Exception Detected. Publishing error alert and ---> Before convertAdnSend in the console, but the new alert is not published and the log ---> After convertAndSend doesn't appear in the console.
Here it is the log:
2018-10-17 09:32:02.849 ERROR 1506 --- [tainer****-1] BrokerExceptionHandler : Exception Detected. Publishing error alert
---> Before convertAndSend
2018-10-17 09:32:02.853 INFO 1506 --- [tainer****-1] o.s.a.r.l.SimpleMessageListenerContainer : Restarting Consumer#4f5b08d: tags=[{amq.ctag-yUcUmg5BCo20ucG1wJZoWA=myechange}], channel=Cached Rabbit Channel: AMQChannel(amqp://admin#XXX.XXX.XXX.XXX:5672/testbed_simulators,1), conn: Proxy#3964d79 Shared Rabbit Connection: SimpleConnection#61f39bb [delegate=amqp://admin#XXX.XXX.XXX.XXX:5672/testbed_simulators, localPort= 51528], acknowledgeMode=AUTO local queue size=0
2018-10-17 09:32:02.905 INFO 1506 --- [tainer****-2] o.s.amqp.rabbit.core.RabbitAdmin : Auto-declaring a non-durable, auto-delete, or exclusive Queue (myexchange) durable:false, auto-delete:true, exclusive:true. It will be redeclared if the broker stops and is restarted while the connection factory is alive, but all messages will be lost.
2018-10-17 09:32:02.905 INFO 1506 --- [tainer****-2] o.s.amqp.rabbit.core.RabbitAdmin : Auto-declaring a non-durable, auto-delete, or exclusive Queue (myexchange) durable:false, auto-delete:true, exclusive:true. It will be redeclared if the broker stops and is restarted while the connection factory is alive, but all messages will be lost.
EDIT
Debugging I see that before to send the new message, the following code is called:
File: SimpleMessageListenerContainer.class line 1212
if (!isActive(this.consumer) || aborted) {
.....
}
else {
---> logger.info("Restarting " + this.consumer);
restart(this.consumer);
}
EDIT 2
Example code: http://github.com/fabry00/spring-boot-rabbitmq
It depends on how you are doing your configuration; if you are using Spring Boot's auto-configured connection factory...
#Bean
public InitializingBean connectionFactoryConfigurer(CachingConnectionFactory ccf) {
return () -> ccf.getRabbitConnectionFactory().setExceptionHandler(...);
}
If you are wiring up your own beans (e.g. via a RabbitConnectionFactoryBean) then set it directly.
EDIT
You are throwing a NullPointerException in your error handler...
2018-10-17 11:51:58.733 DEBUG 38975 --- [containerKpis-1] o.s.a.r.l.SimpleMessageListenerContainer : Consumer raised exception, processing can restart if the connection factory supports it
java.lang.NullPointerException: null
at com.test.BrokerExceptionHandler.handleError(BrokerExceptionHandler.java:27) ~[main/:na]
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.invokeErrorHandler(AbstractMessageListenerContainer.java:1243) ~[spring-rabbit-2.0.6.RELEASE.jar:2.0.6.RELEASE]
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.handleListenerException(AbstractMessageListenerContainer.java:1488) ~[spring-rabbit-2.0.6.RELEASE.jar:2.0.6.RELEASE]
at org.springframework.amqp.rabbit.listener.AbstractMessageListenerContainer.executeListener(AbstractMessageListenerContainer.java:1318) ~[spring-rabbit-2.0.6.RELEASE.jar:2.0.6.RELEASE]
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.doReceiveAndExecute(SimpleMessageListenerContainer.java:817) ~[spring-rabbit-2.0.6.RELEASE.jar:2.0.6.RELEASE]
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.receiveAndExecute(SimpleMessageListenerContainer.java:801) ~[spring-rabbit-2.0.6.RELEASE.jar:2.0.6.RELEASE]
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer.access$700(SimpleMessageListenerContainer.java:77) ~[spring-rabbit-2.0.6.RELEASE.jar:2.0.6.RELEASE]
at org.springframework.amqp.rabbit.listener.SimpleMessageListenerContainer$AsyncMessageProcessingConsumer.run(SimpleMessageListenerContainer.java:1042) ~[spring-rabbit-2.0.6.RELEASE.jar:2.0.6.RELEASE]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
2018-10-17 11:51:58.734 INFO 38975 --- [containerKpis-1] o.s.a.r.l.SimpleMessageListenerContainer : Restarting Consumer#1aabf50d: tags=[{amq.ctag-VxxHKiMsWI_w8DIooAsySA=myapp.mydomain.KPIS}], channel=Cached Rabbit Channel: AMQChannel(amqp://guest#127.0.0.1:5672/,1), conn: Proxy#b88a7d6 Shared Rabbit Connection: SimpleConnection#25dc64a [delegate=amqp://guest#127.0.0.1:5672/, localPort= 55662], acknowledgeMode=AUTO local queue size=0
To turn on DEBUG logging, add
logging.level.org.springframework.amqp=debug
to your application.properties.
this.helper is null because the error handler is not a Spring Bean - #Autowired only works if Spring manages the object; you are using new BrokerExceptionHandler().
EDIT2
I added these 2 beans
#Bean
public BrokerExceptionHandler errorHandler() {
return new BrokerExceptionHandler();
}
#Bean
public MessageConverter json() { // Boot auto-configures in template
return new Jackson2JsonMessageConverter();
}
and now...
---> Before publishing Alert event
--- ALERT
2018-10-17 12:14:45.304 INFO 43359 --- [containerKpis-1] Helper : publishAlert
2018-10-17 12:14:45.321 DEBUG 43359 --- [containerKpis-1] o.s.a.r.c.CachingConnectionFactory : Creating cached Rabbit Channel from AMQChannel(amqp://guest#127.0.0.1:5672/,3)
2018-10-17 12:14:45.321 DEBUG 43359 --- [containerKpis-1] o.s.amqp.rabbit.core.RabbitTemplate : Executing callback RabbitTemplate$$Lambda$638/975724213 on RabbitMQ Channel: Cached Rabbit Channel: AMQChannel(amqp://guest#127.0.0.1:5672/,3), conn: Proxy#77f3f419 Shared Rabbit Connection: SimpleConnection#10c86af1 [delegate=amqp://guest#127.0.0.1:5672/, localPort= 56220]
2018-10-17 12:14:45.321 DEBUG 43359 --- [containerKpis-1] o.s.amqp.rabbit.core.RabbitTemplate : Publishing message (Body:'{"timestamp":1539792885303,"code":"ERROR","severity":"ERROR","message":"Exception detected. Message: Listener method 'kpisEvent' threw exception"}' MessageProperties [headers={sender=myapp, protocolVersion=1.0.0, senderType=MY_COMPONENT_1, __TypeId__=com.test.domain.Alert, timestamp=1539792885304}, contentType=application/json, contentEncoding=UTF-8, contentLength=146, deliveryMode=PERSISTENT, priority=0, deliveryTag=0])on exchange [myevent.ALERT], routingKey = [/]
--- ALERT 2
---> After publishing Alert event
2018-10-17 12:14:45.323 DEBUG 43359 --- [pool-1-thread-6] o.s.a.r.listener.BlockingQueueConsumer : Storing delivery for consumerTag: 'amq.ctag-eYbzZ09pCw3cjdtSprlZMQ' with deliveryTag: '1' in Consumer#4b790d86: tags=[{amq.ctag-eYbzZ09pCw3cjdtSprlZMQ=myapp.myevent.ALERT}], channel=Cached Rabbit Channel: AMQChannel(amqp://guest#127.0.0.1:5672/,2), conn: Proxy#77f3f419 Shared Rabbit Connection: SimpleConnection#10c86af1 [delegate=amqp://guest#127.0.0.1:5672/, localPort= 56220], acknowledgeMode=AUTO local queue size=0
2018-10-17 12:14:45.324 DEBUG 43359 --- [ontainerReset-1] o.s.a.r.listener.BlockingQueueConsumer : Received message: (Body:'{"timestamp":1539792885303,"code":"ERROR","severity":"ERROR","message":"Exception detected. Message: Listener method 'kpisEvent' threw exception"}' MessageProperties [headers={sender=myapp, protocolVersion=1.0.0, senderType=MY_COMPONENT_1, __TypeId__=com.test.domain.Alert, timestamp=1539792885304}, contentType=application/json, contentEncoding=UTF-8, contentLength=0, receivedDeliveryMode=PERSISTENT, priority=0, redelivered=false, receivedExchange=myevent.ALERT, receivedRoutingKey=/, deliveryTag=1, consumerTag=amq.ctag-eYbzZ09pCw3cjdtSprlZMQ, consumerQueue=myapp.myevent.ALERT])
2018-10-17 12:14:45.324 INFO 43359 --- [ontainerReset-1] Application : ---> kpisAlert RECEIVED
2018-10-17 12:14:45.325 ERROR 43359 --- [ontainerReset-1] Application : ---> Message: Exception detected. Message: Listener method 'kpisEvent' threw exception
2018-10-17 12:14:45.326 DEBUG 43359 --- [containerKpis-1] o.s.a.r.listener.BlockingQueueConsumer : Rejecting messages (requeue=false)
EDIT3
Or, if you prefer Gson...
#Bean
public MessageConverter json() {
Gson gson = new GsonBuilder().create();
return new MessageConverter() {
#Override
public Message toMessage(Object object, MessageProperties messageProperties) throws MessageConversionException {
return new Message(gson.toJson(object).getBytes(), messageProperties);
}
#Override
public Object fromMessage(Message message) throws MessageConversionException {
throw new UnsupportedOperationException();
}
};
}
EDIT4
I changed the current version of your app as follows:
#Bean
public MessageConverter jsonConverter() {
Gson gson = new GsonBuilder().create();
EventKpisCollected collected = new EventKpisCollected();
return new MessageConverter() {
#Override
public Message toMessage(Object object, MessageProperties messageProperties) throws MessageConversionException {
System.out.println("toMessage");
return new Message(gson.toJson(object).getBytes(), messageProperties);
}
#Override
public Object fromMessage(Message message) throws MessageConversionException {
System.out.println("fromMessage");
return collected.decode(new String(message.getBody()));
}
};
}
...
#Bean
SimpleMessageListenerContainer containerKpis(ConnectionFactory connectionFactory,
MessageListenerAdapter listenerKpisAdapter) {
SimpleMessageListenerContainer container = new SimpleMessageListenerContainer();
container.setConnectionFactory(connectionFactory);
container.setDefaultRequeueRejected(false);
container.setErrorHandler(errorHandler());
container.setQueueNames(getQueueKpis());
container.setMessageListener(listenerKpisAdapter);
return container;
}
#Bean
SimpleMessageListenerContainer containerReset(ConnectionFactory connectionFactory,
MessageListenerAdapter listenerAlertAdapter) {
SimpleMessageListenerContainer container = new SimpleMessageListenerContainer();
container.setConnectionFactory(connectionFactory);
container.setDefaultRequeueRejected(false);
container.setErrorHandler(errorHandler());
container.setQueueNames(getQueueAlert());
container.setMessageListener(listenerAlertAdapter);
return container;
}
#Bean
MessageListenerAdapter listenerKpisAdapter(Application receiver) {
MessageListenerAdapter messageListenerAdapter = new MessageListenerAdapter(receiver, "kpisEvent");
messageListenerAdapter.setMessageConverter(jsonConverter());
return messageListenerAdapter;
}
#Bean
MessageListenerAdapter listenerAlertAdapter(Application receiver) {
MessageListenerAdapter messageListenerAdapter = new MessageListenerAdapter(receiver, "alertEvent");
// messageListenerAdapter.setMessageConverter(jsonConverter()); converter only handles events.
return messageListenerAdapter;
}
and
fromMessage
2018-10-19 13:46:53.734 INFO 10725 --- [containerKpis-1] Application : ---> kpisEvent RECEIVED
2018-10-19 13:46:53.734 INFO 10725 --- [containerKpis-1] Application : ---> kpisEvent DECODED, windowId: 1522751098000-1522752198000
With the event decoding done by the framework (just for the event currently - you will need a second converter for the alers).

How to stop micro service with Spring Kafka Listener, when connection to Apache Kafka Server is lost?

I am currently implementing a micro service, which reads data from Apache Kafka topic. I am using "spring-boot, version: 1.5.6.RELEASE" for the micro service and "spring-kafka, version: 1.2.2.RELEASE" for the listener in the same micro service. This is my kafka configuration:
#Bean
public Map<String, Object> consumerConfigs() {
return new HashMap<String, Object>() {{
put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, servers);
put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
put(ConsumerConfig.GROUP_ID_CONFIG, groupIdConfig);
put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, autoOffsetResetConfig);
}};
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
return new DefaultKafkaConsumerFactory<>(consumerConfigs());
}
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> kafkaListenerContainerFactory() {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory());
return factory;
}
I have implemented the listener via the #KafkaListener annotation:
#KafkaListener(topics = "${kafka.dataSampleTopic}")
public void receive(ConsumerRecord<String, String> payload) {
//business logic
latch.countDown();
}
I need to be able to shutdown the micro service, when the listener looses connection to the Apache Kafka server.
When I kill the kafka server I get the following message in the spring boot log:
2017-11-01 19:58:15.721 INFO 16800 --- [ 0-C-1] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator 192.168.0.4:9092 (id: 2145482646 rack: null) dead for group TestGroup
When I start the kafka sarver, I get:
2017-11-01 20:01:37.748 INFO 16800 --- [ 0-C-1] o.a.k.c.c.internals.AbstractCoordinator : Discovered coordinator 192.168.0.4:9092 (id: 2145482646 rack: null) for group TestGroup.
So clearly the Spring Kafka Listener in my micro service is able to detect when the Kafka Server is up and running and when it's not. In the book by confluent Kafka The Definitive Guide in chapter But How Do We Exit? it is said that the wakeup() method needs to be called on the Consumer, so that a WakeupException would be thrown. So I tried to capture the two events (Kafka server down and Kafka server up) with the #EventListener tag, as described in the Spring for Apache Kafka documentation, and then call wakeup(). But the example in the documentation is on how to detect idle consumer, which is not my case. Could someone please help me with this. Thanks in advance.
I don't know how to get a notification of the server down condition (in my experience, the consumer goes into a tight loop within the poll()).
However, if you figure that out, you can stop the listener container(s) which will wake up the consumer and exit the tight loop...
#Autowired
private KafkaListenerEndpointRegistry registry;
...
this.registry.stop();
2017-11-01 16:29:54.290 INFO 21217 --- [ad | so47062346] o.a.k.c.c.internals.AbstractCoordinator : Marking the coordinator localhost:9092 (id: 2147483647 rack: null) dead for group so47062346
2017-11-01 16:29:54.346 WARN 21217 --- [ntainer#0-0-C-1] org.apache.kafka.clients.NetworkClient : Connection to node 0 could not be established. Broker may not be available.
...
2017-11-01 16:30:00.643 WARN 21217 --- [ntainer#0-0-C-1] org.apache.kafka.clients.NetworkClient : Connection to node 0 could not be established. Broker may not be available.
2017-11-01 16:30:00.680 INFO 21217 --- [ntainer#0-0-C-1] essageListenerContainer$ListenerConsumer : Consumer stopped
You can improve the tight loop by adding reconnect.backoff.ms, but the poll() never exits so we can't emit an idle event.
spring:
kafka:
consumer:
enable-auto-commit: false
group-id: so47062346
properties:
reconnect.backoff.ms: 1000
I suppose you could enable idle events and use a timer to detect if you've received no data (or idle events) for some period of time, and then stop the container(s).

reconsume Kafka 0.10.1 topic in Spring

In Spring-Kafka I want to reconsume a Kafka topic from the beginning. Doing this by changing the group.id to something unknown to Kafka of course works:
#KafkaListener(topics = "sensordata.t")
public void receiveMessage(String message) {
...
}
#Bean
public Map consumerConfigs() {
Map props = new HashMap<>();
props.put(ConsumerConfig.GROUP_ID_CONFIG, "NewGroupID");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false"); //it still commits though...
return props;
}
However, starting over by setting the offset to 0 fails.
#KafkaListener(topicPartitions =
{ #TopicPartition(topic = "sensordata.t",
partitionOffsets = #PartitionOffset(partition = "0", initialOffset = "0"))})
public void receiveMessage(String message) {
...
}
#Bean
public Map consumerConfigs() {
Map props = new HashMap<>();
...
props.put(ConsumerConfig.GROUP_ID_CONFIG, "NewGroupID");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "false");
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "10000"); //making timeout window larger seems to have no influence
props.put(ConsumerConfig.MAX_POLL_RECORDS_CONFIG, "1"); //setting max records to 1 makes no difference
return props;
}
The error I get:
2016-11-14 14:07:59.018 INFO 8165 --- [ main] c.i.t.s.server.SpringKafkaApplication : Started SpringKafkaApplication in 4.134 seconds (JVM running for 4.745)
2016-11-14 14:07:59.125 INFO 8165 --- [afka-consumer-1] o.a.k.c.c.internals.AbstractCoordinator : Discovered coordinator bto:9092 (id: 2147483647 rack: null) for group spring8.
2016-11-14 14:07:59.125 INFO 8165 --- [afka-consumer-1] o.a.k.c.c.internals.AbstractCoordinator : Discovered coordinator bto:9092 (id: 2147483647 rack: null) for group spring8.
2016-11-14 14:07:59.129 INFO 8165 --- [afka-consumer-1] o.a.k.c.c.internals.ConsumerCoordinator : Revoking previously assigned partitions [] for group spring8
2016-11-14 14:07:59.129 INFO 8165 --- [afka-consumer-1] o.s.k.l.KafkaMessageListenerContainer : partitions revoked:[]
2016-11-14 14:07:59.129 INFO 8165 --- [afka-consumer-1] o.a.k.c.c.internals.AbstractCoordinator : (Re-)joining group spring8
2016-11-14 14:07:59.338 ERROR 8165 --- [afka-consumer-1] essageListenerContainer$ListenerConsumer : Container exception
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:163) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.commitOffsetsSync(ConsumerCoordinator.java:426) ~[kafka-clients-0.10.0.1.jar:na]
at org.apache.kafka.clients.consumer.KafkaConsumer.commitSync(KafkaConsumer.java:1059) ~[kafka-clients-0.10.0.1.jar:na]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.commitIfNecessary(KafkaMessageListenerContainer.java:939) ~[spring-kafka-1.1.1.RELEASE.jar:na]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.processCommits(KafkaMessageListenerContainer.java:816) ~[spring-kafka-1.1.1.RELEASE.jar:na]
at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:526) ~[spring-kafka-1.1.1.RELEASE.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_92]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_92]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
Anyone familiar with this?
I'm using Kafka 0.10.1.0 and
<properties>
<java.version>1.8</java.version>
<spring-kafka.version>1.1.1.RELEASE</spring-kafka.version>
</properties>
Why have you decided that it is the problem of offset 0?
Your StackTrace says that you have longer pollTimeout is longer than session.timeout.ms:
Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records.
How about to adjust them properly?

Resources