How to retry a kafka message when there is an error - spring cloud stream - spring

I'm pretty new to Kafka. I'm using spring cloud stream Kafka to produce and consume
#StreamListener(Sink.INPUT)
public void process(Order order) {
try {
// have my message processing
}
catch( exception e ) {
//retry here that record..
}
}
}
Just want to know how can I implement a retry ? Any help on this is highly appreciated

Hy
There are multiple ways to handle "retries" and it depends on the kind of events you encounter.
For basic issues kafka framework will retry for you to recover from an error condition, for example in case of a short network downtime the consumer and producer api implement auto retry.
In particular kafka support "built-in producer/consumer retries" to correctly handle a large variety of errors without loss of messages, but as a developer, you must still be able to handle other types of errors with the try-catch block you mention.
Error in kafka can be divided in the following categories:
(producer & consumer side) Nonretriable broker errors such as errors regarding message size, authorization errors, etc -> you must handle them in "design phase" of your app.
(producer side) Errors that occur before the message was sent to the broker—for example, serialization errors --> you must handle them in the runtime app execution
(producer & consumer sideErrors that occur when the producer exhausted all retry attempts or when the
available memory used by the producer is filled to the limit due to using all of it to store messages while retrying -> you should handle these errors.
Another point of attention regarding "how to retry" is how to handle correctly the order of commits in case of auto-commit option is set to false.
A common and simple pattern to get commit order right is to use a monotonically increasing sequence number. Increase the sequence number every time you commit and add the sequence number at the time of the commit to the commit function.
When you’re getting ready to send a retry, check if the
commit sequence number the callback got is equal to the instance
variable; if it is, there was no newer commit and it is safe to retry. If
the instance sequence number is higher, don’t retry because a
newer commit was already sent.

Related

How to invoke CommonContainerStoppingErrorHandler once retries are exhausted with Batch listener

I am using spring boot (version 2.7.1) with spring cloud stream kafka binder (2.8.5) for processing Kafka messages
I've functional style consumer that consumes messages in batches. Right now its retrying 10 times and commits the offset for errored records.
I want now to introduce the mechanism of retry for certain numbers (works using below error handler) then stop processing messages and fail entire batch messages without auto committing offset.
I read through the documents and understand that CommonContainerStoppingErrorHandler can be used for stopping the container from consuming messages.
My handler looks below now and its retries exponentially.
#Bean
public ListenerContainerCustomizer<AbstractMessageListenerContainer<String, Message>> errorHandler() {
return (container, destinationName, group) -> {
container.getContainerProperties().setAckMode(ContainerProperties.AckMode.BATCH);
ExponentialBackOffWithMaxRetries backOffWithMaxRetries = new ExponentialBackOffWithMaxRetries(2);
backOffWithMaxRetries.setInitialInterval(1);
backOffWithMaxRetries.setMultiplier(2.0);
backOffWithMaxRetries.setMaxInterval(5);
container.setCommonErrorHandler(new DefaultErrorHandler(backOffWithMaxRetries));
};
}
How do I chain CommonContainerStoppingErrorHandler along with above error handler, so the failed batch is not commited and replayed upon restart ?
with BatchListenerFailedException from consumer, it is possible to fail entire batch (including one or other valid records before any problematic record in that batch) ?
Add a custom recoverer to the error handler - see this answer for an example: How do you exit spring boot application programmatically when retries are exhausted, to prevent kafka offset commit
No; records before the failed one will have their offsets committed.

RabbitMQ operation basic.ack caused a channel exception precondition_failed: unknown delivery tag 3 - Golang [duplicate]

We have a PHP app that forwards messages from RabbitMQ to connected devices down a WebSocket connection (PHP AMQP pecl extension v1.7.1 & RabbitMQ 3.6.6).
Messages are consumed from an array of queues (1 per websocket connection), and are acknowledged by the consumer when we receive confirmation over the websocket that the message has been received (so we can requeue messages that are not delivered in an acceptable timeframe). This is done in a non-blocking fashion.
99% of the time, this works perfectly, but very occasionally we receive an error "RabbitMQ PRECONDITION_FAILED - unknown delivery tag ". This closes the channel. In my understanding, this exception is a result of one of the following conditions:
The message has already been acked or rejected.
An ack is attempted over a channel the message was not delivered on.
An ack is attempted after the message timeout (ttl) has expired.
We have implemented protections for each of the above cases but yet the problem continues.
I realise there are number of implementation details that could impact this, but at a conceptual level, are there any other failure cases that we have not considered and should be handling? or is there a better way of achieving the functionality described above?
"PRECONDITION_FAILED - unknown delivery tag" usually happens because of double ack-ing, ack-ing on wrong channels or ack-ing messages that should not be ack-ed.
So in same case you are tying to execute basic.ack two times or basic.ack using another channel
(Solution below)
Quoting Jan Grzegorowski from his blog:
If you are struggling with the 406 error message which is included in
title of this post you may be interested in reading the whole story.
Problem
I was using amqplib for conneting NodeJS based messages processor with
RabbitMQ broker. Everything seems to be working fine, but from time to
time 406 (PRECONDINTION-FAILED) message shows up in the log:
"Error: Channel closed by server: 406 (PRECONDITION-FAILED) with message "PRECONDITION_FAILED - unknown delivery tag 1"
Solution <--
Keeping things simple:
You have to ACK messages in same order as they arrive to your system
You can't ACK messages on a different channel than that they arrive on If you break any of these rules you will face 406
(PRECONDITION-FAILED) error message.
Original answer
It can happen if you set no-ack option of a Consumer to true that means you souldn't call ack function manually:
https://www.rabbitmq.com/amqp-0-9-1-reference.html#basic.consume.no-ack
The solution: set no-ack flag to false.
If you aknowledge twice the same message you can have this error.
A variation of what they said above about acking it twice:
there is an "obscure" situation where you are acking a message more than once, which is when you ack a message with multiple parameter set to true, which means all previous messages to the one you are trying to ack, will be acked too.
And so if you try to ack one of the messages that were "auto acked" by setting multiple to true then you would be trying to "ack" it multiple times and so the error, confusing but hope you understand it after a few reads.
Make sure you have the correct application.properties:
If you use the RabbitTemplate without any channel configuration, use "simple":
spring.rabbitmq.listener.simple.acknowledge-mode=manual
In this case, if you use "direct" instead of "simple", you will get the same error message. Another one looks like this:
spring.rabbitmq.listener.direct.acknowledge-mode=manual

does Spring allow to configure retry and recovery mechanism for KafkaTemplate send method?

Trying to build a list of possible errors that can potentially happen during the execution of kafkaTemplate.send() method:
Errors related serialization process;
Some network issues or broker is down;
Some technical issues on broker side, for example acknowledgement not received from broker, etc.
And now I need to find a way how to handle all possible errors in the right way:
Based on the business requirements: in case of any exceptions I need to do the following things:
Retry 3 times;
If all 3 retries failed - log appropriate message.
I found that configuration property spring.kafka.producer.retries available, and I believe it exactly what I need.
But have can I configure recovery method (method that will be executed when all retries failed)?
Probably that spring.kafka.producer.retries is not what you are looking for.
This auto-configuration property is mapped directly to ConsumerConfig:
map.from(this::getRetries).to(properties.in(ProducerConfig.RETRIES_CONFIG));
and then we go and read docs for that ProducerConfig.RETRIES_CONFIG property:
private static final String RETRIES_DOC = "Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error."
+ " Note that this retry is no different than if the client resent the record upon receiving the error."
+ " Allowing retries without setting <code>" + MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION + "</code> to 1 will potentially change the"
+ " ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second"
+ " succeeds, then the records in the second batch may appear first. Note additionally that produce requests will be"
+ " failed before the number of retries has been exhausted if the timeout configured by"
+ " <code>" + DELIVERY_TIMEOUT_MS_CONFIG + "</code> expires first before successful acknowledgement. Users should generally"
+ " prefer to leave this config unset and instead use <code>" + DELIVERY_TIMEOUT_MS_CONFIG + "</code> to control"
+ " retry behavior.";
As you see spring-retry is fully not involved in the process and all the retries are done directly inside Kafka Client and its KafkaProducer infrastructure.
Although this is not all. Pay attention to the KafkaProducer.send() contract:
Future<RecordMetadata> send(ProducerRecord<K, V> record);
It returns a Future. And if we take a look closer to the implementation, we will see that there is a synchronous part - topic metadata request and serialization, - and enqueuing for the batch for async sending to Kafka broker. The mentioned ProducerConfig.RETRIES_CONFIG has an effect only in that Sender.completeBatch().
I believe that the Future is completed with an error when those internal retries are exhausted. So, you probably should think about using a RetryTemplate manually in the service method around KafkaTemplate to be able to control a retry (and recovery, respectively) around metadata and serialization which are really sync and blocking in the current call. The actual send you also can control in that method with retry, but if you call Future.get() to block it for a response or error from Kafka client on send.

How to use messageKeyGenerator in StatefulRetryOperationsInterceptor

I am trying implement sample retry mechanism for RabbitMQ using Spring's StatefulRetryOperationsInterceptor.
As stated in documentation, I need to setup message key generator as the message id is absent. What I don't understand is the real usage of unique id generated per message. i.e. when I used below implementation I did not have any issue with retry:
StatefulRetryOperationsInterceptor interceptor =
RetryInterceptorBuilder.stateful()
.maxAttempts(3)
.backOffOptions(2000, 1, 2000)
.messageKeyGenerator(
new MessageKeyGenerator() {
#Override
public Object getKey(Message message) {
return 1;
}
);
container.setAdviceChain(new Advice[] {interceptor});
Stateful retry needs the originating message to be somehow unique - so the retry "state" for the message can be determined - the simplest way is to have the message publisher add a unique message id header.
However, it's possible something in your message body or some other header might be used to uniquely identify the message. Enter the MessageKeyGenerator which is used to determine the unique id.
Using a constant (1 in your case) won't work because every message has the same message key and will all be considered to be deliveries of the same message (from a retry perspective).
The framework does provide a MissingMessageIdAdvice which can provide limited support for stateful retry (if added to the advice chain before the retry advice). It adds a messageId to the incoming message.
"Limited" means full stateful retry support is not available - only one redelivery attempt is allowed.
If the redelivery fails, the message is rejected which causes it to be discarded or routed to the DLX/DLQ if so configured. In all cases the "temporary" state is removed from the cache.
Generally, if full retry support is needed and there is no messageId property and there is no way to generate a unique key with a MessageKeyGenerator, I would recommend using stateless retry.

Changing state of messages which are "in delivery"

In my application, I have a queue (HornetQ) set up on JBoss 7 AS.
I have used Spring batch to do some work once the messages is received (save values in database etc.) and then the consumer commits the JMS session.
Sometimes when there is an exception while processing the message, the excecution of consumer is aborted abruptly.
And the message remains in "in delivery" state. There are about 30 messages in this state on my production queue.
I have tried restarting the consumer but the state of these messages is not changed. The only way to remove these
messages from the queue is to restart the queue. But before doing that I want a way to read these messages so
that they can be corrected and sent to the queue again to be processed.
I have tried using QueueBrowser to read them but it does not work. I have searched a lot on Google but could not
find any way to read these messages.
I am using a Transacted session, where once the message is processed, I am calling:
session.commit();
This sends the acknowledgement.
I am implementing spring's
org.springframework.jms.listener.SessionAwareMessageListener
to recieve messages and then to process them.
While processing the messages, I am using spring batch to insert some data in database.
For a perticular case, it tries to insert data too big to be inserted in a column.
It throws an exception and transaction is aborted.
Now, I have fixed my producer and consumer not to have such data, so that this case should not happen again.
But my question is what about the 30 "in delivery" state messages that are in my production queue? I want to read them so that they can be corrected and sent to the queue again to be processed. Is there any way to read these messages? Once I know their content, I can restart the queue and submit them again (after correcting them).
Thanking you in anticipation,
Suvarna
It all depends on the Transaction mode you are using.
for instance if you use transactions:
// session here is a TX Session
MessageConsumer cons = session.createConsumer(someQueue);
session.start();
Message msg = consumer.receive...
session.rollback(); // this will make the messages to be redelivered
if you are using non TX:
// session here is auto-ack
MessageConsumer cons = session.createConsumer(someQueue);
session.start();
// this means the message is ACKed as we receive, doing autoACK
Message msg = consumer.receive...
//however the consumer here could have a buffer from the server...
// if you are not using the consumer any longer.. close it
consumer.close(); // this will release messages on the client buffer
Alternatively you could also set consumerWindowSize=0 on the connectionFactory.
This is on 2.2.5 but it never changed on following releases:
http://docs.jboss.org/hornetq/2.2.5.Final/user-manual/en/html/flow-control.html
I"m covering all the possibilities I could think of since you're not being specific on how you are consuming. If you provide me more detail then I will be able to tell you more:
You can indeed read your messages in the queue using jmx (with for example jconsole)
In Jboss As7 you can do it the following way :
MBeans>jboss.as>messaging>default>myJmsQueue>Operations
listMessagesAsJson
[edit]
Since 2.3.0 You have a dedicated method for this specific case :
listDeliveringMessages
See https://issues.jboss.org/browse/HORNETQ-763

Resources