As you know, in Masstransit, we can set up a UseMessageRetry. This method can help us to retry the message sending process once more.
This process continues as long as we hit errors.
The problem that I have encountered is that this setting is a singleton for every message.
What I seek to have is to set a different configuration for a particular message(courier).
the setting I have set for all messages is as follows:
config.UseMessageRetry(x =>
{
x.Incremental(100,
TimeSpan.FromSeconds(1),
TimeSpan.FromMilliseconds(100));
});
I would be thankful if somebody can help me out.
When configuring message retry at the bus configuration level, that retry policy applies to all receive endpoints. To configure a different policy for specific receive endpoints, you'd need to specify a different retry policy in the receive endpoint configuration.
A better way to configure the receive endpoints if you're using ConfigureEndpoints() is to register an IConfigureReceiveEndpoint type and check the queue name to see if it matches an execute or compensate queue name, and apply a different retry in that scenario.
Related
Im using Http Client Processor app in one of my stream which is a standard app of SCDF.The puspose of this app is to make http calls against provided URL with a message payload. I tried to enable the retry mechanism of this app by keeping the boolean httpclient.retry.enabled to 'true'.
But when I do that it try to repost the message against the http end point even if the first attempt is succecssfull. It looks like it is working with the concept of 'write at least once'. The problem with this approach is, it creates duplicates in the target system.
Is there a way we can configure it for 'write just once if the call is successful else, retry'. If not can we expect a fix from Spring ?
The Http Client Processor is no longer supported. I recommend upgrading to HttpRequestProcessor. This uses the common retry mechanism included in the messaging binder. The behavior is as you describe, the request will be retried only if the consumer fails to acknowledge the message. With at least once guarantees, you still have the potential for duplicates.
I have a spring service subcribing for messages from a topic in the google cloud pubsub (pulling). It is working correctly in general. But I want to have more control over resent messages. My service need sometimes to nack the message or just let the ackDeadline pass so that I would get the message later on again. While testing with single messages, the nacked message comes back to me almost immidetaly, and the ones I don't ack or nack at all, come back after 10 sec default for ackDeadline. I would like it to postpone the repeated consuming of these messages. I thought the retry setting are designed for such cases.
I should mention as well that I am currently testing locally with an emulator and create the subscription from code. I am using the PubSubAdmin for managing.
According to this docu I have tried to set those configuration in my profile config. like this:
spring.cloud.gcp.pubsub.subscriber.retry.initial-retry-delay-second: 4
spring.cloud.gcp.pubsub.subscriber.retry.max-attempts: 5
spring.cloud.gcp.pubsub.subscriber.retry.initial-rpc-timeout-seconds: 4
spring.cloud.gcp.pubsub.subscriber.retry.max-rpc-timeout-seconds: 8
spring.cloud.gcp.pubsub.subscriber.retry.max-retry-delay-seconds: 7
spring.cloud.gcp.pubsub.subscriber.retry.total-timeout-seconds: 3000
but it had no effect on the time of reoccuring of the messages.
Do I understand the meaning of retry settings wrongly? maybe they only take effect if there are some connection problems but not in nacking or lacking of acknowledgment cases? Or do I have to set the setting while using deploymentManager for creating the subscriptions and am not allowed to set them from the code? Or maybe setting them in (development) profile configs won't work with the PubSubAdmin?
Thanks for any suggestions!
edit: I want the first retry to happen after 5 seconds, but next retry 10 seconds later, etc. Plus I want to set the max retry number. So what I am not interested in is setting the ackDeadline just to a bigger number.
edit2: why nacking: one of the services (let's call it a bridge) is subscribing for the messages, has to validate each message and if ok pass it to another external system. this service is acting as a bridge for this system, as we can't work on this second system directly. in some cases the message need some extra information, so the bridge will try to fetch it somewhere else (there are a lot of microservices included) and it happens sometimes, that at this moment in time the extra information is not there (yet). So the first idea was to not ack the message and let it come later again. but I don't want to ask every 10 sec for the next 7 days (with ackDeadline), I want to just try few times, and if it is not there after 2 hours, it will never came. so we tried to nack and hoped those retry settings can help to manage the resending. But as they don't, I suppose the only way to go will be to build something for managing these messages in the bridge by myself. Maybe store message ids and the number of retry so that I can ack after for example 5 times and push the message to another topic to deal with it differently. Or are there any better solutions known?
Cloud Pub/Sub does not provide exponential backoff for specific messages. A nack has no effect other than to tell Cloud Pub/Sub that you were not able to handle the message.
I could provide a more useful answer if you were to document why you needed to nack the messages. If you are unable to handle the current load, you can use the flow control options described here to reduce the number of outstanding messages or bytes to your client. If you have messages that are known to be bad, you should instead ack them after pushing to another dead letter topic to be handled separately.
Response to edit 2:
If you have this scenario where the action to supplement the messages can fail, implement whatever backoff mechanism you want on that action yourself in your service. Set the max ack extension period when constructing your subscriber (setMaxAckExtensionPeriod in java) to ensure that your client will extend the ack deadline for each message long enough for your chain of retries.
Edit 2
Note that Pub/Sub now has built in support for Dead Lettering.
You can use PubSubSubscriberTemplate.modifyAckDeadline() to programmatically extend the deadlines of a batch of messages retrieved through pull. Each individual AcknowledgeablePubsubMessage also has a modifyAckDeadline() method, if you only need to extend deadline for a select few stragglers.
If all messages on that particular subscription need to have a longer acknowledgement period, a default can be set in GCP Console by editing the subscription and updating the "Acknowledgement Deadline" field.
I am looking for a way for each consumer instance to receive a message that is published to RabbitMQ via MassTransit. The scenario would be, we have multiple microservices that need to invalidate a cache on notification. Pub-Sub won't work in this instance as there will be 5 consumers of the same type as its the same code per service instance, so only one would receive the message in a traditional PubSub.
Message observation could be an option but this means the messages would never be consumed and hang around forever on the bus.
Can anyone suggest a pattern to use in the context of MassTransit?
Thanks in advance.
You should create a management endpoint in each service, which could even be a temporary queue (just request a receive endpoint without a queue name and one will be dynamically generated). Then, put your queue invalidation consumers on that endpoint. Each service instance will receive a unique instance of the message (when Publish is called), and those queues and bindings will automatically be removed once the service exits.
This is exactly how the bus endpoint works, but in your case, you're creating a receive endpoint which can have consumer message type bindings, so that published messages are received, one copy per service.
cfg.ReceiveEndpoint(cfg => { ... });
Note that the queue name is not specified, and will be automatically generated uniquely.
The retries parameter as it states
Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error.
What I want to confirm is are the retries made automatically by the kafka framework or any additional handling is required from the client side.
And if it's done automatically, and say retries is set to 1. And if the sending of record fails on second attempt also, then will kafka inform me differently with appropriate error messages when it's going to retry and when it's going to stop retrying.
Yes you are right, the retries are made automatically by the Kafka client without the need for any additional handling by the user application.
The client library doesn't inform you when it's going to retry but it follows your configuration with the retry.backoff.ms parameter which defines "The amount of time to wait before attempting to retry a failed request to a given topic partition" as described in the official documentation.
When all the attempts fail then you will receive an exception in the callback passed to the KafkaProducer send method (with RecordMetadata null because no record was sent) or through the Future returned by send.
I has some strange behaviour on production deployment for azure queue messages:
Some of the messages in queues appears with big delay - minutes, and sometimes 10 minutes.
Befere you ask about setting delayTimeout when we put message to queue - we do not set delayTimeout for that message, so message should appear almost immedeatly after it was placed in queue.
At that moments we do not have a big load. So my instances has no work load, and able to process message fast, but they just don't appear.
Our service process millions of messages per month, we able to identify that 10-50 messages processed with very big delay, by that we fail SLA in front of our customers.
Does anyone have any idea what can be reason?
How to overcome?
Did anyone faced similar issues?
Some general ideas for troubleshooting:
Are you certain that the message was queued up for processing - ie the queue.addmessage operation returned successfully and then you are waiting 10 minutes - meaning you can rule out any client side retry policies etc as being the cause of the problem.
Is there any chance that the time calculation could be subject to some kind of clock skew problems. eg - if one of the worker roles pulling messages has its close out of sync with the other worker roles you could see this.
Is it possible that in the situations where the message is appearing to be delayed that a worker role responsible for pulling the messages is actually failing or crashing. If the client calls GetMessage but does not respond with an appropriate acknowledgement within the time specified by the invisibilityTimeout setting then the message will become visible again as the Queue Service assumes the client did not process the message. You could tell if this was a contributing factor by looking at the dequeue count on these messages that are taking longer. More information can be found here: http://msdn.microsoft.com/en-us/library/dd179474.aspx.
Is it possible that the number of workers you have pulling items from the queue is insufficient at certain times of the day and the delays are simply caused by the queue being populated faster than you can pull messages from the queue.
Have you enabled logging for queues and then looked to see if you can find the specific operations (look at e2elatency and serverlatency).
http://blogs.msdn.com/b/windowsazurestorage/archive/tags/analytics+2d00+logging+_2600_amp_3b00_+metrics/. You should also enable client logging and try to determine if the client is having connectivity problems and the retry logic is possibly kicking in.
And finally if none of these appear to help can you please send me the server logs (and ideally the client side logs as well) along with your account information (no passwords) to JAHOGG at Microsoft dot com.
Jason
Azure Service bus has a property in the BrokeredMessage class called ScheduledEnqueueTimeUtc, it allows you to set a time for when the message is added to the queue (effectively creating a delay).
Are you sure that in your code your not setting this property, and this might be the cause for the delay?
You can find more info on this at this url: https://www.amido.com/azure-service-bus-how-to-delay-a-message-being-sent-to-the-queue/
If you are using WebJobs to process messages from the queue, it can be due to WebJobs configuration.
From an MSDN forum post by pranav rastogi:
Starting with 0.4.0-beta, the (WebJobs) SDK implements a random exponential back-off algorithm. As a result of this if there are no messages on the queue, the SDK will back off and start polling less frequently.
The following setting allows you to configure this behavior.
MaxPollingInterval for when a queue remains empty, the longest period of time to wait before checking for a message to. Default is 10min.
static void Main()
{
JobHostConfiguration config = new JobHostConfiguration();
config.Queues.MaxPollingInterval = TimeSpan.FromMinutes(1);
JobHost host = new JobHost(config);
host.RunAndBlock();
}