Spring XD recovery semantics

Spring XD recovery semantics - spring-xd

What are the recovery semantics for Spring XD. I have gone through all the publicly available resources, but still not able get any definite answer.
To elaborate on my question and keep it simple.
a) What happens when a Source fails?
b) What happens when a Processor fails?
c) What happens when a sink fails?

It depends what you mean by "fails"; if you mean "fails to process a message", with processors and sinks, the message bus can be configured with various retry options and, with Rabbit, permanent failures (after retries are exhausted) can be sent to a dead letter exchange/queue (or an ERROR list with REDIS). See Error Handling (Message Delivery Failures).
For sources, it depends on the source and the nature of the failure; in general, the error will be reported to the sender in some fashion. However, for some sources, e.g. rabbit, jms; the message may be redelivered.
For all modules, if the container fails, the admin will redeploy its modules if there's another container available.

Related

MassTransit consumers didn't acknowledge some messages

I have a question about some strange behaviour of consumer.
Recently we had strange situation on production environment. Two consumers on two different microservices were stuck at some messages. The first one was holding 20 messages from rabbitMQ queue and the second one 2 messages and they weren't processing them. These messages were visible as Unacked in RabbitMQ for two days. They went back to Ready state just when that two microservices were restarted. At that time when consumers took this messages the whole program was processing thousands messages per hour, so basically our Saga and all consumers were working. When these messages went back to Ready state they were processed in one second after that so I don't think that it's problem with them.
The messages are published by Saga to Exchange and besides these two stucked consumers we have also EventLogger consumer subscribed to all messages and this EventLogger processed this 22 messages normally without any problems (from his own queue). Also we have connected Application Insights to consumers and there is no information about receiving these 22 messages by these two consumers (there are information about receiving it by EventLogger).
The other day we had the same issue with one message on test environment.
Recently we updated version of MassTransit in our project from version 6.2.0 to 7.1.6 and before that we didn't notice any similar issues with consumers but maybe it's just coincidence. We also have retry, redelivery, circuit breaker and in memory outbox mechanisms but I don't think that's problem with them because the consumer didn't even start to process these 22 messages.
Do you have any suggestions what could happened to this consumers?

Usually when a consumer doesn't even start to consume the message once it has been delivered to MassTransit by RabbitMQ, it could be an issue resolving the consumer from the container, such as a dependency to another backing service (database, log server, file, network connection, device, etc.).
The message remains unacknowledged on the broker because the transport/delivery mechanism to the consumer is waiting for a resource to become available. If there isn't anything in the logs for that time period indicating an issue with a resource, it's hard to know what could have blocked those messages from being consumed. The fact that they were ultimately consumed once the services were restarted seems to indicate the message content itself was fine.
Monitoring the lack of message consumption (and likely an associated queue depth increase) would give an indication that the situation has occurred. If it happens again, I'd increase the logging detail levels to see if the issue occurs again and can then be identified.

RabbitMQ/Spring AMQP - Leave message in a queue

I created a SpringBoot/Spring AMQP project where I configured a listener on a RabbitMQ queue. Question: Is there any way to leave the message in the queue? Let me explain: I consume the message and do some things (eg save on db), if something goes wrong I would like to be able to reconsume the message.
Thanks in advance

You need to think about configuring your listener container with transactions, so when DB call fails, the transaction is going to be rolled back and an AMQP message will not be acked on RabbitMQ.
See docs for more info: https://docs.spring.io/spring-amqp/docs/current/reference/html/#transactions

I don't know about the "Spring" way of accomplishing this, but what you describe is the normal behavior for AMQP consumers that do not automatically acknowledge.
From the documentation:
In automatic acknowledgement mode, a message is considered to be successfully delivered immediately after it is sent.
When you turn off automatic acknowledgment, your consumer must explicitly acknowledge the message, otherwise it will not be dequeued (or as you put it, it will be left "in the queue"). You will then need to simply ACK the message at the very end of your operation, when you are certain that your operation succeeded (and perhaps coordinated with your database transaction).
There is always the question of what to do first; acknowledge first or commit your database transaction first? Without adding complexity, you must choose what's best depending on what failure mode is less problematic for you, i.e. Would you rather tolerate a duplicated message or a missing message?

Is AMQP's DistributionMode analogous to autoacknowledge in Tibco?

We are migrating from Tibco to start using ActiveMQ Artemis. There are several ack settings that are available on Tibco, but we haven't found anything that's simply similar to this in Artemis. We are using the amqpnetlite .NET library to interface with Artemis, and as part of our code using DistributionMode to either move or copy based on the boolean value we are assigning to a configuration flag that we are calling as UseAutoAcknowledge. I haven't found much documentation about DistributionMode but for one that isn't very clear here - http://docs.oasis-open.org/amqp/core/v1.0/amqp-core-messaging-v1.0.html.
My question is if DistributionMode is set to move - does Artemis send an acknowledgement to the client and doesn't when it is set to copy?

I can't talk to Tibco but I can try to explain AMQP DistributionMode. Essentially the DistributionMode is a setting as to the behaviour of the receiver - a receiver with a move mode is expecting the messages to be sent only to it, not to other receivers - this is the normal behaviour of a consumer on a queue. A receiver with a copy mode is expecting other receivers to also receive the message (like a queue browser, or - sort of - like a subscriber to a topic). In a traditional Client-Broker topology, the DistributionMode is only really interesting when receiving messages from the Broker, and is unlikely to have effect when sending messages to the Broker.
Acknowledgement is separate from the DistributionMode. AMQP has the concept of Disposition which is similar to but not the same as Acknowledgement. Disposition is ultimately the action that the sender will apply at the completion of the message transfer (and so interacts with DistributionMode for messages sent by the Broker). Conceptually for each message transfer a Broker might decide that the transfer has completed successfully; that it has failed - but in a way that retrying might succeed; that it has failed in a way that will not succeed on retry; or some other more subtle outcome. Here the behaviour at the Broker is probably different depending upon whether the DistributionMode was move or copy (the specification left this vague to allow flexibility in implementations). If the receiver is asking for messages to be moved, and it declares that the transfer was unsuccessful, a broker is likely to make that message available for all competing consumers. If the receiver was asking for copy, then it never held an exclusive lock on the message, and so the choice is only whether to retry sending the copy to that same consumer.
Perhaps the simplest thing here is if you can describe the behaviour that you desire, and experts on Apache Artemis can weigh in on if/how that can be achieved.

Two consumers on same Websphere MQ JMS Queue, both receiving same message

I am working with someone who is trying to achieve a load-balancing behavior using JMS Queues with IBM Websphere MQ. As such, they have multiple Camel JMS consumers configured to read from the same Queue. Despite that this behavior is undefined according to the JMS spec (last time I looked anyway), they expect a sort of round-robin / load-balancing behavior. And, while the spec leaves this undefined, I'm led to believe that the normal behavior of Websphere MQ is to deliver the message to only one of the consumers, and that it may do some type of load-balancing. See here, for example: When multi MessageConsumer connect to same queue(Websphere MQ),how to load balance message-consumer?
But in this particular case, it appears that both consumers are receiving the same message.
Can anyone who is more of an expert with Websphere MQ shed any light on this? Is there any situation where this behavior is expected? Is there any configuration change that can alleviate this?
I'm leaning towards telling everyone here to use the native Websphere MQ clustering facility and go away from having multiple consumers pointing at the same Queue, but that will be a big change for them, so I'd love to discover a way to make this work.
Not that I'm a fan of relying on anything that's undefined, but if they're willing to rely on IBM specific behavior, I'll leave that up to them.

The only way for them to both receive the same messages are:
There are multiple copies of the message.
The apps are browsing the message without a lock, then circling back to delete it.
The apps are backing out a transaction and making the message available again.
The connection is severed before the app acknowledges the message.
Having multiple apps compete for messages in a queue is a recommended practice. If one app goes down the queue is still served. In a cluster this is crucial because the cluster will continue to direct messages to the un-served queue instance until it fills up.
If it's a Dev system, install SupportPac MA0W and tell it to trace just that one queue and you will be able to see exactly what is happening.
See the JMS spec in section 4.4. The provider must never deliver a second copy of an acknowledged message. Exception is made for session handling in 4.4.13 which I cover in #4 above. That's pretty unambiguous and part of the official spec so not an IBM-specific behavior.

Message Queues: Are messages lost on network failure?

I am wondering about the reliabilty of message delivery in messaging systems such as WebsphereMQ or ActiveMQ (used via JMS). As far as I know messages can be buffered if the recepient is unavailable and will be delivered later.
Now I am wondering what happens if the sender temporarily cannot reach the network. Is there some kind of local buffering which will send the messages later? I assume this depends on where the message broker is running. Are there local brokers on all machines or just a central one?
To pinpoint my question: Is a messaging system the right choice if I need to ensure, that messages are received eventually, even in the face of temporary network failure? Is there a certain setup required to achieve this reliabilty?
Any pointers to relevant documentation would be appreciated.

The common solution is called "store and forward". In such systems, once you've handed off the message to the local message agent it becomes their responsibility. This agent might not be a full broker. If the messaging system has basic delivery guarantees, the local agent will still need persistent buffering of messages until they're handed off to a real broker.

If you really can't afford to lose messages I'd recommend implementing a reliable messaging pattern at the endpoints if you can, i.e. the sender re-sends if no acknowledgement is received within a certain time period and the receiver has duplicate detection to cope with receiving the same message more than once.
Guaranteed delivery comes with a performance overhead and usually doesn't give any guarantee as to how long your message might take to get there.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio