GCP PubSub Spring Boot repeat extract message - spring-boot

I need help with a problem with gcp pub/sus. I have a process that send 100 messages with filters to pubsub and another application (in spring boot) receive these messages. When spring boot application receive message from pubsub (not pull), process 100 messages but, into the process, receive more messages, in diferents times receive diferents numbers of messages, any times receive 120, another 140, and the others more than 200. I wasn't found any solution of this, this is my code:
#Bean
#ServiceActivator(inputChannel = "pubsubInputChannel")
public MessageHandler messageReceiver() {
return message -> {
System.out.println("Message arrived! Payload: " + new String((byte[]) message.getPayload()));
//other process of app (call other api)
AckReplyConsumer consumer = (AckReplyConsumer) message.getHeaders().get(GcpPubSubHeaders.ACKNOWLEDGEMENT);
consumer.ack();
};
}
please help me!!!

Duplicate messages can happen for different reasons in Google Cloud Pub/Sub. One thing to keep in mind is that Cloud Pub/Sub offers at-least-once delivery meaning that some amount of duplicates is always possible, so your application must be resilient to them. That many duplicates does seem a bit high, though. In general duplicates can generally happen for the following reasons:
Messages are being sent by the publisher more than once. This can happen if the publisher got disconnected from Cloud Pub/Sub and sent the same message again. If this type of duplication occurs, then the messages will have different message IDs.
The subscriber is taking too long to acknowledge messages. In your code, you have //other process of app (call other api). How long does this process take? If it is longer than the deadline for acknowledging the message, then the message will be redelivered. Keep in mind that if this other process requires locks be grabbed for all messages, there could be a contention issue with too many requests trying to get those locks at the same time, resulting in processing delays. By default, the ack deadline for a message is ten seconds. When using the Java client library, the deadline is automatically extended by the maxAckExtensionPeriod, which defaults to one hour. This property can be set in the DefaultSubscriberFactory for Spring as well.
Messages are not acked at all. If an exception prevents the call to ack or there is deadlock resulting in that line of code never being reached, then the message will be redelivered.
The use case is one of a large backlog of small messages. In this situation, buffers are prone to fill up in the client in a way that results in redelivery of messages.

Related

Circuit breaker for asynchronous microservices..?

There is a ActiveMQ queue (QueueA). A service (MyService) subscribes to messages, processes it and sends the message to another ActiveMQ queue (QueueB).
QueueA -> MyService -> QueueB
Cosider a scenario where thousands of messages are in QueueA. At the same time, QueueB is down. I want to stop processing if a certain number of messages (say 100) messages are consecutively failing while sending messages to QueueB. It should test for a rolling window in certain time period (say, 100 consecutive messages failed in 60 seconds) and stop consuming from QueueA. It should then test if the service is up after 15 minutes or so by sending one more message. If it still fails, again stop consuming from QueueA for another 15 minutes.
Right now, what is happening is that all the messages are erroring out and we have to reprocess every message again. There is a recovery mechanism, but the recovery mechanism is getting overloaded because of the limitations of the current archituecture.
Is there any pattern for this? Is it the same circuit breaker (I am aware of it in synchronous context). If so, not sure if there is a solution in Java / Spring Boot / Apache Camel. Yes, that is the technology stack we are currently on. Any guidelines for the pattern also will help even if you may not have the knowledge of this specific technology platform.
I have also read the following question in StackOverflow.
Is circuit breaker pattern applicable for asynchronous requests also?
Thanks and appreciate your time in helping me with this.
Have a look on the Camel RoutePolicy of type ThrottlingExceptionRoutePolicy which is based on the CircuitBreakerLoadBalancer.
Using this policy should allow you to stop consuming from the endpoint when the circuit is in the open state (to compare with the standard circuit behahiour : bypass the service call, and fallback to another response).
#Bean
public ThrottlingExceptionRoutePolicy myCustomPolicy() {
// Important: do not open circuit for this kind of exceptions
List<Class<?>> handledExceptions = Arrays.asList(MyException.class);
return new ThrottlingExceptionRoutePolicy(failureThreshold, failureWindow, halfOpenAfter, handledExceptions);
}
from("jms:queue:QueueA")
.routePolicy(myCustomPolicy)
.to("mock:MyService")

How does Mass Transit handle retries deduplication and message id generation when using in-memory outbox

Mass Transit has an in-memory "outbox" implementation that I think will handle the majority of the concerns / challenges I am looking to over come however I can not find a lot of documentation that describes its capabilities in the detail I am looking for. A lot of these questions came about after watching a video where Udi Dahan explains how to handle reliable messaging without distributed transactions (https://vimeo.com/111998645).
Does the in-memory outbox handle failures that may happen when trying to send a message to the queue? So for example: A consumer generates 3 messages that are collected in the outbox. The consumer completes without issue.The collected messages in the outbox start being processed
If from some reason while processing the collected message there is a network issue (or other issue) and message 2 fails to be sent what will happen to message 2 and 3? Is there any sort of retry policy?
What happens if a message being processed in the outbox is successfully added to the queue but is unsuccessfully marked as sent in the outbox? Will there be another attempt to send the message to the queue?
Assuming the outbox will retry sending a message to a queue if there is some sort of failure is the message ID guaranteed to be consistent between attempts? Having a consistent Message ID is important for de-duplication to ensure we do not process the same message multiple times.
When a message is consumed is there any de-duplication that takes place? (This ties back to 1.C)
How does Mass Transit track processed records for each consumer? Do the storage engines take care of this responsibility?
Is there any sort of "transaction" exposed to the consumer that allows you to clear the collected message in the outbox without throwing an exception or is throwing an exception the only way to rollback the outbox?
What about messages that are generated outside of a consumer, Is there a way to rollback messages collected in the outbox (example: A WebAPI controller action)?
Is there a recommendation to use the DTC features of Mass Transit instead of outbox or vice versa or use them both?
Currently Mass Transit does not have an outbox implementation that can survive a process crash. Is there a plan to include such a feature? Is there a road map this is tracked on?
The in-memory outbox defers any message send/publish/respond calls until the consumer has completed all processing. This includes regular consumers and sagas. The very last thing the consumer does is send/publish any deferred messages, after which the incoming message is acknowledged (and removed from the queue). With that said, most of the remaining items in your question aren't relevant, because it isn't writing messages to a database, and then processing them afterwards.
No
No
Don't use the DTC, it isn't even supported in .NET Core
No plans, nothing on the roadmap
As you said at the start, the in-memory outbox handles 99.9% of the cases. A well-designed saga and supporting services can push that even higher, ensuring idempotency and eventually successful command (or event) processing. Anything beyond what's there today is typically to support poorly designed systems and just creates way too much complexity with extra dependencies.

AWS SQS - Queue not delivering any messages until Visibility Timeout expires for one message

EDIT: Solved this one while I was writing it up :P -- I love those kind of solutions. I figured I'd post it anyway, maybe someone else will have the same problem and find my solution. Don't care about points/karma, etc. I just already wrote the whole thing up, so figured I'd post it and the solution.
I have an SQS FIFO queue. It is using a dead letter queue. Here is how it had been configured:
I have a single producer microservice, and I have 10 ECS images that are running as consumers.
It is important that we process the messages close to the time they are delivered in the queue for business reasons.
We're using a fairly recent version of the AWS SDK Golang client package for both producer and consumer code (if important, I can go look up the version, but it is not terribly outdated).
I capture the logs for the producer so I know exactly when messages were put in the queue and what the messages were.
I capture aggregate logs for all the consumers, so I have a full view of all 10 consumers and when messages were received and processed.
Here's what I see under normal conditions looking at the logs:
Message put in the queue at time x
Message received by one of the 10 consumers at time x
Message processed by consumer successfully
Message deleted from queue by consumer at time x + (0-2 seconds)
Repeat ad infinitum for up to about 700 messages / day at various times per day
But the problem I am seeing now is that some messages are not being processed in a timely manner. Occasionally we fail processing a message deliberately b/c of the state of the system for that message (e.g. maybe users still logged in, so it should back off and retry...which it does). The problem is if the consumer fails a message it is causing the queue to stop delivering any other messages to any other consumers.
"Failure to process a message" here just means the message was received, but the consumer declared it a failure, so we just log an error, and do not proceed to delete it from the queue. Thus, the visibility timeout (here 5m) will expire and it will be re-delivered to another consumer and retried up to 10 times, after which it will go to the dead letter queue.
After delving into the logs and analyzing it, here's what I'm seeing:
Process begins like above (message produced, consumed, deleted).
New message received at time x by consumer
Consumer fails -- logs error and just returns (does not delete)
Same message is received again at time x + 5m (visibility timeout)
Consumer fails -- logs error and just returns (does not delete)
Repeat up to 10x -- message goes to dead-letter queue
New message received but it is now 50 minutes late!
Now all messages that were put in the queue between steps 2-7 are 50 minutes late (5m visibility timeout * 10 retries)
All the docs I've read tells me the queue should not behave this way, but I've verified it several times in our logs. Sadly, we don't have a paid AWS support plan, or I'd file a ticket with them. But just consider the fact that we have 10 separate consumers all reading from the same queue. They only read from this queue. We don't have any other queues it is using.
For de-duplication we are using the automated hash of the message body. Messages are small JSON documents.
My expectation would be if we have a single bad message that causes a visibility timeout, that the queue would still happily deliver any other messages it has available while there are available consumers.
OK, so turns out I missed this little nugget of info about FIFO queues in the documentation:
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/FIFO-queues.html
When you receive a message with a message group ID, no more messages
for the same message group ID are returned unless you delete the
message or it becomes visible.
I was indeed using the same Message Group ID. Hadn't given it a second thought. Just be aware, if you do that and any one of your messages fails to process, it will back up all other messages in the queue, until the time that the message is finally dealt with. The solution for me was to change the message group id. There is some business logic id I can postfix on it that will work for me.

Need help to handle MDB Exception in two ways

I'm trying to handle two different types of problems while processing a message.
The first problem is if the remote database is down. In that case, the message should stop processing, and try again later. This message should never go to a DLQ, and should keep trying until the remote database is up.
The second problem is when there is a problem with the message. In that case, it should go to the DLQ.
How should I be structuring the following code?
#Override
public void onMessage(Message message) {
try {
// Do some processing
messageProcessing(message); // Should DLQ if message is bad
// Save to the database
putNamedLocation(message); // <<--- Exception when external DB is down
} catch (Exception e) {
logger.error(e.getMessage());
mdc.setRollbackOnly();
}
}
Assuming you can detect bad messages definitively in the code body of the MDB, I would write the bad messages to the DLQ directly. This gives you a bit more freedom to perhaps categorize the error and optionally send different types of bad messages to different "DLQ-Like" queues, and/or apply a time-to-live to DLQ'ed messages so that no-hope-of-ever-being-processed type messages don't pile up in the queue for ever. You can add #Resource annotated instance variables to your MDB class referencing the ConnectionFactory and Queue references to support the sending of the messages to the target DLQ. The bottom line is, make sure you detect the error and DLQ the message yourself.
As for the DB being down, you can detect this by catching exceptions when acquiring a connection or writing your updates. In this case, clean up your resources and throw a RuntimeException. This will cause the message to be redelivered, but you will want to check the JMS configuration for two things:
Make sure the max-redelivery count is high enough, otherwise the count will tick over and the message will be DLQed eventually anyway.
If your JMS implementation supports it, add a redelivery delay to rejected messages to allow some time for the DB to come back up, otherwise your messages will endlessly spin in a deliver/reject loop.
To avoid #2 (which is tricky if your JMS implementation does not support redilvery delay, like WebSphereMQ), you can use the JBoss JMX management interface for the MDB to stop (and later restart) delivery on the MDB. However, you can't do this inside the MDB in the same thread that is processing the message because the MDB will wait for the message to complete processing, which it can't because it is waiting for the MDB to stop, which it can't because...[and so on] so... your best bet is to start some sort of sentry that polls the DB and when it finds it down, stops the MDB and when it finds it up again, restarts it. See this question for a snippet on how to do that.
That last part should help deal with any unexpected exceptions resulting from message validations. (i.e. the DB is fine, but for some reason the message is totally fubar resulting in uncaught exceptions which causes the message to be redelivered). Since down-DB messages should not be redelivered more than a few times (on account of your sentry), you can check a message's redelivery count and if it is ridiculously high then you know you have poison message and you can ditch it, or DLQ it.
Hope that's helpful.

Configure a JMS (ActiveMQ) queue so that it only contains the last message

We have quartz process that polls a ActiveMQ JMS queue.
We know that we could get several messages a minute would like to only respond to the most current message at a configured polling rate of a minute or more.
We don't need to process any of the previous messages.
Is there a way to configure the queue to get this behavior?
Its seems like a topic has the ability to do this via the subscription recovery policy using a count of 1. We would like to do this using a queue to guarantee (more or less) a single delivery of the message.
Or is there a conceptual flaw in our assumptions...
Thanks
In my opinion there is no standard operation for this, so you will have to write some code....
One possible solution would be to use a QueueBrowser together with a QueueReceiver:
Through the QueueReceiver you would get an Enumeration of the messages in the queue. For each message you can now perform a receive with a MessageSelector on the JMSMessageID as long as hasMoreElements() returns true. The last message will be the one you want to have....
When using activemq, you can use "image caching" on topics. One of the settings there is to always keep the last mesage sent..
Take a look at the Subscription recovery Policy settings:
http://activemq.apache.org/subscription-recovery-policy.html

Resources