MassTransit consumers didn't acknowledge some messages

MassTransit consumers didn't acknowledge some messages - masstransit

I have a question about some strange behaviour of consumer.
Recently we had strange situation on production environment. Two consumers on two different microservices were stuck at some messages. The first one was holding 20 messages from rabbitMQ queue and the second one 2 messages and they weren't processing them. These messages were visible as Unacked in RabbitMQ for two days. They went back to Ready state just when that two microservices were restarted. At that time when consumers took this messages the whole program was processing thousands messages per hour, so basically our Saga and all consumers were working. When these messages went back to Ready state they were processed in one second after that so I don't think that it's problem with them.
The messages are published by Saga to Exchange and besides these two stucked consumers we have also EventLogger consumer subscribed to all messages and this EventLogger processed this 22 messages normally without any problems (from his own queue). Also we have connected Application Insights to consumers and there is no information about receiving these 22 messages by these two consumers (there are information about receiving it by EventLogger).
The other day we had the same issue with one message on test environment.
Recently we updated version of MassTransit in our project from version 6.2.0 to 7.1.6 and before that we didn't notice any similar issues with consumers but maybe it's just coincidence. We also have retry, redelivery, circuit breaker and in memory outbox mechanisms but I don't think that's problem with them because the consumer didn't even start to process these 22 messages.
Do you have any suggestions what could happened to this consumers?

Usually when a consumer doesn't even start to consume the message once it has been delivered to MassTransit by RabbitMQ, it could be an issue resolving the consumer from the container, such as a dependency to another backing service (database, log server, file, network connection, device, etc.).
The message remains unacknowledged on the broker because the transport/delivery mechanism to the consumer is waiting for a resource to become available. If there isn't anything in the logs for that time period indicating an issue with a resource, it's hard to know what could have blocked those messages from being consumed. The fact that they were ultimately consumed once the services were restarted seems to indicate the message content itself was fine.
Monitoring the lack of message consumption (and likely an associated queue depth increase) would give an indication that the situation has occurred. If it happens again, I'd increase the logging detail levels to see if the issue occurs again and can then be identified.

Related

ActiveMQ not delivering/dispatching persistent messages on queues

I am using ActiveMQ v5.10.0 and having an issue almost every weekend where my ActiveMQ instance stops delivering persistent messages sent on queues to the consumers. I have not been able to figure out what could be causing this.
While the issue was happening I tried following things:
I added a new consumer on the affected queue but it didn't receive
any messages.
I restarted the original consumer but it didn't receive any messages after the restart.
I purged the messages that were held on the queue but then messages started accumulating again and broker didn't deliver any of the new messages. When I purged the expiry count didn't increase neither the dequeue and dispatch counters.
I sent 100 non-persistent messages on the affected queue, surprisingly it received those messages.
I tried sending 100 persistent messages on that queue, it didn't deliver anyone of them, all the messages were held by broker.
I created a fresh new queue and sent 100 persistent messages and none of them was delivered to the consumer whereas all the non-persistent messages were delivered.
The same things happen if I send persistent or non-persistent messages from STOMP producers. Surprisingly all this happened only for queues, topic consumers were able to receive persistent as well as non-persistent messages.
I have already posted this on ActiveMQ user forum: http://activemq.2283324.n4.nabble.com/Broker-not-delivering-persistent-messages-to-consumer-on-queue-td4691245.html but no one from ActiveMQ has suggested anything.
The jstack output also isn't very helping.
More details:
1. I am not using any selectors, message groups feature
2. I have disabled producer flow control in my setup
I want some suggestions as to what configuration values might cause this issue- memory limits, message TTL etc.

JMS consumer inside a Netty handler?

I'm designing a quite complicated system and was wondering what the best way is to put a jms consumer (activemq, vm protocol, non persitent) inside a netty handler.
Let me explain, i have several clients connecting to my netty server using websockets. For every client connection i create a jms consumer that listens for interesting messages on one or more topics. If a interesting message arrives i need to do a extra step (additional filtering) before sending the message to the client using the websocket.
Is the following a good way to do this:
inside a SimpleChannelInboundHandler i declare a private non static consumer
the consumer is initialized in channelActive
the consumer is destroyed in channelInactive
when a message is received by consumer i do the extra filter a send it using ctx.channel().write()
In this setup i'm a bit worried that the consumer might turn into slow consumer and slow everything down, cause the websocket goes over the internet.
I came up with a more complex one to decouple the "receiving of message by consumer" and "sending of message through a websocket".
inside a SimpleChannelInboundHandler i declare a private non static consumer
the consumer is initialized in channelActive
the consumer is destroyed in channelInactive
when a message is received by consumer i put it in a blockedqueue
every minute i let a thread (created for every client) look in the queue and send the found messages to the client using ctx.channel().write().
At this point i'm a bit worried about the extra thread per client.
Or is there maybe a better way to accomplish this task?

This is a classic slow consumer problem and the first step to resolving it is to determine what the appropriate action is when a slow consumer is detected. If it is acceptable that the slow consumer misses messages then the solution is some variation on dropping messages or unsubscribing them from the feed. For example, if it's acceptable that the client misses messages then, when one is received from JMS, check if the channel is writable. If it isn't, drop the message. If you want to give yourself a bit more of a buffer (although OS buffers are quite large) you can track the number of write completion future's that haven't completed (ie the messages haven't been written to the OS send buffer) and drop messages if there are too many outstanding write requests.
If the client may not miss messages, and is consistently slow, then the problem is more difficult. One option might be to divert messages to a JMS queue with a specific header value, then open a new consumer that reads messages from that queue using a JMS selector. This will put more load on the JMS server but might be appropriate for temporary slowness and hopefully it won't interfere with you main topic feeds. Alternatively you might want to stash the messages in a different store, such as a database, so you can poll for messages when they can be sent. If you do this right a single polling thread can cope with many clients (query for clients which have outstanding messages, then for each client, load a bunch of messages). However this isn't as convenient as using JMS.
I wouldn't go with option 2 because the blocking queue is only going to solve the problem temporarily, and you can achieve the same thing by tracking how many write operations are waiting to complete.

ActiveMq, what happens if Client terminates before Ack

I have a persistent queue, non-transacted, client-acknowledge, the consumers read with jms.prefetchPolicy.queuePrefetch=1&wireFormat.maxInactivityDuration=50000
and once a consumer processes a message, it ack's the message.
If the consumer reads the message, and before it can send an ack, the process terminates abruptly, what happens in ActiveMq? (What ActiveMq parameters come into play here?)
How is that different than if the the consumer will take 10 minutes to process the message (so the consumer task is alive and working), how does ActiveMq know the message is still being worked on? (Does it monitor the TCP/IP connection, if the connection dies, it assumes the message will not be Ack'ed?)
How do I determine if a message is a "poison pill", i.e. it makes the consumers crash? (the redelivery count seems to be valid if the consumer task does not die; is there an internal counter in the message that says "it was been read n times without being successfully ack'ed?")
As an experiment, I sent 6 messages, one of them being a "poison pill" (kills the consumer before the consumer can send the ack), with 2 simultaneous consumers running (and automatically restarting consumers to bring the count to 2 whenever a consumer dies). Looking at the queue (using jconsole, I enable jmx using broker.setUseJmx(true)), 4 messages were delivered, 2 are in-flight. Why would there be 2 in-flight instead of just one?
I've been reading the ActiveMq and JMS specs for a while without clear/conclusive answers, so any insights on what parameters come into play, and if there are any known bugs, will be greatly useful.

This is purely based on my understanding of JMS - may not be completely correct:
If the consumer reads the message, and before it can send an ack, the process terminates abruptly, what happens in ActiveMq
My understanding is that since this happens in the context of a session with the JMS provider, JMS provider knows if the session is no longer active or has failed and any message not acknowledged as part of the session will be redelivered when the session is re-established.
How do I determine if a message is a "poison pill", i.e. it makes the consumers crash?
Like you have mentioned, the JMS provider keeps track of the # of times the message was redelivered possibly in the header of the message
4 messages were delivered, 2 are in flight
Not sure about this point

ActiveMQ with slow consumer skips 200 messages

I'm using ActiveMQ along with Mule (a kind of ESB based on Spring).
We got a fast producer and a slow consumer.
It's synchronous configuration with only one consumer.
Here the configuration of the consumer in spring style: http://pastebin.com/vweVd1pi
The biggest requirement is to keep the order of the messages.
However, after hours of running this code, suddenly, ActiveMQ skips 200 messages, and send the next ones.The 200 messages are still there in the activeMQ, they are not lost.
But our client (Mule), does have some custom code to check the order of the messages, using an unique identifier.
I had this issue already a few month ago. We change the consumer by using the parameter "jms.prefetchPolicy.queuePrefetch=1". It seemed to have worked well and to be the fix we needed unti now when the issue reappeared on another consumer.
Is it a bug, or a configuration issue ?

I can't talk about the requirement from a Mule perspective, but there are a couple of broker features that you should take a look at. There are two ways to guarantee message ordering in ActiveMQ:
Message groups are a way of ensuring that a set of related messages will be consumed by the same consumer in the order that they are placed on a queue. To use it you need to specify a JMSXGroupID header on related messages, and assign them an incrementing JMSXGroupSeq number. If a consumer dies, remaining messages from that group will be sent to another single consumer, while still preserving order.
Total message ordering applies to all messages on a topic. It is configured on the broker on a per-destination basis and requires no particular changes to client code. It comes with a synchronisation overhead.
Both features allow you to scale out to more than one consumer.

JMS Messages not consumed till producer connection close :-(

I am relatively new to JMS and have encountered a weird problem implementing my first real application. I'm desporate for any help or advice.
Background: I use AtiveMQ (java) as the message broker with non-transacted, non-persitent queues.
The Design: I have a straight forward producer/consumer system based around a single queue. A number of nodes(currently 2) place messages onto/ consume from the queue. Selectors are used to filter which messages a node recieves.
The Problem: The producer succesfully places its items on to the queue (i have verified they are there using the web interface) however the consumers remain blocked and do not read them. Only when i close the JMS connection in the producer do the consumers jump into life and consume the messages as expected.
This bevaior seems very weird to me, surely you shouldnt have to completely hang up the producer connection for the consumers to be able to read from the queue. I must have made a mistake somewhere(possibly with sessions) but the at the moment the number of things that could be wrong is to large and i have no idea what would cause this behaviour.
Any hints as to a solution, the cause of the problem or just how to continue debugging would be greatly appreciated.
Thanks for your time,
P.S If you requrie any additional information i am happy to provide it

Hard to say without seeing the code, but it sounds like the producer is transacted. You should not have to close the producer in order for the consumers to receive a message but a transacted producer won't send it messages until you call commit. Other things to check is that the connection has been started. Also if you have many consumers you should look at the prefetch setting to ensure that one consumer doesn't hog all the messages, setting to prefetch of 1 might be needed, but hard to say without further insight into your use case.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio