We have a situation in front of us and its explained as below:
We have durable subscriber subscribed to a topic. This durable subscriber is a perl script which is run by a daemon.
The perl script uses stomp to connect to the broker.
The perl script wakes up every 5 mins, checks for messages in the topic and processes them in a batch by pre-fetching the messages.
The subscriber uses a client acknowledgement and acknowledges only the last message of the batch.
We are using AMQ 5.5 with kahaDB persistence.
Now what we see is,
Even though the messages are processed in a batch and the last message is acknowledged the inflight count does not come down.
Enqueue count, Dequeue count and Dispatch count do not match.
The journal files are not getting cleaned up.
I do understand that the journal files would be cleaned up once the references to the messages are lost or removed (i.e. the messages are consumed). But does it have to do anything with the various count attributes I see on the topic?
Also should I expect the inflight count to come down to 0 if client crashes and then consumes all messages after come back live?
Please let me know if there could be any other reason that could cause the journal files to stay back.
Thank you
Hari
Actually I was disconnecting immediately after sending the ack.
Waiting for ack's receipt before disconnecting solved the issue.
Related
I have a question about some strange behaviour of consumer.
Recently we had strange situation on production environment. Two consumers on two different microservices were stuck at some messages. The first one was holding 20 messages from rabbitMQ queue and the second one 2 messages and they weren't processing them. These messages were visible as Unacked in RabbitMQ for two days. They went back to Ready state just when that two microservices were restarted. At that time when consumers took this messages the whole program was processing thousands messages per hour, so basically our Saga and all consumers were working. When these messages went back to Ready state they were processed in one second after that so I don't think that it's problem with them.
The messages are published by Saga to Exchange and besides these two stucked consumers we have also EventLogger consumer subscribed to all messages and this EventLogger processed this 22 messages normally without any problems (from his own queue). Also we have connected Application Insights to consumers and there is no information about receiving these 22 messages by these two consumers (there are information about receiving it by EventLogger).
The other day we had the same issue with one message on test environment.
Recently we updated version of MassTransit in our project from version 6.2.0 to 7.1.6 and before that we didn't notice any similar issues with consumers but maybe it's just coincidence. We also have retry, redelivery, circuit breaker and in memory outbox mechanisms but I don't think that's problem with them because the consumer didn't even start to process these 22 messages.
Do you have any suggestions what could happened to this consumers?
Usually when a consumer doesn't even start to consume the message once it has been delivered to MassTransit by RabbitMQ, it could be an issue resolving the consumer from the container, such as a dependency to another backing service (database, log server, file, network connection, device, etc.).
The message remains unacknowledged on the broker because the transport/delivery mechanism to the consumer is waiting for a resource to become available. If there isn't anything in the logs for that time period indicating an issue with a resource, it's hard to know what could have blocked those messages from being consumed. The fact that they were ultimately consumed once the services were restarted seems to indicate the message content itself was fine.
Monitoring the lack of message consumption (and likely an associated queue depth increase) would give an indication that the situation has occurred. If it happens again, I'd increase the logging detail levels to see if the issue occurs again and can then be identified.
I am using ActiveMQ v5.10.0 and having an issue almost every weekend where my ActiveMQ instance stops delivering persistent messages sent on queues to the consumers. I have not been able to figure out what could be causing this.
While the issue was happening I tried following things:
I added a new consumer on the affected queue but it didn't receive
any messages.
I restarted the original consumer but it didn't receive any messages after the restart.
I purged the messages that were held on the queue but then messages started accumulating again and broker didn't deliver any of the new messages. When I purged the expiry count didn't increase neither the dequeue and dispatch counters.
I sent 100 non-persistent messages on the affected queue, surprisingly it received those messages.
I tried sending 100 persistent messages on that queue, it didn't deliver anyone of them, all the messages were held by broker.
I created a fresh new queue and sent 100 persistent messages and none of them was delivered to the consumer whereas all the non-persistent messages were delivered.
The same things happen if I send persistent or non-persistent messages from STOMP producers. Surprisingly all this happened only for queues, topic consumers were able to receive persistent as well as non-persistent messages.
I have already posted this on ActiveMQ user forum: http://activemq.2283324.n4.nabble.com/Broker-not-delivering-persistent-messages-to-consumer-on-queue-td4691245.html but no one from ActiveMQ has suggested anything.
The jstack output also isn't very helping.
More details:
1. I am not using any selectors, message groups feature
2. I have disabled producer flow control in my setup
I want some suggestions as to what configuration values might cause this issue- memory limits, message TTL etc.
I have a set up of an ActiveMQ broker and a single consumer. Consumer gets a message that he is not able to process because a service that it depends has a bug (once fixed it will be fine). So the message keeps being redelivered (consumer redelivery) - we use JMS sessions. With our current configuration it will keep redelivering it every 10 minutes for 1 day. That obviously causes a problem because other messages are not being consumed.
In order to solve this problem I have accessed the queue through JMX and tried to delete that message but it is not there. I guess it is cached on the consumer and not visible at the broker.
Is there any way to delete this message other than restarting the application?
Is it possible to configure the redelivery mechanism so that such message (that causes a live lock eventually) is put at the end of the queue so that other messages can be processed?
The 10 minutes for 1 day redelivery policy should stay as is.
I think you're right that the messages are stuck in the consumer's prefetch buffer, and I don't know of a way to delete them from there.
I'd change your redelivery policy to send to the DLQ after the second failure, with a much shorter interval between them, like 30 seconds, and I'd configure the DLQ strategy as an individualDeadLetterStrategy so you get a separate DLQ containing only messages from this particular queue. Then set up a consumer on this DLQ to move the messages to (the end of) the main queue whenever your reprocessing condition is met (whether that's after a certain delay, or based on reading some flag value from a database, or whatever). This consumer is where you'd implement "every 10 minutes for 1 day" logic, instead of in the redelivery policy where you currently have it.
That will keep the garbage ones out of the main queue so they don't delay other messages from being consumed, but still ensure that they will be reprocessed later. And it will put them on the broker instead of in the consumer's prefetch buffer, where you can view and delete them.
The only way to get it to the back of the queue is to reproduce it to the queue. Redelivery polices can only be configured down to the destination on the connection factory.
Given that you already have a connection, it shouldn't be to hard to create a producer that can either move the given message to a DLQ or produce it back to the queue when you run into that particular bug.
Setting jms.nonBlockingRedelivery=true on the connection factory resolved the problem. Now even if there is a message redelivered it does not block processing of other Messages.
My program will be receiving messages rather slowly; and I want to them to persist in the queue until I have receive all of them and acknowledge all of them. I don't know if I have enough messages until I receive a bunch of them.
My question: will the queue block, waiting for the acknowledgement from the first message before delivering the second?
Well I ran a test one this using the sample producer/consumer code. The consumer actually has some code (if you switch over to ClientAcknowledge). It receives a bunch of messages (10 of them) and only acks the last one.
When setting the acknowledge mode to Session.CLIENT_ACKNOWLEDGE you can get as many messages you need. The messages will be locked on the server, so no other consumer can retrieve them meanwhile. So the answer is no, the queue won't block (even thu there might be provider-specific settings that can do that, which I don't know).
However, you can acknowledge only all at once. So when you have received 10 messages, and you acknowledge one of them (doesn't matter which), all messages will be acknowledged.
Check for your reference Controlling Message Acknowledgment
I have a persistent queue, non-transacted, client-acknowledge, the consumers read with jms.prefetchPolicy.queuePrefetch=1&wireFormat.maxInactivityDuration=50000
and once a consumer processes a message, it ack's the message.
If the consumer reads the message, and before it can send an ack, the process terminates abruptly, what happens in ActiveMq? (What ActiveMq parameters come into play here?)
How is that different than if the the consumer will take 10 minutes to process the message (so the consumer task is alive and working), how does ActiveMq know the message is still being worked on? (Does it monitor the TCP/IP connection, if the connection dies, it assumes the message will not be Ack'ed?)
How do I determine if a message is a "poison pill", i.e. it makes the consumers crash? (the redelivery count seems to be valid if the consumer task does not die; is there an internal counter in the message that says "it was been read n times without being successfully ack'ed?")
As an experiment, I sent 6 messages, one of them being a "poison pill" (kills the consumer before the consumer can send the ack), with 2 simultaneous consumers running (and automatically restarting consumers to bring the count to 2 whenever a consumer dies). Looking at the queue (using jconsole, I enable jmx using broker.setUseJmx(true)), 4 messages were delivered, 2 are in-flight. Why would there be 2 in-flight instead of just one?
I've been reading the ActiveMq and JMS specs for a while without clear/conclusive answers, so any insights on what parameters come into play, and if there are any known bugs, will be greatly useful.
This is purely based on my understanding of JMS - may not be completely correct:
If the consumer reads the message, and before it can send an ack, the process terminates abruptly, what happens in ActiveMq
My understanding is that since this happens in the context of a session with the JMS provider, JMS provider knows if the session is no longer active or has failed and any message not acknowledged as part of the session will be redelivered when the session is re-established.
How do I determine if a message is a "poison pill", i.e. it makes the consumers crash?
Like you have mentioned, the JMS provider keeps track of the # of times the message was redelivered possibly in the header of the message
4 messages were delivered, 2 are in flight
Not sure about this point