I have configured Spring DefaultMessageListenerContainer as ActiveMQ consumer consuming messages from a queue. Let's call it "Test.Queue"
I have this code deployed in 4 different machines and all the machines are configured to the same ActiveMQ instance to process the messages from the same "Test.Queue" queue.
I set the max consumer size to 20 as soon as all the 4 machines are up and running, I see the number of consumers count against the queue as 80 (4 * max consumer size = 80)
Everything is fine when the messages produced and sent to the queue grows high.
When there are 1000's of messages and among the 80 consumers, let's say one of them is stuck it puts a freeze on Active MQ to stop sending messages to other consumers.
All messages are stuck in ActiveMQ forever.
As I have 4 machines with up to 80 consumers , I have no clue as to see which consumer failed to acknowledge.
I go stop and restart all the 4 machines and when I stop the machine that has the bad consumer which got stuck, then messages starts flowing again.
I don't know how to configure DefaultMessageListenerContainer to abandon the bad consumer and signal ActiveMQ immediately to start sending messages.
I was able to create the scenario even without Spring as follows:
I produced up to 5000 messages and sent them to the "Test.Queue" queue
I created 2 consumers (Consumer A, B) and in one consumer B's
onMessage() method, I put the thread to sleep for a long time (
Thread.sleep(Long.MAX_VALUE)) having the condition like when current time % 13 is 0 then put the thread to sleep.
Ran these 2 consumers.
Went to Active MQ and found that the queue has 2 consumers.
Both A and B are processing messages
At some point of time consumer B's onMessage() gets called and it puts the Thread to sleep when the condition of current time % 13 is 0 is satisified.
The consumer B is stuck and it can't acknowledge to the broker
I went back to Active MQ web console, still see the consumers as 2, but no messages are dequeued.
Now I created another consumer C and ran it to consume.
Only the consumer count in ActiveMQ went up to 3 from 2.
But Consumer C is not consuming anything as the broker failed sending any messages holding them all as it is still waiting for consumer B to acknowledge it.
Also I noticed Consumer A is not consuming anything
I go and kill consumer B , now all messages are drained.
Let's say A, B, C are managed by Spring's DefaultMessageListenerContainer, how do I tweak Spring DefaultMessageListenerContainer to take that bad consumer off the pool (in my case consumer B) after it failed to acknowledge for X number of seconds, acknowledge the broker immediately so that the broker is not holding onto messages forever.
Thanks for your time.
Appreciate if I get a solution to this problem.
here are a few options to try...
set the queue prefetch to 0 to promote better distribution across consumers and reduce 'stuck' messages on specific consumers. see http://activemq.apache.org/what-is-the-prefetch-limit-for.html
set "?useKeepAlive=false&wireFormat.maxInactivityDuration=20000" on the connection to timeout the slow consumer after a specified inactive time
set the queue policy "slowConsumerStrategy->abortSlowConsumer"...again to timeout a slow consumer
<policyEntry ...
...
<slowConsumerStrategy>
<abortSlowConsumerStrategy />
</slowConsumerStrategy>
...
</policyEntry>
Related
I have ActiveMQ Artemis. Producer generates 1000 messages and consumer one by one processing their. Now I want to process this queue with help of two consumers. I start new consumer and new messages are distributed between two runned consumers. My question: is it posible redistribute old messages between all started consumers?
Once messages are dispatched by the broker to a consumer then the broker can't simply recall them as the consumer may be processing them. It's up to the consumer to cancel the messages back to the queue (e.g. by closing its connection/session).
My recommendation would be to tune your consumerWindowSize (set on the client's URL) so that a suitable number of messages are dispatched to your consumers. The default consumerWindowSize is 1M (1024 * 1024 bytes). A smaller consumerWindowSize would mean that more clients would be able to receive messages concurrently, but it would also mean that clients would need to conduct more network round-trips to tell the broker to dispatch more messages when they run low. You'll need to run benchmarks to find the right consumerWindowSize value for your use-case and performance needs.
EDIT: Solved this one while I was writing it up :P -- I love those kind of solutions. I figured I'd post it anyway, maybe someone else will have the same problem and find my solution. Don't care about points/karma, etc. I just already wrote the whole thing up, so figured I'd post it and the solution.
I have an SQS FIFO queue. It is using a dead letter queue. Here is how it had been configured:
I have a single producer microservice, and I have 10 ECS images that are running as consumers.
It is important that we process the messages close to the time they are delivered in the queue for business reasons.
We're using a fairly recent version of the AWS SDK Golang client package for both producer and consumer code (if important, I can go look up the version, but it is not terribly outdated).
I capture the logs for the producer so I know exactly when messages were put in the queue and what the messages were.
I capture aggregate logs for all the consumers, so I have a full view of all 10 consumers and when messages were received and processed.
Here's what I see under normal conditions looking at the logs:
Message put in the queue at time x
Message received by one of the 10 consumers at time x
Message processed by consumer successfully
Message deleted from queue by consumer at time x + (0-2 seconds)
Repeat ad infinitum for up to about 700 messages / day at various times per day
But the problem I am seeing now is that some messages are not being processed in a timely manner. Occasionally we fail processing a message deliberately b/c of the state of the system for that message (e.g. maybe users still logged in, so it should back off and retry...which it does). The problem is if the consumer fails a message it is causing the queue to stop delivering any other messages to any other consumers.
"Failure to process a message" here just means the message was received, but the consumer declared it a failure, so we just log an error, and do not proceed to delete it from the queue. Thus, the visibility timeout (here 5m) will expire and it will be re-delivered to another consumer and retried up to 10 times, after which it will go to the dead letter queue.
After delving into the logs and analyzing it, here's what I'm seeing:
Process begins like above (message produced, consumed, deleted).
New message received at time x by consumer
Consumer fails -- logs error and just returns (does not delete)
Same message is received again at time x + 5m (visibility timeout)
Consumer fails -- logs error and just returns (does not delete)
Repeat up to 10x -- message goes to dead-letter queue
New message received but it is now 50 minutes late!
Now all messages that were put in the queue between steps 2-7 are 50 minutes late (5m visibility timeout * 10 retries)
All the docs I've read tells me the queue should not behave this way, but I've verified it several times in our logs. Sadly, we don't have a paid AWS support plan, or I'd file a ticket with them. But just consider the fact that we have 10 separate consumers all reading from the same queue. They only read from this queue. We don't have any other queues it is using.
For de-duplication we are using the automated hash of the message body. Messages are small JSON documents.
My expectation would be if we have a single bad message that causes a visibility timeout, that the queue would still happily deliver any other messages it has available while there are available consumers.
OK, so turns out I missed this little nugget of info about FIFO queues in the documentation:
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/FIFO-queues.html
When you receive a message with a message group ID, no more messages
for the same message group ID are returned unless you delete the
message or it becomes visible.
I was indeed using the same Message Group ID. Hadn't given it a second thought. Just be aware, if you do that and any one of your messages fails to process, it will back up all other messages in the queue, until the time that the message is finally dealt with. The solution for me was to change the message group id. There is some business logic id I can postfix on it that will work for me.
I am in charge maintaining a production software written in Golang which uses RabbitMq as its message queue.
Consider the following situation:
A number of goroutines are publishing to a queue name logs.
Another set goroutines read from the queue and write the messages to a MongoDB collection.
Each publisher or consumer has its Own connection, and its own channel respectively, they are working in an infinite loop and never die. (The connections and channels are established when the program starts.)
autoAck, exclusive and noWait are all set to false and prefetch is set to 20 with global set to false for all
channels. All queues are durable with autoDelete, exclusive
and noWait all set to false.
The basic assumption was that each message in the queue will be delivered to one and only one consumer, so each message would be inserted in the database exactly once.
The problem is that there are duplicate messages in the MongoDB collection.
I would like to know if it is possible that more than one consumer gets the same message causing them to insert duplicates?
The one case I could see with your setup where a message would be processed more than once is if one of the consumers has an issue at some point.
The situation would follow such a scenario:
Consumer gets a bunch of messages from the queue
Consumer starts processing a message
Consumer commits the message to mongodb
either due to rabbitmq channel/connection issue, or other type of issue consumer side, the consumer never acknowledges the message
the message as it hasn't been acknowledged is requeued at the top of the queue
same message is processed again, causing the duplication
Such cases should show some errors in your consumers logs.
I have a Rabbit Topic with multiple (say 2)subscribers which is running in a load balanced application server cluster ( say 3 ) .
So will the message will get delivered to all (2 X 3 ) subscribers of all listeners in a clustered environment or only 2 listeners ?
There's no such thing as a "topic" in rabbitmq (amqp).
The closest thing to a JMS topic for your scenario is a fanout exchange with 2 queues bound to it. Each queue gets a reference to a message sent to the exchange so both consumers (one per queue) gets a copy of the message.
If you have multiple consumers (e.g. 3) on each queue, the messages in that queue are distributed round-robin fashion to those consumers. Only one consumer per queue gets a message.
We have 10 messages in Activemq and we started 2 consumers.But only first consumer consume and processing the messages. Second consumer not consuming the messages.
If I send one more message to Queue while first consumer processing time, second consumer consuming and processing that particular message(What we sent 1 message while first consumer processing time) only.After it's not consuming pending messges.
Finally What I understand, All pending messages are processing by first consumer not remaining consumers.
I want to make involve all consumers for processing of pending messages.
Thanks.
I think what you are looking at is the prefetch limit causing one consumer to hog a bunch of messages up front and thereby starving the other consumers. You need to lower the consumer prefetch limit so that the broker won't eagerly dispatch messages to the first connected consumer and allow other consumers to come online to help balance the load.
In your case a prefetch limit of one would allow all consumers to jump in and get some work.