Is it possible that multiple consumers of a Rabbitmq queue get the same message? - go

I am in charge maintaining a production software written in Golang which uses RabbitMq as its message queue.
Consider the following situation:
A number of goroutines are publishing to a queue name logs.
Another set goroutines read from the queue and write the messages to a MongoDB collection.
Each publisher or consumer has its Own connection, and its own channel respectively, they are working in an infinite loop and never die. (The connections and channels are established when the program starts.)
autoAck, exclusive and noWait are all set to false and prefetch is set to 20 with global set to false for all
channels. All queues are durable with autoDelete, exclusive
and noWait all set to false.
The basic assumption was that each message in the queue will be delivered to one and only one consumer, so each message would be inserted in the database exactly once.
The problem is that there are duplicate messages in the MongoDB collection.
I would like to know if it is possible that more than one consumer gets the same message causing them to insert duplicates?

The one case I could see with your setup where a message would be processed more than once is if one of the consumers has an issue at some point.
The situation would follow such a scenario:
Consumer gets a bunch of messages from the queue
Consumer starts processing a message
Consumer commits the message to mongodb
either due to rabbitmq channel/connection issue, or other type of issue consumer side, the consumer never acknowledges the message
the message as it hasn't been acknowledged is requeued at the top of the queue
same message is processed again, causing the duplication
Such cases should show some errors in your consumers logs.

Related

Does EventStoreDB provide message ordering by an event-key on the consumer side?

I have been exploring EventStoreDB and trying to understand more about the ordering of messages on the consumer side. Read about persistent subscriptions and also the Pinned consumer strategy here.
I have a scenario wherein inventory updates get pushed to eventstore and different streams get created by the different unique inventoryIds in the inventory event.
We have multiple consumers with the same consumerGroup name to read these inventory events. We are using Pinned Persistent Subscription with ResolveLinkTos enabled.
My question:
Will every message from a particular stream always go to the same consumer instance of the consumerGroup?
If the answer to the above question is yes, will every message from that particular stream reach the particular consumer instance in the same order as the events were ingested?
The documentation has a warning that ordered message processing using persistent subscriptions is not guaranteed. Any strategy delivers messages with the best-effort level of ordering guarantees, if applicable.
There are a few reasons for this, some of those are:
Spreading out messages across consumer groups lead to a non-linearised checkpoint commit. It means that some messages can be processed before other messages.
Persistent subscriptions attempt to buffer messages, but when a timeout happens on the client side, the whole buffer is redelivered, which can eventually break the processing order
Built-in retry policies essentially can break the message order at any time
Most event log-based brokers, if not all, don't even attempt to guarantee ordered message delivery across multiple consumers. I often hear "but Kafka does it", ignoring the fact that Kafka delivers messages from one partition to at most one consumer in a group. There's no load balancing of one partition between multiple consumers due to exactly the same issue. That being said, EventStoreDB is still not a broker, but a database for events.
So, here are the answers:
Will every message from a particular stream always go to the same consumer instance of the consumer group?
No. It might work most of the time, but it will eventually break.
will every message from that particular stream reach the particular consumer instance in the same order as the events were ingested?
Most of the time, yes, but again, if a message is being retried, you might get the next message before the previous one is Acked.
Overall, load-balancing ordered processing of messages, which aren't pre-partitioned on the server is not an easy task. At most, you get messages re-delivered if the checkpoint fails to persist at some point, and the consumers restart.

AWS SQS - Queue not delivering any messages until Visibility Timeout expires for one message

EDIT: Solved this one while I was writing it up :P -- I love those kind of solutions. I figured I'd post it anyway, maybe someone else will have the same problem and find my solution. Don't care about points/karma, etc. I just already wrote the whole thing up, so figured I'd post it and the solution.
I have an SQS FIFO queue. It is using a dead letter queue. Here is how it had been configured:
I have a single producer microservice, and I have 10 ECS images that are running as consumers.
It is important that we process the messages close to the time they are delivered in the queue for business reasons.
We're using a fairly recent version of the AWS SDK Golang client package for both producer and consumer code (if important, I can go look up the version, but it is not terribly outdated).
I capture the logs for the producer so I know exactly when messages were put in the queue and what the messages were.
I capture aggregate logs for all the consumers, so I have a full view of all 10 consumers and when messages were received and processed.
Here's what I see under normal conditions looking at the logs:
Message put in the queue at time x
Message received by one of the 10 consumers at time x
Message processed by consumer successfully
Message deleted from queue by consumer at time x + (0-2 seconds)
Repeat ad infinitum for up to about 700 messages / day at various times per day
But the problem I am seeing now is that some messages are not being processed in a timely manner. Occasionally we fail processing a message deliberately b/c of the state of the system for that message (e.g. maybe users still logged in, so it should back off and retry...which it does). The problem is if the consumer fails a message it is causing the queue to stop delivering any other messages to any other consumers.
"Failure to process a message" here just means the message was received, but the consumer declared it a failure, so we just log an error, and do not proceed to delete it from the queue. Thus, the visibility timeout (here 5m) will expire and it will be re-delivered to another consumer and retried up to 10 times, after which it will go to the dead letter queue.
After delving into the logs and analyzing it, here's what I'm seeing:
Process begins like above (message produced, consumed, deleted).
New message received at time x by consumer
Consumer fails -- logs error and just returns (does not delete)
Same message is received again at time x + 5m (visibility timeout)
Consumer fails -- logs error and just returns (does not delete)
Repeat up to 10x -- message goes to dead-letter queue
New message received but it is now 50 minutes late!
Now all messages that were put in the queue between steps 2-7 are 50 minutes late (5m visibility timeout * 10 retries)
All the docs I've read tells me the queue should not behave this way, but I've verified it several times in our logs. Sadly, we don't have a paid AWS support plan, or I'd file a ticket with them. But just consider the fact that we have 10 separate consumers all reading from the same queue. They only read from this queue. We don't have any other queues it is using.
For de-duplication we are using the automated hash of the message body. Messages are small JSON documents.
My expectation would be if we have a single bad message that causes a visibility timeout, that the queue would still happily deliver any other messages it has available while there are available consumers.
OK, so turns out I missed this little nugget of info about FIFO queues in the documentation:
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/FIFO-queues.html
When you receive a message with a message group ID, no more messages
for the same message group ID are returned unless you delete the
message or it becomes visible.
I was indeed using the same Message Group ID. Hadn't given it a second thought. Just be aware, if you do that and any one of your messages fails to process, it will back up all other messages in the queue, until the time that the message is finally dealt with. The solution for me was to change the message group id. There is some business logic id I can postfix on it that will work for me.

GO lang NATS Queueing with multiple Queue Subscribe

I am creating NATS go lang Queue Subscriber client as follows,
nc.QueueSubscribe("foo", "my_queue", func(msg *nats.Msg) {
log.Printf("Message :%s", string(msg.Data))
})
So whenever i publish any message to "foo" subject then some time it is receiving and some time not.
e.g let say i sent 10 messages to above "foo" subject then it will receive 2 or 3 max.
My requirement is as follows,
There should be Queue Subscription.
All input events should be processed.
How to implement Queue Subscribe in concurrent mode.
Any help appreciated.
If you start multiple queue subscribers with the same name (in your example my_queue), then a message published on "foo" goes to only one of those queue subscribers.
I am not sure from your statement if you imply that the queue subscriber sometimes misses messages or not. Keep in mind one thing: there is no persistence in NATS (there is in NATS Streaming). So if you publish messages before the subscriber is created, and if there is no other subscriber on that subject, the messages will be lost.
If you were experimenting and starting the queue subscriber from one connection and then in the same application sending messages from another connection, it is possible that the server did not register the queue subscription before it started to receive messages (again, if you were using 2 connections). If that is the case, you would need to flush the connection after creating the subscription and before starting sending: nc.Flush().
Finally, there is nothing special to use queue subscribers in concurrent mode. This is what they are for: load balancing processing of messages on the same subject for subscribers belonging to the same group. The only thing you have to be careful of if you are creating multiple queue subscribers in the same application is either to not share the message handler or if you do, you need to use locking since the message handler would be concurrently invoked if messages arrive fast enough.

JMS queue redelivery order in jboss

I send a java object to a queue from a thread. The relavent MDB's onMessage is invoked with a message from the queue. onMessage, I match a key present in the message with a key in a cache, if key is not present I throw a custom runtimeexception just to make the container redeliver this message. (I have another autonomous system that adds key to the cache from the external system response, it may be little slow by 3-5 seconds)
In such case, does this container add this unprocessed message to the end of the queue, or is it redelivered immediately? is there a way to delay the redelivery time? assuming the queue is always filled with ~550 messages every second.
regards
There's current a redelivery delay feature on HornetQ but all the subsequent messages are delivered fine.
There's a feature request in place to hold the queue for some time if a redelivery happens but that has not been implemented yet.
but if you have multiple consumers on the queue the order will be spread with your consumers anyways. You could use message-grouping and add a sleep on your onMessage if deliveryCount > 1. The message grouping is to guarantee no other consumer (or another MDB instance) will receive the messages out of order.
Depending on how you're application is done, and depending on your requirements you may want to only allow a single instance of your MDB.
Also: look at the consumer-window-size where you can select no buffering on the client which has a better behaviour when you have multiple consumers or multiple mdb instances.

How to tell in Oracle AQ which messages have been consumed from a multiple consumer queue

I'm new to Oracle AQ.
I have created a table and a queue like so:
EXEC dbms_aqadm.create_queue_table(queue_table=>'MY_QUEUE_TABLE',
queue_payload_type=>'sys.aq$_jms_text_message',
multiple_consumers=>TRUE);
EXEC dbms_aqadm.create_queue(queue_name=>'CONTACT_INFO_QUEUE',
queue_table=>'MY_QUEUE_TABLE',
max_retries=>24,
retry_delay=>60,
retention_time=>3600);
Then I wrote a Listener to the queue in Java. When I start the Listener, it waits 6 minutes and then collects all the messages from the queue.
But I can't tell in MY_QUEUE_TABLE which messages have been consumed. Because I want a multiple consumer queue, I think the messages should stick around. However, how does Oracle AQ keep track of which messages each listener has consumed?
Each queue will keep track and ensure that all consumers have dequeued. You can look at the actual queue table to see how many consumers have consumed a message. Check aq$_my_queue_table and aq$_my_queue_table_I to see the status of messages.

Resources