I have a setup with two queues (no exchanges), let's say queue A and queue B.
One parser puts messages on queue A, that are consumed by ElasticSearch RabbitMQ river.
What I want now is to move messages from queue A to queue B when the ES river sends an ack to the queue A, so that I can do other processing in the ack'd messages, being sure that ES already has processed them.
Is there any way in RabbitMQ to do this? If not, is there any other setup that can guarantee me that a message is only in queue B after being processed by ES?
Thanks in advance
I don't think this is supported by either AMQP or the rabbitmq extensions.
You could drop the river and let your consumer also publish to elasticsearch.
Since a normal behavior is that the queues are empty you can just perform a few retries of reading the entries from elasticsearch (with exponential backoff), so even if the elasticsearch loses the initial race it will backoff a bit and you can then perform the task. This might require tuning the prefetch_size/count in your clients.
Related
I have been exploring EventStoreDB and trying to understand more about the ordering of messages on the consumer side. Read about persistent subscriptions and also the Pinned consumer strategy here.
I have a scenario wherein inventory updates get pushed to eventstore and different streams get created by the different unique inventoryIds in the inventory event.
We have multiple consumers with the same consumerGroup name to read these inventory events. We are using Pinned Persistent Subscription with ResolveLinkTos enabled.
My question:
Will every message from a particular stream always go to the same consumer instance of the consumerGroup?
If the answer to the above question is yes, will every message from that particular stream reach the particular consumer instance in the same order as the events were ingested?
The documentation has a warning that ordered message processing using persistent subscriptions is not guaranteed. Any strategy delivers messages with the best-effort level of ordering guarantees, if applicable.
There are a few reasons for this, some of those are:
Spreading out messages across consumer groups lead to a non-linearised checkpoint commit. It means that some messages can be processed before other messages.
Persistent subscriptions attempt to buffer messages, but when a timeout happens on the client side, the whole buffer is redelivered, which can eventually break the processing order
Built-in retry policies essentially can break the message order at any time
Most event log-based brokers, if not all, don't even attempt to guarantee ordered message delivery across multiple consumers. I often hear "but Kafka does it", ignoring the fact that Kafka delivers messages from one partition to at most one consumer in a group. There's no load balancing of one partition between multiple consumers due to exactly the same issue. That being said, EventStoreDB is still not a broker, but a database for events.
So, here are the answers:
Will every message from a particular stream always go to the same consumer instance of the consumer group?
No. It might work most of the time, but it will eventually break.
will every message from that particular stream reach the particular consumer instance in the same order as the events were ingested?
Most of the time, yes, but again, if a message is being retried, you might get the next message before the previous one is Acked.
Overall, load-balancing ordered processing of messages, which aren't pre-partitioned on the server is not an easy task. At most, you get messages re-delivered if the checkpoint fails to persist at some point, and the consumers restart.
We have an IBM MQ JMS queue and want to distribute the data into multiple consumers for load balancing. So if we write two JMS Clients to consume from same JMS queue what will happen? Will Messages be equally distributed across both consumers since one consumer will delete the data after it is read? Is there a possibility for data duplication, like if the same message is read by both consumers in a race condition?
My comments below are based on destructive get and not a browse get.
So if we write two JMS Clients to consume from same JMS queue what
will happen?
They will both consume messages.
Will Messages be equally distributed across both consumers since one
consumer will delete the data after it is read?
No. The "hot" consumer will be feed the next available message, assuming it is "getting" a message again before the next message arrives.
Is there a possibility for data duplication, like if the same message
is read by both consumers in a race condition?
Not if you are performing a destructive get (the default).
I am in charge maintaining a production software written in Golang which uses RabbitMq as its message queue.
Consider the following situation:
A number of goroutines are publishing to a queue name logs.
Another set goroutines read from the queue and write the messages to a MongoDB collection.
Each publisher or consumer has its Own connection, and its own channel respectively, they are working in an infinite loop and never die. (The connections and channels are established when the program starts.)
autoAck, exclusive and noWait are all set to false and prefetch is set to 20 with global set to false for all
channels. All queues are durable with autoDelete, exclusive
and noWait all set to false.
The basic assumption was that each message in the queue will be delivered to one and only one consumer, so each message would be inserted in the database exactly once.
The problem is that there are duplicate messages in the MongoDB collection.
I would like to know if it is possible that more than one consumer gets the same message causing them to insert duplicates?
The one case I could see with your setup where a message would be processed more than once is if one of the consumers has an issue at some point.
The situation would follow such a scenario:
Consumer gets a bunch of messages from the queue
Consumer starts processing a message
Consumer commits the message to mongodb
either due to rabbitmq channel/connection issue, or other type of issue consumer side, the consumer never acknowledges the message
the message as it hasn't been acknowledged is requeued at the top of the queue
same message is processed again, causing the duplication
Such cases should show some errors in your consumers logs.
What can be the best way to aggregate messages from many different sources (actually queues/topics) into a single queue/topic and then consume it. I am trying to design an application to receive messages from different topics in JMS using weblogic.
You could write your own "aggregator" as a stand-alone Java application:
For each queue/topic have a reader in its own thread.
Each reader sends its received message again on a "aggregate queue".
Have another thread to listen on the "aggregate queue".
As a variation, you could use a JVM Queue (like java.util.concurrent.ArrayBlockingQueue) as the "aggregate queue". This is faster, does not require another MQ queue, does not need network bandwidth, but it's not persistent.
Another idea is to use a "Message driven bean (MDB)" for each incoming queue/topic:
Again, each of these MDBs just reads the message and resends it to the "aggregate queue".
Have another MDB listening on the "aggregate queue".
A few suggestions on quality requirements. I belive you have to consider them.
They will be highly relate with your technical solution.
is that message loss acceptable?
client ack could be considered.
e.g. A memory queue sit in middle, e.g. incoming queue1...n -> ArrayBlockingQueue in memory -> outgoing queue. The data in the ArrayBlockingQueue , will lost when app crash.
is that message duplicate acceptable for the single outgoing queue?
I would suggest yes.
Set applicable level PossibleDuplicateFlag to make the client aware of that.
how fast the incoming messages per second on the diff incoming queue?
one queue session has only a uniqe thread. Performance has to be considered in advance.
I was wondering what is the difference between a JMS Queue and JMS Topic.
ActiveMQ page says
Topics
In JMS a Topic implements publish and subscribe semantics. When you publish a message it goes to all the subscribers who are
interested - so zero to many subscribers will receive a copy of the
message. Only subscribers who had an active subscription at the time
the broker receives the message will get a copy of the message.
Queues
A JMS Queue implements load balancer semantics. A single message will be received by exactly one consumer. If there are no
consumers available at the time the message is sent it will be kept
until a consumer is available that can process the message. If a
consumer receives a message and does not acknowledge it before closing
then the message will be redelivered to another consumer. A queue can
have many consumers with messages load balanced across the available
consumers.
I want to have 'something' what will send a copy of the message to each subscriber in the same sequence as that in which the message was received by the ActiveMQ broker.
Any thoughts?
That means a topic is appropriate. A queue means a message goes to one and only one possible subscriber. A topic goes to each and every subscriber.
It is simple as that:
Queues = Insert > Withdraw (send to single subscriber) 1:1
Topics = Insert > Broadcast (send to all subscribers) 1:n
Topics are for the publisher-subscriber model, while queues are for point-to-point.
A JMS topic is the type of destination in a 1-to-many model of distribution.
The same published message is received by all consuming subscribers. You can also call this the 'broadcast' model. You can think of a topic as the equivalent of a Subject in an Observer design pattern for distributed computing. Some JMS providers efficiently choose to implement this as UDP instead of TCP. For topic's the message delivery is 'fire-and-forget' - if no one listens, the message just disappears. If that's not what you want, you can use 'durable subscriptions'.
A JMS queue is a 1-to-1 destination of messages. The message is received by only one of the consuming receivers (please note: consistently using subscribers for 'topic client's and receivers for queue client's avoids confusion). Messages sent to a queue are stored on disk or memory until someone picks it up or it expires. So queues (and durable subscriptions) need some active storage management, you need to think about slow consumers.
In most environments, I would argue, topics are the better choice because you can always add additional components without having to change the architecture. Added components could be monitoring, logging, analytics, etc.
You never know at the beginning of the project what the requirements will be like in 1 year, 5 years, 10 years. Change is inevitable, embrace it :-)
Queues
Pros
Simple messaging pattern with a transparent communication flow
Messages can be recovered by putting them back on the queue
Cons
Only one consumer can get the message
Implies a coupling between producer and consumer as it’s an one-to-one relation
Topics
Pros
Multiple consumers can get a message
Decoupling between producer and consumers (publish-and-subscribe pattern)
Cons
More complicated communication flow
A message cannot be recovered for a single listener
As for the order preservation, see this ActiveMQ page. In short: order is preserved for single consumers, but with multiple consumers order of delivery is not guaranteed.
If you have N consumers then:
JMS Topics deliver messages to N of N
JMS Queues deliver messages to 1 of N
You said you are "looking to have a 'thing' that will send a copy of the message to each subscriber in the same sequence as that in which the message was received by the ActiveMQ broker."
So you want to use a Topic in order that all N subscribers get a copy of the message.
TOPIC:: topic is one to many communication... (multipoint or publish/subscribe)
EX:-imagine a publisher publishes the movie in the youtub then all its subscribers will gets notification....
QUEVE::queve is one-to-one communication ...
Ex:-When publish a request for recharge it will go to only one qreciever ...
always remember if request goto all qreceivers then multiple recharge happened so while developing analyze which is fit for a application
Queue is JMS managed object used for holding messages waiting for subscribers to consume. When all subscribers consumed the message , message will be removed from queue.
Topic is that all subscribers to a topic receive the same message when the message is published.