Does EventStoreDB provide message ordering by an event-key on the consumer side? - event-sourcing

I have been exploring EventStoreDB and trying to understand more about the ordering of messages on the consumer side. Read about persistent subscriptions and also the Pinned consumer strategy here.
I have a scenario wherein inventory updates get pushed to eventstore and different streams get created by the different unique inventoryIds in the inventory event.
We have multiple consumers with the same consumerGroup name to read these inventory events. We are using Pinned Persistent Subscription with ResolveLinkTos enabled.
My question:
Will every message from a particular stream always go to the same consumer instance of the consumerGroup?
If the answer to the above question is yes, will every message from that particular stream reach the particular consumer instance in the same order as the events were ingested?

The documentation has a warning that ordered message processing using persistent subscriptions is not guaranteed. Any strategy delivers messages with the best-effort level of ordering guarantees, if applicable.
There are a few reasons for this, some of those are:
Spreading out messages across consumer groups lead to a non-linearised checkpoint commit. It means that some messages can be processed before other messages.
Persistent subscriptions attempt to buffer messages, but when a timeout happens on the client side, the whole buffer is redelivered, which can eventually break the processing order
Built-in retry policies essentially can break the message order at any time
Most event log-based brokers, if not all, don't even attempt to guarantee ordered message delivery across multiple consumers. I often hear "but Kafka does it", ignoring the fact that Kafka delivers messages from one partition to at most one consumer in a group. There's no load balancing of one partition between multiple consumers due to exactly the same issue. That being said, EventStoreDB is still not a broker, but a database for events.
So, here are the answers:
Will every message from a particular stream always go to the same consumer instance of the consumer group?
No. It might work most of the time, but it will eventually break.
will every message from that particular stream reach the particular consumer instance in the same order as the events were ingested?
Most of the time, yes, but again, if a message is being retried, you might get the next message before the previous one is Acked.
Overall, load-balancing ordered processing of messages, which aren't pre-partitioned on the server is not an easy task. At most, you get messages re-delivered if the checkpoint fails to persist at some point, and the consumers restart.

Related

ActiveMQ - Competing Consumers with Selector - messages starve in the queue

ActiveMQ 5.15.13
Context: I have a single queue with multiple Consumers. I want to stop some consumers from processing certain messages. This has to be dynamic, I don't want to create separate queues for this. This works without any problems. e.g. Consumer1 ignores Stocks -> Consumer1 can process all invoices and Consumer2 can process all Stocks
But if there is a large number of messages already in the Queue (of one type, e.g. stocks) and I send a message of another type (e.g. invoices), Consumer1 won't process the message of type invoices. It will instead be idle until Consumer2 has processed all Stocks messages. It does not happen every time, but quite often.
Is there any option to change the order of the new messages coming into the queue, such that an idle consumer with matching selector picks up the new message?
Things I've already tried:
using a PendingMessageLimitStrategy -> it seems like it does not work for queues
increasing the maxPageSize and maxBrowsePageSize in the hope that once all Messages are in RAM, the Consumers will search for their messages.
Exclusive Consumers aren't an option since I want to be able to use more than one Consumer per message type.
Im pretty sure that there is some configuration which allows this type of usage. I'm aware that there are better solutions for this issue, but sadly I can't use them easily due to other constraints.
Thanks a lot in advance!
EDIT: I noticed that when I'm refreshing on the localhost queue browser, the stuck messages get executed immediately. It seems like this action performs some sort of queue refresh where the messages get filtered based on their selector again. So I just need this action whenever a new message enters the queue...
This is a 'window' problem where the next set of 'stocks' data needs to be processed before the 'invoicing' data can be processed.
The gotcha with window problems like this is that you need to account for the fact that some messages may never come through, or a consumer may never come back online either. Also, eventually you will be asked 'how many invoices or stocks are left to be processed'-- aka observability.
ActiveMQ has you covered-- check out wild-card destinations and consumers.
Produce 'stocks' to:
queue://data.stocks.input
Produce 'invoices' to:
queue://data.invoices.input
You then setup consumes to connect:
queue://data.*.input
note: the wildard '*'.
ActiveMQ will match queues based on the wildcard pattern, and then process data accordingly. As a bonus, you can still use a selector.

duplicate events by consumer

we observed that one of the consumer try to pick the events multiple times from kafka topic. we have the below seetings on consumer application side.
spring.kafka.consumer.enable-auto-commit=false & spring.kafka.consumer.auto-offset-reset=earliest.
how to avoid the duplicate by the consumer application.
Do we need to fine tune the above configuration settings to avoid the consumer to pick the events multiple times from the kafka topic.
Since you've disabled auto commits, you do need to fine tune when you actually commit a record, otherwise you could have at least once processing.
You could also read the examples of the exactly once processing capabilities using transactions and idempotent producers
The auto.offset.reset only applies if your consumer group is removed, or never exists at all (you're not committing anything). In that case, you're always going to read from the beginning of the topic

How does Mass Transit handle retries deduplication and message id generation when using in-memory outbox

Mass Transit has an in-memory "outbox" implementation that I think will handle the majority of the concerns / challenges I am looking to over come however I can not find a lot of documentation that describes its capabilities in the detail I am looking for. A lot of these questions came about after watching a video where Udi Dahan explains how to handle reliable messaging without distributed transactions (https://vimeo.com/111998645).
Does the in-memory outbox handle failures that may happen when trying to send a message to the queue? So for example: A consumer generates 3 messages that are collected in the outbox. The consumer completes without issue.The collected messages in the outbox start being processed
If from some reason while processing the collected message there is a network issue (or other issue) and message 2 fails to be sent what will happen to message 2 and 3? Is there any sort of retry policy?
What happens if a message being processed in the outbox is successfully added to the queue but is unsuccessfully marked as sent in the outbox? Will there be another attempt to send the message to the queue?
Assuming the outbox will retry sending a message to a queue if there is some sort of failure is the message ID guaranteed to be consistent between attempts? Having a consistent Message ID is important for de-duplication to ensure we do not process the same message multiple times.
When a message is consumed is there any de-duplication that takes place? (This ties back to 1.C)
How does Mass Transit track processed records for each consumer? Do the storage engines take care of this responsibility?
Is there any sort of "transaction" exposed to the consumer that allows you to clear the collected message in the outbox without throwing an exception or is throwing an exception the only way to rollback the outbox?
What about messages that are generated outside of a consumer, Is there a way to rollback messages collected in the outbox (example: A WebAPI controller action)?
Is there a recommendation to use the DTC features of Mass Transit instead of outbox or vice versa or use them both?
Currently Mass Transit does not have an outbox implementation that can survive a process crash. Is there a plan to include such a feature? Is there a road map this is tracked on?
The in-memory outbox defers any message send/publish/respond calls until the consumer has completed all processing. This includes regular consumers and sagas. The very last thing the consumer does is send/publish any deferred messages, after which the incoming message is acknowledged (and removed from the queue). With that said, most of the remaining items in your question aren't relevant, because it isn't writing messages to a database, and then processing them afterwards.
No
No
Don't use the DTC, it isn't even supported in .NET Core
No plans, nothing on the roadmap
As you said at the start, the in-memory outbox handles 99.9% of the cases. A well-designed saga and supporting services can push that even higher, ensuring idempotency and eventually successful command (or event) processing. Anything beyond what's there today is typically to support poorly designed systems and just creates way too much complexity with extra dependencies.

Multiple consumers working as single consumer with Masstransit

My system has a constrain for specific consumer that messages should be handled in order, one after the other. To implement that we set the concurrency to 1.
Now we want to scale out and add more instance of this consumer.
To keep the order I want to use distributed lock manager like 'RedLock'. It can tell each consumer if it is OK to fetch the next message.
I work with RabbitMq and my question is if there is kind of observer event that comes before getting messages from the queue. In other words I need a way to enable/disable the operation of polling messages from the queue.

How to read messages in an order from the Queue using MDB?

I have a MDB which listens to WebSphere MQ. It does not picks up the messages in the order that has been received by the Queue. How can i make it read it in that order? Is it possible? Should i not use a MDB.
In general, WMQ delivers messages in the order that they were received. However, several things can impact that...
If the queue is set to priority instead of FIFO delivery and messages arrive in different priorities, they will be delivered "out of order".
Distinguish between order produced and order delivered. If the messages are produced on a remote QMgr and there are multiple paths to the local QMgr, messages may arrive out of order.
Difference in persistence - if messages are produced on a remote QMgr and are of different persistences, the non-persistent messages may arrive faster than the persistent ones, especially with channel NPMSPEED(FAST) set.
Multiple readers/writers - Any dependency on sequence implies a single producer sending to a single consumer over a single path. Any redundancy in producers, consumers or paths between them can result in messages delivered out of sequence.
Syncpoint - To preserve sequence, ALL messages must be written and consumed under syncpoint or else ALL must be written and consumed outside of syncpoint.
Selectors - These specifically are intended to deliver messages out of order with respect to the context of all messages in the queue.
Message groups - Retrieval of grouped messages typically waits until the entire group is present. If groups are interleaved, messages are delivered out of sequence.
DLQ - if the target queue fills, messages may be delivered to the DLQ. As the target queue is drained, messages start going back there. With a queue near capacity, messages can alternate between the target queue and DLQ.
So when an MDB is receiving messages out of order any of these things, or even several of them in combination, may be at cause. Either eliminate the dependency on message sequence (best choice) or else go back over the design and reconcile all the factors that may lead to out-of-sequence processing.
To add to T.Rob's list, MDBs use the application server WorkManager to schedule message delivery, so message order is also dependent on the order in which the WorkManager starts Work items. This is outside the control of WMQ. If you limit the MDB ServerSessionPool depth to one, then this limit is removed as there will only ever be one in-flight Work instance, but at the cost of reducing maximum throughput.
If you're running in WebSphere application server, then non-ASF mode with ListenerPorts can preserve message order subject to some transactional/backout caveats. There's a support technote here:
http://www-01.ibm.com/support/docview.wss?uid=swg21446463

Resources