CQRS - out of order messages

CQRS - out of order messages - microservices

Suppose we have 3 different services producing events, each of them publishing to its own event store.
Each of these services consumes other producers services events.
This because each service has to process another service's events AND to create its own projection. Each of the service runs on multiple instances.
The most straight forward way to do it (for me) was to put "something" in front of each ES which is picking events and publishing (pub/sub) them in queues of every other service.
This is perfect because every service can subscribe to each topics it likes, while the event publisher is doing the job and if a service is unavailable events are still delivered. This seems to me to guarantee high scalability and availability.
My problem is the queue. I can't get an easily scalable queue that guarantees ordering of the messages. It actually guarantees "slightly out of order" with at-least once delivery: to be clear, it's AWS SQS.
So, the ordering problems are:
No order guaranteed across events from the same event stream.
No order guaranteed across events from the same ES.
No order guaranteed across events from different ES (different services).
I though I could solve the first two problems just by keeping track of the "sequence number" of the events coming from the same ES.
This would be done by tracking the last sequence number of each topic from which we are consuming events
This should be easy for reacting to events and also building our projection.
Then, when I pop an event from the queue, if the eventSequenceNumber > previousAppliedEventSequenceNumber + 1 i renqueue it (or make it invisible for a certain time).
But it turns out that using this solution, it will destroy performances when events are produced at high rates (I can use a visibility timeout or other stuff, the result should be the same).
This because when I'm expecting event 10 and I ignore event 11 for a moment, I should ignore also all events (from ES) with sequence numbers coming after that event 11, until event 11 shows up again and it's effectively processed.
Other difficulties were:
where to keep track of the event's sequence number for build the projection.
how to keep track of the event's sequence number for build the projection so that when appling it, I have a consistent lastSequenceNumber.
What I'm missing?
P.S.: for the third problem think at the following scenario. We have a UserService and a CartService. The CartService has a projection where for each user keeps track of the products in the cart. Each cart's projection must have also user's name and other info's that are coming from the UserCreated event published from the UserService. If UserCreated comes after ProductAddedToCart the normal flow requires to throw an exception because the user doesn't exist yet.

What I'm missing?
You are missing flow -- consumers pull messages from sources, rather than having sources push the messages to the consumers.
When I wake up, I check my bookmark to find out which of your messages I read last, and then ask you if there have been any since. If there have, I retrieve them from you in order (think "document message"), also writing down the new bookmarks. Then I go back to sleep.
The primary purpose of push notifications is to interrupt the sleep period (thereby reducing latency).
With SQS acting as a queue, the idea is that you read all of the enqueued messages at once. If there are no gaps, then you can order the collection then start processing them and acking them. If there are gaps, you either wait (leaving the messages in the queue) or you go to the event store to fetch copies of the missing messages.
There's no magic -- if the message pipeline is promising "at least once" delivery, then the consumers must take steps to recognize duplicate messages as they arrive.
If UserCreated comes after ProductAddedToCart the normal flow requires to throw an exception because the user doesn't exist yet.
Review Race Conditions Don't Exist, by Udi Dahan: "A microsecond difference in timing shouldn’t make a difference to core business behaviors."

The basic issue is assuming we can get messages IN ORDER...
This is a fallacy in distributed computing...
I suggest you design for no message ordering in your system.
As for your issues, try and use UTC time in the message body/header created by the originator and try and work around this data point. Sequence numbers are going to fail unless you have a central deterministic sequence creator (which will be a non-scalable, single point of failure).
Using Sagas/State machine is a path that can help to make sense of (business) events ordering.

Related

In event-driven architecture, is it ok to have all services send their event to a component that forwards it to the proper service?

Let's say I want to set up and event-driven architecture with services A-D where the events propagate as follows
A
/ \
B C
/
D
In other words,
(1) A publishes an event
(2) Subscribers B and C receive A's event
(3) C publishes an event
(4) Subscriber D receive's C's event
One way is to have services B and C directly listen to a queue into which A posts messages. But the issue I see with this is maintenance. Once the system becomes complicated with 1000s of subscriptions, it becomes difficult to have any visibility into how the updates are propagating.
A solution I propose to this problem is to have another service X that knows the tree in the in the first image and is responsible for directing the propagation of events according to the tree. Every service publishes its event to X and it publishes the event to the listening services. So it's kinda of a middleman like
A
|
X
/ \
B C
|
X
|
D
This also makes it easier to track the event propagation.
Are there any downsides to this (other than extra cost associating with twice as much message transferring)?

You’re thinking of events like they are implemented in a Winforms UI where the publisher sends the event directly to the subscriber. That’s not how events work in an EDA architecture. The word “event” has taken on a whole new meaning.
Before we start, you’re jumbling together the ideas of a message and an event when they really need to be kept separate. A message is a request for some action to happen, while an event is notification that something has already happened. The important distinction for this discussion is that a message publisher assumes 1 or more other processes will receive and process the message. If the message is not processed by something, downstream errors will occur. An event has no such assumption and can go unread without adversely affecting anything. Another difference is that once messages are processed they are typically thrown away, whereas events are kept for an extended period (days, or weeks).
With that in mind, the ‘X’ service you talk about already exists (please don’t build one) and is integral to the process – it’s called the bus. There are 2 types of bus; a message bus (think RabbitMQ, MSMQ, ZeroMQ, etc) or event bus (Kafka, Kinesis, or Azure Event Hub). In either case, a publisher puts a message on to the bus and subscribers get it from the bus. You may implement the bus servers as multiple physical buses, but when imagining it think of them all being the same logical bus.
The key point that’s tripping you up, and it’s a subtle difference, is thinking that the message bus has business logic indicating where messages go. The business logic of who gets what message is determined by the subscribers – the message bus is just a holding place for the messages to wait for pickup.
In your example, A publishes an event to the bus with a message type of “MT1”. B and C both tell the bus that they are interested in events of type “MT1”. When the bus receives the request from B and C to be notified of “MT1” messages, the bus creates a queue for B and a queue for C. When A publishes the message, the bus puts a copy in the “B-MT1” queue and a copy in the “C-MT1” queue. Note that the bus doesn’t know why B and C want to receive those messages, only that they’ve subscribed.
These messages sit there until processed by their respective subscribers (the processes can poll or the bus can push the messages, but the key idea is that the messages are held until processed). Once processed, the messages are thrown away.
For C to communicate with D, D will subscribe to messages of type “MT2” and C will publish them to the bus.
Constantin’s answer above has a point that this is a single point of failure, but it can be managed with standard network architecture like failover servers, local message persistence, message acknowledgements, etc.
One of your concerns is that with 1000’s of subscriptions it becomes difficult to follow the path, and you’re right. This is an inherent downside of EDA and there’s nothing you can do about it. Eventual consistency is also something the business is going to complain about, but it’s part of the beast and is actually a good thing from a technical perspective because it enables more scalability. The biggest problem I’ve found using the term Eventual Consistency is that the business thinks it means hours or days, not seconds.
BTW, This whole discussion assumes the message publishers and subscribers are different apps. All the same ideas can be applied within the same address space, just with a different bus. If you’re a .net shop look at Mediatr. For other tech stacks, there are similar solutions that I’m sure google knows about.

If your main concern is visibility into the propagation of events (which is a very valid concern for debugging and long-term application maintenance of a distributed system), you can use a correlation identifier to trace the generation of messages from the initial event through the entire chain. You don't need to build another layer of orchestration -- let your messaging platform handle that for you.
Most messaging platforms/libraries have the concept built in: e.g., NServiceBus defines a ConversationId field in the message headers, and AMQP defines a correlation-id field in the basic messaging model.
Your system should have some kind of logging that allows you to audit messages -- the correlation ID will allow you to group all messages that result from a single command/request to make debugging distributed logic much simpler.
If you set a GUID in the client requests, you can even correlate actions in the UI to the backend API, right through all the events recursively generated.

It is OK but the microservices shouldn't care how they get the messages in the first place. From their point of view the input messages just arrive. You will then be tempted to design your system to depend on some global order of events, which is hard in a distributed scalable system. Resist that temptation and design your system to relay only on local ordering of events (i.e. the ordering in an Event stream emitted by an Aggregate in Event sourcing + DDD).
One downside that I see is that the availability and the scalability may be hurt. You will then have a single point of failure for the entire system. If this fails everything fails. When it needs to be scaled up then you will have again problems as you will have distributed messaging system.

How to handle side effects based on multiple events in a message driven microservice system?

we are currently working in a message driven Microservice environment and some of our messages/events are event sourced (using Apache Kafka). Now we are struggling with implementing more complex business requirements, were we have to take multiple events into account to create new events and side effects.
In the current situation we are working with devices that can produce errors and we already process them and have a single topic which contains ERROR_OCCURRED and ERROR_RESOLVED events (so they are in order). We also make sure, that all messages regarding a specific device always go onto the same partition. And both messages share an ID that identifies that specific error incident. We already have a projection that consumes those events and provides an API for our customers, s.t. they can see all occurred errors and their current state.
Now we have to deal with the following requirement:
Reporting Errors
We need a push system that reports errors of devices to our external partners, but only after 15 minutes and if they have not been resolved in that timeframe. Our first approach was to consume all ERROR_RESOLVED events, store the IDs and have another consumer that is handling the ERROR_OCCURRED events in a delayed fashion (e.g. by only consuming the next ERROR_OCCURRED event on the topic if its timestamp is at least 15 minutes old). We would then be able to know if that particular error has already been resolved and does not need to be reported (since they share a common ID with the corresponding ERROR_RESOLVED event). Otherwise we send an HTTP request to our external partner and create an ERROR_REPORTED event on a new topic. Is there any better approach for delayed and conditional message processing?
We also have to take the following special use cases into account:
Service restarts: currently we are planning to keep the list of resolved errors in memory, so if a service restarts, that list has to be created from scratch. We could just replay the ERROR_RESOLVED messages, but that may take some time and in that time no ERROR_OCCURRED events should be processed because that may result in reporting errors that have been resolved in less then 15 minutes, but we are just not aware of it. Are there any good practices regarding replay vs. "normal" processing?
Scaling: we may increase or decrease the number of instances of our service at any time, so the partition assignment may change during runtime. That should not be a problem if we create a consumer group for each service instance when consuming the ERROR_RESOLVED events, s.t. every instance knows all resolved errors while still only handling the ERROR_OCCURRED events of its assigned partitions (in another consumer group which is shared by all instances). Is there a better approach for handling partition reassignment and internal state?
Thanks in advance!

For side effects, I would record all "side" actions in the event store. In your particular example, when it is time to send a notification, I would call SEND_NOTIFICATION command that emit NOTIFICATION_SENT event. These events would be processed by some worker process that does actual HTTP request.
Actually I would elaborate this even furter, since notifications could fail, so I would have, say, two events NOTIFICATION_REQUIRED, and NORIFICATION_SENT, so we can retry failed notifications.
And finally your logic would be "if error was not resolved in 15 minutes and notification was not sent - send a notification (or just discard if it missed its timeframe)"

If nobody needs reliable messaging on transport level, how to implement reliable PubSub on business level?

This question is mostly out of curiosity. I read this article about WS-ReliableMessaging by Marc de Graauw some time ago and agreed that reliable messaging should be applied on the business level as whenever possible.
Now, the question is, he explains clearly what his approach is in a point-to-point fashion. However, I fail to see how you could implement reliable messaging on the business level in a Publish/Subscribe situation.
I will try to demonstrate the difference by showing commands (point-to-point) vs. events (publish/subscribe). Note that these examples are highly simplified.
Command: Transfer(uniqueId, amount, sourceAccount, recipientAccount)
If the account holder sends this transfer, he could wait for the confirmation MoneyTransferred (assuming this event will contain a reference to the uniqueId in the Transfer command.
If the account holder doesn't received the MoneyTransferred within a given timeout period, he could send the same command again. (of course assuming the command processor is idempotent)
So I see how reliable messaging could work on business level in a point-to-point fashion.
Now, say we the previous command succeeded and produced a MoneyTransferred event. Somewhere in the system we have an event processor (MoneyTransferEmailNotifier) that handles MoneyTransferred events and will send an email notification to the recipient of the transfer.
This MoneyTransferEmailNotifier is subscribed to MoneyTransferred events. But note that system sending the MoneyTransferred event does not really care who or how many listeners there are to this event. The whole point is the decoupling here. I raise an event and don't care if there zero or 20 listeners that subscribe to this event.
At this point, if there is no reliable messaging (minimally at-least-once-delivery) provided by the infrastructure, how can we prevent the loss of the MoneyTransferred event? I do want the recipient to get his e-mail notification.
I fail to see how any real 'business-level' solution will resolve this.
(1) One of the solutions I can think of is by explicitly subscribing to events on 'business level' and thereby bypassing any infrastructure component. But aren't we at that moment introducing infrastructure in our business?
(2) The other 'solution' would be by introducing a process manager that does something like this:
PM receives Transfer command
PM forwards Transfer command to the accounts subsystem
If successful, sends command SendEmailNotification(recipient) to the notification subsystem
This does seem to be the solution that DDD prescribes, correct? But doesn't this introduce more coupling?
What do you think?
Edit 2016-04-16
Maybe the root question is a little bit more simplistic: If you do not have an infrastructural component that ensures at-least or exactly-once delivery, how can you ensure (when you're in an at-most-once infrastructure) that your events emitted will be received?
Not all events need to be delivered but there are many that are key (like the example of sending the confirmation email)

This MoneyTransferEmailNotifier is subscribed to MoneyTransferred events. But note that system sending the MoneyTransferred event does not really care who or how many listeners there are to this event. The whole point is the decoupling here. I raise an event and don't care if there zero or 20 listeners that subscribe to this event.
Your tangle, I believe, is here - that only the publish subscribe middleware can deliver events to where they need to go.
Greg Young covers this in his talk on polyglot data (slides).
Summarizing: the pub/sub middleware is in the way. A pull based model, where consumers retrieve data from the durable event store gives you a reliable way to retrieve the messages from the store. So you pull the data from the store, and then use the business level data to recognize previous work as before.
For instance, upon retrieving the MoneyTransferred event with its business data, the process manager looks around for an EmailSent event with matching business data. If the second event is found, the process manager knows that at least one copy of the email was successfully delivered, and no more work need be done.
The push based models (pub/sub, UDP multicast) become latency optimizations -- the arrival of the push message tells the subscriber to pull earlier than it normally would.
In the extreme push case, you pack into the pushed message enough information that the subscriber(s) can act upon it immediately, and trust that the idempotent handling of the message will prevent problems when the redundant copy of the message arrives on the slower channel.

If nobody needs reliable messaging on transport level, how to implement reliable PubSub on business level?
The original article does not state that "nobody needs reliable messaging on transport level", it states that the ordering of messages should be enforced at the business level because, in some cases, if this ordering is an important characteristic of the business.
In any case, PubSub is at the infrastructure level, you can't say that you implement PubSub at the business level. It doesn't make sense.
But then how you could ensure only-once-delivery at the business level? By using a Saga/Process manager. On of the important responsibilities of them is exactly that. You can combine that with idempotent Aggregates. Also, you could identify terms that emphasis ordering from the Ubiquitous language like transaction phase and include them in your domain models (for example as properties of the events).
If you do not have an infrastructural component that ensures at-least
or exactly-once delivery, how can you ensure (when you're in an
at-most-once infrastructure) that your events emitted will be
received?
If you do not have at-least-once then you could use the first event that it is initiating the hole process. I would use event polling and a Saga that ensure that every important step in the process is reached at the right moment.
In your case, as the sending of the email is an important business aspect, I would include it as a step in the process.

An event store could become a single point of failure?

Since a couple of days I've been trying to figure it out how to inform to the rest of the microservices that a new entity was created in a microservice A that store that entity in a MongoDB.
I want to:
Have low coupling between the microservices
Avoid distributed transactions between microservices like Two Phase Commit (2PC)
At first a message broker like RabbitMQ seems to be a good tool for the job but then I see the problem of commit the new document in MongoDB and publish the message in the broker not being atomic.
Why event sourcing? by eventuate.io:
One way of solving this issue implies make the schema of the documents a bit dirtier by adding a mark that says if the document have been published in the broker and having a scheduled background process that search unpublished documents in MongoDB and publishes those to the broker using confirmations, when the confirmation arrives the document will be marked as published (using at-least-once and idempotency semantics). This solutions is proposed in this and this answers.
Reading an Introduction to Microservices by Chris Richardson I ended up in this great presentation of Developing functional domain models with event sourcing where one of the slides asked:
How to atomically update the database and publish events and publish events without 2PC? (dual write problem).
The answer is simple (on the next slide)
Update the database and publish events
This is a different approach to this one that is based on CQRS a la Greg Young.
The domain repository is responsible for publishing the events, this
would normally be inside a single transaction together with storing
the events in the event store.
I think that delegate the responsabilities of storing and publishing the events to the event store is a good thing because avoids the need of 2PC or a background process.
However, in a certain way it's true that:
If you rely on the event store to publish the events you'd have a
tight coupling to the storage mechanism.
But we could say the same if we adopt a message broker for intecommunicate the microservices.
The thing that worries me more is that the Event Store seems to become a Single Point of Failure.
If we look this example from eventuate.io
we can see that if the event store is down, we can't create accounts or money transfers, losing one of the advantages of microservices. (although the system will continue responding querys).
So, it's correct to affirmate that the Event Store as used in the eventuate example is a Single Point of Failure?

What you are facing is an instance of the Two General's Problem. Basically, you want to have two entities on a network agreeing on something but the network is not fail safe. Leslie Lamport proved that this is impossible.
So no matter how much you add new entities to your network, the message queue being one, you will never have 100% certainty that agreement will be reached. In fact, the opposite takes place: the more entities you add to your distributed system, the less you can be certain that an agreement will eventually be reached.
A practical answer to your case is that 2PC is not that bad if you consider adding even more complexity and single points of failures. If you absolutely do not want a single point of failure and wants to assume that the network is reliable (in other words, that the network itself cannot be a single point of failure), you can try a P2P algorithm such as DHT, but for two peers I bet it reduces to simple 2PC.

We handle this with the Outbox approach in NServiceBus:
http://docs.particular.net/nservicebus/outbox/
This approach requires that the initial trigger for the whole operation came in as a message on the queue but works very well.

You could also create a flag for each entry inside of the event store which tells if this event was already published. Another process could poll the event store for those unpublished events and put them into a message queue or topic. The disadvantage of this approach is that consumers of this queue or topic must be designed to de-duplicate incoming messages because this pattern does only guarantee at-least-once delivery. Another disadvantage could be latency because of the polling frequency. But since we have already entered the eventually consistent area here this might not be such a big concern.

How about if we have two event stores, and whenever a Domain Event is created, it is queued onto both of them. And the event handler on the query side, handles events popped from both the event stores.
Ofcourse every event should be idempotent.
But wouldn’t this solve our problem of the event store being a single point of entry?

Not particularly a mongodb solution but have you considered leveraging the Streams feature introduced in Redis 5 to implement a reliable event store. Take a look this intro here
I find that it has rich set of features like message tailing, message acknowledgement as well as the ability to extract unacknowledged messages easily. This surely helps to implement at least once messaging guarantees. It also support load balancing of messages using "consumer group" concept which can help with scaling the processing part.
Regarding your concern about being the single point of failure, as per the documentation, streams and consumer information can be replicated across nodes and persisted to disk (using regular Redis mechanisms I believe). This helps address the single point of failure issue. I'm currently considering using this for one of my microservices projects.

nservicebus: events and dead letter queue

Using the Pub/Sub model with NSB, the following two scenarios seemingly cause the dead-letter queue to fill up, eventually resulting in a "Insufficient resources" error.
1) Publishing an event type that has no subscribers
2) Subscriber is offline
For our purposes we are not interested in historical events when the subscriber starts up, so the incoming queue is purged on startup. Events published while the subscriber is offline fill up the dead-letter queue, however.
Have i misunderstood the command vs. event? This is the behaviour i was expecting from Commands, but expected events to disappear if not subscribed to.

When using NServiceBus, events are considered just as important as commands, and thus are subject to the same guarantees regarding durability, delivery, etc.
So, if your subscriber does not care about events when it is offline, it could unsubscribe before shutting down - this way, it's an explicit decision made by your subscriber that it does not care about what happens when it's not around to hear it... just make sure that it doesn't get confused or chokes somehow if there's a few (old) events lying in its input queue when it comes back online later on, because stuff might get published in the time between the unsubscribe message is sent and it gets to the publisher.
Another option is to supply the [TimeToBeReceived(...)] attribute on your event messages, but that should only be used if it can be safely determined that the event contents lose their relevance after a fixed time for all subscribers.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio