I'm working on architecting a micro-service solution where most code will be C# and most likely Angular for any front end. My question is about message chaining. I am still figuring out what message broker to use; Azure Service Bus , RabbitMQ, etc.. There is a concept which I haven't found much about.
How do I handle cases when I want to fire a message when a specific set of messages have fired. An example but not part of my actual solution: I want to say Notify someone when pays a bill. We send a message "PAIDBILL"
which will fire off microservices which will be processed independently:
FinanceService to Debit the ledger and fire "PaymentPosted"
EmailService: email Customer Saying thank you for paying the bill
"CustomerPaymentEmailSent"
DiscountService: Check if they get a discount for paying on time then send
"CustomerCanGetPaymentDiscount"
If all three messages have fired for the Same PAIDBILL: Message "PaymentPosted", "CustomerPaymentEmailSent", "CustomerCanGetPaymentDiscount"
then I want to email the customer that they will get a discount on their next bill. It Must be done AFTER all three have tiggered and the order doesn't matter. How do I Schedule a new message to be sent "EmailNextTimeDiscount" message, without having to poll for what messages have fired every minute, hour, day?
All I can think of is to have a SQL table which marks that each one is complete (by locking the table) and when the last one is filled then send off the message. Would this be a good solution? I find it an anti-pattern for the micro-service & message queue design.
If you're using messages (e.g. Service Bus / RabbitMQ), then I think the solution you have described is the best one. This type of design - where services have knowledge about the other domains in the system - is typically known as choreography.
You'll want to pick a service which will be responsible for this business logic. That service will need to receive all the preceding types of messages so that it can determine when (if) all have been met, which it probably wants to do by recording which of the gates have already passed in a database.
One alternative you could consider is chaining the business processes instead of doing them in parallel. So...
PAYBILL causes FinanceService to Debit the ledger and fire "PaymentPosted"
"PayentPosted" causes EmailService to email Customer Saying thank you for paying the bill and broadcasts "CustomerPaymentEmailSent"
"CustomerPaymentEmailSent" causes DicsountService to check if they get a discount for paying on Time then sends "CustomerCanGetPaymentDiscount"
The email you want to send is just triggered by "CustomerCanGetPaymentDiscount".
If I'm honest, I would switch around the dependency model you're using at this last stage. So, instead of some component listening for "CustomerCanGetPaymentDiscount" events from DiscountService and sending an email, I think I would instead have the DiscountService tell some other component to send an email. It seems natural to me for something that calculates discounts to know that an email should be sent. It seems less natural for something that sends emails to know about discounts (and everything else that needs emails sent). This is why I don't like architectures where the assumption is that every message should be an event and every action should be triggered by an event: it removes a lot of decisions about where domain logic can live, because the message receiver always has to know about the domain of the message sender, never vice versa.
Related
We're trying to understand how to compensate a "saga compensation failure".
We have two microservices, and two databases, one per microservice.
Customer microservice
Contract microservice
Use case: Customer alias modification.
Request is sent to "Customer microservice".
a. Customer alias is modified on customer table, but its state is pending.
b. A customer modified event is sent.
customer modified event is received by "Constract microservice".
a. Received Customer is updated on all contracts (we're using mongodb), since customer information is embedded in each contract.
b. A contract updated event is sent.
contract updated event is received by "Customer microservice".
a. Customer's state is set to confirmed.
If 3.a fails a compensation action is performed, but what about if it fails?
This can be handle with combination of below approaches:
Implement Retry pattern for Compensate Action
Exception handling - exception can be save and, this can be resolve through - Automated process like Retry mechanism through separate application
This is extension of approach#2, If Automated process unable to resolve, generate exception report which can be review manually and action can be taken based on the issue.
It looks like you are using the term saga but you really mean you want a transaction. If you really need a transaction do that (you can look at solutions like https://docs.temporal.io/ for providing that)
[personally I think transactions between services are bad, and if I need transaction between services, I try to rethink my design but your milage may vary]
You didn't specify the reason on why contracts would reject the change - if there are business rules that one thing but if these are "technical reasons" like availability etc. then the thing to do is to make sure the event is persistent and was sent (e.g. like outbox pattern on the sending side) and have the consuming service(s) handle it when it can
If there are business rules involved then maybe it is a bad example but I'd expect a person can still change their alias regardless and the compensation would be keeping some of the contracts with the old alias or something a long these lines.
by the way, it seems you have a design issue that causes needless temporal coupling between your services.
If the alias is important in contracts but owned by the customers service, the alias stored in the contracts should be considered as cached.
In this case the customers service can close the update regardless of what other services do. it can fire the event and you can complete the process when you can on the contracts service. when a contract is read you can check if there's a newer version of the customer and if so get it. you may also (depending on the business reqs. specify that the data is correct as of the last update)
BASE VS ACID :
ISOLATION: As local transactions are committed while the Saga is running, their changes are already visible to other concurrent transactions, despite the possibility that the Saga will fail eventually, causing all previously applied transactions to be compensated. I.e., from the perspective of the overall Saga, the isolation level is comparable to “read uncommitted.”
Eventually other services will read those inconsistent events, they will also take wrong decisions according to these, they will increase the number of events which should not be happen at all.
In the end there will be tons of events to rollback (how is that possible if your system let users to do more than allowed in real world ? Can you get back an ice cream from a kid which is sold 5 minute ago !)
I'm new to event sourcing and have some question on my mind. Here example diagram.
Let say we have 2 instances of service BookShop and 2 instances of service Wallet.
User ask BookService_1 to buy him a book. This book service creates event BuyBookRequestCreated and push it to event bus. Event bus emits this event to two instances of service Wallet. Two instances now try to reserve enough money from user wallet and they both emit event BookMoneyReserved? Now on the other side two instances of BookShop services recive 2 events and they both try to emit event BookBought? Or maybe eventbus will send BuyBookRequestCreated to only one subscriber? But then what happends when this selected service fails?
How this pattern is handlen from API consumer point of view? If I make call to some API to buy me a book I expect it to "return 200" when the book is bought. In event sourcing pattern there is no awaiting for response from other services so in case if some other service must emit an event to complete book purchase there is no way of telling the customer if his purchase actually faild or not.
I'm a little lost in the whole microservice world. On one hand we have Grpc, protobufs and service mesh but on the other we have event sourcing and event driven architecture. When to use which? And from what I see and can understand I could use event sourcing and grpc togheter? I could just use grpc to communicate beetwen services and save events as a form of state persistance or maybe I just completly didn't get it and should read thru articles again?
Thanks for help.
I don't understand event sourcing
That's not your fault. The literature sucks.
Here example diagram.
OK, so the most important lesson I can give you about that diagram: it has nothing to do with event sourcing. It has quite a bit to do with messaging, and distributed systems, and event driven. But event sourcing is a different animal.
At the high level, a fundamental concern is that distributed systems are imperfect. So we need to accept that as a constraint in our designs, and work with it. Pat Helland's Memories, Guesses, and Apologies is a good starting point.
In normal operation, we would never have two wallet services doing the same work. They might be sharing the work (in much the same way that the Book Services share the work from the load balancer).
One way that you might share the work is to assign to each message a unique number; top wallet processes the odd numbered messages and ignores the even numbered messages, bottom wallet processes the even numbered messages and ignores the odd numbered messages.
Of course, normal operation is what you want, but not necessarily what you get -- after all, distributed systems are imperfect. For some kinds of problems, there are comfortable patterns - locking, or idempotent message handling -- and your system can continue to deliver business value.
For other kinds of problems, the answer is that some human being gets on the phone, and tells somebody else that there has been a mistake, and can we work things out?
My team is considering if we can use mass transit as a primary solution for sagas in RabbitMq (vs NServiceBus). I admit that our experience which solution like masstransit and nserviceBus are minimal and we have started to introduce messaging into our system. So I sorry if my question will be simple or even stupid.
However, when I reviewed the mass transit documentation I noticed that I am not sure if that is possible to solve one of our cases.
The case looks like:
One of our components will produce up to 100 messages which will be "sent" to queue. These messages are a result of a single operation in a system. All of the messages will have the same Correlated Id and our internal publication id (same too).
1) is it possible to define a single instance saga (by correlated id) which will wait until it receives all messages from a queue and then process them as a single batch?
2) otherwise, is there any solution to ensure all of the sent messages was processed? (Consistency batch?) I assume that correlated Id will serve as a way to found an existing saga instance (singleton). In the ideal case, I would like to complete an instance of a saga When the system will process every message which belongs to a single group (to one publication)
I look at CompositeEvent too but I do not sure if I could use it to "ensure" that every message was processed and then I would let to complete saga for specific correlated Id.
Can you explain how could it be achieved? And into what mechanism I should look at in order to correlated id a lot of messages with the same id to the single saga and then complete if all of msg will be consumed?
Thank you in advance for any response
What you describe is how correlation by id works. It is like that out of the box.
So, in short - when you configure correlation for your messages correctly, all messages with the same correlation id will be handled by the same saga instance.
Concerning the second question - unless you publish a separate event that would inform the saga about how messages it should expect, how would it know that? You can definitely schedule a long timeout, attempting and assuming that within the timeout all the messages will be received by the saga, but it's not reliable.
Composite events won't help here since they are for messages with different types to be handled as one when all of them arrive and it doesn't count for the number of messages of each type. It just waits for one message of each type.
The ability to receive a series of messages and then operate on them in a batch is a common case, so much so that there is a sample showing how to do just that:
Batch Sample
Each saga instance has a unique correlation identifier, and as long as those messages can be correlated to that single instance, MassTransit will manage the concurrency (either optimistic or pessimistic, and depending upon the saga storage engine).
I'd suggest reviewing the state machine in the sample, and seeing how that compares to your scenario.
This question is mostly out of curiosity. I read this article about WS-ReliableMessaging by Marc de Graauw some time ago and agreed that reliable messaging should be applied on the business level as whenever possible.
Now, the question is, he explains clearly what his approach is in a point-to-point fashion. However, I fail to see how you could implement reliable messaging on the business level in a Publish/Subscribe situation.
I will try to demonstrate the difference by showing commands (point-to-point) vs. events (publish/subscribe). Note that these examples are highly simplified.
Command: Transfer(uniqueId, amount, sourceAccount, recipientAccount)
If the account holder sends this transfer, he could wait for the confirmation MoneyTransferred (assuming this event will contain a reference to the uniqueId in the Transfer command.
If the account holder doesn't received the MoneyTransferred within a given timeout period, he could send the same command again. (of course assuming the command processor is idempotent)
So I see how reliable messaging could work on business level in a point-to-point fashion.
Now, say we the previous command succeeded and produced a MoneyTransferred event. Somewhere in the system we have an event processor (MoneyTransferEmailNotifier) that handles MoneyTransferred events and will send an email notification to the recipient of the transfer.
This MoneyTransferEmailNotifier is subscribed to MoneyTransferred events. But note that system sending the MoneyTransferred event does not really care who or how many listeners there are to this event. The whole point is the decoupling here. I raise an event and don't care if there zero or 20 listeners that subscribe to this event.
At this point, if there is no reliable messaging (minimally at-least-once-delivery) provided by the infrastructure, how can we prevent the loss of the MoneyTransferred event? I do want the recipient to get his e-mail notification.
I fail to see how any real 'business-level' solution will resolve this.
(1) One of the solutions I can think of is by explicitly subscribing to events on 'business level' and thereby bypassing any infrastructure component. But aren't we at that moment introducing infrastructure in our business?
(2) The other 'solution' would be by introducing a process manager that does something like this:
PM receives Transfer command
PM forwards Transfer command to the accounts subsystem
If successful, sends command SendEmailNotification(recipient) to the notification subsystem
This does seem to be the solution that DDD prescribes, correct? But doesn't this introduce more coupling?
What do you think?
Edit 2016-04-16
Maybe the root question is a little bit more simplistic: If you do not have an infrastructural component that ensures at-least or exactly-once delivery, how can you ensure (when you're in an at-most-once infrastructure) that your events emitted will be received?
Not all events need to be delivered but there are many that are key (like the example of sending the confirmation email)
This MoneyTransferEmailNotifier is subscribed to MoneyTransferred events. But note that system sending the MoneyTransferred event does not really care who or how many listeners there are to this event. The whole point is the decoupling here. I raise an event and don't care if there zero or 20 listeners that subscribe to this event.
Your tangle, I believe, is here - that only the publish subscribe middleware can deliver events to where they need to go.
Greg Young covers this in his talk on polyglot data (slides).
Summarizing: the pub/sub middleware is in the way. A pull based model, where consumers retrieve data from the durable event store gives you a reliable way to retrieve the messages from the store. So you pull the data from the store, and then use the business level data to recognize previous work as before.
For instance, upon retrieving the MoneyTransferred event with its business data, the process manager looks around for an EmailSent event with matching business data. If the second event is found, the process manager knows that at least one copy of the email was successfully delivered, and no more work need be done.
The push based models (pub/sub, UDP multicast) become latency optimizations -- the arrival of the push message tells the subscriber to pull earlier than it normally would.
In the extreme push case, you pack into the pushed message enough information that the subscriber(s) can act upon it immediately, and trust that the idempotent handling of the message will prevent problems when the redundant copy of the message arrives on the slower channel.
If nobody needs reliable messaging on transport level, how to implement reliable PubSub on business level?
The original article does not state that "nobody needs reliable messaging on transport level", it states that the ordering of messages should be enforced at the business level because, in some cases, if this ordering is an important characteristic of the business.
In any case, PubSub is at the infrastructure level, you can't say that you implement PubSub at the business level. It doesn't make sense.
But then how you could ensure only-once-delivery at the business level? By using a Saga/Process manager. On of the important responsibilities of them is exactly that. You can combine that with idempotent Aggregates. Also, you could identify terms that emphasis ordering from the Ubiquitous language like transaction phase and include them in your domain models (for example as properties of the events).
If you do not have an infrastructural component that ensures at-least
or exactly-once delivery, how can you ensure (when you're in an
at-most-once infrastructure) that your events emitted will be
received?
If you do not have at-least-once then you could use the first event that it is initiating the hole process. I would use event polling and a Saga that ensure that every important step in the process is reached at the right moment.
In your case, as the sending of the email is an important business aspect, I would include it as a step in the process.
How to deal with correlated events in an Event Driven Architecture? Concretely, what if multiple events must be triggered in order for some action to be performed. For example, I have a microservice that listens to two events foo and bar and only performs an action when both of the events arrive and have the same correlation id.
One way would be to keep an internal data structure inside the microservice that does the book keeping and when everything is satisfied an appropriate action is triggered. However, the problem with this approach is that the microservice is not immutable anymore.
Is there a better approach?
A classic example is where an order comes in at sales and an event is published. Both Finance and Shipping are subscribed to the event, but shipping is also subscribed to the event coming from finance.
The funny thing is that you have no idea on the order in which the messages arrive. The event from sales might cause a technical error, because the database is offline. It might get queued again or end up in an error queue for operations to retry it. In the meantime the event from finance might arrive. So theoretically
the event from sales should arrive first and then the finance event, but in practice it can be the other way around.
There are a number of solutions here, but I've never liked the graphical ones. As a .NET developer I've used K2 and Windows Workflow Foundation in the past, but the solutions most flexible are created in code, not via a graphical interface.
I currently would use NServiceBus or MassTransit for this. On a sidenote, I currently work at Particular Software and we make NServiceBus. NServiceBus has Sagas for this kind of work (documentation) and you can also read on my weblog about a presentation, incl. code on GitHub.
The term saga is kind of loaded, but it basically handles long running (business) processes. Gregor Hohpe calls it a Process Manager (link).
To summarize what sagas do : they are instantiated by incoming messages and have state. Incoming messages are bound/dispatched to a specific saga instance based on a correlationid, for example a customer id or order id. Once the message (event) is processed, state is stored until a new message arrives, or until the code marks the saga as completed and the state is removed from storage.
As said, in the .NET world MassTransit and NServiceBus support this, but there are most likely alternatives in other environments.
If i understand correctly, it looks like you need a CEP ( complex event processor), like ws02 cep or other , which does exactly that.
cep's can aggregate events and perform actions when certain conditions
have been met.