I am new to Event Sourcing and I have encountered an example which I am not quite sure the pros and cons of different approaches.
Let's say this is a bank example, I have three entities Account, Deposit and Transfer.
My idea is, when a use deposits, command bank.deposit will create two events:
deposit.created and account.deposited. Can I or should I include the deposit.created event uuid in account.deposited as a reference?
Taking to the next step, if later the bank has a transfer feature, should I made a separate event account.transfer_received or I should created a more general event account.credited to be used by both deposit and transfer?
Thanks in advance.
A good article to review is Nobody Needs Reliable Messaging. One key observation is that you often need identifiers at the domain level.
For instance, when I look at my bank account, and see that the account history includes a specific deposit, there is an identifier for the deposit that is reported in the view.
If you imagine it from an event sourced perspective, before the deposit the balance was X, and the history did not include deposit 12345; after processing the deposit, the balance was X+Y and deposit 12345 was in the account history.
(This means, among other things, that if a second copy of deposit 12345 were to appear, the domain model would know to ignore it even if the identifier for the event were different).
Now, there are reasons that you might want to keep various message ids around. See Hohpe's work on Enterprise Integration Patterns; in particular Correlation Identifier.
should I made a separate event
Usually. "Make the implicit, explicit". The fact that two events happen to have similar representations is not a reason to blur them when the ubiquitous language distinguishes the two.
It's somewhat analogous, in motivation, to providing a task based ui or eschewing the user of generic repositories.
Related
Quick question on Foreign key in Microservices. I already tried looking for answer. But, they did not give me the exact answer I was looking for.
Usecase : Every blog post will have many comments. Traditional monolith will have comments table with foreign key to blog post. However in microservice, we will have two services.
Service 1 : Post Microservie with these table fields (PostID, Name, Content)
Service 2 : Comments Microservie with these table fields (CommentID, PostID, Cpmment)
The question is, Do we need "PostID" in service 2 (Comments Microservice) ? I guess the answer is yes, as we need to know which comment belongs to which post. But then, it will create tight coupling? I mean if I delete service 1(Blog post service), it will impact service 2(Comments service) ?
I'm going to use another example I'm more familiar with to explain how I believe most people would do this.
Consider an Order Management System (OMS) and an Inventory Management System (IMS).
When a customer places an order in the company web site, we ask the OMS to create an order entry in the backend (e.g. via an HTTP endpoint).
The OMS system then broadcasts an event e.g. OrderPlaced containing all the details of the customer order. We may have a pub/sub (e.g. Redis), or a queue (e.g. RabbitMQ), or an event stream (e.g. Kafka) where we place the event (although this can be done in many other ways).
The thing is that we have one or more subscribers interested in this event. One of those could be the IMS, which has the responsibility of assigning the best inventory available every time an order is placed.
We can expect that the IMS will keep a copy of the relevant order information it received when it processed the OrderPlaced event such that it does not ask every little detail of the order to the OMS all the time. So, if the IMS needed a join with the order, instead of calling an endpoint in the Order API, it would probably just do a join with its local copy of the orders table.
Say now that our customer called to cancel her order. A customer service representative then cancelled it in the OMS Web User Interface. At that point an event OrderCanceled is broadcast. Guess who is listening for that event? Correct, the IMS receives notification and acts accordingly reversing the inventory assignation and probably even deleting the order record because it is no longer necessary on this domain.
So, as you can see, the best way to do this is by using events and making copies of the relevant details on the other domain.
Since events need time to get broadcast and processed by interested parties, we say that the order data in the IMS is eventually consistent.
Followup Questions
Q: So, if I understood right in microservises we prefer to duplicate data and get better performance? That is the concept? I mean I know the concept is scaling and flexibility but when we must share data we will just duplicate it?
Not really. That´s definitively not what I meant although it may have sounded like that due to my poor choice of words in the original explanation. It appears to me that at the heart of your question lies a lack of sufficient understanding of the concept of a bounded context.
In my explanation I meant to indicate that the OMS has a domain concept known as the order, but so does the IMS. Therefore, they both have an entity within their domain that represents it. There is a good chance that the order entity in the OMS is much richer than the corresponding representation of the same concept in the IMS.
For example, if the system I was describing was not for retail, but for wholesale, then the same concept of a "sales order" in our system corresponds to the concept of a "purchase order" in that of our customers. So you see, the same data, mapped under a different name, simply because under a different bounded context the data may have a different perspective and meaning.
So, this is the realization that a given concept from our model may be represented in multiple bounded contexts, perhaps from a different perspective and names from our ubiquitous language.
Just to give another example, the OMS needs to know about the customer, but the representation of the idea of a customer in the OMS is probably different than the same representation of such a concept or entity in the CRM. In the OMS the customer's name, email, shipping and billing addresses are probably enough representation of this idea, but for the CRM the customer encompasses much more.
Another example: the IMS needs to know the shipping address of the customer to choose the best inventory (e.g. the one in a facility closest to its final destination), but probably does not care much about the billing address. On the other hand, the billing address is fundamental for the Payment Management System (PMS). So, both the IMS and PMS may have a concept of an "order", it is just that it is not exactly the same, neither it has the same meaning or perspective, even if we store the same data.
One final example: the accounting system cares about the inventory for accounting purposes, to be able to tell how much we own, but perhaps accounting does not care about the specific location of the inventory within the warehouse, that's a detail only the IMS cares about.
In conclusion, I would not say this is about "copying data", this is about appropriately representing a fundamental concept within your bounded context and the realization that some concepts from the model may overlap between systems and have different representations, sometimes even under different names and levels of details. That's why I suggested that you investigate the idea of context mapping some more.
In other words, from my perspective, it would be a mistake to assume that the concept of an "order" only exists in the OMS. I could probably say that the OMS is the master of record of orders and that if something happens to an order we should let other interested systems know about those events since they care about some of that data because those other systems could have mapping concepts related to orders and when reacting to the changes in the master of record, they probably want to change their data as well.
From this point of view, copying some data is a side effect of having a proper design for the bounded context and not a goal in itself.
I hope that answers your question.
I am learning to develop microservices using DDD, CQRS, and ES. It is HTTP RESTful service. The microservices is about online shop. There are several domains like products, orders, suppliers, customers, and so on. The domains built in separate services. How to do the validation if the command payload relates to other domains?
For example, here is the addOrderItemCommand payload in the order service (command-side).
{
"customerId": "CUST111",
"productId": "SKU222",
"orderId":"SO333"
}
How to validate the command above? How to know that the customer is really exists in database (query-side customer service) and still active? How to know that the product is exists in database and the status of the product is published? How to know whether the customer eligible to get the promo price from the related product?
Is it ok to call API directly (like point-to-point / ajax / request promise) to validate this payload in order command-side service? But I think, the performance will get worse if the API called directly just for validation. Because, we have developed an event processor outside the command-service that listen from the event and apply the event to the materalized view.
Thank you.
As there are more than one bounded contexts that need to be queried for the validation to pass you need to consider eventual consistency. That being said, there is always a chance that the process as a whole can be in an invalid state for a "small" amount of time. For example, the user could be deactivated after the command is accepted and before the order is shipped. An online shop is a complex system and exceptions could appear in any of its subsystems. However, being implemented as an event-driven system helps; every time the ordering process enters an invalid state you can take compensatory actions/commands. For example, if the user is deactivated in the meantime you can cancel all its standing orders, release the reserved products, announce the potential customers that have those products in the wishlist that they are not available and so on.
There are many kinds of validation in DDD but I follow the general rule that the validation should be done as early as possible but without compromising data consistency. So, in order to be early you could query the readmodel to reject the commands that couldn't possible be valid and in order for the system to be consistent you need to make another check just before the order is shipped.
Now let's talk about your specific questions:
How to know that the customer is really exists in database (query-side customer service) and still active?
You can query the readmodel to verify that the user exists and it is still active. You should do this as a command that comes from an invalid user is a strong indication of some kind of attack and you don't want those kind of commands passing through your system. However, even if a command passes this check, it does not necessarily mean that the order will be shipped as other exceptions could be raised in between.
How to know that the product is exists in database and the status of the product is published?
Again, you can query the readmodel in order to notify the user that the product is not available at the moment. Or, depending on your business, you could allow the command to pass if you know that those products will be available in less than 24 hours based on some previous statistics (for example you know that TV sets arrive daily in your stock). Or you could let the customer choose whether it waits or not. In this case, if the products are not in stock at the final phase of the ordering (the shipping) you notify the customer that the products are not in stock anymore.
How to know whether the customer eligible to get the promo price from the related product?
You will probably have to query another bounded context like Promotions BC to check this. This depends on how promotions are validated/used.
Is it ok to call API directly (like point-to-point / ajax / request promise) to validate this payload in order command-side service? But I think, the performance will get worse if the API called directly just for validation.
This depends on how resilient you want your system to be and how fast you want to reject invalid commands.
Synchronous call are simpler to implement but they lead to a less resilient system (you should be aware of cascade failures and use technics like circuit breaker to stop them).
Asynchronous (i.e. using events) calls are harder to implement but make you system more resilient. In order to have async calls, the ordering system can subscribe to other systems for events and maintain a private state that can be queried for validation purposes as the commands arrive. In this way, the ordering system continues to work even of the link to inventory or customer management systems are down.
In any case, it really depends on your business and none of us can tell you exaclty what to do.
As always everything depends on the specifics of the domain but as a general principle cross domain validation should be done via the read model.
In this case, I would maintain a read model within each microservice for use in validation. Of course, that brings with it the question of eventual consistency.
How you handle that should come from your understanding of the domain. Factors such as the length of the eventual consistency compared to the frequency of updates should be considered. The cost of getting it wrong for the business compared to the cost of development to minimise the problem. In many cases, just recording the fact there has been a problem is more than adequate for the business.
I have a blog post dedicated to validation which you can find here: How To Validate Commands in a CQRS Application
Scenario:
I have 2 Microservices (which both use CQRS + Event Sourcing internally)
Microservice 1 manages Contacts (= Aggregate Root)
Microservice 2 manages Invoices (= Aggregate Root)
The recipient of an invoice must be a valid contact.
CreateInvoiceCommand:
{
"content": "my invoice content",
"recipient": "42"
}
I now read lot's of times, that the write side (= the command handler) shouldn't call the read side.
Taking this into account, the Invoices Microservice must listen to all ContactCreated and ContactDeleted events in order to know if the given recipient id is valid.
Then I'd have thousands of Contacts within the Invoices Microservice, even if I know that only a few of them will ever receive an Invoice.
Is there any best practice to handle those scenarios?
The recipient of an invoice must be a valid contact.
So the first thing you need to be aware of - if two entities are part of different aggregates, you can't really implement "apply a change to this entity only if that entity satisfies a specification", because that entity could change between the moment you evaluate the specification and the moment you perform the write.
In other words - you can only get eventual consistency across an aggregate boundary.
The aggregate is the authority for its own state, but everything else (for example, the contents of the command message), it pretty much has to accept that some external authority has checked the data.
There are a couple approaches you can take here
1) You can blindly accept that the recipient specified in the command is valid.
2) You can try to verify the validity of the recipient from some external authority (aka: a read model of some other aggregate) between receiving it from the untrusted source and submitting it to the domain model.
3) You can blindly accept the command as described, but treat the invoice as provisional until the validity of the recipient is confirmed. That means there is a second command to run on the invoice that certifies the recipient.
Note - from the point of view of the model, these different commands are equivalent, but at the application layer they don't need to be -- you can restrict access to the command to trusted sources (don't make it part of the public api, require authorization that is only available to trusted sources, etc).
Approach #3 is the most microservicy, as the two commands can be separated in time -- you can accept the CreateInvoice command as soon as it arrives, and certify the recipient asynchronously.
Where would you put approach 4), where the Invoices Microservice has it's own Contacts Store which gets updated whenever there's a ContactCreated or ContactDeleted event? Then both entities are part of the same service and boundary. Now it should be possible to make things consistent, right?
No. You've made the two entities part of the same service, but the problem was never that they were in different services, but that they are in separate aggregates -- meaning we can be changing the entity states concurrently, which means that we can't ensure that they are immediately synchronized.
If you wanted immediate consistency, you need a model that draws your boundaries differently.
For instance, if the invoice entities were modeled as part of the Contacts aggregate, then the aggregate can ensure the invariant that new invoices require a valid recipient -- the domain model uses the copy of the state in memory to confirm that the recipient was valid when we loaded, and the write into the book of record verifies that the book of record hadn't changed since the load happened.
The write of the aggregate state is a compare-and-swap in the book of record; if some concurrent process had invalidated the recipient, the CAS operation would fail.
The trade off, of course, is that any change to the Contact aggregate would also cause the invoice to fail; concurrent editing of different invoices with the same recipient goes out the window.
Aggregates are all or nothing; they aren't separable.
Now, one out might be that your Invoice aggregate has a part that must be immediately consistent with the recipient, and another part where eventually consistent, or even inconsistent, is acceptable. In which case your goal is to refactor the model.
The recipient of an invoice must be a valid contact.
This is a business rule. The question should be asked, what does this business rule mean for my application? Who should take responsibility for implementing this rule, or can the responsibility be shared?
One possibility is that, yes, the business rule is about invoices so it should be the responsibility of the Invoices Service to implement it.
However, the business rule is really about the creation of invoices. And the owner of invoice creation in your architecture is, strangely, not the Invoices Service. The reason for this is that the name of the command is CreateInvoiceCommand.
Let's think about this - the Invoices Service will never just create an invoice on its own. It just provides the capability. In this architecture, the actual owner of invoice creation is the sender of the command.
Using this line of reasoning, if the business rule is saying that invoice creation cannot happen against an invalid recipient, then it becomes the responsibility of the command sender to ensure this business rule is implemented.
This would be a very different scenario if, rather than receiving a command, the Invoices Service subscribed to events. As an example, an event called WidgetSold. In this scenario, the owner of invoice creation clearly would be the Invoicing service, and so the business rule would be implemented there instead.
If the user clicks the create invoice for contact 42 button, it's the
user's responsibility to take care that contact 42 exists
Yes, that is correct. The user's intention is to create an invoice. The business rules around invoice creation should, therefore, be enforced at this point. How this happens (or whether this happens at all) is a different question.
But what if the user doesn't care? Then it would create an invoice
with an invalid recipient id.
Also correct. As you say, there are side-effects to this approach, one of which is that you can end up with inconsistent data across your system. That is one of the realities of SOA.
Isn't this somehow similar to this: The Invoice has a currencyCode
property, it's a String.
I don't know if I agree or not. Is asking is this a valid ISO currency? different to asking is entity 42 valid according to another system?. I would think so.
Isn't it kinda the same as given recipient is not null and is valid
according to my Contacts Database?
I agree that in reality, you could implement this validation in the service. I am just saying that I don't think it's the right place for it. If you wanted to do this, you would have to either call out the another service or store all contacts locally, as you framed your question originally. I think it's simpler to just do it outside of the service.
I think that the answer depends on how resilient you want the system to be, that is, how to handle the situation in wich the Contacts Microservice is down (not responding or very slow).
1. You want to be very resilient
If the Contacts Microservice is down, you want to be able to emit invoices for some (maybe most) of the contacts. In this case you listen to the ContactCreated and ContactDeleted and maintain a (eventually consistent) local list of valid contacts; they should be named accordingly to the Ubiquitous language in this bounded context, like Payers (or something like that). Then, in the Application layer, when building the CreateInvoiceCommand you check that Payer is valid and create the command.
2. You don't need to be resilient
If the Contacts Microservice is down, you refuse to generate invoices. In this case, when building the command you make a request to the Invoices Microservice API endpoint and verify that the Payer is valid.
In any case, you check for contact's validity before the command is dispatched.
My application is an inpatient acute pain service, but similar patterns would exist for other in-hospital and ambulatory service (nutrition, physio and other therapies, social work) - ie any time a service is brought in by the treating team but then manages its own schedule of interaction based on the services' understanding of requirements and ongoing need assessments.
The task from a team/service level is to identify:
who are are our current patients?
when do we need to see them again?
So this involves tracking:
a referral (which would imply a degree of urgency: "please see today/tomorrow")
individual encounters (which would plan the next visit), and finally
a "discharge" event ("we're done here, let us know if you need us again")
By itself none of this is awfully complicated (and managed on spreadsheets and back of envelopes all around the world), but struggling to find the right FHIR resources to drop all this into.
It seems that:
care would be triggered somehow by a ReferralRequest
each visit should be in an encounter linked back to the incomingReferral
an order would allow tracking the "next visit", although potentially an appointment would do that job
obviously there are lots of observations that the service can record along the way
This leaves these questions:
What is the role of a care plan in the mix?
How to track an episode of care?
What is the role of a clinical impression?
Via what resource would we make a "summary of care given by the service" available?
What is the event that triggers the completion of a referral?
CarePlan is used to share information about what the intended course of care is for a patient - what activities are you going to do, when are you going to do them, how are they going, etc. If you wanted to track a plan to have 5 encounters over the course of 6 months, maintain a daily pain log, do a set of exercises at least twice a week, etc., CarePlan could be used. No requirement to use it if not needed though.
EpisodeOfCare is used to link activities related to a single condition that span multiple encounters. You can link encounters, procedures, etc. to EpisodeOfCare
ClinicalImpression is a new, evolving resource. Think of it as a specialized type of Observation that's intended to tie together a bunch of other observations and make an overall assessment.
A complete summary of care would typically be represented as a FHIR document - that's a Bundle instance starting with a Composition that would then organize relevant information about the care into a series of sections. If you don't want the overhead of full document, you can skip the Composition and just have a Bundle containing relevant information.
Completion of Referrals is dependent on business process. Typically the ReferralRequest instance is owned by the placing/initiating system. They decide when to mark the request as complete - be that on receiving back a report, knowledge that the transfer of care is done, sufficient elapsed time or other means. The Order/OrderResponse (to be replaced by Task) can be used to communicate back and forth between placer and filler systems to help coordinate when work is deemed to be complete.
It seems the core information is representable by a ReferralRequest that may trigger an appointment, which becomes an encounter generating a Clinical impression including an appointment for the next visit.
Clinical impression
Is often one to one with an encounter. It allows a record of new (and excluded) conditions and also recording a plan which can include an appointment action to cover "see again tomorrow"
Orders
Include a timing element so maybe useful for fixed repeating schedules.
ReferralRequest
ReferralRequest.status has values of (requested | active | cancelled | accepted | rejected | completed) make it a suitable repository to hold the episode of care by the service
Episode of care
This resource is similar in many ways to an encounter. Unlike encounters there is no mechanism to nest Episode of cares, although each encounter may relate to 0..* episode of care.s
I am designing some events that will be raised when actions are performed or data changes in a system. These events will likely be consumed by many different services and will be serialized as XML, although more broadly my question also applies to the design of more modern funky things like Webhooks.
I'm specifically thinking about how to describe changes with an event and am having difficulty choosing between different implementations. Let me illustrate my quandry.
Imagine a customer is created, and a simple event is raised.
<CustomerCreated>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</CustomerCreated>
Now let's say Bob spends lots of money and becomes a gold customer, or indeed any other property changes (e.g.: he now prefers to be known as Robert). I could raise an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is nice because the schema of the Created and Modified events are the same and any subscriber receives the complete current state of the entity. However it is difficult for any receiver to determine which properties have changed without tracking state themselves.
I then thought about an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is more compact and only contains the properties that have changed, but comes with the downside that the receiver must apply the changes and reassemble the current state of the entity if they need it. Also, the schemas of the Created and Modified events must be different now; CustomerId is required but all other properties are optional.
Then I came up with this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<Before>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</Before>
<After>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</After>
</CustomerModified>
This covers all bases as it contains the full current state, plus a receiver can figure out what has changed. The Before and After elements have the exact same schema type as the Created event. However, it is incredibly verbose.
I've struggled to find any good examples of events; are there any other patterns I should consider?
You tagged the question as "Event Sourcing", but your question seems to be more about Event-Driven SOA.
I agree with #Matt's answer--"CustomerModified" is not granular enough to capture intent if there are multiple business reasons why a Customer would change.
However, I would back up even further and ask you to consider why you are storing Customer information in a local service, when it seems that you (presumably) already have a source of truth for customer. The starting point for consuming Customer information should be getting it from the source when it's needed. Storing a copy of information that can be queried reliably from the source may very well be an unnecessary optimization (and complication).
Even if you do need to store Customer data locally (and there are certainly valid reasons for need to do so), consider passing only the data necessary to construct a query of the source of truth (the service emitting the event):
<SomeInterestingCustomerStateChange>
<CustomerId>1234</CustomerId>
</SomeInterestingCustomerStateChange>
So these event types can be as granular as necessary, e.g. "CustomerAddressChanged" or simply "CustomerChanged", and it is up to the consumer to query for the information it needs based on the event type.
There is not a "one-size-fits-all" solution--sometimes it does make more sense to pass the relevant data with the event. Again, I agree with #Matt's answer if this is the direction you need to move in.
Edit Based on Comment
I would agree that using an ESB to query is generally not a good idea. Some people use an ESB this way, but IMHO it's a bad practice.
Your original question and your comments to this answer and to Matt's talk about only including fields that have changed. This would definitely be problematic in many languages, where you would have to somehow distinguish between a property being empty/null and a property not being included in the event. If the event is getting serialized/de-serialized from/to a static type, it will be painful (if not impossible) to know the difference between "First Name is being set to NULL" and "First Name is missing because it didn't change".
Based on your comment that this is about synchronization of systems, my recommendation would be to send the full set of data on each change (assuming signal+query is not an option). That leaves the interpretation of the data up to each consuming system, and limits the responsibility of the publisher to emitting a more generic event, i.e. "Customer 1234 has been modified to X state". This event seems more broadly useful than the other options, and if other systems receive this event, they can interpret it as they see fit. They can dump/rewrite their own data for Customer 1234, or they can compare it to what they have and update only what changed. Sending only what changed seems more specific to a single consumer or a specific type of consumer.
All that said, I don't think any of your proposed solutions are "right" or "wrong". You know best what will work for your unique situation.
Events should be used to describe intent as well as details, for example, you could have a CustomerRegistered event with all the details for the customer that was registered. Then later in the stream a CustomerMadeGoldAccount event that only really needs to capture the customer Id of the customer who's account was changed to gold.
It's up to the consumers of the events to build up the current state of the system that they are interested in.
This allows only the most pertinent information to be stored in each event, imagine having hundreds of properties for a customer, if every command that changed a single property had to raise an event with all the properties before and after, this gets unwieldy pretty quickly. It's also difficult to determine why the change occurred if you just publish a generic CustomerModified event, which is often a question that is asked about the current state of an entity.
Only capturing data relevant to the event means that the command that issues the event only needs to have enough data about the entity to validate the command can be executed, it doesn't need to even read the whole customer entity.
Subscribers of the events also only need to build up a state for things that they are interested in, e.g. perhaps an 'account level' widget is listening to these events, all it needs to keep around is the customer ids and account levels so that it can display what account level the customer is at.
Instead of trying to convey everything through payload xmls' fields, you can distinguish between different operations based on -
1. Different endpoint URLs depending on the operation(this is preferred)
2. Have an opcode(operation code) as an element in the xml file which tells which operation is to used to handle the incoming request.(more nearer to your examples)
There are a few enterprise patterns applicable to your business case - messaging and its variants, and if your system is extensible then Enterprise Service Bus should be used. An ESB allows reliable handling of events and processing.