What is event driven design and Domain driven design?
What are the specific benefits using of Domain driven design, event driven design in MicroServices.
Event sourcing as an implementation strategy for the persistence of state, e.g. of aggregates. This strategy should not be exposed beyond the boundaries of aggregates. The events from event sourcing should therefore only be used internally in the corresponding aggregate or in the context of CQRS to build related read models.
Domain events, on the other hand, represent a specific fact or happening that is relevant regardless of the type of persistence strategy for aggregates, for example, for integrating bounded contexts.
Event sourcing and domain events can of course be used both at the same time, but should not influence each other. The two concepts are used for different purposes and should therefore not be mixed.
Please, read from the link below to learn more: check here
Related
How do event sourcing and CQRS help to achieve decoupled architecture for microservices.
We can have microservices which own their data and others accessing that data via service even by having traditional means of persistence. Isn't?
Event sourcing and CQRS are not intended to be used to decouple services.
The main goal achived by using CQRS is to improve the performance of a service because you can use different persistence type for writes (command) and for reads (query).
Thanks to that, you can use a higly performance write type persistence like event log to store all events that happen in the service and use a relational model for example for reads where you store the information in the way that you need for your queries.
The way to achieve consistency between the two models is normally by using events generated by the command model that are consumed by the query to update the read model. The drawback of this appproach is the eventual consistency, because the update of the read model does not happen inmediately.
Higly related with cqrs is event sourcing that states that all modifications in the model should be stored like events in an event store. This way you have an historic of all the actions made in the application. The advantage of this is that you have an historic of all changes for audit purposes and that writes are extremely fast. The drawback is that if you want to get the current state you have to replay all the events since the begining.
To solve this, you use cqrs to get the actual state to make queries
I am designing and developing a microservice platform based on the specifications of http://microservices.io/
The entire framework integrates through socket thus removing the overhead of multiple HTTP requests (like most REST APIs).
A service registry host receives the registry of multiple microservice hosts, each microservice is responsible for a domain of the business. Another host we call a router (or API gateway) is responsible for exposing the microservices for consumption by third parties.
We will use the structure of Sagas (in choreography style) to distribute the requisitions, so we have some doubts:
Should a microservice issue the event in any process manager or should it be passed directly to the next microservice responsible for the chain of events? (the same logic applies to rollback)
Who should know how to build the Saga chain of events? The first microservice that receives a certain work or the router?
If an event needs to pass a very large volume of data to the next Saga event, how is this done in terms of the request structure? Is it divided into multiple Sagas for example (as a result pagination type)?
I think the main point is that in this router and microservice structure, who is responsible for building the Sagas and propagating their events.
The article Patterns for Microservices — Sync vs. Async does a great job defining many of the terms used here and has animated gifs demonstrating sync vs. async and orchestrated vs. choreographed as well as hybrid setups.
I know the OP answered his own question for his use case, but I want to try and address the questions raised a bit more generally in lieu of the linked article.
Should a microservice issue the event in any process manager or should it be passed directly to the next microservice responsible for the chain of events?
To use a more general term, a process manager is an orchestrator. A concrete implementation of this may involve a stateful actor that orchestrates a workflow, keeping track of the progress in some way. Since a saga is workflow itself (composed of both forward and compensating actions), it would be the job of the process manager to keep track of the state the saga until completion (success or failure). This typically involves the actor sending synchronous* calls to services waiting for some result before going to the next step. Parallel operations can of course be introduced and what not, but the point is that this actor dictates the progression of the saga.
This is fundamentally different from the choreography model. With this model there is no central actor keeping track of the state of a saga, but rather the saga progresses implicitly via the events that each step emits. Arguably, this is a more pure case of an event-driven model since there is no coordination.
That said, the challenge with this model is observing the state at any given point in time. With the orchestration model above, in theory, each actor could be queried for the state of the saga. In this choreographed model, we don't have this luxury, so in practice a correlation ID is added to every message corresponding to (in this case) a saga. If the messages are queryable in some way (the event bus supports it or through some other storage means), then the messages corresponding to a saga could be queried and the saga state could be reconstructed.. (effectively an event sourced modeled).
Who should know how to build the Saga chain of events? The first microservice that receives a certain work or the router?
This is an interesting question by itself and one that I have been thinking about quite a lot. The easiest and default answer would be.. hard code the saga plans and map them to the incoming message types. E.g. message A triggers plan X, message B triggers plan Y, etc.
However, I have been thinking about what a control plane might look like that manages these plans and provides the mechanism for pushing changes dynamically to message handlers and/or orchestrators dynamically. The two specific use cases in mind are changes in authorization policies or dynamically adding new steps to a plan.
If an event needs to pass a very large volume of data to the next Saga event, how is this done in terms of the request structure? Is it divided into multiple Sagas for example (as a result pagination type)?
The way I have approached this is to include references to the large data if these are objects such as a file or something. For data that are inherently streams themselves, a parallel channel could be referenced that a consumer could read from once it receives the message. I think the important distinction here is to decouple thinking about the messages driving the workflow from where the data is physically materialized which depends on the data representation.
For microservices, every microservice should be responsible for its domain business.
Should a microservice issue the event in any process manager or should it be passed directly to the next microservice responsible for the chain of events? (the same logic applies to rollback)
All events are not passed to the next microservice, but are published, then all microservices interested in the events should subscribe to them.
If there is rollback, you should consider orchestration.
Who should know how to build the Saga chain of events? The first microservice that receives a certain work or the router?
The microservice who publish the event will certainly know how to build it. There are no chain of events, because every microservice interested in the event will subscribe it separately.
If an event needs to pass a very large volume of data to the next Saga event, how is this done in terms of the request structure? Is it divided into multiple Sagas for example (as a result pagination type)?
Only publish the data others may be interested, not all. In most cases, the data are not large, and message queue can handle them efficiently
In a classical microservice architecture, you have relevant domain events published on some messaging system which allows other parts of the system to react.
Now imagine you have three microservices: Customers, Orders and Recommendation. The Recommendation microservice needs information from Customers and Orders to provide its functionality, such as the list of all customers and all the orders, which is going to be analyzed from some machine learning algorithm. Now, you need to have the state of Customers "join" Orders on the Recommandation microservice:
You have the Recommandation microservice listen to domain events published by Customers and Orders and built its own state. This leads to logic duplication since you probably have that same logic inside Customers and Orders already
On each relevant domain message from Customers and Orders, you just go to them and ask the state of a specific customer or order. This works fine, however if you have N services rather than just one which needs to build a materialized view, you will cause a big load on Customers and Orders
You get Customers and Orders themselves publish "heavy-weight" events (not domain events) that allows any other microservice to build a materialized view without processing domain events. This allows you both a) not to duplicate the logic b) not to keep asking the same information
Has pattern n.3 some drawbacks we couldn't figure out and if not, how do you implement it in Lagom?
I will try to explain a few more bits in the hope to give you some more perspective on that matter and how you can achieve it in a reliable way in Lagom.
We have a few concepts that we must keep in mind. The most important one which is the source of all is Event Sourcing itself. Event Sourcing means that any State in the system has its source in Events.
The first State that we will deal with is the State of the PersistentEntity. This State is prominent because, together with the Command and Event Handler, it defines the consistency boundary of your model.
But there other States in the system. Actually, we can create as much as we want because we have the Event Journal. A read-model is also a State and it’s also generated from the events.
There are many reasons why you shouldn’t publish the State of the PersistentEntity to other systems. The first one being a matter of avoiding coupling. You don’t want your data to leak to other services. That’s all about having an anti-corruption layer (ACL).
So, from here we could say: before publishing Order and Customer to Recommendation Service, I will transform it to OrderView and CustomerView (ACL 101).
The question now is when will you do it? If you try to publish it in Kafka after you have handled a command, you don’t have any guarantee that the State will be published. There are no XA transactions between the event journal and the Kafka topic. So, there is a chance that the events are persisted, but for some reason, the State is not published in Kafka.
If you want data to get out of a service in a reliable way and without creating coupling between services, you have the following options:
Use the broker API and publish the events to a topic. You should not publish the events as they are, but transform them into the format of your external API (ACL).
Use a read-side processor to generate a view of it, again the external API format you want to make available. If you want, you can publish that ViewState to a topic so other services can consume it directly.
That said, there is nothing wrong in publishing something in a topic that is not a real event, but some derived State. The problem is how you can guarantee that it is effectively published. Doing that from inside the PersistentEntity is risky because you have at-most-once semantics. The most reliable way of doing it is a read-side process that gives you at-least-once semantics.
Further comments inline...
Listen to domain events from customer and orders and rebuild the state
in the recommandation service. This is a horrible idea because you
would need to duplicate the logic that handles events across different
bounded context
That's not a horrible idea. That's how you make your services independent from each other. The logic that you will need to implement to consume the events are not the same. As you said, it's a different bounded context, as such it only gets what it needs.
Leaking the State from a BC to another is more problematic for the reasons I mentioned above (anti-corruption layer).
To achieve decoupling you do need more coding and there is nothing wrong with that. At the end of the day, the reason for building microservices is to avoid coupling and be able to let the services evolve and scale without interfering with each other. There is a price to pay for that and the price is to write more code. You need to evaluate the thread-offs.
You can consume your own events, produce an OrderView and CustomerView and publish into Kafka, but that's the same as consuming the events directly on the Recommendation Service.
Note that you also need to store OrderView and CustomerView somewhere in the Recommendation Service. So you end up storing it three times. On the original service (view table), in Kafka and in the Recommendation Services.
That's why publishing events in a topic is the best option to propagate data between services.
Every time we receive a domain event from customers or orders, go to
them and ask them the state. This is horrible because if you have more
than one microservice that needs their state, you will end up
producing load on customers and orders
That is indeed a horrible idea because you will make the Recommendation Service be dependent on the other two services. If Order or Customer is down, the Recommendation will be down as well. That's what a broker helps to solve.
Have customers and orders not only publish events but also state and
having all the services that need to build materialized views listen
the state they need How do you apply the last pattern with Lagom? We
found no way to listen to state changes, just to events. One solution
we considered implied publishing with pubSub the state in the onEvent
handler of a persistent entity but I am not sure this is the right
place to make it happen.
Using pubSub in the onEvent handler is the worst solution of all. For the following reasons:
pubSub has at-most-once sematincs (see comments above)
Event handlers are called many times. Whenever you re-hydrate an Entity, the events are replayed and the the event handlers will be used for that. Which mean that you will re-publish the state each time. Actually, you would solve the at-most-once pubSub problem, but not the way you might expect/desire.
You could use the afterPersist callback for that, but that's not reliable neither because pubSub is at-most-once.
PubSub inside a PersistentEntity should not be used for something that you need to be reliable. It's a best-effort capability, that's all.
I am quite new in context of Micro-service architecture and reading this post : http://microservices.io/patterns/data/event-sourcing.html to get familiar with Event sourcing and data storage in Microservice architecture.
I have read many documents about 3 important aspect of system :
Using event sourcing instead of a simply shared DB and ORM and
row update
Events are JAVA objects.
In case of saving data permanently
, we need to use DB (either relational or noSQL)
Here are my questions :
How database comes along with event sourcing? I have read CQRS
pattern, but I can not understand how CQRS pattern is related to
event store and event objects ?
Can any body provide me a
complete picture and set of operations happens with all players to
gather: CQRS pattern , Event sourcing (including event storage
module) and finally different microservices?
In a system
composed of many microservices, should we have one event storage or
each microservice has its own ? or both possible ?
same
question about CQRS. This pattern is implemented in all
microservices or only in one ?
Finally, in case of using
microservice architecture, it is mandatory to have only one DB or
each Microserivce should have its own ?
As you can see, I have understood all small pieces of game , but I can not relate them together to compose a whole image. Specially relevance between CQRS and event sourcing and storing data in DB.
I read many articles for example :
https://ookami86.github.io/event-sourcing-in-practice/
https://msdn.microsoft.com/en-us/library/jj591577.aspx
But in all of them small players are discussed. Even a hand drawing piece of image will be appreciated.
How database comes along with event sourcing? I have read CQRS pattern, but I can not understand how CQRS pattern is related to event store and event objects ?
"Query" part of CQRS instructs you how to create a projection of events, which is applicable in some "bounded context", where the database could be used as a means to persist that projection. "Command" part allows you to isolate data transformation logic and decouple it from the "query" and "persistence" aspects of your app. To simply put it - you just project your event stream into the database in many ways (projection could be relational as well), depending on the task. In this model "query" and "command" have their own way of projecting and storing events data, optimised for the needs of that specific part of the application. Same data will be stored in events and in projections, this will allow achieving simplicity and loose coupling among subdomains (bounded contexts, microservices).
Can any body provide me a complete picture and set of operations happens with all players to gather: CQRS pattern , Event sourcing (including event storage module) and finally different microservices?
Have you seen Greg Young's attempt to provide simplest possible implementation? If you still confused, consider creating more specific question about his example.
In a system composed of many microservices, should we have one event storage or each microservice has its own ? or both possible ?
It is usually one common event storage, but there definitely could be some exceptions, edge cases where you really will need multiple storages for different microservices here and there. It all depends on the business case. If you not sure - most likely you just need a single storage for now.
same question about CQRS. This pattern is implemented in all microservices or only in one ?
It could be implemented in most performance-demanding microservices. It all depends on how complex your implementation becomes when you are introducing CQRS into it. If it gets simpler - why not implement it everywhere? But if people in your team become more and more confused by the need to perform more explicit synchronisation between commands and queries parts - maybe cqrs is too much for you. It all depends on your team, on your domain ... there is no single simple answer, unfortunately.
Finally, in case of using microservice architecture, it is mandatory to have only one DB or each Microservice should have its own ?
If same microservices sharing same tables - this is usually considered as an antipattern, as it increases coupling, the system becomes more fragile. You can still share the same database, but there should be no shared tables. Also, tables from one microservice better not have FK's to tables in another microservice. Same reason - to reduce coupling.
PS: consider not to ask coarse-grained questions, as it will be harder to get a response from people. Several smaller, more specific questions will have better chance to be answered.
I'm new to EDA and I've read a lot about benefits and would probably be interested to apply it during my next project but still haven't understood something.
When raising an event, which pattern is the most suited:
Name the event "CustomerUpdate" and include all information (updated or not) about the customer
Name the event "CustomerUpdate" and include only information that have really been updated
Name the event "CustomerUpdate" and include minimum information (Identifier) and/or a URI to let the consumer retrieves information about this Customer.
I ask the question because some of our events could be heavy and frequent.
Thx for your answers and time.
Name the event "CustomerUpdate"
First let's start with your event name. The purpose of an event is to describe something which has already happenned. This is different from a command, which is to issue an instruction for something yet to happen.
Your event name "CustomerUpdate" sounds ambiguous in this respect, as it could be describing something in the past or something in the future.
CustomerUpdated would be better, but even then, Updated is another ambiguous term, and is nonspecific in a business context. Why was the customer updated in this instance? Was it because they changed their payment details? Moved home? Were they upgraded from silver to gold status? Events can be made as specific as needed.
This may seem at first to be overthinking, but event naming becomes especially relevant as you remove data and context from the event payload, moving more toward skinny events (the "option 3" from your question, which I discuss below).
That is not to suggest that it is always appropriate to define events at this level of granularity, only that it is an avenue which is open to you early on in the project which may pay dividends later on (or may swamp you with thousands of event types).
Going back to your actual question, let's take each of your options in turn:
Name the event "CustomerUpdate" and include all information (updated
or not) about the customer
Let's call this "pattern" the Fat message.
Fat messages (also called snapshots) represent the state of the described entity at a given point in time with all the event context present in the payload. They are interesting because the message itself represents the contract between service and consumer. They can be used for communicating changes of state between business domains, where it may be preferred that all event context be present during message processing by the consumer.
Advantages:
Self consistent - can be consumed entirely without knowledge of other systems.
Simple to consume (upsert).
Disadvantages:
Brittle - the contract between service and consumer is coupled to the message itself.
Easy to overwrite current data with old data if messages arrive in the wrong order (hint: you can mitigate this by using the event sourcing pattern)
Large.
Name the event "CustomerUpdate" and include only information that have
really been updated
Let's call this pattern the Delta message.
Deltas are similar to fat messages in many ways, though they are generally more complex to generate and consume. A good example here is the JSONPatch standard.
Because they are only a partial description of the event entity, deltas also come with a built-in assumption that the consumer knows something about the event being described. For this reason they may be less suitable for sending outside a business domain, where the event entity may not be well known.
Deltas really shine when synchronising data between systems sharing the same entity model, ideally persisted in non-relational storage (eg, no-sql). In this instance an entity can be retrieved, the delta applied, and then persisted again with minimal effort.
Advantages:
Smaller than Fat messages
Excels in use cases involving shared entity models
Portable (if based on a standard such as jsonpatch, or to a lesser extent, diffgram)
Disadvantages:
Similar to the Fat message, assumes complete knowledge of the data entity.
Easy to overwrite current data with old data.
Complex to generate and consume (except for specific use cases)
Name the event "CustomerUpdate" and include minimum information
(Identifier) and/or a URI to let the consumer retrieves information
about this Customer.
Let's call this the Skinny message.
Skinny messages are different from the other message patterns you have defined, in that the service/consumer contract is no longer explicit in the message, but implied in that at some later time the consumer will retrieve the event context. This decouples the contract and the message exchange, which is a good thing.
This may or may not lend itself well to cross-business domain communication of events, depending on how your enterprise is set up. Because the event payload is so small (usually an ID with some headers), there is no context other than the name of the event on which the consumer can base processing decisions; therefore it becomes more important to make sure the event is named appropriately, especially if there are multiple ways a consumer could handle a CustomerUpdated message.
Additionally it may not be good practice to include an actual resource address in the event data - because events are things which have already happened, event messages are generally immutable and therefore any information in the event should be true forever in case the events need to be replayed. In this instance a resource address could easily become obsolete and events would not be re-playable.
Advantages:
Decouples service contract from message.
Information about the event contained in the event name.
Naturally idempotent (with time-stamp).
Generally tiny.
Simple to generate and consume.
Disadvantages:
Consumer must make additional call to retrieve event context - requires explicit knowledge of other systems.
Event context may have become obsolete at the point where the consumer retrieves it, making this approach generally unsuitable for some real-time applications.
When raising an event, which pattern is the most suited?
I think the answer to this is: it depends on lots of things, and there is probably no one right answer.
Update from comments: Also worth reading, a very old, classic, blog post on messaging: https://learn.microsoft.com/en-gb/archive/blogs/nickmalik/killing-the-command-message-should-we-use-events-or-documents (also here: http://vanguardea.com/killing-the-command-message-should-we-use-events-or-documents/)
Martin Fowler gave a great talk about "The Many Meanings of Event-Driven Architecture" (the content is based on this paper) in which he mentioned the Event-Carried State Transfer pattern.
It seems to be close to your second option "Delta message" with the difference that it doesn't try to describe an entity, but instead describe a named business fact that happened and carry over all the necessary data to understand this fact.
I don't think it matters how you have modeled your persistence layer when it comes to designing domain events. Likewise, I don't think it matters how your consumer has modeled its own persistence layer when designing domain events.
Thus, I don't think it's wise to put as an advantage the fact that you can apply the event as a patch directly on your data (from a consumer point of view), because it pushes the producer to design their events given the persistence model of a consumer.
In that case, I would tend to think that you're designing persistence patches, instead of domain events.
What do you think?