Where to apply business logic in EventSourcing - microservices

In eventsourcing, I am having bit confusion on where exactly have to apply Business logic? I have already searched in google, but all examples are very basic ie., Updating state of an object inside Handler from an event object, but in my other scenario, had some confusion didnt understood on where exactly have to apply Business logic.
For eg: lets take a scenario to update status of IntervieweeVO, which exists inside Interview aggregate class as below:
class Interview extends AggregateRoot {
private IntervieweeVO IntervieweeVO;
}
class IntervieweeVO {
int performance;
String status;
}
class IntervieweeSelectedEvent extends BaseEvent {
private IntervieweeVO IntervieweeVO;
}
I have a business logic, ie., if interviewee performance < 3, then status = REJECTED, otherwise status should be SELECTED.
So, my doubt is: where should I keep above business logic? Below are 3 scenarios:
1) Before Applying an Event: Do Business Logic, then apply(IntervieweeSelectedEvent) and then eventstore.save(intervieweeSelectedEvent)
2) Inside EventHandler: Apply Business logic inside EventHandler class, like handle(IntervieweeSelectedEvent intervieweeSelectedEvent) , check Business logic and then update Object state in ReadModel table.
3) Applying Business Logic in both places ie., Before Applying an event and also while handing the event (combining above 1 + 2)
Please clarify me on above.

The main issue with event sourcing is that it is hard to produce a viable example using synthetic scenarios.
But probably I could suggest something a little bit better than Interview. If you compare pre-computer era event sourced systems, you'll find that an event stream, which is the store of events composing the lifecycle of some entity, it rather a long-living thing. Events in an entity could span a few days (a list that tracks some document flow), a year (accounting period for some organisation) or tens of years (medical records for some person).
A single event stream usually represents a single entity - a legal process, a ledger or a person... Each event is a transactional (as in ACID) change to the state of the entity.
In your case such an entity could be, say, a position. Which is opened, announced, interviewee invited, invitation accepted, skills assessed, offer made, offer accepted, position closed. From the top of my head.
When an event is added to an entity, it means that the entity's state has changed. It is the new truth about the entity. You want to be careful about changing the truth. So, that's where business logic happens. You run some business logic to make up the decision whether to change the truth or not. It you decide to update the state of the truth - you save the event. That being said, "Interviewee rejected" is a valid event in this case.
Since an event is persisted, all the saved events of an entity are unconditionally the part of the truth about the entity, in their respective order. You then don't decide whether to "accept" or "reject" a persisted event - only how it would affect a projection.

You should be able to reconstruct the entity's state as of a specific point in time from the event stream.
This implies that applying events should NOT contain any logic other than state mapping logic. All state necessary to project the AR's state from the events must be explicitly defined in those events.
Events are an expressive way to define state changes, not operations/commands. For instance, if IntervieweeRejected means IntervieweeStatusChanged(rejected) then that meaning can't ever change. The IntervieweeRejected event can't ever imply anything else than status = rejected, unless there's some other state captured in the event's data (e.g. reason).
Obviously, the way the state is represented can always change, but the meaning must not. For example the AR may have started by only projecting the current status and later on projected the entire status history.
apply(IntervieweeRejected) => status = REJECTED //at first
apply(IntervieweeRejected) => statusHistory.add(REJECTED) //later
I have a business logic, ie., if interviewee performance < 3, then
status = REJECTED, otherwise status should be SELECTED.
Business logic would be placed in standard public AR methods. In this specific case you may expect interviewee.assessPerformance(POOR) to yield IntervieweePerformanceAssessed(POOR) and IntervieweeRejected events. Should you need to reevaluate that smart screening policy at a later time (e.g. if it has changed) then you could implement a reevaluateSmartScreeningPolicy operation.
Also, please note that such logic may not even belong in the Interviewee AR itself. The smart screening policy may be seen as something that happend after/in response to the IntervieweePerformanceAssessed event. Furthermore, I can easily see how a smart screening policy could become very complex, AI-driven which could justify it living in a dedicated Screening bounded context.
Your question actually made me think about how to effectively capture the context or why events occurred and I've asked about that here :)

you tagged your question cqrs but this is acutally the missing part in your example.
Eventsourcing is merely a way to look at the current state of an object. You either save that state as it appears now, or you source it from everything that happend. (eg Bank accounts current banalance as value or sum of all transactions)
So an event is a "fact" of something that happend. In your case that would be the interview with a certain score. And (dependent on your business logic) it COULD also state the status if the barrier is expected to change over time.
The crucial point is here that you should always adhere to the following chain:
"A command gets validated and if it passes it creates an unchangeable event that is persisted"
This means that in your case I would go for option 1. A SelectIntervieweeCommand should be validated and if everything is okay create an IntervieweeSelectedEvent which is an unchangeable fact. Thus the business logic wether the interviewee passed or not, must reside in the command handler function.

Related

Event Sourcing: multiple events vs a single "StatusChanged"

Assuming the common "Order" aggregate, my view of events is that each should be representative of the command that took place. E.g. OrderCreated, OrderePicked, OrderPacked, OrderShipped.
Applying these events in the aggregate changes the status of the order accordingly.
The problem:
I have a projector that lists all orders in the system and their statuses. So it consumes the events, and like with the aggregate "apply" method, it implements the logic that changes the status of the order.
So now the logic exists in two places, which is... not good.
A solution to this is to replace all the above events with a single StatusChanged event that contains a property with the new status.
Pros: both aggregate and projectors just need to handle one event type, and set the status to what's in that event. Zero logic.
Cons: the list of events is now very implicit. Instead of getting a list of WHAT HAPPENED (created, packed, shipped, etc.), we now have a list of the status changes events.
How do you prefer to approach this?
Note: this is not the full list of events. other events contain other properties, so clearly they don't belong to this problem. the problem is with events that don't contain any info, just change the status of an order.
In general it's better to have more finer-grained events, because this preserves context (and means that you don't have to write logic to reconstruct the context in your consumers).
You typically will have at most one projector which is duplicating your aggregate's event handler. If its purpose is actually to duplicate the aggregate's event handler (e.g. update a datastore which facilitates cross-aggregate querying), you may want to look at making that explicit as a means of making the code DRY (e.g. function-as-value, strategy pattern...).
For the other projectors you write (and there will be many as you go down the CQRS/ES road), you're going to be ignoring events that aren't interesting to that projection and/or doing radically different things in response to the events you don't ignore. If you go down the road of coarse events (CRUD being about the limit of coarseness: a StatusChanged event is just the "U" in CRUD), you're setting yourself up for either:
duplicating the aggregate's event handling/reconstruction in the projector
carrying oldState and newState in the event (viz. just saying StatusChanged { newState } isn't sufficient)
before you can determine what changed and the code for determining whether a change is interesting will probably be duplicated and more complex than the aggregate's event-handling code.
The coarser the events, the greater the likelihood of eventually having more duplication, less understandability, and worse performance (or higher infrastructure spend).
So now the logic exists in two places, which is... not good.
Not necessarily a problem. If the logic is static, then it really doesn't matter very much. If the logic is changing, but you can coordinate the change (ex: both places are part of the same deployment bundle), then its fine.
Sometimes this means introducing an extra layer of separation between your "projectors" and the consumers - ex: something that is tightly coupled to the aggregate watching the events, and copying status changes to some (logical) cache where other processes can read the information. Thus, you preserve the autonomy of your component without compromising your event stream.
Another possibility to consider is that we're allowed to produce more than one event from a command - so you could have both an OrderPicked event and a StatusChanged event, and then use your favorite filtering method for subscribers only interested in status changes.
In effect, we've got two different sets of information to track to remember later - inputs (information in the command, information copied from local caches), and also things we have calculated from those inputs, prior state, and the business policies that are now in effect.
So it may make sense to separate those expressions of information anyway.
If event sourcing is a good approach for the problems you are solving, then you are probably working on problems that are pretty important to the business, where specialization matters (otherwise, licensing an off the shelf product and creating adapters would be more cost effective). In which case, you should probably be expecting to invest in thinking deeply about the different trade offs you need to make, rather than hoping for a one-size-fits-all solution.

Is it acceptable to have an invalid state in eventsourcing after event upgrading and before patching?

Let say I have a stream of persisted events that build a valid state according to some "schema" I have defined.
I change the schema and the events are upgraded to reflect this.
However, some state could not be made valid just by upgrading events, I also needed to add more events to patch the state to make it fully valid.
Firstly, is this reasoning at all valid in terms of event sourcing?
If so, how do I handle cases where a specific version of a state no longer becomes valid? I mean is this acceptable? Should it still be possible to rehydrate a version with invalid state? If this is a write model and it's not the latest version, I could not modify this state anyway so maybee it's no big deal?
However, some state could not be made valid just by upgrading events, I also needed to add more events to patch the state to make it fully valid.
"Compensating events" is the usual term; there is a clerical error in the book of record, so we need to add a new event to the history that corrects the mistake.
If so, how do I handle cases where a specific version of a state no longer becomes valid?
As a rule, you want to be wary, extremely wary, of introducing any automated validation that prevents you from loading an invalid history. Remember, state is just state; the business rules constrain the way the domain is allowed to change. Leaving broken states readable, but broken, is safe.
In particular, if you allow the state to load, it is a straight forward exercise to enumerate your event streams, test the final state of the object, and produce an exception report for any streams that produce an invalid state, escalating them to operators/management for handling, and so on.
Assuming that you are reasonable careful about input validation, and comparing whether your proposed command is consistent with latest known state (aggregates enforce business rules, but they don't need to hoard those rules for themselves), then you can probably achieve error rates low enough that you don't need aggressive data validation. That's especially true when the errors are easy to detect and cheap to fix.
Failing that, freezing any aggregates while they are in an invalid state is a good way to prevent further damage.
But if you really need the state to stay valid, there's a trick that you can play with compensating events.
Consider: the basic pattern of event sourcing looks something like
History history = repository.getHistoryById(id)
State current = State.SEED
for (Event e : history) {
current = current.apply(e)
}
There's actually a hidden concept here, which encapsulates the logic for processing the events prior to passing them to the state. Hidden, because the null case just passes the enumerated events straight through to the target.
History history = repository.getHistoryById(id)
Historian historian = new Historian();
State current = State.SEED
for (Event e : historian.reviewEvents(history)) {
current = current.apply(e)
}
The historian gives you a place to put your compensating event logic - based on its own state, the historian passes through most events, but fixes the ones that knows needs edits/compensation/redactions
Where does the historian state come from? Why, from the history of the historian, of course. You load the history of the event corrections, which will typically be short, into the historian, and then let the historian clean up the events for the aggregate.
And if you need corrections for the historian? It's turtles all the way down! Each stream has a unique historian; the identifier for the historian's stream is calculated from the stream it filters (named UUID's, for example, would allow you to do this). So for each stream, you check to see if a historian stream exists; when you find one that doesn't, you know to stop searching and use the null historian, roll up the changes, process the final sequence of events to regenerate the state of your real object, and off you go.
Mind you, I haven't seen a reference implementation of this idea anywhere; it's whiteboard sound, but the truth is I've been deferring this requirement in my own designs.

Design of notification events

I am designing some events that will be raised when actions are performed or data changes in a system. These events will likely be consumed by many different services and will be serialized as XML, although more broadly my question also applies to the design of more modern funky things like Webhooks.
I'm specifically thinking about how to describe changes with an event and am having difficulty choosing between different implementations. Let me illustrate my quandry.
Imagine a customer is created, and a simple event is raised.
<CustomerCreated>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</CustomerCreated>
Now let's say Bob spends lots of money and becomes a gold customer, or indeed any other property changes (e.g.: he now prefers to be known as Robert). I could raise an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is nice because the schema of the Created and Modified events are the same and any subscriber receives the complete current state of the entity. However it is difficult for any receiver to determine which properties have changed without tracking state themselves.
I then thought about an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is more compact and only contains the properties that have changed, but comes with the downside that the receiver must apply the changes and reassemble the current state of the entity if they need it. Also, the schemas of the Created and Modified events must be different now; CustomerId is required but all other properties are optional.
Then I came up with this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<Before>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</Before>
<After>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</After>
</CustomerModified>
This covers all bases as it contains the full current state, plus a receiver can figure out what has changed. The Before and After elements have the exact same schema type as the Created event. However, it is incredibly verbose.
I've struggled to find any good examples of events; are there any other patterns I should consider?
You tagged the question as "Event Sourcing", but your question seems to be more about Event-Driven SOA.
I agree with #Matt's answer--"CustomerModified" is not granular enough to capture intent if there are multiple business reasons why a Customer would change.
However, I would back up even further and ask you to consider why you are storing Customer information in a local service, when it seems that you (presumably) already have a source of truth for customer. The starting point for consuming Customer information should be getting it from the source when it's needed. Storing a copy of information that can be queried reliably from the source may very well be an unnecessary optimization (and complication).
Even if you do need to store Customer data locally (and there are certainly valid reasons for need to do so), consider passing only the data necessary to construct a query of the source of truth (the service emitting the event):
<SomeInterestingCustomerStateChange>
<CustomerId>1234</CustomerId>
</SomeInterestingCustomerStateChange>
So these event types can be as granular as necessary, e.g. "CustomerAddressChanged" or simply "CustomerChanged", and it is up to the consumer to query for the information it needs based on the event type.
There is not a "one-size-fits-all" solution--sometimes it does make more sense to pass the relevant data with the event. Again, I agree with #Matt's answer if this is the direction you need to move in.
Edit Based on Comment
I would agree that using an ESB to query is generally not a good idea. Some people use an ESB this way, but IMHO it's a bad practice.
Your original question and your comments to this answer and to Matt's talk about only including fields that have changed. This would definitely be problematic in many languages, where you would have to somehow distinguish between a property being empty/null and a property not being included in the event. If the event is getting serialized/de-serialized from/to a static type, it will be painful (if not impossible) to know the difference between "First Name is being set to NULL" and "First Name is missing because it didn't change".
Based on your comment that this is about synchronization of systems, my recommendation would be to send the full set of data on each change (assuming signal+query is not an option). That leaves the interpretation of the data up to each consuming system, and limits the responsibility of the publisher to emitting a more generic event, i.e. "Customer 1234 has been modified to X state". This event seems more broadly useful than the other options, and if other systems receive this event, they can interpret it as they see fit. They can dump/rewrite their own data for Customer 1234, or they can compare it to what they have and update only what changed. Sending only what changed seems more specific to a single consumer or a specific type of consumer.
All that said, I don't think any of your proposed solutions are "right" or "wrong". You know best what will work for your unique situation.
Events should be used to describe intent as well as details, for example, you could have a CustomerRegistered event with all the details for the customer that was registered. Then later in the stream a CustomerMadeGoldAccount event that only really needs to capture the customer Id of the customer who's account was changed to gold.
It's up to the consumers of the events to build up the current state of the system that they are interested in.
This allows only the most pertinent information to be stored in each event, imagine having hundreds of properties for a customer, if every command that changed a single property had to raise an event with all the properties before and after, this gets unwieldy pretty quickly. It's also difficult to determine why the change occurred if you just publish a generic CustomerModified event, which is often a question that is asked about the current state of an entity.
Only capturing data relevant to the event means that the command that issues the event only needs to have enough data about the entity to validate the command can be executed, it doesn't need to even read the whole customer entity.
Subscribers of the events also only need to build up a state for things that they are interested in, e.g. perhaps an 'account level' widget is listening to these events, all it needs to keep around is the customer ids and account levels so that it can display what account level the customer is at.
Instead of trying to convey everything through payload xmls' fields, you can distinguish between different operations based on -
1. Different endpoint URLs depending on the operation(this is preferred)
2. Have an opcode(operation code) as an element in the xml file which tells which operation is to used to handle the incoming request.(more nearer to your examples)
There are a few enterprise patterns applicable to your business case - messaging and its variants, and if your system is extensible then Enterprise Service Bus should be used. An ESB allows reliable handling of events and processing.

EventSourcing fix business logic bugs

I'm making some study of eventsourcing before applying it (or not).
Quick question : When using EventSourcing pattern we can imagine this scenario to handle an event :
command sent
command handler receive the previous command, validate it then
command handler persist this event and publish it
business model apply (business logic algorithm v1 for example) this event mutating its internal state
We can replay all the events and reconstruct the business object state.
How to handle business logic bugs (business logic algorithm v1 contains a nasty bugs).
I read we can fix the bug and replay the events and then we got the business model in a valid state once again.
But what happens if when fixing the business rule when applying event#1 would have caused the 'futurs' commands to fails ? In other words, the event#2, event#3, event#n was dependend of the state of the domain model after applying event#0. How can we fix the cascading events failure ?
I don't have a specific usecase : but we can imagine an account where balance is currently positive. Applying Event#0 increment the balance but this was a bug, the developer wanted to reduce the balance. Event#1 is a purchase that was valid because of the positive balance at this time.
The developer fixes the bug and replay the events. Event#0 decrease the balance which becomes negative. Event#1 is replayed : what happens ?
Do we need to handle this case with 'compensation' ? how ?
thanks in advance for your comments, external ressources that can be of any help (articles, blogs).
bye
Minor correction
When using EventSourcing pattern we can imagine this scenario to handle an event
command sent
command handler receive the previous command, validate it then
business model verifies that the command can be satisfied without violating the business invariant, and calculates the ensuing events
command handler persist this events and publish them
The command handler (specifically, the anti-corruption layer) is responsible for making sure that the command is well formed. The business model decides if the command is permitted by the business.
The good news: the events are just state changes; all of the rule validation is already done. When you fix the bug in the domain object so that it produces the correct events in response to the command, you aren't changing the way the event is applied.
And you certainly aren't changing the history -- if the ATM gave away $20 that it wasn't supposed to, you can't get the money back by editing the record.
What that means is that deploying the bug fix keeps the problem from getting worse; but it doesn't do anything for the event histories that are incorrect.
Compensating events are the right answer here. Ever have a grocery clerk double scan an item, and have to back one of them out? If you look closely, you'll see all three items
+1 candy bar
+1 candy bar
-1 candy bar
That's the idiom of the compensating event being appended to end of the stream.
So if the error showed first appeared in event #0, and then [event #1 .. event #99] have been played on top of that, the remedy for the error is to publish a compensating event #100.
Notice that this is exactly what book keepers would do. You put the wrong sign on the entry on line #1, add a bunch more entries, realize your mistake, and add a new entry that compensates for the earlier mistake.
More good news: in mature business processes, there are already mitigation procedures in place to handle various contingencies. So you can grab a meeting with your domain experts, and doodle on the whiteboard explaining the problem, and your experts should be able to show you the right way to compensate for it. Everything after that is feature management (does the mitigation need to be automated? Does the system need to do the mitigation automatically, or can it let human experts tell it what mitigation to apply, etc. etc.)

Keeping of track of the states of object - SOS

Hi the wise folks at SO. This is an SOS.
I'm in a deep trouble. In my web application there is an object (Say it is a request for something). User submits his/her request. After this it comes to the people who can approve/disapprove that request. During the period from submission to approval/disapproval many actions can be taken on the request. I have to present user with actions panel (collections of links) using which he/she can modify the state of the request.
Now based on which stage of processing the request is some actions are not allowed. Also if some action has already been taken it excludes the possibility of other actions.
Overall it creates a pretty complex matrix of allowed/forbidden actions that my tiny head is not able to take care of it.
I've create some static classes/methods which returns the arrays of allowed actions based on the state of the request. There are about 20 states that a application can be in. I've taken care based on state to remove/disable links for actions that are not possible in that state.
Now problem arises is that suppose request is in state X.
Now if in past action l has been taken on request we may not allow l or based on this some arbitrary actions m,n,o.
After writing all the methods to get arrays of links for 20 states, I have to filter the arrays based on the past history of actions (which is stored in sql db) which is very very big task.
Please suggest me some pattern which is easier to implement and efficient. It is getting on my nerves.
As I understand you have a real-world workflow scenario. In this case I would:
Model entire state as a single entity if possible (a single row with fixed number of fields). I would not model this as a set of actions.
Model each action as some change in the row. It is quite obvious when user enters some data, but I would also model each acceptance as either - a boolean field or a state field - depending on whether the acceptance is done by independent departments or it is a cascade of acceptance in a single department.
Also there may be a situation when an acceptance is given for some particular parameter and the parameter may change in the future, requiring new acceptance. In this case I would model such scenario as two fields. On for the parameter value and the second one for the accepted value. I would make the decision on whether an acceptance is still needed based on the difference of this two fields. This allows for implementing some thresholds.
Having a state modeled as a single row I would implement independent predicates for action allowance.
I think that point 4 is the most important one. If your are able to implement independent predicates for enabling actions then you will be able to easily modify them in the future.
Having 1-3 properly implemented you will be able to easily implement acceptance revoking, which may be required and in this case may make overall code size smaller.
Sounds like a job for a state machine workflow, or a few giant nested switches (which ever you prefer).
First thing that came into my mind: Statemachine. Each State is some kind of object. All states have some method "processRequest" that transits the execution into the next state.
The second thing that came into my mind - theses states have to be organized like a tree or graph. The graph represents the history of requests. You start in the initial State. You get Request A, you proceed to State A. After that, you get request B, you proceed to AB. Wether state AB is equal to BA is not clear by your description.
That way, you get far more states then your 20 states you have now, but each state includes the history. I'd suggest a naming convention after the path you had to take to get there (like AB before). And perhaps you can reuse state A and B in AB, to minimize coding.

Resources