Event sourcing: non-business important events - event-sourcing

This is a general architectural question on ES. The concern is generally about the need to keep a great amount of business non-important events, that affect intermediate state, though we definitely won't care about them (will just ignore them) at the end of the day.
Say we have a User, that has a list of items (i.e Tasks), and the user may quite often add/remove/edit different fields of a task. If we are building ES, we should treat each update as an individual event for example TaskNameChange, TaskCommentChange etc, or we may have one event TaskModified whatever. In our case tasks state changes are actually not important for us, is we don't get much from task change history, from the business standpoint we will ever care about only last ones (for example last TaskNameChange), but we should anyway track and record all the events.
Again my concern is that we should record and keep a great amount of business meaningless events in event store.
Has anyone met such situation and what are ideas about it?

Has anyone met such situation and what are ideas about it?
Horses for courses
If the costs associated with keeping a complete event backed history of your document exceed the business value that you can accrue from that history, then don't design your system to keep all of the history. Set up a document store, on each save overwrite the previous version of the document, and get on with it.
Greg Young: a whole system based on event sourcing is an anti pattern.

Related

Event Sourcing: multiple events vs a single "StatusChanged"

Assuming the common "Order" aggregate, my view of events is that each should be representative of the command that took place. E.g. OrderCreated, OrderePicked, OrderPacked, OrderShipped.
Applying these events in the aggregate changes the status of the order accordingly.
The problem:
I have a projector that lists all orders in the system and their statuses. So it consumes the events, and like with the aggregate "apply" method, it implements the logic that changes the status of the order.
So now the logic exists in two places, which is... not good.
A solution to this is to replace all the above events with a single StatusChanged event that contains a property with the new status.
Pros: both aggregate and projectors just need to handle one event type, and set the status to what's in that event. Zero logic.
Cons: the list of events is now very implicit. Instead of getting a list of WHAT HAPPENED (created, packed, shipped, etc.), we now have a list of the status changes events.
How do you prefer to approach this?
Note: this is not the full list of events. other events contain other properties, so clearly they don't belong to this problem. the problem is with events that don't contain any info, just change the status of an order.
In general it's better to have more finer-grained events, because this preserves context (and means that you don't have to write logic to reconstruct the context in your consumers).
You typically will have at most one projector which is duplicating your aggregate's event handler. If its purpose is actually to duplicate the aggregate's event handler (e.g. update a datastore which facilitates cross-aggregate querying), you may want to look at making that explicit as a means of making the code DRY (e.g. function-as-value, strategy pattern...).
For the other projectors you write (and there will be many as you go down the CQRS/ES road), you're going to be ignoring events that aren't interesting to that projection and/or doing radically different things in response to the events you don't ignore. If you go down the road of coarse events (CRUD being about the limit of coarseness: a StatusChanged event is just the "U" in CRUD), you're setting yourself up for either:
duplicating the aggregate's event handling/reconstruction in the projector
carrying oldState and newState in the event (viz. just saying StatusChanged { newState } isn't sufficient)
before you can determine what changed and the code for determining whether a change is interesting will probably be duplicated and more complex than the aggregate's event-handling code.
The coarser the events, the greater the likelihood of eventually having more duplication, less understandability, and worse performance (or higher infrastructure spend).
So now the logic exists in two places, which is... not good.
Not necessarily a problem. If the logic is static, then it really doesn't matter very much. If the logic is changing, but you can coordinate the change (ex: both places are part of the same deployment bundle), then its fine.
Sometimes this means introducing an extra layer of separation between your "projectors" and the consumers - ex: something that is tightly coupled to the aggregate watching the events, and copying status changes to some (logical) cache where other processes can read the information. Thus, you preserve the autonomy of your component without compromising your event stream.
Another possibility to consider is that we're allowed to produce more than one event from a command - so you could have both an OrderPicked event and a StatusChanged event, and then use your favorite filtering method for subscribers only interested in status changes.
In effect, we've got two different sets of information to track to remember later - inputs (information in the command, information copied from local caches), and also things we have calculated from those inputs, prior state, and the business policies that are now in effect.
So it may make sense to separate those expressions of information anyway.
If event sourcing is a good approach for the problems you are solving, then you are probably working on problems that are pretty important to the business, where specialization matters (otherwise, licensing an off the shelf product and creating adapters would be more cost effective). In which case, you should probably be expecting to invest in thinking deeply about the different trade offs you need to make, rather than hoping for a one-size-fits-all solution.

Dispatch one event on updating multiple data or dispatch an event for every single field

at the moment, I am learning Event Sourcing. I used CRUD for a long time now and I guess I'm still kinda stuck in the CRUD-way.
Well, now to my question:
I event-sourced a part of my application, where I create something called a Job. A Job can have:
title
description
created_at
So creating this Job is easy - but what do I do when it comes to updating?
Is it an anti pattern to dispatch an event like JobUpdated, which contains changes to the title and possibly the description? Or should I dispatch multiple events like:
JobTitleChanged
JobDescriptionChanged
In this particular case, where Updated events seem to be the best you can do, a lot will come down to whether it's more common for one of title or description to be edited or for them to be edited together. If the former, specific field updates (e.g. JobTitleUpdated) are better, as they at least allow for consumers which don't care about the title field to easily ignore those events, but if a particular transaction issues both JobTitleUpdated and a JobDescriptionUpdated events, the context that those events were in the same transaction is difficult to reliably reconstruct from the separate events.
In general, Updated events aren't particularly rich: they capture what changed, but lose other context (most often the why). In a hotel, for instance, you could have a RoomStatusUpdated event (e.g. RoomStatusUpdated(VacantDirty)), but there are a lot of different reasons for that, so it might be better to have GuestCheckedIn, GuestCheckedOut, RoomCleaned, RoomOutOfOrder etc. events.

Is it ok to have FAT events with event sourcing?

I have recently been building an application on top of Greg Young EventStore as my peristance layer and I have been pondering how big should I allow an event to get?
For example I have an UK Address Aggregate with the following fields
UK_Address
-BuildingName
-Street
-Locality
-Town
-Postcode
Now I'm building the UI using React/Redux and was thinking should I create a single FAT addressUpdated Event contatining all the above fields?
Or should I Create a event for each of the different fields? and batch them within the client until the Save event is fired? buildingNameUpdated Event, streetUpdated Event, localityUpdated Event.
I'm not sure if the answer is as black and white ask I have asked it what I really would like to know is what conditions/constraints could you use to make the decision?
should I create a event for each of the different fields?
No. The representations of your events are part of the API -- so you want to use spellings that make sense at the level of the business, not at the level of the implementation.
Now I'm building the UI using React/Redux and was thinking should I create a single FAT updateAddress Event containing all the above fields?
You don't need to constrain the data that you send to your UI to match that which is in the persistence store. The UI is just a cached representation of a read model; there's no reason that representation needs to have the same form as what is in your event store.
Consider the React model itself -- your code makes changes to the "in memory" representation of your data, and then the library computes the new DOM and replaces it, which in turn causes the browser to update its view, which in turn causes the pixels on the screen to change.
So taking a fat event from the store, and breaking it into field level events for the UI is fine. Taking multiple events from the store and aggregating them into a single message for the UI is also fine. Taking events from the event store and transforming them into a spelling that the UI will recognize is also fine.
Do you have any comment regarding Arien answer regarding keeping fields that need to be consistent together? so regardless of when your snapshop the current state of the world it would be in a valid state?
I don't believe that this makes sense, and I'm not sure if it is possible in general.
It doesn't make sense, because "valid state" is a write model concern only; events are things that have happened, its too late to vote on whether they are valid or not. For instance, if you deploy a new model, with a new invariant, it still needs to respect the history of what happened before. So you can build a snapshot for that new model, but the snapshot may not be "valid". Too bad.
Given that, I don't think it makes sense to worry over whether each individual event in a commit leaves the snapshot in a valid state.
In particular, if a particular transaction involves multiple entities, it is very likely that the domain language will suggest an event for each entity (we "debit cash" and "credit accounts receivable"). The entities themselves, of course, are capable of changing independently of each other -- it's the aggregate that maintains the balance.
You have to bundle al the information together in one event when this data has to be consistent with each other.
So when you update one field of an address you probably get an unwanted address.
This will happen when the client has not processed all the events at a certain time due to eventual consistency.
Example:
Change address (City=1, Street=1, Housenumber=1) to (City=2, Street=2, Housenumber=2)
When you do this with 3 events and you have just processed one at the time of reading you could get the address: (City=2, Street=1, Housenumber=1).
If puzzled, give a try to a solution that is easier to implement. I guess "FAT" event will be easier: you will end up spending less time for implementing/debugging/supporting.
It is usually referred as YAGNI-KISS-Occam's Razor principles.
In theory and I find it to be a good rule of thumb is to have your commands and events reflecting the intent of the user staying true to DDD. You can find a good explanation of the pros and cons about event granularity here: https://medium.com/#hugo.oliveira.rocha/what-they-dont-tell-you-about-event-sourcing-6afc23c69e9a

FHIR Managing planned reviews and discharge from a service

My application is an inpatient acute pain service, but similar patterns would exist for other in-hospital and ambulatory service (nutrition, physio and other therapies, social work) - ie any time a service is brought in by the treating team but then manages its own schedule of interaction based on the services' understanding of requirements and ongoing need assessments.
The task from a team/service level is to identify:
who are are our current patients?
when do we need to see them again?
So this involves tracking:
a referral (which would imply a degree of urgency: "please see today/tomorrow")
individual encounters (which would plan the next visit), and finally
a "discharge" event ("we're done here, let us know if you need us again")
By itself none of this is awfully complicated (and managed on spreadsheets and back of envelopes all around the world), but struggling to find the right FHIR resources to drop all this into.
It seems that:
care would be triggered somehow by a ReferralRequest
each visit should be in an encounter linked back to the incomingReferral
an order would allow tracking the "next visit", although potentially an appointment would do that job
obviously there are lots of observations that the service can record along the way
This leaves these questions:
What is the role of a care plan in the mix?
How to track an episode of care?
What is the role of a clinical impression?
Via what resource would we make a "summary of care given by the service" available?
What is the event that triggers the completion of a referral?
CarePlan is used to share information about what the intended course of care is for a patient - what activities are you going to do, when are you going to do them, how are they going, etc. If you wanted to track a plan to have 5 encounters over the course of 6 months, maintain a daily pain log, do a set of exercises at least twice a week, etc., CarePlan could be used. No requirement to use it if not needed though.
EpisodeOfCare is used to link activities related to a single condition that span multiple encounters. You can link encounters, procedures, etc. to EpisodeOfCare
ClinicalImpression is a new, evolving resource. Think of it as a specialized type of Observation that's intended to tie together a bunch of other observations and make an overall assessment.
A complete summary of care would typically be represented as a FHIR document - that's a Bundle instance starting with a Composition that would then organize relevant information about the care into a series of sections. If you don't want the overhead of full document, you can skip the Composition and just have a Bundle containing relevant information.
Completion of Referrals is dependent on business process. Typically the ReferralRequest instance is owned by the placing/initiating system. They decide when to mark the request as complete - be that on receiving back a report, knowledge that the transfer of care is done, sufficient elapsed time or other means. The Order/OrderResponse (to be replaced by Task) can be used to communicate back and forth between placer and filler systems to help coordinate when work is deemed to be complete.
It seems the core information is representable by a ReferralRequest that may trigger an appointment, which becomes an encounter generating a Clinical impression including an appointment for the next visit.
Clinical impression
Is often one to one with an encounter. It allows a record of new (and excluded) conditions and also recording a plan which can include an appointment action to cover "see again tomorrow"
Orders
Include a timing element so maybe useful for fixed repeating schedules.
ReferralRequest
ReferralRequest.status has values of (requested | active | cancelled | accepted | rejected | completed) make it a suitable repository to hold the episode of care by the service
Episode of care
This resource is similar in many ways to an encounter. Unlike encounters there is no mechanism to nest Episode of cares, although each encounter may relate to 0..* episode of care.s

Design of notification events

I am designing some events that will be raised when actions are performed or data changes in a system. These events will likely be consumed by many different services and will be serialized as XML, although more broadly my question also applies to the design of more modern funky things like Webhooks.
I'm specifically thinking about how to describe changes with an event and am having difficulty choosing between different implementations. Let me illustrate my quandry.
Imagine a customer is created, and a simple event is raised.
<CustomerCreated>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</CustomerCreated>
Now let's say Bob spends lots of money and becomes a gold customer, or indeed any other property changes (e.g.: he now prefers to be known as Robert). I could raise an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is nice because the schema of the Created and Modified events are the same and any subscriber receives the complete current state of the entity. However it is difficult for any receiver to determine which properties have changed without tracking state themselves.
I then thought about an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is more compact and only contains the properties that have changed, but comes with the downside that the receiver must apply the changes and reassemble the current state of the entity if they need it. Also, the schemas of the Created and Modified events must be different now; CustomerId is required but all other properties are optional.
Then I came up with this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<Before>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</Before>
<After>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</After>
</CustomerModified>
This covers all bases as it contains the full current state, plus a receiver can figure out what has changed. The Before and After elements have the exact same schema type as the Created event. However, it is incredibly verbose.
I've struggled to find any good examples of events; are there any other patterns I should consider?
You tagged the question as "Event Sourcing", but your question seems to be more about Event-Driven SOA.
I agree with #Matt's answer--"CustomerModified" is not granular enough to capture intent if there are multiple business reasons why a Customer would change.
However, I would back up even further and ask you to consider why you are storing Customer information in a local service, when it seems that you (presumably) already have a source of truth for customer. The starting point for consuming Customer information should be getting it from the source when it's needed. Storing a copy of information that can be queried reliably from the source may very well be an unnecessary optimization (and complication).
Even if you do need to store Customer data locally (and there are certainly valid reasons for need to do so), consider passing only the data necessary to construct a query of the source of truth (the service emitting the event):
<SomeInterestingCustomerStateChange>
<CustomerId>1234</CustomerId>
</SomeInterestingCustomerStateChange>
So these event types can be as granular as necessary, e.g. "CustomerAddressChanged" or simply "CustomerChanged", and it is up to the consumer to query for the information it needs based on the event type.
There is not a "one-size-fits-all" solution--sometimes it does make more sense to pass the relevant data with the event. Again, I agree with #Matt's answer if this is the direction you need to move in.
Edit Based on Comment
I would agree that using an ESB to query is generally not a good idea. Some people use an ESB this way, but IMHO it's a bad practice.
Your original question and your comments to this answer and to Matt's talk about only including fields that have changed. This would definitely be problematic in many languages, where you would have to somehow distinguish between a property being empty/null and a property not being included in the event. If the event is getting serialized/de-serialized from/to a static type, it will be painful (if not impossible) to know the difference between "First Name is being set to NULL" and "First Name is missing because it didn't change".
Based on your comment that this is about synchronization of systems, my recommendation would be to send the full set of data on each change (assuming signal+query is not an option). That leaves the interpretation of the data up to each consuming system, and limits the responsibility of the publisher to emitting a more generic event, i.e. "Customer 1234 has been modified to X state". This event seems more broadly useful than the other options, and if other systems receive this event, they can interpret it as they see fit. They can dump/rewrite their own data for Customer 1234, or they can compare it to what they have and update only what changed. Sending only what changed seems more specific to a single consumer or a specific type of consumer.
All that said, I don't think any of your proposed solutions are "right" or "wrong". You know best what will work for your unique situation.
Events should be used to describe intent as well as details, for example, you could have a CustomerRegistered event with all the details for the customer that was registered. Then later in the stream a CustomerMadeGoldAccount event that only really needs to capture the customer Id of the customer who's account was changed to gold.
It's up to the consumers of the events to build up the current state of the system that they are interested in.
This allows only the most pertinent information to be stored in each event, imagine having hundreds of properties for a customer, if every command that changed a single property had to raise an event with all the properties before and after, this gets unwieldy pretty quickly. It's also difficult to determine why the change occurred if you just publish a generic CustomerModified event, which is often a question that is asked about the current state of an entity.
Only capturing data relevant to the event means that the command that issues the event only needs to have enough data about the entity to validate the command can be executed, it doesn't need to even read the whole customer entity.
Subscribers of the events also only need to build up a state for things that they are interested in, e.g. perhaps an 'account level' widget is listening to these events, all it needs to keep around is the customer ids and account levels so that it can display what account level the customer is at.
Instead of trying to convey everything through payload xmls' fields, you can distinguish between different operations based on -
1. Different endpoint URLs depending on the operation(this is preferred)
2. Have an opcode(operation code) as an element in the xml file which tells which operation is to used to handle the incoming request.(more nearer to your examples)
There are a few enterprise patterns applicable to your business case - messaging and its variants, and if your system is extensible then Enterprise Service Bus should be used. An ESB allows reliable handling of events and processing.

Resources