Do I need to store last state of object in separate table in Event Sourcing - event-sourcing

I'm still learning event sourcing i dont undestand something.
When i get a command to change object, do I first recreate that object from event store than change it and save event, or should i have separate table that holds last state?
What is practice here?

I'm still learning event sourcing i don't understand something. When i get a command to change object, do I first recreate that object from event store than change it and save event, or should i have separate table that holds last state? What is practice here?
The first rule of optimization: Don't.
For handling commands, all of the information that you need to have is stored in your event history; simply loading the history and recomputing any state you need will get the job done.
In the case where you need low latency in your command handler, AND recomputing the state you need from the event history is too slow to meet your service level targets, then you might look into saving a "snapshot", and using that to speed up the load of your data.
Current consensus is that snapshots should be saved separately from the event history (ie: a snapshot is not another kind of event), as though the snapshot were another "read model".

Related

Dispatch one event on updating multiple data or dispatch an event for every single field

at the moment, I am learning Event Sourcing. I used CRUD for a long time now and I guess I'm still kinda stuck in the CRUD-way.
Well, now to my question:
I event-sourced a part of my application, where I create something called a Job. A Job can have:
title
description
created_at
So creating this Job is easy - but what do I do when it comes to updating?
Is it an anti pattern to dispatch an event like JobUpdated, which contains changes to the title and possibly the description? Or should I dispatch multiple events like:
JobTitleChanged
JobDescriptionChanged
In this particular case, where Updated events seem to be the best you can do, a lot will come down to whether it's more common for one of title or description to be edited or for them to be edited together. If the former, specific field updates (e.g. JobTitleUpdated) are better, as they at least allow for consumers which don't care about the title field to easily ignore those events, but if a particular transaction issues both JobTitleUpdated and a JobDescriptionUpdated events, the context that those events were in the same transaction is difficult to reliably reconstruct from the separate events.
In general, Updated events aren't particularly rich: they capture what changed, but lose other context (most often the why). In a hotel, for instance, you could have a RoomStatusUpdated event (e.g. RoomStatusUpdated(VacantDirty)), but there are a lot of different reasons for that, so it might be better to have GuestCheckedIn, GuestCheckedOut, RoomCleaned, RoomOutOfOrder etc. events.

Is it ok to have FAT events with event sourcing?

I have recently been building an application on top of Greg Young EventStore as my peristance layer and I have been pondering how big should I allow an event to get?
For example I have an UK Address Aggregate with the following fields
UK_Address
-BuildingName
-Street
-Locality
-Town
-Postcode
Now I'm building the UI using React/Redux and was thinking should I create a single FAT addressUpdated Event contatining all the above fields?
Or should I Create a event for each of the different fields? and batch them within the client until the Save event is fired? buildingNameUpdated Event, streetUpdated Event, localityUpdated Event.
I'm not sure if the answer is as black and white ask I have asked it what I really would like to know is what conditions/constraints could you use to make the decision?
should I create a event for each of the different fields?
No. The representations of your events are part of the API -- so you want to use spellings that make sense at the level of the business, not at the level of the implementation.
Now I'm building the UI using React/Redux and was thinking should I create a single FAT updateAddress Event containing all the above fields?
You don't need to constrain the data that you send to your UI to match that which is in the persistence store. The UI is just a cached representation of a read model; there's no reason that representation needs to have the same form as what is in your event store.
Consider the React model itself -- your code makes changes to the "in memory" representation of your data, and then the library computes the new DOM and replaces it, which in turn causes the browser to update its view, which in turn causes the pixels on the screen to change.
So taking a fat event from the store, and breaking it into field level events for the UI is fine. Taking multiple events from the store and aggregating them into a single message for the UI is also fine. Taking events from the event store and transforming them into a spelling that the UI will recognize is also fine.
Do you have any comment regarding Arien answer regarding keeping fields that need to be consistent together? so regardless of when your snapshop the current state of the world it would be in a valid state?
I don't believe that this makes sense, and I'm not sure if it is possible in general.
It doesn't make sense, because "valid state" is a write model concern only; events are things that have happened, its too late to vote on whether they are valid or not. For instance, if you deploy a new model, with a new invariant, it still needs to respect the history of what happened before. So you can build a snapshot for that new model, but the snapshot may not be "valid". Too bad.
Given that, I don't think it makes sense to worry over whether each individual event in a commit leaves the snapshot in a valid state.
In particular, if a particular transaction involves multiple entities, it is very likely that the domain language will suggest an event for each entity (we "debit cash" and "credit accounts receivable"). The entities themselves, of course, are capable of changing independently of each other -- it's the aggregate that maintains the balance.
You have to bundle al the information together in one event when this data has to be consistent with each other.
So when you update one field of an address you probably get an unwanted address.
This will happen when the client has not processed all the events at a certain time due to eventual consistency.
Example:
Change address (City=1, Street=1, Housenumber=1) to (City=2, Street=2, Housenumber=2)
When you do this with 3 events and you have just processed one at the time of reading you could get the address: (City=2, Street=1, Housenumber=1).
If puzzled, give a try to a solution that is easier to implement. I guess "FAT" event will be easier: you will end up spending less time for implementing/debugging/supporting.
It is usually referred as YAGNI-KISS-Occam's Razor principles.
In theory and I find it to be a good rule of thumb is to have your commands and events reflecting the intent of the user staying true to DDD. You can find a good explanation of the pros and cons about event granularity here: https://medium.com/#hugo.oliveira.rocha/what-they-dont-tell-you-about-event-sourcing-6afc23c69e9a

CQRS+ES: Client log as event

I'm developing small CQRS+ES framework and develop applications with it. In my system, I should log some action of the client and use it for analytics, statistics and maybe in the future do something in domain with it. For example, client (on web) download some resource(s) and I need save date, time, type (download, partial,...), from region or country (maybe IP), etc. after that in some view client can see count of download or some complex report. I'm not sure how to implement this feather.
First solution creates analytic context and some aggregate, in each client action send some command like IncreaseDownloadCounter(resourced) them handle the command and raise domain event's and updating view, but in this scenario first download occurred and after that, I send command so this is not really command and on other side version conflict increase.
The second solution is raising event, from client side and update the view model base on it, but in this type of handling my event not store in event store because it's not raise by command and never change any domain context. If is store it in event store, no aggregate to handle it after fetch for some other use.
Third solution is raising event, from client side and I store it on other database may be for each type of event have special table, but in this manner of event handle I have multiple event storage with different schema and difficult on recreating view models and trace events for recreating contexts states so in future if I add some domain for use this type of event's it's difficult to use events.
What is the best approach and solution for this scenario?
First solution creates analytic context and some aggregate
Unquestionably the wrong answer; the event has already happened, so it is too late for the domain model to complain.
What you have is a stream of events. Putting them in the same event store that you use for your aggregate event streams is fine. Putting them in a separate store is also fine. So you are going to need some other constraint to make a good choice.
Typically, reads vastly outnumber writes, so one concern might be that these events are going to saturate the domain store. That might push you towards storing these events separately from your data model (prior art: we typically keep the business data in our persistent book of record, but the sequence of http requests received by the server is typically written instead to a log...)
If you are supporting an operational view, push on the requirement that the state be recovered after a restart. You might be able to get by with building your view off of an in memory model of the event counts, and use something more practical for the representations of the events.
Thanks for your complete answer, so I should create something like the ES schema without some field (aggregate name or type, version, etc.) and collect client event in that repository, some offline process read and update read model or create command to do something on domain space.
Something like that, yes. If the view for the client doesn't actually require any validation by your model at all, then building the read model from the externally provided events is fine.
Are you recommending save some claim or authorization token of the user and sender app for validation in another process?
Maybe, maybe not. The token describes the authority of the event; our own event handler is the authority for the command(s) that is/are derived from the events. It's an interesting question that probably requires more context -- I'd suggest you open a new question on that point.

How to update/migrate data when using CQRS and an EventStore?

So I'm currently diving the CQRS architecture along with the EventStore "pattern".
It opens applications to a new dimension of scalability and flexibility as well as testing.
However I'm still stuck on how to properly handle data migration.
Here is a concrete use case:
Let's say I want to manage a blog with articles and comments.
On the write side, I'm using MySQL, and on the read side ElasticSearch, now every time a I process a Command, I persist the data on the write side, dispatch an Event to persist the data on the read side.
Now lets say I've some sort of ViewModel called ArticleSummary which contains an id, and a title.
I've a new feature request, to include the article tags to my ArticleSummary, I would add some dictionary to my model to include the tags.
Given the tags did already exist in my write layer, I would need to update or use a new "table" to properly use the new included data.
I'm aware of the EventLog Replay strategy which consists in replaying all the events to "update" all the ViewModel, but, seriously, is it viable when we do have a billion of rows?
Is there any proven strategies? Any feedbacks?
I'm aware of the EventLog Replay strategy which consists in replaying
all the events to "update" all the ViewModel, but, seriously, is it
viable when we do have a billion of rows?
I would say "yes" :)
You are going to write a handler for the new summary feature that would update your query side anyway. So you already have the code. Writing special once-off migration code may not buy you all that much. I would go with migration code when you have to do an initial update of, say, a new system that requires some data transformation once off, but in this case your infrastructure would exist.
You would need to send only the relevant events to the new handler so you also wouldn't replay everything.
In any event, if you have a billion rows of data your servers would probably be able to handle the load :)
Im currently using the NEventStore by JOliver.
When we started, we were replaying our entire store back through our denormalizers/event handlers when the application started up.
We were initially keeping all our data in memory but knew this approach wouldn't be viable in the long term.
The approach we use currently is that we can replay an individual denormalizer, which makes things a lot faster since you aren't unnecessarily replaying events through denomalizers that haven't changed.
The trick we found though was that we needed another representation of our commits so we could query all the events that we handled by event type - a query that cannot be performed against the normal store.

What's the best way to implement deletion of user objects where there are multiple viewers of the object?

Let's say I have a GUI with multiple types of viewers of user objects. For example, a tree view, a list view and a diagram view. The three views show the same objects. If a user deletes an object from one view, I would like to fire off an event to notify the other two views. I currently do this by exposing an event on the object itself. So if the object is deleted from View 1, View 1 will call delete on the object, which will then fire an event to the subscribers (all 3 views). Each subscriber has the chance to cancel the deletion.
There are a few problems as I see it. If a subscriber cancels a deletion after another subscriber has already approved of the deletion, then I have to instruct those subscribers to undo the deletion.
Are there any good patterns to implement this kind of common scenario?
If an object is to be deleted from all views, or no view at all
Ask every subscriber if it's ok to delete the item; if yes:
Issue a "delete item" call to remove the object from the source, perform a soft delete or whatever you'd like
Update each view. This would be the observer part, listen for a "object deleted" call and take appropriate actions, for example manually remove the now deleted object from each view
If you always want the user to be able to delete the object from its own view:
Step 2. from above, with the addition that it's only been deleted either for 1) the user; or 2) that user in that view
Step 1. from above, and continue.. (might be skipped, depending on how much you'd like the views to be coherent)
The twist here is that each subscriber has the chance to cancel the deletion. Normally, when you use the words "view" and "subscribe", it means that you are being passive and just reacting to what you see.
That doesn't mean that what you're trying to do is impossible, but it's definitely tricky. For example, you could try to do a sort of two-phase commit, where you mark the object is deleted and then wait for all of the viewers to acknowledge the deletion before really removing the object. (This is basically the "ask every subscriber if it's OK to delete the item" approach that chelmertz suggests.) However, this means you need to know exactly how many viewers there are, and all viewers will need to respond before you can complete the deletion. Do you always have three viewers? Are there ever only two? What if there is an error in one of the viewers - Should the delete fail, or do you want to go ahead and delete the object anyway?
The nice thing about an event-driven system is that you don't normally have to worry about these sorts of questions: You just make your change to the model (in this case, delete the object) and fire a change event. You don't need to know anything about your viewers.
So, if this were my system, I would try to figure out a way to make model changes cancelable only before they are applied to the model, rather than trying to apply changes to other views through the model and then trying to roll back those changes later.

Resources