Does event store store state? - event-sourcing

Theoretically when using event sourcing you don't store "state" but events. But I saw in many implementations that you store snapshots of state in a column in a format like JSON or just a BLOB. For example:
Using an RDBMS as event sourcing storage
The events table has Data column which stores entire object. To me, it's like storing state for that given time when some event occurred.
Also this picture(taken from Streamstone):
It has Data column with a serialized state. So it stores state too but inside an Event?
So how to replay from the initial state then, If I can simply pick some event and access Data to get the state directly.
What exactly is stored inside Data, is it a state of the entire object or it's serialized event?
Let's say I have a person object (in C#)
public class Person
{
public string Name { get; set }
public int Age { get; set; }
}
What should my event store when I create a person or change properties like name or age.
When I create a Person I most likely will send something like PersonCreatedEvent with the initial state, so the entire object.
But what if I change Name or Age should they be 2 separate events or just 1? PersonChangedEvent or PersonChangedAgeEvent and PersonChangedNameEvent?
What should be stored in the event in this case?

What exactly is stored inside Data, is it a state of the entire object or it's serialized event?
That will usually be a serialized representation of the event.
One way of thinking about it: a stream of events is analogous to a stream of patch documents. Current state is calculated by starting from some known default state and then applying each patch in turn -- aka a "fold". Previous states can be recovered by choosing a point somewhere in the stream, and applying the patches up to that point.
The semantics of events, unlike patches, tends to be domain specific. So Checked-In, Checked-Out rather than Patched, Patched.
We normally keep the events compact - you won't normally record state that hasn't changed.
Within your domain specific event language, the semantics of a field in an event may be "replace" -- for example, when recording a change of a Name, we probably just store the entire new name. In other cases, it may make sense to "aggregate" instead -- with something like an account balance, you might record credits and debits leaving the balance to be derived, or you might update the total balance (like a gauge).
In most mature domains (banking, accounting), the domain language has semantics for recording changes, rather than levels. We write new entries into the ledger, we write new entries into the checkbook register, we read completed and pending transactions in our account statement.
But what if I change Name or Age should they be 2 separate events or just 1? PersonChangedEvent or PersonChangedAgeEvent and PersonChangedNameEvent?
It depends.
There's nothing wrong with having more than one event produced by a transaction.
There's nothing wrong with having a single event schema, that can be re-used in a number of ways.
There's nothing wrong with having more than one kind of event that changes the same field(s). NameLegallyChanged and SpellingErrorCorrected may be an interesting distinction to the business.
Many of the same concerns that motivate task based UIs apply to the design of your event schema.
It still seems to me like PersonChangedEvent will contain all person properties that can change. Even if they didn't change
In messaging (and event design takes a lot of lessons from message design), we often design our schema with optional fields. So the event schema can be super flexible, while any individual representation of an event would be compact (limited to only the information of interest).

To answer your question, an event that is stored should be the event data only. Not the objects state.
When you need to work on your Entity, you read up all the events and apply them to get the latest state every time. So events should be stored with the events data only. (ofc together with AggregateId, Version etc)
The "Objects State" will be the computation of all events, but if you have an Eventlistener that listens to all your published events you can populates a separate ReadModel for you. To query against and use as read only from a users perspective.
Hope it helps!
Updated answer to updated question:
Really depends on your model, if you do the update at the same time Age and Name, yes the new age and name values should be stored in a new event.
The event should only contain this data "name and age with aggregateId, version etc"
The event listener will listen specifically on each event (created, updated etc), find the aggregates read model that you have stored and only update these 2 properties (in this example).
For createevent you create the object for the read model.

Related

How do you store the mutated state, event driven programming?

I have the following case which I am scratching my head. I have an Aggregate lets call it Reservation and I have an Event. Some of the events will lead to the state of the aggregate to be mutated. Some of this state is functional and it naturaly belongs to the aggregate - like "calculated tax" for example. Some of the state I would say is more technical than functional, like if a message is sent to third party system lets call it "isMessageSentToASystem". In the database I have two tables one for the aggregate one for the event. I see two optioons to preserve the state:
1)I can keep only what has changed in a third table bound to the event. This way I will effectivly recieve a revision log. I don't think this fits well my application though. It will keep my aggregate immutable
2) I will accept that my aggregate is mutable and I will write all functional important state into it ilke "calculatedTax". But here comes y question, what should I do with the technical state like "isMessageSenttoSystemA" somehow I have the feeling that this state does not belong to the aggregate itself but it is a side effect of the event.
Can I create a third table and bind it to the event where I can write my technical state? How am I supposed to name such table? I realy have difficulties finding a correct name ?
UPDATE: I am not sure if it becomes clear from the question.But I am mostly interested in how to model the data in the database. I use RDBMS.
UPDATE2: I don't want to implement Event sourcing, and I think this is not a prerequisite in order to have Event driven architecture.
The unmutated state AND mutating events belongs to the aggregate.
I strongly suggest you to download the code of Vaughn Vernon, he is the author of Implementing Domain Driven Design (IDDD) book.
Here is the class that contains answers to your questions:
https://github.com/VaughnVernon/IDDD_Samples/blob/master/iddd_common/src/main/java/com/saasovation/common/domain/model/EventSourcedRootEntity.java
Notice the logic here and the relation with the EventStore, there is a list of mutating events, the events that are effectively changing your entity, and a pointer to the unmutated version. Those are used in the implementation of event store, check the MySQL implementation to understand.
https://github.com/VaughnVernon/IDDD_Samples/blob/master/iddd_common/src/main/java/com/saasovation/common/port/adapter/persistence/eventsourcing/mysql/MySQLJDBCEventStore.java
On the loading side, all events from the event store are loaded and applied to the Event sourced root entity but not stored in the mutating event list. So the state of your entity is restored to the last version of your audit log, and any modifications are traced in memory in this list, which is flushed into the database when saved.
As for the structure of the event store itself, it is very straightforward:
CREATE TABLE `tbl_es_event_store` (
`event_id` bigint(20) NOT NULL auto_increment,
`event_body` varchar(65000) NOT NULL,
`event_type` varchar(250) NOT NULL,
`stream_name` varchar(250) NOT NULL,
`stream_version` int(11) NOT NULL,
KEY (`stream_name`),
UNIQUE KEY (`stream_name`, `stream_version`),
PRIMARY KEY (`event_id`)
) ENGINE=InnoDB;
isMessageSentToSystemA is a query side service, you can check that constraint before dispatching command and changing the aggregate.
Also you can use SAGA to perform distributed transaction and manage workflow of your use-case

Event Sourcing - Aggregate modeling

Two questions
1) How to model aggregate and reference between them
2) How to organise/store events so that they can be retrieved efficiently
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
As LineItem needs to know which Product, so there are two options 1) LineItem has direct reference to Product aggregate (which seems not a best practice, as it violate the idea of aggregate being a consistence boundary because we can update Product aggregate directly from Order aggregate) 2) then LineItem only has ProductId.
It looks like 2nd option is the way to go...What do you think here?
However, another problem arises which is about building a Order read/view model. In this Order view model, it needs to know which Products are in Order (i.e. ProductId, Type, etc.). The typical use case is reporting, and CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc. In order to do it, given the fact that those data are in two separate aggregate, then we need 1+ database roundtrips. As we are using events to build model, so the pseudo code looks like below
1) for a given order id (guid, order aggregate id), we load all the events for it; -- 1st database access
2) then build a Order aggregate, then we know which ProductId are referenced in Order;
3) for the list of ProductIds, we load all events for it; -- 2nd database access
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
Thanks
Take this typical use case as example, we have Order and LineItem (they are an aggregate, Order is the aggregate root), and Product aggregate.
The Order aggregate makes sense the way you have described it. "Product aggregate" is more suspicious; do you ask the model if the product is allowed to change, or are you telling the model that the product has changed?
If Product can change without first consulting with the order, then the LineItem must not include the product. A reference to the product (aka the ProductId) is ok.
If we build a really big graph of objects (a lot of different aggregates), then this may end up with a few more database access (each of which is slow)...What's your idea in here?
For reads, reports, and the like -- where you aren't going to be adding new events to the history -- one possible answer is to do the slow work in advance. An asynchronous process listens for writes in the event store, and then publishes those events to a bus. Subscribers build new versions of the reports when new events are observed, and cache the results. (search keyword: cqrs)
When a client asks for a report, you give them one out of the cache. All the work is done, so it's very quick.
For command handlers, the answer is more complicated. Business rules are supposed to be in the domain model, so having the command handler try to validate the command (as opposed to the domain model) is a bit broken.
The command handler can load the products to see what the state might look like, and pass that information to the aggregate with the command data, but it's not clear that's a good idea -- if the client is going to send a command to be run, and you need to flesh out the Order command with Product data, why not instead have the command add the product data to the command directly, and skip the middle man.
CommandHandler also can use this Product object to perform logic such as whether there are too many particular products, etc.
This example is a bit vague, but taking a guess: you are thinking about cases where you prevent an order from being placed if the available inventory is insufficient to fulfill the order.
For real world inventory - a physical book in a warehouse - that's probably the wrong approach to take. First, the model itself is wrong; if you want to know how much product is in the warehouse, you should be querying the warehouse, not the product. Second, a physical warehouse is not constrained by your model -- calling the addProduct method on a warehouse aggregate doesn't cause the product to magically appear there.
Third, it probably doesn't match very well with what your domain experts want anyway. If the model says that the warehouse doesn't have enough product, do you think the stake holders want the system to
tell the shopper to buy the product somewhere else, or...
accept the order, and contact the supplier for a new delivery.
Hint: when in doubt, carefully review how amazon.com does it.

Determine new record in PreWriteRecord event handler and check value of joined field

There is custom field "Lock Flag" in Account BC, namely in S_ORG_EXT_X table. This field is made available in Opportunity BC using join to above table. The join specification is as follows: Opportunity.Account Id = Account.Id. Account Id is always populated when creating new opportunity. The requirement is that for newly created records in Opportunity BC if "Lock Flag" is equal to 'Y', then we should not allow to create the record and we should show custom error message.
My initial proposal was to use a Runtime Event that is calling Data Validation Manager business service where validation rule is evaluated and error message shown. Assuming that we have to decide whether to write record or not, the logic should be placed in PreWriteRecord event handler as long as WriteRecord have row already commited to database.
The main problem was how to determine if it is new record or updated one. We have WriteRecordNew and WriteRecordUpdated runtime events but they are fired after record is actually written so it doesn't prevent user from saving record. My next approach was to use eScript: write custom code in BusComp_PreWriteRecord server script and call BC's method IsNewRecordPending to determine if it is new record, then check the flag and show error message if needed.
But unfortunately I am faced with another problem. That joined field "Lock Flag" is not populated for newly created opportunity records. Remember we are talking about BC Opportunity and field is placed in S_ORG_EXT_X table. When we create new opportunity we pick account that it belongs to. So it reproduceable: OpportunityBC.GetFieldValue("Lock Flag") returns null for newly created record and returns correct value for the records that was saved previously. For newly created opportunities we have to re-query BC to see "Lock Flag" populated. I have found several documents including Oracle's recomendation to use PreDefaultValue property if we want to display joined field value immediately after record creation. The most suitable expression that I've found was Parent: BCName.FieldName but it is not the case, because active BO is Opportunity and Opportunity BC is the primary one.
Thanks for your patience if you read up to here and finally come my questions:
Is there any way to handle PreWrite event and determine if it is new record or not, without using eScript and BC.IsNewRecordPending method?
How to get value of joined field for newly created record especially in PreWriteRecord event handler?
It is Siebel 8.1
UPDATE: I have found an answer for the first part of my question. Now it seems so simple to me that I am wondering how I haven't done it initially. Here is the solution.
Create Runtime Event triggered on PreWriteRecord. Specify call to Data Validation Manager business service.
In DVM create a ruleset and a rule where condition is
NOT(BCHasRows("Opportunity", "Opportunity", "[Id]='"+[Id]+"'", "AllView"))
That's it. We are searching for record wth the same Row Id. If it is new record there should't be anything in database yet (remember that we are in PreWriteRecord handler) and function returns FALSE. If we are updating some row then we get TRUE. Reversing result with NOT we make DVM raise an error for new records.
As for second part of my question credits goes to #RanjithR who proposed to use PickMap to populate joined field (see below). I have checked that method and it works fine at least when you have appropriate PickMap.
We Siebel developers have used scripting to correctly determine if record is new. One non scripting way you could try is to use RuntimeEvents to set a profileattribute during the BusComp NewRecord event, then check that in the PreWrite event to see if the record is new. However, there is always a chance that user might undo a record, those scenarios are tricky.
Another option, try invokine the BC Method:IsNewRecordPending from RunTime event. I havent tried this.
For the second part of the query, I think you could easily solve your problem using a PickMap.
On Opportunity BC, when your pick Account, just add one more pickmap to pick the Locked flag from Account and set it to the corresponding field on Opportunity BC. When the user picks the Account, he will also pick the lock flag, and your script will work in PreWriteRecord.
May I suggest another solution, again, I haven't tried it.
When new records are created, the field ModificationNumber will be set to 0. Every time you modify it, the ModificationNumber will increment by 1.
Set a DataValidationManager ruleset, trigger it from PreSetFieldValue event of Account field on Opportunity BC. Check for the LockFlag = Y AND (ModificationNumber IS NULL OR ModificationNumber = 0)) and throw error. DVM should throw error when new records are created.
Again, best practices say don't use the ModNumbers. You could set a ProfileAttribute to signal NewRecord, then use that attribute in the DVM. But please remember to clear the value of ProfileAttribute in WriteRecord and UndoRecord.
Let us know how it went !

Which objects are responsible for maintaining references between aggregates?

Suppose I have one aggregate, Ticket. A Ticket will have one assigned Department and one or more assigned Employee.
When instantiating a Ticket, should a TicketFactory be responsible for ensuring that a Ticket is created with a valid/existent Department and Employee?
Likewise, when decommissioning a Department or Employee, what is responsible for ensuring that a new Department or Employee is assigned to a Ticket so as to maintain its invariants? Could there be a service in the domain responsible for decommissioning, or is this a case where eventual consistency or some form of event listening should be adopted?
The TicketFactory would be declare that in order to create a Ticket you need references to both a Department and an Employee. It would not verify that those actually exist. It would be the responsibility of the calling code to obtain the appropriate references.
If using eventual consistency, the decommissioning of a Department and Employee would publish events indicating the decommission. There would be a handler associated with a Ticket which would subscribe to that event and either assign a new department and employee or send some sort of warning to task.
Take a look at Effective Aggregate Design for more on this.
I've recently started exploring DDD, so I have ran into some of the issues you mention.
I think that TicketFactory should always return validated/properly built Ticket instances. If you model is complex, you can have a domain service that validates that a given Department or Employee can be attached to it and then the factory uses it. Otherwise, you can just put it all in the factory. But what comes out of the factory should be a proper ticket.
I'd say that if e.g. only Ticket knows about the other two, a domain service that uses the Department and Employee repos would get the job done. If the relationship is bidirectional, then you can utilize event sourcing. Also, if it's really a event that should be captured in your domain model, and has other consequences other than reshuffling tickets, you can attach one of the handlers to this event to be InvalidTicketHandler. But if it's a small scale thing, keep it simple, just have a domain service that maintains the invariants.
Sidenote: If the Department and/or Employee are aggregates themselves, then you can reference them within Ticket via their identifier (e.g. employee's company ID or ID-code of the department). In that way you'll achieve consistency easier as you will not cross consistency boundaries between different aggregates.
A FACTORY is responsible for ensuring that all invariants are met for the object or AGGREGATE it creates; yet you should always think twice before removing the rules applying to an object outside that object. The FACTORY can delegate invariant checking to the product, and this is often best. [Domain-Driven Design: Tackling Complexity at the Heart of Software]
A depends on question type, but from the look of it it seems like a great candidate for an application layer functionality, i wouldn't go for the event solution though cause i find it only suitable in between layers and not between objects in the same layer.

Do I need to use state pattern for data approval process?

Users of our system are able to submit un-validated contact data. For example:
Forename: null
Surname: 231
TelephoneNumber: not sure
etc
This data is stored in a PendingContacts table.
I have another table - ApprovedContacts. This table has a variety of constraints to improve consistency and integrity. This table shouldn't contain any dirty or incomplete data.
I need a process to move data from one table to another. Structure of both tables is nearly identical, however, one table has the constraints, when another doesn't.
I have two states: Pending and Approved, gut feeling tells me that I should use a state pattern details here. In theory this should allow me to change contact's state from Pending to Approved, depending on whether the contact has been successfully validated. Problem is that I don't see how is this going to work.
Am I going in a right direction or should I be looking at something completely different?
Presentation layer is in MVC 3, so I have view models for pending contacts and approved contacts, as well as domain models for pending contacts and approved contacts. My view models are generally DTOs with some validation routines, but now my view models represent state too. This doesn't seem right.
For example, all contacts must have a state and they can be saved and removed , so I need an interface for that:
public interface IContactViewModelState
{
void Save(ContactViewModel item);
}
I then add an implementation for saving pending contacts into the PendingContacts table:
public class PendingContactViewModelState: IContactViewModelState
{
public void Save(ContactViewModel item)
{
// Save to the pending contacts table
// I don't like this as my view model now references data access layer
}
}
Short answer: no, because you only have two states. You'd use a state pattern to help deal with complex situations with many states and rules. The only reason you might want to go with a full-blown state pattern based implementation is if you there's a very high chance such a situation is imminent.
If the result of a success transition to Approved is the record ending up in the approved table then you really just need to decide where you want to enforce the constraints. This decision will/can be based on many factors including the likely frequency of change (to the constraints) and where other logic resides.
A lot of patterns (but not all) tend to deal with how to structure an application, but here I think it's just a case of deciding where and how implementing some logic. In other words - you might just be accidentally over-analyzing the problem - it's easily done :)

Resources