Amount of properties per command/event in event sourcing - event-sourcing

I'm learning cqrs/event sourcing, and recently I listen some speach and speaker told that you need pass as few parameters to event as possible, in other words to make events tiny as possible. The main reason for that is it's impossible to change events later as it will break the event history, and its easelly to design small events correctly. But what if for example in UI you need fill in for example form with 10 fields to create new aggregate, and same situation can be with updating the aggregate? How to be in such a case? And how to be if business later consider to change something, but we have huge event which updating 10 fields?

The decision is always context-specific and each case deserves its own review of using thin events vs fat events.
The motivation for using thin domain events is to include just enough information that is required to ensure the state transition.
As for fat events, your projections might require a piece of entity state to avoid using any logic in the projection itself (best practice).
For integration, you'd prefer emitting fat events because you hardly know who will consume your event. Still, the content of the event should convey the information related to the meaning of the event itself.
References:
Putting your events on a diet
Patterns for Decoupling in Distributed Systems: Fat Event

recently I listen some speach and speaker told that you need pass as few parameters to event as possible, in other words to make events tiny as possible.
I'm not convinced that holds up. If you are looking for good ides about designing events, you should review Greg Young's e-book on versioning.
If you are event sourcing, then you are primarily concerned with ensuring that your stream of events allows you to recreate the state of your domain model. The events themselves should be representations of changes that a domain expert will recognize. If you find yourself trying to invent smaller events just to fit some artificial constraint like "no more than three properties per event" then you are going to end up with data that doesn't really match the way your domain experts think -- which is to say, technical debt.

Related

Novice question about structuring events in Tcl/Tk

If one is attempting to build a desktop program with a semi-complex GUI, especially one in which users can open multiple instances of identical GUI components such as having a "project" GUI and permitting users to open multiple projects concurrently within the main window, is it good practice to push the event listeners further up the widget hierarchy and use the event detail to determine upon which widget the event took place, as opposed to placing event listeners on each individual widget?
For example, in doing something similar in a web browser, there were no event listeners on any individual project GUI elements. The listeners were on the parent container that held the multiple instances of each project GUI. A project had multiple tabs within its GUI, but only one tab was visible within a project at a time and only one project was visible at any one time; so, it was fairly easy to use classes on the HTML elements and then the e.matches() method on the event.target to act upon the currently visible tab within the currently visible project in a manner that was independent of which project it was that was visible. Without any real performance testing, it was my unqualified impression as an amateur that having as few event listeners as possible was more efficient and I got most of that by reading information that wasn't very exact.
I read recently in John Ousterhout's book that Tk applications can have hundreds of event handlers and wondered whether or not attempting to limit the number of them as described above really makes any difference in Tcl/Tk.
My purpose in asking this question is solely to understand events better in order to start off the coding of my Tcl/Tk program correctly and not have to re-code a bunch of poorly structured event listeners. I'm not attempting to dispute anything written in the mentioned book and don't know enough to do so if I wanted to.
Thank you for any guidance you may be able to provide.
Having hundreds of event handlers is usually just a mark that there's a lot of different events possibly getting sent around. Since you usually (but not always) try to specialize the binding to be as specific as possible, the actual event handler is usually really small, but might call a procedure to do the work. That tends to work out well in practice. Myself, my rule of thumb is that if it is not a simple call then I'll put in a helper procedure; it's easier to debug them that way. (The main exception to my rule is if I want to generate a break.)
There are four levels you can usually bind on (plus more widget-specific ones for canvas and text):
The individual widget. This is the one that you'll use most.
The widget class. This is mostly used by Tk; you'll usually not want to change it because it may alter the behaviour of code that you just use. (For example, don't alter the behaviour of buttons!)
The toplevel containing the widget. This is ideal for hotkeys. (Be very careful though; some bindings at this level can be trouble. <Destroy> is the one that usually bites.) Toplevel widgets themselves don't have this, because of rule 1.
all, which does what it says, and which you almost never need.
You can define others with bindtags… but it's usually not a great plan as it is a lot of work.
The other thing to bear in mind is that Tk supports virtual events, <<FooBarHappened>>. They have all sorts of uses, but the main one in a complex application (that you should take note of) is for defining higher-level events that are triggered by a sequence of low-level events occasionally, and yet which other widgets than the originator may wish to take note of.

Event source the whole system is bad

I'm learning a proper microservice architecture using CQRS, MassTransit and different type of storage for the read side. One thing which often comes along CQRS is the event sourcing. I do understand it's not mandatory at all. However, I can't think of why using it on the whole system is really an anti pattern.
Having an store for all events as a single source of truth can help you build / rebuild a read store on the fly whenever you want.
You are not locked in to any vendor (except for the event store)
For me, the question is more like is it easier to not start with event sourcing (and still have separate data storage depending on which the microservices. eg: elasticsearch, mongodb, etc etc) and migrating / provisioning whenever it's needed or on the other hand, start with event sourcing everything so that you don't have to deal with migration later on.
I can't think of why using it on the whole system is really an anti pattern.
I agree -- calling it an "anti pattern" is an overstatement.
The spelling I believe? Using event sourcing on the whole system isn't cost effective today.
It could be tomorrow, as we get more practice with it, and the costs of designing these systems goes down and we learn to extract more benefit from them.
In the mean time - how valuable are the temporal queries that you get from event sourcing? In your core domain, where you get competitive advantage, they could be quite valuable. In places where you are just doing bookkeeping of information provided to you by the outside world? Not so much - you may be getting everything you need out of simpler solutions that only keep track of "now".
I recently published a blog post about this issue. It explains why event sourcing is a persistence strategy and shouldn't be used at global scale.
To summarize it: Event Sourcing forces you to emit an event for every changed data. This can result in very fine grained events. If you use Event Sourcing for inter microservice communication, you expose those events to the outside world.
In the end you expose the your persistence layer, comparable to exposing your (relational) database schema in a CRUD based persistence strategy.

Does an append-only event store result in an append-only codebase?

When implementing an application with event sourcing, the persistence engine at work is an event store. That is, an append-only log of events, in past tense, in the order or occurrence. By simply replaying the events through the application, the state at any point in time can be reproduced.
My concern – doesn't this append-only event store inevitably lead to an append-only codebase? How can you maintain a codebase if removing, or even altering code, might leave the application unable to replay the sequence of events? Can the number of source lines of code ever decrease?
What if a business rule has to be modified, or perhaps worse, what if a nasty bug early in the early days of the application allowed it to enter into a forbidden state? Must the faulty code be kept alive indefinitely? Of course, a lot of these issues can – in theory – be dealt with using event versioning, event schemas, snapshot versioning etc. But hasn't event sourcing become a burden at that point?
Event sourcing is a fairly new technology, at least in production. I suspect that there are few applications that have been running on it for more than a couple of years. What will they look like in 10 years? That's not an unrealistic age for an enterprise application.
My concern – doesn't this append-only event store inevitably lead to an append-only codebase?
No, it implies an append-only schema, which is decoupled from your implementation.
What if a business rule has to be modified, or perhaps worse, what if a nasty bug early in the early days of the application allowed it to enter into a forbidden state? Must the faulty code be kept alive indefinitely?
Not really - the domain is decoupled from the durable representations.
Yes, there are some common scenarios that you need to incorporate into your design; like the idea that you may need to compensate for errors earlier in the event history.
It's not, fundamentally, different from what you would do if you were only storing current state. If you have a representation of an aggregate in your database that is in the wrong state, you just update it in place, right? by changing some of the fields to what they are supposed to be.
The idea is the same in event sourcing; you have an event stream that produces a state that you don't want to be in. You figure out what additional events are necessary to reach the state you should be in, and append them. Tada.
Of course, a lot of these issues can – in theory – be dealt with using event versioning, event schemas, snapshot versioning etc. But hasn't event sourcing become a burden at that point?
Not really? Yes, you need to design flexibility into your schema, so that you can evolve your model aggressively, but at it's core it's not different from storing current state - you can still migrate if you have to.
But you also have other levers to play with.
It does, perhaps, require more upfront design capital - you have to think about things like schema lifetimes, and the fact that your book of record accumulates data from multiple revisions of your model.
That doesn't mean it's a shoe for all feet. Designing good message schema is an investment. If the consumers of that schema (which in this case really means your model, and the subscribers) don't need to evolve independently, then maybe that investment doesn't make sense.

Event model or schema in event store

Events in an event store (event sourcing) are most often persisted in a serialized format with versions to represent a changed in the model or schema for an event type. I haven't been able to find good documentation showing the actual model or schema for an actual event (often data table in event store schema if using a RDBMS) but understand that ideally it should be generic.
What are the most basic fields/properties that should exist in an event?
I've contemplated using json-api as a specification for my events but perhaps that's too "heavy". The benefits I see are flexibility and maturity.
Am I heading down the "wrong path"?
Any well defined examples would be greatly appreciated.
I've contemplated using json-api as a specification for my events but perhaps that's too "heavy". The benefits I see are flexibility and maturity.
Am I heading down the "wrong path"?
Don't overlook forward and backward compatibility.
You should plan to review Greg Young's book on event versioning; it doesn't directly answer your question, but it does cover a lot about the basics of interpreting an event.
Short answer: pretty much everything is optional, because you need to be able to change it later.
You should also review Hohpe's Enterprise Integration Patterns, in particular his work on messaging, which details a lot of cases you may care about.
de Graauw's Nobody Needs Reliable Messaging helped me to understan an important point.
To summarize: if reliability is important on the business level, do it on the business level.
So while there are some interesting bits of meta data tracking that you may want to do, the domain model is really only going to look at the data; and that is going to tend to be specific to your domain.
You also have the fun that the representation of events that you use in the service that produces them may not match the representation that it shares with other services, and in particular may not be the same message that gets broadcast.
I worked through an exercise trying to figure out what the minimum amount of information necessary for a subscriber to look at an event to understand if it cares. My answers were an id (have I seen this specific event before?), a token that tells you the semantic meaning of the message (is that something I care about?), and a location (URI) to get a richer representation if it is something I care about.
But outside of the domain -- for example, when you are looking at the system as a whole trying to figure out what is going on, having correlation identifiers and causation identifiers, time stamps, signatures of the source location, and so on stored in a consistent location in the meta data can be a big help.
Just modelling with basic types that map to Json to write as you would for an API can go a long way.
You can spend a lot of time generating overly complex models if you throw too much tooling at it - things like Apache Thrift and/or Protocol Buffers (or derived things) will provide all sorts of IDL mechanisms for you to generate incidental complexity with.
In .NET land and many other platforms, if you namespace the types you can do various projections from the types
Personally, I've used records and DUs in F# as a design and representation tool
you get intellisense, syntax hilighting, and types you can use from F# or C# for free
if someone wants to look, types.fs has all they need

Using Drools to perform an Action based on Events

I wanted to use drools for one of our projects - Online Buying and Selling.
Events are like Buying a book, Buying a pen, Buying a kindle.
These events are stored in a database.
Now, based on the events happened before, I want to decide the consequence.
Like say if a person had the following sequence,
1. Buy a book at a price.
2. Sell the same book at a higher price.
Then
Do something based on that.
If someone has done this,
1. Buy a kindle.
2. Purchase a book in Science Section of books.
Then
show him the relevant content in the UI.
I have all the listed things as Events in the database.
Now I have written an interface for the Actions to be done and I have also done the interface to pass a Customer when an event happens.
Now what would give me the best performance to process the events and make a decision based on the sequence of events. I cannot store all the events in memory for sure as I have a whole lot of those.
There are different aspects to consider:
For recommending additional items to customers, there are Recommendation Engines. You may want to use one of those, if most/all of Your use-cases are recommendations.
Storing "all the events in memory" is not neccessarily required. In fact, Drools removes events that are no longer relevant to the rule base. The documentation says
"Events may be automatically expired after some time in the working memory. Typically this happens when, based on the existing rules in the knowledge base, the event can no longer match and activate any rules. Although, it is possible to explicitly define when an event should expire."
To allow early removal of events, I would use Drools to generate aggregated data like "likes science topics", "owns a e-book reader", etc. Those can be inferred from the events but consume less memory.

Resources