Eventual consistency - Axon conflict resolver

Eventual consistency - Axon conflict resolver - events

I'm working on a PoC to evaluate the use of Axon framework for the development of a new application.
My concern is about the eventual consistency with the CQRS pattern since consistency is a requirement for us.
There are a lot of articles and threads about this topic, so I apologize if I'm creating a duplicate thread.
Axon offer a conflict resolver but I'm not sure to understand how it works.
I found an example on a open source project.
This solution stores the version of the aggregate in the event store and read model. The client will read then the version from the read model.
What if I have different read models, could there be version conflicts?
How does Axon solve the conflicts?
Thanks

Before we dive into how Axon deals with consistency, there are a few things that I'd like to point out in the context of CQRS as a concept.
There is a lot of misconception around consistency in combination with CQRS. The concept of eventual consistency applies between the different models that you have defined within your application. For example, a Command Model may have changed state recently, but the Query Model doesn't reflect that state yet. The Query Model is eventually consistent with the Command Model. However, the information within that Query Model is still consistent in itself.
More importantly, this allows you to make conscious choices around where consistency is important and where it can be relaxed. Typically, Command Models make decisions in which consistency is important. You'd want to make sure each decision is made with the relevant knowledge of recent changes. That's the purpose of the Aggregate. An Aggregate will always make decisions that are consistent with its state.
I recommend reading up on the Reactive Principles document [1], namely Section V [2].
Then Axon. Axon implements the concepts of DDD and CQRS very strictly. Consistency is sacred within an Aggregate. For example, when using Event Sourcing, the events with an Aggregate's stream are guaranteed to have been generated based on a State that included all previous events in that stream. In other words, event number 9 in the stream was created with the knowledge of events number 0 through 8. Guaranteed.
When events are published, this doesn't mean any projections are already up to date. This may take a few milliseconds. Relaxing consistency here allows us to scale our system. The only downside is that a user may execute a command, perform a query and not see the results yet. This is actually much more common in systems than you think. There are numerous ways to prevent this from being a problem. Updating user interfaces in real-time is a powerful way of working with this. Then it doesn't matter which user made the change; they see it practically immediately.
The other way round may pose a challenge. A user observes the system state through a Query. This may (and always will, even without CQRS) provide stale data; the data may have been altered while the user is watching it. The user decides to make a change. However, in parallel, the information has already been changed. This other change may be such that, had the user known, it would have never submitted that Command.
In Axon, you can use Conflict Resolvers to detect these "unseen" parallel actions. You can use the "aggregate sequence" from incoming events and store them with your projection. If a user action results in a Command towards that aggregate, pass the Aggregate Sequence as Expected Aggregate Version. If the actual Aggregate's version doesn't match this (because it has been altered in the meantime), you get to decide whether that is problematic. There is a short explanation in the Reference Guide [3].
I hope this sheds some light on consistency in the context of CQRS and Axon.
[1] https://principles.reactive.foundation
[2] https://principles.reactive.foundation/principles/tailor-consistency.html
[3] https://docs.axoniq.io/reference-guide/axon-framework/axon-framework-commands/modeling/conflict-resolution

Related

Amount of properties per command/event in event sourcing

I'm learning cqrs/event sourcing, and recently I listen some speach and speaker told that you need pass as few parameters to event as possible, in other words to make events tiny as possible. The main reason for that is it's impossible to change events later as it will break the event history, and its easelly to design small events correctly. But what if for example in UI you need fill in for example form with 10 fields to create new aggregate, and same situation can be with updating the aggregate? How to be in such a case? And how to be if business later consider to change something, but we have huge event which updating 10 fields?

The decision is always context-specific and each case deserves its own review of using thin events vs fat events.
The motivation for using thin domain events is to include just enough information that is required to ensure the state transition.
As for fat events, your projections might require a piece of entity state to avoid using any logic in the projection itself (best practice).
For integration, you'd prefer emitting fat events because you hardly know who will consume your event. Still, the content of the event should convey the information related to the meaning of the event itself.
References:
Putting your events on a diet
Patterns for Decoupling in Distributed Systems: Fat Event

recently I listen some speach and speaker told that you need pass as few parameters to event as possible, in other words to make events tiny as possible.
I'm not convinced that holds up. If you are looking for good ides about designing events, you should review Greg Young's e-book on versioning.
If you are event sourcing, then you are primarily concerned with ensuring that your stream of events allows you to recreate the state of your domain model. The events themselves should be representations of changes that a domain expert will recognize. If you find yourself trying to invent smaller events just to fit some artificial constraint like "no more than three properties per event" then you are going to end up with data that doesn't really match the way your domain experts think -- which is to say, technical debt.

Event source the whole system is bad

I'm learning a proper microservice architecture using CQRS, MassTransit and different type of storage for the read side. One thing which often comes along CQRS is the event sourcing. I do understand it's not mandatory at all. However, I can't think of why using it on the whole system is really an anti pattern.
Having an store for all events as a single source of truth can help you build / rebuild a read store on the fly whenever you want.
You are not locked in to any vendor (except for the event store)
For me, the question is more like is it easier to not start with event sourcing (and still have separate data storage depending on which the microservices. eg: elasticsearch, mongodb, etc etc) and migrating / provisioning whenever it's needed or on the other hand, start with event sourcing everything so that you don't have to deal with migration later on.

I can't think of why using it on the whole system is really an anti pattern.
I agree -- calling it an "anti pattern" is an overstatement.
The spelling I believe? Using event sourcing on the whole system isn't cost effective today.
It could be tomorrow, as we get more practice with it, and the costs of designing these systems goes down and we learn to extract more benefit from them.
In the mean time - how valuable are the temporal queries that you get from event sourcing? In your core domain, where you get competitive advantage, they could be quite valuable. In places where you are just doing bookkeeping of information provided to you by the outside world? Not so much - you may be getting everything you need out of simpler solutions that only keep track of "now".

I recently published a blog post about this issue. It explains why event sourcing is a persistence strategy and shouldn't be used at global scale.
To summarize it: Event Sourcing forces you to emit an event for every changed data. This can result in very fine grained events. If you use Event Sourcing for inter microservice communication, you expose those events to the outside world.
In the end you expose the your persistence layer, comparable to exposing your (relational) database schema in a CRUD based persistence strategy.

How to validate from aggregate

I am trying to understand validation from the aggregate entity on the command side of the CQRS pattern when using eventsourcing.
Basically I would like to know what the best practice is in handling validation for:
1. Uniqueness of say a code.
2. The correctness/validation of an eternal aggregate's id.
My initial thoughts:
I've thought about the constructor passing in a service but this seems wrong as the "Create" of the entity should be the values to assign.
I've thought about validation outside the aggregate, but this seem to put logic somewhere that I assume should be responsibility of the aggregate itself.
Can anyone give me some guidance here?

Uniqueness of say a code.
Ensuring uniqueness is a specific example of set validation. The problem with set validation is that, in effect, you perform the check by locking the entire set. If the entire set is included within a single "aggregate", that's easily done. But if the set spans aggregates, then it is kind of a mess.
A common solution for uniqueness is to manage it at the database level; RDBMS are really good at set operations, and are effectively serialized. Unfortunately, that locks you into a database solution with good set support -- you can't easily switch to a document database, or an event store.
Another approach that is sometimes appropriate is to have the single aggregate check for uniqueness against a cached copy of the available codes. That gives you more freedom to choose your storage solution, but it also opens up the possibility that a data race will introduce the duplication you are trying to avoid.
In some cases, you can encode the code uniqueness into the identifier for the aggregate. In effect, every identifier becomes a set of one.
Keep in mind Greg Young's question
What is the business impact of having a failure?
Knowing how expensive a failure is tells you a lot about how much you are permitted to spend to solve the problem.
The correctness/validation of an eternal aggregate's id.
This normally comes in two parts. The easier one is to validate the data against some agreed upon schema. If our agreement is that the identifier is going to be a URI, then I can validate that the data I receive does satisfy that constraint. Similarly, if the identifier is supposed to be a string representation of a UUID, I can test that the data I receive matches the validation rules described in RFC 4122.
But if you need to check that the identifier is in use somewhere else? Then you are going to have to ask.... The main question in this case is whether you need the answer to that right away, or if you can manage to check that asynchronously (for instance, by modeling "unverified identifiers" and "verified identifiers" separately).
And of course you once again get to reconcile all of the races inherent in distributed computing.
There is no magic.

Does an append-only event store result in an append-only codebase?

When implementing an application with event sourcing, the persistence engine at work is an event store. That is, an append-only log of events, in past tense, in the order or occurrence. By simply replaying the events through the application, the state at any point in time can be reproduced.
My concern – doesn't this append-only event store inevitably lead to an append-only codebase? How can you maintain a codebase if removing, or even altering code, might leave the application unable to replay the sequence of events? Can the number of source lines of code ever decrease?
What if a business rule has to be modified, or perhaps worse, what if a nasty bug early in the early days of the application allowed it to enter into a forbidden state? Must the faulty code be kept alive indefinitely? Of course, a lot of these issues can – in theory – be dealt with using event versioning, event schemas, snapshot versioning etc. But hasn't event sourcing become a burden at that point?
Event sourcing is a fairly new technology, at least in production. I suspect that there are few applications that have been running on it for more than a couple of years. What will they look like in 10 years? That's not an unrealistic age for an enterprise application.

My concern – doesn't this append-only event store inevitably lead to an append-only codebase?
No, it implies an append-only schema, which is decoupled from your implementation.
What if a business rule has to be modified, or perhaps worse, what if a nasty bug early in the early days of the application allowed it to enter into a forbidden state? Must the faulty code be kept alive indefinitely?
Not really - the domain is decoupled from the durable representations.
Yes, there are some common scenarios that you need to incorporate into your design; like the idea that you may need to compensate for errors earlier in the event history.
It's not, fundamentally, different from what you would do if you were only storing current state. If you have a representation of an aggregate in your database that is in the wrong state, you just update it in place, right? by changing some of the fields to what they are supposed to be.
The idea is the same in event sourcing; you have an event stream that produces a state that you don't want to be in. You figure out what additional events are necessary to reach the state you should be in, and append them. Tada.
Of course, a lot of these issues can – in theory – be dealt with using event versioning, event schemas, snapshot versioning etc. But hasn't event sourcing become a burden at that point?
Not really? Yes, you need to design flexibility into your schema, so that you can evolve your model aggressively, but at it's core it's not different from storing current state - you can still migrate if you have to.
But you also have other levers to play with.
It does, perhaps, require more upfront design capital - you have to think about things like schema lifetimes, and the fact that your book of record accumulates data from multiple revisions of your model.
That doesn't mean it's a shoe for all feet. Designing good message schema is an investment. If the consumers of that schema (which in this case really means your model, and the subscribers) don't need to evolve independently, then maybe that investment doesn't make sense.

Event model or schema in event store

Events in an event store (event sourcing) are most often persisted in a serialized format with versions to represent a changed in the model or schema for an event type. I haven't been able to find good documentation showing the actual model or schema for an actual event (often data table in event store schema if using a RDBMS) but understand that ideally it should be generic.
What are the most basic fields/properties that should exist in an event?
I've contemplated using json-api as a specification for my events but perhaps that's too "heavy". The benefits I see are flexibility and maturity.
Am I heading down the "wrong path"?
Any well defined examples would be greatly appreciated.

I've contemplated using json-api as a specification for my events but perhaps that's too "heavy". The benefits I see are flexibility and maturity.
Am I heading down the "wrong path"?
Don't overlook forward and backward compatibility.
You should plan to review Greg Young's book on event versioning; it doesn't directly answer your question, but it does cover a lot about the basics of interpreting an event.
Short answer: pretty much everything is optional, because you need to be able to change it later.
You should also review Hohpe's Enterprise Integration Patterns, in particular his work on messaging, which details a lot of cases you may care about.
de Graauw's Nobody Needs Reliable Messaging helped me to understan an important point.
To summarize: if reliability is important on the business level, do it on the business level.
So while there are some interesting bits of meta data tracking that you may want to do, the domain model is really only going to look at the data; and that is going to tend to be specific to your domain.
You also have the fun that the representation of events that you use in the service that produces them may not match the representation that it shares with other services, and in particular may not be the same message that gets broadcast.
I worked through an exercise trying to figure out what the minimum amount of information necessary for a subscriber to look at an event to understand if it cares. My answers were an id (have I seen this specific event before?), a token that tells you the semantic meaning of the message (is that something I care about?), and a location (URI) to get a richer representation if it is something I care about.
But outside of the domain -- for example, when you are looking at the system as a whole trying to figure out what is going on, having correlation identifiers and causation identifiers, time stamps, signatures of the source location, and so on stored in a consistent location in the meta data can be a big help.

Just modelling with basic types that map to Json to write as you would for an API can go a long way.
You can spend a lot of time generating overly complex models if you throw too much tooling at it - things like Apache Thrift and/or Protocol Buffers (or derived things) will provide all sorts of IDL mechanisms for you to generate incidental complexity with.
In .NET land and many other platforms, if you namespace the types you can do various projections from the types
Personally, I've used records and DUs in F# as a design and representation tool
you get intellisense, syntax hilighting, and types you can use from F# or C# for free
if someone wants to look, types.fs has all they need

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio