In this simple example, we load an entity from a repository, ask it to perform an operation and then insert the result (a new state) through the repository again.
const doSomething = (personDao: PersonRepository) => (personId: PersonId) => {
const person = personDao.findBy(personId)
const newState = person.doSomething()
personDao.insert(newState)
return newState
}
However, if person.doSomething() not only returns its new state, but also a domain event, how am I supposed to persist the state and publish it transactionally?
const doSomething = (eventPublisher: EventPublisher) => (personDao: PersonRepository) => (personId: PersonId) => {
const person = personDao.findBy(personId)
const [event, newState] = person.doSomething()
personDao.insert(newState)
eventPublisher.publish(event)
return newState
}
That is, if
personDao.insert()
goes through and if
eventPublisher.publish()
does not, then the insertion should be rolled back. Making sure events are published is essential since other applications will need it.
Possible solution
PersonRepository should persist the person and also the events in a separate table.
Something (?) should poll the events table to check for new events, and then send it to a message broker.
I'm not sure how feasible this idea is.
And, what if I am using a database that does not support transactions, such as Cassandra? Or MongoDB, in which case I would have to store the events inside the Person document? I am not using either, but just a thought I had.
Thanks.
There are two broad solutions to this:
First is the transactional outbox pattern, in which you persist the domain events and the updated entity in one atomic transaction. The events can be tracked by polling or via a change-data-capture to feed a message broker.
Second is the event sourcing approach in which you just persist the events (domain events but also, potentially, events which are "implementation details" of your representation). The events can then be used to construct the entity; as an optimization, another process can poll/change-data-capture the events and update a snapshot of the entity (in which case, the process of reconstructing an entity is load the latest snapshot (which includes a version) and then apply the events after that version to the snapshot).
Because your events and your snapshots/entities are different data models, for datastores which are table-oriented, you will almost certainly want to have those be in different tables; for datastores which are more stream-oriented, likewise you'll want those in separate streams. Accordingly, the suitability of a transactional outbox approach rests on the ability to atomically write to multiple tables/streams. Event sourcing doesn't have that requirement.
I would characterize the approach of writing an entity and the events together as event sourcing (where the events happen to include the snapshot). If going that route, I would then have a consumer (fed by polling or by change-data-capture) demultiplex the paired events and snapshot.
Related
I want to implement a Axon projection providing a subscription query. The projection persists entities of a specific based on the events (CRUD). The following should happen if a replay of the projection is executed:
Inform about a empty collection via the subscription query.
Process the events until the projection catches up with the event store.
Inform about the current state of the entities via the subscription query.
My intention is that the subscription query should not inform about many events in a very short time and I want to prevent informing about many intermediate positions (e.g. create and delete a specific entity, it should not be used in the subscription query, because it is not available after catching up).
Currently I cannot implement the third step, because I miss a hook for reacting to the moment when the Axon projection is catching up again.
I use Axon v4.5 and Spring Boot v2.4.5.
At the moment, the Reference Guide is sadly not extremely specific on how to achieve this. I can assure you though that work is underway to improve the entire Event Processor section, including a clearer explanation of the Replay process and the hooks you have.
However, the possible hooks are stated as it is (you can find them here).
What you can do to know whether your event handlers are replaying yes/no, is to utilize the ReplayStatus enumeration. This enum can be used as an additional parameter to your #EventHandler annotated method, holding just two states:
REPLAY
REGULAR
This enumeration allows for conditional logic within an event handler on what to do during replays. If an event handler for example not only updates a projection but also sends an email, you'd want to make sure the email isn't sent again when replaying.
To further clarify how to use this, consider the following snippet:
#EventHandler
public void on(SomeEvent event, ReplayStatus replayStatus) {
if (replayStatus == REGULAR) {
// perform tasks that only may happen when the processor is not replaying
}
// perform tasks that can happen during regular processing and replaying
}
It's within the if-block where you could invoke the QueryUpdateEmitter, to only emit updates when the event handler is performing the regular event handling process.
I use event sourcing to store my object.
Changes are captured via domain events, holding only the minimal information required, e.g.
GroupRenamedDomainEvent
{
string Name;
}
GroupMemberAddedDomainEvent
{
int MemberId;
string Name;
}
However elsewhere in my application I want to be notified if a Group is updated in general. I don’t want to have to accumulate or respond to a bunch of more granular and less helpful domain events.
My ideal event to subscribe to is:
GroupUpdatedIntegrationEvent
{
int Id;
string Name;
List<Member> Members;
}
So what I have done is the following:
Update group aggregate.
Save down generated domain events.
Use these generated domain events to to see whether to trigger my integration event.
For the example above, this might look like:
var groupAggregate = _groupAggregateRepo.Load(id);
groupAggregate.Rename(“Test”);
groupAggregate.AddMember(1, “John”);
_groupAggregateRepo.Save(groupAggregate);
var domainEvents = groupAggregate.GetEvents();
if (domainEvents.Any())
{
_integrationEventPublisher.Publish(
new GroupUpdatedIntegrationEvent
{
Id = groupAggregateId,
Name = groupAggregate.Name,
Members = groupAggregate.Members
});
}
This means my integration events used throughout the application are not coupled to what data is used in my event sourcing domain events.
Is this a sensible idea? Has anyone got a better alternative? Am I misunderstanding these terms?
Of course you're free to create and publish as many events as you want, but I don't see (m)any benefits there.
You still have coupling: You just shifted the coupling from one Event to another. Here it really depends how many Event Consumers you've got. And if everything is running in-memory or stored in a DB. And if your Consumers need some kind of Replay mechanism.
Your integration Events can grow over time and use much bandwidth: If your Group contains 1000 Members and you add 5 new Members, you'll trigger 5 integration events that always contain all members, instead of just the small delta. It'll use much more network bandwidth and hard drive space (if persisted).
You're coupling your Integration Event to your Domain Model. I think this is not good at all. You won't be able to simply change the Member class in the future, because all Event Consumers depend on it. A solution could be to instead use a separate MemberDTO class for the Integration Event and write a MemberToMemberDTO converter.
Your Event Consumers can't decide which changes they want to handle, because they always just receive the full blown current state. The information what actually changed is lost.
The only real benefit I see is that you don't have to again write code to apply your Domain Events.
In general it looks a bit like Read Model in CQRS. Maybe that's what you're looking for?
But of course it depends. If your solution fits your application's needs, then it'll be fine to do it that way. Rules are made to show you the right direction, but they're also meant to be broken when they get in your way (and you know what you're doing).
I am trying to save multiple Document to 'multiple Collection' at once in one Transaction. So if one of the save fail, then all the saved document should RollBack
I am using SpringBoot & SpringData and using MongoAPi to connect to CosmosDB in Azure. I have read in their portal that this can be done by writing some Stored procedure. But is there a way we can do it from code like how spring have #Transaction annotation.?
Any help is really appreciated.
The only way you can write transactionally is with a stored procedure, or via Transactional Batch operations (SDK-based, in a subset of the language SDKs, currently .NET and Java) . But that won't help in your case:
Stored procedures are specific to the core (SQL) API; they're not for the MongoDB API.
Transactions are scoped to a single partition within a single collection. You cannot transactionally write across collections, regardless whether you use the MongoDB API or the core (SQL) API.
You'll need really ask whether transactions are absolutely necessary. Maybe you can use some type of durable-messaging approach, to manage your content-updates. But there's simply nothing built-in that will allow you to do this natively in Cosmos DB.
Transactions across partitions and collections is indeed not supported. If you really need a Rollback mechanism, it might be worthwhile to check the event sourcing pattern, as you might then be able to capture events instead of updating master entities. These events you could then easily delete, but still other processes might have executed using incorrect events.
We created a sort of unitofwork. We register all changes to the data model, including events and messages that are being sent. Only when we call a committ, the changes are persisted to the database, in the following order:
Commit updates
Commit deletes
Commit inserts
Send messages
Send events
Still, it's not watertight, but it avoids sending out messages/events/modifications to the data model as long as the calling process is not ready to do so(i.e. due to an error). This UnitOfWork is passed through our domain services to allow all operations of our command to be handled in one batch. It's then up to thé developer to realize if a certain operation can be committed as part of a bigger operarion(same UoW), or independent(new UoW).
We then wrapped our command handlers in a Polly policy to retry in case of update conflicts. Theoretically though we could get an update conflict on the 2nd update, which could cause an inconsistent data model, but we keep this in mind, when using the UoW.
It's not watertight, but hopefully it helps!
Yes, transactions are supported in Cosmos DB with the Mongo API. It believe it's a fairly new addition, but it's simple enough and described here.
I don't know how well it's supported in Spring Boot, but at least it's doable.
// start transaction
var session = db.getMongo().startSession();
var friendsCollection = session.getDatabase("users").friends;
session.startTransaction();
// operations in transaction
try {
friendsCollection.updateOne({ name: "Tom" }, { $set: { friendOf: "Mike" } } );
friendsCollection.updateOne({ name: "Mike" }, { $set: { friendOf: "Tom" } } );
} catch (error) {
// abort transaction on error
session.abortTransaction();
throw error;
}
// commit transaction
session.commitTransaction();
We are using microservices, cqrs, event store using nodejs cqrs-domain, everything works like a charm and the typical flow goes like:
REST->2. Service->3. Command validation->4. Command->5. aggregate->6. event->7. eventstore(transactional Data)->8. returns aggregate with aggregate ID-> 9. store in microservice local DB(essentially the read DB)-> 10. Publish Event to the Queue
The problem with the flow above is that since the transactional data save i.e. persistence to the event store and storage to the microservice's read data happen in a different transaction context if there is any failure at step 9 how should i handle the event which has already been propagated to the event store and the aggregate which has already been updated?
Any suggestions would be highly appreciated.
The problem with the flow above is that since the transactional data save i.e. persistence to the event store and storage to the microservice's read data happen in a different transaction context if there is any failure at step 9 how should i handle the event which has already been propagated to the event store and the aggregate which has already been updated?
You retry it later.
The "book of record" is the event store. The downstream views (the "published events", the read models) are derived from the book of record. They are typically behind the book of record in time (eventual consistency) and are not typically synchronized with each other.
So you might have, at some point in time, 105 events written to the book of record, but only 100 published to the queue, and a representation in your service database constructed from only 98.
Updating a view is typically done in one of two ways. You can, of course, start with a brand new representation and replay all of the events into it as part of each update. Alternatively, you track in the metadata of the view how far along in the event history you have already gotten, and use that information to determine where the next read of the event history begins.
Inside your event store, you could track whether read-side replication was successful.
As soon as step 9 suceeds, you can flag the event as 'replicated'.
That way, you could introduce a component watching for unreplicated events and trigger step 9. You could also track whether the replication failed multiple times.
Updating the read-side (step 9) and flagigng an event as replicated should happen consistently. You could use a saga pattern here.
I think i have now understood it to a better extent.
The Aggregate would still be created, answer is that all the validations for any type of consistency should happen before my aggregate is constructed, it is in case of a failure beyond the purview of the code that a failure exists while updating the read side DB of the microservice which needs to be handled.
So in an ideal case aggregate would be created however the event associated would remain as undispatched unless all the read dependencies are updated, if not it remains as undispatched and that can be handled seperately.
The Event Store will still have all the event and the eventual consistency this way is maintained as is.
I was trying to understanding ES+CQRS and tech stack can be used.
As per my understanding flow should be as below.
UI sends a request to Controller(HTTP Adapter)
Controller calls application service by passing Request Object as parameter.
Application Service creates Command from Request Object passed from controller.
Application Service pass this Command to Message Consumer.
Message Consumer publish Command to message broker(RabbitMQ)
Two Subscriber will be listening for above command
a. One subscriber will generate Aggregate from eventStore using command
and will apply command than generated event will be stored in event store.
b. Another subscriber will be at VIEW end,that will populate data in view database/cache.
Kindly suggest my understanding is correct.
Kindly suggest my understanding is correct
I think you've gotten a bit tangled in your middleware.
As a rule, CQRS means that the writes happen to one data model, and reads in another. So the views aren't watching commands, they are watching the book of record.
So in the subscriber that actually processes the command, the command handler will load the current state from the book of record into memory, update the copy in memory according to the domain model, and then replace the state in the book of record with the updated version.
Having update the book of record, we can now trigger a refresh of the data model that backs the view; no business logic is run here, this is purely a transform of the data from the model we use for writes to the model we use for reads.
When we add event sourcing, this pattern is the same -- the distinction is that the data model we use for writes is a history of events.
How atomicity is achieved in writing data in event store and writing data in VIEW Model?
It's not -- we don't try to make those two actions atomic.
how do we handle if event is stored in EventStrore but System got crashed before we send event in Message Queue
The key idea is to realize that we typically build new views by reading events out of the event store; not by reading the events out of the message queue. The events in the queue just tell us that an update is available. In the absence of events appearing in the message queue, we can still poll the event store watching for updates.
Therefore, if the event store is unreachable, you just leave the stale copy of the view in place, and wait for the system to recover.
If the event store is reachable, but the message queue isn't, then you update the view (if necessary) on some predetermined schedule.
This is where the eventual consistency part comes in. Given a successful write into the event store, we are promising that the effects of that write will be visible in a finite amount of time.