Event Aggregator Error Handling With Rollback - events

I've been studying a lot of the common ways that developers design/architect an application on domain driven design (Still trying to understand the concept as a whole). Some of the examples that I saw included the use of events via an event aggregator. I liked the concept because it truly keeps the different elements/domains of an application decoupled.
A concern that I have is: how do you rollback an operation in the case of an error?
For example:
Say I have an order application that has to save an order to the database and also save a copy of the order as a pdf to a CMS. The application fires an event that a new order has been created and the pdf service that subscribes to this event saves the pdf. Meanwhile when committing the order changes to the database an exception is thrown. The problem is that the pdf has been saved but their isn't a matching database record.
Should I cache the previously handled events and fire a new error event that looks to the cache for "undo" operations? Use something like the command pattern for this?
Or... is the event aggregator not a good pattern for this.
Edit
I'm starting to think that maybe events should be used for less "mission critical" items, such as emailing and logging.
My initial thought was to limit dependencies by using the event aggregator pattern.

You want the event to be committed in the same transaction as the operation on your database.
In this particular scenario, you can push the event on a queue, which enlists in your transaction, so that the event will never go out unless the aggregate is persisted. This will make creating the PDF eventual consistent; if creating the PDF fails, you can fix the problem, and have it automatically retried.
Maybe you can get more inspiration in one of my previous posts on eventual consistent domain events with RavenDB and IronMQ.

Handling an event before it actually happened (committed) only works if the event handler participates in the transaction. Make the event handler transactional (for instance by storing the PDF in a database), or publish and handle events after the transaction committed.

Related

Atomically update database and send message. Outbox pattern or not?

You have a command/operation which means you both need to save something in database end send an event/message to another system. For example you have an OrderService and when a new order is created you want to publish an "OrderCreated"-event for another system/systems to react on (either direct message or using a message broker) and do something.
The easiest (and naive) implementation is to save in db and if successful then send message. But of course this is not bullet proof because the other service/message broker is down or your service crash before sending message.
One (and common?) solution is to implement "outbox pattern", i.e. instead of publish messages directly you save the message to an outbox table in your local database as part of your database transaction (in this example save to outbox table as well as order table) and have a different process (polling db or using change data capture) reading the outbox table and publish messages.
What is your solution to this dilemma, i.e. "update database and send message or do neither"? Note: I am not talking about using SAGAs (could be part of a SAGA though but this is next level).
I have in the past used different approaches:
"Do nothing", i.e just try to send the message and hope it will be sent. Which might be fine in some cases especially with a stable message broker running on same machine.
Using DTC (in my case MSDTC). Beside all the problem with DTC it might not work with your current solution.
Outbox pattern
Using an orchestrator which will retry process if you have not got a "completed" event.
In my current project it is not handled well IMO and I want to change it to be more resilient and self correcting. Sometimes when a service is calling another service and it fails the user might retry and it might work ok. But some operations might require out support to fix it (if it is even discovered).
ATM it is not a Microservice solution but rather two large (legacy) monoliths communicating and is running on same server but moving to a Microservice architecture in the near future and might run on multiple machines.

Spring boot Axon complete rollback

How to rollback a transaction event which created data in database, I don't want to change status or anything, I need to rollback ( don't persist ) the changes. could any provide or suggest a way achieve this.
Axon Framework will roll back operations to a database depending on the type of failure you're getting.
If the exception's thrown by the #EventHandler annotated method, the ListenerInvocationErrorHandler catches it. By default, this logs the message and proceeds. Differently put, the operation is not rolled back.
If you decide to rethrow the exception in the ListenerInvocationErrorHandler or the exception occurs during the processing of the event batch, Axon's ErrorHandler will catch it.
By default, the ErrorHandler rethrows the exception, causing the Event Processor to catch it. This will make it so that the event handling task is rolled back.
However, from your comments, it seems you're not looking for an Event Handling task to roll back on its own, but for an Event Handling task to roll back a previous Event Handling task. Note that from Axon's perspective, the Event was handled, so the change simply occurred.
If you want something like this to be undone, you could think about options like replaying altogether, adjusting the database yourself, and dispatching a compensating action. Whether any of those is the right fit is unclear to me as I am unaware of what you're actually trying to roll back for what reason.
Hence, I'd like you to provide a bit more background on what your exact scenario is, #ray. If you could please update your question with a description of the use case, that would be great.
In the meantime, here are some useful links from the Axon Framework documentation around Event Processing:
Event Processors
Event Processor Error Handling
Streaming Event Processors

How to migrate to Event-Sourcing?

we are migrating from a legacy monolith application to a microservice architecture. we use CQRS and event sourcing pattern and message broker (rabbitmq) for communication mechanism. now We are facing a challenge about how can convert the old database to new architecture and how can use event sourcing for these? Assuming the old database did not have events, can we do the data conversion without creating events? what is the start point of our old database data in the event sourcing pattern?
One important thing to remember is that many databases internally event source: every write goes to a log and that log is used to update tables, replicate etc., after which the log is truncated. It's equivalent to event sourcing with a lot of snapshots and very little retention of events and old snapshots.
In these databases (which include the likes of Postgres, MySQL, Oracle, SQL Server, Cassandra, CosmosDB, to name ones I know from experience do this), there's a technique called Change Data Capture which essentially taps into the log and exposes a stream of changes to the database which can be treated as events from the database (or by extension as commands: "one service's events are another service's commands"). Debezium can be used to write CDC records to Kafka; for RabbitMQ you may need to roll something yourself, in which case you'll want to get acquainted with how CDC is exposed in your database.
Even if the database doesn't support CDC, if the data isn't that large, you can often turn it into an ersatz event stream by periodically dumping its data (if the records are timestamped, this can even work if the data is particularly slow moving) and implementing a service to track what changed: this won't tell you about changes that netted out, but it's often better than nothing. This sort of dump is also likely to be required if you need a "genesis" event to ensure that your initial state is current to when you moved to event-sourcing or CDC.
This whole broad family of techniques has limitations compared to full event sourcing: reifying what changed is not as valuable as reifying what changed and why it changed. But it can be a useful middle ground in migrating to event-sourcing.
By referring #alexey-zimarev's answer at this post, it's essential to have the starting event in your event sourced database. You can not configure an event-sourced aggregate without replaying its events. Therefore, you need to map the legacy snapshot to an individual domain event of your relevant aggregate.
Either the way, considering event souring definition by Martin Fowler:
The fundamental idea of Event Sourcing is that of ensuring every
change to the state of an application is captured in an event object,
and that these event objects are themselves stored in the sequence
they were applied for the same lifetime as the application state
itself.
So that, it's not an appropriate solution to migrate legacy snapshots into the newer one without extracting and storing domain events. It will turn your event-sourced project into a semi-event-sourced project which is not considered as a paradigm to design and develop.
You have an event store that is a database for events. you can create event data that you need for the old database and insert into the event store. After that, do event replaying for creating read models.

Should we store Events in a database? (Event Driven Design)

We have several services that publishes and subscribes to Domain Events. What we usually do is log events whenever we publish and log events whenever we process events. We basically use this to apply choreography pattern.
We are not doing Event Sourcing in these systems, and there's no programmatic use for them after publishing/processing. That's the main driver we opted not to store these in a durable container, like a database or event store.
Question is, are we missing some fundamental thing by doing this?
Is storing Events a must?
I consider queued messages as system messages, even if they represent some domain event in an event-driven architecture (pub/sub messaging).
There is absolutely no hard-and-fast rule about their storage. If you would like to keep them around you could have your messaging mechanism forward them to some auditing endpoint for storage and then remove them after some time (if necessary).
You are not missing anything fundamental by not storing them.
You're definitely not missing out on anything (but there is a catch) especially if that's not a need by the business. An Event-Sourced System would definitely store all the events generated by the system into a database (or any other event-store)
The main use of an event store is to be able to restore the state of the system to the current state in case of a failure by replaying messages. To make this process of recovery faster we have snapshots.
In your case since these events are just are only relevant until the process is completed, it would not make sense to store them until you have a failure. (this is the catch) especially in a Distributed Transaction case scenario.
What I would suggest?
Don't store the event themselves but log the relevant details about these events and maybe use an ELK stack or Grafana to store these logs.
Use either the Saga Pattern or the Routing Slip pattern in case of a Distributed Transaction and log them as well.
In case a failure occurs while processing an event, put that event into an exception queue and handle it. If it's a part of a distributed transaction make sure either they all have the same TransactionId or they have a CorrelationId so you can lookup for logs and save your system.
For reliably performing your business transactions in a distributed archicture you somehow need to make sure that your events are published at least once.
So a service that publishes events needs to persist such an event within the same transaction that causes it to get created.
Considering you are publishing an event via infrastructure services (e.g. a messaging service) you can not rely on it being available all the time.
Also, your own service instance could go down after persisting your newly created or changed aggregate but before it had the chance to publish the event via, for instance, a messaging service.
Question is, are we missing some fundamental thing by doing this? Is storing Events a must?
It doesn't matter that you are not doing event sourcing. Unless it is okay from the business perspective to sometimes lose an event forever you need to temporarily persist your event with your local transaction until it got published.
You can look into the Transactional Outbox Pattern to achieve reliable event publishing.
Note: Logging/tracking your events somehow for monitoring or later analyzing/reporting purpose is a different thing and has another motivation.

How to share events code in a microservice architecture

I'm working on a "microservice-like" architecture. Each microservice can fire some events to RabbitMQ. The events are identified by an event code. At the moment, the code of the event triggered is an hard coded const string declared inside the microservice that fire the event.
My problem is that each microservice that want to subscribe to this event must duplicate this event code string. This is error prone especially when an event code is renamed because all microservices that subscribed to this event code need to be changed accordingly... which is very bad.
I see the possible alternatives:
Declare the event code only in the microservice that fire the event. Let the consumers microservices directly access to the code declared in the microservice that fire the event. In this case, the event is declared once but it creates a source code dependency between microservices... which is bad.
Create a source file (outside all microservices) that contains all the events code of all the application. This source file is shared by all microservices. In this case, each event is declared once but it creates a global dependency for all microservices which is against the single responsability principle... which is bad.
How do you tackle this problem ?
At the moment, the code of the event triggered is an hard coded const string declared inside the microservice that fire the event. My problem is that each microservice that want to subscribe to this event must duplicate this event code string. This is error prone especially when an event code is renamed because all microservices that subscribed to this event code need to be changed accordingly... which is very bad.
Events are messages. All of the constraints that we use to manage the evolution of messages applies to events as well.
In a microservices architecture, we expect to be able to deploy instances of the services independently of one another. Requiring that all of the services shut down together to coordinate a change in message schema kind of misses the point. That in turn implies that we need to design reasonable behaviors for the cases where the producer and consumer don't have matching understandings of the message.
In practice, this means something like
We never introduce a new required field, only optional fields (with documented default values).
Unrecognized fields are ignored (but forwarded)
Consumers of optional fields know to use default value to use when an expected field is missing.
When these constraints cannot be satisfied, then you are introducing a new message.
If you have the message contracts in place, then you aren't restricting yourself to microservice implementations that share the same runtime platform (because two different implementations of the same contract are equivalent).
Recommend reading:
ZeroMQ RFC 42/C4, specifically section 2.6 which describes the evolution of public contracts
Versioning in an Event Sourced System, speficically "Basic Type Based Versioning"

Resources