Is there any way to replay events in a date range? - spring-boot

I am implementing an example of spring-boot and axon. I have two events
(deposit and withdraw account balance). I want to know is there any way to get the state of the Account Aggregate by a given date ?
I want to get not just the final state, but to replay events in a range of dates.

I think I can help with this.
In the context of Axon Framework, you can start a replay of events by telling a given TrackingEventProcessor to 'reset' it's Tokens. By the way, the current description on this in the Reference Guide can be found here.
These TrackingTokens are the objects which know how far a given TrackingEventProcessor is in terms of handling events from the Event Stream. Thus resetting/adjusting these TrackingTokens is what will issue a Replay of events.
Knowing all these, the second step is to look at the methods the TrackingEventProcessor provides to 'reset tokens', which is threefold:
TrackingEventProcessor#resetTokens()
TrackingEventProcessor#resetTokens(Function<StreamableMessageSource, TrackingToken>)
TrackingEventProcessor#resetTokens(TrackingToken)
Option one will reset your tokens to the beginning of the event stream, which will thus replay everything.
Option two and three however give you the opportunity to provide a TrackingToken.
Thus, you could provide a TrackingToken starting from several points on the Event Stream. So, how do you go about to creating such a TrackingToken at a specific point in time? To that end, you should take a look at the StreamableMessageSource interface, which has the following operations:
StreamableMessageSource#createTailToken()
StreamableMessageSource#createHeadToken()
StreamableMessageSource#createTokenAt(Instant)
StreamableMessageSource#createTokenSince(Duration)
Option 1 is what's used to create a token at the start of the stream, whilst 2 will create a token at the head of the stream.
Option 3 and 4 will however allow you to create a token at a specific point in time, thus allowing you to replay all the events since the defined instance up to now.
There is one caveat in this scenario however. You're asking to replay an Aggregate. From Axon's perspective by default the Aggregate is the Command Model in a CQRS set up, thus dealing with Commands going in to your system. In the majority of the applications, you want Commands (e.g. the requests to change something) to occur on the current state of the application. As such, the Repository provided to retrieve an Aggregate does not allow specifying a point in time.
The above described solution in regards to replaying is thus solely tied to Query Model creation, as the TrackingEventProcessor is part of the Event Handling side in your application most often used to create views. This idea also ties in with your questions, that you want to know the "state of the Account Aggregate" at a given point in time. That's not a command, but a query, as you have 'a request for data' instead of 'the request to change state'.
Hope this helps you out #Safe!

Related

GA3 Event Push Neccesary fields in Request

I am trying to push a event towards GA3, mimicking an event done by a browser towards GA. From this Event I want to fill Custom Dimensions(visibile in the user explorer and relate them to a GA ID which has visited the website earlier). Could this be done without influencing website data too much? I want to enrich someone's data from an external source.
So far I cant seem to find the minimum fields which has to be in the event call for this to work. Ive got these so far:
v=1&
_v=j96d&
a=1620641575&
t=event&
_s=1&
sd=24-bit&
sr=2560x1440&
vp=510x1287&
je=0&_u=QACAAEAB~&
jid=&
gjid=&
_u=QACAAEAB~&
cid=GAID&
tid=UA-x&
_gid=GAID&
gtm=gtm&
z=355736517&
uip=1.2.3.4&
ea=x&
el=x&
ec=x&
ni=1&
cd1=GAID&
cd2=Companyx&
dl=https%3A%2F%2Fexample.nl%2F&
ul=nl-nl&
de=UTF-8&
dt=example&
cd3=CEO
So far the Custom dimension fields dont get overwritten with new values. Who knows which is missing or can share a list of neccesary fields and example values?
Ok, a few things:
CD value will be overwritten only if in GA this CD's scope is set to the user-level. Make sure it is.
You need to know the client id of the user. You can confirm that you're having the right CID by using the user explorer in GA interface unless you track it in a CD. It allows filtering by client id.
You want to make this hit non-interactional, otherwise you're inflating the session number since G will generate sessions for normal hits. non-interactional hit would have ni=1 among the params.
Wait. Scope calculations don't happen immediately in real-time. They happen later on. Give it two days and then check the results and re-conduct your experiment.
Use a throwaway/test/lower GA property to experiment. You don't want to affect the production data while not knowing exactly what you do.
There. A good use case for such an activity would be something like updating a life time value of existing users and wanting to enrich the data with it without waiting for all of them to come in. That's useful for targeting, attribution and more.
Thank you.
This is the case. all CD's are user Scoped.
This is the case, we are collecting them.
ni=1 is within the parameters of each event call.
There are so many parameters, which parameters are neccesary?
we are using a test property for this.
We also got he Bot filtering checked out:
Bot filtering
It's hard to test when the User Explorer has a delay of 2 days and we are still not sure which parameters to use and which not. Who could help on the parameter part? My only goal is to update de CD's on the person. Who knows which parameters need to be part of the event call?

Compensating Events on CQRS/ES Architecture

So, I'm working on a CQRS/ES project in which we are having some doubts about how to handle trivial problems that would be easy to handle in other architectures
My scenario is the following:
I have a customer CRUD REST API and each customer has unique document(number), so when I'm registering a new customer I have to verify if there is another customer with that document to avoid duplicity, but when it comes to a CQRS/ES architecture where we have eventual consistency, I found out that this kind of validations can be very hard to address.
It is important to notice that my problem is not across microservices, but between the command application and the query application of the same microservice.
Also we are using eventstore.
My current solution:
So what I do today is, in my command application, before saving the CustomerCreated event, I ask the query application (using PostgreSQL) if there is a customer with that document, and if not, I allow the event to go on. But that doesn't guarantee 100%, right? Because my query can be desynchronized, so I cannot trust it 100%. That's when my second validation kicks in, when my query application is processing the events and saving them to my PostgreSQL, I check again if there is a customer with that document and if there is, I reject that event and emit a compensating event to undo/cancel/inactivate the customer with the duplicated document, therefore finishing that customer stream on eventstore.
Altough this works, there are 2 things that bother me here, the first thing is my command application relying on the query application, so if my query application is down, my command is affected (today I just return false on my validation if query is down but still...) and second thing is, should a query/read model really be able to emit events? And if so, what is the correct way of doing it? Should the command have some kind of API for that? Or should the query emit the event directly to eventstore using some common shared library? And if I have more than one view/read? Which one should I choose to handle this?
Really hope someone could shine a light into these questions and help me this these matters.
For reference, you may want to be reviewing what Greg Young has written about Set Validation.
I ask the query application (using PostgreSQL) if there is a customer with that document, and if not, I allow the event to go on. But that doesn't guarantee 100%, right?
That's exactly right - your read model is stale copy, and may not have all of the information collected by the write model.
That's when my second validation kicks in, when my query application is processing the events and saving them to my PostgreSQL, I check again if there is a customer with that document and if there is, I reject that event and emit a compensating event to undo/cancel/inactivate the customer with the duplicated document, therefore finishing that customer stream on eventstore.
This spelling doesn't quite match the usual designs. The more common implementation is that, if we detect a problem when reading data, we send a command message to the write model, telling it to straighten things out.
This is commonly referred to as a process manager, but you can think of it as the automation of a human supervisor of the system. Conceptually, a process manager is an event sourced collection of messages to be sent to the command model.
You might also want to consider whether you are modeling your domain correctly. If documents are supposed to be unique, then maybe the command model should be using the document number as a key in the book of record, rather than using the customer. Or perhaps the document id should be a function of the customer data, rather than being an arbitrary input.
as far as I know, eventstore doesn't have transactions across different streams
Right - one of the things you really need to be thinking about in general is where your stream boundaries lie. If set validation has significant business value, then you really need to be thinking about getting the entire set into a single stream (or by finding a way to constrain uniqueness without using a set).
How should I send a command message to the write model? via API? via a message broker like Kafka?
That's plumbing; it doesn't really matter how you do it, so long as you are sure that the command runs within its own transaction/unit of work.
So what I do today is, in my command application, before saving the CustomerCreated event, I ask the query application (using PostgreSQL) if there is a customer with that document, and if not, I allow the event to go on. But that doesn't guarantee 100%, right? Because my query can be desynchronized, so I cannot trust it 100%.
No, you cannot safely rely on the query side, which is eventually consistent, to prevent the system to step into an invalid state.
You have two options:
You permit the system to enter in a temporary, pending state and then, eventually, you will bring it into a valid permanent state; for this you could allow the command to pass, yield CustomerRegistered event and using a Saga/Process manager you verify against a uniquely-indexed-by-document-collection and issue a compensating command (not event!), i.e. UnregisterCustomer.
Instead of sending a command, you create&start a Saga/Process that preallocates the document in a uniquely-indexed-by-document-collection and if successfully then send the RegisterCustomer command. You can model the Saga as an entity.
So, in both solution you use a Saga/Process manager. In order for the system to be resilient you should make sure that RegisterCustomer command is idempotent (so you can resend it if the Saga fails/is restarted)
You've butted up against a fairly common problem. I think the other answer by VoicOfUnreason is worth reading. I just wanted to make you aware of a few more options.
A simple approach I have used in the past is to create a lookup table. Your command tries to register the key in a unique constraint table. If it can reserve the key the command can go ahead.
Depending on the nature of the data and the domain you could let this 'problem' occur and raise additional events to mark it. If it is something that's important to the business/the way the application works then you can deal with it either manually or at the time via compensating commands. if the latter then it would make sense to use a process manager.
In some (rare) cases where speed/capacity is less of an issue then you could consider old-fashioned locking and transactions. Admittedly these are much better suited to CRUD style implementations but they can be used in CQRS/ES.
I have more detail on this in my blog post: How to Handle Set Based Consistency Validation in CQRS
I hope you find it helpful.

How to use compensating measures in an CQRS and DDD based application

Let's assume we host two microservices: RealEstate and Candidate.
The RealEstate service is responsible for managing rental properties, landlords and so forth.
The Candidate service provides commands to apply for a rental property.
There would be a CandidateForRentalProperty command which requires the RentalPropertyId and all necessary Candidate information.
Now the crucial point: Different types of RentalPropertys require a different set of Candidate information.
Therefore the commands and aggregates got splitten up:
Commands: CandidateForParkingLot, CandidateForFlat, and so forth.
Aggregates: ParkingLotCandidature, FlatCandidature, and so forth.
The UI asks the read model to decide which command has to be called.
It's reasonable for me to validate the Candidate information and all the business logic involved with that in the Candidate domain layer, but leave out validation whether the correct command got called based on the given RentalPropertyId. Reason: Multiple aggregates are involved in this validation.
The microservice should be autonomous and it's read model consumes events from the RealEstate domain, hence it's not guaranteed to be up to date. We don't want to reject candidates based on that but rather use eventual consistency.
Yes, this could lead to inept Candidate information used for a certain kind of RentalProperty. Someone could just call the CandidateForFlat command with a parking lot rental property id.
But how do we handle the cases in which this happens?
The RealEstate domain does not know anything about Candidates.
Would there be an event handler which checks if there is something wrong and execute an appropriate command to compensate?
On the other hand, this "mapping" is domain logic and I'd like to accomodate it in the domain layer. But I don't know who's accountable for this kind of compensating measures. Would the Candidate aggregate be informed, like IneptApplicationTypeUsed or something like that?
As an aside - commands are usually imperative verbs. ApplyForFlat might be a better spelling than CandidateForFlat.
The pattern you are probably looking for here is that of an exception report; when the candidate service matches a CandidateForFlat message with a ParkingLot identifier, then the candidate service emits as an output a message saying "hey, we've got a problem here".
If a follow up message fixes the problem -- the candidate service gets an updated message that fixes the identifier in the CandidateForFlat message, or the candidate service gets an update from real estate announcing that the identifier actually points to a Flat, then the candidate service can emit another message "never mind, the problem has been fixed"
I tend to find in this pattern that the input commands to the service are really all just variations of handle(Event); the user submitted, the http request arrived; the only question is whether or not the microservice chooses to track that event. In other words, the "command" stream is just another logical event source that the microservice is subscribed to.
As you said, validation of commands should be performed at the point of command generation - at client side - where read models are available.
Command processing is performed by aggregate, so it cannot and should not check validity or existence of other aggregates. So it should trust a command issuer.
If commands comes from an untrusted environment like public API, then your API gateway becomes a client, and it should have necessary read models to validate references.
If you want to accept a command fast and check it later, then log events like ClientAppliedForParkingLot, and have a Saga/Process manager handle further workflow by keeping its internal state, and issuing commands like AcceptApplication or RejectApplication.
I understand the need for validation but I don't think the example you gave calls for cross-Aggregate (or cross-microservice for that matter) compensating measures as stated in the Q title.
Verifications like checking that the ID the client gave along with the flat rental command matches a flat and not a parking lot, that the client has permission to do that, and so forth, are legitimate. But letting the client create such commands in the wild and waiting for an external actor to come around and enforce these rules seems subpar because the rules could be made intrinsic properties of the object originating the process.
So what I'd recommend is to change the entry point into the operation - to create the Candidature Aggregate Root as part of another Aggregate Root's behavior. If that other Aggregate (RentalProperty in our case) lives in another Bounded Context/microservice, you can maintain a list of RentalProperties in the Candidate Bounded Context with just the amount of info needed, and initiate the Candidature from there.
So you would have
FlatCandidatureHandler ==loads==> RentalProperty ==creates==> FlatCandidature
or
FlatCandidatureHandler ==checks existence==> local RentalProperty data
==creates==> FlatCandidature
As a side note, what could actually necessitate compensating actions are factors extrinsic to the root object of the process. For instance, if the property becomes unavailable in the mean time. Then whatever Aggregate holds that information should emit an event when that happens and the compensation should be initiated.

Design of notification events

I am designing some events that will be raised when actions are performed or data changes in a system. These events will likely be consumed by many different services and will be serialized as XML, although more broadly my question also applies to the design of more modern funky things like Webhooks.
I'm specifically thinking about how to describe changes with an event and am having difficulty choosing between different implementations. Let me illustrate my quandry.
Imagine a customer is created, and a simple event is raised.
<CustomerCreated>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</CustomerCreated>
Now let's say Bob spends lots of money and becomes a gold customer, or indeed any other property changes (e.g.: he now prefers to be known as Robert). I could raise an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is nice because the schema of the Created and Modified events are the same and any subscriber receives the complete current state of the entity. However it is difficult for any receiver to determine which properties have changed without tracking state themselves.
I then thought about an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is more compact and only contains the properties that have changed, but comes with the downside that the receiver must apply the changes and reassemble the current state of the entity if they need it. Also, the schemas of the Created and Modified events must be different now; CustomerId is required but all other properties are optional.
Then I came up with this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<Before>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</Before>
<After>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</After>
</CustomerModified>
This covers all bases as it contains the full current state, plus a receiver can figure out what has changed. The Before and After elements have the exact same schema type as the Created event. However, it is incredibly verbose.
I've struggled to find any good examples of events; are there any other patterns I should consider?
You tagged the question as "Event Sourcing", but your question seems to be more about Event-Driven SOA.
I agree with #Matt's answer--"CustomerModified" is not granular enough to capture intent if there are multiple business reasons why a Customer would change.
However, I would back up even further and ask you to consider why you are storing Customer information in a local service, when it seems that you (presumably) already have a source of truth for customer. The starting point for consuming Customer information should be getting it from the source when it's needed. Storing a copy of information that can be queried reliably from the source may very well be an unnecessary optimization (and complication).
Even if you do need to store Customer data locally (and there are certainly valid reasons for need to do so), consider passing only the data necessary to construct a query of the source of truth (the service emitting the event):
<SomeInterestingCustomerStateChange>
<CustomerId>1234</CustomerId>
</SomeInterestingCustomerStateChange>
So these event types can be as granular as necessary, e.g. "CustomerAddressChanged" or simply "CustomerChanged", and it is up to the consumer to query for the information it needs based on the event type.
There is not a "one-size-fits-all" solution--sometimes it does make more sense to pass the relevant data with the event. Again, I agree with #Matt's answer if this is the direction you need to move in.
Edit Based on Comment
I would agree that using an ESB to query is generally not a good idea. Some people use an ESB this way, but IMHO it's a bad practice.
Your original question and your comments to this answer and to Matt's talk about only including fields that have changed. This would definitely be problematic in many languages, where you would have to somehow distinguish between a property being empty/null and a property not being included in the event. If the event is getting serialized/de-serialized from/to a static type, it will be painful (if not impossible) to know the difference between "First Name is being set to NULL" and "First Name is missing because it didn't change".
Based on your comment that this is about synchronization of systems, my recommendation would be to send the full set of data on each change (assuming signal+query is not an option). That leaves the interpretation of the data up to each consuming system, and limits the responsibility of the publisher to emitting a more generic event, i.e. "Customer 1234 has been modified to X state". This event seems more broadly useful than the other options, and if other systems receive this event, they can interpret it as they see fit. They can dump/rewrite their own data for Customer 1234, or they can compare it to what they have and update only what changed. Sending only what changed seems more specific to a single consumer or a specific type of consumer.
All that said, I don't think any of your proposed solutions are "right" or "wrong". You know best what will work for your unique situation.
Events should be used to describe intent as well as details, for example, you could have a CustomerRegistered event with all the details for the customer that was registered. Then later in the stream a CustomerMadeGoldAccount event that only really needs to capture the customer Id of the customer who's account was changed to gold.
It's up to the consumers of the events to build up the current state of the system that they are interested in.
This allows only the most pertinent information to be stored in each event, imagine having hundreds of properties for a customer, if every command that changed a single property had to raise an event with all the properties before and after, this gets unwieldy pretty quickly. It's also difficult to determine why the change occurred if you just publish a generic CustomerModified event, which is often a question that is asked about the current state of an entity.
Only capturing data relevant to the event means that the command that issues the event only needs to have enough data about the entity to validate the command can be executed, it doesn't need to even read the whole customer entity.
Subscribers of the events also only need to build up a state for things that they are interested in, e.g. perhaps an 'account level' widget is listening to these events, all it needs to keep around is the customer ids and account levels so that it can display what account level the customer is at.
Instead of trying to convey everything through payload xmls' fields, you can distinguish between different operations based on -
1. Different endpoint URLs depending on the operation(this is preferred)
2. Have an opcode(operation code) as an element in the xml file which tells which operation is to used to handle the incoming request.(more nearer to your examples)
There are a few enterprise patterns applicable to your business case - messaging and its variants, and if your system is extensible then Enterprise Service Bus should be used. An ESB allows reliable handling of events and processing.

Events changing state in CQRS

This should be easy to follow, but after some reading I still can find an answer.
So, say that the user needs to change his mobile number, to accomplished that, we might have a command as: ChangedUserMobileNumber
holding the new number. The domain responsible for handling the command will perform the change in the aggregate and publish an event: UserMobilePhoneChanged
There is a subscriber for that event in another domain, which also holds the user mobile number in its aggregate but according to our software architect, events can not old any data so what we end up is rather stupid to say the least:
The Domain 1, receives the command to update the mobile number, the number is updated and one event is published, also, because the event cannot hold data, the command handler in the Domain 1 issues yet another command which is sent to Domain 2. The subscriber of that event lives in Domain 2 too, we then have a Saga to handle both the event and the command.
In terms of implementation we are using NServiceBus, so we have this saga to handle these message and in it we have this line of code, where the entity.IsMobilePhoneUpdated field stored in a saga entity is changed when the event is handeled.
bool isReady = (entity.IsMobilePhoneUpdated && entity.MobilePhoneNumber != null);
Effectively the Saga is started by both the command and the event raised in the Domain 1, and until this condition is met, the saga is kept alive.
If it was up to me, I would be sending the mobile number in the event itself, I just want to get a few other opinions on this.
Thanks
I'm not sure how a UserMobilePhoneChanged event could be useful in any way unless it contained the new phone number. User asks to change a number, the event shoots out that it has. Should be very simple indeed. Why does your architect say that events shouldn't contain any information?
In the first event based system i've designed events also had no data. I also did enforce that rule. At the time that sounded like a clever decision. After a while i realised that it was dumb, and i was making a lot of workarounds because of it. Also this caused a lot of querying form the event subscribers, even for trivial data. I had no problem changing this "rule" after i realised i'm doing it wrong.
Events should have all the data required to make them meaningful. Also they should only have the data that makes sense for that event. ( No point in having the user address in a ChangePhoneNumber message )
If your architect imposes such a restriction, it's not going to be easy to develop a CQRS system. How are the read models updated? Since the events have no data then you either query something to get the data ( the write side ? ) of find some way of sending a command to the read model ( then what's the point of publishing events? ). To fix your problem you should try to have a professional discussion with this architect, preferably including other tech heads and without offending anybody try to get him to relax this constraint.
On argument you could use is Event Sourcing. Event Sourcing is complementary to CQRS and would not make sense without events that have data. Even more when using event sourcing, the only data you have is the data stored in the events. Even if you don't actually implement event sourcing you can use it's existence as a reason for events to have data.
There is little point in finding a technical solution to a people problem.

Resources