Am I right in separating integration events from domain events? - event-sourcing

I use event sourcing to store my object.
Changes are captured via domain events, holding only the minimal information required, e.g.
GroupRenamedDomainEvent
{
string Name;
}
GroupMemberAddedDomainEvent
{
int MemberId;
string Name;
}
However elsewhere in my application I want to be notified if a Group is updated in general. I don’t want to have to accumulate or respond to a bunch of more granular and less helpful domain events.
My ideal event to subscribe to is:
GroupUpdatedIntegrationEvent
{
int Id;
string Name;
List<Member> Members;
}
So what I have done is the following:
Update group aggregate.
Save down generated domain events.
Use these generated domain events to to see whether to trigger my integration event.
For the example above, this might look like:
var groupAggregate = _groupAggregateRepo.Load(id);
groupAggregate.Rename(“Test”);
groupAggregate.AddMember(1, “John”);
_groupAggregateRepo.Save(groupAggregate);
var domainEvents = groupAggregate.GetEvents();
if (domainEvents.Any())
{
_integrationEventPublisher.Publish(
new GroupUpdatedIntegrationEvent
{
Id = groupAggregateId,
Name = groupAggregate.Name,
Members = groupAggregate.Members
});
}
This means my integration events used throughout the application are not coupled to what data is used in my event sourcing domain events.
Is this a sensible idea? Has anyone got a better alternative? Am I misunderstanding these terms?

Of course you're free to create and publish as many events as you want, but I don't see (m)any benefits there.
You still have coupling: You just shifted the coupling from one Event to another. Here it really depends how many Event Consumers you've got. And if everything is running in-memory or stored in a DB. And if your Consumers need some kind of Replay mechanism.
Your integration Events can grow over time and use much bandwidth: If your Group contains 1000 Members and you add 5 new Members, you'll trigger 5 integration events that always contain all members, instead of just the small delta. It'll use much more network bandwidth and hard drive space (if persisted).
You're coupling your Integration Event to your Domain Model. I think this is not good at all. You won't be able to simply change the Member class in the future, because all Event Consumers depend on it. A solution could be to instead use a separate MemberDTO class for the Integration Event and write a MemberToMemberDTO converter.
Your Event Consumers can't decide which changes they want to handle, because they always just receive the full blown current state. The information what actually changed is lost.
The only real benefit I see is that you don't have to again write code to apply your Domain Events.
In general it looks a bit like Read Model in CQRS. Maybe that's what you're looking for?
But of course it depends. If your solution fits your application's needs, then it'll be fine to do it that way. Rules are made to show you the right direction, but they're also meant to be broken when they get in your way (and you know what you're doing).

Related

CQRS / Event Sourcing - transient events

During development of my application, I found that I need to emit some events that actually don't modify the state of the aggregate, but they are needed in order to update read models (transient events?). e.g. if in my code (domain model) I hold state of hierarchy of numbers in layers like:
1 4 7
5 8
3 9
and the read model is doing projection of events like (top number from left to right):
1
5
3
then, when I trigger event in aggregate root RemovedNumber(1), and if this is the only event I trigger (since it is enough to update aggregate state), read model will not know that it needs to replace number 1 with 4.
? <--- SHOULD BE 4 SINCE 4 IS UNDER 1
5
3
So here basically, I need to trigger additionally: NowShowNumber(4 instead of 1), and then read model will know to project:
4
5
3
Event RemovedNumber(1) should be kept in event store, since it affects internal state of aggregate. Event NowShowNumber(4 instead of 1) should also be stored in event store since it is affecting read model (and should be replayed on re-projecting it), but it should probably not be used during reconstruction of aggregate root from event stream.
Is this standard practice in CQRS/Event Sourcing systems? Is there some alternative solution?
Why doesn't the Read model know to show number 4?
Didn't the Aggregate emit an AddNumber(4) prior to AddNumber(1)?
Then the Read model has the necessary state replicated on his part, basically a stack with numbers, in order to pull the previous number and to show it.
In CQRS, in order to help the Read models, when a state changes and an Event is emitted, the Aggregate include bits of the previous state in the Event also.
In your case, the emitted Event could have the following signature RemovedNumber( theRemovedNumber, theNewCurrentNumber), and in particular RemovedNumber(1, 4).
I call these events out of band events and save them to a different stream than I hydrate aggregates with.
Haven't heard anyone else doing it but haven't heard any good arguments to not do it - especially if you have a legitimate case for posting events that have no effect at all on the aggregate.
In your case if I understand your problem well enough I would just have the domain write a TopLevelNumberChanged event which the read model would see and process.
And obviously it would not read that event when hydrating.
I cannot see that it is at all an issue having events that don't effect changes in your projections. Depending on the projection it may be that the projection ignores many events.
That being said, if you are saying that these two events go hand-in-hand you may need to have another look at the design / intention. How do you know to call the second command? Would a single command not perhaps do the trick? The event could return the full change:
NumberReplacedEvent ReplaceNumber(1, 4);
The event would contain all the state:
public class NumberReplacedEvent
{
int ReplacedNumber { get; set; }
int WithNumber { get; set;
}
From my understanding, there's no single correct answers. CQRS / Event Sourcing is just a tool for helping you to model your data flow. But it's still your data, your business rules, your use case. In other words: Some other company could use the exact same data model, but have a different event structure, because it fits better for their use case.
Some example:
Let's imagine we have an online shop. And every time a customer buys a product, we decrease the inStock value for that product. If the customer sends the product back, we increase the value.
The command is pretty simple: BuyProduct(id: "123", amount: 4)
For the resulting event we have (at least) 2 options:
ProductBuyed(id: "123", amount: 4) (delta value)
ProductBuyed(id: "123", newInStockValue: 996) (new total value)
(you could also publish 4 times a simple ProductBuyed(id: "123") event)
Or you can have multiple resulting events at the same time:
ProductBuyed(id: "123", amount: 4)
InStockValueForProductChanged(id: "123", newValue: 996)
An online shop will possibly have multiple read models that are interested in these events. The Product Page wants to display only 996 items left!. And the Shop Statistics Page wants to display sold 4 items today. Though both options (total and delta) can be useful.
But also both Pages could work if there's only one of both events. Then the read side must do the calculation: oldTotal - newTotal = delta or oldTotal - delta = newTotal.
There are even more possible solutions. For example:
Checkout Service publishes ProductBuyed(id: "123", amount: 4) event
Stock Service receives this event, decreases the stock and then publishes the InStockValueForProductChanged(id: "123", newValue: 996) event
It really depends on the needs of your business.
My suggestions:
I prefer when the write model is only responsible for managing the business rules. Get Command, validate it, publish event(s) which look pretty similar to the command contents.
And the read model should be as simple as possible, too. Get Event, update model.
If calculations have to be done, there are a few options:
The calculation is part of a business rule? Then your write side has to compute the result anyway. In this case you already have written the algorithm, the CPU has done its work, and you have the resulting value for free. (Just include the result with the published event)
The calculation is really complex and/or there are multiple event consumers that need the result. Then it might be better to compute it once and include the result in an event, instead of computing it n times for every involved event consumer. Complex could mean:
Takes a lot of time
Very CPU / memory intensive
Needs special / huge external libs (imagine you had to include some Image Processing library with every read service)
The calculation is the result of a combination of a lot of different events (i.e. it's getting complex): Build an external service, which is responsible for the calculation. This way you can easily scale out by providing multiple instances of this service.
If the calculation is not part of a business rule and it's simple and only a single service needs the result or if it's only relevant for the read model: Place it in the read side.
In the end it's a tradeoff:
Duplicate algorithms? You could have multiple event consumers written with different programming languages. Do you want to implement the algorithm multiple times?
More network traffic / bigger event store? If you include the calculation result with the event, there's more data to store and transfer between the services. Can your infrastructure handle that?
Can your write / read service take the additional load?

Is it ok to have FAT events with event sourcing?

I have recently been building an application on top of Greg Young EventStore as my peristance layer and I have been pondering how big should I allow an event to get?
For example I have an UK Address Aggregate with the following fields
UK_Address
-BuildingName
-Street
-Locality
-Town
-Postcode
Now I'm building the UI using React/Redux and was thinking should I create a single FAT addressUpdated Event contatining all the above fields?
Or should I Create a event for each of the different fields? and batch them within the client until the Save event is fired? buildingNameUpdated Event, streetUpdated Event, localityUpdated Event.
I'm not sure if the answer is as black and white ask I have asked it what I really would like to know is what conditions/constraints could you use to make the decision?
should I create a event for each of the different fields?
No. The representations of your events are part of the API -- so you want to use spellings that make sense at the level of the business, not at the level of the implementation.
Now I'm building the UI using React/Redux and was thinking should I create a single FAT updateAddress Event containing all the above fields?
You don't need to constrain the data that you send to your UI to match that which is in the persistence store. The UI is just a cached representation of a read model; there's no reason that representation needs to have the same form as what is in your event store.
Consider the React model itself -- your code makes changes to the "in memory" representation of your data, and then the library computes the new DOM and replaces it, which in turn causes the browser to update its view, which in turn causes the pixels on the screen to change.
So taking a fat event from the store, and breaking it into field level events for the UI is fine. Taking multiple events from the store and aggregating them into a single message for the UI is also fine. Taking events from the event store and transforming them into a spelling that the UI will recognize is also fine.
Do you have any comment regarding Arien answer regarding keeping fields that need to be consistent together? so regardless of when your snapshop the current state of the world it would be in a valid state?
I don't believe that this makes sense, and I'm not sure if it is possible in general.
It doesn't make sense, because "valid state" is a write model concern only; events are things that have happened, its too late to vote on whether they are valid or not. For instance, if you deploy a new model, with a new invariant, it still needs to respect the history of what happened before. So you can build a snapshot for that new model, but the snapshot may not be "valid". Too bad.
Given that, I don't think it makes sense to worry over whether each individual event in a commit leaves the snapshot in a valid state.
In particular, if a particular transaction involves multiple entities, it is very likely that the domain language will suggest an event for each entity (we "debit cash" and "credit accounts receivable"). The entities themselves, of course, are capable of changing independently of each other -- it's the aggregate that maintains the balance.
You have to bundle al the information together in one event when this data has to be consistent with each other.
So when you update one field of an address you probably get an unwanted address.
This will happen when the client has not processed all the events at a certain time due to eventual consistency.
Example:
Change address (City=1, Street=1, Housenumber=1) to (City=2, Street=2, Housenumber=2)
When you do this with 3 events and you have just processed one at the time of reading you could get the address: (City=2, Street=1, Housenumber=1).
If puzzled, give a try to a solution that is easier to implement. I guess "FAT" event will be easier: you will end up spending less time for implementing/debugging/supporting.
It is usually referred as YAGNI-KISS-Occam's Razor principles.
In theory and I find it to be a good rule of thumb is to have your commands and events reflecting the intent of the user staying true to DDD. You can find a good explanation of the pros and cons about event granularity here: https://medium.com/#hugo.oliveira.rocha/what-they-dont-tell-you-about-event-sourcing-6afc23c69e9a

Design of notification events

I am designing some events that will be raised when actions are performed or data changes in a system. These events will likely be consumed by many different services and will be serialized as XML, although more broadly my question also applies to the design of more modern funky things like Webhooks.
I'm specifically thinking about how to describe changes with an event and am having difficulty choosing between different implementations. Let me illustrate my quandry.
Imagine a customer is created, and a simple event is raised.
<CustomerCreated>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</CustomerCreated>
Now let's say Bob spends lots of money and becomes a gold customer, or indeed any other property changes (e.g.: he now prefers to be known as Robert). I could raise an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is nice because the schema of the Created and Modified events are the same and any subscriber receives the complete current state of the entity. However it is difficult for any receiver to determine which properties have changed without tracking state themselves.
I then thought about an event like this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<AccountLevel>Gold</AccountLevel>
</CustomerModified>
This is more compact and only contains the properties that have changed, but comes with the downside that the receiver must apply the changes and reassemble the current state of the entity if they need it. Also, the schemas of the Created and Modified events must be different now; CustomerId is required but all other properties are optional.
Then I came up with this.
<CustomerModified>
<CustomerId>1234</CustomerId>
<Before>
<FullName>Bob</FullName>
<AccountLevel>Silver</AccountLevel>
</Before>
<After>
<FullName>Bob</FullName>
<AccountLevel>Gold</AccountLevel>
</After>
</CustomerModified>
This covers all bases as it contains the full current state, plus a receiver can figure out what has changed. The Before and After elements have the exact same schema type as the Created event. However, it is incredibly verbose.
I've struggled to find any good examples of events; are there any other patterns I should consider?
You tagged the question as "Event Sourcing", but your question seems to be more about Event-Driven SOA.
I agree with #Matt's answer--"CustomerModified" is not granular enough to capture intent if there are multiple business reasons why a Customer would change.
However, I would back up even further and ask you to consider why you are storing Customer information in a local service, when it seems that you (presumably) already have a source of truth for customer. The starting point for consuming Customer information should be getting it from the source when it's needed. Storing a copy of information that can be queried reliably from the source may very well be an unnecessary optimization (and complication).
Even if you do need to store Customer data locally (and there are certainly valid reasons for need to do so), consider passing only the data necessary to construct a query of the source of truth (the service emitting the event):
<SomeInterestingCustomerStateChange>
<CustomerId>1234</CustomerId>
</SomeInterestingCustomerStateChange>
So these event types can be as granular as necessary, e.g. "CustomerAddressChanged" or simply "CustomerChanged", and it is up to the consumer to query for the information it needs based on the event type.
There is not a "one-size-fits-all" solution--sometimes it does make more sense to pass the relevant data with the event. Again, I agree with #Matt's answer if this is the direction you need to move in.
Edit Based on Comment
I would agree that using an ESB to query is generally not a good idea. Some people use an ESB this way, but IMHO it's a bad practice.
Your original question and your comments to this answer and to Matt's talk about only including fields that have changed. This would definitely be problematic in many languages, where you would have to somehow distinguish between a property being empty/null and a property not being included in the event. If the event is getting serialized/de-serialized from/to a static type, it will be painful (if not impossible) to know the difference between "First Name is being set to NULL" and "First Name is missing because it didn't change".
Based on your comment that this is about synchronization of systems, my recommendation would be to send the full set of data on each change (assuming signal+query is not an option). That leaves the interpretation of the data up to each consuming system, and limits the responsibility of the publisher to emitting a more generic event, i.e. "Customer 1234 has been modified to X state". This event seems more broadly useful than the other options, and if other systems receive this event, they can interpret it as they see fit. They can dump/rewrite their own data for Customer 1234, or they can compare it to what they have and update only what changed. Sending only what changed seems more specific to a single consumer or a specific type of consumer.
All that said, I don't think any of your proposed solutions are "right" or "wrong". You know best what will work for your unique situation.
Events should be used to describe intent as well as details, for example, you could have a CustomerRegistered event with all the details for the customer that was registered. Then later in the stream a CustomerMadeGoldAccount event that only really needs to capture the customer Id of the customer who's account was changed to gold.
It's up to the consumers of the events to build up the current state of the system that they are interested in.
This allows only the most pertinent information to be stored in each event, imagine having hundreds of properties for a customer, if every command that changed a single property had to raise an event with all the properties before and after, this gets unwieldy pretty quickly. It's also difficult to determine why the change occurred if you just publish a generic CustomerModified event, which is often a question that is asked about the current state of an entity.
Only capturing data relevant to the event means that the command that issues the event only needs to have enough data about the entity to validate the command can be executed, it doesn't need to even read the whole customer entity.
Subscribers of the events also only need to build up a state for things that they are interested in, e.g. perhaps an 'account level' widget is listening to these events, all it needs to keep around is the customer ids and account levels so that it can display what account level the customer is at.
Instead of trying to convey everything through payload xmls' fields, you can distinguish between different operations based on -
1. Different endpoint URLs depending on the operation(this is preferred)
2. Have an opcode(operation code) as an element in the xml file which tells which operation is to used to handle the incoming request.(more nearer to your examples)
There are a few enterprise patterns applicable to your business case - messaging and its variants, and if your system is extensible then Enterprise Service Bus should be used. An ESB allows reliable handling of events and processing.

Dojo: topics vs events, what design considerations should be taken in account?

I've been using Dojo in various contexts and never found a good explanation on events versus topics. What I understand from using both mechanisms is the following:
Both are event or more generally message mechanisms.
Both work more or less the same, in that you subscribe to a topic/event by setting a callback.
Events are tightly coupled to an object/widget, as in, you need the actual instance of an object or widget to register listeners for specific events.
The topic mechanism on the other hand provides a more decoupled approach, as you can subscribe for any topic without knowing which component is publishing the topic, or even without knowing if the topic will be published at all.
An approach I used a couple of times when developing custom widgets with Dojo was by letting them publish to certain topics. Other components would subscribe to these topics and react appropriately. However, this leads to code that is hard to follow, because when you find a piece of code that subscribes to a certain topic, you start wondering who is publishing to that topic and vice versa. Currently I tend to let my custom widgets submit events and have a controller listening to these events and dispatch them to other widgets that should react on these events.
So in the first approach, the topic mechanism is the glue between widgets, but it is decentralized which makes it hard to maintain the code on the longer term in my experience. In the second approach, a controller class (following the MVC pattern) is the glue, which centralizes event handling.
I'd be interested in knowing if this is a correct understanding of the two mechanisms. I'd also be interested in any design consideration one should take in account when choosing one of the two (or mix them even?). Any pointers to an elaborate discussion on the topic would be appreciated as well. I have been looking at: http://dojotoolkit.org/documentation/tutorials/1.9/events/ but that mainly describes how both mechanisms work but give little insight in how to structure a complex application.
I'm having the exact same idea about topics and events as you. As JavaScript is event-driven both are of course event-ish (like you describe in your first point).
Events are indeed coupled to the widget itself while topics aren't. I usually see it as the following:
When you have master-slave kind of structure (like a list having many items), then using widgets and events is probably the best approach to handle your problem.
When both widgets are unrelated to each other, then topics are probably the best way to communicate between each other.
You're right, topics make it harder to know what the origin is, but if you think about it, you don't need to know the origin. The topics provide you an API that decouples the source from the destination, making it so that you don't need to know the source.
Because both widgets are unrelated (that's the approach I follow, described before), you should normally don't need to know what the origin is when maintaining the code.
What you need is a well written API and make sure both source as destination are following it. If the API changes (code maintaining) you can use your IDE to find out which widgets are publishing/subscribing (for example by searching to the topic name) and make sure each of them is updated.
You can also choose to encapsulate the publish/subscribe behavior and providing a more high level API by creating a module like this:
define([ "dojo/topic", "dojo/_base/array" ], function(topic, arrayUtils) {
var MY_TOPIC = "/my/topic";
var module = {
observers: [],
notify: function(/** String */ name, /** Integer */ age) {
topic.publish(MY_TOPIC, {
name: name,
age: age
});
},
addObserver: function(/** Function */ callback) {
return this.observers.push(callback) - 1;
},
removeObserver: function(/** Integer */ index) {
this.observers[index] = null;
}
};
topic.subscribe(MYTOPIC, function(data) {
arrayUtils.forEach(module.observers, function(observer) {
if(observer !== null && data.name !== undefined && data.age !== undefined) {
observer(name, age);
}
});
});
return module;
});
You publish using the notify() function (providing the correct function parameters) and you add/remove observers with the other functions. Then you will make this component your sole subscriber and make it notify all observers.
This way you don't need to know about the topic and the API is uniform. You only need to make sure that the callbacks use the arguments correctly. To maintain your code you just change the high level API and look for modules that use this high level component. This is way easier to detect since it's in the require() function.
When I use topics I usually create a high level API like this (might change a bit depending on the use of it). But I think the point made is clear, it's easier to change the topic and to modify the data that is sent through.
In the sense of design patterns and software architecture topics seem to be the perfect mechanism to implement flux in dojo. Found an article with basic idea here.

Events changing state in CQRS

This should be easy to follow, but after some reading I still can find an answer.
So, say that the user needs to change his mobile number, to accomplished that, we might have a command as: ChangedUserMobileNumber
holding the new number. The domain responsible for handling the command will perform the change in the aggregate and publish an event: UserMobilePhoneChanged
There is a subscriber for that event in another domain, which also holds the user mobile number in its aggregate but according to our software architect, events can not old any data so what we end up is rather stupid to say the least:
The Domain 1, receives the command to update the mobile number, the number is updated and one event is published, also, because the event cannot hold data, the command handler in the Domain 1 issues yet another command which is sent to Domain 2. The subscriber of that event lives in Domain 2 too, we then have a Saga to handle both the event and the command.
In terms of implementation we are using NServiceBus, so we have this saga to handle these message and in it we have this line of code, where the entity.IsMobilePhoneUpdated field stored in a saga entity is changed when the event is handeled.
bool isReady = (entity.IsMobilePhoneUpdated && entity.MobilePhoneNumber != null);
Effectively the Saga is started by both the command and the event raised in the Domain 1, and until this condition is met, the saga is kept alive.
If it was up to me, I would be sending the mobile number in the event itself, I just want to get a few other opinions on this.
Thanks
I'm not sure how a UserMobilePhoneChanged event could be useful in any way unless it contained the new phone number. User asks to change a number, the event shoots out that it has. Should be very simple indeed. Why does your architect say that events shouldn't contain any information?
In the first event based system i've designed events also had no data. I also did enforce that rule. At the time that sounded like a clever decision. After a while i realised that it was dumb, and i was making a lot of workarounds because of it. Also this caused a lot of querying form the event subscribers, even for trivial data. I had no problem changing this "rule" after i realised i'm doing it wrong.
Events should have all the data required to make them meaningful. Also they should only have the data that makes sense for that event. ( No point in having the user address in a ChangePhoneNumber message )
If your architect imposes such a restriction, it's not going to be easy to develop a CQRS system. How are the read models updated? Since the events have no data then you either query something to get the data ( the write side ? ) of find some way of sending a command to the read model ( then what's the point of publishing events? ). To fix your problem you should try to have a professional discussion with this architect, preferably including other tech heads and without offending anybody try to get him to relax this constraint.
On argument you could use is Event Sourcing. Event Sourcing is complementary to CQRS and would not make sense without events that have data. Even more when using event sourcing, the only data you have is the data stored in the events. Even if you don't actually implement event sourcing you can use it's existence as a reason for events to have data.
There is little point in finding a technical solution to a people problem.

Resources