How to deal with concurrent events in an event-driven architecture - events

Suppose I have a eCommerce application designed in an event-driven architecture. I would publish events like ProductCreated and ProductPriceUpdated. Typically both events are published in seperate channels.
Now a consumer of those events comes into play and would react on these, for example to generate a price-chart for specific products.
In fact this consumer has the requirement to firstly consume the ProductCreated event to create a Product entity with the necessary information in its own bounded context. Only if a product has been created price points can be added to the chart. Depending on the consumers performance it can easily happen that those events arrive "out-of-order".
What are the possible strategies to fulfill this requirement?
The following came to my mind:
Publish both events onto the same channel with ordering guarantees. For example in Kafka both events would be published in the same partition. However this would mean that a topic/partition would grow with its events, I would have to deal with different schemas and the documentation would grow.
Use documents over events. Simply publishing every state change of the product entity as a single ProductUpdated event or similar. This way I would lose semantics from the message and need to figure out what exactly changed on consumer-side.
Defer event consumption. So if my consumer would consume a ProductPriceUpdated event and I don't have such a product created yet, I postpone the consumption by storing it in a database and come back at a later point or use retry-topics in Kafka terms.
Create a minimal entity. Once I receive a ProductPriceUpdated event I would probably have a correlation id or something to identify the entity and simple create a Entity just with this id and once a ProductCreated event arrives fill in the missing information.

Just thought of giving you some inline comments, based on my understanding for your requirements (#1,#3 and #4).
Publish both events onto the same channel with ordering guarantees. For example in Kafka both events would be published in the same partition. However this would mean that a topic/partition would grow with its events, I would have to deal with different schemas and the documentation would grow.
[Chris] : Apache Kafka preserves the order of messages within a partition. But, the mapping of keys to partitions is consistent only as long as the number of partitions in a topic does not change. So as long as the number of partitions is constant, you can be sure the order is guaranteed. When partitioning keys is important, the easiest solution is to create topics with sufficient partitions and never add partitions.
Defer event consumption. So if my consumer would consume a ProductPriceUpdated event and I don't have such a product created yet, I postpone the consumption by storing it in a database and come back at a later point or use retry-topics in Kafka terms.
[Chris]: If latency is not of a concern, and if we are okay with an additional operation overhead of adding a new entity into your solution, such as a storage layer, this pattern looks fine.
Create a minimal entity. Once I receive a ProductPriceUpdated event I would probably have a correlation id or something to identify the entity and simple create a Entity just with this id and once a ProductCreated event arrives fill in the missing information.
[Chris] : This is kind of a usual integration pattern (Messaging Later -> Backend REST API) we adopt, works over a unique identifier, in this case a correlation id.
This can be easily acheived, if you have a separate topics and consumer per events and the order of messages from the producer is gaurenteed. Thus, option #1 becomes obsolete.
From my perspective, option #3 and #4 look one and the same, and #4 would be ideal.
On an another note, if you thinking of KAFKA Streams/Table into your solution, just go for it, as there is a stronger relationship between streams and tables is called duality.
Duality of streams and tables makes your application to support more elastic, fault-tolerant stateful transactions and to run interactive queries. And, KSQL add more flavour into it, because, this use is just of of Data Enrichment at the integration layer.

Related

Migrating an asynchronous businness flow to an event-driven system

In the effort to redesign an asynchronous flow based functional service to an event driven one, we have come up with changes on different part of this system. The service receives various statuses from external services through the API, which does computations and persists the result into the data store. The core logic is now moved from the api by introducing a queue (Kafka). Similarly the query functionality is provided through another interface (api) fronted by web UI. With this the command and query are separated. See below the diagram.
I have few questions on the approach
Is it right to have the query API (read) service & the event-complete-handler (write) operate on the same database with both dependent on the DB schema? Or is it better to have the query-api read from the replica DB?
The core-business-logic, at the end of computation, writes only to database and not to db+Kafka in a single transaction. Persisting to the database is handled by the event-complete-handler. Is this approach better?
Say in the future, if the core-business-logic needs to query the database to do the computation on every event, can it directly read from the database? Again, does it not create DB schema dependency between the services?
Is it right to have the query API (read) service & the event-complete-handler (write) operate on the same database with both dependent on the DB schema? Or is it better to have the query-api read from the replica DB?
"Right" is a loaded term. The idea behind CQRS is that the pattern can allow you to separate commands and queries so that your system can be distributed and scaled out. Typically they would be using different databases in a SOA/Microservice architecture. One service would process the command which produces an event on the service bus. Query handlers would listen to this event to change their data for querying.
For example:
A service which process the CreateWidgetCommand would produce an event onto the bus with the properties of the command.
Any query services which are interested widgets for producing their data views would subscribe to this event type.
When the event is produced, the subscribed query handlers will consume the event and update their respective databases.
When the query is invoked, their interrogate their own database.
This means you could, in theory, make the command handler as simple as throwing the event onto the bus.
The core-business-logic, at the end of computation, writes only to database and not to db+Kafka in a single transaction. Persisting to the database is handled by the event-complete-handler. Is this approach better?
No. If you question is about the transactionality of distributed systems, you cannot rely on traditional transactions, since any commands may be affecting any number of distributed data stores. The way transactionality is handled in distributed systems is often with a compensating transaction, where you code the steps to reverse the mutations made from consuming the bus messages.
Say in the future, if the core-business-logic needs to query the database to do the computation on every event, can it directly read from the database? Again, does it not create DB schema dependency between the services?
If you follow the advice in the first response, the approach here should be obvious. All distinct queries are built from their own database, which are kept "eventually consistent" by consuming events from the bus.
Typically these architectures have major complexity downsides, especially if you are concerned with consistency and transactionality.
People don't generally implement this type of architecture unless there is a specific need.
You can however design your code around CQRS and DDD so that in the future, transitioning to this type of architecture can be relatively painless.
The topic of DDD is too dense for this answer. I encourage you to do some independent learning.

system design - How to update cache only after persisted to database?

After watching this awesome talk by Martin Klepmann about how Kafka can be used to stream events so that we can get rid of 2-phase-commits, I have a couple of questions related to updating a cache only when the database is updated properly.
Problem Statement
Lets say you have a Redis cache which stores the user's profile pic and a Postgres database which is used for all the User related operations(creating, updation, deletion, etc)
I want to update my Redis cache only and only when a new user has been successfully added to my database.
How can I do that using Kafka ?
If I am to take the example given in the video then the workflow would follow something like this:
User registers
Request is handled by User Registration Micro service
User Registration Microservice inserts a new entry into the User's table.
Then generates an User Creation Event in the user_created topic.
Cache population microservice consumes the newly created User Creation Event
Cache population microservice updates the redis cache.
The problem starts what would happen if the User Registration Microservice crashed just after writing to the database, but failed to send the event to Kafka ?
What would be the correct way of handling this ?
Does the User Registration Microservice maintain the last event it published ? How can it reliably do that ? Does it write to a DB ? Then the problem starts all over again, what if it published the event to Kafka but failed before it could update its last known offset.
There are three broad approaches one can take for this:
There's the transactional outbox pattern, wherein, in the same transaction as inserting the new entry into the user table, a corresponding user creation event is inserted into an outbox table. Some process then eventually queries that outbox table, publishes the events in that table to Kafka, and deletes the events in the table. Since the inserts are in the same transaction, they either both occur or neither occurs; barring a bug in the process which publishes the outbox to Kafka, this guarantees that every user insert eventually has an associated event published (at least once) to Kafka.
There's a more event-sourcingish pattern, where you publish the user creation event to Kafka and then some consuming process inserts into the user table based on the event. Since this happens with a delay, this strongly suggests that the user registration service needs to keep state of which users it has published creation events for (with the combination of Kafka and Postgres being the source of truth for this). Since Kafka allows a message to be consumed by arbitrarily many consumers, a different consumer can then update Redis.
Change data capture (e.g. Debezium) can be used to tie into Postgres' write-ahead log (as Postgres actually event sources under the hood...) and publish an event that essentially says "this row was inserted into the user table" to Kafka. A consumer of that event can then translate that into a user created event.
CDC in some sense moves the transactional outbox into the infrastructure, at the cost of requiring that the context it inherently throws away be reconstructed later (which is not always possible).
That said, I'd strongly advise against having ____ creation be a microservice and I'd likewise strongly advise against a RInK store like Redis. Both of these smell like attempts to paper over architectural deficiencies by adding microservices and caches.
The one-foot-on-the-way-to-event-sourcing approach isn't one I'd recommend, but if one starts there, the requirement to make the registration service stateful suddenly opens up possibilities which may remove the need for Redis, limit the need for a Kafka-like thing, and allow you to treat the existence of a DB as an implementation detail.

Implementing CQRS / ES the proper way

Recently I'm looking forward to implement the CQRS / ES pattern with Event sourcing in my microservices.
I've been reading for these patterns, but I have some questions that I couldn't find an answer anywhere:
When doing CQRS / ES, should each microservice have its own local
database anymore (Within microservice)?
I know that there will be an event store for writes, and a read-only projection database and i totally understand their purpose, but do microservices need
their own local database for any reason? (Advantages / disadvantages)
Example: Order microservice could have local orders database, item service an items local database etc...apart from the Event source DB and projections database implemented.
How to validate if some data exists in a microservice before
actually issuing a command?
Let's say i want to make a new order, so i assume first I have to
check if that item is still in stock, then perform the other
operation/s.
However, if i want to check if an item is still in stock, where do i
query that data, will it be the projection (read-only) database, or
a local database that each microservice has?
I've read many articles about CQRS / ES at this point, but most of them just explain the concept rather than actually diving into real-life scenarios / explaining how to implement it. I would appreciate if you had any recommendations.
Much appreciated
In general, when dealing with microservices, it's recommended (regardless of whether or not you're doing CQRS/ES) that no two microservices use the same database, or at the very least that no two microservices be writing to the same database. This allows each microservice to control its schema, which only needs to change if the microservice needs it to. One other advantage of this is that the database becomes entirely encapsulated within the service: it's purely an implementation detail.
It's entirely possible that a microservice implementing a read-model might not have a database: it might be able to keep all state in memory (an example might be a read-model which exposes metrics for your monitoring infrastructure), or it might simply be translating events from the write-model into commands to another service (so all of its state is just its position in the event stream).
if i want to check if an item is still in stock, where do i query that data, will it be the projection (read-only) database, or a local database that each microservice has?
In an event-sourced system, every view that's not the stream of events is a projection. So, depending on your requirements, your service can query another service or maintain its own view based on the events.
Note that at any given instant there may exist an event which has been published to the event stream (i.e. it has indisputably happened) but for which there also exists a projection which has not processed the event: the projections are eventually consistent with the event stream. So any check of whether an item is in stock will only tell you that the item was in stock at some point in the past (never mind, to use Greg Young's example, that no in-stock data can guarantee that nothing's been stolen from the warehouse unless the thieves happened to have the decency to update the count as they walked out with their loot). The nanosecond after your query, it might receive word of an event which makes it out-of-stock before you placed your order.
Accordingly, it may just be worth sending a command and letting it get reject your order if the item is not in stock. The write-side (which is the more strongly consistent part of the system, though it should be remembered that in many cases, one component's events are another component's commands) is under no obligation to accept every command; "command" in this context really means "polite request to publish events to the event stream which are conformant with my desired state of the universe".

How to implement Event sourcing and a database in a microservice architecture?

I have been learning lately about microservices architecture and it's features.
in this source it appears that event sourcing is replacing a database, however, it is later stated:
The event store is difficult to query since it requires typical queries to reconstruct the state of the business entities. That is likely to be complex and inefficient. As a result, the application must use Command Query Responsibility Segregation (CQRS) to implement queries.
In the CQRS Page the author seems to describe a singular database that listens to all events and reconstructs itself.
My question(s) is:
What is actually needed to implement event sourcing with a queryable database? particularly:
Where is the events database? Where is the queryable database? Do I need to have multiple event stores for every service or can I store events in a message broker like Kafka? is the CQRS database actually is one "whole" database that collects all the events? And how can all of this scale?
I'm sorry if I'm not clear with my question, I am very confused myself. I guess I'm looking for a full example architecture of how things will look in the grand picture.
Where is the queryable database?
I'm guessing this is the most useful starting point, because it will be most familiar. The queryable database is in the same place that your this-is-the-entire-database was when you weren't doing event sourcing.
That could be a database exclusively to support this microservice, or it could be a database that is shared by several microservices, with some part of the schema where this microservice has exclusive write authority. Another way of thinking about this: the microservices are using different logical databases, which might be physically deployed together.
Where is the events database?
Same general idea - you can have one events database per microservice; or you could have several different microservices sharing the same database. Again, you have partitioning of authority, and the same logical vs physical separation to consider.
What changes with the introduction of events and CQRS is that the query/reporting database no longer stores the authoritative copy of the information that is used by the microservice. The authoritative information lives in the event store, and the query/reporting database acts more like a cache.
Our command handlers will typically load information only from the authoritative store (aka the events); that's the data that we lock if we are processing commands concurrently.
We copy information that is stored in the events into the query/reporting database(s). Depending on our needs, that can be done synchronously by the command handlers, but it is more common to use background batch processing to do that work, meaning that the data in the reporting database will often be a little bit stale.
can I store events in a message broker like Kafka?
Current consensus is that Kafka cannot reliably be used for event sourcing as understood by the CQRS community.
https://issues.apache.org/jira/browse/KAFKA-2260
https://cwiki.apache.org/confluence/display/KAFKA/KIP-27+-+Conditional+Publish
Roughly, the problem is this: when you have two processes with the authority to write events, how do you ensure that they don't introduce inconsistencies? With event stores we can use locks, or conditional writes (aka compare and swap), to ensure that nobody came along and snuck in a few extra events that might change the events we are writing.
With Kafka, there doesn't seem to be a mechanism that supports prevention, so you need to lean more into apologies, or something.
the CQRS database actually is one "whole" database that collects all the events?
Logically? No. But you certain can combine them physically into the same appliance. For example, message-db is "just" a postgres schema with some tables, functions, and so on. You certainly could combine that with the tables you use for queries and reports.
I'm looking for a full example architecture of how things will look in the grand picture.
The materials published by Greg Young in 2010 might be a decent starting point.
Event Source is not replacing the DB. It has some benefits and challenges. So, we should choose it wisely. If you are not comfortable then don't choose it. You can implement Microservice Style without event sourcing.
Query able DB - Simple solution is to implement CQRS pattern and keep your Query DB in sync with Event Source DB.
Event DB should be with owner service like if you are keeping events about Order than it should be in Order service. (Yeah, other service can have replica of the same).
You may use Kafka as intermediate storage for event but not the final one.
CQRS is not about one DB. It an pattern where we use to DB models, one is for Command and Another one is for Query.
If you understand Java then please refer Book "Microservice Patterns - Chris Richardson" and if you are from C# or Microsoft technology stack then you may refer "https://github.com/dotnet-architecture/eShopOnAzure".

Text search for microservice architectures

I am investigating into implementing text search on a microservice based system. We will have to search for data that span across more than one microservice.
E.g. say we have two services for managing Organisations and managing Contacts. We should be able to search for organisations by contact details in one search operation.
Our preferred search solution is Elasticsearch. We already have a working solution based on embedded objects (and/or parent-child) where when a parent domain is updated the indexing payload is enriched with the dependent object data, which is held in a cache (we avoid making calls to the service managing child directly for this purpose).
I am wondering if there is a better solution. Is there a microservice pattern applicable to such scenarios?
It's not particularly a microservice pattern I would suggest you, but it fits perfectly into microservices and it's called Event sourcing
Event sourcing describes an architectural pattern in which events are generated by different sources. An event will now trigger 0 or more so called Projections which then use the data contained in the event to aggregate information in the form it is needed.
This is directly applicable to your problem: Whenever the organisation service changes it's internal state (Added / removed / updated an organization) it can fire an event. If an organization is added, it will for example aggregate the contacts to this organization and store this aggregate. The search for it is now trivial: Lookup the organizations id in the aggregated information (this can be indexed) and get back the contacts associated with this organization. Of course the same works if contracts are added to the contract service: It just fires a message with the contract creation information and the corresponding projections now alter different aggregates that can again be indexed and searched quickly.
You can have multiple projections responding to a single event - which enables you to aggregate information in many different forms - exactly the way you'd like to query it later. Don't be afraid of duplicated data: event sourcing takes this trade-off intentionally and since this is not the data your business-services rely on and you do not need to alter it manually - this duplication will not hurt you.
If you store the events in the chronological order they happened (which I seriously advise you to do!) You can 'replay' these events over and over again. This helps for example if a projection was buggy and has to be fixed!
If your're interested I suggest you read up on event sourcing and look for some kind of event store:
event sourcing
event store
We use event sourcing to aggregate an array of different searches in our system and we aggregate millons of records every day into mongodb. All projections have their own collection create their own indexes and until now we never had to resort to different systems / patterns like elastic search or the likes!
Let me know if this helped!
Amendment
use the data contained in the event to aggregate information in the form it is needed
An event should contain all the information necessary to aggregate more information. For example if you have an organization creation event, you need to at least provide some information on what the organizations name is, an ID of some kind, creation date, parent organizations ID etc. As a rule of thumb, we send all the information we gather in the service that gets the request (don't take it directly form the request ;-) check it first, then write it to the event and send it off) because we do not know what we're gonna need in the future. Just stay cautious - payloads should not get too large!
We can now have multiple projections responding to this event: One that adds the organizations to it's parents aggregate (to get an easy lookup for all children of a given organization), one that just adds it to the search set of all organizations and maybe a third that aggregates all the parents of a given child organization so the lookup for the parent organizations is easy and fast.
We have the same service process these events that also process client requests. The motivation behind it is, that the schema of the data that your projections create is tightly coupled to the way it is read by the service that the client interacts with. This does not have to be that way and it could be separated into two services - but you create an almost invisible dependency there and releasing these two services independently becomes even more challenging. But if you do not mind that additional level of complexity - you can separate the two.
We're currently also considering writing a generic service for aggregating information from events for things like searches, where projections could be scripted. That only makes the invisible dependencies problem less conspicuous, it does not solve it.

Resources