With CQRS Pattern how are you still limiting to one service per database - microservices

According to my understanding
We should only have one service connecting to a database
With CQRS you will be keeping two databases in sync, hypothetically using some “service” glueing them together
Doesn’t that now mean there’s a service which only purpose is to keep the two in sync, and another service to access the data.
Questions
Doesn’t that go against rule number above? Or does this pattern only apply when native replication is being used?
Also, other than being able to independently scale the replicated database for more frequent reads, does the process of keeping both in sync kind of take away from that? Either way we’re writing the same data to both in the end.
Ty!

We should only have one service connecting to a database
I would rephrase this to: each service should be accessible via that service's api. And all internals, like database, should be completely hidden. Hence, there should be no (logical) database sharing between services.
With CQRS you will be keeping two databases in sync, hypothetically using some “service” glueing them together
CQRS is a pattern for splitting how a service talks to a data layer. Typical example would be something like separating reads and writes; as those are fundamentally different. E.g. you do rights as commands via a queue and reads as exports via some stream.
CQRS is just an access pattern, using it (or not using it) does nothing for synchronization. If you do need a service to keep two other ones in sync, then you still should use services' api's instead of going into the data layer directly. And CQRS could be under those api's to optimize data processing.
The text from above might address your first question. As for the second one: keeping database incapsulated to a service does allow that database (and service) to be scaled as needed. So if you are using replication for reads, that would be a reasonable solutions (assuming you address async vs sync replication).
As for "writing data on both ends", I am actually confused what does that mean...

Related

Does storing another service's data violate the Single Responsibility Principle of Microservice

Say I have a service that manages warehouses(that is not very frequently updated). I have a sales service that requires the list of stores( to search through and use as necessary). If I get the list of stores from the store service and save it( lets say in redis) inside my sales service but ensure that redis is updated if the list of stores changes. Would it violate the single responsibility principle of Microservice architecture?
No it does not, actually it is quite common approach in microservice architecture when service stores a copy of related data from another services and uses some mechanism to sync it (usually using some async communications via message broker).
Storing the copy of data does not transfer ownership of that data from service which manages it.
It is common and you have a microservice pattern (CQRS).
If you need some information from other services / microservices to join with your data, then you need to store that information.
Whenever you are making design decision whether always issue requests against the downstream system or use a local copy then you are basically making trade-off analysis between performance and data freshness.
If you always issue RPC calls then you prefer data freshness over performance
The frequency of how often do you need to issue RPC calls has direct impact on performance
If you utilize caching to gain performance then there is a chance to use stale data (depending on your business it might be okay or unacceptable)
Cache invalidation is a pretty tough problem domain so, it can cause headache
Caching one microservice's data does not violate data ownership because caching just reads the data, it does not delete or update existing ones. It is similar to have a single leader (master) - multiple followers setup or a read-write lock. Until there is only one place where data can be created, modified or deleted then data ownership is implemented in a right way.

Eventually consistent DB : How to deal with relational data?

So let's say we have microservices that uses an event broker to communicate each other.
To secure sovereignty of data, each microservices has denormalized documents.
So whenever the data is changed, from the service changed the data, 'DataAHasChanged' event gets fired. Next, all the microservices that have subscribed this event will change document they have to maintain consistency of data A. (A here is not foreign key, but it's actual data, since it's denormalized)
This seems really not good to me if services have multiple documents that have data A. And if data A is changing often. I would just send API call to other services using data A's ID as a foreign key.
Real world use case would be:
User creates 'contract requests' and it has multiple vendor information.
Vendors information will be changed often.
So if there are 2000 contract requests. It means whenever vendor changes their information. We should go through every contract requests and change the denormalized document.
Is eventual consistency still the best practice in this case? or should I just use synchronous call to just read data from vendor service?
Thank you.
I would revisit the microservices decoupling and would ask a question - who is the source of truth for each type of data? You'll probably arrive to one service owning documents and that service will be responsible for updating those documents as well.
Even with a dedicated service owning documents, you still have to answer what are the consistency guarantees you need. Usually you start with SLA's - how available your service should be? How the data is stored? Often the underlaying data storage will dictate those.
Also, I would like to note that even with synchronous calls your system will be eventually consistent - since it takes time to execute all those calls, it will be a period when the system as a whole might see non-latest data.
If you really need true strong consistency, you may will have to pick right storage for that. I would go with a strongly consistent option assuming my performance and availability goals are met. And the reason for strong consistency - it is much easier to reason about; hence the system gets simpler.

Event sourcing, hold read side consistent

I'm new in ES, and only trying to sort everything in my head. I have heard that ES is actually solving the consistency issue between write and read database (with some delay for sure). But I still do not fully understand how?
If command is coming to domain and aggregate root firing event to update event store, same event is sending to update read side?? But what if message lost, we will have outdated read side.
Is projections the only solution??So instead of updating from event, read side walking through event store and reproducing aggregate (from beginning or from some snapshot). But in such case it's probably breaking some rules as read side should be simple and it should not know about domain. And also usually read side is a separate application so she can't know about aggregate.
For sure we also can use rabbitMQ or some other message broker to not lost messages,and actually I think we need. But I also read that to make it consistent "you can use rabbit or ES", but again how ES can make it consistent by own??
Benjamin is completely right about the purpose of Event Sourcing.
My answer aims to add some more details.
First:
Read models and projections aren't suppose to represent the aggregate state.
Projections are the way for event-sourced systems to build the read model for CQRS. CQRS in essence postulates that write and read models usually serve different purposes and therefore it makes perfect sense to use another model for the read side.
Therefore, you often find multiple projections building different, narrowly purposed models, targeting specific needs for queries.
Second:
By "solving consistency issues" you probably mean that in event-sourced systems each state transition is represented as an event (or multiple events). Therefore, writes are always transactional. The database you choose as your event store should support (could using some library or additional tool) real-time subscription that would allow you to receive new events in your projection, in order. For new projections, it will start reading from the start and eventually come real-time. Subscriptions usually need to keep the current processing position in the global stream of events so when the projection restarts, it starts receiving events from the point which is last known to it.
By doing this, you will guarantee that every state transition in the write model will be reflected in the read model. This is probably what you mean in your original question.
Third:
Now, all those things above imply that you cannot use a message bus (only) to deliver events to projections. Brokers give no ordering guarantees and can deliver one message more than once. Also, message brokers don't keep history so you cannot build new projections at will.
However, it doesn't mean that you can't use brokers at all. Some projections don't require ordering and are idempotent. But the feed for events to publish via a broker is the same subscription, so you get guaranteed delivery and can read past events if necessary.
Fourth:
CQRS doesn't imply separate databases. Sometimes, using CQRS just means that you use some persistence layer for your domain objects, so you read and write aggregates. But for queries, you just query at will, whatever you want. A database view is a technical example of CQRS.
Almost there:
Projections need to have little to no logic, it is true. The main point here is to ensure idempotency, if possible, so projections usually should not use operations to calculate new values based on old values and information from events.
But projections will know about your domain. Everything in your system should know about your domain.
And last:
You can definitely use different databases for write and read models without getting to Event Sourcing. You just need to choose a database that supports a change feed. SQL Server, Postgres, CosmosDb and other databases have such functionality.
P.S. I'd suggest spending some time studying those concepts. I can point to the book repository, it has CQRS and Event Sourcing examples: https://github.com/PacktPublishing/Hands-On-Domain-Driven-Design-with-.NET-Core
I have heard that ES is actually solving the consistency issue between
write and read database
To the best of my knowledge, Event sourcing has NOTHING to do with consistency between read/write to your db. Consistency between read/write has actually more to do with the type of db you are using such as relational which are mostly ACID versus the non-relational db which are often eventual consistency.
ES is not meant for that, instead ES : "Capture all changes to an application state as a sequence of events" Martin Fowler.
ES works like time machine, which allows you to change the state of your application to a specific date time in the past.

Microservices: model sharing between bounded contexts

I am currently building a microservices-based application developed with the mean stack and am running into several situations where I need to share models between bounded contexts.
As an example, I have a User service that handles the registration process as well as login(generate jwt), logout, etc. I also have an File service which handles the uploading of profile pics and other images the user happens to upload. Additionally, I have an Friends service that keeps track of the associations between members.
Currently, I am adding the guid of the user from the user table used by the User service as well as the first, middle and last name fields to the File table and the Friend table. This way I can query for these fields whenever I need them in the other services(Friend and File) without needing to make any rest calls to get the information every time it is queried.
Here is the caveat:
The downside seems to be that I have to, I chose seneca with rabbitmq, notify the File and Friend tables whenever a user updates their information from the User table.
1) Should I be worried about the services getting too chatty?
2) Could this lead to any performance issues, if alot of updates take place over an hour, let's say?
3) in trying to isolate boundaries, I just am not seeing another way of pulling this off. What is the recommended approach to solving this issue and am I on the right track?
It's a trade off. I would personally not store the user details alongside the user identifier in the dependent services. But neither would I query the users service to get this information. What you probably need is some kind of read-model for the system as a whole, which can store this data in a way which is optimized for your particular needs (reporting, displaying together on a webpage etc).
The read-model is a pattern which is popular in the event-driven architecture space. There is a really good article that talks about these kinds of questions (in two parts):
https://www.infoq.com/articles/microservices-aggregates-events-cqrs-part-1-richardson
https://www.infoq.com/articles/microservices-aggregates-events-cqrs-part-2-richardson
Many common questions about microservices seem to be largely around the decomposition of a domain model, and how to overcome situations where requirements such as querying resist that decomposition. This article spells the options out clearly. Definitely worth the time to read.
In your specific case, it would mean that the File and Friends services would only need to store the primary key for the user. However, all services should publish state changes which can then be aggregated into a read-model.
If you are worry about a high volume of messages and high TPS for example 100,000 TPS for producing and consuming events I suggest that Instead of using RabbitMQ use apache Kafka or NATS (Go version because NATS has Rubby version also) in order to support a high volume of messages per second.
Also Regarding Database design you should design each micro-service base business capabilities and bounded-context according to domain driven design (DDD). so because unlike SOA it is suggested that each micro-service should has its own database then you should not be worried about normalization because you may have to repeat many structures, fields, tables and features for each microservice in order to keep them Decoupled from each other and letting them work independently to raise Availability and having scalability.
Also you can use Event sourcing + CQRS technique or Transaction Log Tailing to circumvent 2PC (2 Phase Commitment) - which is not recommended when implementing microservices - in order to exchange events between your microservices and manipulating states to have Eventual Consistency according to CAP theorem.

Event sourcing, CQRS and database in Microservice

I am quite new in context of Micro-service architecture and reading this post : http://microservices.io/patterns/data/event-sourcing.html to get familiar with Event sourcing and data storage in Microservice architecture.
I have read many documents about 3 important aspect of system :
Using event sourcing instead of a simply shared DB and ORM and
row update
Events are JAVA objects.
In case of saving data permanently
, we need to use DB (either relational or noSQL)
Here are my questions :
How database comes along with event sourcing? I have read CQRS
pattern, but I can not understand how CQRS pattern is related to
event store and event objects ?
Can any body provide me a
complete picture and set of operations happens with all players to
gather: CQRS pattern , Event sourcing (including event storage
module) and finally different microservices?
In a system
composed of many microservices, should we have one event storage or
each microservice has its own ? or both possible ?
same
question about CQRS. This pattern is implemented in all
microservices or only in one ?
Finally, in case of using
microservice architecture, it is mandatory to have only one DB or
each Microserivce should have its own ?
As you can see, I have understood all small pieces of game , but I can not relate them together to compose a whole image. Specially relevance between CQRS and event sourcing and storing data in DB.
I read many articles for example :
https://ookami86.github.io/event-sourcing-in-practice/
https://msdn.microsoft.com/en-us/library/jj591577.aspx
But in all of them small players are discussed. Even a hand drawing piece of image will be appreciated.
How database comes along with event sourcing? I have read CQRS pattern, but I can not understand how CQRS pattern is related to event store and event objects ?
"Query" part of CQRS instructs you how to create a projection of events, which is applicable in some "bounded context", where the database could be used as a means to persist that projection. "Command" part allows you to isolate data transformation logic and decouple it from the "query" and "persistence" aspects of your app. To simply put it - you just project your event stream into the database in many ways (projection could be relational as well), depending on the task. In this model "query" and "command" have their own way of projecting and storing events data, optimised for the needs of that specific part of the application. Same data will be stored in events and in projections, this will allow achieving simplicity and loose coupling among subdomains (bounded contexts, microservices).
Can any body provide me a complete picture and set of operations happens with all players to gather: CQRS pattern , Event sourcing (including event storage module) and finally different microservices?
Have you seen Greg Young's attempt to provide simplest possible implementation? If you still confused, consider creating more specific question about his example.
In a system composed of many microservices, should we have one event storage or each microservice has its own ? or both possible ?
It is usually one common event storage, but there definitely could be some exceptions, edge cases where you really will need multiple storages for different microservices here and there. It all depends on the business case. If you not sure - most likely you just need a single storage for now.
same question about CQRS. This pattern is implemented in all microservices or only in one ?
It could be implemented in most performance-demanding microservices. It all depends on how complex your implementation becomes when you are introducing CQRS into it. If it gets simpler - why not implement it everywhere? But if people in your team become more and more confused by the need to perform more explicit synchronisation between commands and queries parts - maybe cqrs is too much for you. It all depends on your team, on your domain ... there is no single simple answer, unfortunately.
Finally, in case of using microservice architecture, it is mandatory to have only one DB or each Microservice should have its own ?
If same microservices sharing same tables - this is usually considered as an antipattern, as it increases coupling, the system becomes more fragile. You can still share the same database, but there should be no shared tables. Also, tables from one microservice better not have FK's to tables in another microservice. Same reason - to reduce coupling.
PS: consider not to ask coarse-grained questions, as it will be harder to get a response from people. Several smaller, more specific questions will have better chance to be answered.

Resources