We started to migrate our existing project into microservice architecture. After going through a lot of videos/lectures, we came to a conclusion that a service should do one task and only one task and should be great at it. The services should be designed around Noun and Verb.
We have an entity which has basically CRUD operations. Now the add, update and delete are least used operations but GET requests at too high compared to those operations. Typically, update/add/delete are done by admin guys.
What we thought of is breaking the CRUD entity into two services
EntityCUDService (create/update/delete)
EntityLookupService (get)
Now both these services point to the same collection in mongo or say some SQL.
Now if EntityCUDService has done some changes to collection/table then EntityLookupService fails.
We heard of maintaining semantic versioning, that sounds okay but We also heard microservices should not share model/data source. So what would be the optimal solution to handle this where we have tons of gets but tens of updates/adds of same entity
Any help is greatly appreciated.
Typically, a micro service should manage single entity. So in your case you can have one micro-service to manage the entity (for various operations on the entity). Now if you want to split the service again on the basis of read and write operation then you are following the CQRS pattern. In CQRS , you split your micro-service on the basis of read and write operations. So now you will have 2 services one called command service and other called query service over the same entity. I will suggest to go with one service first to manage the entity and then if required split it more for separate service for read and write operations. Again if you are going to use CQRS, then have a look at event sourcing as it nicely fits with CQRS in micro-services design.
Related
A have a microservice that needs some data it does not own. It needs a read-only cache of data that is owned by another service. I am looking for guidence on how to implement this.
I dont' want my microserivce to call another microservice. I have too much data that is used in a join for this to be successful. In addition, I don't want my service to be dependent on another service (which may be dependent on another ...).
Currently, I am publishing an event to a queue. Then my service subscribes and maintains a copy of the data. I am haivng problem staying in sync with the source system. Plus, our DBAs are complaining about data duplication. I don't see a lot of informaiton on this topic.
Is there a pattern for this? What the name?
First of all, there are couple of ways to share data and two of them you mention.
One service call another service to get the data when it is required. This is good as you get up to date data and also there is no extra management required on consuming service. Problem is that if you are calling this too many times then other service performance may impact.
Another solution is maintained local copy of that data in consuming service using Pub/Sub mechanism.
Depending on your requirement and architecture you can keep this in actual db of consuming service or some type of cache ( persisted cache)
Here cons is consistency. When working with distributed architecture you will not get strong consistency but you have to depends on Eventual consistency.
Another solution is that and depends on your required you can separate out that tables that needs to join in some separate service. It depends on your use case.
If you still want consistency then at the time when first service call that update the data and then publish. Instead create some mediator component and that will call two service in sync fashion. Here things get complicated as you now try to implement transaction over distributed system.
One another point, when product build around Microservice architecture then it is not only technical move, as a organization and as a team your team needs to understand something that work in Monolith, it is not same in Microservices. DBA needs to understand that part and in Microservices Duplication of data across schema ( other aspect like code) prefer over reusability.
Last but not least, If it is always required to call another service to get data, It is worth checking service boundary as well. It may possible that sometime service needs to merge as business functionality required to stay together.
Is it considered as a good practice to connect to two different databases in on microservice API Or I need to implement another microservice for working with the second database and call the new microservice API inside the first one?
The main thing is that you have only one microservice per database, but it is ok to have multiple databases per microservice if the business case requires it.
Your microservice can abstract multiple data sources, connect them, etc. and then just give consistent api to whoever is using it. And who's using it, doesn't care how many data sources there actually is.
It becomes an issue, if you have same database abstracted by multiple microservices. Then your microservice is no longer isolated and can break, because the data source you are using was changed by another team who's using the same data source.
As far as my little current experience allows me to understand, one of the core concepts about "microservice" is that it relies on its own database which is independent from other microservices.
Diving into how to handle distributed transactions in a microservices system, the best strategy seems to be the Event Sourcing pattern whose core is the Event Store.
Is the event store shared between different microservices? Or there are multiple independent event stores databases for each microservice and a single common event broker?
If the first option is the solution, using CQRS I can now assume that every microservice's database is intended as query-side, while the shared event store is on the command-side. Is it a wrong assumption?
And since we are in the topic: how many retries I have to do in case of a concurrent write in a Stream using optimistic locking?
A very big big thanks in advance for every piece of advice you can give me!
Is the event store shared between different microservices? Or there are multiple independent event stores databases for each microservice and a single common event broker?
Every microservice should write to its own Event store, from their point of view. This could mean separate instances or separate partitions inside the same instance. This allows the microservices to be scaled independently.
If the first option is the solution, using CQRS I can now assume that every microservice's database is intended as query-side, while the shared event store is on the command-side. Is it a wrong assumption?
Kinda. As I wrote above each microservice should have its own Event store (or a partition inside a shared instance). A microservice should not append events to other microservice Event store.
Regarding reading events, I think that reading events should be in general permitted. Polling the Event store is the simplest (and the best in my opinion) solution to propagate changes to other microservices. It has the advantage that the remote microservice polls at the rate it can and what events it wants. This can be scaled very nice by creating Event store replicas, as much as it is needed.
There are some cases when you would want to not publish every domain event from the Event store. Some say that there are could exist internal domain events on that the other microservices should not depend. In this case you could mark the events as free (or not) for external consuming.
The cleanest solution to propagate changes in a microservice is to have live queries to whom other microservices could subscribe. It has the advantage that the projection logic does not leak to other microservice but it also has the disadvantage that the emitting microservice must define+implement those queries; you can do this when you notice that other microservices duplicate the projection logic. An example of this query is the total order price in an ecommerce application. You could have a query like this WhatIsTheTotalPriceOfTheOrder that is published every time an item is added to/removed from/updated in an Order.
And since we are in the topic: how many retries I have to do in case of a concurrent write in a Stream using optimistic locking?
As many as you need, i.e. until the write succeeds. You could have a limit of 99999, just to be detect when something is horribly wrong with the retry mechanism. In any case, the concurrent write should be retried only when a write is done at the same time on the same stream (for one Aggregate instance) and not for the entire Event store.
As a rule: in service architectures, which includes micro services, each service tracks its state in a private database.
"Private" here primarily means that no other service is permitted to write or read from it. This could mean that each service has a dedicated database server of its own, or services might share a single appliance but only have access permissions for their own piece.
Expressed another way: services communicate with each other by sharing information via the public api, not by writing messages into each others databases.
For services using event sourcing, each service would have read and write access only to its streams. If those streams happen to be stored on the same home - fine; but the correctness of the system should not depend on different services storing their events on the same appliance.
TLDR: All of these patterns apply to a single bounded context (service if you like), don't distribute domain events outside your bounded context, publish integration events onto an ESB (enterprise service bus) or something similar, as the public interface.
Ok so we have three patterns here to briefly cover individually and then together.
Microservices
CQRS
Event Sourcing
Microservices
https://learn.microsoft.com/en-us/azure/architecture/microservices/
Core objective: Isolate and decouple changes in a system to individual services, enabling independent deployment and testing without collateral impact.
This is achieved by encapsulating change behind a public API and limiting runtime dependencies between services.
CQRS
https://learn.microsoft.com/en-us/azure/architecture/patterns/cqrs
Core objective: Isolate and decouple write concerns from read concerns in a single service.
This can be achieved in a few ways, but the core idea is that the read model is a projection of the write model optimised for querying.
Event Sourcing
https://learn.microsoft.com/en-us/azure/architecture/patterns/event-sourcing
Core objective: Use the business domain rules as your data model.
This is achieved by modelling state as an append-only stream of immutable domain events and rebuilding the current aggregate state by replaying the stream from the start.
All Together
There is a lot of great content here https://learn.microsoft.com/en-us/previous-versions/msp-n-p/jj554200(v=pandp.10)
Each of these has its own complexity, trade-offs and challenges and while a fun exercise you should consider if the cost outway the benefits. All of them apply within a single service or bounded context. As soon as you start sharing a data store between services, you open yourself up to issues, as the shared data store can not be changed in isolation as it is now a public interface.
Rather try publish integration events to a shared bus as the public interface for other services and bounded contexts to consume and use to build projections of other domain contexts data.
It's a good idea to publish integration events as idempotent snapshots of the current aggregate state (upsert X, delete X), especially if your bus is not persistent. This allows you to republish integration events from a domain if needed without producing an inconsistent state between consumers.
I am working on a jobs site where I am thinking of breaking out the jobs matching section into a micro service - everything else is a monolith.
But when thinking about how the microservice should have its own separate database, that would mean having the microservice have a separate copy of all the jobs, given the monolith would still handle all job crud functionality.
Am I thinking about this the right way and is it normal to have multiple copies of the same data spread out across different microservices?
The idea of having different databases with the same data scares me a bit, since that creates the potential for things to get out of sync.
You are trying to go away from monolith and the approach you are taking is very common, to take out part from monolith which can be converted into a microservice. Monolith starts to shrink over time and you have more number of MSs.
Coming to your question of data duplicacy, yes this is a challenge and some data needs to be duplicated but this vary case to case and difficult to say without looking into application.
You may expose API so monolith can get/create the data if needed and I strongly suggest not to sacrifice or compromise data model of microservice to avoid duplicacy, because MS will be going to more important than your monolith in future. Keep in mind you should avoid adding any new code to the monolith and even if you have to, for data ask the MS instead of the monolith.
One more thing you can try, instead of REST API call between microservices, you can use caching mechanism with event bus. Every microservice will publish CRUD changes to event bus, interested micro-service consume those events & update local cache accordingly.
Problem with REST call is, in some situation when dependent service is down we can not query main microservice, which could become bottleneck sometime.
I'm a beginner in microservice architecture and I have read in a lot of blog that in a microservice architecture, it is mandatory that each micro service has its own database. In my case it may cost very expensive.
My question is, is it possible to make the persistence layer as micro service in itself ? Which would have the function of allowing other microservices to have read/write access to the database.
Thanks
To answer your question first of all lets understand :
it is mandatory that each micro service has its own database. In my
case it may cost very expensive.
Yes it is said that every microservice should have its own database.
What they mean is tables/collection of each microservice should be separate (you could use a single scalable database instance) and one microservice should only access the data of other microservices only through API calls
Benefits of having a separate model are:
Model will be clean. Eg: In E-Commerce Customer have diff. meaning for Shipping Microservice, Order Microservice, Customer Management Microservice and so on. If we put all data required by multiple microserives Customer Object will become very big
Microservices could evolve independently. In this case if we have a single Customer object and one microservice lets say Order one want to add something to the schema, all microservices needs to change
If we have a single Database Schema we will be getting into a big mess.
In my case it may cost very expensive.
If expensive means read model actually require data from multiple microservices. then its better to listen to events from multiple microservices and create a single read model , little duplication of data is ok.
If anything else, ask more specific question.
Having all Microservices accessing the same database will result in Loose Cohesion and Strong Coupling
Try to see if you can define separate Schema for each of the Microservices, so that you can ensure Microservices doesn't refer to the tables of other MicroServices.
This way in future, you can seamlessly move to separate Database for each service when your infrastructure cost concern goes off.
Micro services follows database per service model