For example amazon.com; they rely on microservice architecture and probably order and payment are seperate micro services but when you checkout order on amazon.com you can finally see the order id and details.If it's not eventual consistency approach what is it? Maybe 2PC?
I'm generalizing my question; what if eventual consistency is not
appropriate for business transaction(end user should see the result end of transaction) but seperate microservices is meaningful(like order and payment)
how to handle immediate consistency?
There are several techniques which can provide cross-service transactions (atomicity): 2PC, Percolator's transactions and Sagas.
Percolator's transactions have serializable isolation level. They are known in the industry, see: Amazon's DynamoDB transaction library, CockroachDB database, and the Google's Pecolator system itself. A step-by-step visualization of the Percolator's transactions may help you to understand how they work.
The saga pattern was described in the late 80s in the Sagas paper but became more relevant with the rise of microservices. Please see the Applying the Saga Pattern talk for inspiration.
But since you mentioned eventual consistency it's important to notice that all of the techniques require individual services to be linearizable (strong consistency) and to support compare and set.
Related
I am trying to understand microservices. While going through Saga pattern, I came across this sentence
Whilst somewhat close to having ACID guarantees, the saga pattern is
still missing isolation. This means that it is possible to read and
write data from an incomplete transaction, thus introducing various
isolation anomalies
I am still not clear on why saga pattern lacks isolation. Can someone please explain with an example.
There's no isolation since the intermediate states are manifested in the services that are part of the saga
when you call another service in a saga it performs the action (e.g. you are buying a ticket, if you reserve a seat for a show and you're still on the process to pay for the ticket). It isn't isolated as any other interaction with the same service will see the effect of your action (e.g. the specific seat will seem taken for others who are trying to purchase a ticket).
There is total of 6 solutions including:
Semantic lock
Commutative update
Pessimistic view
Reread value
Versioning
By value
you can read more about each of them in this book
We have a monolithic Web API layer in our application with a hundred end points. I am trying to break it into microservices using Azure Service Fabric.
When we break them into multiple services, we may end up having duplicate code.
Example: Let's say we have an Account Services to create an account. And there is a payment service to apply payments to transactions.
In this case, both services need the Customer class/domain. Probably the Account Services need an exhaustive customer with full details, but the payment might need a light weight one.
The question is do we need to copy several domain entities, and other layers like this? Doesn't that create more maintenance issues?
If we don't we end up copying the code and creating different services, one monolithic service same is the existing Web API.
Any thoughts on this?
2ndly, we have some cases where transactions are mentioned today. If we separate them, is there any good design to record failures and rollback without trying too much to maintain transactions?
Breaking a monolith up into proper microservices with appropriate boundaries for your domain is certainly more of an art than a science. The prerequisite to taking on such a task is a thorough understanding of your domain and the interactions within, and you won't get it right the first time. One of points that Evans makes in his book on Domain-Driven Design is that for any sufficiently complex domain, the domain model continually evolves because your understanding of the domain is continually evolving; you will understand it a little better tomorrow than you do today. That said, don't be afraid to start when you have an understanding that is "good enough" and be willing to adapt/evolve your model.
I don't know your domain, but it sounds to me like you need to first figure out in which bounded context Customer primarily belongs. Yes, you want to minimize duplication of domain logic, and though it may not fit completely and neatly into a single service, to the extent that you make one service take primary responsibility for accessing, persisting, manipulating, validating, and ensuring the integrity of a Customer, the better off you'll be.
From your question, I see two possibilities:
The Account Services bounded context is the primary stakeholder in Customer, and Customer has non-trivial ties to other Account Services entities and services. It's difficult to draw clear boundaries around a Customer in isolation. In this case, Customer belongs in the Account Services bounded context.
Customer is an independent enough concept to merit its own microservice. A Customer can stand alone. In this case, Customer belongs in its own bounded context.
In either case, great care should be taken to ensure that the Customer-specific domain logic stays centralized in the Customer microservice behind strong boundaries. Other services might use Customer, or perhaps a light-weight (even read-only) CustomerView, but their interactions should go through the Customer service to the extent that they can.
In your question, you indicate that the Payments bounded context will need access to Customer, but it might just need a light-weight version. It should communicate with the Customer service to get that light-weight object. If, during Payments processing you need to update the Customer's billing address for example, Payments should call into the Customer microservice telling it to update its billing address. Payments need not know anything about how to update a Customer's billing address other than the single API call; any domain logic, validation, firing of domain events, etc... that need to happen as part of that operation are contained within the Customer microservice.
Regarding your second question: it's true that atomic transactions become more complex/difficult in a distributed architecture. Do some reading on the Saga pattern: https://blog.couchbase.com/saga-pattern-implement-business-transactions-using-microservices-part/. Also, Jimmy Bogard is currently in the midst of a blog series called
Life Beyond Distributed Transactions: An Apostate's Implementation that may offer some good insights.
Hope this helps!
I am currently building a microservices-based application developed with the mean stack and am running into several situations where I need to share models between bounded contexts.
As an example, I have a User service that handles the registration process as well as login(generate jwt), logout, etc. I also have an File service which handles the uploading of profile pics and other images the user happens to upload. Additionally, I have an Friends service that keeps track of the associations between members.
Currently, I am adding the guid of the user from the user table used by the User service as well as the first, middle and last name fields to the File table and the Friend table. This way I can query for these fields whenever I need them in the other services(Friend and File) without needing to make any rest calls to get the information every time it is queried.
Here is the caveat:
The downside seems to be that I have to, I chose seneca with rabbitmq, notify the File and Friend tables whenever a user updates their information from the User table.
1) Should I be worried about the services getting too chatty?
2) Could this lead to any performance issues, if alot of updates take place over an hour, let's say?
3) in trying to isolate boundaries, I just am not seeing another way of pulling this off. What is the recommended approach to solving this issue and am I on the right track?
It's a trade off. I would personally not store the user details alongside the user identifier in the dependent services. But neither would I query the users service to get this information. What you probably need is some kind of read-model for the system as a whole, which can store this data in a way which is optimized for your particular needs (reporting, displaying together on a webpage etc).
The read-model is a pattern which is popular in the event-driven architecture space. There is a really good article that talks about these kinds of questions (in two parts):
https://www.infoq.com/articles/microservices-aggregates-events-cqrs-part-1-richardson
https://www.infoq.com/articles/microservices-aggregates-events-cqrs-part-2-richardson
Many common questions about microservices seem to be largely around the decomposition of a domain model, and how to overcome situations where requirements such as querying resist that decomposition. This article spells the options out clearly. Definitely worth the time to read.
In your specific case, it would mean that the File and Friends services would only need to store the primary key for the user. However, all services should publish state changes which can then be aggregated into a read-model.
If you are worry about a high volume of messages and high TPS for example 100,000 TPS for producing and consuming events I suggest that Instead of using RabbitMQ use apache Kafka or NATS (Go version because NATS has Rubby version also) in order to support a high volume of messages per second.
Also Regarding Database design you should design each micro-service base business capabilities and bounded-context according to domain driven design (DDD). so because unlike SOA it is suggested that each micro-service should has its own database then you should not be worried about normalization because you may have to repeat many structures, fields, tables and features for each microservice in order to keep them Decoupled from each other and letting them work independently to raise Availability and having scalability.
Also you can use Event sourcing + CQRS technique or Transaction Log Tailing to circumvent 2PC (2 Phase Commitment) - which is not recommended when implementing microservices - in order to exchange events between your microservices and manipulating states to have Eventual Consistency according to CAP theorem.
I'll illustrate my question with Twitter. For example, Twitter has microservice-based architecture which means that different processes are in different servers and have different databases.
A new tweet appears, server A stored in its own database some data, generated new events and fired them. Server B and C didn't get these events at this point and didn't store anything in their databases nor processed anything.
The user that created the tweet wants to edit that tweet. To achieve that, all three services A, B, C should have processed all events and stored to db all required data, but service B and C aren't consistent yet. That means that we are not able to provide edit functionality at the moment.
As I can see, a possible workaround could be in switching to immediate consistency, but that will take away all microservice-based architecture benefits and probably could cause problems with tight coupling.
Another workaround is to restrict user's actions for some time till data aren't consistent across all necessary services. Probably a solution, depends on customer and his business requirements.
And another workaround is to add additional logic or probably service D that will store edits as user's actions and apply them to data only when they will be consistent. Drawback is very increased complexity of the system.
And there are two-phase commits, but that's 1) not really reliable 2) slow.
I think slowness is a huge drawback in case of such loads as Twitter has. But probably it could be solved, whereas lack of reliability cannot, again, without increased complexity of a solution.
So, the questions are:
Are there any nice solutions to the illustrated situation or only things that I mentioned as workarounds? Maybe some programming platforms or databases?
Do I misunderstood something and some of workarounds aren't correct?
Is there any other approach except Eventual Consistency that will guarantee that all data will be stored and all necessary actions will be executed by other services?
Why Eventual Consistency has been picked for this use case? As I can see, right now it is the only way to guarantee that some data will be stored or some action will be performed if we are talking about event-driven approach when some of services will start their work when some event is fired, and following my example, that event would be “tweet is created”. So, in case if services B and C go down, I need to be able to perform action successfully when they will be up again.
Things I would like to achieve are: reliability, ability to bear high loads, adequate complexity of solution. Any links on any related subjects will be very much appreciated.
If there are natural limitations of this approach and what I want cannot be achieved using this paradigm, it is okay too. I just need to know that this problem really isn't solved yet.
It is all about tradeoffs. With eventual consistency in your example it may mean that the user cannot edit for maybe a few seconds since most of the eventual consistent technologies would not take too long to replicate the data across nodes. So in this use case it is absolutely acceptable since users are pretty slow in their actions.
For example :
MongoDB is consistent by default: reads and writes are issued to the
primary member of a replica set. Applications can optionally read from
secondary replicas, where data is eventually consistent by default.
from official MongoDB FAQ
Another alternative that is getting more popular is to use a streaming platform such as Apache Kafka where it is up to your architecture design how fast the stream consumer will process the data (for eventual consistency). Since the stream platform is very fast it is mostly only up to the speed of your stream processor to make the data available at the right place. So we are talking about milliseconds and not even seconds in most cases.
The key thing in these sorts of architectures is to have each service be autonomous when it comes to writes: it can take the write even if none of the other application-level services are up.
So in the example of a twitter like service, you would model it as
Service A manages the content of a post
So when a user makes a post, a write happens in Service A's DB and from that instant the post can be edited because editing is just a request to A.
If there's some other service that consumes the "post content" change events from A and after a "new post" event exposes some functionality, that functionality isn't going to be exposed until that service sees the event (yay tautologies). But that's just physics: the sun could have gone supernova five minutes ago and we can't take any action (not that we could have) until we "see the light".
I am quite new in context of Micro-service architecture and reading this post : http://microservices.io/patterns/data/event-sourcing.html to get familiar with Event sourcing and data storage in Microservice architecture.
I have read many documents about 3 important aspect of system :
Using event sourcing instead of a simply shared DB and ORM and
row update
Events are JAVA objects.
In case of saving data permanently
, we need to use DB (either relational or noSQL)
Here are my questions :
How database comes along with event sourcing? I have read CQRS
pattern, but I can not understand how CQRS pattern is related to
event store and event objects ?
Can any body provide me a
complete picture and set of operations happens with all players to
gather: CQRS pattern , Event sourcing (including event storage
module) and finally different microservices?
In a system
composed of many microservices, should we have one event storage or
each microservice has its own ? or both possible ?
same
question about CQRS. This pattern is implemented in all
microservices or only in one ?
Finally, in case of using
microservice architecture, it is mandatory to have only one DB or
each Microserivce should have its own ?
As you can see, I have understood all small pieces of game , but I can not relate them together to compose a whole image. Specially relevance between CQRS and event sourcing and storing data in DB.
I read many articles for example :
https://ookami86.github.io/event-sourcing-in-practice/
https://msdn.microsoft.com/en-us/library/jj591577.aspx
But in all of them small players are discussed. Even a hand drawing piece of image will be appreciated.
How database comes along with event sourcing? I have read CQRS pattern, but I can not understand how CQRS pattern is related to event store and event objects ?
"Query" part of CQRS instructs you how to create a projection of events, which is applicable in some "bounded context", where the database could be used as a means to persist that projection. "Command" part allows you to isolate data transformation logic and decouple it from the "query" and "persistence" aspects of your app. To simply put it - you just project your event stream into the database in many ways (projection could be relational as well), depending on the task. In this model "query" and "command" have their own way of projecting and storing events data, optimised for the needs of that specific part of the application. Same data will be stored in events and in projections, this will allow achieving simplicity and loose coupling among subdomains (bounded contexts, microservices).
Can any body provide me a complete picture and set of operations happens with all players to gather: CQRS pattern , Event sourcing (including event storage module) and finally different microservices?
Have you seen Greg Young's attempt to provide simplest possible implementation? If you still confused, consider creating more specific question about his example.
In a system composed of many microservices, should we have one event storage or each microservice has its own ? or both possible ?
It is usually one common event storage, but there definitely could be some exceptions, edge cases where you really will need multiple storages for different microservices here and there. It all depends on the business case. If you not sure - most likely you just need a single storage for now.
same question about CQRS. This pattern is implemented in all microservices or only in one ?
It could be implemented in most performance-demanding microservices. It all depends on how complex your implementation becomes when you are introducing CQRS into it. If it gets simpler - why not implement it everywhere? But if people in your team become more and more confused by the need to perform more explicit synchronisation between commands and queries parts - maybe cqrs is too much for you. It all depends on your team, on your domain ... there is no single simple answer, unfortunately.
Finally, in case of using microservice architecture, it is mandatory to have only one DB or each Microservice should have its own ?
If same microservices sharing same tables - this is usually considered as an antipattern, as it increases coupling, the system becomes more fragile. You can still share the same database, but there should be no shared tables. Also, tables from one microservice better not have FK's to tables in another microservice. Same reason - to reduce coupling.
PS: consider not to ask coarse-grained questions, as it will be harder to get a response from people. Several smaller, more specific questions will have better chance to be answered.