Isolation in Saga patterns - microservices

I am trying to understand microservices. While going through Saga pattern, I came across this sentence
Whilst somewhat close to having ACID guarantees, the saga pattern is
still missing isolation. This means that it is possible to read and
write data from an incomplete transaction, thus introducing various
isolation anomalies
I am still not clear on why saga pattern lacks isolation. Can someone please explain with an example.

There's no isolation since the intermediate states are manifested in the services that are part of the saga
when you call another service in a saga it performs the action (e.g. you are buying a ticket, if you reserve a seat for a show and you're still on the process to pay for the ticket). It isn't isolated as any other interaction with the same service will see the effect of your action (e.g. the specific seat will seem taken for others who are trying to purchase a ticket).

There is total of 6 solutions including:
Semantic lock
Commutative update
Pessimistic view
Reread value
Versioning
By value
you can read more about each of them in this book

Related

Saga Compensating Transaction

Currently working on initial phase of Microservice Architecture for product. It is evident that for many operation it required to have distributed transaction or say there are couple of operation required across different microservice in-order to complete one business process.
For this purpose I found that Saga is useful. Now for ideal case or where everything goes correct then it works fine but when something is not correct or some activity failed at that moment we may have to rollback those operation. For this there is something called "Compensating" transaction or operation required. Now It is completely possible when operation performed specially for successful operation, it may possible that other transaction also performed on that service so db might be in different state then when actually operation performed.
What will be the solution for this ? One solution I think is that somehow state needs to preserve so it can revisit but like for stock may be change to some other transaction so I feel that compensating transaction would be a problem.

How to achieve immediate consistency in microservice architecture?

For example amazon.com; they rely on microservice architecture and probably order and payment are seperate micro services but when you checkout order on amazon.com you can finally see the order id and details.If it's not eventual consistency approach what is it? Maybe 2PC?
I'm generalizing my question; what if eventual consistency is not
appropriate for business transaction(end user should see the result end of transaction) but seperate microservices is meaningful(like order and payment)
how to handle immediate consistency?
There are several techniques which can provide cross-service transactions (atomicity): 2PC, Percolator's transactions and Sagas.
Percolator's transactions have serializable isolation level. They are known in the industry, see: Amazon's DynamoDB transaction library, CockroachDB database, and the Google's Pecolator system itself. A step-by-step visualization of the Percolator's transactions may help you to understand how they work.
The saga pattern was described in the late 80s in the Sagas paper but became more relevant with the rise of microservices. Please see the Applying the Saga Pattern talk for inspiration.
But since you mentioned eventual consistency it's important to notice that all of the techniques require individual services to be linearizable (strong consistency) and to support compare and set.

Eventual Consistency in microservice-based architecture temporarily limits functionality

I'll illustrate my question with Twitter. For example, Twitter has microservice-based architecture which means that different processes are in different servers and have different databases.
A new tweet appears, server A stored in its own database some data, generated new events and fired them. Server B and C didn't get these events at this point and didn't store anything in their databases nor processed anything.
The user that created the tweet wants to edit that tweet. To achieve that, all three services A, B, C should have processed all events and stored to db all required data, but service B and C aren't consistent yet. That means that we are not able to provide edit functionality at the moment.
As I can see, a possible workaround could be in switching to immediate consistency, but that will take away all microservice-based architecture benefits and probably could cause problems with tight coupling.
Another workaround is to restrict user's actions for some time till data aren't consistent across all necessary services. Probably a solution, depends on customer and his business requirements.
And another workaround is to add additional logic or probably service D that will store edits as user's actions and apply them to data only when they will be consistent. Drawback is very increased complexity of the system.
And there are two-phase commits, but that's 1) not really reliable 2) slow.
I think slowness is a huge drawback in case of such loads as Twitter has. But probably it could be solved, whereas lack of reliability cannot, again, without increased complexity of a solution.
So, the questions are:
Are there any nice solutions to the illustrated situation or only things that I mentioned as workarounds? Maybe some programming platforms or databases?
Do I misunderstood something and some of workarounds aren't correct?
Is there any other approach except Eventual Consistency that will guarantee that all data will be stored and all necessary actions will be executed by other services?
Why Eventual Consistency has been picked for this use case? As I can see, right now it is the only way to guarantee that some data will be stored or some action will be performed if we are talking about event-driven approach when some of services will start their work when some event is fired, and following my example, that event would be “tweet is created”. So, in case if services B and C go down, I need to be able to perform action successfully when they will be up again.
Things I would like to achieve are: reliability, ability to bear high loads, adequate complexity of solution. Any links on any related subjects will be very much appreciated.
If there are natural limitations of this approach and what I want cannot be achieved using this paradigm, it is okay too. I just need to know that this problem really isn't solved yet.
It is all about tradeoffs. With eventual consistency in your example it may mean that the user cannot edit for maybe a few seconds since most of the eventual consistent technologies would not take too long to replicate the data across nodes. So in this use case it is absolutely acceptable since users are pretty slow in their actions.
For example :
MongoDB is consistent by default: reads and writes are issued to the
primary member of a replica set. Applications can optionally read from
secondary replicas, where data is eventually consistent by default.
from official MongoDB FAQ
Another alternative that is getting more popular is to use a streaming platform such as Apache Kafka where it is up to your architecture design how fast the stream consumer will process the data (for eventual consistency). Since the stream platform is very fast it is mostly only up to the speed of your stream processor to make the data available at the right place. So we are talking about milliseconds and not even seconds in most cases.
The key thing in these sorts of architectures is to have each service be autonomous when it comes to writes: it can take the write even if none of the other application-level services are up.
So in the example of a twitter like service, you would model it as
Service A manages the content of a post
So when a user makes a post, a write happens in Service A's DB and from that instant the post can be edited because editing is just a request to A.
If there's some other service that consumes the "post content" change events from A and after a "new post" event exposes some functionality, that functionality isn't going to be exposed until that service sees the event (yay tautologies). But that's just physics: the sun could have gone supernova five minutes ago and we can't take any action (not that we could have) until we "see the light".

Event sourcing, CQRS and database in Microservice

I am quite new in context of Micro-service architecture and reading this post : http://microservices.io/patterns/data/event-sourcing.html to get familiar with Event sourcing and data storage in Microservice architecture.
I have read many documents about 3 important aspect of system :
Using event sourcing instead of a simply shared DB and ORM and
row update
Events are JAVA objects.
In case of saving data permanently
, we need to use DB (either relational or noSQL)
Here are my questions :
How database comes along with event sourcing? I have read CQRS
pattern, but I can not understand how CQRS pattern is related to
event store and event objects ?
Can any body provide me a
complete picture and set of operations happens with all players to
gather: CQRS pattern , Event sourcing (including event storage
module) and finally different microservices?
In a system
composed of many microservices, should we have one event storage or
each microservice has its own ? or both possible ?
same
question about CQRS. This pattern is implemented in all
microservices or only in one ?
Finally, in case of using
microservice architecture, it is mandatory to have only one DB or
each Microserivce should have its own ?
As you can see, I have understood all small pieces of game , but I can not relate them together to compose a whole image. Specially relevance between CQRS and event sourcing and storing data in DB.
I read many articles for example :
https://ookami86.github.io/event-sourcing-in-practice/
https://msdn.microsoft.com/en-us/library/jj591577.aspx
But in all of them small players are discussed. Even a hand drawing piece of image will be appreciated.
How database comes along with event sourcing? I have read CQRS pattern, but I can not understand how CQRS pattern is related to event store and event objects ?
"Query" part of CQRS instructs you how to create a projection of events, which is applicable in some "bounded context", where the database could be used as a means to persist that projection. "Command" part allows you to isolate data transformation logic and decouple it from the "query" and "persistence" aspects of your app. To simply put it - you just project your event stream into the database in many ways (projection could be relational as well), depending on the task. In this model "query" and "command" have their own way of projecting and storing events data, optimised for the needs of that specific part of the application. Same data will be stored in events and in projections, this will allow achieving simplicity and loose coupling among subdomains (bounded contexts, microservices).
Can any body provide me a complete picture and set of operations happens with all players to gather: CQRS pattern , Event sourcing (including event storage module) and finally different microservices?
Have you seen Greg Young's attempt to provide simplest possible implementation? If you still confused, consider creating more specific question about his example.
In a system composed of many microservices, should we have one event storage or each microservice has its own ? or both possible ?
It is usually one common event storage, but there definitely could be some exceptions, edge cases where you really will need multiple storages for different microservices here and there. It all depends on the business case. If you not sure - most likely you just need a single storage for now.
same question about CQRS. This pattern is implemented in all microservices or only in one ?
It could be implemented in most performance-demanding microservices. It all depends on how complex your implementation becomes when you are introducing CQRS into it. If it gets simpler - why not implement it everywhere? But if people in your team become more and more confused by the need to perform more explicit synchronisation between commands and queries parts - maybe cqrs is too much for you. It all depends on your team, on your domain ... there is no single simple answer, unfortunately.
Finally, in case of using microservice architecture, it is mandatory to have only one DB or each Microservice should have its own ?
If same microservices sharing same tables - this is usually considered as an antipattern, as it increases coupling, the system becomes more fragile. You can still share the same database, but there should be no shared tables. Also, tables from one microservice better not have FK's to tables in another microservice. Same reason - to reduce coupling.
PS: consider not to ask coarse-grained questions, as it will be harder to get a response from people. Several smaller, more specific questions will have better chance to be answered.

Event-driven architecture and structure of events

I'm new to EDA and I've read a lot about benefits and would probably be interested to apply it during my next project but still haven't understood something.
When raising an event, which pattern is the most suited:
Name the event "CustomerUpdate" and include all information (updated or not) about the customer
Name the event "CustomerUpdate" and include only information that have really been updated
Name the event "CustomerUpdate" and include minimum information (Identifier) and/or a URI to let the consumer retrieves information about this Customer.
I ask the question because some of our events could be heavy and frequent.
Thx for your answers and time.
Name the event "CustomerUpdate"
First let's start with your event name. The purpose of an event is to describe something which has already happenned. This is different from a command, which is to issue an instruction for something yet to happen.
Your event name "CustomerUpdate" sounds ambiguous in this respect, as it could be describing something in the past or something in the future.
CustomerUpdated would be better, but even then, Updated is another ambiguous term, and is nonspecific in a business context. Why was the customer updated in this instance? Was it because they changed their payment details? Moved home? Were they upgraded from silver to gold status? Events can be made as specific as needed.
This may seem at first to be overthinking, but event naming becomes especially relevant as you remove data and context from the event payload, moving more toward skinny events (the "option 3" from your question, which I discuss below).
That is not to suggest that it is always appropriate to define events at this level of granularity, only that it is an avenue which is open to you early on in the project which may pay dividends later on (or may swamp you with thousands of event types).
Going back to your actual question, let's take each of your options in turn:
Name the event "CustomerUpdate" and include all information (updated
or not) about the customer
Let's call this "pattern" the Fat message.
Fat messages (also called snapshots) represent the state of the described entity at a given point in time with all the event context present in the payload. They are interesting because the message itself represents the contract between service and consumer. They can be used for communicating changes of state between business domains, where it may be preferred that all event context be present during message processing by the consumer.
Advantages:
Self consistent - can be consumed entirely without knowledge of other systems.
Simple to consume (upsert).
Disadvantages:
Brittle - the contract between service and consumer is coupled to the message itself.
Easy to overwrite current data with old data if messages arrive in the wrong order (hint: you can mitigate this by using the event sourcing pattern)
Large.
Name the event "CustomerUpdate" and include only information that have
really been updated
Let's call this pattern the Delta message.
Deltas are similar to fat messages in many ways, though they are generally more complex to generate and consume. A good example here is the JSONPatch standard.
Because they are only a partial description of the event entity, deltas also come with a built-in assumption that the consumer knows something about the event being described. For this reason they may be less suitable for sending outside a business domain, where the event entity may not be well known.
Deltas really shine when synchronising data between systems sharing the same entity model, ideally persisted in non-relational storage (eg, no-sql). In this instance an entity can be retrieved, the delta applied, and then persisted again with minimal effort.
Advantages:
Smaller than Fat messages
Excels in use cases involving shared entity models
Portable (if based on a standard such as jsonpatch, or to a lesser extent, diffgram)
Disadvantages:
Similar to the Fat message, assumes complete knowledge of the data entity.
Easy to overwrite current data with old data.
Complex to generate and consume (except for specific use cases)
Name the event "CustomerUpdate" and include minimum information
(Identifier) and/or a URI to let the consumer retrieves information
about this Customer.
Let's call this the Skinny message.
Skinny messages are different from the other message patterns you have defined, in that the service/consumer contract is no longer explicit in the message, but implied in that at some later time the consumer will retrieve the event context. This decouples the contract and the message exchange, which is a good thing.
This may or may not lend itself well to cross-business domain communication of events, depending on how your enterprise is set up. Because the event payload is so small (usually an ID with some headers), there is no context other than the name of the event on which the consumer can base processing decisions; therefore it becomes more important to make sure the event is named appropriately, especially if there are multiple ways a consumer could handle a CustomerUpdated message.
Additionally it may not be good practice to include an actual resource address in the event data - because events are things which have already happened, event messages are generally immutable and therefore any information in the event should be true forever in case the events need to be replayed. In this instance a resource address could easily become obsolete and events would not be re-playable.
Advantages:
Decouples service contract from message.
Information about the event contained in the event name.
Naturally idempotent (with time-stamp).
Generally tiny.
Simple to generate and consume.
Disadvantages:
Consumer must make additional call to retrieve event context - requires explicit knowledge of other systems.
Event context may have become obsolete at the point where the consumer retrieves it, making this approach generally unsuitable for some real-time applications.
When raising an event, which pattern is the most suited?
I think the answer to this is: it depends on lots of things, and there is probably no one right answer.
Update from comments: Also worth reading, a very old, classic, blog post on messaging: https://learn.microsoft.com/en-gb/archive/blogs/nickmalik/killing-the-command-message-should-we-use-events-or-documents (also here: http://vanguardea.com/killing-the-command-message-should-we-use-events-or-documents/)
Martin Fowler gave a great talk about "The Many Meanings of Event-Driven Architecture" (the content is based on this paper) in which he mentioned the Event-Carried State Transfer pattern.
It seems to be close to your second option "Delta message" with the difference that it doesn't try to describe an entity, but instead describe a named business fact that happened and carry over all the necessary data to understand this fact.
I don't think it matters how you have modeled your persistence layer when it comes to designing domain events. Likewise, I don't think it matters how your consumer has modeled its own persistence layer when designing domain events.
Thus, I don't think it's wise to put as an advantage the fact that you can apply the event as a patch directly on your data (from a consumer point of view), because it pushes the producer to design their events given the persistence model of a consumer.
In that case, I would tend to think that you're designing persistence patches, instead of domain events.
What do you think?

Resources