Spring boot Distrubuted transaction - spring-boot

We need to find best way to address distributed transaction management in our microservices architecture.
Here is the Problem Statement.
We have one Composite microservice which shall interact with underlying other 2 Atomic microservices (Which are meant for specific purpose obviously) and have separate database e.g. We can consider these 2 microservices as
STUDENT_SERVICE (STU_DB)
TEACHER_SERVICE (TEACHR_DB)
Here in Composite Service Usecase is like user (Administrator) can assign a Teacher to a student for the specific course etc.
I wonder how can we address this problem in one transaction as each servie (STUDENT_SERVICE and TEACHER_SERVICE ) has separate DB and all should happen in one transaction either commit or rollback.
Since those 2 services are separate and I see JTA would not be of help as it is meant for having these 2 applications (services) deployed on same application server!
I have opted out JTA as mentioned above
//Pseudo Code
class CompositeService{
AssignStaff(resquest){
//txn Start
updateStudentServiceAPI(request);
UpdateTeacherServiceAPI(request);
//txn End
}
}
System should be in consistent state after api execution

This is a tricky question even it's not obvious at the first sight.
The functionality you call for is understood to be an anti-pattern for microservice architecture.
Microservice architecture is in general a distributed system. Transactions in distributed systems are hard (see https://martin.kleppmann.com/2015/09/26/transactions-at-strange-loop.html). Your application consists from two services.
The JTA is a Java API for ACID style transactions. ACID transactions usually requires locks to be established in databases. As the transaction spans over multiple services (in your case there are two) then a failure of one service can block processing of the other service. In such case you are loosing the advantage of the microservice architecture - loose coupling and Independence of the services. You can end up of building a distributed monolith (see nice article https://blog.christianposta.com/microservices/the-hardest-part-about-microservices-data/).
Btw. there are several discussion on the topic of transactions in microservices here at Stackoverflow. Just search or check e.g.
Distributed transactions in microservices
Transactions in microservices
Transactions across REST microservices?
What are your options
(disclaimer: I'm a developer for http://narayana.io and presented options are from perspective of Java EE and Narayana. There could be other projects providing similar functionality. Plus, even Narayana integrates nicely with Spring you will possibly need to handle some integration issues.)
you really need to run the ACID style transaction in your project - aka you insists you need the transaction behaviour in way you describe. Then you need to span transaction over services. Then if services communicate over REST you can consider for example Narayana REST-AT (http://jbossts.blogspot.com/2011/03/rest-cloud-and-transactions.html, start looking into quickstart here https://github.com/jbosstm/quickstart/tree/master/rts)
you relax your requirements for atomicity and then you can cosider some transaction model relaxing the consistency (you are fine to be eventual consistent). You can consider for example LRA (https://github.com/eclipse/microprofile-lra/blob/master/spec/src/main/asciidoc/microprofile-lra-spec.adoc). (Unfortunately the spec and implementation is still not ready but PoC could be run on current state.)
you want to use a different approach for transaction processing completely. Then you can investigate on event sourcing. You would deploy e.g. Apache Kafka and send events for updates to the event store. Each service will reads those events and updates independently the DBs.

Related

Running multiple Quarkus instances on one machine

I have an application separated in various OSGI bundles which run on a single Apache Karaf instance. However, I want to migrate to a microservice framework because
Apache Karaf is pretty tough to set up due its dependency mechanism and
I want to be able to bring the application later to the cloud (AWS, GCloud, whatever)
I did some research, had a look at various frameworks and concluded that Quarkus might be the right choice due to its container-based approach, the performance and possible cloud integration opportunities.
Now, I am struggeling at one point and I didn't find a solution so far, but maybe I also might have a misunderstanding here: my plan is to migrate almost every OSGI bundle of my application into a separate microservice. In that way, I would be able to scale horizontally only the services for which this is necessary and I could also update/deploy them separately without having to restart the whole application. Thus, I assume that every service needs to run in a separate Quarkus instance. However, Quarkus does not not seem to support this out of the box?!? Instead I would need to create a separate configuration for each Quarkus instance.
Is this really the way to go? How can the services discover each other? And is there a way that a service A can communicate with a service B not only via REST calls but also use objects of classes and methods of service B incorporating a dependency to service B for service A?
Thanks a lot for any ideas on this!
I think you are mixing some points between microservices and osgi-based applications. With microservices you usually have a independent process running each microservice which can be deployed in the same o other machines. Because of that you can scale as you said and gain benefits. But the communication model is not process to process. It has to use a different approach and its highly recommended that you use a standard integration mechanism, you can use REST, you can use Json RPC, SOAP, or queues or topics to use a event-driven communication. By this mechanisms you invoke the 'other' service operations as you do in osgi, but you are just using a different interface, instead of a local invocation you do a remote invocation.
Service discovery is something that you can do with just Virtual IP's accessing other services through a common dns name and a load balancer, or using kubernetes DNS, if you go for kubernetes as platform. You could use also a central configuration service or let each service register itself in a central registry. There are already plenty different flavours of solutions to tackle this complexity.
Also more importantly, you will have to be aware of your new complexities, but some you already have.
Contract versioning and design
Synchronous or asynchronous communication between services.
How to deal with security in the boundary of the services / Do i even need security in most of my services or i just need information about the user identity.
Increased maintenance cost and redundant side code for common features (here quarkus helps you a lot with its extensions and also you have microprofile compatibility).
...
Deciding to go with microservices is not an easy decision and not one that should be taken in a single step. My recommendation is that you analyse your application domain and try to check if your design is ok to go with microservices (in terms of separation of concenrs and model cohesion) and extract small parts of your osgi platform into microservices, otherwise you mostly will be force to make changes in your service interfaces which would be more difficult to do due to the service to service contract dependency than change a method and some invocations.

Transaction management in microservices

We are rewriting legacy app using microservices. Each microservice has its own DB. There are certain api calls that require to call another microservice and persist data into both DBs. How to implement distributed transaction management effectively in this case?
Since we are not migrated completely to the new micro services environment, we still writeback data to old monolith. For this when an microservice end point is called, we call monolith service from microservice api to writeback same data. How to deal with the same problem in this case as well.
Thanks in advance.
There are different distributer transaction frameworks usually included and maintained as part of heavy application servers like JBoss and WebLogic.
The standard usually used by such services is Jakarta Transactions (JTA; formerly Java Transaction API).
Tomcat and Spring don't support distributed transactions out-of-the-box. You can add this functionality using third party framework like Atomikos (just googled, I've never used it).
But remember, microservice with JTA ist not "micro" anymore :-)
Here is a small overview over available technologies and possible workarounds:
https://www.baeldung.com/transactions-across-microservices
If you can afford to write to the legacy system later (i.e. allow some latency between updating the microservice and the legacy system) you can use the outbox pattern.
Essentially that means that you write to the microservice database in a transactional way both to the tables you usually write and an additional "outbox" table of changes to apply and then have a separate process that reads that table and updates the legacy system.
You can also achieve something similar with a change data capture mechanism on the db used in the microservice(s)
Check out this answer on "Why is 2-phase commit not suitable for a microservices architecture?": https://stackoverflow.com/a/55258458/3794744

Should microservices connected with axon share the axon framework related tables?

I am starting a project where I want to have multiple services that communicate with each other using the axon server.
I have more than one service with the following stack:
Spring Boot 2.3.0.RELEASE (with starters: Data, JPA, web, mysql)
Axon
Spring Boot Starter - 4.2.1
Each one of the services uses different schemas in the mysql server.
When I start the spring boot service with the axon framework activated, some tables for tokens, sagas, etc are created in the database schema of each application.
I have two questions
In the architecture that I am trying to build, should I have only
one database for all the ‘axon enabled’ services, so the sagas,
tokens, events, etc are only in one place?
If so, can anyone
provide an example of how to configure a custom
EntityManagerProvider to have the database of the service separated
from the database of Axon?
I assume each of your microservices models a sub-domain. Since the events do model a (sub)domain, along with aggregates, entities and value objects, I very much favor keeping the Axon-related schemas separated, most likely along with the databases/schemas corresponding to each service. I would, thus, prefer a modeling-first approach when considering such technical options.
It is what we're currently doing in our microservices ecosystem.
There is at least one more technical reason to go with the same schema (one per sub-domain, that is), both for Axon assets and application-specific assets. It was pointed out to me by my colleague Marian. If you (will) use Event Sourcing (thus reconstructing the state of an aggregate by fetching and applying all past events resulted after handling the commands) then you will, most likely, need transactions which encompass this fetching as well as the command handling code which might, in turn, trigger (through events) writes to your microservice-specific database.
Axon can require five tables, depending on your usages of Axon of course.
These are:
The Event table.
The Snapshot Event table.
The Token table.
The Saga table.
The Association Value Entry table.
When using Axon Server, tables 1 and 2 will not be created since Axon Server is the storage solution for events and snapshots.
When not using Axon Server, I would indeed suggest to have a dedicated datasource for these.
Table 3 which services the TokenStore, should be as close as possible to your Query Models. The tokens portray how far a given EventProcessor is with handling events. As these EventProcessors typically service projectors which create your query models, keeping them together is sensible from a transactional perspective.
Table 4 and 5 are both required for Sagas. The "Saga table" stores the serialized sagas, whereas the "Association Value Entry table" carries the associations values between events and sagas so that the framework can load the right sagas. I'd store these either in a dedicated database or along with the other tables of the given (micro)service.

Do we need to maintain different instances of state machine for each transaction

We are analyzing state machine to implement in one of our micro services solutions (spring boot).
Service is handling transactions, and internally calling other payment providers APIs. My concern is do we have to create separate instances of state machines with respect to transactionId.
Any leads would be appreciated.

Microservices: Service discovery/ circuit breaker for Event-driven architecture

I'm fairly new to Microservices...
I've taken an interest in learning more about two main patterns like service discovery and circuit breaker and I have conducted research on how these could be implemented.
As a Java Developer, I'm using Spring Boot. From what I understand, these patterns are useful if microservices communicate via HTTP.
One of the topics I've recently seen is the importance of event-driven architecture, which makes use of an event message bus that services would use to send messages to for other services, which subscribe to the bus
and process the message.
Given this event-driven nature, how can service-discovery and circuit breakers be achieved/implemented, given that these are commonly applicable for services communicating via HTTP?
From what I understand, these patterns are useful if microservices communicate via HTTP.
It is irrelevant that the communication is HTTP. The circuit breaker is useful in prevention of cascade failures that are more probable to occur in the architectures that use a synchronous communication style.
Event-driven architectures are in general asynchronous so cascade failure is less probable to occur.
Service discovery is used in order for the microservices to discover each other but in Event-driven architectures microservices communicate only to the messaging infrastructure (i.e. the Event store in Event sourcing) so discoverability could be used only at the infrastructure level.
I. circuit breaker and service discovery are patterns. When we say Pattern they can be implemented with any programming language. 'HTTP' protocol is for transfer of data.
circuit breaker can be implemented within Java. You can find many implementations (of course, with varying capabilities and interpretation of pattern) on github.
Some of the well-known, built for purpose implementations are :
Hysterix from NetflixOSS For using Hysterix: You can follow Spring Guide - Spring Circuit Breaker
Apache Polygene - which has example of JMX circuit breaker
Resilience4j
II. About,
Given this event-driven nature, how can service-discovery and circuit
breakers be achieved/ implemented, given that these are commonly
applicable for services communicating via HTTP?
It seems you need bit more research on topic of Microservices interactions.
There are two ways to which microservices interactions are possible. You have to choose one over the other. You can/should not mix both.
Orchestration: An interaction style that has an intelligent controller that dispatches events to processes. Please note the word 'processes' which is representing business processes here. Orchestration style was preferred in old SOA implementations as well.
Choreography: An interaction style that allows processes to subscribe to events and handle them independently or through integration with other processes without the need for a central controller.
These topics are greatly covered under
Orchestration vs. Choreography
Need of Service Discovery:
With choreography, two or more microservices can coordinate their activities and processes to share information and value.
But, these microservices may not be aware of each other's existence i.e. There are no hard-coded or service references of dependency endpoints configured or coded into them. Why we do this, is for avoiding any kind of coupling between services. So, the question remains is how one service, if required will find another services' endpoint? This is where service discovery mechanism is used.
Another perspective is, with microservices deployment with containers etc, microservices endpoints will not be even tied to any hosts etc. [due to spin-up and spin-down of containers]. So, for this case as well, we need 'service discovery' mechanism.
So, In service discovery mechanism, a centralized service discovery tool helps services to register themselves and to discover other services via a DNS or HTTP interface.
Service discovery can be implemented with
1. Server-side service discovery
2. Client Side service discovery
Consul,etcd, zookeeper are some of the key-tools names within service discovery space.
Spring Boot integrates well with Spring Cloud. And Spring Cloud provides Eureka (for service discovery) as well as Hystrix (for circuit breaker patterns). Also, Spring Cloud Stream to provide event driven patterns
Very easy to use with Spring Boot
I believe there is a misunderstanding in the question in that you assume that event-driven architectures cannot be implemented on top of HTTP.
An event-driven architecture may be implemented in many different ways and (when the architecture is that of a distributed system), on top of many different protocols.
It can be implemented using a message broker (i.e. Kafka, RabbitMQ, ActiveMQ, etc) as you suggested it too. However, this is just a choice and certainly not the only way to do it.
For example, the seminal book Building Microservices by Sam Newman, in Chapter 4: Integration, under Implementing Asynchronous Event-Based Collaboration says:
“Another approach is to try to use HTTP as a way of propagating
events. ATOM is a REST-compliant specification that defines semantics
(among other things) for publishing feeds of resources. Many client
libraries exist that allow us to create and consume these feeds. So
our customer service could just publish an event to such a feed when
our customer service changes. Our consumers just poll the feed,
looking for changes. On one hand, the fact that we can reuse the
existing ATOM specification and any associated libraries is useful,
and we know that HTTP handles scale very well. However, HTTP is not
good at low latency (where some message brokers excel), and we still
need to deal with the fact that the consumers need to keep track of
what messages they have seen and manage their own polling schedule.
I have seen people spend an age implementing more and more of the
behaviors that you get out of the box with an appropriate message
broker to make ATOM work for some use cases. For example, the
Competing Consumer pattern describes a method whereby you bring up
multiple worker instances to compete for messages, which works well
for scaling up the number of workers to handle a list of independent
jobs. However, we want to avoid the case where two or more workers see
the same message, as we’ll end up doing the same task more than we
need to. With a message broker, a standard queue will handle this.
With ATOM, we now need to manage our own shared state among all the
workers to try to reduce the chances of reproducing effort. If you
already have a good, resilient message broker available to you,
consider using it to handle publishing and subscribing to events. But
if you don’t already have one, give ATOM a look, but be aware of the
sunk-cost fallacy. If you find yourself wanting more and more of the
support that a message broker gives you, at a certain point you might
want to change your approach.”
Likewise, if your design uses a message broker for the event-driven architecture, then I'm not sure if a circuit breaker is needed, because in that case the consumer applications control the rate at which event messages are being consumed from the queues. The producer application can publish event messages at its own pace, and the consumer applications can add as many competing consumers as they want to keep up with that pace. If the server application is down the client applications can still continue consuming any remaining messages in the queues, and once the queues are empty, they will just remain waiting for more messages to arrive. But that does not put any burden on the producer application. The producer and the consumer applications are decoupled in this scenario, and all the work the circuit breaker does in other scenarios would be solved by the message broker application.
Somewhat similar can be said of the service discovery feature. Since the producer and the consumer do not directly talk to each other, but only through the message broker, then the only service you need to discover would be the message broker.

Resources