When the BPMN process shoud start in the microservices architecture with Camunda orchestration - microservices

Consider an architecture like this:
API Gateway - responsible for aggregating services
Users microservice - CRUD operations on the user (users, addresses, consents, etc)
Notification microservice- sending email and SMS notifications
Security microservice - a service responsible for granting / revoking permissions to users and clients. For example, by connecting to Keycloak, it creates a user account with basic permission
Client - any application that connects to API Gateway in order to perform a given operation, e.g. user registration
Now, we would like to use Camunda for the entire process.
For example:
Client-> ApiGateway-> UsersMicroservice.Register-> SecurityMicroservice.AddDefaultPermition-> NotificationMicroservice.SendEmail
We would like to make this simplified flow with the use of e.g. Camunda.
Should the process start in UsersMicroservice.RegisterUser after receiving "POST api/users/" - that is UsersMicroservice.RegisterUser starts the process in Camunda and how does this endpoint know what specific process is to run in Camunda?
What if the BPMN process in Camunda is designed in such a way that immediately after entering the process there will be a Business Rule Task that will validate the Input and if there is no "Name", for example, it will interrupt the registration process? How UsersMicroservice will find out that the process has been interrupted and it should not perform any further standard operation like return this.usersService.Create (userInput);
Should the call to Camunda be in the Controller or rather in the Service layer?
How in the architecture as above, make a change to use Camunda to change the default Client-> UsersMicroservice-> UsersService-> Database flow, adding e.g. input validation before calling return this.usersService.Create (someInput);

If your intention is to let the process engine orchestrate the business process, then why not start the business process first? Either expose the start process API or a facade, which gets called by the API gateway when the desired business request should be served. Now let the process model decide which steps need to be taken to serve the request and deliver the desired result/business value. The process may start with a service task to create a user. However, like you wrote, the process may evolve and perform additional checks before the user is created. Maybe a DMN validates data. Maybe it is followed by a gateway which lead to a rejection path, a path that call an additional blacklist service, a path with a manual review, and the "happy path' with automated creation of the user. Whatever needs to happen, this is business logic, which you can make flexible by giving control to the process engine first.
The process should be started by the controller via a start process endppoint, before/not form UsersMicroservice.RegisterUser. You use a fixed process definition key to start. From here everything can be changed in the process model. You could potentially have an initial routing process ("serviceRequest") first which determines based on a process data ("request type") what kind of request it is ("createUser", "disableUser",...) and dispatches to the correct specific process for the given request ("createUser" -> "userCreationProcess").
The UsersMicroservice should be stateless (request state is managed in the process engine) and should not need to know. If the process is started first, the request may never reach UsersMicroservice. this.usersService.Create will only be called if the business logic in the process has determined that it is required - same for any subsequent service calls. If a subsequent step fails error handling can include retries, handling of a business error (e.g. "email address already exists") via an exceptional error path in the model (BPMNError), or eventually triggering a 'rollback' of operations already performed (compensation).
Controller - see above. The process will call the service if needed.
Call the process first, then let it decide what needs to happen.

Related

Microservices: how to track fallen down services?

Problem:
Suppose there are two services A and B. Service A makes an API call to service B.
After a while service A falls down or to be lost due to network errors.
How another services will guess that an outbound call from service A is lost / never happen? I need some another concurrent app that will automatically react (run emergency code) if service A outbound CALL is lost.
What are cutting-edge solutions exist?
My thoughts, for example:
service A registers a call event in some middleware (event info, "running" status, timestamp, etc).
If this call is not completed after N seconds, some "call timeout" event in the middleware automatically starts the emergency code.
If the call is completed at the proper time service A marks the call status as "completed" in the same middleware and the emergency code will not be run.
P.S. I'm on Java stack.
Thanks!
I recommend to look into patterns such as Retry, Timeout, Circuit Breaker, Fallback and Healthcheck. Or you can also look into the Bulkhead pattern if concurrent calls and fault isolation are your concern.
There are many resources where these well-known patterns are explained, for instance:
https://www.infoworld.com/article/3310946/how-to-build-resilient-microservices.html
https://blog.codecentric.de/en/2019/06/resilience-design-patterns-retry-fallback-timeout-circuit-breaker/
I don't know which technology stack you are on but usually there is already some functionality for these concerns provided already that you can incorporate into your solution. There are libraries that already take care of this resilience functionality and you can, for instance, set it up so that your custom code is executed when some events such as failed retries, timeouts, activated circuit breakers, etc. occur.
E.g. for the Java stack Hystrix is widely used, for .Net you can look into Polly .Net to make use of retry, timeout, circuit breaker, bulkhead or fallback functionality.
Concerning health checks you can look into Actuator for Java and .Net core already provides a health check middleware that more or less provides that functionality out-of-the box.
But before using any libraries I suggest to first get familiar with the purpose and concepts of the listed patterns to choose and integrate those that best fit your use cases and major concerns.
Update
We have to differentiate between two well-known problems here:
1.) How can service A robustly handle temporary outages of service B (or the network connection between service A and B which comes down to the same problem)?
To address the related problems the above mentioned patterns will help.
2.) How to make sure that the request that should be sent to service B will not get lost if service A itself goes down?
To address this kind of problem there are different options at hand.
2a.) The component that performed the request to service A (which than triggers service B) also applies the resilience patterns mentioned and will retry its request until service A successfully answers that it has performed its tasks (which also includes the successful request to service B).
There can also be several instances of each service and some kind of load balancer in front of these instances which will distribute and direct the requests to an available instance (based on regular performed healthchecks) of the specific service. Or you can use a service registry (see https://microservices.io/patterns/service-registry.html).
You can of course chain several API calls after another but this can lead to cascading failures. So I would rather go with an asynchronous communication approach as described in the next option.
2b.) Let's consider that it is of utmost importance that some instance of service A will reliably perform the request to service B.
You can use message queues in this case as follows:
Let's say you have a queue where jobs to be performed by service A are collected.
Then you have several instances of service A running (see horizontal scaling) where each instance will consume the same queue.
You will use message locking features by the message queue service which makes sure that as soon one instance of service A reads a message from the queue the other instances won't see it. If service A was able to complete it's job (i.e. call service B, save some state in service A's persistence and whatever other tasks you need to be included for a succesfull procesing) it will delete the message from the queue afterwards so no other instance of service A will also process the same message.
If service A goes down during the processing the queue service will automatically unlock the message for you and another instance A (or the same instance after it has restarted) of service A will try to read the message (i.e. the job) from the queue and try to perform all the tasks (call service B, etc.)
You can combine several queues e.g. also to send a message to service B asynchronously instead of directly performing some kind of API call to it.
The catch is, that the queue service is some highly available and redundant service which will already make sure that no message is getting lost once published to a queue.
Of course you also could handle jobs to be performed in your own database of service A but consider that when service A receives a request there is always a chance that it goes down before it can save that status of the job to it's persistent storage for later processing. Queue services already address that problem for you if chosen thoughtfully and used correctly.
For instance, if look into Kafka as messaging service you can look into this stack overflow answer which relates to the problem solution when using this specific technology: https://stackoverflow.com/a/44589842/7730554
There is many way to solve your problem.
I guess you are talk about 2 topics Design Pattern in Microservices and Cicruit Breaker
https://dzone.com/articles/design-patterns-for-microservices
To solve your problem, Normally I put a message queue between services and use Service Discovery to detect which service is live and If your service die or orverload then use Cicruit Breaker methods

need clarification on microservices

I need some clarifications on microservices.
1) As I understand only choreography needs event sourcing and in choreography we use publish/subscribe pattern. Also we use program likes RabbitMQ to ensure communication between publisher and subscribers.
2) Orchestration does not use event sourcing. It uses observer pattern and directly communicate with observers. So it doesn't need bus/message brokers (like RabbitMQ). And to cooridante all process in orchestration we use mediator pattern.
Is that correct?
In microservice orchestration , a centralized approach is followed for execution of the decisions and control with help of orchestrator. The orchestrator has to communicate directly with respective service , wait for response and decide based on the response from and hence it is tightly coupled. It is more of synchronous approach with business logic predominantly in the orchestrator and it takes ownership for sequencing with respect to business logic. The orchestration approach typically follows a request/response type pattern whereby there are point-to-point connection between the services.
In, microservice choreography , a decentralized approach is followed whereby there is more liberty such that every microservice can execute their function independently , they are self-aware and it does not require any instruction from a centralized entity. It is more of asynchronous approach with business logic spread across the microservices, whereby every microservice shall listen to other service events and make it's own decision to perform an action or not. Accordingly, the choreography approach relies on a message broker (publish/subscribe) for communication between the microservices whereby each service shall be observing the events in the system and act on events autonomously.
TLDR: Choreography is the one which doesn't need persistance of the status of the process, orchestration needs to keep the status of the process somewhere.
I think you got this somewhat mixed up with implementation details.
Orchestration is called such, because there is a central process manager (sometimes mentioned as saga, wrongly imho) which directs (read orchestrates) operations across other services. In this pattern, the process manager directs actions to BC's, but needs to keep a state on previous operations in order to undo, roll back, or take any corrective or reporting actions deemed necessary. This status can be held either in an event stream, normal form db, or even implicitly and in memory (as in a method executing requests one by one and undoing the previous ones on an error), if the oubound requests are done through web requests for example. Please note that orchestrators may use synchronous, request-response communication (like making web requests). In that case the orchestrator still keeps a state, it's just that this state is either implicit (order of operations) or in-mem. State still exists though, and if you want to achieve resiliency (to be able to recover from an exception or any catastrophic failure), you would again need to persist that state on-disk so that you could recover.
Choreography is called such because the pieces of business logic doing the operations observe and respond to each other. So for example when a service A does things, it raises an event which is observed by B to do a follow up actions, and so on and so forth, instead of having a process manager ask A, then ask B, etc. Choregraphy may or may not need persistance. This really depends on the corrective actions that the different services need to do.
An example: As a practical example, let's say that on a purchase you want to reserve goods, take payment, then manifest a shipment with a courier service, then send an email to the recipient.
The order of the operations matter in both cases (because you want to be able to take corrective actions if possible), so we decide do the payment after the manifestation with the courier.
With orchestration, we'd have a process manager called PM, and the process would do:
PM is called when the user attempts to make a purchase
Call the Inventory service to reserve goods
Call the Courier integration service to manifest the shipment with a carrier
Call the Payments service to take a payment
Send an email to the user that they're receiving their goods.
If the PM notices an error on 4, they only corrective action is to retry to send the emai, and then report. If there was an error during payment then the PM would directly call Courier integration service to cancel the shipment, then call Inventory to un-reserve the goods.
With choreography, what would happen is:
An OrderMade event is raised and observed by all services that need data
Inventory handles the OrderMade event and raises an OrderReserved
CourierIntegration handles the OrderReserved event and raises ShipmentManifested
Payments service handles the ShipmentManifested and on success raises PaymentMade
The email service handles PaymentMade and sends a notification.
The rollback would be the opposite of the above process. If the Payments service raised an error, Courier Integration would handle it and raise a ShipmentCancelled event, which in turn is handled by Inventory to raise OrderUnreserved, which in turn may be handled by the email service to send a notification.

Micro-services architecture, need advise

We are working on a system that is supposed to 'run' jobs on distributed systems.
When jobs are accepted they need to go through a pipeline before they can be executed on the end system.
We've decided to go with a micro-services architecture but there one thing that bothers me and i'm not sure what would be the best practice.
When a job is accepted it will first be persisted into a database, then - each micro-service in the pipeline will do some additional work to prepare the job for execution.
I want the persisted data to be updated on each such station in the pipeline to reflect the actual state of the job, or the its status in the pipeline.
In addition, while a job is being executed on the end system - its status should also get updated.
What would be the best practice in sense of updating the database (job's status) in each station:
Each such station (micro-service) in the pipeline accesses the database directly and updates the job's status
There is another micro-service that exposes the data (REST) and serves as DAL, each micro-service in the pipeline updates the job's status through this service
Other?....
Help/advise would be highly appreciated.
Thanx a lot!!
To add to what was said by #Anunay and #Mohamed Abdul Jawad
I'd consider writing the state from the units of work in your pipeline to a view (table/cache(insert only)), you can use messaging or simply insert a row into that view and have the readers of the state pick up the correct state based on some logic (date or state or a composite key). as this view is not really owned by any domain service it can be available to any readers (read-only) to consume...
Consider also SAGA Pattern
A Saga is a sequence of local transactions where each transaction updates data within a single service. The first transaction is initiated by an external request corresponding to the system operation, and then each subsequent step is triggered by the completion of the previous one.
http://microservices.io/patterns/data/saga.html
https://dzone.com/articles/saga-pattern-how-to-implement-business-transaction
https://medium.com/#tomasz_96685/saga-pattern-and-microservices-architecture-d4b46071afcf
If you would like to code the workflow:
Micorservice A which accepts the Job and command for update the job
Micorservice B which provide read model for the Job
Based on JobCreatedEvents use some messaging queue and process and update the job through queue pipelines and keep updating JobStatus through every node in pipeline.
I am assuming you know things about queues and consumers.
Myself new to Camunda(workflow engine), that might be used not completely sure
accessing some shared database between microservices is highly not recommended as this will violate the basic rule of microservices architecture.
microservice must be autonomous and keep it own logic and data
also to achive a good microservice design you should losely couple your microservices
Multiple microservices accessing the database is not recommended. Here you have the case where each of the service needs to be triggered, then they update the data and then some how call the next service.
You really need a mechanism to orchestrate the services. A workflow engine might fit the bill.
I would however suggest an event driven system. I might be going beyond with a limited knowledge of the data that you have. Have one service that gives you basic crud on data and other services that have logic to change the data (I would at this point would like to ask why you want different services to change the state, if its a biz req, its fine) Once you get the data written just create an event to which services can subscribe and react to it.
This will allow you to easily add more states to your pipeline in future.
You will need a service to manage the event queue.
As far as logging the state of the event was concerned it can be done easily by logging the events.
If you opt for workflow route you may use Amazon SWF or Camunda or really there quite a few options out there.
If going for the event route you need to look into event driven system in mciroservies.

linking Microservices and allowing for one to be unavailable

I'm new to the microservices architecture and am seeing that it is possible under the model to call a microservice from another via HTTP request. However, I am reading that if a service is down all other services should still operate.
My question is, how is this generally achieved?
Example, a microservice that handles all Car record manipulation may need access to the service which handles the Vehicle data. How can the Car Microservice complete it's operations if that service is down or doesn't respond?
You should generally consider almost zero sync communication between microservices(if still you want sync comminucation try considering circuit breakers which allow your service to be able to respond but with logical error message , if no circuit breaking used dependent services will also go down completly).This could be achieved by questioning the consistency requirement of the micorservice.
Sometimes these things are not directly visible , for eg: lets say there are two services order service and customer service and order service expose a api which say place a order for customer id. and business say you cannot place a order for a unknown customer
one implementation is from the order service you call the customer service in sync ---- in this case customer service down will impact your service, now lets question do we really need this.
Because a scenario could happen where customer just placed an order and somebody deleted that customer from customer service, now we have a order which dosen't belong to customer.Consistency cannot be guaranteed.
In the new sol. we are saying allow the order service to place the order without checking the customer id and do one of the following:
Using ProcessManager check the customer validity and update the status of the order as invalid and when customer get deleted using ProcessManager update the order status as invalid or perform business logic
Do not check at all , because placing a order dosen't count a thing, when this order will be in the process of dispatch that service will anyway check the customer status
Statement here is try to achieve more async communication between microservices , mostly you will be able find the sol. in the consistency required by the business. But in case your business wants to check it 100% you have to call other service and if other service is down , your service will give logical errors.

In DDD, who should be resposible for handling domain events?

Who should be responsible for handling domain events? Application services, domain services or entities itself?
Let's use simple example for this question.
Let's say we work on shop application, and we have an application service dedicated to order operations. In this application Order is an aggregate root and following rules, we can work only with one aggregate within single transaction. After Order is placed, it is persisted in a database. But there is more to be done. First of all, we need to change number of items available in the inventory and secondly notify some other part of a system (probably another bounded context) that shipping procedure for that particular order should be started. Because, as already stated, we can modify only one aggregate within transaction, I think about publishing OrderPlacedEvent that will be handled by some components in the separate transactions.
Question arise: which components should handle this type of event?
I'd like to:
1) Application layer if the event triggers modification of another Aggregate in the same bounded context.
2) Application layer if the event trigger some infrastructure service.
e.g. An email is sent to the customer. So an application service is needed to load order for mail content and mail to and then invoke infrastructure service to send the mail.
3) I prefer a Domain Service personally if the event triggers some operations in another bounded context.
e.g. Shipping or Billing, an infrastructure implementation of the Domain Service is responsible to integrate other bounded context.
4) Infrastructure layer if the event need to be split to multiple consumers. The consumer goes to 1),2) or 3).
For me, the conclusion is Application layer if the event leads to an seperate acceptance test for your bounded context.
By the way, what's your infrastructure to ensure durability of your event? Do you include the event publishing in the transaction?
These kind of handlers belong to application layer. You should probably create a supporting application service's method too. This way you can start separate transaction.
I think the most common and usual place to put the EventHandlers is in the application layer. Doing the analogy with CQRS, EventHandlers are very similar to CommandHandlers and I usually put them both close to each other (in the application layer).
This article from Microsoft also gives some examples putting handlers there. Look a the image bellow, taken from the related article:

Resources