Stateful workflow engine vs Orchestrated idempotent services - microservices

I realize the benefits of workflow engine such as easy to understand communication, easy waiting, parallelism and compensative actions with informative graphical model. The concept is great and more manageable than dogmatic event driven architecture without central coordinator and specified flow.
We are currently using legacy workflow engine to orchestrate microservices in insurance business. Over the time chunks of business logic and little helper scripts has creeped into process model, which is not developer friendly solution to maintain and test with continuous integration standards. Also the lack of available expertise and future support is a huge risk from the project management perspective.
I played around with Camunda and Activiti, but immediately faced compability issues with Spring Boot 3 and a lack of up to date examples and general knowledge outside of relatively small user community. This gives me a bad feeling of drowning into the same swamp as we are now in the future.
We planned design our own Java based orchestrator, which just invokes specified microservices in a specified order when the process is started or user task is completed. The orchestrator will also handle monitoring and versioning of the process flow. It's up to microservices to validate their business context and halt the process by raising user tasks if necessary. When user task is completed, the orchestrator restarts the whole process from the beginning with all tasks cleared. It is the responsibility of microservices to no-op when their work is already done in the previous run. Eventually, the process will reach it's end and finish. This solution would be a good balance of modern DX and coordinated process management.
Is there examples or name for such an idempotent orchestrated architecture?

You only get into the challenge of aligning dependencies between your services and the process engine (and other components) if you tightly couple the orchestration / engine with the services. Happened to me many times in the past, too. If you separate the engine (called remote process engine with Camunda 7, only architecture with Camunda 8), then you are not influenced by its dependencies. Try for instance the Camunda RUN distribution and the external task pattern or C8 SaaS to get to a cleaner, decoupled architecture. See Bernd Ruecker's reasoning here.
Details will depend on your specific requirements, but I would definitely advise anyone against building a homegrown solution. There are enough options in the market and these times are over. Requirements grow over time. There are security vulnerabilities to be aware of and to fix, etc. High maintenance, no market for resources, no synergies, you would need to maintain proprietary knowledge in the company and cannot achieve the same level of quality and feature richness as a more broadly used solution can. For a list of options see for instance Bernd Ruecker's articles. Among the available options I would personally prefer an orchestrator, which uses a graphical process modelling approach based on the BPMN 2 standand. It helps clarity, knowledge transfer, and Business-IT alignment and the standard is a vendor-independent skill set.

There is no need to build your own. Use temporal.io open source project. Besides Java SDK it supports Go, Typescript/Javascript, Python, PHP.
The project started at Uber in 2016. There are hundreds of companies using it for mission critical applications.

Related

Should event driven architecture be targeted for all data & analytics platforms?

For example,
You have an IT estate where a mix of batch and real-time data sources exists from multiple systems, e.g. ERP, Project management, asset, website, monitoring etc.
The aim is to integrate the datasources into a cloud environment (agnostic).
There is a need for reporting and analytics on combinations of all data sources.
Inevitably, some source systems are not capable of streaming, hence batch loading is required.
Potential use-cases for performing functionality/changes/updates based on the ingested data.
Given a steer for creating a future-proofed platform, architecturally, how would you look to design it?
It's a very open-end question, but there are some good principles you can adopt to help direct you in the right direction:
Avoid point-to-point integration, and get everything going through a few common points - ideally one. Using an API Gateway can be a good place to start, the big players (Azure, AWS, GCP) all have their own options, plus there's lots of decent independent ones like Tyk or Kong.
Batches and event-streams are totally different, but even then you can still potentially route them all through the gateway so that you get the centralised observability (reporting, analytics, alerting, etc).
Use standards-based API specifications where possible. A good REST based API, based off a proper resource model is a non-trivial undertaking, not sure if it fits with what you are doing if you are dealing with lots of disparate legacy integration. If you are going to adopt REST, use OpenAPI to specify the API's. Using this standard not only makes it easier for consumers, but also helps you with better tooling as many design, build and test tools support OpenAPI. There's also AsyncAPI for event/async API's
Do some architecture. Moving sh*t to cloud doesn't remove the sh*t - it just moves it to the cloud. Don't recreate old problems in a new place.
Work out the logical components in your new solution: what does each of them do (what's it's reason to exist)? Don't forget ancillary components like API catalogues, etc.
Think about layering the integration (usually depending on how they will be consumed and what role they need to play, e.g. system interface, orchestration, experience APIs, etc).
Want to handle data in a consistent way regardless of source (your 'agnostic' comment)? You'll need to think through how data is ingested and processed. This might lead you into more data / ETL centric considerations rather than integration ones.
Co-design. Is the integration mainly data coming in or going out? Is the integration with 3rd parties or strictly internal?
If you are designing for external / 3rd party consumers then a co-design process is advised, since you're essentially designing the API for them.
If the API's are for internal use, consider designing them for external use so that when/if you decide to do that later it's not so hard.
Taker a step back:
Continually ask yourselves "what problem are we trying to solve?". Usually, a technology initiate is successful if there's a well understood reason for doing it, which has solid buy-in from the business (non-IT).
Who wants the reporting, and why - what problem are they trying to solve?
As you mentioned its an IT estate aka enterprise level solution mix of batch and real time so first you have to identify what is end goal of this migration. You can think of refactoring applications. If you are trying to make it event driven then assess the refactoring efforts and cost. Separation of responsibility is the key factor for refactoring and migration.
If you are thinking about future proofing your solution then consider Cloud for storing and processing your data. Not necessary it will be cheap but mix of Cloud and on-prem could be a way. There are services available by cloud providers to move your data in minimal cost. Cloud native solutions are there for performing analysis on your data. Database migration service in AWS or Azure can move data and then capture on-going changes. So you can keep using on-prem db & apps and perform analysis for reporting on cloud. It will ease out load on your transactional DB. Most data sync from on-prem to cloud is near real time.

Passively Logging React App Performance in Production

I'm wondering if there are any utilities/patterns/paradigms/standards for monitoring React applications in production.
I've seen a lot of documentation about React performance debugging that recommends the Chrome Dev Tools (which are great, but aren't a passive way to monitor end user performance)
How could I log data to know how long users are waiting for components to mount or render?
The only thing I've thought of so far is creating a Loggable[Pure]Component that extends React.[Pure]Component whose constructor, componentWillMount/Update, and componentDidMount/Update methods log render/mount times to a server. Then, components I want to monitor can extend these components and, if need be, call super() in the lifecycle methods before doing their own work. To specifically know which components these metrics go to, I'd have to expose a method in the Loggable[Pure]Component class that does something silly like setUniqueId and then each derived class would have to call it in the constructor.
This all seems terrible and I'm very much hoping there are some things people out there have implemented, but I haven't found anything thus far.
I would have a look at some APM tools, they handle the frontend monitoring, and the backend monitoring as well. They all support react, and folks use these all the time for that use case. It really depends on your goals in the monitoring, are you doing this for fun? Do you have a startup? Are you working for a large enterprise? There are 3 major players in this market.
AppDynamics - Enterprise APM, handles the most complex apps. Unified product offering delivered SaaS or on-premises. Has deep database, server, and other monitoring.
Dynatrace - Enterprise APM, handles complex apps well. Fragmented portfolio, but the SaaS product is good. The SaaS product has limited depth in some ways. Handles server and cloud infrastructure monitoring well.
New Relic - Easy and cheap(er than others), not as in-depth as some other options. Tends to be popular with small companies. Does a good job monitoring cloud infrastructure services.
These products all do what you are looking for, but it depends on your goals with the data and how you plan to analyze it.
If you want something free and less functional there are ways to do this with open source, but you'll have to stand up and manage a pretty complex stack. Here is one option.
Check out boomerang, which can log/extract the metrics you are looking for, it doesn't "understand" react, but it should work. This data can be posted to many different systems. The best suited is likely the ELK stack (open source log analytics, and more). Here is one of several examples which marries these two together to provide analysis of the browser performance https://github.com/naukri-engineering/NewMonk

Microservices, Dependencies and Events

I’ve been doing a lot of googling regarding managing dependencies between microservices. We’re trying to move away from big monolithic app into micro-services in order to scale organizationally and be able to develop faster and with multiple teams working in parallel.
However, as we’re trying to functionally partition the monolith into the microservices, we see how intertwined business logic and data really is. This was not a problem when we were sitting on top of one big DB and were able to do big relational joins. But with microservices, this becomes a problem.
One solution is to make microservice-A go to 5-10 other microservices to get necessary data (this is equivalent of DB view with join). Another solution is to make microservice-A listen to events from 5-10 other services and populate local storage with relevant into (this is an equivalent of materialized view). Either way, microservice-A is coupled with 5-10 other services, and if new info is needed in microservice-A, the some of the services that it depends upon might will need to be release prior to microservice-A. Please note that microservice-A is itself depended upon by other services. Bottom line, we end up with DISTRIBUTED dependency hell.
Many articles advocate for second solution – i.e. something along the lines of Event Sourcing, Choreography, etc.
I would appreciate any shared experiences, recommendations and insights.
Philometor.
While not technically an "answer", I can definitely share some of my observations and experiences. Your question concerning services calling other services for database operations reminded me of a project where an architect sold senior management on the idea of "decoupling" persistence from the rest of the applications by implementing hundreds of REST interfaces in what essentially was a distributed DAO pattern in front of a very large enterprise database. The project ended up exactly the way I predicted - a dismal failure.
Microservices aren't about turning a monolithic application into a distributed monolithic application. In my example project above, the monolith was turned into a stove-piped, fragile, chaotic mess, with the coupling only moved to service contracts instead of Java class method signatures, and with a performance hit so bad the application was unusable. Last I heard they are still running their original monolith.
Microservices should be more of a vertical partitioning of your application and not a horizontal one. In my opinion it's better to think in terms of business function partitioning rather than "converting" an existing monolith. There's no rule that determines how big a microservice must be, but it should be big enough to do one complete synchronous function without needing to directly depend on outside services (as much as possible) to complete its work. If a microservice performs a complex business function that affects 50 tables, so be it! It owns those many tables. Ideally if a service goes down, it should affect only that business functionality it's responsible for, and not directly affect other services. As you can see, this thinking is the complete opposite from that which produced the distributed mess in my project example.
Not only do you need to ensure that the motivation behind replacing monoliths with microservices is sound, but also you need to step outside the monolith and revisit the actual business and begin partitioning that instead. Like everything else, baby steps are the way to go. Start with one small complete business function, and convert that into a single microservice instead of trying to replace a monolith all at once.

Best Practices when Migrating to Microservices

To anyone with real world experience breaking a monolith into separate modules and services.
I am asking this question having already read the MonolithFirst blog entry by Martin Fowler. When taking a monolith and breaking it into microservices the "size" element of the equation is the one that I ponder over the most. Specifically, how to approach breaking a monolith application (we're talking 2001: A Space Oddessy; as in it is that old and that large) into micro services without getting overly fine grained or staying too monolithic. The end goal is creating separate modules that can be upgraded indepenently and scaled independently.
What are some recommended best practices based on personal experience of breaking a monolith into microservices?
The rule of thumb is breaking the monolith based on bounded context . The most common way of defining the bounded context is using BU ( Business Unit) . For example the module which does actual payment is mostly a separate BU .
The second thing to consider is the overhead micro-services bring. You should analyse the hardware , monitoring , infra pieces before completely breaking the service. What I have seen is people taking smaller microservices out of monolith instead of going and writing say 10 new service and depreciating the monolith.
My advice will be have an incremental approach . Take the first BU which is being worked upon out of monolith. This will also give a goos learning curve for the whole team.
You should clearly distinguish sub-domain areas (bounded contexts) from you domain.
Usually (if everything is fine with your architecture) you already have some separate components in your monolith application which responsible for each sub-domain. These components interact with each other in one process
(in monolith application) and you should to think about how to put them into separate processes. Of course you need to produce a lot of refactoring when moving one by one parts of the monolith to microservices.
Always remember that every microservice is responsible for some sub-domain.
I strongly recommend you to learn Domain Driven Design.
Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans
Implementing Domain-Driven Design by Vaughn Vernon
Also learn CQRS pattern
At the beginning you also should decide how your micservices will interact with each other.
There are several options:
Direct calls from one service to another
Send messages through some dispatcher service
which abstracts the client service from the knowledge where the called (destination) services are located.
This approach is similar to how proxy server like NGINX works.
Interact through some messaging bus (middleware), like RabbitMQ
You can combine these options, for example Query requests can be processed through Dispatcher Service, Commands and Events through message bus.
From my experience the biggest problem will be to go away from a single database,
which monolith applications is usually used.
In addition some good practices:
Put each microservice in own repository - this isolates from the ability to directly use the code of one micro service in another.
You also get faster checkouts and builds of each microservice on CI.
Interactions with any service should occur only through its public contracts.
It is necessary to aspire that each microservice has its own database
Example of the sub-domains (bounded contexts) for some Tourism Industry application.
Each bounded context can be serviced by a microservice.
We also started our journey some time back and i started writing a blog series for exactly the same thing: https://dzone.com/articles/how-i-started-my-journey-in-micro-services-and-how
Basically what i understood is to break my problem in diff. microservices, i need a design framework which Domain Driven Design gives(Domain Driven Design Distilled Book by Vaugh Vernon).
Then to implement the design (using CQRS and Event Sourcing and ...) i need a framework which provides all the above support.
I found Lagom good for this.(Eventuate , Spring Microservices are some other choices).
Sample Microservices Domain analysis using Domain Driven Design by Microsoft: https://learn.microsoft.com/en-us/azure/architecture/microservices/domain-analysis
One more analysis is: http://cqrs.nu/tutorial/cs/01-design
After reading on Domain Driven Design i think lagom and above links will help you to build a end to end application. If still any doubts , please raise :)

Prefered methods for interacting with a rules engine

I am about to dive into a rules oriented project (using ILOGs Rules for .NET - now IBM). And I have read a couple different perspectives regarding how to set up the rules processing and how to interact with the rule engine.
The two main thoughts I have seen is to centralize the rule engine (into its own farm of servers) and program against the farm via a web service API (or in ILOG's case via WCF). The other side is to run an instance of the rule engine on each of your app servers and interact with it locally with each instance having its own copy of the rules.
The up side to centralization is the ease of deployment of the rules to a centralized location. The rules scale as they need to rather than scaling each time you expand your application server configuration. This reduces waste from a purchased license perspective. The down side to this set up is the added overhead of making service calls, network latency, etc.
The upside/downside to locally running the rule engine is the exact opposite of the centralized configuration's upside/downside. No slow service calls (fast API calls), no network issues, each app server relies on it self. Managing deployment of rules becomes more complex. Each time you add a node to your app cloud you will need more licenses for rule engines.
In reading white papers I see that Amazon is running the rule engine per app server configuration. They appear to do a slow deployment of rules and recognize that the lag in rule publishing is "acceptable" even though business logic is out of a sync for a given period of time.
Question: From your experiences, what is the best way to start integrating rules into a .net based web app for a shop that has not yet spent much time working in a rules driven world?
I never liked the centralization argument. It means that everything is coupled into the rules engine, which becomes a dumping ground for all the rules in the system. Pretty soon you can't change anything for fear of the unknown: "What will we break?"
I much prefer following Amazon's idea of services as isolated, autonomous components. I interpret that to mean that services own their data and their rules.
This has the added benefit of partitioning the rules space. A rule set becomes harder to maintain as it grows; better to keep them to a manageable size.
If parts of the rule set are shared, I'd prefer a data-driven, DI approach where a service can have its own instance of a rules engine and load the common rules from a database on startup. This might not be feasible if your iLog license makes multiple instances cost prohibitive. That would be a case where product that's supposed to be helping might actually be dictating architectural choices that will bring grief. It would be a good argument for a less expensive alternative (e.g., JBoss Rules in Java-land).
What about a data-driven decision tree approach? Is a Rete rules engine really necessary, o is the "enterprise tool" decision driving your choice?
I'd try to set up the rules engine so it was as decoupled from the rest of the enterprise as possible. I wouldn't have it calling out to databases or services if I could. Better to make that the responsibility of the objects asking for a decision. Let them call to the necessary web services and databases to assemble the necessary data, pass it to the rules engine, and let it do its thing. Coupling is your enemy: Try to design your system to minimize it. Keeping rules engines isolated is a good way to do it.
We're using ILOG For DotNet and have a deployed pilot project.
Here's a summary of our immature Rules Architecture:
All data-access done outside of rules.
Rules are deployed the same way as code (source control, release process, yada yada).
Projects (services) that use Rules have a reference to ILOG.Rules.dll and new-up RuleEngines via a custom pooling class. RuleEngines are pooled because it is expensive to bind a RuleSet to a RuleEngine.
Almost all rules are written to expect Assert'd objects, rather than RuleFlow parameters.
Since the rules run in the same memory space, instances that are modified by the rules are the same instances in the program - which is immediate propagation of state.
Almost all rules are run via RuleFlow (even if it is a single RuleStep in the RuleFlow).
We're looking at RuleExecutionServer as an hosting platform as well as RuleTeamServerForSharePoint to be the host for rules source. Eventually, we will have Rules deployed to production outside of the code release process.
The primary obstacle in all our Rule endeavors is Modeling and Rule Authoring skillsets.
I don't have much to say on the "which server" question but I would urge you to develop decision services - callable services that use rules to make decisions but that do not change the state of the business. Letting the calling application/service/process decide what data changes to make as a result of calling the decision service and having the calling component actually initiate the action(s) suggested by the decision service makes it easier to use the decision service over and over again (across channels, processes etc). The cleaner and less tied to the rest of the infrastructure the decision service the more reusable and manageable it is going to be.
The discussion here on ebizQ might be worth reading in this regard.
In my experience with rules engines, we've applied a pretty basic set of practices to govern interaction with the rules engine. First of all, these have always been commercial rules engines (iLog, Corticon) and not open source (Drools), hence deploy locally to each of the app servers has never really been a viable option due to licensing costs. Hence, we've always gone with the centralized model, albeit in two primary flavors:
Remote Execution of Web Service - In the same way you specified in your question, we make calls to SOAP-based services provided by the rules engine product. Within the web service realm, we have come upon several options: (1) "Boxcar" the requests, allowing the application to queue up rules processing requests and send them over in chunks as opposed to one-off messages; (2) Tune the threading and process options provided by the vendor. This includes allowing separating decision services out by function and allocating each a W3WP and/or using web gardens. There is an aweful lot of tweaking you can do with boxcars, threads, and processes and getting the right mix is more a process of trial and error (and knowing your rulesets and data) than an exact science.
Remotely Call the Rules Engine in Process - A classic batch style trick to avoid the overhead of serialization and de-serialization. Remotely make a call that fires up an in-process call to the rules engine. This can be done either scheduled (e.g. batch) or based upon demand (i.e. "boxcars" of requests). Either way a lot of the overhead of the service calls can be avoided by interacting directly with the process and the database. Downside of this process is that you don't have IIS or your EJB/Servlet container managing the threads for you and you have to do it yourself.

Resources