Designing microservices in practice - microservices

Yet another question on how to or how not to split up a microservice :-D
The scenario:
What do we need?
Sending emails at different points of time within the work flow of an ecommerce order process. These mails will be containing order information.
What do we have?
1 x persistence service which retrieves order information
Several services which subscribe to order events and processes the relevant use case (e.g. Confirmation, delivery, invoice)
1 x service which can be triggered to send a mail
What's the next step?
Designing the architectural component which transforms the order information so they will fit the data structure of the email rendering service.
The current options are
1 having each processing service transform already existing order information for the mail template and send them to the mail rendering service.
2 have each processing service call a new service which would aggregate and transform the order information and call the mail rendering service.
Currently we're not sure yet if the data structures for the mail templates will be mostly common or if there will be differences.
So what do you think of these options in terms of cohesion, coupling and separation of concerns?
Do you need any more information? Any constructive thoughts are welcome!

Your software architecture should reflect your organizational structure, see Conway's law
Do you have multiple teams, and you want to minimize dependencies between the teams.
Are "services" large and complex enough to justify them being separated into modules?
Does the size of the product justify having advanced devops in place to orchestrate the microservices?
Do you need the flexibility in terms of deployment and replaceability of individual "services"?
If you can answer yes to most of these questions, it would make sense to go for microservices. Otherwise, you are just making your life complicated.
Frankly, microservices require a lot of coordination overhead which makes sense only if the product is large enough. Most (small) projects are just fine with monolithic and MVC architecture.

This is how I propose to proceed man, it's how one of my project's architecture does all SMTP related stuff.
API receives an HTTP request
It persists data needed to the database.
It offloads the long-running and memory intensive processes to mail builder.
Optional, mail builder builds attachment files (XLSX, PDF, etc)
Mail builder uploads to File Server
Mail builder offloads generic SMTP sending to SMTP service.
I suggested this format because it allows you to scale the instance of each piece (Mail builder will have tons of instances) depending on bottlenecks in your processing pipeline.

Given that you have asked this question in microservices, I am assuming you are asking the question in reference to cloud native patterns.
I suggest you start with looking at microservices pattern. An excellent site for the patterns is https://microservices.io/patterns/microservices.html.
Your question does not have the necessary details to provide an educated advice on what patterns are suitable and what are not. So, I suggest you look at these few patterns...
https://microservices.io/patterns/data/shared-database.html
https://microservices.io/patterns/data/database-per-service.html
Also take a look at event sourcing pattern
https://microservices.io/patterns/data/event-sourcing.html
Hope this helps.

Related

Design guides for Event Sourced microservices

I am thinking what is the best way to structure your micro-services, in the past the team I was working with used Axon Framework and PostgreSQL and each microservice had its own event store in the PostgreSQL database, then we built communication between using REST.
I am thinking that it would be smarter to have all microservices talk to the same event store as we would be able to share events faster instead of rewriting the communication lines using REST.
The questions that follows from the backstory is:
What is the best practice for having an event store
Would each service have its own? Would they share the same eventstore?
Where would I find information to inspire and gather more answers? As searching the internet for best practices and how to structure the Event Store seems like searching for a needle in a haystack.
Bear in mind, the question stated is in no way aimed at Axon Framework, but more the general idea on building scalable and good code. As the applications would work with each own event store for write model and read models.
Thank you for reading and I wish you all the best
-- Me
I'd add a slightly different notion to Tore's response, although the mainline is identical to what I'm sharing here. So, I don't aim to overrule Tore, just hoping to provide additional insight.
If the (micro)services belong to the same Bounded Context, then they're allowed to "learn about each other's language."
This language thus includes the events these applications publish and store.
Whenever there's communication required between different Bounded Contexts, you'd separate the stores, as one context shouldn't be bothered by the specifics of another context.
Hence it is beneficial to deduce what services belong to which Bounded Context since that would dictate the required separation.
Axon aims to support this by allowing multiple contexts with the Axon Server, as you can read here.
It simply allows the registration of applications to specific contexts, within which it will completely separate all message streams (so commands, events, and queries) and the Event Store.
You can also set this up from scratch yourself, of course. Tore's recommendation of Kafka is what's used quite broadly for Event Streaming needs between applications. Honestly, any broadcast type of infrastructure suits event distribution, as that's how events are typically propagated.
You want to have one EventStore per service, just as you would want to have one relation database per service for a non EventSourced system.
Sharing a database/eventstore between services creates coupling and we have all learned the hard way that this is an anti-pattern today.
If you want to use a event log to share events across services, then Kafka is a popular choice.
Important to remember that you only do event-sourcing within a service bounded context.

Should event driven architecture be targeted for all data & analytics platforms?

For example,
You have an IT estate where a mix of batch and real-time data sources exists from multiple systems, e.g. ERP, Project management, asset, website, monitoring etc.
The aim is to integrate the datasources into a cloud environment (agnostic).
There is a need for reporting and analytics on combinations of all data sources.
Inevitably, some source systems are not capable of streaming, hence batch loading is required.
Potential use-cases for performing functionality/changes/updates based on the ingested data.
Given a steer for creating a future-proofed platform, architecturally, how would you look to design it?
It's a very open-end question, but there are some good principles you can adopt to help direct you in the right direction:
Avoid point-to-point integration, and get everything going through a few common points - ideally one. Using an API Gateway can be a good place to start, the big players (Azure, AWS, GCP) all have their own options, plus there's lots of decent independent ones like Tyk or Kong.
Batches and event-streams are totally different, but even then you can still potentially route them all through the gateway so that you get the centralised observability (reporting, analytics, alerting, etc).
Use standards-based API specifications where possible. A good REST based API, based off a proper resource model is a non-trivial undertaking, not sure if it fits with what you are doing if you are dealing with lots of disparate legacy integration. If you are going to adopt REST, use OpenAPI to specify the API's. Using this standard not only makes it easier for consumers, but also helps you with better tooling as many design, build and test tools support OpenAPI. There's also AsyncAPI for event/async API's
Do some architecture. Moving sh*t to cloud doesn't remove the sh*t - it just moves it to the cloud. Don't recreate old problems in a new place.
Work out the logical components in your new solution: what does each of them do (what's it's reason to exist)? Don't forget ancillary components like API catalogues, etc.
Think about layering the integration (usually depending on how they will be consumed and what role they need to play, e.g. system interface, orchestration, experience APIs, etc).
Want to handle data in a consistent way regardless of source (your 'agnostic' comment)? You'll need to think through how data is ingested and processed. This might lead you into more data / ETL centric considerations rather than integration ones.
Co-design. Is the integration mainly data coming in or going out? Is the integration with 3rd parties or strictly internal?
If you are designing for external / 3rd party consumers then a co-design process is advised, since you're essentially designing the API for them.
If the API's are for internal use, consider designing them for external use so that when/if you decide to do that later it's not so hard.
Taker a step back:
Continually ask yourselves "what problem are we trying to solve?". Usually, a technology initiate is successful if there's a well understood reason for doing it, which has solid buy-in from the business (non-IT).
Who wants the reporting, and why - what problem are they trying to solve?
As you mentioned its an IT estate aka enterprise level solution mix of batch and real time so first you have to identify what is end goal of this migration. You can think of refactoring applications. If you are trying to make it event driven then assess the refactoring efforts and cost. Separation of responsibility is the key factor for refactoring and migration.
If you are thinking about future proofing your solution then consider Cloud for storing and processing your data. Not necessary it will be cheap but mix of Cloud and on-prem could be a way. There are services available by cloud providers to move your data in minimal cost. Cloud native solutions are there for performing analysis on your data. Database migration service in AWS or Azure can move data and then capture on-going changes. So you can keep using on-prem db & apps and perform analysis for reporting on cloud. It will ease out load on your transactional DB. Most data sync from on-prem to cloud is near real time.

Microservice Aggregator Service BFF

Let say I have 22 microservices. I developed with docker on local.
Client wants to get product model data which contains 3 different service data and aggregate them.
Should I use aggregator gateway api or SPA get separately from each service. Does Aggregator service couple services ?
These Microservices patterns always come with Trade-offs. Here you need to consider more than just a coupling issue when you are going with Aggregator pattern (Backend for Frontend).
The following are some of the points you need to think about before going with this pattern.
The Latency problem. If you want this implementation to make it better without any latency problem, then your services and aggregator should be in the same location or the same data center. Avoid third party calls from aggregators.
This can introduce a single point of failure. Make sure that you've designed in such a way that the service is highly available.
Implement a resilient design and timeout since this aggregator is calling other services and getting data. If one or more service calls take too long, it should timeout and return a partial set of data. Consider how your application will handle this scenario
Monitoring of your aggregator and it's child service calls. Implement distributed tracing using correlation IDs to track each call.
Ensure the aggregator has the adequate performance to handle the load and can be scaled to meet your anticipated growth.
These are the best practices that I can suggest, You are the best person who can decide based on your system requirements and these points.
There are some compelling advantages to using a BfF service as an orchestration layer that aggregates calls to various backend data services.
It will reduce the complexity in the data access areas of your SPA.
It can also reduce load times.
Over time, your frontend devs will be less likely to get blocked on the backend devs assuming that the BfF is maintained by the frontend devs.
Take a look at this article on Consistency, Coupling, and Complexity at the Edge that goes into more detail on this and proposes some best practices such as GraphQL vs REST.

System API in mulesoft

I have a requirement to persist some data in a table (single table). The data is coming from UI. Do i need to write just the system API and persist the data OR i need to write process and system API both? I don't see a use of process API in this case. Please suggest. Is it always necessary to access system API through process API or system API can be invoked without process API as well.
I would recommend a fine-grained approach to this. We should be following it through the experience layer even though we do not have must customization to the data.
In short, an experience layer API and directly calling System layer API (if there is no orchestration/data conversion/formatting needed)
Why we need a system API & experience API? A couple of points.
System API should be more attached to the underlying system. And if
in case, in the future, it changes then it should not impact any of
the clients.
Secondly, giving an upper layer gives us the feasibility to add
different SLAs, policies, logging and lots more, to different
clients. Even if you have a single client right now it's better to
architect for the future. Reusing is the key advantage of these APIs.
Please check Pattern 2 in this document
That is a question for the enterprise architect in your organisation. In this case, the process API would probably be a simple proxy for the system API, but that might not always be the case in future. Also, it is sometimes useful to follow a standard architectural pattern even if it creates some spurious complexity in the implementation. As always, there are design trade-offs and the answer will depend on factors that cannot be known by people outside of your organisation.

Sharing huge data between microservices

I am designing an review analysis platform in microservices architecture.
Application is works like below;
all product reviews retrieved from ecommerce-site-a ( site-a ) as an excel file
reviews are uploaded to system with excel
Analysis agent can list all reviews, edit some of them, delete or approve
Analysis agent can export all reviews for site-a
Automated regexp based checks are applied for each review on upload and editing.
I have 3 microservices.
Reviews: Responsible for Review Crud operations plus operations similar to approve/reject..
Validations: Responsible for defining and applying validation rules on review.
Export/Import: Export service exports huge files given site name (like site-a)
The problem is at some point, validation service requires to get all reviews for site-a, apply validation rules and generate errors if is there any. I know sharing database schema's and entities breaks micro-services architecture.
One possible solution is
Whenever validation service requires reviews for a site, it requests gateway, gateway redirects request to Reviews service and response taken.
Two possible drawbacks of this approach is
validation service knows about gateway? Is it brings a dependency?
in case I have 1b reviews for a site, getting all reviews via rest request may be a problem. ( or not, I can make paginated requests from validation service to gateway..)
So what is the best practice for sharing huge data between micro-services without
sharing entity
and dublicating data
I read lot about using messaging queues but I think in my case it is not good to use messaging queue to share gigabytes of data.
edit 1: Instead of sharing entity, using data stores with rest API can be a solution? Assume I am using mongodb, instead of sharing my entity object between microservices, I can use rest interface of mongo (http://restheart.org/) and query data whenever possible.
Your problem here is not "sharing huge data", but rather the boundaries you choose to separate your micro services based on.
I can tell from your requirements that the 3 micro services you chose to separate (Reviews, Validations, Import/Export) are actually operating on the same context and business domain .. which is Reviews.
I would encourage you to reconsider your design decision and consider Reviews, as a single micro service, that handles all reviews operations and logic as a black box.
I assume that reviews are independent from each other and that validating a review therefore requires only that review, and no other reviews.
You don't want to share entities, which rules out things like shared databases, Hadoop clusters or data stores like Redis. You also don't want to duplicate data, thereby ruling out plain file copies or trigger-based replication on database level.
In summary, I'd say your aim should be a stream. Let the Validator request everything from Reviews about Site A, but not in one bulk, but in a stream of single or small packages of reviews.
The Validator can now process the reviews one after the other, at constant memory and processor consumption. To gain performance, you can make multiple instances of the Validator who pull different, disjunct pieces of the stream at the same time. Similarly, you can create multiple instances of the Reviews microservice if one alone wouldn't be able to answer the pull fast enough.
The Validator does not persist this stream, it produces only the errors and stores or sends them somewhere; this should fulfill your no-sharing no-duplication requirements.

Resources