We are planning to migrate from monolith to micro-services based architecture. Now i own the responsibility of talking a module out of monolith.
Existing Monolith:
1) Code is very tightly coupled.
2) APIs are called recursively with different parameters.
3) Some of the calls with-in the module which i am planning to extract out contains calls to a system which takes approx 9 minutes to complete.
Unfortunately that's a synchronous.
Points to note:
1) I am starting with a single api migration which is a very important one and is not performing well.
2) This api consists of parallel calls to another system for performing
bunch of tasks. All the calls are blocking and time-consuming (consider
avg response time to be 5-6 min)
Moving to microservice based architecture : There are 2 approaches that comes to my mind while moving the aforementioned api from monolith to a separate microservice, along with solving the problem of blocking threads due to time taking blocking calls.
a) moving in phases :
- Create a separate module
- In this module provide an api to push events to kafka, another
module will in-turn process the request and push the response back
to kafka
- monolith for now will call above mentioned api to push events to
kafka
- New module will inturn call back the monolith when the task
complete (received response on a separate topic in kafka)
- Monolith once get response for all the tasks will trigger some post
processing activity.
Advantage:
1) It will solve the problem of sync- blocking call.
Disadvantage:
1) Changes are required in the monolith, which could introduce some
bugs.
2) No fallbacks are available for the case if bug gets introduced.
b) Move the API at once to the microservice :
Initially which will share common
data source with the monolith and solve the problem of blocking calls
via introduction of kafka between new microservice and the module which
takes time to process the request.
Advantage:
1) Fallback is available in monolith
Disadvantage:
1) Initially data source is shared between the systems.
What should be the best approach to do these kinds of complex tasks ?
You have to take care of several things.
First
Going to microservice will be slower (90% of the time) than a monolith because you introduce latency. So never forget it when you go with it.
Second
You ask if it is a good way to go with kafka. I may answer yes in most of the case but you mentioned that today the process is synchronous. If it is for transactional reasons, you won't be able to solve it with a message broker I guess because you'll update your strong consistency system to an eventually one. https://en.wikipedia.org/wiki/Eventual_consistency
I am not saying that it is a bad solution only that it change your workflow and may impact some business rules.
As a solution I offer this:
1 - Break the seams in your monolith by introducing functional key and api composition inside the monolith (read Sam Newman's book to help).
2 - Introduce the eventual consistency inside the monolith to test if it fits the purpose. It will be easier to rollback if not.
No you have to possibility:
The second step went well so go ahead and put the code of the service into a microservice out of the monolith.
The second step did not fit then think about doing the risky thing in a specific service or use distributed transactions (be careful with this solution it could be hard to manage).
I believe the best approach would be moving option 1: Moving in phases. However, it is possible to do it while having a fallback strategy. You can keep the a version of the untouched backend to serve as a backup if your new service encounters issues.
The approach is described in more details in the article: Low risk monolith to microservice evolution It provides more details in the implementation and the thought processes behind why a phased approach has lower risk. However, the need to change the backend would still be present, but hopefully mitigated through unit testing.
Related
I am reading different posts and books on Microservice Architecture in the hunt to answer my question which is related to the Decomposition Strategies. The question is, should we create a new microservice specifically to handle the batch job?
To my context, the nature of the batch job is to read the data from the database and make REST calls to external system if the data is in the particular state. Additionally, the batch job is suppose to run only once a day.
My questions related to this are
Is this an industry norm/practice that when we have to run BATCH job, it should be a new microservice because batch job consumes resources which can hinder the incoming traffic and increases latency.
Does running a batch job effect the latency of the APIs exposed towards client?
I would say yes, it makes sense. Usually batch jobs have very different development lifecycle and deployment frequency.
I've done something similar by myself and I'm totally sure it's worth it.
Also It would then possible to spin instance to run job once a day - which can save money in cloud environments.
Latency: it depends on that other system. You might want to come with throttling your requests to other system, to not put it down under the heavy load.
Reading Spring in action 5th edition chapter 11, last paragraph in section 11.1.2
By accepting a Mono as input, the method is invoked immediately
without waiting for the Taco to be resolved from the request body. And
because the repository is also reactive, it’ll accept a Mono and
immediately return a Flux, from which you call next() and return
the resulting Mono … all before the request is even processed!
How the service will immediately return before the request is even processed? Isn't that counter-intuitive?
I mean should the request be processed first before returning a response?
The book has everything you need. It is a well-written book, just make sure to read carefully while actually (make sure to download the source code from Manning) running the code. It will help you understand better.
From the book (https://livebook.manning.com/book/spring-in-action-fifth-edition/chapter-11/v-7/6):
11.1 Working with Spring WebFlux
Typical Servlet-based web frameworks, such as Spring MVC, are blocking
and multithreaded in nature, using a single thread per connection. As
requests are handled, a worker thread is pulled from a thread pool to
process the request. Meanwhile, the request thread is blocked until it
is notified by the worker thread that it is finished.
Consequently, blocking web frameworks do not scale effectively under
heavy request volume. Latency in slow worker threads makes things even
worse, because it will take longer for the worker thread to be
returned to the pool to be ready to handle another request.
In some use cases, this arrangement is perfectly acceptable. In fact,
this is largely how most web applications have been developed for well
over a decade. But times are changing and the clients of these web
applications have grown from people occasionally viewing websites on
the web browser to people frequently consuming content and using
applications that consume APIs. And these days the so-called "Internet
of Things" where humans aren’t even involved while cars, jet engines,
and other non-traditional clients are constantly exchanging data with
our APIs. With an increasing number of clients consuming our web
applications, scalability is more important than ever.
Asynchronous web frameworks, in contrast, achieve higher scalability
with fewer threads—generally one per CPU core. By applying a technique
known as event looping (as illustrated in Figure 11.1), these
frameworks are able to handle many requests per thread, making the
per-connection cost much cheaper.
In an event loop, everything is handled as an event, including
requests and callbacks from intensive operations (such as database and
network operations). When a costly operation is needed, the event loop
registers a callback for that operation to be performed in parallel
while the event loop moves on to handle other events. When the
operation is complete, the completion is treated as an event by the
event loop the same as requests. As a result, asynchronous web
frameworks are able to scale better under heavy request volume with
fewer threads (and thus reduced overhead for thread management).
Read the rest of this section and it will clarify any other concern.
Also, check Reactor https://github.com/reactor/reactor-core
For a complete example if you are still having difficulties https://www.baeldung.com/spring-webflux
We are currently designing a web service based process, in which we will be using the web-service invoke and receive steps to communicate with a Microsoft biz-talk server.
Our main concern is that a task on the receive step can wait for some time (up to one week) until the biz-talk responds to us, which (we think) would incur a performance penalty on the workflow system as it will be polling for response.
My question is, is there any known performance considerations for the receive step, specially for leaving work items for extended periods?
No, I don't think there will be any undue "overhead". Yes, internally the process engine "polls". For just about anything. Including invoking components, or executing timers. But from a system perspective, you're just waiting for a request.
It sounds like a "receive" step is exactly the right solution here.
I'm trying to get to grips with service fabric and I'm struggling a little bit. Some questions:
are all service fabric service instances single-threaded? I created a stateless web api, one instance, with a method that did a Task.Delay, then returned a string. Two requests to this service were served one after the other, not concurrently. So am I right in thinking then that the number of concurrent requests that can be served is purely a function of the service instance count in the application manifest? Edit Thinking about this, it is probably to do with the set up of OWIN Wep Api. Could it be it is blocking by session? I assumed there is no session by default?
I have long-running operations that I need to perform in service fabric (that can take several hours). Is there a recommended pattern that I can use for this in service fabric? These are currently handled using a storage queue that triggers a webjob. Maybe something with Reliable Queues and a RunAsync loop?
It seems you handled the first part so I will comment on the second part: "long-running operations".
We can see long running operations / workflows being handled far before service fabric came about. For this reason, we can build on the shoulders of giants by looking on the design patterns that software experts have been using for decades. For example, the famous and all inclusive Process Manager. Mind you that this pattern is sometimes an overkill. If it is in your case, just check out the rest of the related patterns in the Enterprise Integration Patterns book (by Gregor Hohpe).
As for the use of reliable collections, those are implementation details when choosing a data structure supporting the chosen design pattern.
I hope that helps
With regards to your second point - It really depends on the nature of your long running task.
Is your long running task the kind of workload that runs on an isolated thread that depends on local OS/VM level resources and eventually comes back with a result (A)? or is it the kind of long running task that goes through stages and builds up a model of the result through a series of persisted state changes (B)?
From what I understand of Service Fabric, it isn't really designed for running long running workloads (A), but more for writing horizontally-scalable, highly-available systems.
If you were absolutely keen on using service fabric (and your kind of workload tends to be more like B than A) I would definitely find a way to break down those long running tasks that could be processed in parallel across the cluster. But even then, there is probably more appropriate technologies designed for this such as Azure Batch?
P.s. If you are going to put a long running process in the RunAsync method, you should design the workload so it is interruptable and its state can be persisted in a way that can be resumed from another node in the cluster
In a stateful service, only the primary replica has write access to
state and thus is generally when the service is performing actual
work. The RunAsync method in a stateful service is executed only when
the stateful service replica is primary. The RunAsync method is
cancelled when a primary replica's role changes away from primary, as
well as during the close and abort events.
P.s.s Long running operations are the devil when trying to write scalable systems. Try and tackle that now and save yourself the future pain if possibe.
To the first point - this is purely a client issue. Chrome saw my requests as indentical and so delayed the 2nd request until the 1st got a response. Varying the parameter of the requests allowed them to be served concurrently.
I'm working on a web application frontend to a legacy system which involves a lot of CPU bound background processing. The application is also stateful on the server side and the domain objects needs to be held in memory across the entire session as the user operates on it via the web based interface. Think of it as something like a web UI front end to photoshop where each filter can take 20-30 seconds to execute on the server side, so the app still has to interact with the user in real time while they wait.
The main problem is that each instance of the server can only support around 4-8 instances of each "workspace" at once and I need to support a few hundreds of concurrent users at once. I'm going to be building this on Amazon EC2 to make use of the auto scaling functionality. So to summarize, the system is:
A web application frontend to a legacy backend system
task performed are CPU bound
Stateful, most calls will be some sort of RPC, the user will make multiple actions that interact with the stateful objects held in server side memory
Most tasks are semi-realtime, where they have to execute for 20-30 seconds and return the results to the user in the same session
Use amazon aws auto scaling
I'm wondering what is the best way to make a system like this distributed.
Obviously I will need a web server to interact with the browser and then send the cpu-bound tasks from the web server to a bunch of dedicated servers that does the background processing. The question is how to best hook up the 2 tiers together for my specific neeeds.
I've been looking at message Queue systems such as rabbitMQ but these seems to be geared towards one time task where any worker node can simply grab a job form a queue, execute it and forget the state. My needs are a little different since there could be multiple 'tasks' that needs to be 'sticky', for example if step 1 is started in node 1 then step 2 for the same workspace has to go to the same worker process.
Another problem I see is that most worker queue systems seems to be geared towards background tasks that can be processed anytime rather than a system that has to provide user feedback that I'm dealing with.
My question is, is there an off the shelf solution for something like this that will allow me to easily build a system that can scale? Would love to hear your thoughts.
RabbitMQ is has an RPC tutorial. I haven't used this pattern in particular but I am running RabbitMQ on a couple of nodes and it can handle hundreds of connections and millions of messages. With a little work in monitoring you can detect when there is more work to do then you have consumers for. Messages can also timeout so queues won't backup too greatly. To scale out capacity you can create multiple RabbitMQ nodes/clusters. You could have multiple rounds of RPC so that after the first response you include the information required to get second message to the correct destination.
0MQ has this as a basic pattern which will fanout work as needed. I've only played with this but it is simpler to code and possibly simpler to maintain (as it doesn't need a broker, devices can provide one though). This may not handle stickiness by default but it should be possible to write your own routing layer to handle it.
Don't discount HTTP for this as well. When you want request/reply, a strict throughput per backend node, and something that scales well, HTTP is well supported. With AWS you can use their ELB easily in front of an autoscaling group to provide the routing from frontend to backend. ELB supports sticky sessions as well.
I'm a big fan of RabbitMQ but if this is the whole scope then HTTP would work nicely and have fewer moving parts in AWS than the other solutions.