TL;DR: which is more pattern? Using Mutiny + Imperative resteasy, or just use Reactive resteasy?
My understanding is Mutiny allows for me to pass Quarkus a longer running action, and have it handle the specifics of how that code gets run in the context. Does using Reactive provide equal or more benefit than Mutiny+imperative? If from a functional point of view from a thread handling perspective it's equal or better, then Reactive would be great as it requires less code to maintain (Creating Unis, etc). However, if passing a Uni back is significantly better, then it might make sense to use that.
https://quarkus.io/guides/getting-started-reactive#imperative-vs-reactive-a-question-of-threads
Mutiny + imperative:
#GET
#Path("/getState")
#Produces(MediaType.APPLICATION_JSON)
public Uni<State> getState() throws InterruptedException {
return this.serialService.getStateUni();
}
Reactive:
#GET
#Path("/getState")
#Produces(MediaType.APPLICATION_JSON)
public State getState() throws InterruptedException {
return this.serialService.getState();
}
As always, it depends.
First, I recommend you to read https://quarkus.io/blog/resteasy-reactive-smart-dispatch/, which explains the difference between the two approaches.
It's not about longer action (async != longer); it's about dispatching strategies.
When using RESTEasy Reactive, it uses the I/O thread (event loop) to process the request and switches to a worker thread only if the signature of the endpoint requires it. Using the I/O thread allows better concurrency (as you do not use worker threads), reduces memory usage (because you do not need to create the worker thread), and also tends to make the response time lower (as you save a few context switches).
Quarkus detects if your method can be called on the I/O thread or not. The heuristics are based on the signature of the methods (including annotations). To reuse the example from the question:
a method returning a State is considered blocking and so will be called on a worker thread
a method returning a Uni<State> is considered as non-blocking and so will be called on the I/O thread
a method returning a State but explicitly annotated with #NonBlocking is considered as non-blocking and so will be called on the I/O thread
So, the question is, which dispatching strategy should you use?
It really depends on your application and context. If you do not expect many concurrent requests (it's hard to give a general threshold, but it's often between 200 and 500 req/sec), it is perfectly fine to use a blocking/imperative approach. If your application acts as an API Gateway with potential peaks of requests, non-blocking will provide better results.
Remember that even if you choose the imperative/blocking approach, RESTEasy Reactive provides many benefits. As most of the heavy-lifting request/response processing is done on the I/O thread, you get faster and use less memory for... free.
Related
We started a new project with Quarkus and Mutiny, and created a bunch of endpoints with Quarkus #Funq, everything has been working fine so far. Now we want to process something very time-consuming in one of the endpoints, and what we are expecting is, once user clicks a button to send the http request from frontend and hits this specific endpoint, we are going to return 202 Accepted immediately, leaving the time-consuming operation processing in another thread from backend, then send notification email accordingly to user once it completes.
I understand this can be done with #Async or CompletableFuture, but now we want to do this with Mutiny. Based on how I read Mutiny documentation here https://smallrye.io/smallrye-mutiny/guides/imperative-to-reactive, runSubscriptionOn will avoid blocking the caller thread by running the time-consuming method on another thread, and my testing showed the time-consuming codes did get executed on a different thread. However, the http request does not return immediately, it is still pending until the time-consuming method finishes executing (as I observe in the browser's developer tool). Did I misunderstand how runSubscriptionOn works? How do I implement this feature with Mutiny?
My #Funq endpoint looks like this
#Inject
MyService myService;
#Funq("api/report")
public Uni<String> sendReport(MyRequest request) {
ExecutorService executor = Executors.newFixedThreadPool(10, r -> new Thread(r, "CUSTOM_THREAD"));
return Uni.createFrom()
.item(() -> myService.timeConsumingMethod(request))
.runSubscriptionOn(executor);
}
Edit: I found the solution using Uni based on #Ladicek's answer. After digging deeper into Quarkus and Uni I have a follow-up question:
Currently most of our blocking methods are not returning Uni on Service level, instead we create Uni object from what they return (i.e. object or list), and return the Uni on Controller level in their endpoints like this
return Uni.createFrom().item(() -> myService.myIOBlockingMethod(request)).
As #Ladicek explained, I do not have to use .runSubscriptionOn explicitly as the IO blocking method will automatically run on a worker thread (as my method on Service level does not return Uni). Is there any downside for this? My understanding is, this will lead to longer response time because it has to jump between the I/O thread and worker thread, am I correct?
What is the best practice for this? Should I always return Uni for those blocking methods on Service level so that they can run on the I/O threads as well? If so I guess I will always need to call .runSubscriptionOn to run it on a different worker thread so that the I/O thread is not blocked, correct?
By returning a Uni, you're basically saying that the response is complete when the Uni completes. What you want is to run the action on a thread pool and return a complete response (Uni or not, that doesn't matter).
By the way, you're creating an extra thread pool in the method, for each request, and don't shut it down. That's wrong. You want to create one thread pool for all requests (e.g. in a #PostConstruct method) and ideally also shut it down when the application ends (in a #PreDestroy method).
I understand when using blocking operations in reactive streams we should use Publisher<Object>.publishOn(Schedulers.elastic).subscribe(//blocking operations go here)
I understand that it makes sense when my publisher publishes a list of items (For ex: Flux) the future items does not have to wait for the current item getting blocked by a blocking operation. But in case of Mono is it necessary ? Because there will be only one item flowing in my pipe.
PS. I am using spring boot 2 reactive flux controller something like this.
#RestController("/item")
public Mono<Response> saveItem(Mono<Item> item) {
return
item.publishOn(Schedulers.elastic()) **//Do I need this ?**
.map(blockingDB.save(item))
.map(item -> new Response(Item);
}
Yes, absolutely!
If you don't do it you are blocking on the main processing/event loop threads. Of these, you should have only as many as your machine has (effective) CPUs.
Let's say that's 8. This means with just 8 concurrent requests that are waiting for the blocking operation you bring your application to a full stop!
Also, make sure to move processing after the blocking operation back to a thread pool intended for CPU intense work.
When using a classical Tomcat approach, you can give your server a maximum number of threads it can use to handle web requests from users. Using the Reactive Programming paradigm, and Reactor in Spring 5, we are able to scale better vertically, making sure we are blocked minimally.
It seems to me that it makes this less manageable than the classical Tomcat approach, where you simply define the max number of concurrent requests. When you have a max number of concurrent requests, it's easier to estimate the maximum memory your application will need and scale accordingly. When you use Spring 5's Reactive Programming this seems like more of a hassle.
When I talk about these new technologies to sysadmin friends, they reply with worry about applications running out of RAM, or even threads on the OS level. So how can we deal with this better?
No blocking I/O at ALL
First of all, if you don't have any blocking operation then you should not worry at all about How much Thread should I provide for managing concurrency. In that case, we have only one worker which process all connections asynchronously and nonblockingly. And in that case, we may easily scale connection-servant workers which process all connections without contention and coherence (each worker has its own queue of received connections, each worker works on its own CPU) and we may scale application better in that case (shared nothing design).
Summary: in that case you manage max number of webthread identically as previously, by configuration application-container (Tomcat, WebSphere, etc) or similar in case of non-Servlet servers like Netty, or hybrid Undertow. The benefit - you may process muuuuuuch more users requests but with the same resources consumption.
Blocking Database and Non-Blocking Web API (such as WebFlux over Netty).
In case we should deal somehow with blocking I/O, for an instant communication with DB over blocking JDBC, the most appropriate way to keep your app scalable and efficient as possible we should use dedicated thread-pool for I/O.
Thread-pool requirements
First of all, we should create thread-pool with exactly the same amount of workers as available connections in JDBC connections-pool. Hence, we will have exactly the same amount of thread which will be blockingly wait for the response and we utilize our resources as efficiently as it possible, so no more memory will be consumed for Thread stack as it actually needed (In other word Thread per Connection model).
How to configure thread-pool accordingly to size of connection-pool
Since access to properties is varying for a particular database and JDBC driver, we may always externalize that configuration on a particular property, which in turn means that it may be configured by devops or sysadmin.
A configuration of Threadpool (in our example it is configuring of Scheduler of Project Reactor 3) may looks like next:
#Configuration
public class ReactorJdbcSchedulerConfig {
#Value("my.awasome.scheduler-size")
int schedulerSize;
#Bean
public Scheduler jdbcScheduler() {
return Schedulers.fromExecutor(new ForkJoinPool(schedulerSize));
// similarly
// ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
// taskExecutor.setCorePoolSize(schedulerSize);
// taskExecutor.setMaxPoolSize(schedulerSize);
// taskExecutor.setQueueCapacity(schedulerSize);
// taskExecutor.initialize();
// return Schedulres.fromExecutor(taskExecutor);
}
}
...
#Autowire
Scheduler jdbcScheduler;
public Mono myJdbcInteractionIsolated(String id) {
return Mono.fromCallable(() -> jpaRepo.findById(id))
.subscribeOn(jdbcScheduler)
.publishOn(Schedulers.single());
}
...
As it might be noted, with that technique, we may delegate our shared thread-pool configuration to an external team (sysadmins for an instance) and allows them to manage consumption of memory which is used for created Java Threads.
Keep your blocking I/O thread pool only for I/O work
This statement means that I/O thread should be only for operations which are blockingly waiting. In turn, it means that after the thread has done his awaiting the response, you should move result processing to another thread.
That is why in the above code-snippet I put .publishOn right after .subscribeOn.
So, to summarize, with that technique we may allow external team managing application sizing by controlling thread-pool size to connection-pool size accordingly. All results processing will be executed within one thread and there will be no redundant, uncontrolled memory consumption hence.
Finally, Blocking API (Spring MVC) and blocking I/O (Database access)
In that case, there is no need for reactive paradigm at all since you don't get any profit from that. First of all, Reactive Programming requires particular mind shifting, especially in the understanding of the usage of functional techniques with Reactive libraries such as RxJava or Project Reactor. In turn for non-prepared users, it gives more complexity and causes more "What ****** is going on here???". So, in case of blocking operations from both ends, you should think twice do you really need Reactive Programming here.
Also, there is no magic for free. Reactive Extensions comes with a lot of internal complexity and using all that magical .map, .flatMap, etc., you may lose in overall performance and memory consumption instead of winning like in case of end-to-end non-blocking, async communication.
That means that old good imperative programming will be more suitable here and it will much easier to control your application sizing in memory using old good Tomcat configuration management.
Can you try this :
public class AsyncConfig implements AsyncConfigurer {
#Override
public Executor getAsyncExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(15);
taskExecutor.setMaxPoolSize(100);
taskExecutor.setQueueCapacity(100);
taskExecutor.initialize();
return taskExecutor;
}
}
This works for async in spring 4 but I'm not sure it'll works in spring 5 with reactive.
A fasthttp based server is up to 10 times faster than net/http.
Which implementation details make fasthttp so much faster? Moreover, how does it manage incoming requests better than net/http?
The article "http implementation fasthttp in golang" from husobee mentions:
Well, this is a much better implementation for several reasons:
The worker pool model is a zero allocation model, as the workers are already initialized and are ready to serve, whereas in the stdlib implementation the go c.serve() has to allocate memory for the goroutine.
The worker pool model is easier to tune, as you can increase/decrease the buffer size of the number of work units you are able to accept, versus the fire and and forget model in the stdlib
The worker pool model allows for handlers to be more connected with the server through channel communications, if the server needs to shutdown for example, it would be able to more easily communicate with the workers than in the stdlib implementation
The handler function definition signature is better, as it takes in only a context which includes both the request and writer needed by the handler. this is HUGELY better than the standard library, as all you get from the stdlib is a request and response writer… The work in go1.7 to include context within the request is pretty much a hack to give people what they really want (context) without breaking anyone.
Overall it is just better to write a server with a worker pool model for serving requests, as opposed to just spawning a “thread” per request, with no way of throttling out of the box.
I have an Asp.net MVC 5 application that has:
Web UI layer
Business Logic layer
Data repositories layer
They are also referenced in this order. UI only accesses business logic, and business logic references repositories.
As with 99% of applications everything can and should be executed synchronously except calls into database (or other I/O expensive operations). That's why I would like to make Data layer asynchronous but without affecting upper layers to make all upper calling methods async (all the way to controller actions).
Is that possible?
What I was thinking to do
I was thinking of changing things this way.
Data layer method
public async Task<SomeEntity> GetData()
{
return await Task.Run<SomeEntity>(() => ...);
}
Business logic method
public SomeEntity GetData()
{
return this.repo.GetData().Result;
}
Questions
Does this make sense and would I actually get my code to execute in asynchronous manner?
Update
After reading Stephen Cleary's blog post it made it more clear to me that whole call stack to the bottom (data layer that splits the synchronisity) is being split by the data async call hence all calls on the stack should be async and split as well.
If this thinking is correct then are my assumptions correct when I say that
In order to not have the whole synchronous call stack converted to async we should create a separate thread that would work asynchronously and our synchronous thread would use it.
Question 2
Is this assumption correct and if it is, is that the only way to keep some parts synchronous?
As with 99% of applications everything can and should be executed synchronously except calls into database.
Not at all. Anything that is I/O-based should be asynchronous.
So, the data layer does database I/O, and should be asynchronous.
The business logic layer uses the data layer, which is I/O-based, and should be asynchronous.
The UI layer uses the business logic layer, which is I/O-based, and should be asynchronous.
Of course, only those methods that are actually I/O-based should be made asynchronous; the rest should be synchronous. But I find that in data-access-heavy applications, they should be almost entirely asynchronous.
Does this make sense and would I actually get my code to execute in asynchronous manner?
No. Sorry, but you should never wrap asynchronous code in Task.Run and block on it in an ASP.NET application. All that does is use up more threads than necessary for processing your request. It would be better to keep it all synchronous than to use multiple threads to keep it synchronous.
On ASP.NET, you have to allow the asynchrony to propagate through all layers in order to have asynchronous actions/handlers (and all the benefits that come with it, namely, scalability).
Mixing sync with async in this manor can be dangerous as it invites deadlocks. Stephen Cleary explains this very well here:
http://blog.stephencleary.com/2012/07/dont-block-on-async-code.html