I decided to rewrite my web app on Java(previously it was on Python).
In my app I used no-blocking I/O, I had worker pool(Celery + Eventlet threads) where I pass tasks which consists of hundreds of API calls.
Now I'm using Spring WebFlux but I can't understand how I can create a workers pool to pass my tasks to that pool, and after get results and do some calculations.
(I know about possibility to create ThreadPoolTaskExecutor, but the threads are blocking threads)
If you're using non-blocking APIs, you don't need to schedule tasks on specific threads - Reactor is doing that for you. With Spring WebFlux, the threads used for processing work are managed by Reactor or it is reusing the Netty threads.
Check out the Schedulers and threading parts of the reactor reference documentation.
Related
I'm learning about threads and how threads work when building a web application.
As per my understanding the flow is as follows:
User sends Http request
Tomcat creates a thread running the controller.
Spring boot runs an async annotated method, that runs code on a seperate thread pool created by the spring boot app.
The tomcat thread is released until the async method is completed to handle more requests.
Am I correct in my understanding?
Does spring boot create its own thread pool to run async operations on freeing the main tomcat thread?
When the asynchronous method is called, the tomcat thread isn't "released" or "freed". It proceeds with the next instruction after the async method call and keeps on going (unless it does something like call get on a future returned by the async method so that it blocks until the future completes). Both threads execute concurrently.
It is true that Spring has its own separate threadpool that it uses for async methods. The Executor used is configurable, #Async takes the name of the executor as an argument so different methods can use different pools if needed. And Tomcat has a pool for its threads, of course.
I have seen a few projects where developers are using RxJava with Tomcat and Mysql on Spring Boot.
As per my knowledge:
The main advantage of reactive streams is that it only creates a single thread per multiple requests, and hence database connection should also be non-blocking.
Tomcat creates threads per request.
Spring Data Jpa is blocking.
I know that there are libraries for non-blocking Relational databases (Like r2dbc).
So, I am specifically confused about the tomcat and RxJava benefits.
I would like to know the benefits of RxJava for the following scenarios:
Rest Api on tomcat with Spring data JPA (Mysql).
Rest Api on tomcat with R2dbc (MySql).
Thanks.
Benefits of Spring MVC and JPA (blocking), linear, easy to write and debug code. Slow clients may be slowing your app down.
Reactive Spring:
Small pool of threads handling many more requests - less memory consumed.
Downside: Takes time to start thinking 'reactively'.
More:
https://www.baeldung.com/spring-mvc-async-vs-webflux
Also:
https://dzone.com/articles/micronaut-mastery-using-reactor-mono-and-flux
If you rest API doesn't always go to the database you could benefit from that approach.
I have a question, I am just starting on reactive programing and I am using quarkus. I made a demo with Panache hibernate reactive and one with SQL clients.
I want each of my rest apis to run on a different non blocking thread. With panache hibernate whenever I did a a blocking action I got a message about it and in the logs it showed me that the api was running o vertex event loop thread so everything was fine.
In Reactive clients everything runs on executor thread 0 does that mean my apis aren’t asynchonus(reactive) from input to output and when I run a blocking action non erros is showing.
Quarkus tries in a lot of places to put the proper guard-rails when it comes to the threads that can be used - however it can't catch all mistakes.
If you are seeing your code executed on a thread that has executor in the name, then the request is being serviced by the wrong thread pool.
If you are using quarkus-resteasy for example, this is the only way that RESTEasy can handle requests - it doesn't matter what your code is, the request is always handled on an executor thread.
For this reason, Quarkus provides RESTEasy Reactive (which is the prefered REST API layer) which allows you to choose whether you want a request to be serviced on an executor thread or an event-loop thread.
See this for more details.
I am working on the micro-service developed using Spring Boot . I have implemented following layers:
Controller layer: Invoked when user sends API request
Service layer: Processes the request. Either sends request to third-part service or sends request to database
Repository layer: Used to interact with the
database
.
Methods in all of above layers returns the CompletableFuture. I have following questions related to this setup:
Is it good practice to return Completable future from all methods across all layers?
Is it always recommended to use #Async annotation when using CompletableFuture? what happens when I use default fork-join pool to process the requests?
How can I configure the threads for above methods? Will it be a good idea to configure the thread pool per layer? what are other configurations I can consider here?
Which metrics I should focus while optimizing performance for this micro-service?
If the work your application is doing can be done on the request thread without too much latency, I would recommend it. You can always move to an async model if you find that your web server is running out of worker threads.
The #Async annotation is basically helping with scheduling. If you can, use it - it can keep the code free of the references to the thread pool on which the work will be scheduled. As for what thread actually does your async work, that's really up to you. If you can, use your own pool. That will make sure you can add instrumentation and expose configuration options that you may need once your service is running.
Technically you will have two pools in play. One that Spring will use to consume the result of your future, and another that you will use to do the async work. If I recall correctly, Spring Boot will configure its pool if you don't already have one, and will log a warning if you didn't explicitly configure one. As for your worker threads, start simple. Consider using Spring's ThreadPoolTaskExecutor.
Regarding which metrics to monitor, start first by choosing how you will monitor. Using something like Spring Sleuth coupled with Spring Actuator will give you a lot of information out of the box. There are a lot of services that can collect all the metrics actuator generates into time-based databases that you can then use to analyze performance and get some ideas on what to tweak.
One final recommendation is that Spring's Web Flux is designed from the start to be async. It has a learning curve for sure since reactive code is very different from the usual MVC stuff. However, that framework is also thinking about all the questions you are asking so it might be better suited for your application, specially if you want to make everything async by default.
Is it possible to handle actuator requests like health within a separate thread pool from the "main" application?
Why am I asking?
I've got an application that might sometimes use up all available threads, and the Kubernetes health check is failing due to the unavailability of a thread to compute the health endpoint request.
I want to make sure that every health request is processed no matter how much load the application is under.
I was thinking about maybe defining a separate thread pool for the actuators to operate with, but I am not sure how to do this.
We had a similar problem with some of our apps when running in Kubernetes. We looked at different ways of creating multiple tomcat connectors and changing the spring management port to get the desired affect, but never quite got it.
In the end, we attacked the root of the problem, which was resource starvation within the pod. We found that the apps experiencing the health check timeouts had lots of extra threads for various 3rd party thread pools. In some cases we had apps with close to 500 threads, so even under what we considered moderate load, the tomcat pools would get starved and couldn't handle new requests.
FWIW, the biggest culprit we found was the effect of CPU request on a pod and the JDK. When we didn't set any request, the JDK would see every CPU on the node when it queried for numbers of processors. We found there are lots of places in the Java ecosystem where number of processors is used to initialize different thread pools.
In our case, each node had 36 processors, and we found around 10-12 thread pools using this number to determine size...not hard to see to how an app could quickly grow to 500 threads.
I believe that switching to the nonblocking stack (Webflux) could solve your issue, should this be an option for you. If you rely on some blocking API (e.g. JDBC) you can publish it on a separate thread pool (e.g. Schedulers.elastic()). Thus, the HTTP request threads should always be available for processing the incoming trafic (including health check) and the long-running, blocking operations would be processed in a dedicated thread pool. I believe that similar effect should be possible using the asynchronous servlets API or anything that builds on top of it.
If you are using Spring Boot >= 2.2, you can use the separate library spring-boot-async-health-indicator to run your healthchecks on a separate thread pool.
Simply annotate your HealthIndicator with #AsyncHealth:
#AsyncHealth
#Component
public class AsynchronousHealthCheck implements HealthIndicator {
#Override
public Health health() { //will be executed on a separate thread pool
actualCheck();
return Health.up().build();
}
}
Disclaimer: I created this library for this exact purpose