Spring Boot Actuator to run in separate thread pool - spring

Is it possible to handle actuator requests like health within a separate thread pool from the "main" application?
Why am I asking?
I've got an application that might sometimes use up all available threads, and the Kubernetes health check is failing due to the unavailability of a thread to compute the health endpoint request.
I want to make sure that every health request is processed no matter how much load the application is under.
I was thinking about maybe defining a separate thread pool for the actuators to operate with, but I am not sure how to do this.

We had a similar problem with some of our apps when running in Kubernetes. We looked at different ways of creating multiple tomcat connectors and changing the spring management port to get the desired affect, but never quite got it.
In the end, we attacked the root of the problem, which was resource starvation within the pod. We found that the apps experiencing the health check timeouts had lots of extra threads for various 3rd party thread pools. In some cases we had apps with close to 500 threads, so even under what we considered moderate load, the tomcat pools would get starved and couldn't handle new requests.
FWIW, the biggest culprit we found was the effect of CPU request on a pod and the JDK. When we didn't set any request, the JDK would see every CPU on the node when it queried for numbers of processors. We found there are lots of places in the Java ecosystem where number of processors is used to initialize different thread pools.
In our case, each node had 36 processors, and we found around 10-12 thread pools using this number to determine size...not hard to see to how an app could quickly grow to 500 threads.

I believe that switching to the nonblocking stack (Webflux) could solve your issue, should this be an option for you. If you rely on some blocking API (e.g. JDBC) you can publish it on a separate thread pool (e.g. Schedulers.elastic()). Thus, the HTTP request threads should always be available for processing the incoming trafic (including health check) and the long-running, blocking operations would be processed in a dedicated thread pool. I believe that similar effect should be possible using the asynchronous servlets API or anything that builds on top of it.

If you are using Spring Boot >= 2.2, you can use the separate library spring-boot-async-health-indicator to run your healthchecks on a separate thread pool.
Simply annotate your HealthIndicator with #AsyncHealth:
#AsyncHealth
#Component
public class AsynchronousHealthCheck implements HealthIndicator {
#Override
public Health health() { //will be executed on a separate thread pool
actualCheck();
return Health.up().build();
}
}
Disclaimer: I created this library for this exact purpose

Related

Advisable to run a Kafka producer + consumer in same application?

Spring + Apache Kafka noob here. I'm wondering if its advisable to run a single Spring Boot application that handles both producing messages as well as consuming messages.
A lot of the applications I've seen using Kafka lately usually have one separate application send/emit the message to a Kafka topic, and another one that consumes/processes the message from that topic. For larger applications, I can see a case for separate producer and consumer applications, but what about smaller ones?
For example: I'm a simple app that processes HTTP requests => send requests to a third party service, but to ensure retryability, I put the request on a Kafka queue with a service using the #Retryable annotation?
And what other considerations might come into play since it would be on the Spring framework?
Note: As your question states, what'll say is more of an advice based on my beliefs and experience rather than some absolute truth written in stone.
Your use case seems more like a proxy than an actual application with business logic. You should make sure that making this an asynchronous service makes sense - maybe it's good enough to simply hold the connection until you get a response from the 3p, and let your client handle retries if you get an error - of course, you can also retry until some timeout.
This would avoid common asynchronous issues such as making your client need to poll or have a webhook in order to get a result, or making sure a record still makes sense to be processed after a lot of time has elapsed after an outage or a high consumer lag.
If your client doesn't care about the result as long as it gets done, and you don't expect high-throughput on either side, a single Spring Boot application should be enough for handling both producer and consumer sides - while also keeping it simple.
If you do expect high throughput, I'd look into building a WebFlux based application with the reactor-kafka library - high throughput proxies are an excellent use case for reactive applications.
Another option would be having a simple serverless function that handles the http requests and produces the records, and a standard Spring Boot application to consume them.
TBH, I don't see a use case where having two full-fledged java applications to handle a proxy duty would pay off, unless maybe you have a really sound infrastructure to easily manage them that it doesn't make a difference having two applications instead of one and using more resources is not an issue.
Actually, if you expect really high traffic and a serverless function wouldn't work, or maybe you want to stick to Java-based solutions, then you could have a simple WebFlux-based application to handle the http requests and send the messages, and a standard Spring Boot or another WebFlux application to handle consumption. This way you'd be able to scale up the former in order to accommodate the high traffic, and independently scale the later in correspondence with your performance requirements.
As for the retry part, if you stick to non-reactive Spring Kafka applications, you might want to look into the non-blocking retries feature from Spring Kafka. This will enable your consumer application to process other records while waiting to retry a failed one - the #Retryable approach is deprecated in favor of DefaultErrorHandler and both will block consumption while waiting.
Note that with that you lose ordering guarantees, so use it only if the order the requests are processed is not important.

Thread model for Async API implementation using Spring

I am working on the micro-service developed using Spring Boot . I have implemented following layers:
Controller layer: Invoked when user sends API request
Service layer: Processes the request. Either sends request to third-part service or sends request to database
Repository layer: Used to interact with the
database
.
Methods in all of above layers returns the CompletableFuture. I have following questions related to this setup:
Is it good practice to return Completable future from all methods across all layers?
Is it always recommended to use #Async annotation when using CompletableFuture? what happens when I use default fork-join pool to process the requests?
How can I configure the threads for above methods? Will it be a good idea to configure the thread pool per layer? what are other configurations I can consider here?
Which metrics I should focus while optimizing performance for this micro-service?
If the work your application is doing can be done on the request thread without too much latency, I would recommend it. You can always move to an async model if you find that your web server is running out of worker threads.
The #Async annotation is basically helping with scheduling. If you can, use it - it can keep the code free of the references to the thread pool on which the work will be scheduled. As for what thread actually does your async work, that's really up to you. If you can, use your own pool. That will make sure you can add instrumentation and expose configuration options that you may need once your service is running.
Technically you will have two pools in play. One that Spring will use to consume the result of your future, and another that you will use to do the async work. If I recall correctly, Spring Boot will configure its pool if you don't already have one, and will log a warning if you didn't explicitly configure one. As for your worker threads, start simple. Consider using Spring's ThreadPoolTaskExecutor.
Regarding which metrics to monitor, start first by choosing how you will monitor. Using something like Spring Sleuth coupled with Spring Actuator will give you a lot of information out of the box. There are a lot of services that can collect all the metrics actuator generates into time-based databases that you can then use to analyze performance and get some ideas on what to tweak.
One final recommendation is that Spring's Web Flux is designed from the start to be async. It has a learning curve for sure since reactive code is very different from the usual MVC stuff. However, that framework is also thinking about all the questions you are asking so it might be better suited for your application, specially if you want to make everything async by default.

How to ensure my Reactive application is running in event loop style

I am using spring boot 2.0.4.RELEASE. My doubt is whether my application is running in event loop style or not. I am using tomcat as my server.
I am running some performance tests in my application and after a certain time I see a strange behaviour. After the request reaches 500 req/second , my application is not able to serve more than 500 req/second. Via prometheus I was able to figure out max thread for tomcat were 200 by default. Looks like all the threads were consumed and that's why , it was not able to server more than 500 req/second. Please correct me if am wrong.
Can the tomcat server run in event-loop style ?
How can I change the event-loop size for tomcat server if possible.
Tried changing it to jetty still the same issue. Wondering if my application is running in event loop style.
Hey i think that you are doing something wrong in your project maybe one of your dependency does not support reactive programming. If you want to benefit from async programing(reactive) your code must be 100 reactive even for security you must use reactive spring security.
Normally a reactive spring application will run on netty not in tomcat so check your dependency because tomcat is not reactive
This is more of a analysis. After running some performance test on my local machine , I was able to figure out what was actually happening inside my application.
What I did was, ran performance test on my local machine and analysed the application through JConsole.
As I said I scheduled all my blocking dB calls to schedulers.elastic. What I realised that I it is causing the bottleneck. since my dB connections are limited and I am using hikari for connection pooling so it doesn’t matter the number of threads I create out of elastic pool.
Since reactive programming is more about consuming resource to the fullest with lesser number of threads, since the threads were being created in unbounded way so it was no different from normal application .
So what I did as part of resolution limited the number of threads to 100 that were supposed to be used by for dB calls. And bang number jumped from 500 tps to 2300 tps.
I know this is not the number which one should expect out of reactive application , it has much more capability. Since right now I do not have any choice but to bear with non reactive drivers .Waiting for production grade availability of reactive drivers for mssql server.

Spring boot service higher response times under heavy load

the response time of my spring boot rest service running on embedded tomcat sometimes goes really high. I have isolated the external dependencies and all of that is pretty quick.
I am at a point that I think that it is something to do with tomcat's default 200 thread pool size that it reserves only for incoming requests for the service.
What I believe is that all 200 threads under heavy load (100 requests per second) are held up and other requests are queued and lead to higher response time.
I was wondering if there is a definitive way to find out if the incoming requests are really getting queued? I have done an extensive research on tomcat documentation, spring boot embedded container documentation. Unfortunately I don't see anything relevant.
Does anyone have any ideas on how to check this

Limit number of parallel requests to spring boot actuator health

We are using spring boot actuator to get health status of an application, my understanding is that request for health check will be handled by thread out of thread pool that is used to serve actual service requests.
Is there a way to limit number of requests for health endpoint to prevent a DDOS type starvation.
You can use Spring Boot Throttling community library. I think you could restrict DDOS access to your endpoints (Actuator or otherwise) using it's configuration.
https://github.com/weddini/spring-boot-throttling
Another possibility to reduce DDOS vulnerability on the /health endpoint is to have your health checks run on a separate thread pool.
This ensures that:
no more than one health indicator concurrently runs at any given time against an underlying service
your /health endpoint returns instantly (as it returns healths pre-calculated on different threads).
For this purpose, and if you are using Spring Boot >= 2.2, you can use the separate library spring-boot-async-health-indicator to run your healthchecks on a separate thread pool by simply annotating them with #AsyncHealth.
Disclaimer: I created this library to address this issue (among others)

Resources