I running a Spring Boot 2.6.x application (bundled with Tomcat 9.56.x) with the following configuration:
server.tomcat.accept-count = 100
server.tomcat.threads.max = 1000
server.tomcat.threads.min-spare = 10
on a machine with 16 CPU cores and 32GB of RAM
I testing a performance load of my server, during which I'm opening multiple (500) connections and each one sends a HTTP request every 1 second.
Expected behavior: tomcat will attempt to use as much threads as possible in order to maximize a throughput.
Actual behavior: tomcat always stick to 10 threads (which are configured by "min-spare") and never adding threads above that configured amount. I know that by observing its JMX endpoint (currentThreadCount is always 10). This is despite that it definitely not able to process all requests in time, since I have growing amount of pending requests in my client.
Does anyone can explain me such behavior? Based on what Tomcat (the NIO thread pool) supposed to decide whether to add threads?
Turns out the issue was in my client.
For issuing requests, I was using RestTemplate which internally was using HttpClient. Well, HttpClient internally managing connections and by default it has ridiculously low limits configured - max 20 concurrent connections...
I solved the issue by configuring PoolingHttpClientConnectionManager (which supposed to deliver better throughput in multi-threaded environment) and increased limits:
HttpClientBuilder clientBuilder = HttpClientBuilder.create();
PoolingHttpClientConnectionManager connManager
= new PoolingHttpClientConnectionManager();
connManager.setMaxTotal(10000);
connManager.setDefaultMaxPerRoute(10000);
clientBuilder.setConnectionManager(connManager);
HttpClient httpClient = clientBuilder.build();
After doing that, I greatly increased issued requests per second which made Tomcat to add new threads - as expected
Related
I am using hikariCP for connection pooling in my reactive spring boot application running in kubernetes cluster. There will be lots of blocking calls and multiple database queries, so ideally more no of database connections would help, provided the availability of cpu cores.
Providing all the cpu core to one kubernetes container will waste resource as the spike in requests will not always be there. So I am trying to explore how to utilize the autoscaler in kubernetes so that new application containers can be spinned up with increase in the no of requests. Two concerns:
I tried the hikari configuration com.zaxxer.hikari.blockUntilFilled=true to keep the no of connections filled up during the application startup. But when using autoscaler with increasing no of requests, this will cause delays in the response as connection creation in the pool would take time. Is it better to use hikari's dynamic connection creation based on spike in demand rather than creating all the connections at once (during the startup).
Also, each kubernetes container will be a new instance of application, how do we manage the no of database connections created.
I did a sample load test with jmeter and could see improved performance (and no timeouts etc) with large no of requests when using a fixed no of active database connections. There were large no of thread interrupted exceptions when there was no fixed connection pool size provided and connections were getting created dynamically with increased no of requests.
Any insights will help.
Is it possible to handle actuator requests like health within a separate thread pool from the "main" application?
Why am I asking?
I've got an application that might sometimes use up all available threads, and the Kubernetes health check is failing due to the unavailability of a thread to compute the health endpoint request.
I want to make sure that every health request is processed no matter how much load the application is under.
I was thinking about maybe defining a separate thread pool for the actuators to operate with, but I am not sure how to do this.
We had a similar problem with some of our apps when running in Kubernetes. We looked at different ways of creating multiple tomcat connectors and changing the spring management port to get the desired affect, but never quite got it.
In the end, we attacked the root of the problem, which was resource starvation within the pod. We found that the apps experiencing the health check timeouts had lots of extra threads for various 3rd party thread pools. In some cases we had apps with close to 500 threads, so even under what we considered moderate load, the tomcat pools would get starved and couldn't handle new requests.
FWIW, the biggest culprit we found was the effect of CPU request on a pod and the JDK. When we didn't set any request, the JDK would see every CPU on the node when it queried for numbers of processors. We found there are lots of places in the Java ecosystem where number of processors is used to initialize different thread pools.
In our case, each node had 36 processors, and we found around 10-12 thread pools using this number to determine size...not hard to see to how an app could quickly grow to 500 threads.
I believe that switching to the nonblocking stack (Webflux) could solve your issue, should this be an option for you. If you rely on some blocking API (e.g. JDBC) you can publish it on a separate thread pool (e.g. Schedulers.elastic()). Thus, the HTTP request threads should always be available for processing the incoming trafic (including health check) and the long-running, blocking operations would be processed in a dedicated thread pool. I believe that similar effect should be possible using the asynchronous servlets API or anything that builds on top of it.
If you are using Spring Boot >= 2.2, you can use the separate library spring-boot-async-health-indicator to run your healthchecks on a separate thread pool.
Simply annotate your HealthIndicator with #AsyncHealth:
#AsyncHealth
#Component
public class AsynchronousHealthCheck implements HealthIndicator {
#Override
public Health health() { //will be executed on a separate thread pool
actualCheck();
return Health.up().build();
}
}
Disclaimer: I created this library for this exact purpose
This is probably a rather peculiar question. I am using Spring Boot 2.0.2 with the default Tomcat container. In order to set up a test in our QA environment that simulates many servers, I would like to set up a Spring Boot-based REST service that listens on a very large number of ports simultaneously. I'm able to do this using the technique previously described in in another SO post (Configure Spring Boot with two ports) which basically adds connectors using TomcatServletWebServerFactory.addAdditionalTomcatConnectors().
The difficulty is that a large number of threads seem to be created for each additional port activated; some empirical measurements show the total to be 17 + (15 * number of ports). This means listening on 250 ports result in 3767 threads created and 500 ports result in 7517 threads created and I would like to go somewhat beyond that number. The test program used to take the above measurements is the bare minimum to bring up a Spring service and there is no code that creates threads explicitly so insofar, as I know, all of those threads were created by Spring/Tomcat.
Is there a way to accomplish this using Spring that doesn't use so many threads per active port? Would an alternate container like Jetty be more efficient?
You can configure the embedded tomcat container properties in spring boot configuration file and set the appropriate numbers for these properties to limit the threads created by Tomcat container -
server.tomcat.max-threads=200 # Maximum amount of worker threads.
server.tomcat.min-spare-threads=10 # Minimum amount of worker threads.
I am using spring boot 2.0.4.RELEASE. My doubt is whether my application is running in event loop style or not. I am using tomcat as my server.
I am running some performance tests in my application and after a certain time I see a strange behaviour. After the request reaches 500 req/second , my application is not able to serve more than 500 req/second. Via prometheus I was able to figure out max thread for tomcat were 200 by default. Looks like all the threads were consumed and that's why , it was not able to server more than 500 req/second. Please correct me if am wrong.
Can the tomcat server run in event-loop style ?
How can I change the event-loop size for tomcat server if possible.
Tried changing it to jetty still the same issue. Wondering if my application is running in event loop style.
Hey i think that you are doing something wrong in your project maybe one of your dependency does not support reactive programming. If you want to benefit from async programing(reactive) your code must be 100 reactive even for security you must use reactive spring security.
Normally a reactive spring application will run on netty not in tomcat so check your dependency because tomcat is not reactive
This is more of a analysis. After running some performance test on my local machine , I was able to figure out what was actually happening inside my application.
What I did was, ran performance test on my local machine and analysed the application through JConsole.
As I said I scheduled all my blocking dB calls to schedulers.elastic. What I realised that I it is causing the bottleneck. since my dB connections are limited and I am using hikari for connection pooling so it doesn’t matter the number of threads I create out of elastic pool.
Since reactive programming is more about consuming resource to the fullest with lesser number of threads, since the threads were being created in unbounded way so it was no different from normal application .
So what I did as part of resolution limited the number of threads to 100 that were supposed to be used by for dB calls. And bang number jumped from 500 tps to 2300 tps.
I know this is not the number which one should expect out of reactive application , it has much more capability. Since right now I do not have any choice but to bear with non reactive drivers .Waiting for production grade availability of reactive drivers for mssql server.
the response time of my spring boot rest service running on embedded tomcat sometimes goes really high. I have isolated the external dependencies and all of that is pretty quick.
I am at a point that I think that it is something to do with tomcat's default 200 thread pool size that it reserves only for incoming requests for the service.
What I believe is that all 200 threads under heavy load (100 requests per second) are held up and other requests are queued and lead to higher response time.
I was wondering if there is a definitive way to find out if the incoming requests are really getting queued? I have done an extensive research on tomcat documentation, spring boot embedded container documentation. Unfortunately I don't see anything relevant.
Does anyone have any ideas on how to check this