Spring boot thread pool executor rest template behavior in case queueCapacity is 0 is decreasing performance for a rest apis application - spring-boot

I am stuck with a strange problem and not able to find out its root cause. This is my rest template thread pool executor :
connectionRequestTimeout: 60000
connectTimeout: 60000
socketTimeout: 60000
responseTimeout: 60000
connectionpoolmax: 900
defaultMaxPerRoute: 20
corePoolSize: 10
maxPoolSize: 300
queueCapacity: 0
keepAliveSeconds: 1
allowCoreThreadTimeOut: true
1) I know as the queueCapacity is 0 thread pool executor is going to create SynchronusQueue. The first issue is if I give its value positive integer value such as 50, application performance is decreasing. As per my understanding, we should only be using SynchronouseQueue in rare cases not in a spring boot rest API based application like mine.
2) Second thing is, I want to understand how SynchronousQueue works in a spring boot rest API application deployed on a server (tomcat). I know A SynchronousQueue has zero capacity so a producer blocks until a consumer is available, or a thread is created. But who consumer and producer in this case as all the requests are served by a web or application server. How does SynchronousQueue will basically work in this case?
I am checking the performance by running JMeter script on my machine. This script can handle more cases with queueCapacity 0 rather than some > 0.
I really appreciate any insight.

1) Don't set the queueCapacity explicitly otherwise, it is bound to degrade performance. Since we're limiting the incoming requests that can reside in the queue and it will be taken up once one of the thread becomes available from the fixed threadpool.
ThreadPoolTaskExecutor has a default configuration of the core pool
size of 1, with unlimited max pool size and unlimited queue capacity.
https://docs.spring.io/spring-framework/docs/current/javadoc-api/org/springframework/scheduling/concurrent/ThreadPoolTaskExecutor.html
2) In a SynchronousQueue, pairs of insert and remove operations always occur simultaneously, so the queue never actually contains anything. It passes data synchronously to other thread, it waits for the other party to take the data instead of just putting data and returning.
Read more:
https://javarevisited.blogspot.com/2014/06/synchronousqueue-example-in-java.html#ixzz6PFz4Akom
https://www.baeldung.com/thread-pool-java-and-guava
I hope my answer can help you in one way.

Related

Monitoring a frequently changing value use Micrometer and Prometheus with restrictions

Background:
I hava a spring boot application,I want to monitor the max and avg request’s number per minute.
Since the server assigns a thread to a request, I can observe the thread number.I use micrometer to expose the metric, and use a prometheus to pull the metrics.I chose the Gauge type to track the thread number.
AtomicInteger concurrentNumber = meterRegistry.gauge(“concurrent_thread_number”, new AtomicInteger(0));
But in prometheus, the gauge value remains 0 while I make lots of requests to the spring boot application.
I have found the reason.
My application deals with the request pretty fast, it may finish processing many requests in 1 second. I set the prometheus scrape_interval to 30s(because I don't want my machines to suffer the great load).so between the scrape intervals, the thread number changes from 0 to 1,2,3,4,..,and finally to 0. So the samples are 0 and 0.
My Question:
I don’t want to shorten the scrape_interval, is there any trick to monitor the max and avg request’s number per minute while scrape_interval is 30s? Maybe choosing another type of metric instead of gauge? Any advice would be appreciated.

What happens if I give a large value to server.tomcat.max-threads to handle load on my application?

There are around 1000+ jobs running through our service in a day and around 70-80 jobs starting at the same time and running parallelly.
To handle this, we looked that increasing the number of max threads to a large number to server.tomcat.max-threads property of our Spring application should work but I do not have full confidence as to what all can be the side effects of having a huge number like 800 to this property.
Can you please help here.
The default installation of Tomcat sets the maximum number of HTTP servicing threads at 200. Effectively, this means that the system can handle a maximum of 200 simultaneous HTTP requests. When the number of simultaneous HTTP requests exceeds this count, the unhandled requests are placed in a queue, and the requests in this queue are serviced as processing threads become available. This default queue length is 100. At these default settings, a large web load that can generate over 300 simultaneous requests will surpass the thread availability, resulting in service unavailable (HTTP 503).
More reference: https://docs.bmc.com/docs/brid91/en/tomcat-container-workload-configuration-825210082.html
How to run multiple servlets execution in parallel for Tomcat?
If this is a batch job like configuration, you can use spring batch.

Creation of parallel threads for bulk request handling?

I have rest service and want to handle almost 100 requests in parallel for this service. I have mentioned number of threads and number of connections to create as 100 in my application.yml even i did not see 100 connections created to handle requests
Here is what i did in my application.yml
server.tomcat.max-threads=100
server.tomcat.max-connections=100
I am using yourkit to see the internals , but when i start its created only 10 connections to handle requests, when i sent multiple requests also the count of request handling threads not increased its remain as 10. see the attachment i took from yourkit.
You're setting max threads. Not minimum threads. Tomcat in this case has decided the minimum should be 10.

Spring Boot Kafka: Commit cannot be completed since the group has already rebalanced

Today, in my Spring Boot and single instance Kafka application I faced the following issue:
org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot
be completed since the group has already rebalanced and assigned the
partitions to another member. This means that the time between
subsequent calls to poll() was longer than the configured
max.poll.interval.ms, which typically implies that the poll loop is
spending too much time message processing. You can address this either
by increasing the session timeout or by reducing the maximum size of
batches returned in poll() with max.poll.records.
What may be the reason for this and how to fix it? As far as I understand - my consumer was blocked for a long time and didn't respond for the heartbeat. And I should adjust Kafka properties in order to address it. Could you please tell me what exact properties should I adjust and where, for example on the Kafka side or on my application Spring Kafka side?
By default Kafka will return a batch of records of fetch.min.bytes (default 1) up to either max.poll.records (default 500), or fetch.max.bytes (default 52428800), otherwise it will wait fetch.wait.max.ms (default 100) before returning a batch of data. Your consumer is expected to do some work on that data and then call poll() again. Your consumer's work is expected to be completed within max.poll.interval.ms (default 300000 — 5 mins in pre v2.0 and 30000 - 30 seconds post v2.0). If poll() is not called before expiration of this timeout, then the consumer is considered failed and the group will rebalance in order to reassign the partitions to another member.
So to fix your issue, reduce the number of messages returned, or increase max.poll.interval.ms property to avoid timing out and rebalancing.

Performance tunning tomcat for 8 core system

I have a tomcat application running on an 8 core system. I observed that when changed maxthread count from 16 to 2 ,there was a dramatic improvement of performance for throughputs of 13 req/sec
So, started printing the active thread count , it seems that when maxthread of tomcat was set to 2 , the active threads on an average for 8 , so basically 8 threads operating on 8 cores , best possible outcome
However, when I increased the throughput to 30-40 req/sec I saw requests queueing up . So , what happened here is that due to only maxthreads 2 requests started piling up .
And when I then set maxThreads to very high value like 10k I saw JVM taking long again context switching .
My question is , is there any property in tomcat wherein I can specify how many requests are to be picked up to process in JVM parallely .
acceptCount property wont help cause it only defines threshold of request up .
There is another property called acceptorThreadCount which is defined as number of threads to be used to accept connections , is this the property I need to tune , or is there any other property , or anything I am missing here?
According to the Connector documentation for maxThreads (I'm assuming that this is where you changed your maxThreads configuration):
The maximum number of request processing threads to be created by this
Connector, which therefore determines the maximum number of
simultaneous requests that can be handled. If not specified, this
attribute is set to 200. If an executor is associated with this
connector, this attribute is ignored as the connector will execute
tasks using the executor rather than an internal thread pool. Note
that if an executor is configured any value set for this attribute
will be recorded correctly but it will be reported (e.g. via JMX) as
-1 to make clear that it is not used.
There's no problem (quite the opposite) setting the thread count to higher than the number of available cores, as not every core always is working (quite often they're waiting for external input, e.g. data from a database).
In case I've missed the point and you change a different maxThreads configuration, please clarify. On the other hand, your question is about the configuration that specifies how many requests are handled in parallel: If you referred to a different maxThreads, then tomcat's default is 200, and it can be changed in the Connector's configuration (or, as the documentation says, with Executors)

Resources