task max thread value in wildfly 10.1 - performance

i want to support 7k requests per minute for my system . Considering there are network calls and database calls which might take around 4-5 seconds to complete . how should i configure task max threads and max connections to achieve that ?

This is just math.
7k requests/minute is roughly 120 requests/second.
If each request is taking 5s then you will have roughly 5 x 120 = 600 inflight requests.
That's 600 HTTP connections, 600 threads and possibly 600 database connections.
These numbers are a little simplistic but I think you get the picture.
Note the standard Linux stack size for each thread is 8MB, therefore 600 threads is going to want nearly 5GB of memory just for the stacks. This is configurable at the OS level - but how do you size it?
Therefore you're going to be up for some serious OS tuning if you're planning to run this on a single server instance.

Related

Azure Table Increased Latency

I'm trying to create an app which can efficiently write data into Azure Table. In order to test storage performance, I created a simple console app, which sends hardcoded entities in a loop. Each entry is 0.1 kByte. Data is sent in batches (100 items in each batch, 10 kBytes each batch). For every batch, I prepare entries with the same partition key, which is generated by incrementing a global counter - so I never send more than one request to the same partition. Also, I control a degree of parallelism by increasing/decreasing the number of threads. Each thread sends batches synchronously (no request overlapping).
If I use 1 thread, I see 5 requests per second (5 batches, 500 entities). At that time Azure portal metrics shows table latency below 100ms - which is quite good.
If I increase the number of treads up to 12 I see x12 increase in outgoing requests. This rate stays stable for a few minutes. But then, for some reason I start being throttled - I see latency increase and requests amount drop.
Below you can see account metrics - highlighted point shows 2K31 transactions (batches) per minute. It is 3850 entries per second. If threads are increased up to 50, then latency increases up to 4 seconds, and transaction rate drops to 700 requests per second.
According to documentation, I should be able to send up to 20K transaction per second within one account (my test account is used only for my performance test). 20K batches mean 200K entries. So the question is why I'm being throttled after 3K entries?
Test details:
Azure Datacenter: West US 2.
My location: Los Angeles.
App is written in C#, uses CosmosDB.Table nuget with the following configuration: ServicePointManager.DefaultConnectionLimit = 250, Nagles Algorithm is disabled.
Host machine is quite powerful with 1Gb internet link (i7, 8 cores, no high CPU, no high memory is observed during the test).
PS: I've read docs
The system's ability to handle a sudden burst of traffic to a partition is limited by the scalability of a single partition server until the load balancing operation kicks-in and rebalances the partition key range.
and waited for 30 mins, but the situation didn't change.
EDIT
I got a comment that E2E Latency doesn't reflect server problem.
So below is a new graph which shows not only E2E latency but also the server's one. As you can see they are almost identical and that makes me think that the source of the problem is not on the client side.

Jmeter tps adjustment

Do we need to adjust Throughput given by jmeter, to find out the actual tps of the system
For eg : I am getting 100 tps for concurrent 250 users. This ran for 10 hrs. Can I come to a conclusion like my software can handle 100 transactions per second. Or else do I need to do some adjustment and need to get a value. Why i am asking this because when load started, system will take sometime to perform in adequate level (warm up time). If so how to do this. Please help me to understand this.
By default JMeter sends requests as fast as it can, the main factor which are affecting TPS rate are:
number of threads (virtual users) - this you can define in Thread Group
your application response time - this is not something you can control
Ideally when you increase number of threads the number of TPS should increase by the same factor, i.e. if you have 250 users and getting 100 tps you should get 200 tps for 500 users. If this is not the case - these 500 users are beyond saturation point and your application bottleneck is somewhere between 250 and 500 users (if not earlier).
With regards to "warm up" time - the recommended approach of conducting the load is doing it gradually, this way you will allow your application to get prepared to increasing load, warm up caches, let JIT compiler/optimizer to go their work, etc. Moreover this way you will be able to correlate the increasing load with increasing/decreasing throughput, response time, number of errors, etc. while having 250 users released at once doesn't tell the full story. See
The system warmup period varies from one system to the other. Warm up period is where configurations are cached, different libraries are initialized (eg. Builder.init()) and other initial functions that usually don't happen for subsequent calls. If you study results of the load test, there is a slow period at the very beginning. For most systems, it could be as small as 5 to 10 minutes. These values could be even negligible if the test is as long as 10 hours. But then again, average calculation can be effected if the results give extremely low values at the start (it always depend on the jump from initial warming up period to normal operations).
As per jmeter configurations this thread may explain the configuration. How to exclude warmup time from JMeter summary?

Is the throughput value related to the response time of requests in JMeter?

I'm getting the following results, where the throughput does not have a change, even when I increase the number of threads.
Scenario#1:
Number of threads: 10
Ramp-up period: 60
Throughput: 5.8/s
Avg: 4025
Scenario#2:
Number of threads: 20
Ramp-up period: 60
Throughput: 7.8/s
Avg: 5098
Scenario#3:
Number of threads: 40
Ramp-up period: 60
Throughput: 6.8/s
Avg: 4098
The my JMeter file consists of a single ThreadGroup that contains a single GET.
When I perform the request for an endpoit where the response time faster (less than 300 ms) I can achieve throughput greater than 50 requests per seconds.
Can you see the bottleneck of this?
Is there a relationship between response time and throughput?
It's simple as JMeter user manual states:
Throughput = (number of requests) / (total time)
Now assuming your test contains only a single GET then Throughput will be correlate average response time of your requests.
Notice Ramp-up period: 60 will start to create threads over 1 minute, so it will add to total time of execution, you can try to reduce it to 10 or equal to Number of threads.
But you may have other sampler/controllers/component that may effect total time.
Also in your case especially in Scenario 3, maybe some requests failed then you are not calculating Throughput of successful transactions.
In ideal world if you increase number of threads by factor of 2x - throughput should increase by the same factor.
In reality the "ideal" scenario is hardly achievable so it looks like a bottleneck in your application. The process of identifying the bottleneck normally looks as follows:
Amend your test configuration to increase the load gradually so i.e. start with 1 virtual user and increase the load to i.e. 100 virtual users in 5 minutes
Run your test and look into Active Threads Over Time, Response Times Over Time and Server Hits Per Second listeners. This way you will be able to correlate increasing load with increasing response time and identify the point where performance starts degrading. See What is the Relationship Between Users and Hits Per Second? for more information
Once you figure out what is the saturation point you need to know what prevents your application from from serving more requests, the reasons could be in:
Application simply lacks resources (CPU, RAM, Network, Disk, etc.), make sure to monitor the aforementioned resources, this could be done using i.e JMeter PerfMon Plugin
The infrastructure configuration is not suitable for high loads (i.e. application or database thread pool settings incorrect)
The problem is in your application code (inefficient algorithm, large objects, slow DB queries). These items can be fetched using a profiler tool
Also make sure you're following JMeter Best Practices as it might be the case JMeter is not capable of sending requests fast enough due to either lack of resources on JMeter load generator side or incorrect JMeter configuration (too low heap, running test in GUI mode, using listeners, etc)

WebSphere JDBC Connection Pool advice

I am having a hard time understanding what is happening in our WebSphere 7 on AIX environment. We have a JDBC Datasource that has a connection pool with a Min/Max of 1/10.
We are running a Performance Test with HP LoadRunner and when our test finishes we gather the data for the JDBC connection pool.
The Max Pool sizes shows as 10, the Avg pool size shows as 9, the Percent Used is 12%. With just this info would you make any changes or keep things the same? The pool size is growing from 1 to 9 during our test but it says its only 12% used overall. The final question is everytime our test is in the last 15 min before stopping we see an Avg Wait time of 1.8 seconds and avg thread wait of .5 but the percent used is still 10%. FYI, the last 15 min of our test does not add additional users or load its steady.
Can anyone provide any clarity or recommendations on if we should make any changes? thx!
First, I'm not an expert in this, so take this for whatever it's worth.
You're looking at WebSphere's PMI data, correct? PercentUsed is "Average percent of the pool that is in use." The pool size includes connections that were created, but not all of those will be in-use at any point in time. See FreePoolSize, "The number of free connections in the pool".
Based on just that, I'd say your pool is large enough for the load you gave it.
Your decreasing performance at the end of the test, though, does seem to indicate a performance bottleneck of some sort. Have you isolated it enough to know for certain that it's in database access? If so, can you tell if your database server, for instance, may be limiting things?

Maximum number of concurrent connections jBoss

We are currently developing a servlet that will stream large image files to a client. We are trying to determine how many Jboss nodes we would need in our cluster with an Apache mod_jk load balancer. I know that it takes roughly 5000 milliseconds to serve a single request. I am trying to use the forumula here http://people.apache.org/~mturk/docs/article/ftwai.html to figure out how many connections are possible, but I am having an issue because they don't explain each one of the numbers in the formula. Specifically they say that you should limit each server to 200 requests per cpu, but I don't know if I should use that in the formula or not. Each server we are using will have 8 cores so I think the forumula should either go like this:
Concurrent Users = (500/5000) * 200 * 8 = 100 concurrent users
Or like this:
Concurrent Users = (500/5000) * (200 * 8) * 8 = ~1200 concurrent users
It makes a big difference which one they meant. Without a example in their documentation it is hard to tell. Could anyone clarify?
Thanks in advance.
I guess these images aren't static, or you'd have stopped at this line?
First thing to ease the load from the
Tomcat is to use the Web server for
serving static content like images,
etc..
Even if not, you've got larger issues than a factor of 8: the purpose of his formula is to determine how many concurrent connections you can handle without the AART (average application response time) exceeding 0.5 seconds. Your application takes 5 seconds to serve a single request. The formula as you're applying it is telling you 9 women can produce a baby in one month.
If you agree that 0.5 seconds is the maximum acceptable AART, then you first have to be able to serve a single request in <=0.5 seconds.
Otherwise, you need to replace his value for maximum AART in ms (500) with yours (which must be greater than or equal to your actual AART).
Finally, as to the question of whether his CPU term should account for cores: it's going to vary depending on CPU & workload. If you're serving large images, you're probably IO-bound, not CPU-bound. You need to test.
Max out Tomcat's thread pools & add more load until you find the point where your AART degrades. That's your actual value for the second half of his equation. But at that point you can keep testing and see the actual value for "Concurrent Users" by determining when the AART exceeds your maximum.

Resources