Improper results - Load test for java rest api application - jmeter

For a rest api application in java, we are planning to perform a load test. But the initial results are a bit confusing. Post development of script using jmeter.
1. we execute the script for 1 vuser, 2 vusers, 5vuser, 10 vusers & 25vusers
2. Each test is executed for 30 minutes duration with nearly 5 sec rampup.
3. Each request has a random think time from 2 sec to 3 sec.
When this test is executed we see that for for few apis the 95%ile response time for 2, 5, 10 vuser is way less than 1 vuser. But same test post restart of tomcat gives different results
I am confused as to how the response time is decreasing as vusers are increasing.
Response time graphs, when tomcat instance is not restarted :
Response time graphs, when tomcat instance is restarted :

There is one Java runtime feature: Just-in-time compilation, the Java bytecode gets translated into the native code after ~1500 invocations (default value), controllable via -XX:CompileThreshold property.
That could be the explanation for the situation you're facing: Java runtime optimizes the functions according to their usage hence function execution time might decrease if you repeatedly call it.
Also don't expect that response time for 2 virtual users will be 2x times higher than for 1 virtual user. The application might scale up to certain extent and when you increase the load the throughput will increase and the response time will remain the same.
At some point response time will start growing and throughput will go down and this is known as performance bottleneck, however the chance you will hit application limits with 25 users is minimal given current modern hardware.
So consider applying the following performance testing types:
Load testing: start with 1 user and gradually increase the load till the anticipated amount of virtual users at the same time looking at throughput and response time. If you will not detect performance degradation as number of users is growing - you can report that the application is ready for production usage
Stress testing: start with 1 user and gradually increase the load until response time starts growing or errors start occurring. This will provide you information like what is the maximum number of users it can support and what is the component which will fail first.
Check that your API does not return 200 for invalid responses at scale.
Getting so high average response time in Jmeter

I am testing a scenario with 400 threads. Although I am almost getting no errors, I have very high average response. What can bring about this problem? Seems like server gives no time-out but gives response so late. I've addded the summary report. It is as follows:
This table doesn't tell the full story, if response time seems "so high" to you - this is definitely the bottleneck and you can report it already.
What you can do to localize the problem is:
Consider using a longer ramp-up period, i.e. start with 1 user and add 1 more user every 5 seconds (adjust these numbers according to your scenario) so you would have arrival phase, the "plateau" and the load decrease phase. This approach will allow you to correlate increasing load and increasing response time by looking at Active Threads Over Time and Response Times Over Time charts. This way you will be able to state that:
response time remains the same up to X concurrent users
after X concurrent users it starts growing so throughput is going down
after Z concurrent users response time exceeds acceptable threshold
It would also be good to see CPU, RAM, etc. usage on the server side as increased response time might be due to lack of resources, you can use JMeter PerfMon Plugin for this
Inspect your server configuration as you might need to tune it for high loads (same applies to JMeter, make sure to follow JMeter Best Practices)
JMeter sending less requests than expected

I'm using jmeter to generate a performance test, to keep things short and straight i read the initial data from a json file, i have a single thread group in which after reading the data i randomize certain values to prevent data duplication when i need it, then i'm passing the final data to the endpoint using variables, this will end up in a json body that is recieved by the endpoint and it will basically generate a new transaction in the database. Also i added a constant timer to add a 7 seconds delay between requests, with a test duration of 10 minutes and no ramp up, i calculated the requests per second like this:
1 minute has 60 seconds and i have a delay of 7 seconds per request then it's logical to say that every minute i'm sending approximately 8.5 requests per minute, this is my calculation (60/7) = 8.5 now if the test lasts for 10 minutes then i multiply (8.5*10) = 85 giving me a total of 85 transactions in 10 minutes, so i should be able to see that exact same amount of transactions created in the database after the test completes.
This is true when i'm running 10-20-40 users, after the load test run i query the db and i get the exact same number of transaction however, as i increase the users in the thread group this doesn't happen anymore, for example if i set 1000 users i should be able to generate 8500 transactions in 10 minutes, but this is not the case, the db only creates around 5.1k transactions.
What is happening, what is wrong? Why it initially works as expected and as i increase the users it doesn't? I can provide more information if needed. Please help.
There could be 2 possible reasons for this:
You discovered your application bottleneck. When you add more users the application response time increases therefore throughput decreases. There is a term called saturation point which stands for the maximum performance of the system, if you go beyond this point - the system will respond slower and you will get less TPS than initially. From the application under test side you should take a look into the following areas:
It might be the case your application simply lacks resources (CPU, RAM, Network, etc.), make sure that it has enough headroom to operate using i.e. JMeter PerfMon Plugin
Your application middleware (application server, database, load balancer, etc.) are not properly set up for the high loads. Identify your application infrastructure stack and make sure to follow performance tuning guidelines for each component
It is also possible that your application code needs optimization, you can detect the most time/resource consuming functions, largest objects, slowest DB queries, idle times, etc. using profiling tools
JMeter is not sending requests fast enough
Just like for the application under test check that JMeter machine(s) have enough resources (CPU, RAM, etc.)
Make sure to follow JMeter Best Practices
Consider going for Distributed Testing
Can you please check once CPU and Memory utilization(RAM and java heap utilization) of jmeter load generator while running jemter for 1000 users? If it is higher or reaching to max then it may affect requests/sec. Also just to confirm requests/sec from Jmeter side, can you please add listener in Jmeter script to track Hit/sec or TPS?
This will also be true(8.5K requests in 10 mins test duration) if your API response time is 1 second and also you have provided enough ramp-up time for those 1000 users.
So possible reason is:
You did not provide enough ramp-up time for 1000 users.
Your API average response time is more than 1 second while you performing tests for 1000 users.
Possible workarounds:
First, try to measure the API response time for 1 user.
Then calculate accordingly that how many users you need to reach 8500 requests in 10 mins. Use this formula:
TPS* max response time in second
Give proper ramp-up time for 1000 users. Check this thread to understand how you should calculate ramp-up time.
Is it good, sample time increasing gradually along with increase in number of users

In Starting of script sample time is less and then it starts increasing as the load increasing, is it the correct way to do load testing for website?
Please help, which is the correct way to do load testing for website
Not really, in ideal world response time should remain the same as the load increases like:
1 user - response time 1 second - throughput 1 request per second
100 users - response time 1 second - throughput 100 requests per second
200 users - response time 1 second - throughput 200 requests per second
The situation when response time doesn't start increasing is called saturation point - it is the maximum throughput your application can support.
The situation when response time starts increasing as you start more threads (virtual users) is known as the bottleneck and the question is: whether performance is still acceptable for that number of users that is defined in NFR and/or SLA. If yes - you're good to go, if not - you need to report this issue (it would be beneficial if you could try to determine reason for this)
JMeter JDBC database testing - Max Wait (ms)

What is the best practice for Max Wait (ms) value in JDBC Connection Configuration?
I am executing 2 types of tests:
20 loops for each number of threads - to get max Throupught
30min runtime for each number of Threads - to get Response time
With Max Wait = 10000ms I can execute JDBC request with 10,20,30,40,60 and 80 Threads without an error. With Max Wait = 20000ms I can go higher and execute with 100, 120, 140 Threads without an error. It seems to be logical behaviour.
Now question.
Can I increase Max Wait value as desired? Is it correct way how to get more test results?
Should I stop testing and do not increase number of Threads if any error occur in some Report? I got e.g. 0.06% errors from 10000 samples. Is this stop for my testing?
Everything depends on what your requirements are and how you defined performance baseline.
Can I increase Max Wait value as desired? Is it correct way how to get more test results?
If you are OK with higher response times and the functionality should be working, then you can keep max time as much as you want. But, practically, there will be the threshold to response times (like, 2 seconds to perform a login transaction), which you define as part of your performance SLA or performance baseline. So, though you are making your requests successful by increasing max time, eventually it is considered as failed request due to high response time (by crossing threshold values)
Note: Higher response times for DB operations eventually results in higher response times for web applications (or end users)
Should I stop testing and do not increase number of Threads if any error occur in some Report?
Same applies to error rates as well. If SLA says, some % error rate is agreed, then you can consider that the test is meeting SLA or performance baseline if the actual error rate is less that that. eg: If requirements says 0% error rate, then 0.1% is also considered as failed.
Is this stop for my testing?
You can stop the test at whatever the point you want. It is completely based on what metrics you want to capture. From my knowledge, It is suggested to continue the test, till it reaches a point where there is no point in continuing the test, like error rate reached 99% etc. If you are getting error rate as 0.6%, then I suggest to continue with the test, to know the breaking point of the system like server crash, response times reached to unacceptable values, memory issues etc.
Following are some good references:
difference between baseline and benchmark in performance of an application
This setting maps to DBCP -> BasicDataSource -> maxWaitMillis parameter, according to the documentation:
The maximum number of milliseconds that the pool will wait (when there are no available connections) for a connection to be returned before throwing an exception, or -1 to wait indefinitely
It should match the relevant setting of your application database configuration. If your goal is to determine the maximum performance - just put -1 there and the timeout will be disabled.
In regards to Is this stop for my testing? - it depends on multiple factors like what application is doing, what you are trying to achieve and what type of testing is being conducted. If you test database which orchestrates nuclear plant operation than zero error threshold is the only acceptable. And if this is a picture gallery of cats, this error level can be considered acceptable.
In majority of cases performance testing is divided into several test executions like:
Load Testing - putting the system under anticipated load to see if it capable to handle forecasted amount of users
Soak Testing - basically the same as Load Testing but keeping the load for a prolonged duration. This allows to detect e.g. memory leaks
Stress testing - determining boundaries of the application, saturation points, bottlenecks, etc. Starting from zero load and gradually increasing it until it breaks mentioning the maximum amount of users, correlation of other metrics like Response Time, Throughput, Error Rate, etc. with the increasing amount of users, checking whether application recovers when load gets back to normal, etc.
Load testing using Visual Studio 2008 : troubles with analyzing results

I use Visual Studio Team System 2008 Team Suite for load testing of my Web-application (it uses ASP.MVC technology).
Load pattern:Constant (this means I have constant amount of virtual users all the time).
I specify coniguratiton of 1000 users to analyze perfomance of my Web-application in really stress conditions.I run the same load test multiple times while making some changes in my application.
But while analyzing load test results I come to a strange dependency: when average page response time becomes larger,the requests per second value increases too!And vice versa:when average page response time is less,requests per second value is less.This situation does not reproduce when the amount of users is small (5-50 users).
How can you explain such results?
Perhaps there is a misunderstanding on the term Requests/Sec here. Requests/Sec as per my understanding is just a representation of how any number of requests that the test is pushing into the application (not the number of requests completed per second).
If you look at it that way. This might make sense.
High Requests/Sec will cause higher Avg. Response Time (due to bottleneck somewhere, i.e. CPU bound, memory bound or IO bound).
So as your Requests/Sec goes up, and you have tons of object in memory, the memory is under pressure, thus triggering the Garbage Collection that will slow down your Response time.
Or as your Requests/Sec goes up, and your CPU got hammered, you might have to wait for CPU time, thus making your Response Time higher.
Or as your Request/Sec goes up, your SQL is not tuned properly, and blocking and deadlocking occurs, thus making your Response Time higher.
These are just examples of why you might see these correlation. You might have to track it down some more in term of CPU, Memory usage and IO (network, disk, SQL, etc.)
A few more details about the problem: we are load testing our rendering engine [NDjango][1] against the standard ASP.NET aspx. The web app we are using to load test is very basic - it consists of 2 static pages - no database, no heavy processing, just rendering. What we see is that in terms of avg response time aspx as expected is considerably faster, but to my surprise the number of requests per second as well as total number of requests for the duration of the test is much lower.
Leaving aside what we are testing against what, I agree with Jimmy, that higher request rate can clog the server in many ways. But it is my understanding that this would cause the response time to go up - right?
If the numbers we are getting really reflect what's happening on the server, I do not see how this rule can be broken. So for now the only explanation I have is that the numbers are skewed - something is wrong with the way we are configuring the tool.
[1]: NDjango
This is a normal result as the number of users increases you will load the server with higher numbers of requests per second. Any server will take longer to deal with more requests per second, meaning the average page response time increases.
Requests per second is a measure of the load being applied to the application and average page response time is a measure of the applications performance where high number=slow response.
You will be better off using a stepped number of users or a warmup period where the load is applied gradually to the server.
