How are server hits/second more than active thread count? | Jmeter - jmeter

I'm running a load test to test the throughput of a server by making HTTP requests through JMeter.
I'm using the Thread Stepper plugin that allows me to increase the number of threads I'm using to make the requests after a particular time period.
The following graphs show the number of active threads with time and another one shows the corresponding hits per second I was able to make.
The third graph shows the latencies of the requests. The fourth one shows the response per second.
I'm not able to correlate the four graphs together.
In the server hits per second, I'm able to make a maximum of around 240 requests per second with only 50 active threads. However, the latency of the request is around 1 second.
My understanding is that a single thread would make a request, and then wait for the response to return before making the second request.
Since the minimum latency in my case is around 1 second, how is JMeter able to hit 240 requests per second with only 50 threads?
Server hits per second, max of 240 with only 50 threads. How?
Response latencies (minimum latency of 1 sec)
Active threads with time (50 threads when server hits are 240/sec)
Response per second (max of 300/sec, how?)

My expectation is that the reasons could be in:
Response time is less than 1 second therefore JMeter is able to send more than one request per second with every thread
It might also be connected with HTTP redirections and/or Embedded Resources processing, as per plugin's documentation:
Hits uncludes child samples from transactions and embedded resources hits.
For example this single HTTP Request with 1 single user results in 20 sub-samples which are being counted by the "Server Hits Per Second" plugin.

I took some time at analyzing the four graphs you provided and it seems to make sense that Jmeter Graphs are plotted reasonably well (since you feel the Jmeter is plotting incorrectly I will try to explain why the graphs look normal to me) .Taking clue from the point 1 of the answer that #Dmitri T provided I start the below analysis:
1 . Like pointed by #Dimitry T, the number of responses are coming in more faster than than the number of hits(requests) sent to the server; which can be seen from the Number of responses/second graph as the first batch of hits is sent at -between 50 to 70 from 0 to first five minutes . The responses for this set of requests come a a much faster rate in i.e at 60 to 90 from 0 to the first five minutes.. the same trend is observed for the set of hits fired from five to 10 minutes (responses come faster than the requests(hits) i.e 100 to 150 responses compared to 85 to 130 hits) ...Hence by the continuous tned the Load Generator is able to send more hits and more hits and more hits for the 50 active threads...which gives the upwards positive slope coupled with the Thread Stepper plugin's capability..
Hence the hits and responses graph are in lock step pattern(marching in unison) with the response graph having a better slope compared to hits per second graph.
This upwards happy happy trend continues till the queuing effect due to entire processing capacity use ,takes place at 23 minutes. This point in time all the graphs seems to have a opposite effect of what they were doing up till now i.e for 22.59 minutes.
The response latency (i.e the time taken to get the response is increased from 23rd minute on . At the same time there is a drop in hits per second(maybe due to not enough threads available to load generator o fire next request as they(threads aka users) are in queue and have not exited the process to make the next request). This drop in requests have dropped the rate of receiving responses as seen from the number of responses graph. But still you can see "service center" still processing the requests efficiently i.e sending back request faster the arriving rate i.e as per queuing theory the service rate is faster then the arrival rate and hence reinforcing point 1 of our analysis.
At 60 users load .Something happens ..Queuing happens!!(Confirm this by checking drop in response time graph with Throughput graph drop at the same time.If yes then requests were piped-up at the server i.e queued.) and this is the point where all the service centers are busy.and hence a drop in response time which impact the user threads from being able to generate a new hits causing low in hits per second.
The error codes observed in number of responses per second graph namely the 400,403,500 and 504 seem to part of the response codes all, from the 10th user load onwards which may indicate a time bound or data issue(first 10 users of your csv have proper data in database and the rest don't)..
Or it could be with the "credit" or "debit" transaction since chances are both may conflict...or be deadlocked on a Bank account etc.
If you notice the nature of all the error codes they can be seen to be many where more volume of responses are received i.e till 23 minute and reduced in volume since the level of responses are less due to queuing from 23rd minute on wards.Hence directly proportional with response codes. The 504 (gateway timeout) error which is a sure sign of lot of time taken to process and the web server timing out means the load is high..so we can consider the load till 80 users ..i.e at 40th minute as a reasonable load bearing capacity of the system(Obliviously if more 504 errors are observed we can fix that point as the unstressed load the system can handle.)
***Important: Check your HITS per second Graph configuration :Another observation is that the metering parameter to plot the graph could be not in sync with the expected scale i.e per second .Since you are expecting Hits in seconds but in your Hits per second graph you per configuration to plot could be 500 ms i.e half a second.so this could cause the plotting to go up high i.e higher than 50hits per 50 users ..

Related

Jmeter Transactions Per Second do not represent actual requests processed in second

I am in confusion here for what is the right parameter to find how many requests my service can handle in a sec..
Eg: According to docs & this post TPS(transactions/sec) is calculated based on elapsed time of the request which seems to be fair when you have one service instance. Eg: My elapsed time is 1 second so my tps is 1 which makes sense, but the calculations fail when i have 3 service instance(H-Scaled) though the elapsed time remains the same but now i can process 3 concurrent requests in that same second which should ideally read back as 3 tps but it doesnt
Q:Then what is the right parameter in jmeter report to check for this ? or is my theory wrong?
As per JMeter Glossary:
Throughput is calculated as requests/unit of time. The time is calculated from the start of the first sample to the end of the last sample. This includes any intervals between samples, as it is supposed to represent the load on the server.
The formula is: Throughput = (number of requests) / (total time).
And request is something produced by JMeter's Sampler
If you're doing some scalability testing you can measure it as follows:
Run a stress test with 1 service instance, i.e. start with 1 user and gradually increase the load at the same time looking at TPS. At some point you will reach the stage where increasing the number of users won't result in increased TPS due to some bottleneck. Measure the number of users and the TPS just before the bottleneck hits you.
Re-run your test with 3 service instances, you should see that the number of users and TPS before the bottleneck is higher now.

What is the difference between replenishRate and burstCapacity?

In the Redis implementation of the RequestRateLimiter, we must specify two properties redis-rate-limiter.replenishRate and redis-rate-limiter.burstCapacity as arguments for the RequestRateLimiter filter.
According to the documentation,
The redis-rate-limiter.replenishRate is how many requests per second
do you want a user to be allowed to do, without any dropped requests.
This is the rate that the token bucket is filled.
The redis-rate-limiter.burstCapacity is the maximum number of requests
a user is allowed to do in a single second. This is the number of
tokens the token bucket can hold. Setting this value to zero will
block all requests.
From what I see, replenishRate is the rate at which the requests are being made, and the burstCapacity is the maximum requests that can be made (both under one second).
However, I can't seem to understand the difference between the two in a practical scenario.
It's easier to grasp with different time units, e.g:
replenish rate: 1000 requests per minute
burst capacity: 500 requests per second
The former controls that you never get more than 1000 requests in a minute while the latter allows you to support temporary load peaks of up to 500 requests in the same second. You could have one 500 burst in second 0, another 500 burst in second 1 and you would've reached the rate limit (1000 requests within the same minute), so new requests in the following 58 seconds would be dropped.
In the context of Spring Cloud Gateway (SCG) the documentation is kind of ambiguous (the rate limiter needs to be allowed some time...):
A steady rate is accomplished by setting the same value in
replenishRate and burstCapacity. Temporary bursts can be allowed by
setting burstCapacity higher than replenishRate. In this case, the
rate limiter needs to be allowed some time between bursts (according
to replenishRate), as two consecutive bursts will result in dropped
requests (HTTP 429 - Too Many Requests).
Extrapolating from the previous example I'd say that SCG works like this:
replenish rate: 1000 requests per second
burst capacity: 2000 requests per second
You are allowed to have a burst (peak) of 2000 requests in the same second (second 0). Since your replenish rate is 1000 rps, you've already passed two cycles' allowance so you couldn't send another message until second 3.

Is it good, sample time increasing gradually along with increase in number of users

In Starting of script sample time is less and then it starts increasing as the load increasing, is it the correct way to do load testing for website?
Please help, which is the correct way to do load testing for website
Not really, in ideal world response time should remain the same as the load increases like:
1 user - response time 1 second - throughput 1 request per second
100 users - response time 1 second - throughput 100 requests per second
200 users - response time 1 second - throughput 200 requests per second
etc.
The situation when response time doesn't start increasing is called saturation point - it is the maximum throughput your application can support.
The situation when response time starts increasing as you start more threads (virtual users) is known as the bottleneck and the question is: whether performance is still acceptable for that number of users that is defined in NFR and/or SLA. If yes - you're good to go, if not - you need to report this issue (it would be beneficial if you could try to determine reason for this)
The correct way of load testing the website is simulating end users activity as close as possible including workload model. Remember to increase the load gradually, this way you will be able to correlate increasing load with metrics like response time, throughput, number of errors. It is also good to decrease the load gradually as well to see whether your website recovers when the load gets back to normal/zero.

How Throughput and Response time are related

I ran a JMeter test for 193 samples
where I could see my average response time as 5915ms and Throghput as 1.19832.
I just want to know how are they exactly related
All the answers are in JMeter Glossary
Elapsed time. JMeter measures the elapsed time from just before sending the request to just after the last response has been received.
Throughput is calculated as requests/unit of time. The time is calculated from the start of the first sample to the end of the last sample. This includes any intervals between samples, as it is supposed to represent the load on the server.
The formula is: Throughput = (number of requests) / (total time).
The relationship is: higher response time - lower throughput and vice versa.
You can use charts like Transactions per Second for throughput and Response Times Over Time for response times to get them plotted on your test timeline and Composite Graph to put them together. This way you will be able to track the trends.
All 3 charts can be installed using JMeter Plugins Manager
TL;DR
No, but yes.
Both aren't related directly, but when increasing Throughput, it will probably affect server response time due to load/stress on server.
If there are timeout errors response time will probably increase.
But for validation or firewall errors - response time will probably decrease.
There's a long explanation in JMeter archive, last is using Disney to demonstrate:
Think of your last trip to disney or your favorite amusement park. Lets define capacity of the ride to be the number of people that can sit on the ride per turn (think roller coaster). Throughput will be the number of people that exit the ride per unit of time. Lets define service time the the amount of time you get to sit on the ride. Lets define response time or latency to be your time queuing for the ride (dead time) plus service time.
In terms of load/Performance testing. Throughput and Response times are inversely proportional. i.e
With increase in response time throughput should decrease.
With increase in Throughput response time should decrease.
You can get more detailed definitions in this blog:
https://nirajrules.wordpress.com/2009/09/17/measuring-performance-response-vs-latency-vs-throughput-vs-load-vs-scalability-vs-stress-vs-robustness/
Throughout increases to some extent and remains stable when all the resources becomes busy. Now, if user requests increases further at this point response time would increase. But if response time increase is only because of internal queuing then due to the fact that system is taking more requests in at the same time response time is also increasing, throughout doesn't change. When queues are full more requests should fail. If response increase is due to some delay in processing or serving the request, for example running a query on database then due to the fact that system is not accepting more request and at the same time response time is also increasing, consequently throughout would drop.
Just a general explaination.
Respose Time : It is the time calculated when user send the request till request gets finnished.
Throughput : It is server property that number of transaction or request can be made during certain amount of time. here 1.19832 /minute means server cand hadle 1.19832 sample per minute.
As Respose Time increses Throughput increases.

Why number of requests are reduced when number of Threads are increased?

I have a test suite which has many HTTP requests. Each HTTP requests have different number of threads but with 30 seconds as Ramp up time.
Set 1:
Set 2:
The difference between Set 1 and Set 2 are only in the number of Threads. Its exactly the double number of requests in Set 2. But you can see the total count is reduced. Why is this? i was expecting the number of requests also to go up when the number of Threads are increased.
Can someone please put some light into this?
Your tables don't tell the full story and the could be multiple explanations, for example:
You increase number of threads by factor of 2
Your application becomes overloaded hence response time increases
So assuming the same test duration JMeter is able to execute less requests as it waits for response from previous request prior to sending a new one
So pay attention not only to number of requests but also check response time for all the samplers and correlation between increased amount of active users and response time by looking into i.e. Response Times vs Threads and Transaction Throughput vs Threads charts.
Aforementioned graphs can be installed using JMeter Plugins Manager

Resources