Jmeter response time peaks every ~5 minutes - performance

I have an issue when running load tests using JMeter - the response peaks every ~5 minutes. These response time peaks are repeating in every run and for different processes or even single endpoints.
Below is the response time graph for one of the endpoints I am testing. The graph shows merged results of 4 different runs and the response time peaks are present in all of them - repeating every ~5 minutes.
The test configuration is 100 users, ramp-up time 3500s and thread duration 3600s.
Response time graph
This can also be observed in response time vs threads graph:
Response time vs threads
This looks like some JMeter misconfiguration, but I couldn't find any relevant info for such repeating peaks.

Re-run your test with:
Monitoring your operating system metrics like CPU, RAM, Network, Disk, etc. usage as it might be a third-party periodic activity which has an impact. The majority of operating systems have monitoring toolchains out of the box, if yours don't - you can consider using JMeter PerfMon Plugin
Doing the same for the JVM metrics using a tool like JVisualVM or equivalent (the aforementioned JMeter PerfMon Plugin can read the statistics via JMX) as the pattern you're getting looks like a full GC
Doing the same for the system under test as it might be the case JMeter works just fine and the problem is on the application side.
If you confirm that the problem is in JMeter:
Make sure to follow JMeter Best Practices
If you have followed them already and the issue is still there - you might have to switch to Distributed Testing

Related

Jmeter threads stuck during the load test

I am running a load test using JMeter with 200 users for approx 1hr. So, the observation is that a few threads are stuck even after the duration completes. Like 60 out of 200 get stuck. When I take the thread dump and observe that these threads are in a Runnable state. Any suggestions for resolving this issue? And I do not see anything meaningful from the JMeter log file.
You will find an unexpected increase in response time at the end of that time.
This is because of the thread's insufficient ramp-down time. Some of your threads were active and made requests to the server and didn't receive the response but threads were closed forcefully. If your JMeter test is stopped forcefully, all the active threads will be closed immediately. So the requests generated by those threads will get higher response time.
You can use Ultimate Thread Group for graceful shutdown time(ramp-down time) of threads just like the ramp-up time.
Here is an example setting:
This is not a normal behaviour for a JMeter test, most probably it indicates that either JMeter engine is overloaded (not properly configured for high loads) or the machine where JMeter is running is overloaded (i.e. lacks RAM and starts intensive swapping)
Make sure to follow JMeter Best Practices (run your test in non-GUI mode, remove all Listeners and test elements you don't need, increase JMeter heap size, etc.)
Make sure to monitor the essential health metrics of the machine where JMeter is running (CPU, RAM, Network and Disk IO, Swap file usage). You can use JMeter PerfMon Plugin for this if you don't have any better software
It might be the case you'll have to switch to Distributed Testing, 200 virtual users doesn't seem to be a "high" load to me, but it depends on what exactly these users are doing, if they're uploading/downloading large files it may be sufficient to cause the problems
Going forward consider adding the thread dump and jmeter log file contents to your question as it doesn't contain any clues so we can only come up with "blind shot" answers
You may want to check your HTTP timeouts.
I usually set Connect Timeout to 5000 milliseconds, and Response Time out to 30000.
Your values may vary for your specific environment/ application.
In this way, if things go bad on the server under test, all requests terminate within the timeout (with errors).
You have also to consider that, if you are retrieving an HTML page with all its embedded objects, and the web server is stuck, you need to wait for multiple timeouts to expire before the operation terminate.

How to determine if connection error is due to test or host machine

I have come across a situation where I can`t decide what scenario would be best. I have written my test in JMeter as follows:
I have one test plan that runs test in consecutive.
I have 4 thread groups and each thread group has the following properties:
No of threads: 8000
Ramp-up period: 60 sec
Loop count : 10
Same user on each iteration: true
I was having connection error, connection time out error.
So, to make it work , when I test from localhost (same machine), I have to enter a response time out of 1800000 ms, whereas when I do the same test on a remote server, I have to enter the response time out of 3600000 ms.
Can someone please advise :
Is it a good idea to include response time? Is there any other issue I should look for instead of including a response time? Is it an alert for other issue?
Can I improve the test without using response time?
First of all, a couple of questions to you:
Do you really think that the real user will really wait for 1 hour for getting the response from the application?
How did you come to this 8000 users?
Now recommendations:
Never have JMeter and the system under test on the same machine, they are both resource intensive and they will start struggling for CPU cycles, memory pages, etc. and you won't be able to tell whether JMeter is not capable of sending requests fast enough or your application cannot respond properly
Follow JMeter Best Practices
Although the number of virtual users you can simulate with JMeter is very high it's limited by the machine/operating system resources so make sure JMeter has enough headroom to operate in terms of CPU, RAM, Network and Disk IO. The metrics can be checked using JMeter PerfMon Plugin. Once you notice that any of monitored metrics start exceeding i.e. 80% of total available capacity - mention how many users are online and this is how many you can simulate from this machine for this test. If you need more - go for Distributed Testing
The same for the system under test, if you hit the resources consumption limits you need either to upgrade the hardware or to deploy your application in clustered mode (if it supports such a mode)

Apache Jmeter Concurrent Users Performance Testing

I want to test 400 Concurrency Users Which allow us to pass our load testing scenario as I am using below configuration setting in Apache JMeter which will through us lots of errors.
Number of Thread (Users): 400
Ramp-Up Time: 1
Loop Count: Forever Until ( Period of 1 minutes )
We are not telepathic enough to tell what's wrong with your setup without seeing the configuration and the nature of errors.
Several generic hints:
Run your test with 1-2 users/iterations to ensure it works fine and does what it is supposed to be doing. Check requests and responses details using View Results Tree listener
Make sure to run your test in command-line non-GUI mode and disable all the Listeners while your test is running.
It is better to increase and decrease the load gradually so consider using longer ramp-up time and increase test duration accordingly. I.e.
During the first minute virtual users arrive
They then hold the load for another minute
During the last minute virtual users leave
This way you will be able to tell what was the load when the errors started occurring, what is the maximum number of users your application can support, where is the saturation point, does it recover when the load gets back to normal, etc. See JMeter Ramp-Up - The Ultimate Guide article for more details.
It might be the case you found the bottleneck, i.e. your application fails to support 400 concurrent users, now you need to find the reason which may be in:
incorrect middleware configuration (wrong web server, database, load balancer settings)
your application simply lacks resources (CPU, RAM, Network, Swap, etc.). You can check this using JMeter PerfMon Plugin
if infrastructure configuration is OK and there is enough headroom for the application to operate most probably the reason is in the application code, you need to inspect what it is doing using APM or Profiler tools and report the issue.

The Average response time differs largely while load testing the Same API Endpoint with same load using Jmeter

I am using JMeter to test performance of an API Endpoint.
The Number of Threads(users) applied is 100.
When I execute the test for the very first time the average response time is 35345 ms.
for all the following tests with the same number of threads on the same API Endpoint the average response time is somewhere around 4705 ms.
what is the reason for such a big difference in average response time ?
Does JMeter cache any files on the first test and use the same cached files on all following tests ?
If yes how do I avoid this ?
I am new to JMeter any help in this regards will be much appreciated.
JMeter doesn't cache anything when it comes to API testing, it has HTTP Cache Manager which can represent browser cache when it comes to handling embedded resources like images, scripts and styles when it comes to web testing (mimicking HTTP requests send by real browsers) but it is not your case.
So my expectation is that it is something on your application under test side, i.e. it needs to "warm up" its own caches and first few requests are processed longer due to lazy initialization of components on the first access.
So my recommendations are:
Make sure you use reasonable ramp-up period so the load increases gradually, this way you will be able to correlate main metrics like response time, hits per second, etc. with the increased load
Monitor your application under test health from OS perspective, i.e. CPU, RAM, Swap consumption, etc. You can use JMeter PerfMon Plugin for that
Use profiling tools which are relevant for your application to detect "heavy" methods, long-running DB queries, etc.

Why difference in out when using Jmeter to load test vs HP Load runner?

Here is the scenario
We are load testing a web application. The application is deployed on two VM servers with a a hardware load balancer distributing the load.
There are tow tools used here
1. HP Load Runner (an expensive tool).
2. JMeter - free
JMeter was used by development team to test for a huge number of users. It also does not have any licensing limit like Load Runner.
How the tests are run ?
A URL is invoked with some parameters and web application reads the parameter , process results and generates a pdf file.
When running the test we found that for a load of 1000 users spread over period of 60 seconds, our application took 4 minutes to generate 1000 files.
Now when we pass the same url through JMeter, 1000 users with a ramp up time of 60 seconds,
application takes 1 minutes and 15 seconds to generate 1000 files.
I am baffled here as to why this huge difference in performance.
Load runner has rstat daemon installed on both servers.
Any clues ?
You really have four possibilities here:
You are measuring two different things. Check your timing record structure.
Your request and response information is different between the two tools. Check with Fiddler or Wireshark.
Your test environment initial conditions are different yielding different results. Test 101 stuff, but quite often overlooked in tracking down issues like this.
You have an overloaded load generator in your loadrunner environment which is causing all virtual users to slow. For example you may be logging everything resulting in your file system becoming a bottleneck for the test. Deliberately underload your generators, reduce your logging levels and watch how you are using memory for correlations so you don't create a physical memory oversubscribed condition which results in high swap activity.
As to the comment above as to JMETER being faster, I have benchmarked both and for very complex code the C based solution for Loadrunner is faster upon execution from iteration to iteration than the Java based solution in JMETER. (method: complex algorithm for creating data files on the fly for upload for batch mortgage processing. p3: 800Mhz. 2GB of RAM. LoadRunner 1.8 million iterations per hour ungoverned for a single user. JMETER, 1.2 million) Once you add in pacing it is the response time of the server which is determinate to both.
It should be noted that LoadRunner tracks its internal API time to directly address accusations of the tool influencing the test results. If you open the results set database set (.mdb or Microsoft SQL server instance as appropriate) and take a look at the [event meter] table you will find a reference for "Wasted Time." The definition for wasted time can be found in the LoadRunner documentation.
Most likely the culprit is in HOW the scripts are structured.
Things to consider:
Think / wait time: When recording,
Jmeter does not automatically put in
waits.
Items being requested: Is
Jmeter ONLY requesting/downloading
HTML pages while Load runner gets all
embedded files?
Invalid Responses:
are all 1000 Jmeter responses valid?
If you have 1000 threads from a
single desktop, I would suspect you
killed Jmeter and not all your
responses were valid.
Dont forget that the testing application itself measures itself, since the arrival of the response is based on the testing machine time. So from this perspective it could be the answer, that JMeter is simply faster.
The second thing to mention is the wait times mentioned by BlackGaff.
Always check results with result tree in jmeter.
And always put the testing application onto separate hardware to see real results, since testing application itself loads the server.

Resources