PerfMon Server Agent responsive for 2 hours only - jmeter-5.0

I am trying to run long SOAK tests (24h) monitoring server CPU/RAM utilization in Jmeter. Using perfmon server agent and plugin. Tests are run headless, using JMeter Docker image. Got everything setup and job is running fine in Jenkins. Measurements are sent to the server every 10 minutes. The tests results are saved in CSV during the test.
However, although the test runs for 24 hours, Perfmon agent seems to be sending data only for around 2 hours. This is how much data i can see is saved in CSV file. Regardless test runs for 5 hours or 24, 2 hours of data is saved.
I wonder what causes it, ideally i would like to see all the data saved for the whole duration of the test. Would appreciate comments. Cheers

The issue was caused by 'packet_write_wait: connection to 3.126.218.168 port 22: broken pipe' error.
This was resolved by keeping ssh session alive, i.e. modifying ~/.ssh/config file and setting required values for 'Host * ServerAliveInterval ServerAliveCountMax'.

Related

Scheduler execution gets over/stopped before specified time

We are running Jmeter scripts in Scheduler mode for 1 hour (Master - 4 Slave machines), however the execution gets over/stopped before 1 hour time (ex: it stops in 40 mints).
Below is the setup: Jmeter version 5.1
Thread Group: Scheduler checked with Duration 1 hour, Forever is selected.
CSV config: Recycle EOF is set to True, Stop thread on EOF is set to False. The CSV is having 20 rows and these rows data has been used/fetched multiple times during 40 mints.
--HTTP Sampler
Tried multiple times with different duration, still the same issue. No errors logged in jmeter.log file.
Referred below resource as well:
JMeter ignore Duration time when using Scheduler
Please suggest to make it to work for complete specified duration.
Unfortunately we're not telepathic enough to guess what's wrong without seeing jmeter.log file from the master machine and jmeter-server.log files from the slaves, the answer should be either in these or in the .jtl results file.
Your test configuration looks very good, just check 3 points:
make sure that the OS time is synchronised on all the slaves and the master
Copy your CSV file to all the slaves
None of Stop Thread/Stop Test/Stop Test Now radiobuttons in the Thread Group are checked
Also be informed that according to 9 Easy Solutions for a JMeter Load Test “Out of Memory” Failure article you should always be using the latest version of JMeter so consider upgrading to JMeter 5.2.1 (or whatever is the latest stable JMeter version available at JMeter Downloads page) on next available opportunity as it might be the case you're suffering from a form of a bug which has been already fixed.

The JVM should have exited but did not

During Distributed testing with Jmeter 3.3 in non gui mode i'm getting error as, how can I fix this :
I'm using same version of JMeter and JDK on Master as well as Slave machines.
The JVM should have exited but did not.
The following non-daemon threads are still running (DestroyJavaVM is OK):
Thread[main,5,main],
stackTrace:java.net.SocketInputStream#socketRead0
java.net.SocketInputStream#socketRead
java.net.SocketInputStream#read
java.net.SocketInputStream#read
java.io.BufferedInputStream#fill
java.io.BufferedInputStream#read
java.io.DataInputStream#readByte
sun.rmi.transport.StreamRemoteCall#executeCall
sun.rmi.server.UnicastRef#invoke
java.rmi.server.RemoteObjectInvocationHandler#invokeRemoteMethod
java.rmi.server.RemoteObjectInvocationHandler#invoke
com.sun.proxy.$Proxy19#rrunTest
org.apache.jmeter.engine.ClientJMeterEngine#runTest at line:149
org.apache.jmeter.engine.DistributedRunner#start at line:132
org.apache.jmeter.engine.DistributedRunner#start at line:149
org.apache.jmeter.JMeter#runNonGui at line:1005
org.apache.jmeter.JMeter#startNonGui at line:910
org.apache.jmeter.JMeter#start at line:538
sun.reflect.NativeMethodAccessorImpl#invoke0
sun.reflect.NativeMethodAccessorImpl#invoke
sun.reflect.DelegatingMethodAccessorImpl#invoke
java.lang.reflect.Method#invoke
org.apache.jmeter.NewDriver#main at line:248
I strongly recommend using this jmeter property:
jmeterengine.force.system.exit=true
documented here. These Chinese language web pages link link tipped me off.
You can add -Jjmeterengine.force.system.exit=true on the command line when launching JMeter, or add jmeterengine.force.system.exit=true to JMETER_HOME/bin/jmeter.properties.
How I Confirmed This Fix
With JMeter 5.1 and java version "1.8.0_231" on MS-Win10, we're using a customized version of this JMeter InfluxDB backend Listener.
After my 60 second test run from the command line (jmeter.bat -n -t plan.jtl), the command line hung after displaying this output (very similar to op):
Tidying up ... # Wed Jan 29 14:41:04 CST 2020 (1580330464874)
... end of run
The JVM should have exited but did not.
The following non-daemon threads are still running (DestroyJavaVM is OK):
Thread[DestroyJavaVM,5,main], stackTrace:
Thread[pool-2-thread-3,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
Thread[pool-2-thread-4,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
Thread[pool-2-thread-1,5,main], stackTrace:sun.misc.Unsafe#park
java.util.concurrent.locks.LockSupport#parkNanos
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject#awaitNanos
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue#take
java.util.concurrent.ThreadPoolExecutor#getTask
java.util.concurrent.ThreadPoolExecutor#runWorker
java.util.concurrent.ThreadPoolExecutor$Worker#run
java.lang.Thread#run
After modifying my command line as follows, jmeter.bat cleanly exited instead of hanging and all the ugly stack trace went away too:
jmeter.bat -n -Jjmeterengine.force.system.exit=true -t plan.jtl
To confirm that the problem was caused by our customized JMeter InfluxDB backend Listener, I removed it from the .jmx and I also removed the jmeterengine.force.system.exit=true. No hang, no ugly stacktrace (I actually love stacktraces).
I have not taken the next step to discover whether the problem is with the official JMeter InfluxDB backend Listener or with our customized variant, which is not (and will never be) available publicly.
Should mention one gap in this story. I feel this test conclusively points to our customized backend listener (or jmeter's). However, its odd that none of the threads in the above thread dump seem to belong to the backend listener. So I applaud that JMeter did the right thing by dumping the stack trace -- few other apps go to the extent of auto-dumping when appropriate for troubleshooting. But in this case, perhaps that jmeter auto-dump code needs to be enhanced, because it did not point to the culprit backend listener code. Anyone over there at Apache listening in on this?
Good luck.
Most probably your JMeter engine(s) is(are) overloaded therefore cannot gracefully shut down running threads when you request them to do so.
Make sure you follow JMeter Best Practices
The very first "best practice" states Always use latest version of JMeter so consider migrating to JMeter 5.0 or whatever latest version is available at JMeter Downloads page
Make sure your JMeter instances have enough headroom to operate in terms of CPU, RAM and so on. You can use JMeter PerfMon Plugin for this if you don't have other monitoring software in place/in mind.
Take a thread dump and examine it - this way you will know where exactly your test is stuck
Introduce reasonable timeout values in HTTP Request Defaults so in case when server fails to respond JMeter wouldn't wait infinitely but rather fail with an error
And finally (however I wouldn't recommend this) you can suppress this check by adding the next line to user.properties file:
jmeter.exit.check.pause=-1
if you go for this keep in mind that you may run into a situation when JMeter slaves will still be trying to execute something even after your test ends so you will need to kill and restart the processes manually or using a script.

Ubuntu + Jmeter : Execution results are not showing in console while running test in Non GUI mode (Distributed Test)

I am running a distributed test from ubuntu machine using JMeter. When I am running test from master machine the results ( Active threads, Avg Resp time) details not showing in console,
tried by adding "Console status logger" Lisner. Still, it's not showing the results.
Possible reasons for that are the following:
You don't have enough results yet, see doc, properties time_threshold, num_sample_threshold
You have a connectivity issue between slaves and master, see this doc and this one. This might be due to a Firewall between those components, ensure you open required ports.

JMeter distributed testing - how to get aggregated report?

I have three slaves (jmeter-servers) running on EC2 instances, and in one case – (1) JMeter GUI on a local laptop, on another – same test plan (2) running from a command line on yet another EC2 instance.
In case of GUI I can see all the aggregated numbers for Throughput, 99%, etc. in – well, GUI. I'm creating a jtl file with Aggregate Report listener.
From watching Datadog charts monitoring the application server parameters (CPU usage, memory, etc.) I see that in case of a command line and everything on EC2 load is more than twice higher than when my local laptop is communicating with the jmeter-servers, meaning probably that the network becomes a bottleneck. So I want to run everything on EC2.
But then – how do I get access to the same aggregated numbers when I'm running from the command line when all four machines are EC2 instances? The huge jtl file contains records for each transaction, not the aggregated one line of the entire run result.
On an attempt to download that jtl from EC2 and open it in GUI on a local laptop it generates some error instead of showing aggregated data.
Am I using a wrong listener to get to the summary data? (Tried Summary Report – it creates even larger jtl file, not the one line I'm looking for.)
Problem in this case is not running scripts via JMeter GUI. Instead it is related to network.
I had a similar distributed setup in EC2-environment and I successfully executed heavy load tests in GUI mode. In my case, all my JMeter (master/slaves) were running on EC2 instances (windows environment). So, I will recommend you to setup your JMeter (Master) on EC2 and run scripts via GUI mode.
If you still want to run in command line mode then you simply need to pass command to create jtl file while the script runs on command line. Later on you can use this JTL to generate any JMeter report as per requirement. For more details check..
Jmeter - Run .jmx file through command line and get the summary report in a excel
jmeter -n -t /path/to/your/test.jmx -l /path/to/results/file.jtl
Please refer to Dmitri answer in following question to reduce JTL size.
How can we control size of JTL file while running test from Non GUI Mode

Jenkins+Yandex-tank+Jmeter and hanged jobs

I am using CI Jenkins for automation of load-testing with yandex-tank + jmeter. I am using distributed testing and starting summary 10k threads. So, I have a problem when the test should be finished but it`s not happening because (I think so) some threads on remote machines are stuck.
Also, I tried to use these settings in jmeter.properties file:
jmeterengine.threadstop.wait=1000
jmeterengine.remote.system.exit=true
jmeterengine.stopfail.system.exit=true
jmeterengine.force.system.exit=true
jmeter.exit.check.pause=1000
But it does not help. Are there some another for force stopping of jmeter without killing java process?
I am having a similar problem, I've tried the Jenkins option:
Abort the build if it's stuck, Strategy: Absolut, 120 minutes
It didn't help for me, but it would worth a try.

Resources