Jmeter- With Distributed testing Execution start has a huge delay - jmeter

I am doing distributed Testing in Jmeter.
I have 2 Slave and 1 Master machine. When I am running the script in the distributed environment in both the machines it is showing (Starting test on host....) in cmd prompt in Slave machine.
But the execution starts after a long time. Sometime it is very late so that I need to stop it.
Note: I am also using Perfmon Metrics collector
Jmeter Log
2014/07/17 12:42:42 INFO - jmeter.samplers.BatchSampleSender: Using batching (client settings) for this run. Thresholds: num=100, time=60000
2014/07/17 12:42:42 INFO - jmeter.samplers.DataStrippingSampleSender: Using DataStrippingSampleSender for this run
Please let me know if I need to add any more info

Related

Not able to get the nifi processor group status

I am new to Nifi. I have define a processor group in Nifi and I have started the dataflow.
From the backend, how can I check the status of the processor group whether it is running or not?.
I tried
/bin/nifi.sh status
But it only gives the overall nifi status whether it is running or not
you can't see status of a process group because process groups do not have status
. Nifi just adds a added log like this
2021-04-09 13:26:44,766 INFO [main] o.a.nifi.groups.StandardProcessGroup StandardProcessGroup[identifier=feffff20-c806-305a-5d38-2b8def09bebe] added to StandardProcessGroup[identifier=1be26a7f-0175-1000-6d70-e5784c0dde33]
you can see the IDs in 'Operate' table on the right side of canvas.
On the other hand processor based logs can be seen
2021-04-09 13:40:59,290 INFO [Timer-Driven Process Thread-2] o.a.n.c.s.TimerDrivenSchedulingAgent Scheduled QuickFixInitiator[id=31ee54ea-5043- 3415-6f6e-4b8df429188f] to run with 3 threads
2021-04-09 13:45:31,164 INFO [Timer-Driven Process Thread-2] o.a.n.c.s.TimerDrivenSchedulingAgent Stopped scheduling QuickFixInitiator[id=31ee54ea-5043-3415-6f6e-4b8df429188f] to run
also via rest api
http://localhost:8080/nifi-api/processors/31ee54ea-5043-3415-6f6e-4b8df429188f
It is hard to follow nifi rest api doc All the UI requests go through rest api so the best way to learn it watching UI requests in developer console -> network tab

Apache Flink on Windows

First, I am a complete newbie with Flink. I have installed Apache Flink on Windows.
I start Flink with start-cluster.bat. It prints out
Starting a local cluster with one JobManager process and one
TaskManager process. You can terminate the processes via CTRL-C in the
spawned shell windows. Web interface by default on
http://localhost:8081/.
Anyway, when I submit the job, I have a bunch of messages:
DEBUG org.apache.flink.runtime.rest.RestClient - Received response
{"status":{"id":"IN_PROGRESS"}}.
In the log in the web UI at http://localhost:8081/, I see:
2019-02-15 16:04:23.571 [flink-akka.actor.default-dispatcher-4] WARN
akka.remote.ReliableDeliverySupervisor
flink-akka.remote.default-remote-dispatcher-6 - Association with
remote system [akka.tcp://flink#127.0.0.1:56404] has failed, address
is now gated for [50] ms. Reason: [Disassociated]
If I go to the Task Manager tab, it is empty.
I tried to find if any port needed by flink was in use but it does not seem to be the case.
Any idea to solve this?
So I was running Flink locally using Intelij
Using ArchType that gives you ready to go examples
https://ci.apache.org/projects/flink/flink-docs-stable/dev/projectsetup/java_api_quickstart.html
You not necessary have to install it unless you are using Flink as a service on cluster.
Code editor will compile it just fine for spot instance of Flink for one code run.

For spark applications running on YARN, which deploy mode is better - client or cluster

I understand the major differences between client and cluster mode for Spark applications on YARN.
Major differences include
Where do the driver run - Local in clinet mode, Application Master in cluster mode
Client running duration - In clinet mode, client needs to run for entire duration,
In cluster mode, client need not run as AM takes care of it
Interactive usage - spark shell and pyspark. Cluster mode is not suited well as these
require the driver to be run on client
Scheduling work - In client mode, the client schedules the work by communicating directly with the containers.
In cluster mode, A schedules the work by communicating directly with the containers
In both cases for similarities
Who handles the executor requests from the YARN - Application master
Who starts the executor processes - YARN Node Manager
My question is - In real world scenarios( production environment), where we do not need interactive mode, client not requiring to run for long duration - is the cluster mode an obvious choice?
Are there any benefits for client mode like:
to run the driver on client machine rather than AM
to allow client to schedule work, rather than AM
From the documentation,
A common deployment strategy is to submit your application from a
gateway machine that is physically co-located with your worker
machines (e.g. Master node in a standalone EC2 cluster). In this
setup, client mode is appropriate. In client mode, the driver is
launched directly within the client spark-submit process, with the
input and output of the application attached to the console. Thus,
this mode is especially suitable for applications that involve the
REPL (e.g. Spark shell).
Alternatively, if your application is submitted from a machine far
from the worker machines (e.g. locally on your laptop), it is common
to use cluster mode to minimize network latency between the drivers
and the executors. Note that cluster mode is currently not supported
for standalone clusters, Mesos clusters, or python applications.
Looks like, the main reason is when we run the spark-submit from remote, to reduce the latency between executors and driver, cluster mode is preferred.
From my experience, in production environment the only resonable mode is cluster-mode with 2 exceptions:
when hadoop nodes does not have resources needed by application, for example: at the end of execution spark job performs ssh to server that is not accessible from hadoop nodes
when you use spark streaming and you want to shut it gracefully (killing cluster-mode application forces the streaming to close and if you run in client-mode you can call ssc.stop(stopGracefully = true)

Server Performance in Non GUI Mode Jmeter

How can I measure the server side performance of server in NON-GUI mode, I have been running the server agent on the server side but still I am not able to generate any graph from server side in JP#GC-Graphs Generator.Can anyone suggest me a way toow to proceed on this .
I have founded the way to Measure the Server side performance. By using PerfMon Metrics Collector which is an extra plugin from JMeter I can record Performance data by writing it to a file(JTL/CSV).
Steps
1.Add PerfMon Metrics Collector in you jmx file as listerner
2.Add the fields which you want to measure and in write results to file option, write it to a file in JTL/CSV
3.Run the server agent on the server side.
4.Run the Test with the command prompt/ batch file .
5.After completion of tests you will be getting a (JTL/CSV) file in the configured location, you can open this using PerfMon Metrics Collector and can see the graphs.

JMeter Summary report in distributed mode

I am running Jmeter performance test in distributed mode (2 slaves + master). In my test script I have configured Summary
Report which should save some data to csv file.
This file location is configured with fixed name "reports/summary.csv" value.
When I connect successfully from master to both slaves, tests finish on slaves, but no data is returned to master directory "reports/summary.csv". When I was running setup with one slave and master, master was collecting this data. What could be the problem when I have 2 slaves? Name conflict maybe?
Actually this problem was related to "jmeter-server" process not being able to connect to master using RMI. This network issue caused my jmeter-server to hang for decade on second node, and this stopped master node jmeter process from finnishing and finalizing results in mentioned summary file.
After I got it working, important thing to know is if you use SummaryReport or AggregateReport jMeter component in your testplan in distributed env., master will take care of collecting data from each slave.

Resources