Tool to test a BigData Pipeline performance end to end? - performance

I have this pipeline: Kafka->Logstash->ElasticSearch->Kibana
I have found a producer performance tool that can be invoked with the script “ bin/kafka-producer-perf-test.sh”.
I am wondering if anyone had any suggestions for testing performance end-to-end testing?
Thanks.

Your pipeline Kafka->Logstash->ElasticSearch->Kibana involves 4 components. Kafka, Logstash, ElasticSearch, Kibana all of which serve a different purpose. Each component has different performance numbers and characteristics.
bin/kafka-producer-perf-test.sh You mentioned is kafka performance test tool which will only measure performance of Kafka and not others. If you have configured pipeline to read, process and display data generated using above test tool then you can get overall pipeline performance. This way you won't be able to find out limiting component in pipeline.
I suggest you to configure a replicate a similar data which is required by your pipeline. Kafka custom producer - PepperBox is good tool for that. Deploy monitoring on all components using influxDB/graphite (or any timeseries DB of your choice) and measure the end-to-end throughput as well as component wise throughput.
Sample benchmark is YSB. This will help you to get started.

Related

How to find out root cause analysis using JMeter integration with Prometheus and Grafana?

Requirements is to find out root cause analysis using Prometheus and Grafana and also need to know which method is taking more time, CPU and Memory Utilisation also?
Can anyone pls help me
Integrate Prometheus client library into your app
Configure Grafana dashboard to visualize the metrics you want to monitor from your app
Additionally you can integrate JMeter with Prometheus Listener for JMeter plugin (can be installed using JMeter Plugins Manager)
Come up with a test plan which simulates a real life usage of your application
Run a stress test (start with 1 user and gradually increase the load until response time starts exceeding acceptable threshold or errors start occurring)
Inspect Grafana dashboard to see what is the reason of the application being slow, i.e. trace down the "slow" request to the underlying function. Additionally you can use a profiler tool which can provide a better picture

Streaming pipeline on Apache Flink runner (parallelism and grouping of tasks written using Apache Beam)

I am already having a stream pipeline written in Apache beam. Earlier I was running the same in Google Dataflow and it used to run like a charm. Now due to changing business needs I need to run it using flink runner.
Currently in use beam version is 2.38.0, Flink version is 1.14.5. I validated this and found this is supported and valid combination of versions.
The pipeline code which is written using Apache Beam sdk and it uses multiple ParDo and PTransforms. The pipeline is somewhat complicated in nature as it involves lot of interim operations (catered via these ParDo & PTransforms) between source and sink.The source in my case is Azure service bus topic which I am reading using JmsTopicIO reads. Until here all works fine i.e. stream of data enters in to the pipeline and getting processed normally. The problem occurs when load testing is performed. I see many operator going in to back pressure and eventually not able to read & process msgs from topic. Though the CPU and memory usage remains well under control of Job & Task manager.
Actual issue/problem/question: While troubleshooting this performance issues I observed that Flink is chaining and grouping these ParDo's and PTranforms (by itself) in to operators. With my implementation I see that many heavy processing tasks are getting combined in to same operator. This is causing slow processing of all such operators. Also the parallelism I have set (20 for now) is at pipeline level which mean each operator is running with 20 parallelism.
flinkOptions.setParallelism(20);
Question 1. Using apache beam sdk or any flink configuration is there any way I can control or manage the chaining and grouping of these ParDo/PTransforms in to operators (through code or config)?. So I should be able to uniformly distribute the load myself.
Question 2. With implementation using Apache Beam, how I can mention or set the parallelism to each operator (not to complete pipeline) based on the load on them?. This way I will be able to better allocate the resources to heavy computing operators (set of tasks).
Please suggest answers to above questions. Also if any other pointer can be given to me to work up on for flink performance improvements in my deployment. Just for reference please note my pipeline .

Limitation when using K6 (Load impact) for load testing on APIs

I ran a few tests using k6(OSS) by load impact and found it great in terms of usability compared to JMeter
I am doing a feasibility study to choose a load testing tool that should help me do API testing. I am inclined towards using K6 because I believe it is developer friendly but could not find resources that advise regarding maximum load I can simulate using K6.
Would it be possible to simulate 1 million rps(requests per second) using K6? If yes, how should I go about achieving this?
In theory, yes, if you use multiple k6 instances, you can achieve however many requests per second you want. A single k6 instance can produce anywhere from thousands to tens of thousands requests per second, depending on a lot of different factors - machine specs, script complexity, VUs, sleep times, network conditions, etc.
Right now k6 doesn't have a native distributed execution mode though, so you'd have to schedule the different instances yourself. There's a REST API (https://docs.k6.io/docs/rest-api) and you can output metrics to a centralized collector like InfluxDB (https://docs.k6.io/docs/results-output), but it'd take some work to execute a single test on multiple machines. A native k6 distributed execution mode is planned, but work on it hasn't started yet.
You can run k6 on the Load Impact (https://loadimpact.com) cloud (Cloud Execution mode) to access multiple k6 instances executing in parallel. Then, as noted, you can generate a large number of requests per second with the specific RPS being highly dependent on your script and other factors.

Live User monitoring by jmeter

I am looking to monitor live user responses through jmeter. Can the backend listener in jmeter used to record live users(end users)? I am not talking about virtual users that we set up in jmeter. But the real end users.How can this be achieved?
Editing to add more details:
Our requirement is to monitor the real users, in 2-3 geographical locations, all through out the business hours..say from 8 to 5.
For this purpose, do you think, I need to have a dedicated machine with jmeter, grafana and influxdb for monitoring alone? I have other testing going on using jmeter and I don't want to use the same machine to do both monitoring and testing. DO you think this is achievable by jmeter? ANy suggestions?
you can use the following tools in combination to achieve live monitoring:
JMeter backend Listener - to send results to influxDB
influxdb - store the results sent by backend listener
grafana - run continuous queries for metrics and plot graphs like average response times etc.
Follow the steps mentioned here:
http://jmeter.apache.org/usermanual/realtime-results.html - First Option
https://www.linkedin.com/pulse/jmeter-live-performance-monitoring-dashboard-grafana-influxdb-sarker
http://www.testautomationguru.com/jmeter-real-time-results-influxdb-grafana/
http://techblogsearch.com/a/live-performance-result-monitoring-with-jmeter-grafana-influxdb.html
We use to perform general production app (here- Scada-LTS) monitoring by javamelody. But this will give You general statistics. For per user monitoring it seems You should use log4j + ELK or other simpler syslog analyzing tool.
Jmeter should be used rather for test environment for stress tests.

How can I monitor my Application server or Database server from Jmeter scripts ? Can we check CPU , memory utilization, etc?

I need to know till what extend we can analyze our Application using Apache Jmeter.
My script creation is complete. Paramatrized & Correlated. Now I need a deep understanding of Analysis.
Earlier, I just use to focus on Response time, Standard deviation, throughput, etc.
But now my boss wants me to do more analysis. Please help me guys.
You can use these Samplers from JMeter-plugins project:
http://jmeter-plugins.org/wiki/DbMon/
http://jmeter-plugins.org/wiki/JMXMon/
Correctly still divide the tasks and means to solve them. If you need to monitor the parameters of the server utilization - then it needs to use the appropriate means, for example, zabbix. If you need to understand how much resources consumes your server applications - it is necessary to refer to the appropriate monitoring tools, plug-ins, such as Zorka for WebSphere in zabbix.

Resources