Shell script to divide csv files automatically using thread group count - performance

Currently I am manually dividing the csv files for distributed testing of Jmeter from 3 machines. But I need a shell script which will automatically divide the csv files based on thread group count

If you want to give unique values for each thread, you can simply do this by changing some values on CSV Data Set Config:
Recycle on EOF : False
Stop Thread on EOF : True
Sharing mode: All Threads
After these values, each thread on your jmx will get unique values from your CSV file.

Normally it's not necessary to split the CSV file for the Thread Groups count, you just need to choose the appropriate Sharing Mode of the CSV Data Set Config
As an exception here is an example shell script which splits the CSV file by the number of thread groups in the .jmx script:
#!/usr/bin/env bash
threadGroups=`grep -c "\"ThreadGroup\"" test.jmx`
split --suffix-length="${threadGroups}" --additional-suffix=.csv -d --number="l/${threadGroups}" "test.csv" "."/
Replace test.jmx and test.csv with the names/locations of your .jmx and .csv files
it will generate .CSV files in form of 000.csv, 001.csv, etc.
More information: Split Command in Linux with Examples

Related

JMeter number of threads (users) variable from CSV file not working

I'm reading different parameters from the CSV file with the "CSV Data Set Config" component.
I added the threadCount column in the CSV. It could have for example the value 100.
Then I use in the "Thread Group" component in the field "Number of Threads (users)" the variable ${threadCount} and nothing happens when I run the report.
There is in the log file:
2021-09-29 09:34:19,704 DEBUG o.a.j.e.u.ValueReplacer: About to replace in property of type: class org.apache.jmeter.testelement.property.StringProperty: ${threadCount}
2021-09-29 09:34:19,704 DEBUG o.a.j.t.p.AbstractProperty: Not running version, return raw function string
2021-09-29 09:34:19,704 DEBUG o.a.j.e.u.ValueReplacer: Replacement result: ${threadCount}
If I add the threadCount variable to the "User Defined Variables" component then the program runs correctly.
Could you please tell where is the problem?
I don't think you can configure the number of threads in the Thread Group using CSV Data Set Config because Thread Group is initialized before the CSV Data Set Config is processed.
If you want to make the number of threads externally configurable you can define it using __P() function like
${__P(threadCount,)}
Once done you should be able to define the value in following ways:
Via user.properties file like:
threadCount=100
Via your custom .properties file, the same way but you need to pass this file to JMeter via -q command-line argument
jmeter -q /path/to/your/custom.properties file
And you can override the value via -J command-line argument like:
jmeter -JthreadCount=100
More information:
Full list of command-line options
Configuring JMeter
Apache JMeter Properties Customization Guide

how to stop jmeter from writing anything in the .csv results file

Straight and short, i want jmeter to stop writing any output in the .csv results file.
I've heard that preventing the creation of the file is not possible however we can set the results file configuration properties to decide what values to output so basically i went over all those properties, i did read them all and i set the ones that i could set to false to false, the ones that i could set to none to none and the ones that i could set to their default values to their default values, in theory the file shouldn't print anything and remain at 0kbs, however this is not the case the file still grows in size and it is printing a zero every line like these:
0
0
0
0
and this is how i have my properties in the user.properties file
jmeter.save.saveservice.output_format=csv
jmeter.save.saveservice.assertion_results_failure_message=false
jmeter.save.saveservice.assertion_results=none
jmeter.save.saveservice.data_type=false
jmeter.save.saveservice.label=false
jmeter.save.saveservice.response_code=false
jmeter.save.saveservice.response_data=false
jmeter.save.saveservice.response_data.on_error=false
jmeter.save.saveservice.response_message=false
jmeter.save.saveservice.successful=false
jmeter.save.saveservice.thread_name=false
jmeter.save.saveservice.time=false
jmeter.save.saveservice.subresults=false
jmeter.save.saveservice.assertions=false
jmeter.save.saveservice.latency=false
jmeter.save.saveservice.connect_time=false
jmeter.save.saveservice.samplerData=false
jmeter.save.saveservice.responseHeaders=false
jmeter.save.saveservice.requestHeaders=false
jmeter.save.saveservice.encoding=false
jmeter.save.saveservice.bytes=false
jmeter.save.saveservice.url=false
jmeter.save.saveservice.filename=false
jmeter.save.saveservice.hostname=false
jmeter.save.saveservice.thread_counts=false
jmeter.save.saveservice.sample_count=false
jmeter.save.saveservice.idle_time=false
jmeter.save.saveservice.timestamp_format=none
jmeter.save.saveservice.default_delimiter=,
jmeter.save.saveservice.print_field_names=false
jmeter.save.saveservice.xml_pi=
jmeter.save.saveservice.base_prefix=~/
jmeter.save.saveservice.autoflush=false
Where is that 0 coming from? is it impossible?
If you don't want the results file just don't provide its location, i.e. omit this -l bit when you launch JMeter test in command-line non-GUI mode:
jmeter -n -t test.jmx
The above command will trigger the test execution but no .jtl results file will be created.
Not sure why you want to create a .csv file when your not storing anything into it.
Maybe just don't give any file name here

How can i provide different csv files via command line for the same JMX file in Jmeter

I would like to run the same JMX file for different loads/threads using different csv config files.Let's say csv files contains data for the username and password. For test1 csv file has 1000 rows for the test2 csv file has 2000 rows like that so on..
How can i provide different csv files via command line for the same JMX file for different thread count.
I know i can pass the threads,rampup,rampdown,duration by usin __P() function like ${__P(threads,)}via command line as below
jmeter -Jthreads=200 -Jrampup=10 -Jduration=1000 -Jrampdown=10 -n -t test1.jmx -l result1.jtl
Thanks,
Raj
You can do it just the same way as you do for Threads, Rampup, Duration, etc.
In your CSV Data Set Config define the Filename using __P() function like:
${__P(csvFile,test1.csv)}
this will tell the CSV Data Set Config to read the filename from csvFile JMeter Property and use test1.csv if the property is not set (so you could debug your test in GUI mode)
That's it, now you will be able to pass the file name using -J command-line argument like:
jmeter -JcsvFile=/path/to/file2.csv -Jthreads=200 ....
alternative way of setting up the property is putting the value in user.properties file. Check out Apache JMeter Properties Customization Guide for more information.

How to save request body in Jmeter?

I am fetching data from the CSV file and giving as input to my Request.How can I save all the request in the same file when I run a test for an hour.
One more requirement is, if the result is success then I have to write that data I have used from the CSV into another file so that we can have only the data which is working in an separate file.
Please suggest
The best way would be using JMeter's built-in Sample Variables property.
Add the next line to user.properties file:
sample_variables=foo
Replace foo with the variable name you're getting from the CSV file
Next time you run your JMeter test in command-line non-GUI mode like:
jmeter -n -t test.jmx -l result.csv
your result.csv file will have an extra column called foo and having the foo variable value for each and every request. You will be also able to determine which data caused the failure by looking into "success" column

How to use airflow for real time data processing

I have a scenario where i want to process csv file and load to someother database:
Cases
pic csv file and load to mysql with the same name as csv
then do some modification on loaded rows using python task file
after that extract data from mysql and load to some other database
CSV files are coming from remote server to one airflow server in a folder.
We have to pick these csv file and process through python script.
Suppose i pick one csv file then i need to pass this csv file to rest of the operator in a dependency manner like
filename : abc.csv
task1 >> task2 >> task3 >>task4
So abc.csv should be available for all the task.
Please tell how to proceed.
Your scenarios don't have anything to do with realtime. This is ingesting on a schedule/interval. Or perhaps you could use a SensorTask Operator t detect data availability.
Implement each of your requirements as functions and call them from operator instances.
Add the operators to a DAG with a schedule appropriate for your incoming feed.
How you pass and access params is
-kwargs python_callable when initing an operator
-context['param_key'] in execute method when extending an operator
-jinja templates
relevant...
airflow pass parameter from cli
execution_date in airflow: need to access as a variable
The way tasks communicate in Airflow is using XCOM, but it is meant for small values, not for file content.
If you want your tasks to work with the same csv file you should save it on some location and then pass in the XCOM the path to this location.
We are using the LocalExecutor, so the local file system is fine for us.
We decided to create a folder for each dag with the name of the dag. Inside that folder we generate a folder for each execution date (we do this in the first task, that we always call start_task). Then we pass the path of this folder to the subsequent tasks via Xcom.
Example code for the start_task:
def start(share_path, **context):
execution_date_as_string = context['execution_date'].strftime(DATE_FORMAT)
execution_folder_path = os.path.join(share_path, 'my_dag_name', execution_date_as_string)
_create_folder_delete_if_exists(execution_folder_path)
task_instance = context['task_instance']
task_instance.xcom_push(key="execution_folder_path", value=execution_folder_path)
start_task = PythonOperator(
task_id='start_task',
provide_context=True,
python_callable=start,
op_args=[share_path],
dag=dag
)
The share_path is the base directory for all dags, we keep it in the Airflow variables.
Subsequent tasks can get the execution folder with:
execution_folder_path = task_instance.xcom_pull(task_ids='start_task', key='execution_folder_path')

Resources