Currently, i have a requirement where I need to make sure that the data once read is not read again. Earlier I used to use HttpSimpleTableServer when I had to run only one loop with keep=false. However now I need to run 2 loops and for which the above option doesn’t work as the same csv is read from the start agin for the second loop. So I was thinking if there is a way to read data from different csv files per loop. If not how can I make sure that different data is read from the csv for every loop and no data is ever repeated. My Jmeter version is 5.3.
You can use CSV Data Set Config component to read the data from CSV files.
Set the `` flag to false to read the data only once.
You may set the remaining flags based on your need.
You may add two different CSV Data Set Config elements to work with different CSV files.
If you want to handle this programmatically API documentation will be useful. API documenation
If you need to read 2 different files in 2 different loops you should consider going for __CSVRead() function instead
Create 2 files like file0.csv and file1.csv
Once done you will be able to:
${__CSVRead(file${__jm__Thread Group__idx}.csv,0)} - read first column
${__CSVRead(file${__jm__Thread Group__idx}.csv,1)} - read second column
${__CSVRead(file${__jm__Thread Group__idx}.csv,next)} - proceed to next row
etc.
The __CSVRead() function will proceed to the next file on next Thread Group iteration
More information: How to Pick Different CSV Files at JMeter Runtime
Related
I have a project in spring batch where I must read from two .txt files, one has many lines and the other is a control file that has the number of lines that should be read from the first file. I know that I must use partition to process these files because the first one is very large and I need to divide it and be able to restart it in case it fails but I don't know how the reader should handle these files since both files do not have the same width in their lines.
None of the files have a header or separator in their lines, so i have to obtain the fields according to a range mainly in the first one.
One of my doubts is whether I should read both in the same reader? And how should I set the reader FixedLengthTokenizer and DefaultLineMapper to handle both files in the case of using the same reader??
These are examples of the input file and the control file
- input file
09459915032508501149343120020562580292792085100204001530012282921883101 the txt file can contain up to 50000 lines
- control file
00128*
It only has one line
Thanks!
I must read from two .txt files, one has many lines and the other is a control file that has the number of lines that should be read from the first file
Here is a possible way to tackle your use case:
Create a first step (tasklet) that reads the control file and put the number of lines to read in the job execution context (to share it with the next step)
Create a second step (chunk-oriented) with a step scoped reader that is configured to read only the number of lines calculated by the first step (get value from job execution context)
You can read more about sharing data between steps here: https://docs.spring.io/spring-batch/docs/4.2.x/reference/html/common-patterns.html#passingDataToFutureSteps
I have the following Test Plan:
Test Plan
Thread Group
Java Request
CSV Data Config Set
My Thread Group has 1 thread looping forever. To my understanding, the thread should go down the CSV file line by line, 1 line per loop. However, it stays on the same first line. If I have two threads, then the first thread will stay on the first line, second thread on the second line, and so on.
I have tried all the different options in CSV Data Config Set (even if it doesn't make sense to try those options) including:
Checked path to file is correct
Tried file encoding as empty, UTF-8, UTF-16
Checked delimiter was correct in CSV
Checked variable names were correct
Allow quoted data true and false
Recycle on EOF true and false
Stop thread on EOF true and false
Tried all sharing modes
I also ensured the CSV file had no empty lines. I am using JMeter 2.13 and the line break character in the CSV is CR LF if that helps.
I've looked at tutorials and other JMeter questions on here, it seems that by default the threads should go down the CSV file. I remember it was behaving properly awhile back, unsure when it started behaving this way.
It is hard to say anything without seeing the code of the Java Request sampler to read the variable from the CSV and your CSV Data Set configuration.
If you want each thread to read the next line from the CSV file each iteration you need to set the Sharing Mode to All Threads
Try using other sampler, i.e. Debug Sampler as it might be the case your approach to reading the variable from the CSV file is not valid
According to JMeter Best Practices you should always be using the latest version of JMeter and you're sitting on a 4-years-old version, it might be the case you're suffering from an issue which has been already fixed so consider migrating to JMeter 5.1.1 or whatever is the latest stable JMeter version available at JMeter Downloads page
I'm very experienced with Apache Camel and EIPs and am struggling to understand how to implement equivalents in Nifi. I understand that Nifi uses a different paradigm (flow based programming) but I don't think what I'm trying to do is unreasonable.
In a nutshell I want the contents of each file to be sent to many rest services and I want to aggregate the responses into a single document which will stored in elasticsearch. I might also do some further processing and cleanup to improve what is stored (but this isn't my immediate issue)
The screenshot is a quick mock-up of what I'm trying to achieve but I don't understand enough about Nifi to know how to implement this pattern correctly.
If you are going to take a single piece of data and then fork to multiple parts of the flow and then converge back, there needs to be a way for MergeContent to know which pieces go together.
There are generally two ways this can be done...
The first is using MergeContent in "defragment mode". Think of this as reversing a split operation that was performed by one of the split processors like SplitText. For example, you split a file of 100 lines into 100 flow files of 1 line each, then do some stuff to each one, then want to converge back. The split processors produce a standard set of split attributes (described in the docs of the processors) and the defragment mode knows how to bin the splits accordingly and merge them back together. This probably doesn't apply to your example since you didn't start with a split processor.
The second approach is the "Correlation Attribute" in MergeConent. This tells merge content to only merge flow files together that have the same value for the attribute specified. In your example, when a file gets picked up by GetFile and sent to 3 InvokeHttp processors, there are 3 flow files created, and they all should have their "filename" attribute set to the name of the file picked up from disk. So telling MergeContent to correlate on filename should do the trick, and probably setting the min and max number of entries to the number you expect like 3, and a maximum time in case one of them fails or hangs.
I am using PStore to store the results of some computer simulations. Unfortunately, when the file becomes too large (more than 2GB from what I can see) I am not able to write the file to disk anymore and I receive the following error;
Errno::EINVAL: Invalid argument - <filename>
I am aware that this is probably a limitation of IO but I was wondering whether there is a workaround. For example, to read large JSON files, I would first split the file and then read it in parts. Probably the definitive solution should be to switch to a proper database in the backend, but because of some limitations of the specific Ruby (Sketchup) I am using this is not always possible.
I am going to assume that your data has a field that could be used as a crude key.
Therefore I would suggest that instead of dumping data into one huge file, you could put your data into different files/buckets.
For example, if your data has a name field, you could take the first 1-4 chars of the name, create a file with those chars like rojj-datafile.pstore and add the entry there. Any records with a name starting 'rojj' go in that file.
A more structured version is to take the first char as a directory, then put the file inside that, like r/rojj-datafile.pstore.
Obviously your mechanism for reading/writing will have to take this new file structure into account, and it will undoubtedly end up slower to process the data into the pstores.
I want to run 5 threads and each thread pulls in data from different .csv file. For example, thread 1 maps to data_1.csv... I do NOT want to create 5 Thread Groups.
Please help. Thank you!
To be able to open different csv files in the same test plan execution, you have to build a file name with the threadNum function.
According to your example you would have to set the filename to "data_${__threadNum}.csv" in the csv reader so the 5 threads will load your 5 files.
The files are shared upon their filenames so the sharing mode is not an issue.
According to usermanual By default, the file is only opened once, and each thread will use a different line from the file. You can change sharing mode but not open several different CSV files, i.e. one file for each thread.
Upd: on the other hand, if you don't have a lot of threads you can try this