Need help in write file operations in Jmeter - jmeter

Need help in write file operations in Jmeter with below code:
a = vars.get("PARAM_1");
b = vars.get("PARAM_2");
f = new FileOutputStream("FILEPATH/filename.csv", true);
p = new PrintStream(f);
this.interpreter.setOut(p);
print(a +","+ b);
f.close();
I am using in bean shell post processer but above code only will able to add 2 variables in the CSV file but I tried and defined more than 2 like
a = vars.get("PARAM_1");
b = vars.get("PARAM_2");
c = vars.get("PARAM_3");
d = vars.get("PARAM_4");
e = vars.get("PARAM_5");
f = new FileOutputStream("FILEPATH/filename.csv", true);
p = new PrintStream(f);
this.interpreter.setOut(p);
print(a +","+ b","+ c","+ d","+ e);
f.close();
With the above code not write with all 5 variables, write with only 2.
And also for me, data writing needs to happen with the 2nd line of the CSV not with the 1st line of the CSV because the 1st line has the headline of the data.

Using Beanshell is some form of a performance anti-pattern, since JMeter 3.1 you're supposed to be using Groovy for scripting, more information: Beanshell vs. JSR223 vs. Java For JMeter: Complete Showdown
You can amend you code to look like:
def file = new File('FILEPATH/filename.csv')
1.upto(5, {
file << vars.get('PARAM_' + it)
if (it != 5) {
file << ','
}
})
file << System.getProperty('line.separator')
If you run your script with 2 or more users you might face a race condition so when 2 or more threads will be concurrently writing into the same file it will result in data corruption or loss. So it's better to consider using i.e. Flexible File Writer instead
In any case make sure that the variables exist and have their respective values using Debug Sampler and View Results Tree listener combination as if the variable is not there you will have null value written to the file

Related

How to create a dynamic request variables in JMeter

I have a few values in CSV files. I need to pass this values in JSR223 sampler. if suppose, in CSV files it has 4 variables are present in first row then i need to write two line below.
Example 1 in JSR223 sampler :
('usr', '${Var1}', '${Var2}')
('usr', '${Var3}', '${Var4}')
If 6 variables are present in second row of CSV files, they my script should have three lines.
Example 2 in JSR223 sampler :
('usr', '${Var1}', '${Var2}')
('usr', '${Var3}', '${Var4}')
('usr', '${Var5}', '${Var6}')
My scenario here is, each row will have different count of values would be present. So, how can i create my JSR223 sampler request count based on the values counts that is present in CSV files. May i know how to create this scenario in JMeter.
Something like:
def lines = new File('test.csv').readLines()
lines.each { line ->
log.info('Line from CSV: ' + line)
def request = new StringBuilder()
def values = line.split(',')
def counter = 0
1.upto(values.size() / 2, { it ->
request.append("('usr',").append(values[counter]).append(", '").append(values[counter + 1]).append(" ')")
if (it != values.size() / 2) {
request.append(System.getProperty('line.separator'))
}
counter = counter + 2
})
log.info('Generated request: ' + System.getProperty('line.separator') + request.toString())
}
should do the trick for you.
Demo:
More information:
Reading a File in Groovy
The Groovy Templates Cheat Sheet for JMeter

Use same random variable in all threads

I've got test plan :
Thread groups ( users 3, loop 2)
Random Variable
HTTP Request
I want variable to be changed only per loop, so under each iteration all three threads should send same value.
So I want something like this :
request where random var = X
request where random var = X
request where random var = X
request where random var = Y
request where random var = Y
request where random var = Y
I tried a lot of workounds but can't find proper solution.
P.S. I don't want to read variables from file. I need to generate them
No matter whatever you "want" the best option would be pre-generating random values somewhere in setUp Thread Group and writing it to the file and then using CSV Data Set Config in the "main" Thread Group to read the values.
However if this is still not something you "want" here is yet another "workaround", hopefully it's "proper" enough for you:
Add JSR223 PreProcessor as a child of the request which you "want" to parameterize with the random variable
Put the following code into "Script" area:
if (props.get('foo_' + vars.getIteration()) != null {
props.put('foo_' + vars.getIteration(), org.apache.commons.lang3.RandomUtils.nextInt(0, 100))
}
Refer the "generated" random value using the following __groovy() function where required:
${__groovy(props.get('foo_' + vars.getIteration()),)}
Demo:

Generate random values in JSR223 sampler in JMeter

def index = [];
def randoms = [];
def size = new File("C:/Users/320027671/Desktop/JmeterPerformanceSuit/CompleteSuit/STU3/Post/index.csv").readLines().size();
File file = new File("C:/Users/320027671/Desktop/JmeterPerformanceSuit/CompleteSuit/STU3/Post/index.csv");
file.each { line ->
index << line
randoms << __Random(0,size,)
}
The script is giving error
the method does not exists
the scirpt is working uptil index << line, the problem is with random function
I assume you use groovy as language (otherwise it won't work)
You can't use JMeter functions inside JSR223
You can randomize every line using for example RandomUtils:
org.apache.commons.lang3.RandomUtils.nextInt(0, size-1);
Your approach may fail to produce "random" numbers, especially on lesser file sizes you can get duplicate values in the randoms list so I would recommend doing something like:
1.upto(size, { i ->
randoms.add(i)
})
Collections.shuffle(randoms)
This will populate randoms list with the numbers from 1 to the length of size and then calls Collection.shuffle() function in order to "randomise" the list.
Just in case check out Writing JMeter Functions in Groovy for more insights.

Code organization Matlab [duplicate]

This question already has answers here:
In MATLAB, can I have a script and a function definition in the same file?
(7 answers)
Closed 6 years ago.
I write some program at Matlab. I do it in .m file. And it's 300+ strings of code now, so it became not very comfortable for reading. My idea is to use it like in C++: I want to create local functions at the end of this file and put pieces of code into them. It would be easy to read and it would consist of some logical parts.
But I faced the fact that local functions can be created only in body of other function! So I can't create this:
x = 1;
y = 2;
z = myLocalFnc(x,y);
function res = myLocalFnc (a,b)
res = a.*b;
end
This generate error:
Function definitions are not permitted in this context.
I can do it by including the whole code into one function:
function myBigFcn
x = 1;
y = 2;
z = myLocalFnc(x,y);
end
function res = myLocalFnc (a,b)
res = a.*b;
end
But now all the variables became local and it return to workspace nothing.
Modify input/output for this function any time I create some variable - I don't think I walk the right way... Here are described advantages and disadvantages of this method.
I know I can create function in additional .m file, save it and use in my script - OK, but a lot of them are single-operation, I don't want to create so much new files.
My question is ARE any other methods of code organizing?
If you really want your main thing to be a script (I don't recommend this), you can put all the other functions in separate .m files and call those from within the script.
The other option, as you've noted is to put the entire contents of your "main" part into a function at the top of your code. You can return all the necessary values via the output arguments or save them to a .mat file if you need access to them later.
As a side note, the ability to put local functions within a script (what you're trying to do) is present in R2016b which will be released later this Fall.
Update
If you want an easy way to convert your code to a function. You can easily save all variables to a struct at the end of your function automatically and just return this struct.
function output = my_script_that_is_now_a_function(inputs)
% Do stuff
% Now save all variables in a struct and return it
tmpfile = tempname;
save(tmpfile);
output = load(tmpfile);
delete(tmpfile);
end
One option, if you don't mind saving a file locally, is to restructure your code into multiple functions (all in the same file) and pass the necessary variables as output. You then have a short script which calls this function and creates the necessary variables.
For example, say your script looks like this
numPoints = 5;
a = randn(numPoints);
b = sin(a);
c = a + b;
You could restructure it to look like
function data = main()
data.numPoints = 5;
data.a = getA(data.numPoints);
data.b = getB(data.a);
data.c = getC(data.a, data.b);
function a = getA(numPoints)
a = randn(numPoints);
function b = getB(a)
b = sin(a);
function c = getC(a, b)
c = a + b;
and create a script file which looks like
data = main();
You then have a struct called data which contains all of your variables. If you really want them to be contained in separate variables (i.e. not in a struct) then there are a couple of ways to do it. One is to unpack them manually,
a = data.a;
b = data.b;
c = data.c;
clear data;
Another is to save the struct, and then reload it (this has the advantage of keeping a copy of the workspace that you used for this fun of the function, for later analysis).
save('workspace.mat', '-struct', 'data');
load('workspace.mat');

Huge memory consumption in Map Task in Spark

I have a lot of files that contain roughly 60.000.000 lines. All of my files are formatted in the format {timestamp}#{producer}#{messageId}#{data_bytes}\n
I walk through my files one by one and also want to build one output file per input file.
Because some of the lines depend on previous lines, I grouped them by their producer. Whenever a line depends on one or more previous lines, their producer is always the same.
After grouping up all of the lines, I give them to my Java parser.
The parser then will contain all parsed data objects in memory and output it as JSON afterwards.
To visualize how I think my Job is processed, I threw together the following "flow graph". Note that I did not visualize the groupByKey-Shuffeling-Process.
My problems:
I expected Spark to split up the files, process the splits with separate tasks and save each task output to a "part"-file.
However, my tasks run out of memory and get killed by YARN before they can finish: Container killed by YARN for exceeding memory limits. 7.6 GB of 7.5 GB physical memory used
My Parser is throwing all parsed data objects into memory. I can't change the code of the Parser.
Please note that my code works for smaller files (for example two files with 600.000 lines each as the input to my Job)
My questions:
How can I make sure that Spark will create a result for every file-split in my map task? (Maybe they will if my tasks succeed but I will never see the output as of now.)
I thought that my map transformation val lineMap = lines.map ... (see Scala code below) produces a partitioned rdd. Thus I expect the values of the rdd to be split in some way before calling my second map task.
Furthermore, I thought that calling saveAsTextFile on this rdd lineMap will produce a output task that runs after each of my map task has finished. If my assumptions are correct, why do my executors still run out of memory? Is Spark doing several (too) big file splits and processes them concurrently, which leads to the Parser filling up the memory?
Is repartitioning the lineMap rdd to get more (smaller) inputs for my Parser a good idea?
Is there somewhere an additional reducer step which I am not aware of? Like results being aggregated before getting written to file or similar?
Scala code (I left out unrelevant code parts):
def main(args: Array[String]) {
val inputFilePath = args(0)
val outputFilePath = args(1)
val inputFiles = fs.listStatus(new Path(inputFilePath))
inputFiles.foreach( filename => {
processData(filename.getPath, ...)
})
}
def processData(filePath: Path, ...) {
val lines = sc.textFile(filePath.toString())
val lineMap = lines.map(line => (line.split(" ")(1), line)).groupByKey()
val parsedLines = lineMap.map{ case(key, values) => parseLinesByKey(key, values, config) }
//each output should be saved separately
parsedLines.saveAsTextFile(outputFilePath.toString() + "/" + filePath.getName)
}
def parseLinesByKey(key: String, values: Iterable[String], config : Config) = {
val importer = new LogFileImporter(...)
importer.parseData(values.toIterator.asJava, ...)
//importer from now contains all parsed data objects in memory that could be parsed
//from the given values.
val jsonMapper = getJsonMapper(...)
val jsonStringData = jsonMapper.getValueFromString(importer.getDataObject)
(key, jsonStringData)
}
I fixed this by removing the groupByKey call and implementing a new FileInputFormat as well as a RecordReader to remove my limitations that lines depend on other lines. For now, I implemented it so that each split will contain a 50.000 Byte overhead of the previous split. This will ensure that all lines that depend on previous lines can be parsed correctly.
I will now go ahead and still look through the last 50.000 Bytes of the previous split, but only copy over lines that actually affect the parsing of the current split. Thus, I minimize the overhead and still get a highly parallelizable task.
The following links dragged me into the right direction. Because the topic of FileInputFormat/RecordReader is quite complicated at first sight (it was for me at least), it is good to read through these articles and understand whether this is suitable for your problem or not:
https://hadoopi.wordpress.com/2013/05/27/understand-recordreader-inputsplit/
http://www.ae.be/blog-en/ingesting-data-spark-using-custom-hadoop-fileinputformat/
Relevant code parts from the ae.be article just in case the website goes down. The author (#Gurdt) uses this to detect whether a chat message contains an escaped line return (by having the line end with "\") and appends the escaped lines together until an unescaped \n is found. This will allow him to retrieve messages that spans over two or more lines. The code written in Scala:
Usage
val conf = new Configuration(sparkContext.hadoopConfiguration)
val rdd = sparkContext.newAPIHadoopFile("data.txt", classOf[MyFileInputFormat],
classOf[LongWritable], classOf[Text], conf)
FileInputFormat
class MyFileInputFormat extends FileInputFormat[LongWritable, Text] {
override def createRecordReader(split: InputSplit, context: TaskAttemptContext):
RecordReader[LongWritable, Text] = new MyRecordReader()
}
RecordReader
class MyRecordReader() extends RecordReader[LongWritable, Text] {
var start, end, position = 0L
var reader: LineReader = null
var key = new LongWritable
var value = new Text
override def initialize(inputSplit: InputSplit, context: TaskAttemptContext): Unit = {
// split position in data (start one byte earlier to detect if
// the split starts in the middle of a previous record)
val split = inputSplit.asInstanceOf[FileSplit]
start = 0.max(split.getStart - 1)
end = start + split.getLength
// open a stream to the data, pointing to the start of the split
val stream = split.getPath.getFileSystem(context.getConfiguration)
.open(split.getPath)
stream.seek(start)
reader = new LineReader(stream, context.getConfiguration)
// if the split starts at a newline, we want to start yet another byte
// earlier to check if the newline was escaped or not
val firstByte = stream.readByte().toInt
if(firstByte == '\n')
start = 0.max(start - 1)
stream.seek(start)
if(start != 0)
skipRemainderFromPreviousSplit(reader)
}
def skipRemainderFromPreviousSplit(reader: LineReader): Unit = {
var readAnotherLine = true
while(readAnotherLine) {
// read next line
val buffer = new Text()
start += reader.readLine(buffer, Integer.MAX_VALUE, Integer.MAX_VALUE)
pos = start
// detect if delimiter was escaped
readAnotherLine = buffer.getLength >= 1 && // something was read
buffer.charAt(buffer.getLength - 1) == '\\' && // newline was escaped
pos <= end // seek head hasn't passed the split
}
}
override def nextKeyValue(): Boolean = {
key.set(pos)
// read newlines until an unescaped newline is read
var lastNewlineWasEscaped = false
while (pos < end || lastNewlineWasEscaped) {
// read next line
val buffer = new Text
pos += reader.readLine(buffer, Integer.MAX_VALUE, Integer.MAX_VALUE)
// append newly read data to previous data if necessary
value = if(lastNewlineWasEscaped) new Text(value + "\n" + buffer) else buffer
// detect if delimiter was escaped
lastNewlineWasEscaped = buffer.charAt(buffer.getLength - 1) == '\\'
// let Spark know that a key-value pair is ready!
if(!lastNewlineWasEscaped)
return true
}
// end of split reached?
return false
}
}
Note: You might need to implement getCurrentKey, getCurrentValue, close and getProgress in your RecordReader as well.

Resources