JMeter testdata for distributed testing - jmeter

Here is my JMeter setup:
testing web services
distrbiuted testing, 1 master, 20 slaves (potentially 100 if we decide to go with blazemeter)
a file containing testdata, integer per line see [1] for an example
a thread group with 20 users (20x20=400 requests)
CSV Data Set Config, with \n as separator
[1] Example of testdata file, each line represents an id that will be used as Web Service parameter:
23
8677
10029
29957
1001
My question is: how to distribute the data amoung the slaves so that each machine will use distinct part of the testfile and select test data items in a random manner? One way would be to split the test file into separat parts, but is it possible to make it more dynamic? I am thinking towards "machine x will read lines 0-20, machine y 21-40 and so on". In the answer to this question it is mentioned that CSVs are local, but it is possible to dynamically read different lines of the csv?

If you do go with BlazeMeter, they have a built in function that does exactly this. In advanced options there is a checkbox that says:
[ ] Split any CSV file to unique files and distribute among the load servers.

Have you looked at the split command?
$ split --help
Usage: split [OPTION] [INPUT [PREFIX]]
Output fixed-size pieces of INPUT to PREFIXaa, PREFIXab, ...; default
size is 1000 lines, and default PREFIX is `x'. With no INPUT, or when INPUT
is -, read standard input.
Mandatory arguments to long options are mandatory for short options too.
-a, --suffix-length=N use suffixes of length N (default 2)
-b, --bytes=SIZE put SIZE bytes per output file
-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file
-d, --numeric-suffixes use numeric suffixes instead of alphabetic
-l, --lines=NUMBER put NUMBER lines per output file
--verbose print a diagnostic to standard error just
before each output file is opened
--help display this help and exit
--version output version information and exit
You could do something like:
split -l 20 filename

Related

Text file search tool for Windows (command line) with an extremely large pattern list

Is there an efficient way to search a list of strings from another text file or from a piped output?
I have tried the following methods:
FINDSTR /G:patternlist.txt <filetocheck>
or
Some program whose output is piped to FINDSTR
SOMEPROGRAM | FINDSTR /G:patternlist.txt
Similarly, tried GREP from MSYS, UnixUtils, GNU package etc.,
GREP -w -F -f patternlist.txt <filetocheck>
or
Some program whose output is piped to GREP
SOMEPROGRAM | GREP -w -F -f patternlist.txt
For example, Pattern List file is a text file which contains one literal string per line.
For example
Patternlist.txt
65sM547P
Bu83842T
t400N897
s20I9U10
i1786H6S
Qj04404e
tTV89965
etc.,
And the file_to_be_checked contains similar texts, but there might be multiple words in a single line in some cases.
For example
Filetocheck.txt
3Lo76SZ4 CjBS8WeS
iI9NvIDC
TIS0jFUI w6SbUuJq joN2TOVZ
Ee3z83an rpAb8rWp
Rmd6vBcg
O2UYJOos
hKjL91CB
Dq0tpL5R R04hKmeI W9Gs34AU
etc.,
They work as expected if the number of pattern literals are less than 50000 and sometime works very slow upto 100000 patterns.
Also, the filetocheck.txt will contain upto 250000 lines and grows upto 30 MB in size.
The problem comes when the pattern file becomes larger than this. I have an instance of patternfile which is around 20 MB and contains 600000 string literals.
Matching this against a list or output of 250000 to 300000 lines of text literally stalls the processor.
I tried SIFT, and multiple other text search tools, but they just kill the system with the memory requirements and processor usage and make the system unresponsive.
I require a commandline based solution or utility which could help in achieving this task because this is a part of another big script.
I have tried multiple programs and methods to speed up, but all in vain like indexing the pattern file, sorting the file alphabetically etc.,.
Since the input will be from a program, there is no option to split the input file as well. It is all in one big piped command.
Example:
PASSWORDGEN | <COMMAND_TO_FILTER_KNOWN_PASSWORDS> >> FILTERED_OUTPUT
The above problem is in part where the system hangs or take very long time to filter the stdout stream or from a saved results file.
System configuration details if this will be any help:
I am running this on a modest 8 GB RAM, SATA HDD, Core i7 with Win 7 64bit and currently I do not have any better configuration available currently.
Any help in this issue is much appreciated.
I am also trying to find a solution if not create a specific code to achieve this (help appreciated in that sense as well.)

text manipulation using unix commands only

I have a task where I need to parse through files and extract information. I can do this easy using bash but I have to get it done through unix commands only.
For example, I have a file similar to the following:
Set<tab>one<tab>two<tab>three
Set<tab>four<tab>five<tab>six
ENDSET
Set<tab>four<tab>two<tab>nine
ENDSET
Set<tab>one<tab>one<tab>one
Set<tab>two<tab>two<tab>two
ENDSET
...
So on and so forth. I want to be able to extract a certain number of sets, say the first 10. Also, I want to be able to extract info from the columns.
Once again, this is a trivial thing to do using bash scripting, but I am unsure of how to do this with unix commands only. I can combine the commands together in a shell script but, once again, only unix commands.
Without an output example, it's hard to know your goal, but anyway, one UNIX command you can use is AWK.
Examples:
Extract 2 sets from your data sample (without include "ENDSET" nor blank lines):
$ awk '/ENDSET/{ if(++count==2) exit(0);next; }NF{print}' file.txt
Set one two three
Set four five six
Set four two nine
Extract 3 sets and print 2nd column only (Note 1st column is always "Set"):
$ awk '/ENDSET/{ if(++count==3) exit(0);next; }$2{print $2}' file.txt
two
five
two
one
two
And so on... (more info: $ man awk)

How do I open / manipulate multiple files in bash?

I have a bash script that take advantage of a local toolbox to perform an operation
my question is fairly simple
I have multiple files that are the same quantities but different time steps i would like to first untar them all, and then use the toolbox to perform some manipulation but i am not sure if i am on the right track.
=============================================
The file is as follows
INPUTS
fname = a very large number or files with same name but numbering
e.g wnd20121.grb
wnd20122.grb
.......
wnd2012100.grb
COMMANDS
> cdo -f nc copy fname ofile(s)
(If this is the ofile(s)=output file how can i store it for sequent use ? Take the ofile (output file) from the command and use it / save it as input to the next, producing a new subsequent numbered output set of ofile(s)2)
>cdo merge ofile(s) ofile2
(then automatically take the ofile(s)2 and input them to the next command and so on, producing always an array of new output files with specific set name I set but different numbering for distinguishing them)
>cdo sellon ofile(s)2 ofile(s)3
------------------------------------
To make my question clearer, I would like to know the way in which I can instruct basically through a bash script the terminal to "grab" multiple files that are usually the same name but have a different numbering to make the separate their recorded time
e.g. file1 file2 ...file n
and then get multiple outputs , with every output corresponding to the number of the file it converted.
e.g. output1 output2 ...outputn
How can I set these parameters so the moment they are generated they are stored for subsequent use in the script, in later commands?
Your question isn't clear, but perhaps the following will help; it demonstrates how to use arrays as argument lists and how to parse command output into an array, line by line:
#!/usr/bin/env bash
# Create the array of input files using pathname expansion.
inFiles=(wnd*.grb)
# Pass the input-files array to another command and read its output
# - line by line - into a new array, `outFiles`.
# The example command here simply prepends 'out' to each element of the
# input-files array and outputs each (modified) element on its own line.
# Note: The assumption is that the filenames have no embedded newlines
# (which is usually true).
IFS=$'\n' read -r -d '' -a outFiles < \
<(printf "%s\n" "${inFiles[#]}" | sed s'/^/out-/')
# Note: If you use bash 4, you could use `readarray -t outFiles < <(...)` instead.
# Output the resulting array.
# This also demonstrates how to use an array as an argument list
# to pass to _any_ command.
printf "%s\n" "${outFiles[#]}"

Comparing two text files and counting number of occurrences

I'm trying to write a blog post about the dangers of having a common access point name.
So I did some wardriving to get a list of access point names, and I downloaded a list of the 1000 most common access point names (which there exists rainbow tables for) from Renderlab.
But how can I compare those two text files, to see how many of my collected access point names that are open to attacks from rainbow tables?
The text files are build like this:
collected.txt:
linksys
internet
hotspot
Most common access point names are called
SSID.txt:
default
NETGEAR
Wireless
WLAN
Belkin54g
So the script should sort the lines, compare them and show how many times the lines from collected.txt are found in SSID.txt ..
Does that make any sense? Any help would be grateful :)
If you don't mind using python script:
file1=open('collected.txt', 'r') # open file 1 for reading
with open('SSID.txt', 'r') as content_file: # ready file 2
SSID = content_file.read()
found={} # summary of found names
for line in file1:
if line in SSID:
if line not in found:
found[line]=1
else:
found[line]+=1
for i in found:
print found[i], i # print out list and no. of occurencies
...it can be run in the dir containing these files - collected.txt and SSID.txt - it will return a list looking like this:
5 NETGEAR
3 default
(...)
Script reads file 1 line-by line and compares it to the whole file 2. It can be easily modified to take file names from command prompt.
First, take a look on a simple tutorial about sdiff command, like How do I Compare two files under Linux or UNIX. Also, Notepad++ support this.
To find the number of times each line in file A appears in file B, you can do:
awk 'FNR==NR{a[$0]=1; next} $0 in a { count[$0]++ }
END { for( i in a ) print i, count[i] }' A B
If you want the output sorted, pipe the output to sort, but there's no need to sort just to find the counts. Note that the $0 in a clause can be omitted at the cost of consuming more memory, which may be a problem if file B is very large.

Split text file into multiple files

I am having large text file having 1000 abstracts with empty line in between each abstract . I want to split this file into 1000 text files.
My file looks like
16503654 Three-dimensional structure of neuropeptide k bound to dodecylphosphocholine micelles. Neuropeptide K (NPK), an N-terminally extended form of neurokinin A (NKA), represents the most potent and longest lasting vasodepressor and cardiomodulatory tachykinin reported thus far.
16504520 Computer-aided analysis of the interactions of glutamine synthetase with its inhibitors. Mechanism of inhibition of glutamine synthetase (EC 6.3.1.2; GS) by phosphinothricin and its analogues was studied in some detail using molecular modeling methods.
You can use split and set "NUMBER lines per output file" to 2. Each file would have one text line and one empty line.
split -l 2 file
Something like this:
awk 'NF{print > $1;close($1);}' file
This will create 1000 files with filename being the abstract number. This awk code writes the records to a file whose name is retrieved from the 1st field($1). This is only done only if the number of fields is more than 0(NF)
You could always use the csplit command. This is a file splitter but based on a regex.
something along the lines of :
csplit -ks -f /tmp/files INPUTFILENAMEGOESHERE '/^$/'
It is untested and may need a little tweaking though.
CSPLIT

Resources