Exponential vs Uniform vs Exact Mean Response Times - performance

so I have a hard time with this question. It is asking what I should choose that will give me the fastest mean response time.
So option 1 I have exponential distribution with service rate of 2 per minute. Which gives me service time of 0.5m = 30s.
Option 2, I have uniform distribution between 10s and 50s so this gives me uniform time between 10s and 50s so the average of that is the median which is 30s.
Option 3, I have 50% probability of getting 10s exactly response time or 50% probability that I'm getting 50s exact response time. So I if I do this calculation: (0.5)(10/60) + (0.5)(50/60) I get 0.5m or 30s.
All these options give me the same mean response time so I'm not sure what to choose here.

You want to know the expected value a RV in each of these cases.
For an exponential RV:
E[X] = 1/lambda (lambda = rate)
E[X] = 1/2
For a uniform distribution, the expected value will just be the mean.
For option 3, use the definition of expected value:
E[X] = sum(E[x_i]P(x_i)) over all i
Then your best option is the one with the lowest expected value. If they are all the same, and that is your only selection criteria, then it doesn't matter which option you pick.

Related

Finding Big O Notation Clarification without function

So this example was on this site and it was pretty clear, but what if you have this instead:
N, average sec:(1,000: 2.7)(2,000: 3.04) (4,000: 3.6)(8,000: 3.7)(16,000: 4)?
N doubles every time (2*N) and the average time starts to level off. I can guess from looking at the examples below (O(logN)), but can someone clarify how you would calculate the problem?
O(1): known as Constant complexity
1 item: 1 second
10 items: 1 second
100 items: 1 second
The number of items is still increasing by a factor of 10, but the scaling factor of O(1) is always 1.
O(log n): known as Logarithmic complexity
1 item: 1 second
10 items: 2 seconds
100 items: 3 seconds
1000 items: 4 seconds
You'd do a regression analysis based on a log curve fit. You can start by plotting your data to get a visual confirmation.
A log fit in Wolfram Alpha would for example produce:
This shows that you're right and the growth seems to be logarithmic (for the provided data).
However, be aware that time measurements are not equal to an actual complexity analysis which is a formal proof rather than a curve fit to empirical data (which can be distorted for a number of reasons).

Thresholding algorithm - positive and negative threshold?

I recently posed a question on how to implement an algorithm which adaptively changes a threshold in real time, so that a time series reaches that threshold every N seconds. I was told that if my time series has a constant time interval (it does), I could take the absolute value, reverse sort it, and find the index in the array that gives me the average time resolution I want. I implemented it in MATLAB in this manner:
x = abs(timeseries); % Get the absolute value
x = flipud(sort(x)); % Reverse sort
N = length(x); % Size of the time series
idx = round(N/goal_time); % Find the right index
threshold = x(idx); % Set the threshold
Where goal time is the average time I want to detect a 'hit' (timeseries > threshold). My problem is two fold. I can't tell if the algorithm is not accurate enough, or if my data is too noisy (when using it, the average time I get hits doesn't really approximate my goal very well). And secondly, how would I modify this algorith to calculate a 'hit' time, where a hit is defined as the time series dipping below a threshold?

Adjustment algorithm?

I am trying to find a function in Matlab, or at least the name of an algorithm that does the following:
Let's say that I am analyzing a time series in real time. I initially start with a threshold of 10 and -10, so that when the time series goes above 10 or below -10, it's considered a 'HIT'. Let's say it initially takes the time series 5 minutes to produce a 'HIT', but I want to adjust the threshold so that, on average, it takes only 1 minute for a 'HIT' to be produced. I know it would look something like start with 10 and -10, if it takes too long, drop it to 5 and -5, then increase the threshold if it's too quick, etc.
I know there's a specific name for this type of algorithm, and there's probably built-in functions for this, but the name is eluding me. Can somebody help?
I don't know what the time resolution of your time series is, or if it's constant, so I'll leave that to you. However here is what you can do in matlab if you have a constant time resolution. First take the absolute value of the values in your time series. Then sort these values in reverse order using the sort() command. Then choose the value whose index in the sorted array gives you the average time resolution that you desire. So e.g. if your time series has size N and the time resolution is 0.1 seconds, and you want an alert on average every 1 second, then after sorting you would choose the threshold at (reverse order) sorted position N/10.

Can anyone explain the columns of summary report in Jmeter?

Is the average showing in milliseconds? How the error % is calculating, on the what basis total value of average and error % calculating?
As per the The Load Reports guide:
#Samples is the number of samples with the same label.
Average is the average time of a set of results.
Median is a number which divides the samples into two equal halves. Half of the samples are smaller than the median, and half are larger. [Some samples may equal the median.] This is a standard statistical measure. The Median is the same as the 50th Percentile.
90% Line (90th Percentile) meaning 90% of the samples took no more than this time.
Median is the time in the middle of a set of results. 50% of the samples took no more than this time; the remainder took at least as long.
Min is the shortest time for the samples with the same label
Max is the longest time for the samples with the same label
Error % is the percent of requests with errors
Throughput is measured in requests per second/minute/hour. The time unit is chosen so that the displayed rate is at least 1.0. When the throughput is saved to a CSV file, it is expressed in requests/second, i.e. 30.0 requests/minute is saved as 0.5.
Kb/sec - throughput measured in Kilobytes per second. Time is in milliseconds.

Mean max subset of array values

I am working on an algorithm to compute multiple mean max values of an array. The array contains time/value pairs such as HR data recorded on a Garmin device over a 5 hour run. the data is approx once a second for an unknown period, but has no guaranteed frequency. An example would be a 10 minute mean maximum, which is the maximum average 10 minute duration value. Assume "mean" is just average value for this discussion. The desired mean maximal value's duration is arbitrary, 1 min, 5 min, 60 min. And, I'm likely going to need many of them-- at least 30 but ideally any on demand if it wasn't a lengthy request.
Right now I have a straight forward algorithm to compute on value:
1) Start at beginning of array and "walk" forward until the subset is equal to or 1 element past the desired duration. Stop if end of array is reached.
2) Find the average of those subset values. Store as max avg if larger than current max.
3) Shift a single value off the left side of array.
4) Repeat from 1 until end of array met.
It basically computes every possible consecutive average and returns the max.
It does this for each duration. And it computes a real avg computation continuously instead of sliding it somehow by removing the left point and adding the right, like one could do for a Simple-moving-average series. It takes about 3-10 secs per mean max value depending on the total array size.
I'm wondering how to optimize this. For instance, the series of all mean max values will be an exponential curve with the 1s value highest, and lowering until the entire average is met. Can this curve, and all values, be interpolated from a certain number of points? Or some other optimization to the above heavy computation but still maintain accuracy?
"And it computes a real avg computation continuously instead of sliding it somehow by removing the left point and adding the right, like one could do for a Simple-moving-average series."
Why don't you just slide it (i.e. keep a running sum and divide by the number of elements in that sum)?

Resources