I have read the Gaussian Random Timer info in jmeter user manual but it is difficult to understand. any one have idea related to this please explain with example highly appreciated. Thanks in advance.
The Gaussian Random Timer has a random deviation (based on Gauss curve distribution) around the constant delay offset.
For example:
Deviation: 100 ms
Constant Delay Offset: 300 ms
The delay will vary between 200 ms (300 - 100) and 400 ms (300 + 100) based on Gauss distribution for about 68% of the cases.
I'll try to explain it with one of the examples already posted:
Constant delay offset: 1000 ms
Deviation: 500 ms
Approximately 68% of the delays will be between [500, 1500] ms (=[1000 - 500, 1000 + 500] ms).
According to the docs (emphasis mine):
The total delay is the sum of the Gaussian distributed value (with mean 0.0 and standard deviation 1.0) times the deviation value you specify, and the offset value
Apache JMeter invokes Random.nextGaussian()*range to calculate the delay. As explained in the Wikipedia, the value ofnextGaussian() will be between [-1,1] only for about 68% of the cases. In theory, it could have any value (though the probability to get values outside of this interval decreases very quickly with the distance to it).
As a proof, I have written a simple JMeter test that launches one thread with a dummy sampler and a Gaussian Random Timer: 3000 ms constant delay, 2000 ms deviation:
To rule out cpu load issues, I have configured an additional concurrent thread with another dummy sampler and a Constant Timer: 5000 ms:
The results are quite enlightening:
Take for instance samples 10 and 12: 9h53'04.449" - 9h52'57.776" = 6.674", that is a deviation of 3.674" in contrast to the 2.000" configured! You can also verify that the constant timer only deviates about 1ms if at all.
I could find a very nice explanation of these gaussian timers in the Gmane jmeter user's list: Timer Question.
Gaussian Random Timer is nearly the same as Uniform Random Timer.
In Uniform Random Timer the variation around constant offset has a linear distribution
In Gaussian Random Timer, the variation around constant offset has a gaussian curve distribution.
Constant delay offset(mu)=300 ms,deviation(si)=100 ms
mu-si=200,mu+si=400,There are 68% chances of the time gap between two threads are in range of[200,400]
mu-2(si)=100,mu+2(si)=500,There are 95% chances of time gap between two threads are in range of[100,500]
mu-3(si)=0,mu+3(si)=300,there are 99.7% chances of the time gap between two consecutive threads are in range of[0,600]
when you go on like this some where you will get 100% probability that time gap between two threads is 100%
I am restricting my self to 3 iterations because mu-4(si) yields a negative value and time elapsed is always a positive value in this universe.
But it will be very unrealistic to depend on gaussian timer as we have constant timer and constant through put timer with no standard deviation(si).
Hope that it helps.
Related
How to find delay, throughput, maximum operating frequency for my circuit in vivado?
The values that I have are Worst Negative slack=2.055 ns, Total negative slack of 0ns, number of failing end points=0, total number of end points=22082.
Delay and throughput depend on your design. The slack suggests that your max frequency is about 2ns faster than what it is now.
I have read a bunch of posts on SO regarding the computation of end-to-end delay in Veins, but have not found an answer to be fulfilling in explaining why the delay is seemingly too low.
I am using:
Veins 4.7
Sumo 0.32.0
Omnetpp 5.3
Channel switching is turned off.
I have the following code, sending a message from the transmitting node:
if(sendMessage) {
WaveShortMessage* wsm = new WaveShortMessage();
sendDown(wsm);
}
The receiving node computes the delay using the wsm creation time, but I have also tried setting the timestamp on the transmitting side. The result is the same.
simtime_t delay = simTime() - wsm -> getCreationTime();
delayVector.record(delay);
The sample output for the delay vector is as follows:
Item# Event# Time Value
0 165 14.400239402394 2.39402394E-4
1 186 14.500240403299 2.40403299E-4
2 207 14.600241404069 2.41404069E-4
3 228 14.700242404729 2.42404729E-4
Which means that the end-to-end delay (from creation to reception) is equivalent to roughly a quarter of a millisecond, which seems to be quite low - and a fair bit below what is typically reported in the literature. This seems to be consistent with what other people have reported as being an issue (e.g. end to end delay in Veins)
Am I missing something in this computation? I have tried adding load on the network by adding a high number of vehicular nodes (21 nodes within a 1000x50 sandbox on a straight highway, with an average speed of 50 km/h), but the result seems to be the same. The difference is negligible. I have read several research papers that suggest that end-to-end delay should increase dramatically in high vehicular densities.
This end-to-end delay is to be expected. If your application's simulation model does not explicitly model processing delay (e.g., by an application running on a slow general purpose computer), all you would expect to delay a frame is propagation delay (lightspeed, so negligible here) and queueing delay on the MAC (time from inserting frame into TX queue until transmission finishes).
To give an example, for a 2400 bit frame sent at 6 Mbit/s this delay is roughly 0.45 ms. You are likely using slightly shorter frames, so your values appear to be reasonable.
For background information, see F. Klingler, F. Dressler, C. Sommer: "The Impact of Head of Line Blocking in Highly Dynamic WLANs" (DOI 10.1109/TVT.2018.2837157), which also includes a comparison of theory vs. Veins vs. real measurements.
My beacons have advertisement interval of 330ms. I use an iOS device to scan the advertisement packet whose scanning rate is 1 scan per second on average. I want to use the moving average filter to smooth the fluctuating RSSI values. Considering the walking speed of 1.2 m/s and the advertisement interval of 330 ms, what should be the size of a window in the moving average filter? Is there any mathematical relationship between them?
Thank you.
There is no one correct answer here. It is a trade-off between noise in the distance estimate and lag time.
The large (and longer) your statistical sample, the more lag time there will be in a running average. A 20 second window will tell you where you were on average over the last 20 seconds, and filter out a lot of noise. A 5 second running average will tell you where you were on average over the last 5 seconds, but with much more noise on the calculation.
How much lag you can tolerate and how much noise you can tolerate all depend on your use case. Use cases that are very time sensitive may sacrifice accuracy for the sake of less lag. Conversely use cases needing greater accuracy may accept more lag to filter out more noise on the estimate.
I have a book describing energy saving compiler algorithms with a variable having "cycles" as measuring unit for the "distance" until something happens (an HDD is put into idle mode).
But the results for efficiency of the algorithm have just "time" on one axis of a diagram, not "cycles". So is it safe to assume (i.e. my understanding of the cycle concept) that unless something like dynamic frequency scaling is used, cycles are equal to real physical time (seconds for example)?
The cycles are equal to real physical time, for example a CPU with a 1 GHz frequency executes 1,000,000,000 cycles per second which is the same as 1 over 1,000,000,000 seconds per cycle or, in a other words a cycle per nanosecond. In the case of dynamic frequency that would change according to the change in frequency at any particular time.
So I want to know how to calculate the total memory effective bandwidth for:
cublasSdot(handle, M, devPtrA, 1, devPtrB, 1, &curesult);
where that function belows to cublas_v2.h
That function runs in 0.46 ms, and the vectors are 10000 * sizeof(float)
Am I having ((10000 * 4) / 10^9 )/0.00046 = 0.086 GB/s?
I'm wondering about it because I don't know what is inside the cublasSdot function, and I don't know if it is necesary.
In your case, the size of the input data is 10000 * 4 * 2 since you have 2 input vectors, and the size of the output data is 4. The effective bandwidth should be about 0.172 GB/s.
Basically cublasSdot() does nothing much more than computing.
Profile result shows cublasSdot() invokes 2 kernels to compute the result. An extra 4-bytes device-to-host mem transfer is also invoked if the pointer mode is CUBLAS_POINTER_MODE_HOST, which is the default mode for cublas lib.
If kernel time is in ms then a multiplication factor of 1000 is necessary.
That results in 86 GB/s.
As an example refer to example provide by NVIDIA for Matrix Transpose
at http://docs.nvidia.com/cuda/samples/6_Advanced/transpose/doc/MatrixTranspose.pdf
On Last Page entire code is present. The way the Effective Bandwidth is computed is 2.*1000*mem_size/(1024*1024*1024)/(Time in ms)