bash loop with curl evidencing non-linear scaling of response times - performance

I wrote this simple Bash script to detect incidence of error-pages:
date;
iterations=10000;
count_error=0;
count_expected=0;
for ((counter = 0; counter < iterations; ++counter)); do
if curl -s http://www.example.com/example/path | grep -iq error;
then
((count_error++));
else
((count_expected++));
fi;
sleep 0.1;
done;
date;
echo count_error=$count_error count_expected=$count_expected
I'm finding total execution-time does not scale linearly with iteration count. 10 iterations 00:00:12, 100 in 00:01:46, 1000 in 00:17:24, 10000 in ~50 mins, 100000 in ~10 hrs
Can anyone provide insights into the non-linearity and/or improvements to the script? Is curl unable to fire requests at rate of 10/sec? Is GC having to periodically clear internal buffers filling up with response text ?

Here are a few thoughts:
You are not creating 10 requests per second here (as you stated in the question), instead you are running the requests sequentially, i.e. as many per second as possible.
The ; at the end of each line is not required in Bash.
When testing your script from my machine against a different URL, 10 iterations take 3 seconds, 100 take 31 seconds, and 1000 take 323 seconds, so the execution time scales linearly in this range.
You could try using htop or top to identify performance bottlenecks on your client or server.
The apache benchmark tool ab is a standard tool to benchmark web servers and available on most distributions. See manpage ab(1) for more information.

Related

Understanding MPI multi-host mode

I have two hosts with the same number of cores (20 = 10 virtual + 10 real).
Just to test my MPI performance I use simple matrix multiplication program.
I understand MPI with two hosts as "let's start maximum cores in every host"
The problem is the following:
In one-node mode I execute mpirun -n 20 mm and obtain executed time nearly 0.5 sec.
Then in multi-node mode I execute mpirun -n 20 --host srv1:20,srv2:20 mm and obtain time nearly 0.5 sec too.
So there are no better performance with two hosts usage (but it's expected).
What settings, options, configuration files (and so on) I should check&repair to get expected result?
Thanks.

How does lftp calculate the throughput in parallel mode?

I'm using lftp (lftp --version shows Version 4.0.9) in mirror mode to test the performance of some sftp servers I'm specially interested in the throughput (bytes/sec) when I run lftp with a different number of concurrent connections.
When I ran the test with 25 concurrent connections it gave me a rather strange number of 5866 seconds as time to download. To check what was the real time spent in the download I used the time command (as suggested in this related question).
The output was:
$ time lftp -e 'mirror --parallel=25 (rest of the command-line)'
21732837094 bytes transferred in 5866 seconds (3.53M/s)
real 4m31.315s
user 1m25.977s
sys 1m38.041s
My first thought was that those 5866 seconds where the sum of the time spent by every connection, so dividing that by 25 gives me 234,64 seconds (or 03m54.64s) which is kind of far from 4m31.315s.
Does anyone have an insight on how the numbers from lftp are calculated?
Before lftp-4.5.0 mirror summed overlapping durations of the parallel transfers (incorrectly). It was fixed in 4.5.0 to count wall clock time when any of the transfers was active.

JMeter showing different request time

I have written a script where I have combined 10 HTTP requests with the number of threads different for each but the same Ramp Up period set to 1 second. I had a look at the Kibana after the test execution. What I can see is that the API requests are requested in 5 seconds even though the Ramp Up period is set to 1 second. The command which I used to generate the graphs and results file is as below.
jmeter -n -t S:\roshTests\Cocktail\Cocktail.jmx -l S:\roshTests\Cocktail\results.csv
Even the results.CSV file shows the timestamp of a gap between 5 seconds.
Can someone answer me the following?
a) Does the results.CSV file shows the response time or the time it requested by Jmeter?
b) Few threads are in 1000 range and few are less than 100. Is there any restriction for the maximum threads that can be applied in a second using JMeter?
c) Why the results.CSV shows a gap of 5 seconds with Ramp Up set to 1 second.
d) Is there any Graph which shows that the requests are sent in a second?

Why does using a higher Loop Count improve site performance in JMeter?

I've noticed that when load testing with JMeter, if I do a single loop I get a fairly long average time for my test. If I have say a Loop Count of 10, my average time peaks early on and then drops way down. For example if I setup a test on a simple get request for a page with the following settings:
Number of Threads (users): 500
Ramp-up Period(in seconds): 5
Loop Count: 1
My average time is about 4 seconds. If I change it to 10 loops:
Number of Threads (users): 500
Ramp-up Period(in seconds): 5
Loop Count: 10
I get an average time of 1.4 seconds.
Apache's documentation states that the Loop Count is:
The number of times the subelements of this controller will be
iterated each time through a test run.
Is it possible that this means the first request will actually do something on the server and the subsequent 9 requests will be pulling from cache?
How exactly is the Loop Count being used that would cause the results I'm seeing?
Yes, Remaining 9 requests must be pulling from cache.
Loop controller is simple loop executor doing nothing magic inside.
Improved performance is because of use of cached results on server.
If you want one thing you can try, use the loop controller but use different substituted parameters so that every
time different requests will be sent to server (I know that loop controller is for repeating same values, but this is just to confirm the effect of caching).
then compare the results.
I hope this clears the doubt :)

jMeter - performance degrading with higher loop count

I need a little help on how to debug the matter. My current jMeter scenario seems to run fine as long as I keep the loop count at 1, when I add more loops the performance starts to degrade a lot.
I have a thread group with 225 threads, 110s ramp up, loop count 1 - my total response time is ca. 8-9secs. I run this several times to confirm, each run shows similar response times.
Now, I did the same test , just changed the loop count to 3, all other parameters unchanged, and the performance went south, total response time is ca. 30-40s.
I was under the impression that 3x 1 loop runs would be, more or less, equivalent to 1x 3 loops run. It seems that is not the matter. Anyone could explain to me why is that?
Or, if this should be equivalent, any idea where to search for the culprit of degrading performance?
What you're saying is that the response times degrade if you increase the throughput (as in requests per second).
Based on 225 threads making a single request with a rampup of 110 seconds your throughput is going to be in the region of 2 requests every second. Increasing the loop count to 3 is going to up that by around a factor of 3 to 6 requests a second (assuming no timers). Except of course if the response times are increasing then you will not reach this level of throughput which is you problem.
Given that this request is already taking 8-9 seconds, which is not especially fast, it could be assumed that there is some heavy thinking going on behind the scenes and that you have simply hit a bottleneck, somewhere...
Try using less threads and a longer rampup and then monitor the response times and the throughput rate. At some point, as the load increases, you will see response times start to degrade and then at this point you need to roll up your sleeves and have a look at what is happening in your AUT.
Note. 3 x 1 loop is not the same as 1 x 3 loops. The delay between iterations will cause one thread with multiple iterations to have a different throughput vs. more threads with one iteration where the throughput is decided by the rampup, not the delay. That said, this is not what you describe in your question - you mention that the number of threads is consistent.
In addition to the answer from Oliver: try to use custom listener like Active Threads Over Time Listener - to monitor your load-scenario.
You can also retry both your scenarios described above, with this listener - sure, you'll see the difference in graphs.

Resources