Different results in ApacheBench with and without concurrent requests - performance

I am trying to get some statistics on response time at my production server.
When calling ab -n100 -c1 "http://example.com/search?q=something" I get following results:
Connection Times (ms)
min mean[+/-sd] median max
Connect: 24 25 0.7 24 29
Processing: 526 874 116.1 868 1263
Waiting: 313 608 105.1 596 1032
Total: 552 898 116.1 892 1288
But when I call ab -n100 -c3 "http://example.com/search?q=something" the results are much worse:
Connection Times (ms)
min mean[+/-sd] median max
Connect: 24 25 0.8 25 30
Processing: 898 1872 1065.6 1689 8114
Waiting: 654 1410 765.5 1299 7821
Total: 923 1897 1065.5 1714 8138
Taking into account that site is in production, so there are requests besides mine, I can't explain why call with no concurrency are so much faster than with even small concurrency.
Any suggestions?

If you have a concurrency of 1 that means you are telling AB to hit this URL, as fast as it can, using one thread. The value -c3 is telling AB to do the same thing but using 3 threads which is probably going to result in a greater volume of calls which, in your case, appears to have caused things to slow down. (Note AB is single-threaded so doesn't actually use multiple os threads but the analogy still holds true.)
It's a bit like having more lanes at the tollbooth, one lane can only process cars so fast but with three lanes you're going to get more throughput. But no matter how many lanes you have the width of the tunnel the cars have to pass through after the tollbooth is also going to affect throughput which is probably what you are seeing.
As a general note, a better approach to load testing is to decide what level of traffic your app needs to be able to support and then design a test that generates this level of throughput and no more. Running threads as fast as they can like AB does tends to make any kind of controlled testing hard. JMeter is better.
Also, you might want to think about setting up a test server for his sort of thing, less risky...

Related

JMeter - Throughput Shaping Timer does not keep the requests/sec rate

I am using Ultimate Thread Group and fixed 1020 threads count for entire test duration - 520 seconds.
I've made a throughput diagram as follows:
The load increses over 10 seconds so the spikes shouldn't be very steep. Since the max RPS is 405 and max response time is around 25000ms 1020 threads should be enough.
However, when I run the test (jmeter -t spikes-nomiss.jmx -l spikes-nomiss.csv -e -o spikes-nomiss -n) I have the following graph for hits/seconds.
The threads are stopped for few seconds and suddenly 'wake up'. I can't find a reason for it. The final minute has a lot higher frequency of the calls. I've set heap size to 2GBs and resources are available, the CPU usage does not extend 50% during peaks, and memory is around 80% (4Gbs of ram on the machine). Seeking any help to fix the freezes.
Make sure to monitor JMeter's JVM using JConsole as it might be the case JMeter is not capable of create spikes due to insufficient resources. The slowdowns can be caused by excessive Garbage Collection
It might be the case 1020 threads are not enough to reach the desired throughput as it depends mainly on your application response time. If your application response time is higher than 300 milliseconds - you will not be able to get 405 RPS using 1020 threads. It might be a better idea to consider using Concurrency Thread Group which can be connected to the Throughput Shaping Timer via Schedule Feedback function

File read time in c increase unexpectedly

I'm currently facing an annoying problem, I have to read a big data file (500 GO) which is stored on a SSD revodrive 350.
I read the file using fread function as big memory chunks (roughly 17 mo per chunk).
At the beginning of my program everything goes smoothly It takes 10ms for 3 chunks read. Then after 10 sec read time performances collapse and vary between 60 and 90 ms.
I don't know the reason why this is happening and if it is possible to keep read time stable ?
Thank you in advance
Rob
17 mo per chunk, 10 ms for 3 chunks -> 51 mo / 10 ms.
10 sec = 1000 x 10 ms -> 51 GO read after 10 seconds!
How much memory do you have? Is your pagefile on the same disk?
The system may swap memory!

Increase apache requests per second

I want to increase apache's request per second figure.
I'm using apache benchmark to get it and it's not going over 500.
ab -n 100 -c 100 http://localhost/
this is the command I'm using it gives me 500 RPS
Concurrency Level: 100
Time taken for tests: 0.212 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 17925 bytes
HTML transferred: 900 bytes
Requests per second: 472.05 [#/sec] (mean)
Time per request: 211.843 [ms] (mean)
Time per request: 2.118 [ms] (mean, across all concurrent requests)
Transfer rate: 82.63 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 9 9 0.2 9 9
Processing: 20 150 36.8 160 200
Waiting: 19 148 36.6 159 200
Total: 30 159 36.8 169 209
Percentage of the requests served within a certain time (ms)
50% 169
66% 176
75% 182
80% 187
90% 200
95% 206
98% 209
99% 209
100% 209 (longest request)
this is the whole coutput.
I'm using worker mpm for this with configs as--
<IfModule mpm_worker_module>
ServerLimit 200
StartServers 200
MaxClients 5000
MinSpareThreads 1500
MaxSpareThreads 2000
ThreadsPerChild 64
MaxRequestsPerChild 0
</IfModule>
I suppose these are pretty high figures never the less I keep increasing it and nothing seems to change.
The application itself doesn't contain anything it only prints 'Hello World' with cherrypy.
I want to increase it to like 2000RPS my Ram is 5GB(using a VM).
The numbers you've set in your configuration look wrong - but the only way to get the right numbers is by measuring how your system behaves with real traffic.
Measuring response time across the loopback interface is not very meaningful. Measuring response time for a single URL is not very meaningful. Measuring response time with a load generator running on the same machine as the webserver is not very meaningful.
Making your site go faster / increasing the capacity is very difficult and needs much more testing, data and analysis than is appropriate for this forum.

Slow apigee response on cached items

I've setup a responseCache and added a header for cache hits and misses based on the example in github. The responses seem to work as expected where the first call for a specific key returns a miss, but the second returns a hit in the headers. The problem is that there seems to be a lot of latency even on a cache hit. In the 500ms - 1000ms range, which seems really high to me. Is this because it's on a developer account?
I also tried the trace and those responses seem to be quick as expected, like 20 ms, but not in the app from my laptop in chrome.
Here are some details for the 10K request.
Compare those times to stackoverflow.com for example (logged in) which is 200ms and 40K for the page data. For fun, I added stackoverflow to the system and enabled the cache and got similar very slow responses.
ab -n 100 -c 1 http://frank_zivtech-test.apigee.net/schedule/get?date=12/12/12
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking frank_zivtech-test.apigee.net (be patient).....done
Server Software: Apache/2.2.16
Server Hostname: frank_zivtech-test.apigee.net
Server Port: 80
Document Path: /schedule/get?date=12/12/12
Document Length: 45664 bytes
Concurrency Level: 1
Time taken for tests: 63.421 seconds
Complete requests: 100
Failed requests: 0
Write errors: 0
Total transferred: 4640700 bytes
HTML transferred: 4566400 bytes
Requests per second: 1.58 [#/sec] (mean)
Time per request: 634.208 [ms] (mean)
Time per request: 634.208 [ms] (mean, across all concurrent requests)
Transfer rate: 71.46 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 36 47 20.8 43 166
Processing: 385 587 105.5 574 922
Waiting: 265 435 80.7 437 754
Total: 428 634 114.6 618 1055
Percentage of the requests served within a certain time (ms)
50% 618
66% 624
75% 630
80% 652
90% 801
95% 884
98% 1000
99% 1055
100% 1055 (longest request)
Here it is using webpagetest.org with a different cached endpoint with very little data:
http://www.webpagetest.org/result/140228_7B_40G/
This shouldn't be related to a free vs. paid account.
If trace is showing ~20ms for the time spent in Apigee, I would factor in network latency from your client (laptop) to Apigee. Time between the client and Apigee can also be higher depending on the payload size and if it is compressed (gzip, deflate).

Cassandra Amazon EC2 , lots of IOWait

We have the following stats on single node cassandra on Amazon EC2/Rightscale m1.large instance with 2 ephemeral disks with raid0. (7.6 GB Total Memory)
4 GB RAM is allocated to cassandra Heap, 800MB is Heap NEW size.
following stats are from OpsCenter community 2.0
Read Requests 285 to 340 per second
Write Requests 257 to 720 per second
OS Load 15.15 to 17.15
Write Request Latency 293 to 685 micros
OS Sent Network Traffic 18 MB to 30 MB per second
OS Recieved Network Traffic 22 MB to 34 MB per second
OS Disk Queue Size 23 to 26 requests
Read Requests Pending 8 to 20
Read Request Latency 69140 to 92885 micros
OS Disk latency 37 to 42 ms
OS Disk Throughput 12 to 14 Mb per second
Disk IOPs Reads 600 to 740 per second
Disk IOPs Writes 2 to 7 per second
IOWait 60 to 70 % CPU avg
Idle 24 to 30 % CPU avg
Rowcache is disabled.
Are the above stats are satisfying with the provided configuration....OR how could we tweak it more to get less IOWait..........because we think that we are experiencing lots of IOWait.....how could we tweak it to get the best.
Read Requests are mixed.........some are from one super column family and one standard having more than million keys......and varying no. of super columns max 14 with varying no. of subcolumns from 1 to 10000 and varying no. of columns max 14 in standard column family...............subcolumns are very thin in nature with 0 bytes value....8 bytes for name.
Process is removing the data from super column family and writing the processed data on standard one.
Would EBS Disks work better....on Amazon EC2
I'm not positive whether you can tweak your config easily to get more disk performance, but using Snappy compression could help a good deal in making your app need to read less overall. It may also help to use the new composite key layout instead of supercolumns.
One thing I can say for sure: EBS will NOT work better. Stay away from that at all costs if you care about latency.

Resources