today I had a debate which is the better benchmarking strategy: local machine vs remote machine. And I would like to know some other opinions on that topic. It is more general question, but in this case it was a java project.
The test was about how well some template engines perform.
After a warming up phase of 30 seconds the benchmark runs 120 seconds with 100 threads.
The tested page has some static and dynamic parts. like random items in a table and list, but a static header for example (no database operations).
Senario 1:
The webserver and the benchmarking tools runs on the same machine.
The tests were made with ab and jmeter.
Pro arguments:
the production would also be not an ideal system.
other java process may also run on the production system.
even if the benchmarking tools slow down the result in average the results are still valid ( the better technology would be better on any variant)
Con arguments:
the same machine tries to beat itself
benchmarks slow down real performance
Senario 2:
A remote machine tests the other system:
Pro arguments:
one system can try to reach the limits of the other
benchmarks (jmeter) does not run on the same jvm
Con arguments:
production system would not be an ideal environment
network may slow down the results
First I would like to know which is the better way to go. Second if some of the arguments are not valid. Or in the end it doesn't even matter. And I would like to know other arguments for any of both variants. I hope this question does not get closed, because I think it is not to general. One option must be the better one.
Scenario 2, in almost all cases, is going to be closer to production behavior. I would argue that the cons listed for Scenario 1 effectively invalidate any of the results. However, as long as you are consistent with the approach you should be able to use results for comparisons even if they do not match what you expect to see from production.
Addressing the cons of Scenario 2:
Production System would not be an ideal environment.
If you are about to load test production, something else is wrong to begin with.
Network may slow down results.
While this might be true, the networking for a distributed internal test environment is still better than your customers will have from external sources. In almost all cases you will want to do bandwidth network simulations to ensure production-like behavior.
To accomplish this, look for the following line in the jmeter.properties file:
# Define characters per second > 0 to emulate slow connections
#httpclient.socket.http.cps=0
#httpclient.socket.https.cps=0
This will limit the "bandwidth" of each JMeter thread. (quoted because the throttling does not happen at the network level) Do a simple study to see what CPS means in terms of your applications.
The Short Answer:
Always try to test in a distributed setup because you have more flexibility. Also, consider the scenario where you need multiple load generator machines to actually stress a target. In that case you have no choice at all, so I would argue "best practice" is to configure isolated test environments with separate load generator machines.
Also, to effectively maximize the output of each load generator machine it is advised to customize open file descriptor limits (ulimits) and JVM memory settings for the load generator machines. In most cases those configurations are not acceptable for application servers.
Hope that helps, please let me know if I can elaborate further,
-Addled
Related
I have seen some questions about the limits of jmeter, like "What is the highest number of threads?" and "What are the physical limits of jmeter?". As some answers indicate, there's no specific limit to jmeter, but rather to jmeter configurations used on specific hardware setups. However, folks do indicate there's a limit & give tips on how to optimize.
My question is more basic - "how can I tell if I'm hitting the limits of my client (Jmeter + hardware)?"
I'm not talking about OOM errors (like described in this blog post), which are pretty obvious, but rather if jmeter is lagging. In the aggregate report, I can see throughput, and I could also count number of responses received in csv output & divide by time. Should I just check if that's equal to my desired QPS? Achieving a desired QPS in jmeter generally seems trickier than just blasting the server with users though, and the math from number of users -> QPS seems a bit tricky.
Finally, how can I tell if it's my server lagging or jmeter lagging? I'm wondering if I can test with some simple static webpage first to confirm jmeter's behavior, and then test my actual server. Any recommendations for a simple static page that can take a high amount of QPS?
Apologies if that's too many questions, but feel free to ask for more details or only answer the primary "how to tell if I'm hitting limits" question.
JMeter doesn't have many "limits", at least they're too high to worry about, you can kick off as many as 2,147,483,647 threads given the underlying hardware/OS allows it and JMeter is properly configured
The easiest solution is switching to Distributed Mode of JMeter execution, i.e. if you're "hitting the limits of JMeter" when you add another instance of JMeter as an extra load generator the throughput should go up given the server is capable of handling the load.
Another option is first of all making sure that you're following JMeter Best Practices and setting up monitoring of baseline resources usage like CPU, RAM, Network, Disk, etc. on the machine where JMeter is running, if any of monitored metrics exceeds i.e. 90% of maximum available capacity - you're "hitting the limits" of the machine where JMeter is running.
I was able to confirm my setup worked by checking the Aggregate Report's throughput metric reached my desired QPS. Initially, when I did not reach the desired throughput and was testing against my server, I was not able to confirm whether the problem was my server or my load testing setup.
To confirm the load testing worked, I swapped the load test to hit a very simple 'hello world' service with an excess of resources. Here, the desired throughput was met.
For reference on actual setup, I ran jmeter on a 5.2xlarge EC2 instance which had 8vCPUs, up to 10Gbps network bandwidth, and 16GiB of RAM and reached 1K QPS. I have yet to see how much further I can push this particular setup.
I have User Registration, Flight Search, Book Tickets modules in my application. I have created my JMeter test & I have different thread groups for each module in my test. I verified & it works well.
Thread Group 1: XX number of users - access the site - click on regression , enter the details & register. (bold -> loop- happening again and again)
Thread Group 2: XX number of users - access the site - login, - search for flights - (bold -> loop - happening again and again)
Thread Group 3: XX number of users - access the site - login, book ticket - (bold -> loop - happening again and again)
Issue:
My manager says we need to run all modules (all thread groups) together with appropriate users as that is how It is going to be in Production. Even though i can run them all together, - in case of issues - i would not know which feature of the application caused the issue.
My aim is to run each module separately & find its performance. I think that doing the module wise would be the correct approach to get the response time, resource utilization etc.
Clarify:
I do not have much experience in performance testing. What is the correct approach / How do you do your test for your application?
If i have to find server's optimal load (at which it performs better) - what should my approach be?
Intentionally tagging loadrunner as this question is not specific to JMeter & it is generic.
If your goal is to represent human behavior to assess the risk of deployment then testing each business process atomically will not accomplish your goal.
You appear to be engaging in a process that is more appropriately termed performance unit testing. This is very common with developers (as differentiated from performance testers) who seek to qualify the performance of an individual business process across some number of users. These are also typically classified with non-normal think times (often eliminated altogether), small data sets, smaller than useful test environments and extremely short test durations, such as 5-15 minutes.
You can mark this business scenario as transactions it means the HTTP requests for each module will be grouped for ex- Login requests in one group or transaction , Search flight as one group or transaction and similarly for Book tickets etc. By following this you will be testing it in a integrated manner and it would be a production like scenario too. After your run due to grouping you can easily find out which group of request taking more time either search, book tickets etc... In this way you will get the accurate performance statistics and you will achieve the production like scenario too
The approach really depends on what your goal for the testing exercise is. If you're looking to optimize or profile a particular module, it makes sense to test it in isolation.
However, if you're trying to check if your server scales, or if you have enough capacity, you should test all your modules at once, at or above your expected load levels.
A counter example to your isolated approach:
Say you have to modules A and B. They are both CPU intensive and take up 80% CPU when you run them. You first tested A, it used 80% CPU, you had 20% to spare and it performed fine. Now you test B alone, same result.
Now you go to production and users try to use both A and B modules, both are trying to use 80% CPU and suddenly you don't have sufficient CPU and your performance suffers.
I know this is kind of late but still.
I do not have much experience in performance testing. What is the
correct approach / How do you do your test for your application?
As James mentioned, the approach to conducting a performance test in a normal scenario would be to run all the critical business flows at the same time and not in an isolated fashion.
In order to identify issues, we would group the requests under transactions and name the business flows appropriately. This will help in identifying which requests have failed and which feature/portion of the application is at fault.
Running them individually will not provide you more insights simply because, a load testing tool will only be able to confirm the presence of a bottleneck but not the root cause irrespective of the number of business flows involved.
If i have to find server's optimal load (at which it performs better)
- what should my approach be?
In order to identify the optimal load for the server, it is mandatory to run all the scripts together as the end users are going to access the application (all critical scenarios) concurrently and not in a modularized manner.
I want to do load testing for 10 million users for my site. The site is a Java based web-app. My approach is to create a Jmeter test plan for all the links and then take a report for the 10 million users. Then use jvisualVM to do profiling and check if there are any bottlenecks.
Is there any better way to do this? Is there any existing demo for doing this? I am doing this for the first time, so any assistance will be very helpful.
You are on the correct path, but your load limit is of with a high factor.
Why I'm saying this is cause your site probably will need more machine to handle 10Milj Concurrent users. A process alone would probably struggle to handle concurrent 32K TCP-streams. Also do some math of the bandwidth it would take to actually handle 10Milj users.
Now I do not know what kind of service you thinking of providing on your site, but when thinking of that JVisualVM slows down processing by a factor 10 (or more for method tracing), you would not actually measure the "real world" if you got JMeter and JVisualVM to work at the same time.
JVisualVM is more useful when you run on lower loads.
To create a good measurement first make sure your have a good baseline.
Make a test with 10 concurrent users, connect up JVisuamVM and let it run for a while, not down all interesting values.
After you have your baseline, then you can start adding more load.
Add 10times the load (ea: 100 users), look at the changes in JVisualVM. Continue this until it becomes obvious that JVisualVM slows you down, for every time to add extra load, make sure you have written down the numbers your are interested in. Plot down the numbers in a graph.
Now... Interpolate the graph (by hand) for the number of users you want. This works for memory usage, disc access etc, but not for used CPU time, cause JVisualVM will eat CPU and give you invalid numbers on that (especially if you have method tracing turned on).
If you really want to go as high as 10Milj users, I would not trust JMeter either, I would write a little test program of my own that performs the test you want. This would be okey, since the the setting up the site to handle 10Milj will also take time, so spending a little extra time of the test tools are not a waste.
Just because you have 10 million users in the database, doesn't mean that you need to load test using that many users. Think about it - is your site really going to have 10 million simultaneous users? For web applications, a ratio of 1:100 registered users is common i.e. you are unlikely to have more than 100K users at any moment.
Can JMeter handle that kind of load? I doubt it. Please try faban instead. It is very light-weight and can support thousands of users on a single VM. You also have much better flexibility in creating your workload and can also automate monitoring of your entire test infrastructure.
Now to the analysis part. You didn't say what server you were using. Any Java appserver will provide sufficient monitoring support. Commercial servers provide nice GUI tools while Tomcat provides extensive monitoring via JMX. You may want to start here before getting down to the JVM level.
For the JVM, you really don't want to use VisualVM while running such a large performance test. Besides to support such a load, I assume you are using multiple appserver/JVM instances. The major performance issue is usually GC, so use the JVM options to collect and log GC information. You will have to post-process the data.
This is a non-trivial exercise - good luck!
There are two types of load testing - bottleneck identification and throughput. The question leads me to believe this is about bottlenecks, so number of users is a something of a red herring, instead the goal being for a given configuration finding areas that can be improved to increase concurrency.
Application bottlenecks usually fall into three categories: database, memory leak, or slow algorithm. Finding them involves putting the application in question under stress (i.e. load) for an extended period of time - at least an hour, perhaps up to several days. Jmeter is a good tool for this purpose. One of the things to consider is running the same test with cookie handling enabled (i.e. Jmeter retains cookies and sends with each subsequent request) and disabled - sometimes you get very different results and this is important because the latter is effectively a simulation of what some crawlers do to your site. Details for bottleneck detection follow:
Database
Tables without indices or SQL statements involving multiple joins are frequent app bottlenecks. Every database server I've dealt with, MySQL, SQL Server, and Oracle has some way of logging or identifying slow running SQL statements. MySQL has the slow query log, whereas SQL Server has dynamic management views that track the slowest running SQL. Once you've got your hands on the slow statements use explain plan to see what the database engine is trying to do, use any features that suggest indices, and consider other strategies - such as denormalization - if those two options do not solve the bottleneck.
Memory Leak
Turn on verbose garbage collection logging and a JMX monitoring port. Then use jConsole, which provides much better graphs, to observe trends. In particular leaks usually show up as filling the Old Gen or Perm Gen spaces. Leaks are a bottleneck with the JVM spends increasing amounts of time attempting garbage collection unsuccessfully until an OOM Error is thrown.
Perm Gen implies the need to increase the space as a command line parameter to the JVM. While Old Gen implies a leak where you should stop the load test, generate a heap dump, and then use Eclipse Memory Analysis Tool to identify the leak.
Slow Algorithm
This is more difficult to track down. The most frequent offenders are synchronization, inter process communication (e.g. RMI, web services), and disk I/O. Another common issue is code using nested loops (look mom O(n^2) performance!).
Best way I've found to find these issues absent some deeper knowledge is generating stack traces. These will tell what all threads are doing at a given point in time. What you're looking for are BLOCKED threads or several threads all accessing the same code. This usually points at some slowness within the codebase.
I blogged, the way I proceeded with the performance test:
Make sure that the server (hardware can be as per the staging/production requirements) has no other installations that can affect the performance.
For setting up the users in DB, a procedure can be used and can be called as a part of jmeter test plan.
Install jmeter on a separate machine, so that jmeter won't affect the performance.
Create a test plan in jmeter (as shown in the figure 1) for all the uri's, with response checking and timer based requests.
Take the initial benchmark, using jmeter.
Check for the low performance uri's. These are the points to expect for bottlenecks.
Try different options for performance improvement, but focus on only one bottleneck at a time.
Try any one fix from step 6 and then take an benchmark. If there is any improvement commit the changes and repeat from step 5. Otherwise revert and try for any other options from step 6.
The next step would be to use load balancing, hardware scaling, clustering, etc. This may include some physical setup and hardware/software cost. Give the results with the scalability options.
For detailed explanation: http://www.daemonthread.com/2011/06/site-performance-tuning-using-jmeter.html
I started using JMeter plugins.
This allows me to gather application metrics available over JMX to use in my Load Test.
I have some basic questions around understanding fundamentals of Performance testing. I know that under various circumstances we might want to do
- Stress Testing
- Endurance Testing etc.
But my main objective here is to ensure that response time is decent from application under a set of load which is towards a higher end or in least above average load.
My questions are as follows:
When you start to plan your expected response time of application; what do you consider. If thats the first step at all. I mean, I have a web application now. Do I just pull out a figure from air and say "I would expect application to take 3 seconds to respond to each request". and then go about figuring out what my application is lacking to get that response time?
OR is it the other way round, and you start performance test with a given set of hardware and say, lets see what response time I get now, and then look at results and say, well it's 8 seconds right now, I'd like it to be 3 seconds at max, so lets see how we can optimize it to be 3 seconds? But again is 3 seconds out of air? I am sure, scaling up machines only will not get response time up. It'll get response time up only when single machine/server is under load and you start clustering?
Now for one single user I have response time as 3 seconds but as the load increases it goes down exponentially; so where do I draw the line between "I need to optimize code further" (which has it's upper limit) and "I need to scale up my servers" (Which has a limit too)
What are the best free tools to do performance and load testing? I have used Jmeter a bit. But is there anything else, that is good and open source?
If I have to optimize code, I start profiling the specific flows which took lot of time responding to requests?
Basically I'd like to see how one goes about from end to end doing performance testing for their application. Any links or articles would be very helpful.
Thanks.
The Performance Testing Council is your gateway to freely exchange experiences, knowledge, and practice of performance testing.
Also read Microsoft Patterns & Practises for Performance testing. This guide shows you an end-to-end approach for implementing performance testing.
phoenix mentioned the Open Source tools.
First of all you can read
Best Practices for Speeding Up Your Web Site
For tools
Open source performance testing tools
performance: tools
This link and this show an example and method of performance tuning an application when the application does not have any obvious "bottlenecks". It works most intuitively on individual threads. I have no experience using it on web applications, although other people do. I agree that profiling is not easy, but I've always relied on this technique, and I think it is pretty easy / effective.
First of all, design your application properly.
Use a profiler, see where the bottlenecks in your application are, and take them away if possible. MEASURE performance before improving it.
I will try to provide basic step by step guide, which can be used for implementing Performance testing in you project.
1 - Before you start testing you should know amount of physical memory and amount of memory allocated for JVM, or whatever. DB size collect as much metrics as possible for your current environment. Know you environment
2 - Next step would be to identify common DB production size and expected yearly growth. You will want to test how your application will behave after year, two, five etc.,
3 - Automate environment setup, this is will help you a lot in future for regression testing and defect fix validation. So you need to have DB dumps for your tests. With current (baseline), one year, five year volume.
4 - Once you're done if gathering basic information - Think about monitoring your servers under load, maybe you already have some monitoring solution like http://newrelic.com/ this will help you to identify cause of performance degradation (CPU/Mem/Amount of threads etc.,) Some performance testing tools do have built in monitoring systems.
At this you are ready to move with tooling and load selection, there is already provided materials on how to do that so I will skip part with workload selection.
5 - Select tool I think that JMeter + http://blazemeter.com/ is what you need at this point, both do have a lot nice articles and education materials, for your script recording I would recommend to use blazemeters Chrom Extension instead of inbuilt JMeters solution. If you still think that you do lack knowledge on how things are done in JMeter I recommend to get this book - Performance Testing With JMeter 2.9 by Bayo Erinle
6 - Analyze results, review test plan and take corresponding actions.
I'm about to start testing an intranet web application. Specifically, I've to determine the application's performance.
Please could someone suggest formal/informal standards for how I can judge the application's performance.
Use some tool for stress and load testing. If you're using Java take a look at JMeter. It provides different methods to test you application performance. You should focus on:
Response time: How fast your application is running for normal requests. Test some read/write use case
Load test: How your application behaves in high traffic times. The tool will submit several requests (you can configure that properly) during a period of time.
Stress test: Do your application can operate during a long period of time? This test will push your application to the limits
Start with this, if you're interested, there are other kinds of tests.
"Specifically, I have to determine the application's performance...."
This comes full circle to the issue of requirements, the captured expectations of your user community for what is considered reasonable and effective. Requirements have a number of components
General Response time, " Under a load of .... The Site shall have a general response time of less than x, y% of the time..."
Specific Response times, " Under a load of .... Credit Card processing shall take less than z seconds, a% of the time..."
System Capacity items, " Under a load of .... CPU|Network|RAM|DISK shall not exceed n% of capacity.... "
The load profile, which is the mix of the number of users and transactions which will take place under which the specific, objective, measures are collected to determine system performance.
You will notice the the response times and other measures are no absolutes. Taking a page from six sigma manufacturing principals, the cost to move from 1 exception in a million to 1 exception in a billion is extraordinary and the cost to move to zero exceptions is usually a cost not bearable by the average organization. What is considered acceptable response time for a unique application for your organization will likely be entirely different from a highly commoditized offering which is a public internet facing application. For highly competitive solutions response time expectations on the internet are trending towards the 2-3 second range where user abandonment picks up severely. This has dropped over the past decade from 8 seconds, to 4 seconds and now into the 2-3 second range. Some applications, like Facebook, shoot for almost imperceptible response times in the sub one second range for competitive reasons. If you are looking for a hard standard, they just don't exist.
Something that will help your understanding is to read through a couple of industry benchmarks for style, form, function.
TPC-C Database Benchmark Document
SpecWeb2009 Benchmark Design Document
Setting up a solid set of performance tests which represents your needs is a non-trivial matter. You may want to bring in a specialist to handle this phase of your QA efforts.
On your tool selection, make sure you get one that can
Exercise your interface
Report against your requirements
You or your team has the skills to use
You can get training on and will attend with management's blessing
Misfire on any of the four elements above and you as well have purchased the most expensive tool on the market and hired the most expensive firm to deploy it.
Good luck!
To test the front-end then YSlow is great for getting statistics for how long your pages take to load from a user perspective. It breaks down into stats for each specfic HTTP request, the time it took, etc. Get it at http://developer.yahoo.com/yslow/
Firebug, of course, also is essential. You can profile your JS explicitly or in real time by hitting the profile button. Making optimisations where necessary and seeing how long all your functions take to run. This changed the way I measure the performance of my JS code. http://getfirebug.com/js.html
Really the big thing I would think is response time, but other indicators I would look at are processor and memory usage vs. the number of concurrent users/processes. I would also check to see that everything is performing as expected under normal and then peak load. You might encounter scenarios where higher load causes application errors due to various requests stepping on each other.
If you really want to get detailed information you'll want to run different types of load/stress tests. You'll probably want to look at a step load test (a gradual increase of users on system over time) and a spike test (a significant number of users all accessing at the same time where almost no one was accessing it before). I would also run tests against the server right after it's been rebooted to see how that affects the system.
You'll also probably want to look at a concept called HEAT (Hostile Environment Application Testing). Really this shows what happens when some part of the system goes offline. Does the system degrade successfully? This should be a key standard.
My one really big piece of suggestion is to establish what the system is supposed to do before doing the testing. The main reason is accountability. Get people to admit that the system is supposed to do something and then test to see if it holds true. This is key because because people will immediately see the results and that will be the base benchmark for what is acceptable.