Importance of baseline testing over SLA's - performance

Why we are comparing Performance test results with base line if we already have SLA's?
How they will be related-
For example:
Transaction response time in main test is 3 seconds
SLA for same transaction is 5 second
Baseline for that transaction was 2 seconds
How to compare these?

If time is over SLA - you have a critical production issue that need to be address.
If time is over baseline - your server suffer from performance degradation and it need to be analyse,but in lesser importance
Read more in testingperformance:
Any user action where the response time seems to be higher than expected can be traced, monitored and checked to determine if their are any inefficiencies.
As the workload is increased, the performance tester can look to see how the response times of transactions deviate from the baseline as the workload increases.

This is a difficult question to answer - are you the recipient of an SLA (as in your system uses an external system with an SLA) or do you have to guarantee an SLA?
Typically people use "baseline" to mean the application as it is now, running in typical conditions and under typical load.
Typically, a response time SLA includes upper limits on load, or some kind of commercial ladder - guaranteeing a response time for unlimited traffic is often impossible without additional financial resources.
If your first performance test suggests that the actual response time is higher than baseline, it suggests that either you disagree about "typical" conditions, or that you've exceeded those typical conditions, or that the application's performance has deteriorated since the baseline was established. That's important information.
In general terms, response times and load do not have a linear relationship - if response time is 1 second with 100 users, it's usually not 10 seconds with 1000 users. Instead, response time tends to rise very slowly with load until you hit a bottleneck, at which point it rises very steeply.
I typically use performance testing to explore those bottlenecks, so I can decide how they fit with my desired performance characteristics, and work out how to move that bottleneck further away.
It's also worth noting that most systems have multiple bottlenecks, and the slowest element determines the overall performance characteristics. So even if you have an SLA for 5 second transactions in one part of your architecture, there may be other parts that are slower (or reach their bottleneck sooner).
So, why do you compare your load tests with base lines, even if you have an SLA?
Make sure that the baseline is still valid.
Make sure you understand the overall performance characteristics and can exceed the SLA in other parts of the system.
Verify you can reach the SLA

Related

JMeter Performance bench mark criteria

Could any team assist me for sharing standard all J-Meter performance mark criteria while performing J-Meter Performance testing for min 1000 user .it means that how we can decide what parameter(or threshold) caused for deciding load /performance testing with route cause and proposed solution after generating html report with non-GUI with completed scenario.
Thanks
Amit G
There is no "standard", the acceptance criteria normally are being dictated by the business requirements which might differ depending on the nature of the application
for real-time trading a couple of milliseconds delay is critical, companies invest into locating their highly performing servers physically close to exchange servers because even light speed matters for their scenarios
for normal applications used by people (i.e. news portals, e-commerce websites, etc.) the accepted load time would be 2-3 seconds, if people have to wait more - most probably they will switch somewhere else and never return. Moreover, search engines tend to rank slow websites lower
for internal applications which are being used inside the company response time doesn't really matter cause people will have to use particular this application, but you still might report large response times like: "if a person who earns $18 per hour has to wait for 10 seconds for each operation and the number of operations per day is 100 and number of personnel is 3000 the organization looses $15 000 every single day or $5 475 000 a year"
So I would recommend taking the following steps:
Check for existing SLAs or NFRs, it might be the case the document, you got these 1000 users contains some information regarding maximum response time or minimum throughput (requests per unit of time)
If there are no formal acceptance criteria defined you could go for other performance testing types:
Soak testing - putting your application under the prolonged load to see whether it performs consistently and there are no obvious memory leaks
Stress testing - starting with low number of users and gradually increasing the load until errors start occurring. This way you will be able to report correlation between increasing number of users and increasing response time/number of errors, identify saturation point (the maximum number of users which application can efficiently support), bottleneck (breaking point), etc.

How to Calculate average case after doing HTTP benchmark

If i do a benchmark, and for example i found the following:
With 1 concurrent user, The api give 150 req/s. (9000 req/minute)
With more than 300 concurrent user, The api start throwing exception.
An app is doing request 1 every 30 minute.
Is it correct if I say:
the best cases is that the api could handle (30 * 9000 = 270.000 user). That is under 30 minute, there would be 270.000 sequential request and each are coming from different user
The worst cases would be when there is 300 user posting request at the same time.
And if it's true, would there any way to calculate the average case ?
Is is the same as calculating worst case, average case complexity of an algorithm ?
One theoretical tool to answer these questions is http://en.wikipedia.org/wiki/Queueing_theory. It says that you are very unlikely to get the level of performance that you are assuming, because the load applied to the system fluctuates, so that there are busy periods and quiet periods. If the system has nothing to do in quiet periods it is forced into idleness that you haven't accounted for. In busy periods, on the other hand, it will typically build up long queues of pending work, until the queues get so long that customers walk away, or the queues become longer than the system can support and it collapses, or both.
The graph at figure 1 page 3 of http://pages.cs.wisc.edu/~dsmyers/cs547/lecture_12_mm1_queue.pdf shows a graph of response time vs applied load for what is probably the most optimistic even vaguely realistic situation. You can see that response time gets very large as you approach maximum load.
By far the most sensible thing to do is to run tests which apply a realistic load to your application - this is important enough for people to build things like http://jmeter.apache.org/. If you want a rule of thumb I'd say don't plan to stress the system at more than 50% of theoretical capacity as you originally calculated.

What are the parameters to collect after the performance testing [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I know this may be a repeat of the questions but I started using the WebPerformanceTest and LoadTest in my web projects.I could run the WebPerformanceTest and Loadttest.
Now what are the parameters/statistics that I need to share with the Dev team or Busniess team?I think of these..But it would be great if somoeone share what are the other parameters I might have to consider sharing..
1.No.of users the application can support
2.Reposne time what the application can give under the sustainable load
following things you can consider for sharing,
if SLA's are mentioned by Dev team or stakeholders and if your performance test shows that the web application is not matching those SLA's then you can share that
Next question comes in your and their mind is why? (try finding out which part/tier is taking most time or a bottleneck). This can be done by analyzing logs or use profiler which will give you costly things,slow compnonents
Next question is job of performance engineer (how to resolve them and improve the performance of my application). If you know application very well then try tuning it and get the improvement results after tuning which should be shared with Dev team or stakeholders.
Maximum number of users may be confusing if you do not limit response time. For 100ms requests 10 simultaneos users mean 100 rps (requests per second) and for 10s requests 100 simultaneous users mean only 10 rps.
If you use simple hit-based testing (e.g. testing single page or specific request performance) it could be better to use rps metric instead.
For response time - mean time could be confusing as well, especially in case of high variance of response time, it's better to provide response time for some percentiles.
I.e. 50% in 50ms, 75% in 55 ms, 90% in 60 ms, 95% in 70 ms, 99% in 90 ms and 100% in 10 sec. With average time of 150 ms. For some services 150 ms is very good, but about 1% of really slow answers is unacceptable and you hardly can find that problem using just mean and medium response time.
Also, in my experience, collecting resource usage stats (cpu, memory, I/O intensity and network usage) is very helpful for determining bottlenecks (i.e. service slow-down due to high I/O because of insufficient amount of memory for caches).
Are you asking the right question?
For me a big part of load and performance testing is deciding what my customer wants to learn about the system being tested. There is an element of "what data can I show the customer?" but that is based on interpreting what they ask for. The customer may not know what to ask, your job as a tester is to understand what the customer wants and provide them with the answers they want.
The two topics you list show how the system appears to its users: when it will break and how fast it responds. There are several variations on those factors based on rate-of-change of user load and on duration of the test.
Other factors include the performance of the various parts of the server computers that are being tested. Visual Studio load tests can collect performance data from other computers while the test runs. So they can monitor the web server(s), database server(s), application server(s) and so on. On each of these servers data about CPU and memory usage, SQL and IIS performance, and many more can be collected. All this data can be compared (most easily via graphs) against user load, error rates and transaction times to determine which parts of the system have plenty of headroom, which are busy and where the bottlenecks occur. Monitoring all this data may also reveal threshold warnings from the various servers, they should be checked against the Microsoft documentation and, perhaps, other sources to determine whether they are adversely affecting system performance and whether they should be investigated in more detail.
These and many other ideas are possible but it all goes back to working out what your customer wants to learn.
The same question was asked on another forum and the above words are almost identical to the answer I posted there.
You can furnish following details to your clients:
Response Time
Hits per Second
Throughput
Connections Per Second
First Time to buffer
Number of Errors
Transactions Graph
CPU, Memory, and Disk Utilization
Network Utilization (if applicable)
Number of database inserts/updates/deletes records
It sounds like you simply have no (or exceedingly poor) requirements and you don't have a great depth in the field of performance testing and engineering. As far as what to collect
Before the test:
Full load profile of business functions that make up the load.
Documentation of each business function. Items to time within each business function.
Expected response times for each of the timed business functions
Pay special attention to think times and iteration pacing
Web logs from the current system so you can objectively measure how many people are on the system at any given time, not how many sessions are alive and have not yet timed out.
Test Environment with some defined match level to the production environment to scale your load appropriately.
In the test
Response times matched to the timing of the business functions on the requirements / user stories
Other enumerated datapoints for requirements (hits, volume returned, etc...)
A measurement of any finite resource in the system under test for bottleneck identification for slow response times. You can start at the top level (CPU, DISK, MEMORY, NETWORK) and work your way down through those stats as you find a resource constriction at the top level.
Post Test:
Executive overview: Did you hit the requrements (YES|NO)
Detailed data: response times, monitor peaks
Analysis: Where is the likely bottleneck holding your back
If you are attempting to represent human behavior then under no circumstance should you eliminate think time. Think time, or time between requests on an individual session, is baked into the definition of the client-server model and as you reduce it to zero your test becomes less and less a predictor of what will happen in production
Typically, it is based on the benchmark that you want to achieve with the given hardware and environment.
Following are the key parameters
No.of concurrent users (manual and system threads)
Load of transactional and existing data
Response time (typically page)
Throughput Utilization of CPU, Memory and Disk IOs and Network
Bandwidth(applicable where there is an integration with peripheral
systems )
Success percentage

Maximum number of concurrent requests a webserver can serve assuming average service time to be known

Is it logical to say: "If average service time for a request is X and affordable waiting time for the requests is Y then maximum number of concurrent requests to serve would be Y / X" ?
I think what I'm asking is that if there're any hidden factors that I'm not taking into account!?
If you're talking specifically about webservers, then no, your formula doesn't work, because webservers are designed to handle multiple, simultaneous requests, using forking or threading.
This turns the formula into something far harder to quantify - in my experience, web servers can handle LOTS (i.e. hundreds or thousands) of concurrent requests which consume little or no time, but tend to reduce that concurrency quite dramatically as the requests consume more time.
That means that "average service time" isn't massively useful - it can hide wide variations, and it's actually the outliers that affect you the most.
Broadly yes, but your service provider (webserver in your case) is capable of handling more than one request in parallel, so you should take that into account. I assume you measured end to end service time and havent already averaged it by number of parallel streams. One other thing you didnt and cannot realistically measure is the delay to/from your website.
What you are heading towards is the Erlang unit (not the language using the same name) which is used to described how much load a system can take. Erlangs are unitless (it is just a number) and originated from old school telephony, POTS, where it was used to describe how many wires were needed to handle X calls per time period with low blocking probability. Beyond erlang is engset which is used more for high capacity systems, such as mobile systems.
It also gets used for expensive consultant reports into realtime computer systems and databases to describe the point at which performance degradation is likely to occur. Wikipedia has an article on this http://en.wikipedia.org/wiki/Erlang_(unit) and the book 'Fixed and mobile telecommunications, network systems and services' has a good chapter on performance analysis.
While aimed at telephone systems, just replace with word webserver and it behaves the same. A webserver is the same concept, load is offered that arrives at random intervals to a system with finite parallel capacity. In your case, you can probably calculate total load with load tools easier than parallel capacity and then back calculate the formulas. This is widely done to gain a level of confidence in overall system models.
Erlang/engsetformulas are really useful when you have a randomly arriving load over parallel stream (ie web requests) and a service time that can only be averaged or estimated (ie it varies in real life). You can then calculate the blocking probability, which is the probability a new request will need to wait while current requests are serviced, and how long it will wait. It also helps analyse whether you need to handle more requests in parallel, or make each faster (#lines and holding time in erlang speak)
You will probably look into queuing systems analysis next, as a soon as requests block (queue), the models change slightly.
many factors are not taken into account
memory limits
data locking constraints such as people wanting to update the same data
application latency
caching mechanisms
different users will have different tasks on the site and put different loads
That said, one easy way to get a rough estimate is with apache ab tool (apache benchmark)
Example, get 1000 times the homepage with 100 requests at a time:
ab -c 100 -n 1000 http://www.example.com/

Which are the most Relevant Performance Parameters / Measures for a Web Application

We are re-implementing(yes from scratch) a web application which is currently in production. We have decided to start doing some performance tests on the new app, to get some early information of the capabilities.
As the old application is currently in production and has a good performance we would like to extract some performance parameters, and then use this parameters as a reference or base goal of the performance of the new application.
Which do you think are the most relevant performance parameters we should be obtaining from the current production application?
Thanks!
Determine what pages are used the most.
Measure a latency histogram for the total time it takes to answer the request. Don't just measure the mean, measure a histogram.
From the histogram you can see how many % of requests have which latency in milliseconds. You can choose to key performance indicators by takes the values for 50% and 95%. This will tell you the average latency and the worst latency (for the worst 10% of requests).
Those two numbers alone will bring you great confidence regarding the experience your users will have.
Throughput does not matter for users, but for capacity planning.
I also recommend that you track the performance values over time and review them twice a year.
Just in case you need an HTTP client, there is weighttp, a multi-threaded client written by the guys from Lighttpd.
It has the same syntax used by ApacheBench, but weighttp lets you use several client worker threads (AB is single-threaded so it cannot saturate a modern SMP Web server).
The answer of "usr" is valid, but you can as well record the minnimum, average and maximum latencies (that's useful to see in which range they play). Here is a public-domain C program to automate all this on a given concurrency range.
Disclamer: I am involved in the development of this project.

Resources