Replicate production load on a different environment

Replicate production load on a different environment - performance

Is there any tool by which I can record all the requests/traffic which currently hits my production website and then replay this load on a different environment to check the perf of new environment?
Basically, I want to be able to test the performance of my application on aws cloud and what configuration is required to handle the current load on production if it is migrated to aws.

You could use JMeter's Access Log Sampler (see also Access log replay for load testing? Jmeter Pitfalls and Competitors).
This would allow you to take the logs from your production server, and replay the traffic against your new server. Not sure about it replicating the exact load profile - real traffic tends to be spread over the day, with peaks and troughs in visits depending on your time zone and your users; it also doesn't deal with POST requests.
In fact, for any web app that isn't about retrieving web pages, replayineg historical traffic is likely to be problematic. If users have to log in, for instance, you need to know their passwords; if they browse a product catalogue in an ecommerce site, you need to have the right data to reflect the catalogue as it was when you recorded the log file.
Far more useful, in my view, is to build a performance model based on your current traffic, and understand the peak number of page requests / second you need to be able to support for each (type of) page.
For instance, if you know that today, you have 10K visitors/hour, and you know the most common user journeys, you can build a performance model that equates those 10K users into "login page requests per second", "product home page requests / second", "payment page requests/ second"; you can then use a tool like JMeter to model those journeys, and ramp up the load till you exceed your targets.

Related

Measure average web app response time from the client side during a long period of time

My company has over a hundred users of a specific CRM web application, which is provided as a service by another company to us.
The users of this application are very dissatisfied with its average response time, and I need to find a way to gather metrics during a certain period of time (let's say .. a week) to prove the service provider that they are really providing a bad service.
If the application were mine, I would get the metrics from New Relic or some other equivalent monitoring service, but since it is not, I'm looking for something that could do some sort of client side monitoring.
I already checked Page Speed from Google and YSlow from Yahoo, but both are only useful when you want to test the application during a few seconds. They are not meant for the long term monitoring I need.
Would anybody know a way to get this kind of monitoring from a client side perspective?

LoadRunner is no charge for 50 users, but what you really need is not a test tool but a synthetic user monitor which runs every n number of minutes and pulls the stats. You can build it yourself using LoadRunner 12, Jmeter, or any other http sampling technology. You could also use a service like Gomez for sampling or mpulse from SOASTA for tracking every page component across all users.
Keep in mind that your developer tools will time all of the components of the request to give you some page times. As will Dynatrace for the web client.
If you have access to the web server then consider configuring the web server logs to capture the w3c time-taken field, which will track every request. Depending upon the server the level of granularity can be to the millionth of a second on each and every request.
You could also look at a service like LiteSquare which can process those web logs and provide ammunition for changes to the server to improve performance on a no-gain, no-charge model.

One (expensive) solution would be using LoadRunner endurance test feature. Check here for a demonstration.
Another tool is Oracle OATS.
JMeter is a free tool, though I'm not sure if it's reliable enough to run for a whole week.
These are load generator tools, so if you are testing as a single client, you should carefully chose your load amount (e.g. one user).
Last but not least, you could create your own webservice client, and create a cron job to run it on your specified time of day and log the access time.
If what you want is to get data from their server, this is impossible ... without hacking into it. All you can do is monitor the website as a client, using some of the above tools, make a report and present that to them. But even so they could challenge your bandwidth, your test method etc.
I recommend that you negotiate with them to give you their logs and to prove that their system can support a certain amount of load. If you are a customer to them, you can file a complain or test additional offers.

Dynatrace was already mentioned in combination with Load Testing. As you said that you want to monitor your live system I want to bring Dynatrace up again. Most of the time it is used to do live system monitoring to understand what end users are actually doing. It is also available as a 30 day trial - so - no need to buy it - but - use it for your sanity check: http://bit.ly/dttrial

Performance Testing Methodology

I'm looking for a "concrete" methodology to individuate performance bottleneck of a service provided through a web application. I'm looking for an holistic approach that includes testing of computer network, database and web applications.
Suppose that you are in front of a web application that allows you to download pdf files once logged in your company network.
You access to the application with a browser.
The end user requirement is that the web application must allows to download pdf files (with size up to 5MB) in no more than 1 minute.
Some technical details:
- The application consists of a database, a document management system (e.g., Alfresco) and pieces of Java code.
- An user authenticates him/herself by providing username and password to the application, the application on its turn sends them to the LDAP server (the LDAP server is deployed on another physical server). A java serlet does this work and additionally queries the DB to understand the role of the user (a user can be the administrator, a reader, a writer).
- An authenticated user access to a search page, after searching a document the file will be downloaded. The search works in this way: the user fills in some fields (e.g., the name of the document) the field is sent to the document management systems which performs the actual search of the file and returns the results back to the application.
When the user clicks the download button, the application retrieves the document from the document management system.
The underlying network should be 1GB Eth with some routers/bridges and a load balancer, we have a broad knowledge of network topology.
My question is: if there is a performance bottleneck somewhere (in the network, in the web application, e.g., poor coding) that violates the former requirement (1 second download time) how can we discover it? From which element should we start? For instance trying to understand network performance, then document management systems and at the end the whole system (application, network, database). How should we incrementally increase the number of download request?
I'm looking for a methodology, I've already read
http://www.agileload.com/performance-testing/performance-testing-methodology/test-methodology
http://msdn.microsoft.com/en-us/library/bb924375.aspx
What performance testing methodology are you using for your webapps?
All them contain nice suggestions, but I want a more practical methodology with reference to testing of web application.
Thank you in advance

Is it one minute or one second for a 5MB file? Can you post a diagram
of how the various pieces are connected?
There is a way to determine how the network latency and application processing contribute towards the total response time.
It requires instrumenting the browser and other components that make up the complete system. I.e. writing code in JavaScript, Java, C/C++, Perl, Python, etc. and embedding it into each of the application component so that components can report events to a central collector.
If instrumentation cannot be easily added to the components, then the other alternative to insert event collecting proxies between components and then have them report events to a central collector. You can determine and factor out delays due to proxies by running few tests with and without proxies in the path.
Once the events arrive at the central collector, one can get good visibility into how the response time is made up.

How do you achieve acceptable performance metrics for your web application?

What analysis do you currently perform to achieve performance metrics that are acceptable? Metrics such as page weight, response time, etc. What are the acceptable metrics that are currently recommended?

This is performance related, so 'it depends' :)
Do you have existing metrics (an existing application) to compare against? Do you have users that are complaining - can you figure out why?
The other major factor will depend on what network sits between the application and the users. On a LAN, page weight probably doesn't matter. On a very slow WAN, page size (esp WRT to TCP windowing) is going to dwarf the impact of server time.
As far as analysis:
Server response time, measured by a load test tool on the same network as the app
Client response time, as measured by a browser / client either on a real or simulated network
The workload for 1) follows the 80/20 rule in terms of transaction mix. For 2), I look at some subset of pages for a browser app and run empty cache and full cache cases to simulate new vs returning users.

Use webpagetest.org to get waterfall charts.
Do RUM, Real User Monitoring, using Google Analytics snippet with Site Speed Javascript enabled.
They are the bare minimum to do.

Performance testing application for bottle necks using production data

I have been tasked with looking for a performance testing solution for one of our Java applications running on a Weblogic server. The requirement is to record production requests (both GET and POST including POST data) and then run these requests in a performance test environment with a copy of the production database.
The reasons for using production requests instead of a test script are:
It is a large application with no existing test scripts so it would be a a large amount of work to write scripts to cover the entire application.
Some performance issues only appear when users do a number of actions in a particular order.
To test using actual user interaction with the system not an estimation at how the users may interact with the system. We all know that users will do things we have not thought of.
I want to be able to fix performance issues and rerun the requests against the fixed code before releasing to production.
I have looked at using JMeters Access Log Sampler with server access logs however the access logs do not contain POST data and the access log sampler only looks at the request URL so it cannot simulate users submitting form data.
I have also looked at using the JMeter HTTP Proxy Server however this can record the actions of only one user and requires the user to configure their browser to use the proxy. This same limitation exist with Tsung and The Grinder.
I have looked at using Wireshark and TCReplay but recording at the packet level is excessive and will not give any useful reports at a request level.
Is there a better way to analyze production performance considering I need to be able to test fixes before releasing to production?

That is going to be a hard ask. I work with Visual Studio Test Edition to load test my applications and we are only able to "estimate" the users activity on the site.
It is possible to look at the logs and gather information on the likelyhood of certain paths through your app. You can then look at the production database to look at the likely values entered in any post requests. From that you will have to make load tests that approach the useage patterns of your production site.
With any current tools I don't think it is possible to record and playback actual user interation.
It is possible to alter your web app so that is records and logs every request and post against session and datetime. This custom logging could be then used to generate load test requests against a test website. This would be some serious code change to your existing site and would likely have performance impacts.
That said, I have worked with web apps that do this level of logging and the ability to analyse the exact series of page posts/requests that caused an error is quite valuable to a developer.
So in summary: It is possible, but I have not heard of any off the shelf tools that do it.

Please check out this Whitepaper by Impetus Technologies on this page.. http://www.impetus.com/plabs/sandstorm.html

Honestly, I'm not sure the task you're being asked to do is even possible, let alone a good idea. Depending on how complex the application's backend is, and how perfect you can recreate the state (ie: all the way down to external SOA services or the time/clock), it may not be possible to make those GET and POST requests reproduce the same behavior.
That said, performance testing against production data is always great, but it usually requires application-specific knowledge that will stress said data. Simply repeating HTTP GETs and POSTs will almost certainly not yield useful results.
Good luck!

I would suggest the following to get the production requests and simulate the accurate workload:
1) Use coremetrics: CoreMetrics provides such solutions using which you can know the application usage patterns. This would help in coming up with an accurate workload model. This model can then be converted into test scripts and executed against a masked copy of production database. This will provide you accurate results about the application performance in realtime.
2) Another option would be creating a small utility using AOP (Aspect oriented apporach) so that it can trace all the requests and corresponding method traces. This would help in identifying the production usage pattern and in turn accurate simulation of workload. AOP frameworks such as AspectJ can be used. This would not require any changes in code. The instrumentation can be done on the fly. The other benefit would be that thi cna only be enabled for a specific time window and then it can be turned off.
Regards,
batterywalam

Performance testing scenarios required

What can be the various performance testing scenarios to be considered for a website with huge traffic? Is there any way to identify the elements of the code which are adversely affecting the site performance?
Please provide something similar to checklist of generalised scenarios to be tested to ensure proper performance testing.

It would be good to start with some load testing tools like JMeter or PushToTest and start running it against your web application. JMeter simulates HTTP traffic and loads the server that way. You can do that as well as load test AJAX parts of your application with PushToTest because it can use Selenium Scripts.
If you don't have the resources (computers to run load tests) you can always use a service like BrowserMob to run the scripts against a web accessible server.

It sounds like you need more of a test plan than a suggestion of tools to use. In performance testing, it is best to look at the users of the application -
How many will use the application on a light day? How many will use the app on a heavy day?
What type of users make up your user population?
What transactions will each of these user types perform?
Using this information, you can identify the major transactions and come up with different user levels (e.g. 10, 25, 50, 100) and percentages of user types (30% user A, 50% user B, ...) to test these transactions with. Time each of these transactions for each test you execute and examine how the transaction times change as compared to your user levels.
After gathering some metrics, since you should be able to narrow transactions to individual pieces of code, you will be able to know where to focus your code improvements. If you still need to narrow things down further, finer tests within each transaction can be created to provide more granular results.

Concurrency will kill you here, as you need to test your maximum projected concurrent users + wiggling room hitting the database, website, and any other web service simultaneously. It really depends on the technologies you're using, but if you have a large interaction of different web technologies, you may want to check out Neoload. I've had nothing but success with this web stress tool, and the support is top notch if you need to emulate specific, complicated behavior (such as mocking AMF traffic, or using responses from web pages to dictate request behavior.)

If you have a DB layer then this should be the initial focus of your attention, once the system is stable (i.e. no memory leaks or other resource issues). If the DB is not the bottle neck (or not relevant) then you need to correlate CPU/Memory/Disk IO and Network traffic with the increasing load and increasing response times. This gives you an idea of capacity and correlation (but not cause) to resource usage.
To find the cause of a given issue with resources you need to establish a Six Sigma style project where you define the problem and perform root case analysis in order to pin point the piece of code (or resource configuration) that is the bottleneck. Once you have done this a couple of times in your environment, you will notice patterns of workload, resource usage and counter measures (solutions) that will guide you in your future performance testing 'projects'.

To choose correct performance scenarios you need to go through the next basic checklist:
High priority scenarios from the business logic perspective. For example: login/order transactions, etc.
Mostly used scenarios by end users. Here you may need information from monitoring tools like NewRelic, etc.
Search / filtering functionality (if applicable) - Scenarios which involve different user roles/permissions

Performance test is a comparison test either with the previous release of the same application or with the existing players in the market.
Case 1- Existing application
1)Carry out the test for the same scenarios as covered before to get a clear picture on the response of the application before and after the upgrade.
2)If you need to dig deeper you can get back to the database team to understand which functionalities are getting more requests. Also ask them on the total number of requests on an average on any particular day so that you can take a call on what user load and time duration to be given for the test.
Case 2- New Application
1) Look for existing market players and design your test as per the critical functions of the rival product (for e.g. Gmail might support many functions what what is being used often is launch ->login ->compose mail -> inbox ->outbox).
2) Any time you can get back to your clients on what they suppose to be business critical scenarios or scenarios that will be used more often..

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio