MBUnit test matrix optimization-performace problems in automated ui tests - performance

We're currently using MBUnit for both unit testing and UI testing. For UI testing setup cost for test matrix axes are pretty high (login, browser instance, navigate to page etc). In order to avoid setting up these for each test case we are partly relying on AssemblyFixture to manage some of them.
However because it's not possible to filter out certain cases where they are not applicable to certain combination, it's not possible for us to really use such optimization. So currently we are doing some of the setup per test-case, horribly inefficient.
We could put if statements inside test code to check for correct combinations but we don't desire that either. It pollutes test code.
How do you guys do such optimizations? or test matrix management? Is there a better practice, in another testing framework?

Until recently, I've always thought of UI Automation as black box testing where my UI tests drive against a fully stand alone web site or application. As a result, the tests run under the constraint of normal execution and are subject to a host of environment overhead issues.
I've recently adopted the notion of "shallow" and "deep" UI tests where each set of tests run under an optimized configuration to ease environmental differences and speed things up. For example, the login controller is swapped out with a mechanism that avoids OAuth login overhead and is hard coded with fixed usernames. The product catalog skips database lookup and is hard coded with a few fixed items. The ecommerce backend is swapped out to perform speedy operations that accept/reject transactions based on credit card and amount.
Under a "shallow" configuration I can perform "deep" testing against the UI logic. When I switch to a "deep" configuration, it resembles production and I can perform "shallow" testing of fully integrated components such as login, product catalog, search, etc.
A mix of testing strategies is required.

May be the ui-test-automation-best-practices article is helpful for you. It has some examples how to improve performance of automating ui testing by minimizig logins and context changes.

Related

How to create Performance testing framework in jmeter?

For functional automation we use to create a framework which is reusable for automating application. Is there any way to create Performance testing framework in jmeter. So that we can use same framework for Performance testing of different applications.
Please help if any one knows and provide more information regarding it.
You can consider JMeter as a "framework" which already comes with test elements to build requests via different protocols/transports, applying assertions, generating reports, etc.
It is highly unlikely you will be able to re-use existing script for another application as JMeter acts on protocol level therefore there will be different requests for different applications.
There is a mechanism in JMeter allowing to re-use pieces of test plan as modules so you won't have to duplicate your code, check out Test Fragments and Module Controller, however it is more applicable for a single application.
The only "framework-like" approach I can think of is adding your JMeter tests into continuous integration process so you will have a build step which will execute performance tests and publish reports, basically you will be able to re-use the same test setup and reporting routine and the only thing which will change from application to application will be .jmx test script(s). See JMeter Maven Plugin and/or JMeter Ant Task for more details.
You must first ask yourself, how dynamic is my conversation that I am attempting to replicate. If you have a very stable services API where the exposed external interface is static, but the code to handle it on the back end is changing, then you have a good shot at building something which has a long life.
But, if you are like the majority of web sites in the universe then you are dealing with developers who are always changing something, adding a resource, adding of deleting form values (hidden or not), headers, etc.... In this case you should consider that your scripts are perishable, with a limited life, and you will need to rebuild them at some point.
Having noted the limited lifetime of a piece of code to test a piece of code with a limited lifetime, are there some techniques you can use to insulate yourself? Yes. Rule of thumb is the higher up the stack you go to build your test scripts the more insulated you are from changes under the covers ( assuming the layer you build to is stable ). The trade off is with more of the intelligence under the covers of your test interface, the higher the resource cost for any individual virtual user which then dictates more hosts for test execution and more skew from client side code which can distort the view of what is coming from the server. An example, run a selenium script instead of a base jmeter script. A browser is invoked, you have the benefit of all of the local javascript processing to handle the dynamic changes and your script has a longer life.

What is the difference between UI/GUI testing, functional testing and E2E testing?

I would say that all three are the same, but I wonder if there is small differences between them. In the end, what I think is that you are testing user scenarios on all of them.
UI testing: user interface testing. In other words, you have to make sure that all buttons, fields, labels and other elements on the screen work as assumed in a specification.
GUI testing: graphical user interface. You have to make sure that all elements on the screen work as mentioned in a specification and also color, font, element size and other similar stuff match design.
Functional testing: the process of quality assurance of a product that assumes the testing of the functions/functionalities of component or system in general, according to specification requirements.
E2E testing: it needs for identifying system dependencies and ensuring that the right information is passed through multiple components and systems.
Please make yourself familiar with Hermetic Testing.
You have two ways to access systems in your test:
You have a local service. For example an in memory database instead of the real database
You mock the system.
For me UI-tests work like in above picture: All tests use local resources. They are hermetic.
But End-to-end Tests involve other systems. Example: Your SUT (system under test) creates an email. You want to be sure that this email gets send to a server and later arrives in an Inbox. For me this contradicts with "separation of concerns". This mixes two distinct topics. First: Your application creates an email and sends it to an server. This could be handled with a mocked mail server. But end-to-end tests mix it with a second concern: You want the mail server to be alive and receive and forward mails correctly. This is not software testing, this is monitoring.
My advice: Do hermetic UI-Testing of code and do check/monitor your production system. But don't mix both concepts. I think for small environments end-to-end-tests are not needed.
I don't think that functional testing is the same as UI/GUI testing at all. consider that we talk about a mechanical domain or another which is not software; for me the functional testing, test the function;e.g. if you click on the hard button of your microwave, it should start working. Now if instead of the buttons, your microwave has a touch screen and an OS to manage the screen, and you click on the soft button,this soft button should drive the hard button in order that he microwave functions. So for me, functional testing means testing the microwave using the hard button, but UI testing means testing the Microwave using the soft button and since soft button drives the hard button, by testing the UI, you ALSO do functional testing.
Does it make sense to oy?

Why is caching usually disabled in test environments?

On our applications we have a lot of functional tests through selenium.
We understand that it is a good practice to have the server where the tests are ran as similar as possible to the production servers, and we try to follow it as much as possible.
But that is very hard to achieve in 100%, so we have a different settings file for our server for some changes that we want in the staging environment (for example, we opt to turn e-mail sending off because of the additional required architecture).
In fact, lots of server frameworks recommend having an isolated front-controller (environment) for testing to easily achieve this small changes.
By default, most frameworks such as ours recommend that their testing environment should have its cache turned off. WHY?
If we want to emulate production as much as possible, what's the possible advantage of having the server's cache turned off when performing functional tests? There can be bugs that are only found with the cache on, and having it on might also have the benefit of accelerating our tests execution!
Don't we just need to make sure that the cache is cleared before starting a new batch of functional tests, the same way we clear the cache when deploying a new version to production?
A colleague of mine suggests that the reason for this is could be that cache can generate false-positives, errors that are not caused by badly implemented features (that are the main target of those tests), but of the cache system itself... but even if those really happen (I suppose it depends on how advanced is the way the cache is used), why would they be false-positives?
To best answer this question I will clarify some points.
(be aware that this is based on my experience)
Integration tests using the browser are typically "Black Blox Tests" , which means that they are made ​​without knowledge of the code. That is, without knowing whether the cache is being used or not.
These tests are usually designed based on certain tasks that are performed during the normal use of the system. But, these tasks are chosen for automation depending on certain conditions of use (mainly reusability, and criticality/importance but also the cost of implementation). So most of the times, we will not need/wont to test caching behaviour.
By convention, a test (any) must be created with a single purpose and have the less possible dependencies. Why?
When the test fails , we can quickly find the source of the failure.
Smaller tests are easier to extend, fix, remove...
We do not spend too much time, first debugging the test code and then
debugging the system code.
Integration testing should follow this convention.
Answering the question:
If we want to check a particular task, we must isolate it as possible.
For example, if we want to verify that the user correctly logs in, we have to delete the cookies to be sure that they do not influence the result (because they may). If on the other hand, we want to test the cookies we have to somehow use an environment where they are not deleted.
so, in short:
If there is need to test the caching behaviour then we need to create an "isolated" environment where this is possible.
The usual integration tests purpose is to test the functionality, so the framework default value it's to have the cache disabled.
This does not means that we shouldn't create our own environment to test the caching behaviour.

Performance Testing Secured Web Site

How is the community handling performance testing of their secured web areas? We don't particularly have a public facing web site, thus users have to be logged into be able view data / access the system. To further complicate matters, we can not allow users to be logged in multiple times -- if you attempt to login a second time your first session is invalidated. We could turn this feature off (as well as second-level caching), but then we are testing a system which is inherently different from production.
What methodologies should we look into to stress test our application?
Our developers are pretty proficient with Java and Python.
Good question.
Normally we'd use something like Selenium to automate a web-browser talking to the web application itself. This is a system-level approach, and has several advantages:
You are measuring the performance of client-browser too
You can see (to some extent) if the site performs better or worse in different browsers
It is compatible with techniques which do not lend themselves to "raw" web driver programs like ApacheBench
Of course it can take a large amount of work to create automated tests which are representative of real users actions.
Normally you'd have some special test-system with known hardware (ideally similar to production) and a database which includes certain objects which the test suite expects to find. You could also load a production-size (or bigger) simulated data set into this system.
If you used (for example) Selenium to automate functional tests, the functional tests could be reused to build a performance-test suite. That's what we did before.

Performance testing scenarios required

What can be the various performance testing scenarios to be considered for a website with huge traffic? Is there any way to identify the elements of the code which are adversely affecting the site performance?
Please provide something similar to checklist of generalised scenarios to be tested to ensure proper performance testing.
It would be good to start with some load testing tools like JMeter or PushToTest and start running it against your web application. JMeter simulates HTTP traffic and loads the server that way. You can do that as well as load test AJAX parts of your application with PushToTest because it can use Selenium Scripts.
If you don't have the resources (computers to run load tests) you can always use a service like BrowserMob to run the scripts against a web accessible server.
It sounds like you need more of a test plan than a suggestion of tools to use. In performance testing, it is best to look at the users of the application -
How many will use the application on a light day? How many will use the app on a heavy day?
What type of users make up your user population?
What transactions will each of these user types perform?
Using this information, you can identify the major transactions and come up with different user levels (e.g. 10, 25, 50, 100) and percentages of user types (30% user A, 50% user B, ...) to test these transactions with. Time each of these transactions for each test you execute and examine how the transaction times change as compared to your user levels.
After gathering some metrics, since you should be able to narrow transactions to individual pieces of code, you will be able to know where to focus your code improvements. If you still need to narrow things down further, finer tests within each transaction can be created to provide more granular results.
Concurrency will kill you here, as you need to test your maximum projected concurrent users + wiggling room hitting the database, website, and any other web service simultaneously. It really depends on the technologies you're using, but if you have a large interaction of different web technologies, you may want to check out Neoload. I've had nothing but success with this web stress tool, and the support is top notch if you need to emulate specific, complicated behavior (such as mocking AMF traffic, or using responses from web pages to dictate request behavior.)
If you have a DB layer then this should be the initial focus of your attention, once the system is stable (i.e. no memory leaks or other resource issues). If the DB is not the bottle neck (or not relevant) then you need to correlate CPU/Memory/Disk IO and Network traffic with the increasing load and increasing response times. This gives you an idea of capacity and correlation (but not cause) to resource usage.
To find the cause of a given issue with resources you need to establish a Six Sigma style project where you define the problem and perform root case analysis in order to pin point the piece of code (or resource configuration) that is the bottleneck. Once you have done this a couple of times in your environment, you will notice patterns of workload, resource usage and counter measures (solutions) that will guide you in your future performance testing 'projects'.
To choose correct performance scenarios you need to go through the next basic checklist:
High priority scenarios from the business logic perspective. For example: login/order transactions, etc.
Mostly used scenarios by end users. Here you may need information from monitoring tools like NewRelic, etc.
Search / filtering functionality (if applicable) - Scenarios which involve different user roles/permissions
Performance test is a comparison test either with the previous release of the same application or with the existing players in the market.
Case 1- Existing application
1)Carry out the test for the same scenarios as covered before to get a clear picture on the response of the application before and after the upgrade.
2)If you need to dig deeper you can get back to the database team to understand which functionalities are getting more requests. Also ask them on the total number of requests on an average on any particular day so that you can take a call on what user load and time duration to be given for the test.
Case 2- New Application
1) Look for existing market players and design your test as per the critical functions of the rival product (for e.g. Gmail might support many functions what what is being used often is launch ->login ->compose mail -> inbox ->outbox).
2) Any time you can get back to your clients on what they suppose to be business critical scenarios or scenarios that will be used more often..

Resources