Best practice for end to end testing whole systems - tdd

End to end testing means exercising an application from the outer boundaries to verify its behavior. This far I've only done written tests for a single executable artifact. How should I test systems made up of multiple artifacts that is deployed on different hosts?
I see two alternatives.
The tests set up the whole system and exercise it from the very outer edges.
Each artifact is end to end tested in isolation, relying on the test content to enforce the protocol between them.
Is there a clear case for only adhering to one of these, or are one of them preferred, or are they interchangeable? If interchangeable, then what are some advantages and disadvantages between them?

Even though I think it depends on the context, I prefer the first alternative. Here are my random thoughts:
I like my tests to be as closely mapped to use cases as possible (BDD style) (with the disclaimer that I misuse the term use case). These use cases may span several applications and sub-systems.
Example: A back office administrator can view a transaction made by a user from the public interface.
Here, the back office admin interface and the public interface are different applications, but they are included in the same use case.
Mapping these thoughts to your problem where you have sub-systems deployed on different hosts, I would say it depends on how it is used, from the user/actor perspective. Do the use cases span several sub-systems?
Also, perhaps the fact that the system is deployed on several hosts isn't important to the tests. You could replace the inter-process communication with method calls in your tests and have the whole system within the same process during tests, reducing the complexity. Supplement this with tests that only verify the inter-process communication.
Edit:
I realise that I forgot to include why I prefer to test the whole system.
Your asset is features, that is, behaviour, while the code is a liability. Therefore you'd like to test the behaviour, not the code (BDD style).
If you are testing each sub-system separately you are testing the code, not the features. Why? When you divided your system into sub-systems you did so based on some technical reasons. When you learn more you might discover that the chosen seam is sub-optimal and would like to move some responsibility from one sub-system to another. And you would have to modify test and production code at the same time, leaving you without a safety net. That's a typical symptom of testing implementation details.
That said, these kind of tests are too blunt to test everything. So you need to have complementary tests for details as well, where necessary.

Testing each artifact end-to-end separately would be highly desiderable in any case. This will ensure that every artifact is sound.
In addition, you might want to test a composition of artifacts. That would catch problems in the interactions between artifacts. I don't know about your situation, but one thing that is important to have is a test environment that is a copy of production. Testing the system in the test environment is a very good idea. You might also want to test the system in the production environment; this might be feasible or not. For instance, if your system processes credit card payments, you may want to avoid test payments on the production system.
In any case, testing each system separately is imho more important than testing the composition. Once you know that your artifacts are sound in isolation, catching interaction tests will be much easier. If you only have the end-to-end test of the whole system, it's much more difficult to understand where is the error when the tests fail.

Related

Order of Operations for System Testing?

I was taking an exam yesterday, and I noticed they asked in which order the following occur (and I'll put the order I deemed it to be here):
Unit Testing (Always write your unit tests first!)
Integration Testing (After you have some code and it works with other code / systems)
Validation Testing (Keep your data in a consistent state and make sure no bad data is input)
User / Acceptance Testing (It's all about the users otherwise why are we building a system in the first place?)
Is this about right?
Personally I think load-testing or database tuning oughta be in there at the end, but it wasn't on the test.
This question doesn't make a whole lot of sense.
For one thing, different people have different definitions of pretty much every kind of testing you have mentioned. For example, in Extreme Programming (XP) Acceptance Tests (while being derived from User Stories) have nothing to do with User Testing, or User Acceptance Testing (UAT). Using the XP definition, Acceptance Testing refers to automated tests that run on a build agent before code makes it anywhere near a user. User Acceptance Testing (UAT) on the other hand, is typically a manual process that happens after a proposed final version has been created and deployed to a UAT environment.
As pointed out in the comments already, Validation Testing is not a common concept with a widely accepted definition. Integration testing also means different things to different people. To some, it is testing that different processes/applications work together (in a UAT environment, for example). For others, it is simply automated tests that involve more that one class i.e. not Unit Tests.
Also, what do you mean by "order"? Do you mean the order in which the tests are written, or the order in which they are run before releasing code to the wild and/or production environment?
In any case, the question is largely irrelevant in the real world because different processes work for different teams. For example, I myself would always write an Acceptance Test before any Unit Tests. Following a test first approach, you always write a Unit Test before modifying a class, yes? So why wouldn't you write an Acceptance Test before modifying the whole system?
If "Acceptance Testing" means anything close to the XP definition of acceptance testing, then I don't think it makes sense for this to come last.
This sounds like the kind of "exam question" that only makes sense in the context of the course that you took before the exam. Without all that information (particularly the definitions of each kind of testing) it is very difficult to provide a useful answer to this question.
Instead of validation testing, System testing is correct word. And Database testing is a part of integration and system testing. Also Load testing will be performed on the phase of system and user acceptance test.

Why is caching usually disabled in test environments?

On our applications we have a lot of functional tests through selenium.
We understand that it is a good practice to have the server where the tests are ran as similar as possible to the production servers, and we try to follow it as much as possible.
But that is very hard to achieve in 100%, so we have a different settings file for our server for some changes that we want in the staging environment (for example, we opt to turn e-mail sending off because of the additional required architecture).
In fact, lots of server frameworks recommend having an isolated front-controller (environment) for testing to easily achieve this small changes.
By default, most frameworks such as ours recommend that their testing environment should have its cache turned off. WHY?
If we want to emulate production as much as possible, what's the possible advantage of having the server's cache turned off when performing functional tests? There can be bugs that are only found with the cache on, and having it on might also have the benefit of accelerating our tests execution!
Don't we just need to make sure that the cache is cleared before starting a new batch of functional tests, the same way we clear the cache when deploying a new version to production?
A colleague of mine suggests that the reason for this is could be that cache can generate false-positives, errors that are not caused by badly implemented features (that are the main target of those tests), but of the cache system itself... but even if those really happen (I suppose it depends on how advanced is the way the cache is used), why would they be false-positives?
To best answer this question I will clarify some points.
(be aware that this is based on my experience)
Integration tests using the browser are typically "Black Blox Tests" , which means that they are made ​​without knowledge of the code. That is, without knowing whether the cache is being used or not.
These tests are usually designed based on certain tasks that are performed during the normal use of the system. But, these tasks are chosen for automation depending on certain conditions of use (mainly reusability, and criticality/importance but also the cost of implementation). So most of the times, we will not need/wont to test caching behaviour.
By convention, a test (any) must be created with a single purpose and have the less possible dependencies. Why?
When the test fails , we can quickly find the source of the failure.
Smaller tests are easier to extend, fix, remove...
We do not spend too much time, first debugging the test code and then
debugging the system code.
Integration testing should follow this convention.
Answering the question:
If we want to check a particular task, we must isolate it as possible.
For example, if we want to verify that the user correctly logs in, we have to delete the cookies to be sure that they do not influence the result (because they may). If on the other hand, we want to test the cookies we have to somehow use an environment where they are not deleted.
so, in short:
If there is need to test the caching behaviour then we need to create an "isolated" environment where this is possible.
The usual integration tests purpose is to test the functionality, so the framework default value it's to have the cache disabled.
This does not means that we shouldn't create our own environment to test the caching behaviour.

Performance test against Web Service

I am doing performance test against our site. The site exposes many Web services APIs. Our product has several workflows and each complete workflow is composed of calling more than one APIs.
I am wondering which of the following apporoaches is reasonable:
Test against each single API in a standalone way.
Test against each complete workflow which involves several APIs.
I think the latter one make more sense since it mimic the real scenario.
I'd like to hear your comments.
Thanks.
There are (as usual) pro's and cons of each approach. Personally I'd consider doing both, starting with the single API test.
The single API is probably the easiest to build, and it has the benefit of exactly pinpointing out where your performance issues are. It's also useful to so spot regressions in performance during development. If there are unit tests for the application consider using those instead. When there is a performance regression there usually also is a unit test which suddenly became slower.
Once you've done that you will still need to do the more complex tests. Firstly because you need to know if the performance of a certain flow is acceptable, but also because there can be unexpected interactions between different API's. Depending on you application there may be nasty concurrency issues, throughput bottlenecks etc. Make sure you run several flows concurrently, that's what happens in really live and it's only way to find issues related to locking in the database, I/O bottlenecks etc.
But before you start make sure you have a realistic idea of what the performance should be, how many concurrent users there will be and what the hardware requirements are. There is no limit to improving performance, so you have to decide what is good enough or you will never stop optimizing.

design of mid-large sized application when doing TDD? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a good grasp of unit testing, DI, mocks, and all the design principal goodness required to have as close to full code coverage as humanly possible (single responsibility principal, think 'how will i test this' as I code, etc...).
My most recent app, I did not code doing true TDD. I kept unit-testing in mind as I coded, and wrote my tests after writing the code, refactoring, etc.. I did TDD when it was 'easy' to do... however I did not have as good of a grasp as I do now... That was the first project I made full use of DI, mocking frameworks, etc, and the first which had full code coverage - and I learned a lot from it as I went along. I'm itching to get assigned to my next project so I can code it completely doing TDD from scratch.
I know this is a broad question, and I've already ordered TDD by example and XP Unleashed, but I'm hoping for a brief overview of how you all design / write a large application doing TDD.
Do you write the entire application, using nothing but stubbed out code? (e.g., write all the function signatures, interfaces, structures, and write the entire application but without writing any actual implementation)? I could picture it working on small-mid sized, but is this even possible on large applications?
If not, how the heck would you write your first unit test for the highest level function in your system? Lets say for example - on a web service where you have a function called DoSomethingComplicated(param1,...,param6) exposed to the world. Obviously, writing the test first for a simple function like AddNumbers() is trivial - but when the function is at the top of the call stack such as this?
Do you still do design up-front? Obviously you still want to do 'architecture' design - e.g., a flow chart showing IE talking to IIS which talks to a windows service via WCF which talks to the SQL Database... an ERD which shows all your SQL tables and their fields, etc... but what about class design? Interactions between the classes, etc? Do you design this up-front, or just keep writing stub code, refactoring the interactions as you go along, until the whole thing connects and looks like it will work?
Any advice is much appreciated
Do you do design up front?
Of course you do. You've got a big application in front of you. You've got to have some idea of the structure it will have before you start writing tests and code. You don't have to have it all worked out in detail, but you should have some basic idea of the layers, components, and interfaces. For example, if you are working on a web services system, you ought to know what the top level services are, and have a good first approximation of their signatures.
Do you write the entire application using nothing but stubbed out code?
No. You stub things out only if they are really difficult to control in a test. For example, I like to stub out the database, and the UI. I will also stub out third party interfaces. Sometimes I will stub out one of my own components if it vastly increases the test time, or it forces me to create test data that is too complicated. But most of the time I let my tests work on a pretty well integrated system.
I have to say I really dislike the style of testing that relies heavily on mocks and stubs. Don't get me wrong, I think mocks and stubs are very useful for decoupling from things that are hard to test. But I don't like writing things that are hard to test, and so I don't use a lot of mocks and stubs.
How do you write your first unit test for a high level function?
Most high level functions have degenerate behavior. For example, login is a pretty high level function and can be very complicated. But if you try to log in with no user name and no password, the response from the system is going to be pretty simple. Writing that tests will also be very simple. So you start with the degenerate cases. Once you have exhausted them, you move on to the next level of complexity. For example, what if a user tries to log in with a username but no password? Bit by bit you climb the ladder of complexity, never tackling the more complex aspects until the less complex aspects are all passing.
It is remarkable how well this strategy works. You might think that you'd just be climbing around the edges all the time and never getting to the meat; but that's not what happens. Instead you find yourself designing the internal structure of the code based on all the degenerate and exceptional cases. When you finally get around to the primary flow, you find that the structure of the code you are working on has a nice hole of just the right shape to plug the main flow in.
Please don't create your UI first.
UIs are misleading things. They make you focus on the wrong aspects of the system. Instead, imagine that your system must have many different UIs. Some will be web, some will be thick client, some will be pure text. Design your system to work properly irrespective of the UI. Get all the business rules working first, with all tests passing. Then plug the UI in later. I know this flies in the face of a lot of conventional wisdom, but I wouldn't do it any other way.
Please don't design the database first.
Databases are details. Save the details for later. Rather, design your system as though you had no idea what kind of database you were using, Keep any notion of schema, tables, rows, and columns out of the core of the system. Implement your business rules as though all the data were kept in memory all the time. Then add the database later, once you've gotten all the business rules working. Again, I know this flies in the face of some conventional wisdom, but coupling systems to databases too early is a source of a lot of badly warped designs.
Do I write the entire application, using nothing but stubbed out code?
No, not in the slightest sense - that sounds like a very wasteful approach. We must always keep in mind that the underlying reason for doing TDD is rapid feedback. An automated test suite can tell us if we broke anything much faster than a manual test can. If we wait wiring things together until the last moment, we don't get rapid feedback - while we may get rapid feedback from our unit tests, we wouldn't know if the application works as a whole. Unit tests are only one form of test we need to perform to verify the application.
A better approach is to start with the most important feature and work your way in from there, using an outside-in approach. This often means starting with some UI.
The way I do it is by creating the desired UI. Since we normally can't develop UI with TDD, I simply create the View with the technology of choice. No tests there, but I wire up the UI to some API (preferrably using declarative databinding), and that's when the testing begins.
In the beginning, I would then TDD my ViewModels/Presentation Models and corresponding Controllers, possibly hard-coding some responses to see that the UI works. As soon as I have something that doesn't explode when you run it, I check in the code (remember, many small incremental check-ins).
I subsequently work my way vertically down that feature and ensure that this particular piece of UI can go all the way to the data source (or whatever), ignoring all other features.
When the feature is done, I can start on the next feature. The way I picture this process is that I fill out the application by doing one vertical slice at a time until all features are done.
Kick-starting a greenfield app this way always takes extra long time for the first feature since this is where you have to wire up everything, so pick something simple (like the initial View of the app) to keep things as simple as possible. Once the first feature is done, the next ones become much easier because the foundations are now in place.
Do I still design up-front?
Not much, no. I normally have an overall design in mind before I start, and when I work in a team, we sketch this overall architecture on a whiteboard or a slide deck before we start.
This is more or less limited to
The number and names of layers (UI, Presentation Logic, Domain Model, Data Access, etc).
The technologies used (WPF, ASP.NET MVC, SQL Server, .NET 3.5 or whatnot)
How we structure production code and test code, and which test technologies we use
Quality requirements for the code (pair programming, static code analysis, coding standards, etc.)
The rest we figure out as we go, but we use many ad-hoc design sessions at the whiteboard as we go along.
+1 Good question
I truly don't know the answer, but I would start with building blocks of classes that I could test then build into the application, not with the top-level stuff. And yes I would have a rough up-front design of the interfaces, otherwise I think you would find those interfaces changing so often as you refactor that it would be a real hinderance.
TDD By Example won't help I don't think. IIRC it goes through a simple example. I am reading Roy Osherove's The Art of Unit Testing and while it seems to comprehensively cover tools and techniques like mocks and stubs, the example so far seem also pretty simple and I don't see that it tells you how to approach a large project.
Do you write the entire application, using nothing but stubbed out code?
To test our systems we mainly do unit, integration and remote services testing. In unit tests we stub out all long running, time consuming, and external services, i.e. database operations, web services connection or any connection to external services. This is to make sure that our tests are fast, independent and not relying on the response of any external service to provide us quick feedback. We have learnt this the hard way because we do have some tests that do database operations which makes it really slow that goes against the principle "Unit tests must be fast to run"
In integration tests, we test the database operations but still not the web services and external services because that can make the test brittle depending on their availability and we use autotest to run the tests in the background all the while we are coding.
However, to test any kind of remote services, we have tests that connect to the external services, do the operation on them and get the response. What matters to the test is their response and their end state if it is important for the test. The important thing here is, we keep these kind of tests in another directory called remote (that's a convention we created and follow) and these remote tests are only run by our CI (continuous integration) server when we merge any code to the master/trunk branch and push/commit it to the repo so that we know quickly if there has been any changes in those external services that can affect our application.
Do I still design up-front?
Yes but we don't do big design up front basically what uncle Bob (Robert C. Martin) said.
In addition, we get to the whiteboard before immersing ourself into coding and create some Class Collaboration Diagrams just to make it clear and sure that everyone in the team is on the same page and this also helps us to divide the work amongst the team members.

What's the point of testing fake repositories?

I've been trying to push my mentallity when developing at home to be geared more towards TDD and a bit DDD.
One thing I don't understand though is why you would create a fake repository to test against? I haven't really looked into it much but surely the idea of testing is to help decouple your code (giving you more flexability), trim down code needed and bring down the number of bugs.
So can someone fill in my foolish brain as to why some like to test fake repositories? I would have thought testing against a real database is a much better alternative to creating a fake one because then you KNOW that it works against your real world data store.
The fake repository allows you to test just your application code.
The fake repository means an automated test can easily set up a known state in the repository.
The fake repository will be several orders of magnitude faster than a real database.
The fake repository is NOT a substitute for system testing that will include your database.
As I see it there are two really big reasons why you test against faked resources:
It makes unit testing faster when you have a mocked up against slow I/O or database. This may not look like anything if you have a small test suite but when you're up to +500 unit tests it starts to make a difference. In such amount, tests that run against the database will start to take several seconds to do. Programmers are lazy and want things to go fast so if running a test suite takes more than 10 seconds then you won't be happy to do TDD anymore.
It enforces you to think about your code design to make changes easier. Design by contract and dependency injection also becomes so much easier to do if you've made implementation against interfaces or abstract classes. If done right such design makes it easier to comply to changes in your code.
The only drawback is the obvious one:
How can you be sure it really works?
...and that is what integration tests are for.
I upvoted Giraffe's answer, but want to add just a couple of points:
Each developer can use a mock/fake
repository for her/his own unit
testing without interfering with the
tests being done by other developers
on the same project.
Using a local mock/fake repository
reinforces the user of a data
abstraction layer, which is good
design practice.
As an example, I've used something as simple as a HashMap to implement a mock of the data access layer. This makes it extremely easy for each unit test to ensure that exactly the necessary conditions exist for its purpose, and to verify that the right calls were made on the data access layer.

Resources