So I have some challenging code I would like to refactor. The challenge is that it depends on Database queries, EJB and Java serverFaces. Not simultaneously but close to it.
A good example would be a geocoder. Getting meaningful results depending on multiple queries to the DB depending on the data entered and stored. The code might also reference other helper classes and look them up via the JSF framework.
What are the best strategies for testing this sort of code? Should I try to separate out my code as much as possible? Should I use mocking instead? What has worked for other people?
Well, the short answer is "yes".
You're going to need, first of all, to factor the code sufficiently to construct unit tests at all. What you're describing is excessively complicated to apply the usual unit test methods, and what you would get in any case is more like a higher-level acceptance test.
Now, as far as that factoring goes, you have several possible approaches, and you will probably use them all.
Test the data base queries themselves, using an external script.
Construct an appropriate mock for the components directly accessing the DB, in order to see what happens against known results.
Build unit tests using a JUnit like framework for units of functionality.
Examine the state of the art to see if you can usefully test the output HTML against unit tests.
Related
Recently I wanted to learn TDD by developing a real thing, so I decided to go with simple data packer/unpacker. After design on paper everything looked good, but when I attempted to code it I realised I don't know how to test it, so in TDD - how to do anything.
I have two classes: ArchiveReader and ArchiveWriter. Problem is, when I save something to file with ArchiveWriter I can't test it properly without ArchiveReader, without it I am forced to compare output byte by byte and I think that's not good idea - minor, urrelevant changes can occur later. ArchiveReader tests also needs something to read, so I have to use ArchiveWriter to make test packages.
Is TDD failing in this area? Is there any method to test cases like this?
If you already have the code for the two classes then it's not really TDD, because tests did not drive the design.
You can still test your code, depending on how the dependencies of those classes are handled. For example, if the ArchiveWriter class writes to a stream, you can have it output to a memory stream instead of a file stream, the writer should not care what kind of stream it is, and it will allow you to compare the results of the write method.
Same goes for the ArchiveReader class, if it reads from a stream, then it can be a memory stream.
As for your question about ArchiveWriter and ArchiveReader are mutually dependent on verifying each other, I don't think this is necessarily a problem. While it would be ideal to be able to test both on their own, isolated, there is no rule that a unit-test can only test a single class.
As for these classes, in production, they are likely always to be used in tangent with each other, writing an archive is pointless if you cannot read it again later.
I develop this kind of code, which has two classes or methods that must read and write the same binary format, using TDD all the time.
You mentioned checking the output byte by byte. I prefer the analogous tests of the input code, as although I must hand craft the binary inputs, the behaviour of the input code is generally much easier to check. It will delegate to methods or create objects, which you can check it does correctly in the usual way.
I also have symmetry tests. These use the output code to create the binary representation, then have the input code create a new set of objects from that binary representation. The tests check that the original and new objects are equivalent. These tests are easy to write and produce useful duagnostics when they fail.
Now, some people would say, oh, but you are not doing unit testing, because your symmetry tests test both the output code and the input code; you are doing integration tests and therefore doing it wrong. This should not worry you. The idea that there is a neat division between unit and integration tests, and that everything must be tested by unit tests before having integration tests is wrong. In practice, almost no code is tested in perfect isolation; most tested code uses other classes, even if they are such basics as String and HashMap. It is more useful to view tests as being on a continuum between perfect unit tests and perfect integration tests, to prefer all code to have tests near the unit test end, but not to get too worried if some code does not do so.
TDD is almost about unit testing. Unit testing is about testing of fully isolated independent unit of work. Simply, it is about testing of some logic of some your public method with all dependencies and environment to be faked.
In your example it means that if you want to unit test ArchiveReader or ArchiveWriter, at first you should isolate it from real I/O and test just logic (C#/java/... "your code").
If you are testing your application in a complex way, it is mostly about integration testing. And thus your assertions is needed to be against files your ArchiveWriter produce.
Before going to TDD, I recommend you to read at least one good book about just unit testing. Roy Osherove's book is perfect for me.
I've watched and read a handful of tutorials on PHPUnit and Test Driven Development and have recently begun working with Laravel which extends the PHPUnit Framework with it's TestCase class. All of these things make sense to me, as far as, creating tests as you develop. And I find Laravel's extensions particularly intuitive (especially in regards to testing Controller routes)
However, I've recently been tasked with creating unit tests for a sizable app that's near completion. The app is built in Codeigniter, and it was not built with any tests
I find that I'm not entirely sure where to begin, or what steps to take in order to determine the tests I should create.
Should I be looking to test each controller method? Or do I need to break it down more than that? Admittedly, many of these controller methods are doing more than one task.
It is really difficult to write tests for existing project. I will suggest you to first start with writing tests for classes which are not dependent on other classes. Then you can continue to write tests to classes which coupled with classes for which you wrote tests. You will increase your test coverage step by step by repeating this process.
Also don't forget that some times you will need to refactor your code to make it testable. You should improve design of code for example if your controller methods doing more than one task you should divide this method to sub methods and test each of these methods independently.
I also will suggest you to look at this question
You are in a bit of a tight spot, but here is what I would do in your situation. You need to refactor (ie. change) the existing code so that you end up with three types of functions.
The first type are those that deal with the outside world. By this I mean anything that talks to I/O, or your framework or your operating system or even libraries or code from stable modules. Basically everything that has a dependency on code that you can not, or may not change.
The second group of functions are where you transform or create data structures. The only thing they should know about are the data structures that they receive as parameters and the only way they communicate back is by changing those structures or by creating and populating a new structure.
The third group consists of co-ordinating functions which make the calls to the outside world functions, get their returned data structures and pass those structures to the transforming functions.
Your testing strategy is then as follows: the second group can be tested by creating fake data structures, passing them in and checking that the transforms were done correctly. The third group of co-ordinating functions can be tested by dependency injection and mocking to see that they call the outside world and transform functions correctly. Finally the last group of functions should not be tested. You follow the maxim - "make it so simple that their is obviously nothing wrong". See if you can keep it to a single line of code. If you go over four lines of code for these then you are probably doing it wrong.
If you are completely new to TDD I do however strongly suggest that you first get used to doing it on green field projects/modules. I made a couple of false starts on unit testing because I tried to bolt it onto projects afterwards. TDD is really a joy when you finally grok it so it would not be good if you get discouraged early on because of a too steep learning curve.
I have a lengthy but simple form which allows users to update products, nothing fancy, just displays the fields to enter mostly text values & then updates the corresponding database table.
Is it regular practice to write tests for CRUD operations such as this? Its important the form works properly as it will be used everyday - however there is really not much to go wrong. I have the time to do this, but don't want to be wasting time either or make future maintenance of my test suite overly difficult.
If I was to cover this with tests, should I use integration (cucumber) or unit (rspec) tests?
Thanks for any advice!
In theory I believe you are supposed to use both: Rspec to test each layer in isolation, then Cucumber to test the whole stack. But if you think this piece of your app is too straight-forward to warrent slavish adherence to those principles, I would recommend you stick with the integration tests. It's important to know if something isn't working right, and IMO Cucumber alone should be suffient for that purpose.
Of course for more complex areas of your app, you probably want to use some mixture of each.
I've been learning what TDD is, and one question that comes to mind is what exactly is the "test". For example, do you call the webservice and then build the code to make it work? or is it more unit testing oriented?
In general the test may be...
unit test which tests an individual subcomponent of your software without any external dependencies to other classes
integration test which are tests that test the connection between two separate systems, ie. their integration capability
acceptance test for validating the functionality of the system
...and some others I've most likely temporarily forgotten for now.
In TDD, however, you're mostly focusing on the unit tests when creating your software.
It's entirely Unit Test driven.
The basic idea is to write the unit tests first, and then do the absolute minimum amount of work necessary to pass the tests.
Then write more tests to cover more of the requirements, and implement a bit more to make it pass.
It's an iterative process, with cycles of test writing, and then code writing.
Here are a couple of good articles by Unclebob
Three rules of TDD
TDD with Acceptance and Unit tests
I suggest you not to emphasize on Test because TDD is actually is a software development methodology, not a testing methodology.
I would say it is about unit testing and code coverage. It is about shipping bugless code and being able to make changes easily in the future.
See Uncle Bob's words of wisdom.
How I use it, it's unit testing oriented. Suppose I want a method that square ints I write this method :
int square(int x) { return null; }
and then write some tests like :
[Test]
TestSquare()
{
Assert.AreEqual(square(0),0);
Assert.AreEqual(square(1),1);
Assert.AreEqual(square(10),100);
Assert.AreEqual(square(-1),1);
Assert.AreEqual(square(-10),100);
....
}
Ok, maybe square is a bad example :-)
In each case I test expected behaviour and all borderline vals like maxint and zero and null-value (remember you can test on errors too) and see to it the test fails (which isn't hard :-)) then I keep working on the function until it works.
So : first a unit test that fails an covers what you want it to cover, then the method.
Generally, unit tests in "TDD" shouldn't involve any IO at all.
In fact, you'll have a ton more effectiveness if you write objects that do not create side effects (I/O is almost always, if not always, a side effect!), and define your the behavior of your class either in terms of return values of methods, or calls made to interfaces that have been passed into the object.
I just want to give my view on the topic which may help to understand TDD a bit more in a different way.
TDD is a design method that relies in testing first. because you asked about how the test is, ill go like this:
If you want to build an application, you know the purpose of what you want to build and you know generally that when you are done, or along the way you need to test it e.g check the values of variables you create by code inspection, of quickly drop a button that you can click on and will execute a part of code and pop up a dialog to show the result of the operation etc.
on the other hand TDD changes your mindset and i'll point out one of such ways. commonly , you just rely on the development environment like visual studio to detect errors as you code and compile and somewhere in your head, you know the requirement and just coding and testing via button and pop ups or code inspection. I call this style SDDD (Syntax debugging driven development ).
but when you are doing TDD, is a "semantic debugging driven development " because you write down your thoughts/ goals of your application first by using tests (which and a more dynamic and repeatable version of a white board) which tests the logic (or "semantic") of your application and fails whenever you have a semantic error even if you application passes syntax error (upon compilation).
by the way even though i said "you know the purpose of what you want to build ..", in practice you may not know or have all the information required to build the application , since TDD kind of forces you to write tests first, you are compelled to ask more questions about the functioning of the application at a very early stage of development rather than building a lot only to find out that a lot of what you have written is not required (or at lets not at the moment). you can really avoid wasting your precious time with TDD (even though it may not feel like that initially)
I know what the advantages are and I use fake data when I am working with more complex systems.
What if I am developing something simple and I can easily set up my environment in a real database and the data being accessed is so small that the access time is not a factor, and I am only running a few tests.
Is it still important to create fake data or can I forget the extra coding and skip right to the real thing?
When I said real database I do not mean a production database, I mean a test database, but using a real live DBMS and the same schema as the real database.
The reasons to use fake data instead of a real DB are:
Speed. If your tests are slow you aren't going to run them. Mocking the DB can make your tests run much faster than they otherwise might.
Control. Your tests need to be the sole source of your test data. When you use fake data, your tests choose which fakes you will be using. So there is no chance that your tests are spoiled because someone left the DB in an unfamiliar state.
Order Independence. We want our tests to be runnable in any order at all. The input of one test should not depend on the output of another. When your tests control the test data, the tests can be independent of each other.
Environment Independence. Your tests should be runnable in any environment. You should be able to run them while on the train, or in a plane, or at home, or at work. They should not depend on external services. When you use fake data, you don't need an external DB.
Now, if you are building a small little application, and by using a real DB (like MySQL) you can achieve the above goals, then by all means use the DB. I do. But make no mistake, as your application grows you will eventually be faced with the need to mock out the DB. That's OK, do it when you need to. YAGNI. Just make sure you DO do it WHEN you need to. If you let it go, you'll pay.
It sort of depends what you want to test. Often you want to test the actual logic in your code not the data in the database, so setting up a complete database just to run your tests is a waste of time.
Also consider the amount of work that goes into maintaining your tests and testdatabase. Testing your code with a database often means your are testing your application as a whole instead of the different parts in isolation. This often result in a lot of work keeping both the database and tests in sync.
And the last problem is that the test should run in isolation so each test should either run on its own version of the database or leave it in exactly the same state as it was before the test ran. This includes the state after a failed test.
Having said that, if you really want to test on your database you can. There are tools that help setting up and tearing down a database, like dbunit.
I've seen people trying to create unit test like this, but almost always it turns out to be much more work then it is actually worth. Most abandoned it halfway during the project, most abandoning ttd completely during the project, thinking the experience transfer to unit testing in general.
So I would recommend keeping tests simple and isolated and encapsulate your code good enough it becomes possible to test your code in isolation.
As far as the Real DB does not get in your way, and you can go faster that way, I would be pragmatic and go for it.
In unit-test, the "test" is more important than the "unit".
I think it depends on whether your queries are fixed inside the repository (the better option, IMO), or whether the repository exposes composable queries; for example - if you have a repository method:
IQueryable<Customer> GetCustomers() {...}
Then your UI could request:
var foo = GetCustomers().Where(x=>SomeUnmappedFunction(x));
bool SomeUnmappedFunction(Customer customer) {
return customer.RegionId == 12345 && customer.Name.StartsWith("foo");
}
This will pass for an object-based fake repo, but will fail for actual db implementations. Of course, you can nullify this by having the repository handle all queries internally (no external composition); for example:
Customer[] GetCustomers(int? regionId, string nameStartsWith, ...) {...}
Because this can't be composed, you can check the DB and the UI independently. With composable queries, you are forced to use integration tests throughout if you want it to be useful.
It rather depends on whether the DB is automatically set up by the test, also whether the database is isolated from other developers.
At the moment it may not be a problem (e.g. only one developer). However (for manual database setup) setting up the database is an extra impediment for running tests, and this is a very bad thing.
If you're just writing a simple one-off application that you absolutely know will not grow, I think a lot of "best practices" just go right out the window.
You don't need to use DI/IOC or have unit tests or mock out your db access if all you're writing is a simple "Contact Us" form. However, where to draw the line between a "simple" app and a "complex" one is difficult.
In other words, use your best judgment as there is no hard-and-set answer to this.
It is ok to do that for the scenario, as long as you don't see them as "unit" tests. Those would be integration tests. You also want to consider if you will be manually testing through the UI again and again, as you might just automated your smoke tests instead. Given that, you might even consider not doing the integration tests at all, and just work at the functional/ui tests level (as they will already be covering the integration).
As others as pointed out, it is hard to draw the line on complex/non complex, and you would usually now when it is too late :(. If you are already used to doing them, I am sure you won't get much overhead. If that were not the case, you could learn from it :)
Assuming that you want to automate this, the most important thing is that you can programmatically generate your initial condition. It sounds like that's the case, and even better you're testing real world data.
However, there are a few drawbacks:
Your real database might not cover certain conditions in your code. If you have fake data, you cause that behavior to happen.
And as you point out, you have a simple application; when it becomes less simple, you'll want to have tests that you can categorize as unit tests and system tests. The unit tests should target a simple piece of functionality, which will be much easier to do with fake data.
One advantage of fake repositories is that your regression / unit testing is consistent since you can expect the same results for the same queries. This makes it easier to build certain unit tests.
There are several disadvantages if your code (if not read-query only) modifies data:
- If you have an error in your code (which is probably why you're testing), you could end up breaking the production database. Even if you didn't break it.
- if the production database changes over time and especially while your code is executing, you may lose track of the test materials that you added and have a hard time later cleaning it out of the database.
- Production queries from other systems accessing the database may treat your test data as real data and this can corrupt results of important business processes somewhere down the road. For example, even if you marked your data with a certain flag or prefix, can you assure that anyone accessing the database will adhere to this schema?
Also, some databases are regulated by privacy laws, so depending on your contract and who owns the main DB, you may or may not be legally allowed to access real data.
If you need to run on a production database, I would recommend running on a copy which you can easily create during of-peak hours.
It's a really simple application, and you can't see it growing, I see no problem running your tests on a real DB. If, however, you think this application will grow, it's important that you account for that in your tests.
Keep everything as simple as you can, and if you require more flexible testing later on, make it so. Plan ahead though, because you don't want to have a huge application in 3 years that relies on old and hacky (for a large application) tests.
The downsides to running tests against your database is lack of speed and the complexity for setting up your database state before running tests.
If you have control over this there is no problem in running the tests directly against the database; it's actually a good approach because it simulates your final product better than running against fake data. The key is to have a pragmatic approach and see best practice as guidelines and not rules.