design of mid-large sized application when doing TDD? [closed] - tdd

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I have a good grasp of unit testing, DI, mocks, and all the design principal goodness required to have as close to full code coverage as humanly possible (single responsibility principal, think 'how will i test this' as I code, etc...).
My most recent app, I did not code doing true TDD. I kept unit-testing in mind as I coded, and wrote my tests after writing the code, refactoring, etc.. I did TDD when it was 'easy' to do... however I did not have as good of a grasp as I do now... That was the first project I made full use of DI, mocking frameworks, etc, and the first which had full code coverage - and I learned a lot from it as I went along. I'm itching to get assigned to my next project so I can code it completely doing TDD from scratch.
I know this is a broad question, and I've already ordered TDD by example and XP Unleashed, but I'm hoping for a brief overview of how you all design / write a large application doing TDD.
Do you write the entire application, using nothing but stubbed out code? (e.g., write all the function signatures, interfaces, structures, and write the entire application but without writing any actual implementation)? I could picture it working on small-mid sized, but is this even possible on large applications?
If not, how the heck would you write your first unit test for the highest level function in your system? Lets say for example - on a web service where you have a function called DoSomethingComplicated(param1,...,param6) exposed to the world. Obviously, writing the test first for a simple function like AddNumbers() is trivial - but when the function is at the top of the call stack such as this?
Do you still do design up-front? Obviously you still want to do 'architecture' design - e.g., a flow chart showing IE talking to IIS which talks to a windows service via WCF which talks to the SQL Database... an ERD which shows all your SQL tables and their fields, etc... but what about class design? Interactions between the classes, etc? Do you design this up-front, or just keep writing stub code, refactoring the interactions as you go along, until the whole thing connects and looks like it will work?
Any advice is much appreciated

Do you do design up front?
Of course you do. You've got a big application in front of you. You've got to have some idea of the structure it will have before you start writing tests and code. You don't have to have it all worked out in detail, but you should have some basic idea of the layers, components, and interfaces. For example, if you are working on a web services system, you ought to know what the top level services are, and have a good first approximation of their signatures.
Do you write the entire application using nothing but stubbed out code?
No. You stub things out only if they are really difficult to control in a test. For example, I like to stub out the database, and the UI. I will also stub out third party interfaces. Sometimes I will stub out one of my own components if it vastly increases the test time, or it forces me to create test data that is too complicated. But most of the time I let my tests work on a pretty well integrated system.
I have to say I really dislike the style of testing that relies heavily on mocks and stubs. Don't get me wrong, I think mocks and stubs are very useful for decoupling from things that are hard to test. But I don't like writing things that are hard to test, and so I don't use a lot of mocks and stubs.
How do you write your first unit test for a high level function?
Most high level functions have degenerate behavior. For example, login is a pretty high level function and can be very complicated. But if you try to log in with no user name and no password, the response from the system is going to be pretty simple. Writing that tests will also be very simple. So you start with the degenerate cases. Once you have exhausted them, you move on to the next level of complexity. For example, what if a user tries to log in with a username but no password? Bit by bit you climb the ladder of complexity, never tackling the more complex aspects until the less complex aspects are all passing.
It is remarkable how well this strategy works. You might think that you'd just be climbing around the edges all the time and never getting to the meat; but that's not what happens. Instead you find yourself designing the internal structure of the code based on all the degenerate and exceptional cases. When you finally get around to the primary flow, you find that the structure of the code you are working on has a nice hole of just the right shape to plug the main flow in.
Please don't create your UI first.
UIs are misleading things. They make you focus on the wrong aspects of the system. Instead, imagine that your system must have many different UIs. Some will be web, some will be thick client, some will be pure text. Design your system to work properly irrespective of the UI. Get all the business rules working first, with all tests passing. Then plug the UI in later. I know this flies in the face of a lot of conventional wisdom, but I wouldn't do it any other way.
Please don't design the database first.
Databases are details. Save the details for later. Rather, design your system as though you had no idea what kind of database you were using, Keep any notion of schema, tables, rows, and columns out of the core of the system. Implement your business rules as though all the data were kept in memory all the time. Then add the database later, once you've gotten all the business rules working. Again, I know this flies in the face of some conventional wisdom, but coupling systems to databases too early is a source of a lot of badly warped designs.

Do I write the entire application, using nothing but stubbed out code?
No, not in the slightest sense - that sounds like a very wasteful approach. We must always keep in mind that the underlying reason for doing TDD is rapid feedback. An automated test suite can tell us if we broke anything much faster than a manual test can. If we wait wiring things together until the last moment, we don't get rapid feedback - while we may get rapid feedback from our unit tests, we wouldn't know if the application works as a whole. Unit tests are only one form of test we need to perform to verify the application.
A better approach is to start with the most important feature and work your way in from there, using an outside-in approach. This often means starting with some UI.
The way I do it is by creating the desired UI. Since we normally can't develop UI with TDD, I simply create the View with the technology of choice. No tests there, but I wire up the UI to some API (preferrably using declarative databinding), and that's when the testing begins.
In the beginning, I would then TDD my ViewModels/Presentation Models and corresponding Controllers, possibly hard-coding some responses to see that the UI works. As soon as I have something that doesn't explode when you run it, I check in the code (remember, many small incremental check-ins).
I subsequently work my way vertically down that feature and ensure that this particular piece of UI can go all the way to the data source (or whatever), ignoring all other features.
When the feature is done, I can start on the next feature. The way I picture this process is that I fill out the application by doing one vertical slice at a time until all features are done.
Kick-starting a greenfield app this way always takes extra long time for the first feature since this is where you have to wire up everything, so pick something simple (like the initial View of the app) to keep things as simple as possible. Once the first feature is done, the next ones become much easier because the foundations are now in place.
Do I still design up-front?
Not much, no. I normally have an overall design in mind before I start, and when I work in a team, we sketch this overall architecture on a whiteboard or a slide deck before we start.
This is more or less limited to
The number and names of layers (UI, Presentation Logic, Domain Model, Data Access, etc).
The technologies used (WPF, ASP.NET MVC, SQL Server, .NET 3.5 or whatnot)
How we structure production code and test code, and which test technologies we use
Quality requirements for the code (pair programming, static code analysis, coding standards, etc.)
The rest we figure out as we go, but we use many ad-hoc design sessions at the whiteboard as we go along.

+1 Good question
I truly don't know the answer, but I would start with building blocks of classes that I could test then build into the application, not with the top-level stuff. And yes I would have a rough up-front design of the interfaces, otherwise I think you would find those interfaces changing so often as you refactor that it would be a real hinderance.
TDD By Example won't help I don't think. IIRC it goes through a simple example. I am reading Roy Osherove's The Art of Unit Testing and while it seems to comprehensively cover tools and techniques like mocks and stubs, the example so far seem also pretty simple and I don't see that it tells you how to approach a large project.

Do you write the entire application, using nothing but stubbed out code?
To test our systems we mainly do unit, integration and remote services testing. In unit tests we stub out all long running, time consuming, and external services, i.e. database operations, web services connection or any connection to external services. This is to make sure that our tests are fast, independent and not relying on the response of any external service to provide us quick feedback. We have learnt this the hard way because we do have some tests that do database operations which makes it really slow that goes against the principle "Unit tests must be fast to run"
In integration tests, we test the database operations but still not the web services and external services because that can make the test brittle depending on their availability and we use autotest to run the tests in the background all the while we are coding.
However, to test any kind of remote services, we have tests that connect to the external services, do the operation on them and get the response. What matters to the test is their response and their end state if it is important for the test. The important thing here is, we keep these kind of tests in another directory called remote (that's a convention we created and follow) and these remote tests are only run by our CI (continuous integration) server when we merge any code to the master/trunk branch and push/commit it to the repo so that we know quickly if there has been any changes in those external services that can affect our application.
Do I still design up-front?
Yes but we don't do big design up front basically what uncle Bob (Robert C. Martin) said.
In addition, we get to the whiteboard before immersing ourself into coding and create some Class Collaboration Diagrams just to make it clear and sure that everyone in the team is on the same page and this also helps us to divide the work amongst the team members.

Related

Best practice for end to end testing whole systems

End to end testing means exercising an application from the outer boundaries to verify its behavior. This far I've only done written tests for a single executable artifact. How should I test systems made up of multiple artifacts that is deployed on different hosts?
I see two alternatives.
The tests set up the whole system and exercise it from the very outer edges.
Each artifact is end to end tested in isolation, relying on the test content to enforce the protocol between them.
Is there a clear case for only adhering to one of these, or are one of them preferred, or are they interchangeable? If interchangeable, then what are some advantages and disadvantages between them?
Even though I think it depends on the context, I prefer the first alternative. Here are my random thoughts:
I like my tests to be as closely mapped to use cases as possible (BDD style) (with the disclaimer that I misuse the term use case). These use cases may span several applications and sub-systems.
Example: A back office administrator can view a transaction made by a user from the public interface.
Here, the back office admin interface and the public interface are different applications, but they are included in the same use case.
Mapping these thoughts to your problem where you have sub-systems deployed on different hosts, I would say it depends on how it is used, from the user/actor perspective. Do the use cases span several sub-systems?
Also, perhaps the fact that the system is deployed on several hosts isn't important to the tests. You could replace the inter-process communication with method calls in your tests and have the whole system within the same process during tests, reducing the complexity. Supplement this with tests that only verify the inter-process communication.
Edit:
I realise that I forgot to include why I prefer to test the whole system.
Your asset is features, that is, behaviour, while the code is a liability. Therefore you'd like to test the behaviour, not the code (BDD style).
If you are testing each sub-system separately you are testing the code, not the features. Why? When you divided your system into sub-systems you did so based on some technical reasons. When you learn more you might discover that the chosen seam is sub-optimal and would like to move some responsibility from one sub-system to another. And you would have to modify test and production code at the same time, leaving you without a safety net. That's a typical symptom of testing implementation details.
That said, these kind of tests are too blunt to test everything. So you need to have complementary tests for details as well, where necessary.
Testing each artifact end-to-end separately would be highly desiderable in any case. This will ensure that every artifact is sound.
In addition, you might want to test a composition of artifacts. That would catch problems in the interactions between artifacts. I don't know about your situation, but one thing that is important to have is a test environment that is a copy of production. Testing the system in the test environment is a very good idea. You might also want to test the system in the production environment; this might be feasible or not. For instance, if your system processes credit card payments, you may want to avoid test payments on the production system.
In any case, testing each system separately is imho more important than testing the composition. Once you know that your artifacts are sound in isolation, catching interaction tests will be much easier. If you only have the end-to-end test of the whole system, it's much more difficult to understand where is the error when the tests fail.

How would you stress test a dynamic site, when you don't know what the URLs will be ahead of time?

This isn't a question of what stress testing tools are out there. I'm afraid it's a lot harder than that. (At least for me)
Consider a restful architecture for a forum or blog that generates random IDs for each post.
Simulating creating those topics/articles would be simple, because you'd just be posting form data to an endpoint like: /article, or /topic
But how do you then stress test commenting on those articles/topics? This is different, because the comments need to belong to an article/topic, which means that you need the ids of those items. However, if all you can do is issue posts, and you have no way of pulling those ids, you'd be unable to create them.
I'm creating a site that is similar in this regard, and I have no idea how to stress test the creation of the comments.
I have two ideas, and they're both pretty awful:
Generate a massive system ahead of time with some kind of factory, and then freeze it. From there, I figure I'd have to use some kind of browser automation to create my 'comments' on all of this. The automation would I suppose go through a recording proxy, like what JMeter offers. Then, to run the test, I reload the database, and replay the massive log file.
Use browser automation for the whole thing, taking advantage of the dynamic links delivered in the HTML page. The only option here would be Selenium, and really, we're talking a massive selenium grid that would be extremely expensive. Probably very difficult to maintain also.
Option 2 is completely infeasible near as I can tell, but option 1 sounds excruciating. I'm really hoping someone can suggest something more clever.
Option 1.
I mean, implementation notes aside, you're basically just asking for a testing environment. So, the answer is to make one. In whatever fashion:
Generate it
Make it once and reload it
Randomise it
Whatever. It's the approach to go with.
How do you your testing is kind of a side issue (unit testing/browser/whatever, up to you).
But you've reached a point where you need to test with real data. So make it happen.
This is a common problem. We handle it by extracting the dynamic parts of the URLs from the server responses. I presume this system uses web browser client - which implies that those URLs are being sent in the server responses. If they are in the responses, then you CAN get them. However, since you said "if all you can do is issue posts, and you have no way of pulling those ids", then perhaps this is not the case? In that case, can you clarify?
We've recently been doing a lot of testing of Drupal systems for our customers - which has exactly the problem you've described. We either solve it by extracting the IDs dynamically from the page as the user browses to the page they want to comment on, or we use option 1, or a combination of both. Note that if you have a load testing tool handy, then generation of content is not too difficult - use the tool to do it. I.e. run a "content generation" load test. Besides yielding useful data on its own accord, that will give you a test database that you can then backup/restore as needed to maintain your test infrastructure. Now you can run the test on a more realistic environment - one that has lots of content already in it (assuming, of course, that this is realistic for your purposes).
If you are interested, I'd be happy to demo how we solve the problem using our software (Web Performance Load Tester).
I have used Visual Studio to solve this kind of problem. Visual Studio allows C# coded web tests that can programatically parse the html returned and take action based on that.
I was load testing a SharePoint website and required information to be populated ahead of time. I did create a load test that was specifically for creating "random" pages of content ahead of time. I populated a test harness database with the urls ahead of time, allowing some control over the pages that were loaded.
With a list of "articles" available and a list of potential comments, it is possible to code a pseudo-random number generator (inside a stored procedure because of the asynchronous nature of the test harness) to get a repeatable load test. That meant that the site would be populated in the same way each time the load test was run.
It does take some effort to create a decent way of populating the site off the bat, but the return in the relevance of the load test is quite good.

How to design a command line program reusable for a future development of a GUI? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
What are some best practices to keep in mind when developing a script program that could be integrated with a GUI, probably by somebody else, in the future?
Possible scenario:
I develop a fancy python CLI program that scrapes every unicorn images from the web
I decide to publish it on github
A unicorn fan programmer decides to take the sources and build a GUI on them
he/she gives up because my code is not reusable
How to prevent the step four letting the unicorn fan programmer build his/her GUI without too much hassle?
You do it by applying a good portion of layering (maybe implementing the MVP pattern) and treating your CLI as a UI in it's own right.
UPDATE
This text from the wikipedia article about the Model-View-Presenter pattern explains it quite well.
Model-view-presenter (MVP) is a user
interface design pattern engineered to
facilitate automated unit testing and
improve the separation of concerns in
presentation logic.
The model is an interface defining the data to be displayed or
otherwise acted upon in the user
interface.
The view is an interface that displays data (the model) and routes
user commands (events) to the
presenter to act upon that data.
The presenter acts upon the model and the view. It retrieves data
from repositories (the model),
persists it, and formats it for
display in the view.
The main point being that you need to work on separation of concern in your application.
Your CLI would be one implementation of a view, whereas the unicorn fan would implement another view for a rich client. The unicorn fan, would base his view on the same presenters as your CLI. If those presenters are not sufficient for his rich client he could easily add more, because each presenter is based on data from the model. The model, in turn, is where all the core logic of your application is based. Designing a good model is an entire subject in itself. You may be interested in reading, for example, about Domain-Driven Design, even though I don't know how well it applies to your current application. But it's interesting reading anyway.
As you can see, the wikipedia article on MVP also talks about testability, which is also crucial if you want to provide a robust framework for others to build on. To reach a high level of testability in your code-base, it is often a good idea to use some kind of Dependency Injection framework.
I hope this gives you a general idea of the techniques you need to employ, although I understand that it may be a little overwhelming. Don't hesitate to ask if you have any further doubts.
/Klaus
This sounds like a question about how to write usable code.
When considering reusablility of code, generally speaking, one should try to:
separate functionality into modules
have a well-defined interface
Separating functionality into modules
One should try to separate code into parts that have a simple responsibility. For example, a program that goes out to the internet to scrape pictures of unicorns may be separated into sections that a) scrapes the web for images, b) determines if an image is a unicorn and c) stores the said unicorn images into some specified location.
Have a well-defined interface
Having a well-designed interface, an API (application programming interface), is going to be crucial to providing a way to reuse or extend an application.
Providing entry points into each functionality will allow other programmers to actually write a new user interface for the provided functionality.
The solution to this kind of problem is very simple, but in practice, a lot of junior programmers have trouble with this pattern. Here's the solution:
You design a unicorn-scraping API. This is the hard step; good API design is insanely hard, and there aren't many examples to study. One API that I think is worth studying is the one in Dave Hanson's book C Interfaces and Implementations.
Then you design your command-line interface. If the functionality you are exposing is not to complicated, this is not too hard. But if it's complicated, you may want to think seriously about managing your API using an embedded scripting language like Lua or Tcl and designing an interface for scripting rather than for the command line.
Finally you write your command-line processing code and glue everything together.
Your hypothetical successor builds his or her GUI in one of two ways: using the embedded scripting languages, or directly on top of your API.
As noted in other answers, model/view/controller may be a good pattern to use in designing your API.
You'll be taking input, executing an action, and presenting output. It might be a good idea to use a callback mechanism (such as event handlers, passing a method as a parameter, or passing this/self to the called class) to decouple the input and output methods from the execution of the action.
Aside from this, program to an interface, not to an implementation - the essence of MVC/MVP, as klausbyskov mentioned. e.g., Don't directly call file.write(); make myModel.saveMyData() which calls file.write, so someone else can make a somebodysModel.saveMyData() that writes to a database.

Applying TDD when the application is 100% CRUD

I routinely run into this problem, and I'm not sure how to get past this hurdle. I really want to start learning and applying Test-Driven-Development (or BDD, or whatever) but it seems like every application I do where I want to apply is it pretty much only standard database CRUD stuff, and I'm not sure how to go about applying it. The objects pretty much don't do anything apart from being persisted to a database; there is no complex logic that needs to be tested. There is a gateway that I'll eventually need to test for a 3rd-party service, but I want to get the core of the app done first.
Whenever I try to write tests, I only end up testing basic stuff that I probably shouldn't be testing in the first place (e.g. getters/setters) but it doesn't look like the objects have anything else. I guess I could test persistence but this never seems right to me because you aren't supposed to actually hit a database, but if you mock it out then you really aren't testing anything because you control the data that's spit back; like I've seen a lot of examples where there is a mock repository that simulates a database by looping and creating a list of known values, and the test verifies that the "repository" can pull back a certain value... I'm not seeing the point of a test like this because of course the "repository" is going to return that value; it's hard-coded in the class! Well, I see it from a pure TDD standpoint (i.e. you need to have a test saying that your repository needs a GetCustomerByName method or whatever before you can write the method itself), but that seems like following dogma for no reason other than its "the way" - the test doesn't seem to be doing anything useful apart from justifying a method.
Am I thinking of this the wrong way?
For example take a run of the mill contact management application. We have contacts, and let's say that we can send messages to contacts. We therefore have two entities: Contact and Message, each with common properties (e.g. First Name, Last Name, Email for Contact, and Subject and Body and Date for Message). If neither of these objects have any real behavior or need to perform any logic, then how do you apply TDD when designing an app like this? The only purpose of the app is basically to pull a list of contacts and display them on a page, display a form to send a message, and the like. I'm not seeing any sort of useful tests here - I could think of some tests but they would pretty much be tests for the sake of saying "See, I have tests!" instead of actually testing some kind of logic (While Ruby on Rails makes good use of it, I don't really consider testing validation to be a "useful" test because it should be something the framework takes care of for you)
"The only purpose of the app is basically to pull a list of contacts"
Okay. Test that. What does "pull" mean? That sounds like "logic".
" display them on a page"
Okay. Test that. Right ones displayed? Everything there?
" display a form to send a message,"
Okay. Test that. Right fields? Validations of inputs all work?
" and the like."
Okay. Test that. Do the queries work? Find the right data? Display the right data? Validate the inputs? Produce the right error messages for the invalid inputs?
I am working on a pure CRUD application right now
But I see lots of benefits of Unit test cases (note- I didn't say TDD)
I write code first and then the test cases- but never too apart- soon enough though
And I test the CRUD operations - persistence to the database as well.
When I am done with the persistence - and move on to the UI layer- I will have fair amount of confidence that my service\persistence layer is good- and I can then concentrate on the UI alone at that moment.
So IMHO- there is always benefit of TDD\Unit testing (whatever you call it depending on how extreme you feel about it)- even for CRUD application
You just need to find the right strategy for- your application
Just use common sense....and you will be fine.
I feel like we are confusing TDD with Unit Testing.
Unit Testing are specific tests which tests units of behaviors. These tests are often included in the integration build. S.Lott described some excellent candidates for just those types of tests.
TDD is for design. I find more often then not that my tests I write when using TDD will either be discarded or evolve into a Unit Test. Reason behind this is when I'm doing TDD I'm testing my design while I'm designing my application, class, method, domain, etc...
In response to your scenario I agree with what S.Lott implied is that what you are needing is a suite of Unit tests to test specific behaviors in your application.
TDDing a simple CRUD application is in my opinion kind of like practicing scales on a guitar- you may think that it's boring and tedious only to discover how much your playing improves. In development terms - you would be likely to write code that's less coupled - more testable. Additionally you're more likely to see things from the code consumer's perspective - you'll actually be using it. This can have a lot of interesting side effects like more intuitive API's, better segregation of concerns etc. Granted there are scaffold generators that can do basic CRUD for you and they do have a place especially for prototyping, however they are usually tied to a framework of sorts. Why not focus on the core domain first, deferring the Framework / UI / Database decisions until you have a better idea of the core functionality needed - TDD can help you do that as well.
In your example: Do you want messages to be a queue or a hierarchical tree etc?
Do you want them to be loaded in real time? What about sorting / searching? do you need to support JSON or just html? it's much easier to see these kinds of questions with BDD / TDD. If you're doing TDD you may be able to test your core logic without even using a framework (and waiting a minute for it to load / run)
Skip it. All will be just fine. I'm sure you have a deadline to meet. (/sarcasm)
Next month, we can go back and optimize the queries based on user feedback. And break things that we didn't know we weren't supposed to break.
If you think the project will last 2 weeks and then never be reopened, automated testing probably is a waste of time. Otherwise, if you have a vested interest in "owning" this code for a few months, and its active, build some tests. Use your judgement as to where the most risk is. Worse, if you plan on being with the company for a few years, and have other teammates who take turns whacking on various pieces of a system, and it may be your turn again a year from now, build some tests.
Don't over do it, but do "stick a few pins in it", so that if things start to "move around", you have some alarms to call attention to things.
Most of my testing has been JUnit or batch "diff" type tests, and a rudimentaryy screen scraper type tool I wrote a few years ago (scripting some regex + wget/curl type stuff). I hear Selenium is supposed to be a good tool for web app UI testing, but have not tried it. Anybody have available tools for local GUI apps???
Just an idea...
Take the requirements for the CRUD, use tools like watij or watir or AutoIt to create test cases. Start creating the UI to pass the test cases. Once you have the UI up and passing maybe just one test, start writing the logic layer for that test, and then the db layer.
For most users, the UI is the system. Remember to write test cases for each new layer that you are building. So instead of starting from the db to app to ui layer, start in the reverse direction.
At the end of the day, you would probably have a accumulated a powerful set of regression test set, to give you some confidence in doing refactoring safely.
this is just an idea...
I see what you are saying, but eventually your models will become sufficiently advanced that they will require (or be greatly augmented by) automated testing. If not, what you are essentially developing is a spreadsheet which somebody has already developed for you.
Since you mentioned Rails, I would say doing a standard create/read/update/delete test is a good idea for each property, especially because your test should note permissions (this is huge I think). This also ensures that your migrations work as you expected them to.
I am working on a CRUD application now. What I am doing at this point is writing unit tests on my Repository objects and test that the CRUD features are working as they should. I have found that this has inherently unit tested the actual database code as well. We have found quite a few bugs in the database code this way. So I would suggest you push ahead and keep going with unit tests. I know applying TDD on CRUD apps is not as glamorous as things you might read about in blogs or magazines, but it is serving its purpose and you will be that much better when you work on a more complex application.
These days you should not need much hand written code for a CRUD app apart from the UI, as there are a 101 frameworks that will generate the database and data access code.
So I would look at reducing the amount of hand written code, and automating the testing of the UI. Then I would use TDD of the odd bits of logic that need to be written by hand.

What's the point of testing fake repositories?

I've been trying to push my mentallity when developing at home to be geared more towards TDD and a bit DDD.
One thing I don't understand though is why you would create a fake repository to test against? I haven't really looked into it much but surely the idea of testing is to help decouple your code (giving you more flexability), trim down code needed and bring down the number of bugs.
So can someone fill in my foolish brain as to why some like to test fake repositories? I would have thought testing against a real database is a much better alternative to creating a fake one because then you KNOW that it works against your real world data store.
The fake repository allows you to test just your application code.
The fake repository means an automated test can easily set up a known state in the repository.
The fake repository will be several orders of magnitude faster than a real database.
The fake repository is NOT a substitute for system testing that will include your database.
As I see it there are two really big reasons why you test against faked resources:
It makes unit testing faster when you have a mocked up against slow I/O or database. This may not look like anything if you have a small test suite but when you're up to +500 unit tests it starts to make a difference. In such amount, tests that run against the database will start to take several seconds to do. Programmers are lazy and want things to go fast so if running a test suite takes more than 10 seconds then you won't be happy to do TDD anymore.
It enforces you to think about your code design to make changes easier. Design by contract and dependency injection also becomes so much easier to do if you've made implementation against interfaces or abstract classes. If done right such design makes it easier to comply to changes in your code.
The only drawback is the obvious one:
How can you be sure it really works?
...and that is what integration tests are for.
I upvoted Giraffe's answer, but want to add just a couple of points:
Each developer can use a mock/fake
repository for her/his own unit
testing without interfering with the
tests being done by other developers
on the same project.
Using a local mock/fake repository
reinforces the user of a data
abstraction layer, which is good
design practice.
As an example, I've used something as simple as a HashMap to implement a mock of the data access layer. This makes it extremely easy for each unit test to ensure that exactly the necessary conditions exist for its purpose, and to verify that the right calls were made on the data access layer.

Resources