Creating mock data for unit testing - tdd

I consider myself still pretty new to the TDD scene. But find that no matter which method I use (mock framework or stubbing my own objects) I find that I have to write a lot of code to create mock data. I like the idea of loading up objects to create an in-memory database. But what I don't like is cluttering up my tests with a ton of code for the sole purpose of creating mock data. This is especially the case when the data needs to account for all the different cases.
I'd love some suggestions for a better way of doing this.
It would seem to me that I should be able to load the data once into a known state from some data store and then I could use a snapshot of that state which is loaded in the test setup/initialize before each test method is executed. This would satisfy proper testing practices while providing convenience and let me focus on writing tests instead of writing code to create test data "by hand".

May be you could try the NBuilder library. It provides a very fluent interface and is easy to use. You can use it for generating single instances of a class with defualt values or generate lists with default or overriden values. You can have a look at this one.

If your are using .Net Try NDBUnit
You populate your store and then it reverts your DB to a known state at test time, for each test. The Autumn of Agile screen cast series shows this in pretty good detail.
Or you can do this manually...build a stored procedure or whatever to truncate your tables and copy in the data in your teardown method.

You can have Builder class(es) that helps you building the instances you need / in this case ones you would use related to the repository.
Have the Builder use appropiate defaults, and on your tests you can overwride what you need. This helps you avoid needing to put have every single case of "data" mixed up for all the different tests (which introduces problems, because usually there are cases that aren't compatible for different tests).
**Update 1:**Take a look at www.markhneedham.com/blog/2009/01/21/c-builder-pattern-still-useful-for-test-data

I know exactly what you mean. I think a good approach to solving this problem is to actually have a separate MockFramework project that houses all your mock data, outside the test project. This way you can generate mock data separately, store it in memory if you want to, or not, and then reference the mock framework from the test project. If you use a third party framework to do this, all the better, but you can still wrap that third party framework in your own mock framework so you can get all that "glue" that creates the mock data the way you need it out of your tests so the tests can really be only what they need to be.

Thanks for all the suggestions, I think the solution requires a little bit of everything. I don't want these tests to end up being regression tests, but w/o some kind of existing data store everything still boils down to creating the data by "manually" building the objects.
What would really be nice would be a framework that allowed me to use my existing DAL to either script the data to code for me or get the data in memory and access it like an in memory database.

Untils.org covers this way better than I ever could.
Their whole guide is actually very good.
But basically, if your units require "a lot of data" they may not be unit tests anymore. I'd recommend attempting testing the smaller pieces individually.

Related

Streamlining the implementation of a Repository Pattern and SOA

I'm working with Laravel 5 but I think this question can be applied beyond the scope of a single framework or language. The last few days I've been all about writting interfaces and implementations for repositories, and then binding services to the IoC and all that stuff. It feels extremely slow.
If I need a new method in my service, say, Store::getReviews() I must create the relationship in my entity model class (data source, in this case Eloquent) then I must declare the method in the repo interface to make it required for any other implementation, then I must write the actual method in the repo implementation, then I have to create another method on the service that calls on the repo to extract all reviews for the store... (intentional run-on sentence) It feels like too much.
Creating a new model now isn't as simple as extending a base model class anymore. There are so many files I have to write and keep track of. Sometimes I'll get confused as of to where exactly I should put something, or find halfway throught setting up a method that I'm in the wrong class. I also lost Eloquent's query building in the service. Everytime I need something that Eloquent has, I have to implement it in the repo and the service.
The idea behind this architecture is awesome but the actual implementation I am finding extremely tedious. Is there a better, faster way to do things? I feel I'm beeing too messy, even though I put common methods and stuff in abstract classes. There's just too much to write.
I've wrestled with all this stuff as I moved to Laravel 5. That's when I decided to change my approach (it was tough decision). During this process I've come to the following conclusions:
I've decided to drop Eloquent (and the Active Record pattern). I don't even use the query builder. I do use the DB fascade still, as it's handy for things like parameterized query binding, transactions, logging, etc. Developers should know SQL, and if they are required to know it, then why force another layer of abstraction on them (a layer that cannot replace SQL fully or efficiently). And remember, the bridge from the OOP world to the Relational Database world is never going to be pretty. Bear with me, keeping reading...
Because of #1, I switched to Lumen where Eloquent is turned off by default. It's fast, lean, and still does everything I needed and loved in Laravel.
Each query fits in one of two categories (I suppose this is a form of CQRS):
3.1. Repositories (commands): These deal with changing state (writes) and situations where you need to hydrate an object and apply some rules before changing state (sometimes you have to do some reads to make a write) (also sometimes you do bulk writes and hydration may not be efficient, so just create repository methods that do this too). So I have a folder called "Domain" (for Domain Driven Design) and inside are more folders each representing how I think of my business domain. With each entity I have a paired repository. An entity here is a class that is like what others may call a "model", it holds properties and has methods that help me keep the properties valid or do work on them that will be eventually persisted in the repository. The repository is a class with a bunch of methods that represent all the types of querying I need to do that relates to that entity (ie. $repo->save()). The methods may accept a few parameters (to allow for a bit of dynamic query action inside, but not too much) and inside you'll find the raw queries and some code to hydrate the entities. You'll find that repositories typically accept and/or return entities.
3.2. Queries (a.k.a. screens?): I have a folder called "Queries" where I have different classes of methods that inside have raw queries to perform display work. The classes kind of just help for grouping together things but aren't the same as Repositories (ie. they don't do hydrating, writes, return entities, etc.). The goal is to use these for reads and most display purposes.
Don't interface so unnecessarily. Interfaces are good for polymorphic situations where you need them. Situations where you know you will be switching between multiple implementations. They are unneeded extra work when you are working 1:1. Plus, it's easy to take a class and turn it into an interface later. You never want to over optimize prematurely.
Because of #4, you don't need lots of service providers. I think it would be overkill to have a service provider for all my repositories.
If the almost mythological time comes when you want to switch out database engines, then all you have to do is go to two places. The two places mentioned in #3 above. You replace the raw queries inside. This is good, since you have a list of all the persistence methods your app needs. You can tailor each raw query inside those methods to work with the new data-store in the unique way that data-store calls for. The method stays the same but the internal querying gets changed. It is important to remember that the work needed to change out a database will obviously grow as your app grows but the complexity in your app has to go somewhere. Each raw query represents complexity. But you've encapsulated these raw queries, so you've done the best to shield the rest of your app!
I'm successfully using this approach inspired by DDD concepts. Once you are utilizing the repository approach then there is little need to use Eloquent IMHO. And I find I'm not writing extra stuff (as you mention in your question), all while still keeping my app flexible for future changes. Here is another approach from a fellow Artisan (although I don't necessarily agree with using Doctrine ORM). Good Luck and Happy Coding!
Laravel's Eloquent is an Active Record, this technology demands a lot of processing. Domain entities are understood as plain objects, for that purpose try to utilizes Doctrime ORM. I built a facilitator for use Lumen and doctrine ORM follow the link.
https://github.com/davists/Lumen-Doctrine-DDD-Generator
*for acurated perfomance analisys there is cachegrind.
http://kcachegrind.sourceforge.net/html/Home.html

Laravel4: Will not using the repository pattern hurt my project in the long run?

I'm reading Taylor Otwell's book in which he suggests using the repository pattern. I get the theory behind it. That it's easy to switch implementations and decouples your code. I get the code too. And it is very nice to be able to switch by App::bind() with some other implementation. But I've been thinking for the last two hours how I should approach things for a new CRM I'm building. Almost no code yet but it might end up big.
I would prefer to simply use eloquent models and collection injected through the controller. That should make everything testable if I'm not mistaken.. . Allowing for mocks and such.
Does the repository pattern maybe offer any benefits towards scalability? In cases when more servers or resources are needed..
And why would anybody want to replace the Eloquent ORM once committed to its use? I mean most projects need a relational database. If I would build a UserRepositoryInterface and an implementation DbUserReponsitory, I cannot see when I would ever need or want to switch that with something else. This would count for most entities in my project (order, company, product, ...). Maybe to use another ORM to replace Eloquent? But I don't see myself doing this for this project.
Or am I missing something? My main concern is to have code that is readable and easy to test. Haven't really been testing that well up to now :)
It does not necessarily mean that ignoring the Repository pattern will give you headaches on the long run, but it certainly makes re-factoring code much easier thus you can easily write Test repositories and bind them with your controllers to easily test your business logic.
So even if it is not likely that you will replace Eloquent or any other part, encapsulating chunks of logic and operations in repositories will still be beneficial for you.
Dependency injection is nothing new in terms of design patterns, will it hurt your code, not necessarily. Will it hurt your ability to easily write / re-factor old code and swap it in easily... absolutely it will.
Design patterns exist as solutions to common problems. I would tell you straight away that ignoring the IoC container in Laravel and not dependency injecting will have a big impact on you later down the line if you're writing complex applications. Take for example that you may want different bindings based on request type, user role / group, or any other combination of factors, by swapping out the classes without impacting your calling code will be beyond invaluable to you.
My advice, do not ignore the IoC container at all.
As a sidenote, Service Providers are also your friend, I'd suggest you get well acquainted with them if you want to write great apps in Laravel 4.

first time TDD help needed

Its (almost) my first time trying to create code by TDD principles. But i'm having troubles how to start.
In example: I want to mutate some information about a person.
To make it easy, a person has only these values:
- FirstName
- LastName
- Email
What i need at the end:
- A person DTO
- A person entity (Nhibernate)
- Functionality to store the dto values in the Database. At the end i need to return a succes or an error (possibly a boolean).
With the given information, how to start at all? It's a global question, but that's because i have no clue how to start. I've red many articles but somehow i get stuck already.
Edit:
- I'm using MVC: MVC will give a DTO (filled from form fields) back.
So the MVC start call could be something like this:
public JsonResult MutatePerson(PersonDto person){
//Call functions by TDD here
return Json(true);
}
You've described the objects involved, but not the operations.
Presumably you need a read() operation, a write() operation. Perhaps a list() operation ?
All of the above should have tests associated with them e.g.
testCanReadViaId();
testCanThrowExceptionOnReadingInvalidId();
testCanWrite();
etc.
For a lot of work you should mock your datasources etc. and position (say) hardcoded data under your persistenc elayer such that you don't rely on a database. However for something like the above I would definitely test the basic database interaction.
As such you may need to (initially) point a test suite against a database with canned data. If you want to be more flexible, then your test setup code could write the entities into the database first, prior to running the tests.
Your tests should test different permutations of data and operations e.g. in the above I've suggested a test to read an object via a valid id (say, 1) and a similar operation against an invalid id (say, -1). You may also want to check different data combinations (e.g. does everything work if the email address isn't populated - this may be valid if the database column is nullable)
Using TDD, you should use interfaces as arguments. Interfaces can be mocked, and with a mock you test the MutatePerson method, and ONLY that. In a unit test, you only want to test how a method reacts to input, not how the object reacts to the method. If you test how the DTO object behaves as well, you are writing an integration test.
So, use the interface of PersonDto (create one if it doesn't exist). And use that as method argument instead of the concrete class.
It might be just me, but I have the feeling that you're starting off with little idea of what your global system should look like in terms of layers, modules and the dependencies and interactions between them.
TDD's emergent design sure works at the level of a small object graph, but you won't get away without doing some amount of overall architectural design first (not big upfront design, but enough to get you started).
With that in mind, I think you'll have a far better idea of what to test.
Once you've figured that out, I think you should :
Learn about object-oriented unit testing techniques, and by unit testing I mean testing things in isolation. Roy Osherove's Art Of Unit Testing is an excellent place to start for a .NET developer.
Learn about architecture-level TDD strategies. With the articles you read you certainly got an idea of how to do TDD in the small, but you need a more global approach : what should you TDD first, in what order, etc. A book like GOOS might help you in that department.

TDD: why, how and real world test driven code

First, Please bear with me with all my questions. I have never used TDD before but more and more I come to realize that I should. I have read a lot of posts and how to guides on TDD but some things are still not clear. Most example used for demonstration are either math calculation or some other simple operations. I also started reading Roy Osherove's book about TDD. Here are some questions I have:
If you have an object in your solution, for instance an Account class, what is the benefit of testing setting a property on it, for example an account name, then you Assert that whatever you set is right. Would this ever fail?
Another example, an account balance, you create an object with balance 300 then you assert that the balance is actually 300. How would that ever fail? What would I be testing here? I can see testing a subtraction operation with different input parameters would be more of a good test.
What should I actually test my objects for? methods or properties? sometime you also have objects as service in an infrastructure layer. In the case of methods, if you have a three tier app and the business layer is calling the data layer for some data. What gets tested in that case? the parameters? the data object not being null? what about in the case of services?
Then on to my question regarding real life project, if you have a green project and you want to start it with TDD. What do you start with first? do you divide your project into features then tdd each one or do you actually pick arbitrarily and you go from there.
For example, I have a new project and it requires a login capability. Do I start with creating User tests or Account tests or Login tests. Which one I start with first? What do I test in that class first?
Let's say I decide to create a User class that has a username and password and some other properties. I'm supposed to create the test first, fix all build error, run the test for it to fail then fix again to get a green light then refactor. So what are the first tests I should create on that class? For example, is it:
Username_Length_Greater_Than_6
Username_Length_Less_Than_12
Password_Complexity
If you assert that length is greater than 6, how is that testing the code? do we test that we throw an error if it's less than 6?
I am sorry if I was repetitive with my questions. I'm just trying to get started with TDD and I have not been able to have a mindset change. Thank you and hopefully someone can help me determine what am I missing here. By the way, does anyone know of any discussion groups or chats regarding TDD that I can join?
Have a look at low-level BDD. This post by Dan North introduces it quite well.
Rather than testing properties, think about the behavior you're looking for. For instance:
Account Behavior:
should allow a user to choose the account name
should allow funds to be added to the account
User Registration Behavior:
should ensure that all usernames are between 6 and 12 characters
should ask the password checker if the password is complex enough <-- you'd use a mock here
These would then become tests for each class, with the "should" becoming the test name. Each test is an example of how the class can be used valuably. Instead of testing methods and properties, you're showing someone else (or your future self) why the class is valuable and how to change it safely.
We also do something in BDD called "outside-in". So start with the GUI (or normally the controller / presenter, since we don't often unit-test the GUI).
You already know how the GUI will use the controller. Now write an example of that. You'll probably have more than one aspect of behavior, so write more examples until the controller works. The controller will have a number of collaborating classes that you haven't written yet, so mock those out - just dependency inject them via an interface. You can write them later.
When you've finished with the controller, replace the next thing you've mocked out in the real system by real code, and test-drive that. Oh, and don't bother mocking out domain objects (like Account) - it'll be a pain in the neck - but do inject any complex behavior into them and mock that out instead.
This way, you're always writing the interface that you wish you had - something that's easy to use - for every class. You're describing the behavior of that class and providing some examples of how to use it. You're making it safe and easy to change, and the appropriate design will emerge (feel free to be guided by patterns, thoughtful common sense and experience).
BTW, with Login, I tend to work out what the user wants to log in for, then code that first. Add Login later - it's usually not very risky and doesn't change much once it's written, so you may not even need to unit-test it. Up to you.
Test until fear is replaced by boredom. Property accessors and constructors are high cost to benefit to write tests against. I usually test them indirectly as part of some other (higher) test.
For a new project, I'd recommend looking at ATDD. Find a user-story that you want to pick first (based on user value). Write an acceptance test that should pass when the user story is done. Now drill down into the types that you'd need to get the AT to pass -- using TDD. The acceptance test will tell you which objects and what behaviors are required. You then implement them one at a time using TDD. When all your tests (incl your acc. test) pass - you pick up the next user story and repeat.
Let's say you pick 'Create user' as your first story. Then you write examples of how that should work. Turn them into automated acceptance tests.
create valid user -> account should be created
create invalid user ( diff combinations that show what is invalid ) -> account shouldn't be created, helpful error shown to the user
AccountsVM.CreateUser(username, password)
AccountsVM.HasUser(username)
AccountsVM.ErrorMessage
The test would show that you need the above. You then go test-drive them them out.
Don't test what is too simple to break.
getters and setters are too simple to be broken, so said, the code is so simple that an error can not happen.
you test the public methods and assert the response is as expected. If the method return void you have to test "collateral consequences" (sometimes is not easy, eg to test a email was sent). When this happens you can use mocks to test not the response but how the method executes (you ask the mockk if the Class Under Test called him the desired way)
I start doing Katas to learn the basics: JUnit and TestNG; then Harmcrest; then read EasyMock or Mockito documentation.
Look for katas at github, or here
http://codekata.pragprog.com
http://codingdojo.org/
The first test should be the easiest one! Maybe one that just force you to create the CUT (class under test)
But again, try katas!
http://codingdojo.org/cgi-bin/wiki.pl?KataFizzBuzz

Confused on TDD wrappers/adapters

I'm new to the TDD scene and trying to isolate my tests is having me go around in circles.
I have an app I am trying to write that uses OpenXML so it has a mass of objects that it depends on to work from an external framework.
I thought it would be a good idea to have wrappers around these objects so I was isolated from them in case of changes etc.
My problem is that to represent something like a Cell, I am passing in a real Cell into my wrapper (so it has something to wrap) in the constructor.
To test this wrapper, I have to pass in a real Cell from the OpenXML framework. Ok that's do-able but I also wanted to pass in a SharedStringTablePart to the constructor so it could store the string value (if it was a sharedstring) for easy retrieval.
SharedStringTablePart has a protected constructor so you can't just create one on the fly in a test to test with.
Sooo, I create a wrapper for that too but how do I test this new wrapper? I can't pass in a SharedStringTablePart via dependency injection as I can't construct one.
I have to talk to the 3rd party interface at some point in my app architecture don't I and test that layer?
Do I just create wrappers and not bother with the TDD part of them and just assume they will work if I pass through the same requests and respond with the same responses the actual wrapped object would expect/do?
Not that it makes any difference at this level but I am using c#
thanks
Nev
That's the problem with integration code, it doesn't fit well unit testing.
Above doesn't mean you are to drop unit testing all around, just not for integration code.
The way you are designing those wrappers is tightly coupling them to the external objects. imho that's just shifting the problem / moving the code around.
Don't receive the external objects in the constructor and do the mapping in there. Pull that out of all those objects, instead handle it in code which only responsibility is to map the representation from the external system to the internal representation.
That's the way you'd keep the dependency from the rest of your code. I'd have some code that's responsible from communicating with the third party library, that code doesn't expose any of the types you want to keep from the rest of the system. Also doesn't hide it inside other objects, it either maps them directly or calls code that maps them.

Resources