Uncle Bob says:
"Defensive programming, in non-public APIs, is a smell, and a symptom, of teams that don't do TDD."
I am wondering how TDD can avoid an (internal) function to be used in an unintended way? I think TDD can´t avoid it. It merely shows that the function is used correctly because a calling function is covered by it´s passing unit tests.
When developing a new feature using the (undefensive) function this feature is also developed with TDD. So unintended use of the function will fail the new features tests.
So using TDD to drive new features will force you to correcty use (internal) functions.
Do you think that is what is meant by Uncle Bob´s tweet?
So using TDD to drive new features will force you to correctly use (internal) functions.
Exactly. But keep in mind the subtle "gap" here: you should use TDD to write (unit) tests that test the contract of your public methods. You do not care about the implementation of these methods - that is all internal implementation detail.
Therefore: if your "new" code uses an existing method in an unintended way you are "told" because an exception is thrown or you receive an unexpected result.
That is what I mean by "gap": you see, the above describes a black box testing approach. You have a public method X, and you verify its public contract. Compare that to white box testing where you write tests to cover all paths taken within X. When doing that, you could notice: "ok to test that one condition in my internal method, I would have to drive this special data".
But as said - I think you should go for black box testing - white box tests might break easily when refactoring internal methods.
And there is an additional dimension here: keep in mind that ideally you change code in order to implement new features. This means that adding new features only takes place by writing new classes and methods. This means that your new code has no chance using private internal methods. Because you are within a new class. In other words: when you regularly happen to run into situations where your internal methods are used in many different ways - then you are probably doing something wrong.
The ideal path is: you implement a new requirement by creating a set of new classes. Later on, you have to add other requirements - by writing more classes.
In that ideal path - there is no need for defensive programming within internal methods. Because you exactly understand each use case for such internal methods!
Thus, the conclusion is: avoid defensive programming in internal methods. Make sure that your public APIs check all pre-conditions, so they fail (as fast as possible) if there is a problem. Try to avoid these internal consistency checks - as they tend to bloat your code - and rest assured: in 5 weeks or 5 months you will not remember if you really needed that check, or if it is just "defensive".
One way to answer this is to look at what else Uncle Bob has had to say on the topic. For example:
In a system with meager code coverage, few tests, and lots of tangled legacy code, defensive programming should be the rule.
In a system born of TDD, with 90+% coverage and highly reliable, well-maintained unit tests, defensive programming should be the exception.
From this, we can infer his main argument -- if the defensive checks are actually providing a benefit, then that is a hint that we are missing some constraints. If we are missing some constraints, and all the tests are passing, then we must also be missing some tests.
Or, to express the same idea in a slightly different way -- the constraints implied by the defensive patterns in your implementation belong closer to the boundary (ie, in the public API).
If there are constraints, for example, to limit what data is allowed to pass through the boundary, then there should be tests to ensure that the boundary actually implements the constraints.
When you use TDD properly, you cover all the possible cases and assert that your public functions that call the private ones do respond properly as expected not only for the happy scenario, but for all different possible scenarios. When you use defending programing in your private methods, you are actually getting yourself ready for these (different possible) scenarios mentioned above.
I, personally, do not think defending programing is bad even if it is in private methods, however, based on my description above I see it is a double effort that is unnecessary and also, it eliminates the importance of the TTD because you are handling these special cases in your application by complicating the code, instead of writing it a way that is proof.
Related
I trying to follow TDD. So here is my problem
I have interface Risk with method
boolean check(...)
Risk1, and Risk2 are implentations deveped test first, so now they are fully covered.
I decided that unit that check all risks (CompositeRisk) also could implement Risk.
CompositeRisk applies OR on each Risk1 and Risk2 rezult (If one risk is true then whole this is risky). Still everything is test first.
Now I am looking to one of the risk and thinking - this one has word "AND" and checks different fields. It seems that I can split it to two object and create one more CompositeAndRisk which would apply And on both splitted risks. This way I could construct DSL for risks decision tree (seems nice because risks rules could changes a lot).
So what I should do with risk's I split tests? Should I rename i to CompositeAndRiskTest? should I delete it?, should I write test for splitClasses?
First of all, I suggest that you turn the CompositeRisk class into an interface, and have two separate subclasses of it: CompositeOrRisk and CompositeAndRisk. This is just about the design though.
Regarding your question, I believe there's no single right answer, so let me share how I see it.
As you know, in TDD there are concrete steps you follow (that comprise the TDD cycle), and there's a specific state the tests should be at in between each of them. Here's what I mean:
[State = No tests]
1. Write a test that fails
[State = Test fails]
2. Write as little code as possible in order for the test to pass
[State = Test passes]
3. Refactor
[State = Test still passes]
Given that this is what we aim for in TDD, I would do the changes you're talking about in the refactoring phase, including refactoring the tests accordingly.
This means that if I'm splitting a class, I'll be splitting the relevant test as well. At no point should the tests fail, as I'm only changing the structure of the code, not what it does (this is the meaning of refactoring after all).
If you have a larger change to do though, I would go about creating a new class from scratch (TDD of course), and later on, remove the no longer needed functionality from the old class, as well as the now redundant test cases.
The approach I'd take in this case is "play it innocent" -- when you discover a new requirement, just write a test and the implementation for it, pretending to ignore the relationship with previous requirements at first.
The "And" case here is clearly new functionality. No need to modify the contents of the existing test at that point, just create another test with a name that reflects the new requirement, such as CompositeAndRiskTest and create the corresponding implementation.
Then, during the Refactor step, "realize" that the two previous objects are 2 sides of the same coin and refactor them accordingly. That could just mean renaming CompositeRisk to CompositeOrRisk, or more complex things.
Once the 2 sorts of Risks are identified, tested and implemented, you could go on and create new tests for combinations of them.
I'm using VS 2012, but that's not really important.
What is important is that I'm trying to do some TDD by writing all my tests first and then creating the code.
However, the app will not compile because none of my objects or methods exist.
Now, to my mind, I should be able to create ALL my tests but still run my app so I can debug etc. The tests shouldn't prevent compilation because objects and methods are missing.
I thought the whole point of it was that as you develop your tests you can begin to see duplications etc so that you can refactor before writing a single line of code.
So the question is, is there a way to do this or am I doing this wrong?
EDIT
I am using VS2012 and C#
I see a small problem with
writing all my tests first and then creating the code.
You don't need to write ALL your tests first, you just need one, make it fail, make it pass and repeat. That means ideally at any point you should have ideally one failing test.
A compile failure counts as a failed test in that sense. So the next step is to make it pass - i.e. add stubs or return default values to make it compile. The test would then be red.. then work at getting it to green.
Test Driven Development is about very small iterations. You don't define all your tests up front. You create one test based on one fraction of one requirement. Then you implement the code to pass that test. Once it's passing, you work on another fraction of a requirement.
The idea is that trying to do all the design up front (whether it be creating detailed class diagrams or creating a bunch of tests) means that you will find it too expensive to change a weakness in your design, so you won't improve your code.
Here's an example. Let's say you decide to use inheritance to relate two objects, but when you started implementing the objects, you found that made testing them tough. You discover it would be much easier to test each object individually, and relate them via containment instead. What is happening is the tests are driving your design in a more loosely coupled direction. This is a very good outcome of TDD - you are using tests to improve the design.
If you had written all your tests in advance assuming your design decision of inheritance was a good choice, you would either throw away a lot of work, or you would say "it's too tough to make a change like that now, so I'll just live with this sub-optimal design instead."
You can certainly create business-rule-related acceptance tests in advance. Those are called behavior tests (part of Behavior Driven Development, or BDD) and they are good to test features of the software from the user's point of view. But those are NOT unit tests. Unit tests are for testing code from the developer's point of view. Creating the unit tests in advance defeats the purpose of TDD, because it will make testing harder, it will prevent you from improving your code, and will often lead to rebellion and failure of the practice. That's why it's important to do it right.
What is important is that I'm trying to do some TDD by writing all my tests first and then creating the code.
The problem is that "writing all my tests first" is most emphatically not "do[ing] some TDD". Test driven development consists of lots of small repetitions of the "red-green-refactor" cycle:
Add a unit test to the test suite, run it and watch it fail (red)
Add just enough code to the system under test to make all the tests
pass (green)
Improve the design of the system under test (typically
by removing duplication) while keeping all the tests passing
(refactor)
If you code an entire huge test suite up front, you'll spend forever trying to get to the "green" (all tests passing) state.
However, the app will not compile because none of my objects or methods exist.
That's typical of any compiled language; it's not a TDD issue per se. All it means is that, in order to watch the new test fail, you may have to write a minimal stub for whatever feature you're currently working on to make the compiler happy.
For example, I might write this test (using NUnit):
[Test]
public void DefaultGreetingIsHelloWorld()
{
WorldGreeter target = new WorldGreeter();
string expected = "Hello, world!";
string actual = target.Greet();
Assert.AreEqual(expected, actual);
}
And I'd have to then write this much code to get the app to compile and the test to fail:
public class WorldGreeter
{
public string Greet()
{
return String.Empty;
}
}
Once I've gotten the solution to build and I've seen the one failing test, I can add the code to make the first test pass:
public string Greet()
{
return "Hello, world!";
}
Once all tests pass, I can look through the system under test and see what could be done to improve the design. However, it's essential to the TDD discipline to go through both the "red" and "green" steps before playing around with refactoring.
I thought the whole point of it was that as you develop your tests you can begin to see duplications etc so that you can refactor before writing a single line of code.
Martin Fowler defines refactoring as "a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior" (emphasis added). If you haven't written a single line of code, there's nothing to refactor.
So the question is, is there a way to do this or am I doing this wrong?
If you're looking to do TDD, then, yes, I fear you are doing this wrong. You may well be able to deliver great code doing what you're doing, but it isn't TDD; whether or not that's a problem is for you to decide for yourself.
You should be able to create your empty class with stub functions, no?
class Whatever {
char *foo( const char *name ) {}
int can_wibble( Bongo *myBongo ) {}
}
Then you can compile.
No. It's about coding just enough to verify the implementation of the required use cases
You can define your tests cases early, but to code the test cases them you iteratively write a test, have it fail. Then write some code that ensures that the code passes.
Then rinse and repeat until all your test cases are covered,
Edit to address comment.
As you build out the code, that's where your programming designs and faults are identified. Extreme programming lends it self to you being able to change code with out care as the test base protects your requirements. Your intentions are good but the reality is that you'll refactor/redesign test test cases as you discover design issues and flaws through building out the code and test base.
However IMHO, in a very general case, a test that doesn't compile is effectively a meta test that's failing that needs to be corrected before moving on. It's telling you to write some code!
Use mock, from Wikipiedia: mock objects are simulated objects that mimic the behavior of real objects in controlled ways. A programmer typically creates a mock object to test the behavior of some other object, in much the same way that a car designer uses a crash test dummy to simulate the dynamic behavior of a human in vehicle impacts.
Please refer this.
I found this using dynamic for objects that don't exists yet:
https://coderwall.com/p/0uzmdw
I feel pretty skilled in TDD, and I'm even consired the "TDD expert" in my company, but nevertheless, there are some cases that I feel I don't know how to handle properly, so I would like to hear other's opinions.
My problems is as follows:
Even though in general TDD helps me think of the core responsibility of a class, and extract every other responsibility to dependent classes, there are cases that after some time I realize that one of the classes has multiple responsibilities and it needs to be refactored and split it into 2 classes. This conclusion often comes because the tests of that class start to become complicated or repetitive. I can pretty easily do refactoring to split this class to the design I want (and I do it in small steps, keeping on the green bar). My problem is that I end up with the same complicated and repetitive tests that now tests the 2 classes together, while I would like to have seperate tests for each class.
The only (more-or-less safe) manner I could think of for doing that, is to do the following for each test (after I completed the refactoring of the production code):
Duplicate the test case
Change one copy of the test to use a mock instead of the 1st class, and the other copy of the test to use a mock instead of the 2nd class.
Then if I see that an identical test already exists for one of the copies, I delete it.
I think that sometimes its possible to do the following:
start by creating the 2 classes from scratch (using TDD of course)
Change the old tests to use the new classes instead of the old one
Delete the old class
Delete the old tests
Both of these techniques seems pretty cumbersome and time consuming, so I wonder: how do the "real experts" go about this issue?
Without an actual example I can't be sure I know what exactly you mean. But it sounds like you try to test every class (and maybe even every method) in isolation.
When I get to a point where I want to/have to split a class into multiple classes, I tend to still view the resulting collection of classes as a unit and test it as a whole. Only when they stop building a functional whole and start to become independent units, I test them independently of each other.
I can pretty easily do refactoring to split this class to the design I
want (and I do it in small steps, keeping on the green bar). My
problem is that I end up with the same complicated and repetitive
tests that now tests the 2 classes together, while I would like to
have seperate tests for each class.
I've gotten to this point as well. Here I start refactoring the tests, using the same techniques as for the non-test code - convert variable to field, move field, extract method, move variable, etc etc. Naming is of course very important and provides a lot of design guidance.
eg http://www.kdgregory.com/index.php?page=junit.refactoring
eg http://www.natpryce.com/articles/000686.html
eg http://www-public.it-sudparis.eu/~gibson/Teaching/CSC7302/ReadingMaterial/vanDeursenMdenBK01.pdf
That last article has some example smells and refactorings common to refactoring tests specifically.
I start with asking myself (as you have) what are the responsibilities of a class. Let's say for example that your class is responsible to aggregate weather data and generate a weather report.
At this point I make three (3) lists:
Data aggregation members (attributes, behaviors)
Report generation members
Common members
The first two are easy, the members that exclusively belong in one class exclusively become part of one of the two new classes. I will keep the original dual-responsibility class as a facade, whose members are a pass-through to the new classes, so that tests and functionality will not be broken while refactoring. Depending on circumstances, I may eventually remove the facade, and refactor the tests and dependent objects to use the new classes.
As for the members that are common to both responsibilities - I will move them to a helper class (usually scoped as internal), that the new classes (and any others may use). The functionality has proven to be reused, and may be reused again. Note that the common members might not necessarily all land in one helper class; the helping functionality might be added to one new class, multiple (depending, of course on responsibilities) classes, and some functionality may be added to existing helper classes, if one fits the bill.
I wondered about this a while back, and couldn't really find a satisfactory answer. Here are some discussions I found on the topic:
http://tech.groups.yahoo.com/group/testdrivendevelopment/message/27199
and
http://tech.groups.yahoo.com/group/testdrivendevelopment/message/16227
Personally, I've adopted a "hair-trigger" approach to moving responsibilites into dependencies, and while "spinning off" a new dependency before there is a clear need for it smacks of YAGNI, I've found that re-absorbing a dependency that turned out to be too anemic to warrant being a separate class is much easier than the rigmarole involved with splitting out a separate class from a class that already has a significant battery of tests written for it.
Edit:
Oh - and I should probably point out that I'm not at all a "real expert" ;)
Since I'm a TDD newbie, I'm currently developing a tiny C# console application in order to practice (because practice makes perfect, right?). I started by making a simple sketchup of how the application could be organized (class-wise) and started developing all domain classes that I could identify, one by one (test first, of course).
In the end, the classes have to be integrated together in order to make the application runnable, i.e. placing necessary code in the Main method which calls the necessary logic. However, I don't see how I can do this last integration step in a "test first" manner.
I suppose I wouldn't be having these issues had I used a "top down" approach. The question is: how would I do that? Should I have started by testing the Main() method?
If anyone could give me some pointers, it will be much appreciated.
"Top-down" is already used in computing to describe an analysis technique. I suggest using the term "outside-in" instead.
Outside-in is a term from BDD, in which we recognise that there are often multiple user interfaces to a system. Users can be other systems as well as people. A BDD approach is similar to a TDD approach; it might help you so I will describe it briefly.
In BDD we start with a scenario - usually a simple example of a user using the system. Conversations around the scenarios help us work out what the system should really do. We write a user interface, and we can automate the scenarios against that user interface if we want.
When we write the user interface, we keep it as thin as possible. The user interface will use another class - a controller, viewmodel, etc. - for which we can define an API.
At this stage, the API will be either an empty class or a (programme) interface. Now we can write examples of how the user interface might use this controller, and show how the controller delivers value.
The examples also show the scope of the controller's responsibility, and how it delegates its responsibilities to other classes like repositories, services, etc. We can express that delegation using mocks. We then write that class to make the examples (unit tests) work. We write just enough examples to make our system-level scenarios pass.
I find it's common to rework mocked examples as the interfaces of the mocks are only guessed at first, then emerge more fully as the class is written. That will help us to define the next layer of interfaces or APIs, for which we describe more examples, and so on until no more mocks are needed and the first scenario passes.
As we describe more scenarios, we create different behaviour in the classes and we can refactor to remove duplication where different scenarios and user interfaces require similar behaviour.
By doing this in an outside-in fashion we get as much information as to what the APIs should be as possible, and rework those APIs as quickly as we can. This fits with the principles of Real Options (never commit early unless you know why). We don't create anything we don't use, and the APIs themselves are designed for usability - rather than for ease of writing. The code tends to be written in a more natural style using domain language more than programming language, making it more readable. Since code is read about 10x more than it's written, this also helps make it maintainable.
For that reason, I'd use an outside-in approach, rather than the intelligent guesswork of bottom-up. My experience is that it results in simpler, more strongly decoupled, more readable and more maintainable code.
If you moved stuff out of main(), could you not test that function?
It makes sense to do that, since you might want to run with different cmd-args and you'd want to test those.
You can test your console application as well. I found it quite hard, look at this example:
[TestMethod]
public void ValidateConsoleOutput()
{
using (StringWriter sw = new StringWriter())
{
Console.SetOut(sw);
ConsoleUser cu = new ConsoleUser();
cu.DoWork();
string expected = string.Format("Ploeh{0}", Environment.NewLine);
Assert.AreEqual<string>(expected, sw.ToString());
}
}
You can look at the full post here.
"bottom up" can save you a lot of work cos you already have low-level classes (tested) and you can use them in higher level tests. in opposite case you'd have to write a lot of (probably complicated) mockups. also "top down" often doesn't work as it's supposed: you think out some nice high-level model, test and implement it, and then moving down you realize that some of your assumptions were wrong (it's quite typical situation even for experienced programmers). of course patterns help here but it's not silver bullet too.
to have complete test coverage, you'll need to test your Main in any case. I'm not sure it's worth the effort in all cases. Just move all logic from Main to separate function (as Marcus Lindblom proposed), test it and let Main to just pass command-line arguments to your function.
console output is a simplest possible testing ability, and if you use TDD it can be used just to output the final result w/o any diagnosis and debug messages. so make your top-level function to return solid result, test it and just output it in Main
I recommend "top down" (I call that high level testing). I wrote about that:
http://www.hardcoded.net/articles/high-level-testing.htm
So yes, you should test your console output directly. Sure, initially it's a pain to setup a test so that it's easy to test console output directly, but if you create appropriate helper code, your next "high level" tests will be much easier to write. By doing that, you empower yourself to unlimited re-factoring potential. With "bottom up", your initial class relationship is very rigid because changing the relationship means changing the tests (lots of work and dangerous).
There was an interesting piece in MSDN magazine of December, which describes "how the BDD cycle wraps the traditional Test-Driven Development (TDD) cycle with feature-level tests that drive unit-level implementation". The details may be too technology specific, but the ideas and process outline sound relevant to your question.
The main should be very simple in the end. I don't know how it looks like in c#, but in c++ it should look something like this :
#include "something"
int main( int argc, char *argv[])
{
return TaskClass::Run( argc, argv );
}
Pass already created objects to the constructors of classes, but in unit tests pass mock objects.
For more info about TDD, take a look at these screencasts. They explain how to do agile development, but it also talks how to do TDD, with examples in c#.
I have inherited an existing code base where the "features" are as follows:
huge monolithic classes with
(literally) 100's of member variables
and methods that go one for pages
(er. screens)
public and private methods with a large number of arguments.
I am trying to clean up and refactor the code, to leave it a little better
than how I found it. So my questions
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable ?
are there best practices on how long methods should be ? How long do you usually keep them?
are monolithic classes bad ?
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable ?
Yes, it is worth it. It is typically more important to refactor methods that are not "reasonable" than ones that already are nice, short, and have a small argument list.
Typically, if you have many arguments, it's because a method does too much - most likely, it should be a class of it's own, not a method.
That being said, in those cases when many parameters are required, it's best to encapsulate the parameters into a single class (ie: SpecificAlgorithmOptions), and pass one instance of that class. This way, you can provide clean defaults, and its very obvious which methods are essential vs. optional (based on what is required to construct the options class).
are there best practices on how long methods should be ? How long do you usually keep them?
A method should be as short as possible. It should have one purpose, and be used for one task, whenver possible. If it's possible to split it into separate methods, where each as a real, qualitative "task", then do so when refactoring.
are monolithic classes bad ?
Yes.
if the code is working and there is no need to touch it, i wouldn't refactor. i only refactor very problematic cases if i anyway have to touch them (either for extending them for functionality or bug-fixing). I favor the pragmatic way: Only (in 95%) touch, what you change.
Some first thoughts on your specific problem (though in detail it is difficult without knowing the code):
start to group instance variables, these groups will then be target to do 'extract class'
when having grouped these variables you hopefully can group some methods, which also be moved when doing 'extract class'
often there are many methods which aren't using any fields. make them static (they most likely are helper methods, which can be extracted to helper-classes.
in case non-related instance fields are mixed in many methods, do loads of 'extract method'
use automatic refactoring tools as much as possible, because you most likely have no tests in place and automation is more safe.
Regarding your other concrete questions.
is worth it (or do you) refactor methods with 10 or so arguments so that they are more readable?
definetely. 10 parameters are too many to grasp for us humans. most likely the method is doing too much.
are there best practices on how long methods should be ? How long do you usually keep them?
it depends... on preferences. i stated some things on this thread (though the question was PHP). still i would apply these numbers/metrics to any language.
are monolithic classes bad ?
it depends, what you mean with monolithic. if you mean many instance variables, endless methods, a lot of if/else complexity, yes.
also have a look at a real gem (to me a must have for every developer): working effectively with legacy code
Assuming the code is functioning I would suggest you think about these questions first:
is the code well documented?
do you understand the code?
how often are new features being added?
how often are bugs reported and fixed?
how difficult is it to modify and fix the code?
what is the expected life of the code?
how many versions of the compiler are you behind (if at all)?
is the OS it runs on expected to change during its lifetime?
If the system will be replaced in five years, is documented well, will undergo few changes, and bugs are easy to fix - leave it alone regardless of the size of the classes and the number of parameters. If you are determined to refactor make a list of your refactoring proposals in the order of maximum benefit with minimum changes and attack it incrementally.