Dealing with duplication between unit and integration tests

Dealing with duplication between unit and integration tests - algorithm

I have an algorithm implemented by a number of classes, all covered by unit test.
I would like to refactor it, which will change behavior of two classes.
When I change one class and its tests, all unit tests pass, though the algorithm becomes incorrect until refactoring is done.
This example illustrates that complete coverage by unit tests is sometimes not enough and I need "integration" tests for the whole algorithm in terms of input-output. Ideally, such tests should cover the behavior of my algorithm completely.
My question: looks like by adding such integration tests I make unit tests unnecessary and superfluous. I don't want to support duplicated test logic.
Should I remove my unit tests or leave them as is, e.g. for easier bug location?

This is part of the problem with tests which are too fine grained and are tightly coupled with the implementation.
Personally I would write tests which focus on the behaviour of the algorithm and would consider this 'a unit'. The fact that it is broken into several classes is an implementation detail, in the same way that breaking down a public method's functionality into several smaller private methods is also an implementation detail. I wouldn't write tests for the private methods separately, they would be tested by the tests of the functionality of the public method.
If some of those classes are generically useful and will be reused elsewhere then I would consider writing unit tests for them at that point as then they will have some defined behaviour on their own.
This would result in some duplication but this is ok as those classes now have a public contract to uphold (and which is used by both components which use it), which those tests can define.
Interestingly, see the definition of Unit in this article

Related

Unit, Integration or Feature Test?

A simple question: How do you differentiate between a feature, unit and integration test?
There are a lot of differing opinions, but I'm specifically trying to determine how to organise a Laravel test which touches a model's relationship. Here is an example if some PHP code which would require testing:
public function prices()
{
return $this->hasMany(Prices::class);
}
public function getPriceAttribute($)
{
return $this->prices()->first() * 2;
}
The test descriptions as I understand them (feel free to correct me):
Unit test
Tests the smallest part of your code
Does not touch the database
Does not interact with any other part of the system
Integration test
Tests part of the system working together
e.g controllers which call helper functions which need to be tested together
Feature test
Blackbox test
e.g. Call an api end point, see that it has returned the correct JSON response
Here is my issue given those descriptions:
My Laravel model test needs to test the smallest unit of code - the calculated accessor of a model, which makes it feel like a Unit test
But, it touches the database when it loads the model's relationship
It doesnt feel like an Integration test, because it is only touching other related models, not internal or external services
Other property accessor tests in Laravel would fall under Unit tests when they do not touch the database or the model's relationships
Separating these types of tests into integration tests would mean that a single model's tests against its properties are fragmented between integration and unit tests
So, without mocking relationships between models, where would my test belong?

If I’m interpreting your original question correctly, I think the killer constraint here is:
So, without mocking relationships between models, where would my test belong?
If mocking isn't allowed and you're required to touch a DB then, by your/and google's definition, it has to belong as an integration/medium size test :)
The way I think of this is get price attribute functionality is separate from the DB. Even though it's in the model the prices could come from anywhere. Right now its a RDBMS but what if your org go really big and it split into another service? Basically, I believe, that the capability of getPriceAttributes is distinct from the storage of attributes:
public function getPriceAttribute($)
{
return $this->prices()->first() * 2;
}
If you buy into this reasoning, it creates a logical separation that supports unit tests. prices() can be mocked to returns a collection of 0, 1 & many (2) results. This test can be executed as a unit tests (for orders of magnitude faster test execution (ie on the order of 1ms vs potentially 10s or 100s of ms talking to a local DB)
I am not familiar with php test ecosystem but one way to do this could be with a test specific subclass (not sure if the following is valid PHP :p ):
class PricedModel extends YourModel {
function __construct($stub_prices_supporting_first) {
$this->stub_prices = $stub_prices_supporting_first;
}
public function prices() {
return $this->stub_prices;
}
}
tests
function test_priced_model_0_prices() {
p = new PricedModel(new Prices(array()));
assert.equal(null, p.getPriceAttribute());
}
function test_priced_model_1_price() {
p = new PricedModel(new Prices(array(1)));
assert.equal(2, p.getPriceAttribute());
}
function test_priced_model_2_prices() {
p = new PricedModel(new Prices(array(5, 1)));
assert.equal(10, p.getPriceAttribute());
}
The above should hopeuflly allow you to fully control input into the getPriceAttribute method to support direct IO-free unit testing.
——
Also all the unit tests above can tell you is that you’re able to process prices correctly , it doesn’t price any feedback on if you’re able to query prices !

What distinguishes the tests is their respective goal:
Unit-testing aims at findings those bugs that can be found in isolated small parts of the software. (Note that this does not say you must isolate - it only means your focus is on the isolated code. Isolation and mocking often enough are not needed to reach this goal: Think of a call to a sin function - you almost never need to mock this, you let your system under test just call the original one.)
Integration testing aims at findings bugs in the interaction of two or more components, for example mutual misconceptions about an interface. These bugs can not be found in the isolated software: If you test code in isolation, you also write your tests on your (possibly wrong) understanding of the other components.
Feature tests as you describe them will then have the goal to find further bugs, which the other tests so far could not detect. One example for such a bug could be, that an old version of the feature was integrated (which was correct at that time, but lacked some functionality).
The conclusion, although it may be surprising, is, that it is not in the stricter sense forbidden to make data base accesses in unit-testing. Consider the following scenario: You start writing unit-tests and mock the data base accesses. Later, you realize you can be more lazy and just use the data base without mocking - but otherwise leave all the tests as they are. Your tests have not changed, and they will continue finding the bugs in the isolated code as before. They may run a bit slower now, and the setup may be more complex than with the mocked data base. However, the goal of the test suite was the same - with and without mocking the data base.
This scenario simplifies things a bit, because there may be test cases that can only be done with a mock: For example, testing the case that the data base gets corrupted in a specific way and your code handles this properly. With the real data base such test cases may be practically impossible to set up.

TDD: Refactoring in new classes

I have a class with tests after many red-green cycles.
I see that the class is to big and make different tasks.
Now i want refactor it... after refactoring i have 3 classes and still 100% coverage and all tests are still green.
But the problem is, i test the new two classes over the tests from the "old big class", should i split the tests now?? Or should i write extra tests for each class before i refactore the code in classes?

should i split the tests now??
A quick implication of "refactoring" -- your tests didn't need to change. The public API and the observable behaviors of your implementation were not changed. You've just created some new module boundaries within the implementation.
(If the changes you made forced you to modify the existing tests, then the activity you were doing was not "refactoring").
Since the public API hasn't changed, the existing tests still have value - they describe the required behaviors of your original API. So you shouldn't "split" those tests.
If the new classes that you created during refactoring are going to be lifted into the public API, then you should introduce new tests for the new public API.
Your existing tests of the original API remain untouched until you have announced that API's end of life, deprecated it, and reached the end of life deadline. At that point, you can delete the tests of the old API.

How do you unit test when you need to explore code?

In TDD how should you continue when you know what your final outcome should be, but not the processing steps you need to get there?
For example your class is being passed an object whose API is completely new to you, You know the class has the information you need but you don't know how to retrieve it yet: How would you go about testing this?
Do you just focus on the desired result ignoring the steps?
Edit 1
package com.wesley_acheson.codeReview.annotations;
import com.sun.mirror.apt.AnnotationProcessor;
import com.sun.mirror.apt.AnnotationProcessorEnvironment;
public class AnnotationPresenceWarner implements AnnotationProcessor {
private final AnnotationProcessorEnvironment environment;
public AnnotationPresenceWarner(AnnotationProcessorEnvironment env) {
environment = env;
}
public void process() {
//This is what I'm testing
}
}
I'm trying to test this incomplete class. I want to test I have the right interactions with AnnotationProcessorEnvironment within the process method. However I'm unsure from the API docs what the right interaction is.
This will produce a file that contains details on the occurrence of each annotation within a source tree.
The actual file writing will probably be delegated to another class however. So this class' responsiblity is to create a representation of the annotation occurrences and pass that to whatever classes need to move it.
In non TDD I'd probably invoke a few methods set a breakpoint and see what they return.
Anyway I'm not looking for a solution to this specific example more sometimes you don't know how to get from A to B and you'd like your code to be test driven.

I'm basing my answer on this video:
http://misko.hevery.com/2008/11/11/clean-code-talks-dependency-injection/
If you have a model/business logic class that's supposed to get some data from a service then I'd go about this way:
Have your model class take the data that it needs in the constructor, rather than the service itself. You could then mock the data and unit test your class.
Create a wrapper for the service, you can then unit test then wrapper.
Perform a fuller test where you actually pass the data from the wrapper to the model class.

General Answer
TDD can be used to solve a number of issues, the first and foremost is to ensure that code changes do not break existing code in regards to their expected behavior. Thus, if you've written a class with TDD, you write some code first, see that it fails, then write the behavior to make it green without causing other tests to become red.
The side-effect of writing the test cases is that now you have Documentation. This means that TDD actually provides answers to two distinct problems with code. When learning a new API, regardless of what it is, you can use TDD to explore it's behavior (granted, in some frameworks this can be very difficult). So, when you are exploring an API, it's ok to write some tests to provide documentation to it's use. You can consider this a prototyping step as well, just that prototyping assumes you throw it away when complete. With the TDD approach, you keep it, so you can always return back to it long after you've learned the API.
Specific Answer to the Example Given
There are a number of approaches which attempt to solve the problem with the AnnotationProcessor. There is an Assertion framework which addresses the issue by loading the java code during the test and asserting the line which the error/warning occurs. And here on Stack overflow

I would create a prototype without the testing to get knowledge of how the api is working. When I got that understanding, I would continue on the TDD cycle on my project

I agree with Bassetassen. First do a spike to understand what is this external API call does and what you need for your method. Once you are comfortable with the API you know how to proceed with TDD.

Never ever Unit Test against an unknown API. Follow the same principle is if you didn't own the code. Isolate all the code you are writing from the unknown or unowned.
Write your unit tests as if the environmental processor was going to be code that you were going to TDD later.
Now you can follow #Tom's advice, except drop step 1. Step 2's unit tests now are just a matter of mapping the outputs of the wrapper class to calls on the API of the unknown. Step two is more along the lines of an integration test.
I firmly believe changing your flow from TDD to Prototyping to TDD is a loss in velocity. Stay with the TDD until you are done, then prototype.

Should I only be testing public interfaces in BDD? (in general, and specifically in Ruby)

I'm reading through the (still beta) rspec book by the prag progs as I'm interested in behavioral testing on objects. From what I've gleaned so far (caveat: after only reading for 30 min), the basic idea is that I want ensure my object behaves as expected 'externally' i.e. in its output and in relation to other objects.
Is it true then that I should just be black box testing my object to ensure the proper output/interaction with other objects?
This may be completely wrong, but given all of the focus on how my object behaves in the system, it seems this is ideology one would take. If that's so, how do we focus on the implementation of an object? How do I test that my private method is doing what I want it to do for all different types of input?
I suppose this question is maybe valid for all types of testing?? I'm still fairly new to TDD and BDD.

If you want to understand BDD better, try thinking about it without using the word "test".
Instead of writing a test, you're going to write an example of how you can use your class (and you can't use it except through public methods). You're going to show why your class is valuable to other classes. You're defining the scope of your class's responsibilities, while showing (through mocks) what responsibilities are delegated elsewhere.
At the same time, you can question whether the responsibilities are appropriate, and tune the methods on your class to be as intuitively usable as possible. You're looking for code which is easy to understand and use, rather than code which is easy to write.
If you can think in terms of examples and providing value through behaviour, you'll create code that's easy to use, with examples and descriptions that other people can follow. You'll make your code safe and easy to change. If you think about testing, you'll pin it down so that nobody can break it. You'll make it hard to change.
If it's complex enough that there are internal methods you really want to test separately, break them out into another class then show why that class is valuable and what it does for the class that uses it.
Hope this helps!

I think there are two issues here.
One is that from the BDD perspective, you are typically testing at a higher level than from the TDD perspective. So your BDD tests will assert a bigger piece of functionality than your TDD tests and should always be "black box" tests.
The second is that if you feel the need to test private methods, even at the unit test level, that could be a code smell that your code is violating the Single Responsibilty Principle
and should be refactored so that the methods you care about can be tested as public methods of a different class. Michael Feathers gave an interesting talk about this recently called "The Deep Synergy Between Testability and Good Design."

Yes, focus on the exposed functionality of the class. Private methods are just part of a public function you will test. This point is a bit controversial, but in my opinion it should be enough to test the public functionality of a class (everything else also violates the OOP principle).

Too many public methods forced by test-driven development

A very specific question from a novice to TDD:
I separate my tests and my application into different packages. Thus, most of my application methods have to be public for tests to access them. As I progress, it becomes obvious that some methods could become private, but if I make that change, the tests that access them won't work. Am I missing a step, or doing something wrong, or is this just one downfall of TDD?

This is not a downfall of TDD, but rather an approach to testing that believes you need to test every property and every method. In fact you should not care about private methods when testing because they should only exist to facilitate some public portion of the API.
Never change something from private to public for testing purposes!
You should be trying to verify only publicly visible behavior. The rest are implementation details and you specifically want to avoid testing those. TDD is meant to give you a set of tests that will allow you to easily change the implementation details without breaking the tests (changing behavior).
Let’s say I have a type: MyClass and I want to test the DoStuff method. All I care about is that the DoStuff method does something meaningful and returns the expected results. It may call a hundred private methods to get to that point, but I don't care as the consumer of that method.

You don't specify what language you are using, but certainly in most of them you can put the tests in a way that have more privileged access to the class. In Java, for example, the test can be in the same package, with the actual class file being in a different directory so it is separate from production code.
However, when you are doing real TDD, the tests are driving the class design, so if you have a method that exists just to test some subset of functionality, you are probably (not always) doing something wrong, and you should look at techniques like dependency injection and mocking to better guide your design.

This is where the old saying, "TDD is about design," frequently comes up. A class with too many public methods probably has too many responsibilities - and the fact that you are test-driving it only exposes that; it doesn't cause the problem.
When you find yourself in this situation, the best solution is frequently to find some subset of the public methods that can be extracted into a new class ("sprout class"), then give your original class an instance variable of the sprouted class. The public methods deserve to be public in the new class, but they are now - with respect to the API of the original class - private. And you now have better adherence to SRP, looser coupling, and higher cohesion - better design.
All because TDD exposed features of your class that would otherwise have slid in under the radar. TDD is about design.

At least in Java, it's good practice to have two source trees, one for the code and one for the tests. So you can put your code and your tests in the same package, while they're still in different directories:
src/org/my/xy/X.java
test/org/my/xy/TestX.java
Then you can make your methods package private.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio