Is it a bad practice to randomly-generate test data? - ruby

Since I've started using rspec, I've had a problem with the notion of fixtures. My primary concerns are this:
I use testing to reveal surprising behavior. I'm not always clever enough to enumerate every possible edge case for the examples I'm testing. Using hard-coded fixtures seems limiting because it only tests my code with the very specific cases that I've imagined. (Admittedly, my imagination is also limiting with respect to which cases I test.)
I use testing to as a form of documentation for the code. If I have hard-coded fixture values, it's hard to reveal what a particular test is trying to demonstrate. For example:
describe Item do
describe '#most_expensive' do
it 'should return the most expensive item' do
Item.most_expensive.price.should == 100
# OR
#Item.most_expensive.price.should == Item.find(:expensive).price
# OR
#Item.most_expensive.id.should == Item.find(:expensive).id
end
end
end
Using the first method gives the reader no indication what the most expensive item is, only that its price is 100. All three methods ask the reader to take it on faith that the fixture :expensive is the most expensive one listed in fixtures/items.yml. A careless programmer could break tests by creating an Item in before(:all), or by inserting another fixture into fixtures/items.yml. If that is a large file, it could take a long time to figure out what the problem is.
One thing I've started to do is add a #generate_random method to all of my models. This method is only available when I am running my specs. For example:
class Item
def self.generate_random(params={})
Item.create(
:name => params[:name] || String.generate_random,
:price => params[:price] || rand(100)
)
end
end
(The specific details of how I do this are actually a bit cleaner. I have a class that handles the generation and cleanup of all models, but this code is clear enough for my example.) So in the above example, I might test as follows. A warning for the feint of heart: my code relies heavily on use of before(:all):
describe Item do
describe '#most_expensive' do
before(:all) do
#items = []
3.times { #items << Item.generate_random }
#items << Item.generate_random({:price => 50})
end
it 'should return the most expensive item' do
sorted = #items.sort { |a, b| b.price <=> a.price }
expensive = Item.most_expensive
expensive.should be(sorted[0])
expensive.price.should >= 50
end
end
end
This way, my tests better reveal surprising behavior. When I generate data this way, I occasionally stumble upon an edge case where my code does not behave as expected, but which I wouldn't have caught if I were only using fixtures. For example, in the case of #most_expensive, if I forgot to handle the special case where multiple items share the most expensive price, my test would occasionally fail at the first should. Seeing the non-deterministic failures in AutoSpec would clue me in that something was wrong. If I were only using fixtures, it might take much longer to discover such a bug.
My tests also do a slightly better job of demonstrating in code what the expected behavior is. My test makes it clear that sorted is an array of items sorted in descending order by price. Since I expect #most_expensive to be equal to the first element of that array, it's even more obvious what the expected behavior of most_expensive is.
So, is this a bad practice? Is my fear of fixtures an irrational one? Is writing a generate_random method for each Model too much work? Or does this work?

I'm surprised no one in this topic or in the one Jason Baker linked to mentioned
Monte Carlo Testing. That's the only time I've extensively used randomized test inputs. However, it was very important to make the test reproducible, by having a constant seed for the random number generator for each test case.

This is an answer to your second point:
(2) I use testing to as a form of documentation for the code. If I have hard-coded fixture values, it's hard to reveal what a particular test is trying to demonstrate.
I agree. Ideally spec examples should be understandable by themselves. Using fixtures is problematic, because it splits the pre-conditions of the example from its expected results.
Because of this, many RSpec users have stopped using fixtures altogether. Instead, construct the needed objects in the spec example itself.
describe Item, "#most_expensive" do
it 'should return the most expensive item' do
items = [
Item.create!(:price => 100),
Item.create!(:price => 50)
]
Item.most_expensive.price.should == 100
end
end
If your end up with lots of boilerplate code for object creation, you should take a look at some of the many test object factory libraries, such as factory_girl, Machinist, or FixtureReplacement.

We thought about this a lot on a recent project of mine. In the end, we settled on two points:
Repeatability of test cases is of paramount importance. If you must write a random test, be prepared to document it extensively, because if/when it fails, you will need to know exactly why.
Using randomness as a crutch for code coverage means you either don't have good coverage or you don't understand the domain enough to know what constitutes representative test cases. Figure out which is true and fix it accordingly.
In sum, randomness can often be more trouble than it's worth. Consider carefully whether you're going to be using it correctly before you pull the trigger. We ultimately decided that random test cases were a bad idea in general and to be used sparingly, if at all.

Lots of good information has already been posted, but see also: Fuzz Testing. Word on the street is that Microsoft uses this approach on a lot of their projects.

My experience with testing is mostly with simple programs written in C/Python/Java, so I'm not sure if this is entirely applicable, but whenever I have a program that can accept any sort of user input, I always include a test with random input data, or at least input data generated by the computer in an unpredictable way, because you can never make assumptions about what users will enter. Or, well, you can, but if you do then some hacker who doesn't make that assumption may well find a bug that you totally overlooked. Machine-generated input is the best (only?) way I know of to keep human bias completely out of the testing procedures. Of course, in order to reproduce a failed test you have to do something like saving the test input to a file or printing it out (if it's text) before running the test.

Random testing is a bad practice a long as you don't have a solution for the oracle problem, i.e., determining which is the expected outcome of your software given its input.
If you solved the oracle problem, you can get one step further than simple random input generation. You can choose input distributions such that specific parts of your software get exercised more than with simple random.
You then switch from random testing to statistical testing.
if (a > 0)
// Do Foo
else (if b < 0)
// Do Bar
else
// Do Foobar
If you select a and b randomly in int range, you exercise Foo 50% of the time, Bar 25% of the time and Foobar 25% of the time. It is likely that you will find more bugs in Foo than in Bar or Foobar.
If you select a such that it is negative 66.66% of the time, Bar and Foobar get exercised more than with your first distribution. Indeed the three branches get exercised each 33.33% of the time.
Of course, if your observed outcome is different than your expected outcome, you have to log everything that can be useful to reproduce the bug.

I would suggest having a look at Machinist:
http://github.com/notahat/machinist/tree/master
Machinist will generate data for you, but it is repeatable, so each test-run has the same random data.
You could do something similar by seeding the random number generator consistently.

Use of random test data is an excellent practice -- hard-coded test data only tests the cases you explicitly thought of, whereas random data flushes out your implicit assumptions that might be wrong.
I highly recommend using Factory Girl and ffaker for this. (Never use Rails fixtures for anything under any circumstances.)

One problem with randomly generated test cases is that validating the answer should be computed by code and you can't be sure it doesn't have bugs :)

You might also see this topic: Testing with random inputs best practices.

Effectiveness of such testing largely depends on quality of random number generator you use and on how correct is the code that translates RNG's output into test data.
If the RNG never produces values causing your code to get into some edge case condition you will not have this case covered. If your code that translates the RNG's output into input of the code you test is defective it may happen that even with a good generator you still don't hit all the edge cases.
How will you test for that?

The problem with randomness in test cases is that the output is, well, random.
The idea behind tests (especially regression tests) is to check that nothing is broken.
If you find something that is broken, you need to include that test every time from then on, otherwise you won't have a consistent set of tests. Also, if you run a random test that works, then you need to include that test, because its possible that you may break the code so that the test fails.
In other words, if you have a test which uses random data generated on the fly, I think this is a bad idea. If however, you use a set of random data, WHICH YOU THEN STORE AND REUSE, this may be a good idea. This could take the form of a set of seeds for a random number generator.
This storing of the generated data allows you to find the 'correct' response to this data.
So, I would recommend using random data to explore your system, but use defined data in your tests (which may have originally been randomly generated data)

Related

I'm Trying To Figure Out: What prevents you from "gaming" unit tests if you're doing them right?

So, whenever I attempt to learn Test Driven Development something always bothers me and I figure there must be something fundamental that I am missing about how unit tests and the process of test-driven-development are supposed to be "done."
Okay, so let's say you have a simple Class, called Calculator. Let's say you have one method that you are attempting to "drive" with your tests. The method is called "Add". It is supposed to take two numbers and give you the sum of those two numbers.
Let's say the test looks like this:
[TestMethod]
public void AddingTwoAndThreeEquals5()
{
var myCalulator = new Calculator();
var result = myCalulator.Add(2, 3);
Assert.AreEqual(5, result);
}
Makes sense as a test, right?
Let's say this is my implementation choice:
public int Add(int a, int b)
{
return 5;
}
This works. The test will pass. Hey, the implementation is even about as simple as it gets--no redundancy whatsoever. How could you refactor that? You're "Green" and good to go! :)
But that is obviously a horrible implementation choice! It will fail almost instantly in Production (just imagine the use-case where "2" and "2" are "Added" together to get "5"--the thought! gasp)! If Test-Driven-Development is just focused on getting the tests to pass as quickly as possible, what is to prevent you from "gaming" the test like this? Surely this is not the quality of code Test Driven Development is supposed to produce, but where is it expected that the code would be corrected to be more "flexible" as it would obviously need to be to function even remotely well in Production?
Do there just need to be enough tests that getting them all to pass at once while doing things like this would not be possible? Is this the sort of thing that should be caught on a code-review with a peer before check-in? Is it just up to the developer to be reasonable and know that this could not possibly pass muster in Production even if it technically makes the test pass--in other words, is the developer just to use common sense to know what is and is not in the spirit of the process even if they technically are getting the tests to go to "Green"?
This is obviously an extreme example as no one would likely check code like this in. But what about Test Driven Development is supposed to prevent "driving" code like this? Its a conceptual stumbling block that I always have when I undertake to learn Test Driven Development and I feel like there is just something fundamental that I am not getting about what Test Driven Development is and is not supposed to "do" for us.
Thank you so much for bearing with me through my question!
Cheers! :)
This works. The test will pass. Hey, the implementation is even about as simple as it gets--no redundancy whatsoever.
Ah - but that's not true, you do have duplication here, and you need to see it.
public int Add(int a, int b)
{
return 5;
}
becomes
public int Add(int a, int b)
{
return 2 + 3;
}
becomes
public int Add(int a, int b)
{
assert a == 2;
assert b == 3;
return a + b;
}
Now the duplication is removed, and you can move on to the next test.
In practice, Beck uses "refactor" somewhat differently from Fowler's definition. In Beck's examples, if a change keeps the tests passing, it counts as a refactor even though the behavior in examples-you-aren't-testing-yet might change.
So we'd expect Beck to skip that last example, and proceed directly to:
public int Add(int a, int b)
{
return a + b;
}
I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence -- Kent Beck, 2008
Now, today's picture is a bit more complicated than that. Rough timeline: Kent (re)introduces test first programming when he begins writing about Extreme Programming (XP). Test Driven Development by Example is published a few years later, and by then the ideas (a) have already been spread and (b) have a number of different interpreters, with the own interpretations, goals, agendas etc.
We've got a lot of different ideas like "Simplest Thing", and "You Aren't Gonna Need It" that are common in the same communities, and people try out applying them to the tests and refactoring of TDD. Sometimes the slogans translate well, other times... not so much?
what about Test Driven Development is supposed to prevent "driving" code like this?
So the first thing to recognize is that "Driven" here is pure marketing garbage. There was a time when people believed that the design followed from the tests - but that didn't really work out, and in retrospect there were some pretty clear tells that it wasn't going to work out. Michael Feathers referred to this period as the Look, Ma, no hands era.
There are some cases where the implementation of the tests gives hints that there is a problem in your design (too much irrelevant set up is a common example). But structure invariant tests simply can't "drive" a structure - all "refactorings" are reversible.
What those tests can do is alert you if a small code change you make to improve the design changes any of the behaviors you measuring. If so, you can evaluate and remedy the contradiction at once (usually by reverting the small change that introduced the fault) and move on.
(Contrast this with a "test first" approach, where the test fails until you have the entire design typed in. That same test may then still fail, and now you have to go digging for the fault you introduced. The shortened feedback loop is the big idea of TDD.)
Now, off the top of my head, a better test is to have two randomly generated integers and the result of Add is the result of adding the first and second integer. Suddenly the test becomes way more robust.
Test Driven Development / Test Driven Design is a design practice, not a test practice. Defending against the malfeasance of the Enterprise Developer from Hell is not a TDD goal.
Your cheat is actually something that is quite common while driving the design ("making the test pass, committing any crime necessary"). It works with 2 and 3 and even with a virtually infinite number of other combinations of numbers. The cheat is resolved by either getting rid of duplication as VoiceOfUnreason shows (5 is duplicated in your test and in your production code, replacing it by a+b removes this duplication) or by "triangulation" (writing more tests that expect different results). TDD is not about being "artificially stupid" or even malicious.
The duplication here is in the test and the production code. Notice the following two lines
Assert.AreEqual(5, result);
return 5;
To remove this duplication you need to think about making your code more generic. As #EricSchaefer rightly mentioned, to make the code more generic you need to follow the triangulation technique.

Jasmine: Why toBeUndefined and not.toBeDefined?

I'm just learning the Jasmine library, and I've noticed that Jasmine has a very limited number of built-in assertions. I've also noticed that, despite having such a limited number, two of its assertions appear to be redundant: toBeDefined/toBeUndefined.
In other words, both of these would seem to check for the same exact thing:
expect(1).toBeDefined();
expect(undefined).not.toBeUndefined();
Is there some reason for this, like a case where toBeDefined isn't the same as toBeUndefined? Or is this just the one assertion in Jasmine that has two perfectly equal ways of being invoked?
One might assume the same for toBeTruthy and toBeFalsy, or toBeLessThan and toBeGreaterThan (although I guess the missing assert from the last two is toEqual). In the end it comes down to readability and user preference.
To give you a more complete answer, it might be useful to take a look at the code that is invoked for these functions. The code that is executed goes through separate paths (so toBeUndefined is not simply !toBeDefined). The only real answer that makes sense is readability (or giving in to annoying feature requests). https://github.com/jasmine/jasmine/tree/main/src/core/matchers

When to use RSpec let()?

I tend to use before blocks to set instance variables. I then use those variables across my examples. I recently came upon let(). According to RSpec docs, it is used to
... to define a memoized helper method. The value will be cached across multiple calls in the same example but not across examples.
How is this different from using instance variables in before blocks? And also when should you use let() vs before()?
I always prefer let to an instance variable for a couple of reasons:
Instance variables spring into existence when referenced. This means that if you fat finger the spelling of the instance variable, a new one will be created and initialized to nil, which can lead to subtle bugs and false positives. Since let creates a method, you'll get a NameError when you misspell it, which I find preferable. It makes it easier to refactor specs, too.
A before(:each) hook will run before each example, even if the example doesn't use any of the instance variables defined in the hook. This isn't usually a big deal, but if the setup of the instance variable takes a long time, then you're wasting cycles. For the method defined by let, the initialization code only runs if the example calls it.
You can refactor from a local variable in an example directly into a let without changing the
referencing syntax in the example. If you refactor to an instance variable, you have to change
how you reference the object in the example (e.g. add an #).
This is a bit subjective, but as Mike Lewis pointed out, I think it makes the spec easier to read. I like the organization of defining all my dependent objects with let and keeping my it block nice and short.
A related link can be found here: http://www.betterspecs.org/#let
The difference between using instances variables and let() is that let() is lazy-evaluated. This means that let() is not evaluated until the method that it defines is run for the first time.
The difference between before and let is that let() gives you a nice way of defining a group of variables in a 'cascading' style. By doing this, the spec looks a little better by simplifying the code.
I have completely replaced all uses of instance variables in my rspec tests to use let(). I've written a quickie example for a friend who used it to teach a small Rspec class: http://ruby-lambda.blogspot.com/2011/02/agile-rspec-with-let.html
As some of the other answers here says, let() is lazy evaluated so it will only load the ones that require loading. It DRYs up the spec and make it more readable. I've in fact ported the Rspec let() code to use in my controllers, in the style of inherited_resource gem. http://ruby-lambda.blogspot.com/2010/06/stealing-let-from-rspec.html
Along with lazy evaluation, the other advantage is that, combined with ActiveSupport::Concern, and the load-everything-in spec/support/ behavior, you can create your very own spec mini-DSL specific to your application. I've written ones for testing against Rack and RESTful resources.
The strategy I use is Factory-everything (via Machinist+Forgery/Faker). However, it is possible to use it in combination with before(:each) blocks to preload factories for an entire set of example groups, allowing the specs to run faster: http://makandra.com/notes/770-taking-advantage-of-rspec-s-let-in-before-blocks
It is important to keep in mind that let is lazy evaluated and not putting side-effect methods in it otherwise you would not be able to change from let to before(:each) easily.
You can use let! instead of let so that it is evaluated before each scenario.
In general, let() is a nicer syntax, and it saves you typing #name symbols all over the place. But, caveat emptor! I have found let() also introduces subtle bugs (or at least head scratching) because the variable doesn't really exist until you try to use it... Tell tale sign: if adding a puts after the let() to see that the variable is correct allows a spec to pass, but without the puts the spec fails -- you have found this subtlety.
I have also found that let() doesn't seem to cache in all circumstances! I wrote it up in my blog: http://technicaldebt.com/?p=1242
Maybe it is just me?
Dissenting voice here: after 5 years of rspec I don't like let very much.
1. Lazy evaluation often makes test setup confusing
It becomes difficult to reason about setup when some things that have been declared in setup are not actually affecting state, while others are.
Eventually, out of frustration someone just changes let to let! (same thing without lazy evaluation) in order to get their spec working. If this works out for them, a new habit is born: when a new spec is added to an older suite and it doesn't work, the first thing the writer tries is to add bangs to random let calls.
Pretty soon all the performance benefits are gone.
2. Special syntax is unusual to non-rspec users
I would rather teach Ruby to my team than the tricks of rspec. Instance variables or method calls are useful everywhere in this project and others, let syntax will only be useful in rspec.
3. The "benefits" allow us to easily ignore good design changes
let() is good for expensive dependencies that we don't want to create over and over.
It also pairs well with subject, allowing you to dry up repeated calls to multi-argument methods
Expensive dependencies repeated in many times, and methods with big signatures are both points where we could make the code better:
maybe I can introduce a new abstraction that isolates a dependency from the rest of my code (which would mean fewer tests need it)
maybe the code under test is doing too much
maybe I need to inject smarter objects instead of a long list of primitives
maybe I have a violation of tell-don't-ask
maybe the expensive code can be made faster (rarer - beware of premature optimisation here)
In all these cases, I can address the symptom of difficult tests with a soothing balm of rspec magic, or I can try address the cause. I feel like I spent way too much of the last few years on the former and now I want some better code.
To answer the original question: I would prefer not to, but I do still use let. I mostly use it to fit in with the style of the rest of the team (it seems like most Rails programmers in the world are now deep into their rspec magic so that is very often). Sometimes I use it when I'm adding a test to some code that I don't have control of, or don't have time to refactor to a better abstraction: i.e. when the only option is the painkiller.
let is functional as its essentially a Proc. Also its cached.
One gotcha I found right away with let... In a Spec block that is evaluating a change.
let(:object) {FactoryGirl.create :object}
expect {
post :destroy, id: review.id
}.to change(Object, :count).by(-1)
You'll need to be sure to call let outside of your expect block. i.e. you're calling FactoryGirl.create in your let block. I usually do this by verifying the object is persisted.
object.persisted?.should eq true
Otherwise when the let block is called the first time a change in the database will actually happen due to the lazy instantiation.
Update
Just adding a note. Be careful playing code golf or in this case rspec golf with this answer.
In this case, I just have to call some method to which the object responds. So I invoke the _.persisted?_ method on the object as its truthy. All I'm trying to do is instantiate the object. You could call empty? or nil? too. The point isn't the test but bringing the object ot life by calling it.
So you can't refactor
object.persisted?.should eq true
to be
object.should be_persisted
as the object hasn't been instantiated... its lazy. :)
Update 2
leverage the let! syntax for instant object creation, which should avoid this issue altogether. Note though it will defeat a lot of the purpose of the laziness of the non banged let.
Also in some instances you might actually want to leverage the subject syntax instead of let as it may give you additional options.
subject(:object) {FactoryGirl.create :object}
"before" by default implies before(:each). Ref The Rspec Book, copyright 2010, page 228.
before(scope = :each, options={}, &block)
I use before(:each) to seed some data for each example group without having to call the let method to create the data in the "it" block. Less code in the "it" block in this case.
I use let if I want some data in some examples but not others.
Both before and let are great for DRYing up the "it" blocks.
To avoid any confusion, "let" is not the same as before(:all). "Let" re-evaluates its method and value for each example ("it"), but caches the value across multiple calls in the same example. You can read more about it here: https://www.relishapp.com/rspec/rspec-core/v/2-6/docs/helper-methods/let-and-let
Note to Joseph -- if you are creating database objects in a before(:all) they won't be captured in a transaction and you're much more likely to leave cruft in your test database. Use before(:each) instead.
The other reason to use let and its lazy evaluation is so you can take a complicated object and test individual pieces by overriding lets in contexts, as in this very contrived example:
context "foo" do
let(:params) do
{ :foo => foo, :bar => "bar" }
end
let(:foo) { "foo" }
it "is set to foo" do
params[:foo].should eq("foo")
end
context "when foo is bar" do
let(:foo) { "bar" }
# NOTE we didn't have to redefine params entirely!
it "is set to bar" do
params[:foo].should eq("bar")
end
end
end
I use let to test my HTTP 404 responses in my API specs using contexts.
To create the resource, I use let!. But to store the resource identifier, I use let. Take a look how it looks like:
let!(:country) { create(:country) }
let(:country_id) { country.id }
before { get "api/countries/#{country_id}" }
it 'responds with HTTP 200' { should respond_with(200) }
context 'when the country does not exist' do
let(:country_id) { -1 }
it 'responds with HTTP 404' { should respond_with(404) }
end
That keeps the specs clean and readable.

Should a Unit-test replicate functionality or Test output?

I've run into this dilemma several times. Should my unit-tests duplicate the functionality of the method they are testing to verify its integrity? OR Should unit tests strive to test the method with numerous manually created instances of inputs and expected outputs?
I'm mainly asking the question for situations where the method you are testing is reasonably simple and its proper operation can be verified by glancing at the code for a minute.
Simplified example (in ruby):
def concat_strings(str1, str2)
return str1 + " AND " + str2
end
Simplified functionality-replicating test for the above method:
def test_concat_strings
10.times do
str1 = random_string_generator
str2 = random_string_generator
assert_equal (str1 + " AND " + str2), concat_strings(str1, str2)
end
end
I understand that most times the method you are testing won't be simple enough to justify doing it this way. But my question remains; is this a valid methodology in some circumstances (why or why not)?
Testing the functionality by using the same implementation, doesn't test anything. If one has a bug in it, the other will as well.
But testing by comparing with an alternative implementation is a valid approach. For example, you might test a iterative (fast) method of calculating fibonacci numbers by comparing it with a trivial recursive, yet slow implementation of the same method.
A variation of this is using an implementation, that only works for special cases. Of course in that case you can use it only for such special cases.
When choosing input values, using random values most of the time isn't very effective. I'd prefer carefully chosen values anytime. In the example you gave, null values and extremely long values which won't fit into a String when concatenated come to mind.
If you use random values, make sure, you have a way to recreate the exact run, with the same random values, for example by logging the seed value, and having a way to set that value at start time.
It's a controversial stance, but I believe that unit testing using Derived Values is far superior to using arbitrary hard-coded input and output.
The issue is that as an algorithm becomes even slightly complex, the relationship between input and output becomes obscure if represented by hard-coded values. The unit test ends up being a postulate. It may work technically, but hurts test maintenability because it leads to Obscure Tests.
Using Derived Values to test against the result establishes a much clearer relationship between test input and expected output.
The argument that this doesn't test anything is simply not true, because any test case will exercise only a part of a path through the SUT, so no single test case will reproduce the entire algorithm being tested, but the combination of tests will do so.
An additional benefit is that you can use fewer unit tests to cover the desired functionality, and even make them more communicative at the same time. The end result is terser and more maintainable unit tests.
In unit testing you should definitely manually come up with test cases (so input, output and what side-effects you are expecting - these will be expectations on your mock objects). You come up with these test cases in a way so that they cover all functionality of your class (e.g. all methods are covered, all branches of all if statements, etc.). Think about it more along the lines of creating documentation of your class by showing all possible usages.
Reimplementing the class is not a good idea, because not only you get obvious code/functionality duplication, but also it is likely that you will introduce the same bugs in this new implementation.
to test the functionality of a method i'd use input and output pairs wherever possible. otherwise you might be copy&pasting the functionality as well as the errors in its implementation. what are you testing then? you would be testing if the functionality (including all of its errors) hasn't changed over time. but you wouldn't be testing the correctness of the implementation.
testing if the functionality hasn't changed over time might (temporarily) be useful during refactoring. but how often do you refactor such small methods?
also unit tests can be seen as documentation and as specification of a method's inputs and expected outputs. both should be as simple as possible so others can easily read and comprehend it. as soon as you introduce additional code/logic into a test it becomes harder to read.
your test actually looks like a fuzz test. fuzz tests can be very useful, but in unit tests randomness should be avoided due to reproducibility.
A Unit-Test should exercise your code, not something as part of the language you are using.
If the code's logic is to concatenate strings in a special way, you should be testing for that - otherwise you need to rely on your language/framework.
Finally, you should create your unit tests to fail first "with meaning". In other words, random values shouldn't be used (unless you're testing your random number generator isn't returning the same set of random values!)
Yes. It bothers me too.. although I'd say that it is more prevalent with non-trivial computations. In order to avoid updating the test when the code changes, some programmers write a IsX=X test, which always succeeds irrespective of the SUT
About duplicating functionality
You don't have to. Your test can state what the expected output is, not how you derived it.
Although in some non-trivial cases, it may make your test more readable as to how you derived the expected value - test as a spec. You shouldn't refactor away this duplication
def doubler(x); x * 2; end
def test_doubler()
input, expected = 10, doubler(10)
assert_equal expected, doubler(10)
end
Now if I change doubler(x) to be a tripler, the above test won't fail.
def doubler(x); x * 3; end
However this one would:
def test_doubler()
assert_equal(20, doubler(10))
end
randomness in unit tests - Don't.
Instead of random datasets, choose static representative data points for testing and use a xUnit RowTest/TestCase to run the test with diff data inputs. If n input-sets are identical for the unit, choose 1.
The test in the OP could be used as a exploratory test/ or to determine additional representative input-sets. Unit tests need to be repeatable (See q#61400) - Using random values defeats this objective.
Never use random data for input. If your test reports a failure, how are you ever going to be able to duplicate it? And don't use the same function to generate the expected result. If you have a bug in your method you're likely to put the same bug in your test. Compute the expected results by some other method.
Hard-coded values are perfectly fine, and make sure inputs are picked to represent all of the normal and edge cases. At the very least test the expected inputs as well as inputs in the wrong format or wrong size (eg: null values).
It's really quite simple -- a unit test must test whether the function works or not. That means you need to give a range of known inputs that have known outputs and test against that. There is no universal right way to do that. However, using the same algorithm for the method and the verification proves nothing but that you're adept at copy/paste.

Is checking Perl function arguments worth it?

There's a lot of buzz about MooseX::Method::Signatures and even before that, modules such as Params::Validate that are designed to type check every argument to methods or functions. I'm considering using the former for all my future Perl code, both personal and at my place of work. But I'm not sure if it's worth the effort.
I'm thinking of all the Perl code I've seen (and written) before that performs no such checking. I very rarely see a module do this:
my ($a, $b) = #_;
defined $a or croak '$a must be defined!';
!ref $a or croak '$a must be a scalar!";
...
#_ == 2 or croak "Too many arguments!";
Perhaps because it's simply too much work without some kind of helper module, but perhaps because in practice we don't send excess arguments to functions, and we don't send arrayrefs to methods that expect scalars - or if we do, we have use warnings; and we quickly hear about it - a duck typing approach.
So is Perl type checking worth the performance hit, or are its strengths predominantly shown in compiled, strongly typed languages such as C or Java?
I'm interested in answers from anyone who has experience writing Perl that uses these modules and has seen benefits (or not) from their use; if your company/project has any policies relating to type checking; and any problems with type checking and performance.
UPDATE: I read an interesting article on the subject recently, called Strong Testing vs. Strong Typing. Ignoring the slight Python bias, it essentially states that type checking can be suffocating in some instances, and even if your program passes the type checks, it's no guarantee of correctness - proper tests are the only way to be sure.
If it's important for you to check that an argument is exactly what you need, it's worth it. Performance only matters when you already have correct functioning. It doesn't matter how fast you can get a wrong answer or a core dump. :)
Now, that sounds like a stupid thing to say, but consider some cases where it isn't. Do I really care what's in #_ here?
sub looks_like_a_number { $_[0] !~ /\D/ }
sub is_a_dog { eval { $_[0]->DOES( 'Dog' ) } }
In those two examples, if the argument isn't what you expect, you are still going to get the right answer because the invalid arguments won't pass the tests. Some people see that as ugly, and I can see their point, but I also think the alternative is ugly. Who wins?
However, there are going to be times that you need guard conditions because your situation isn't so simple. The next thing you have to pass your data to might expect them to be within certain ranges or of certain types and don't fail elegantly.
When I think about guard conditions, I think through what could happen if the inputs are bad and how much I care about the failure. I have to judge that by the demands of each situation. I know that sucks as an answer, but I tend to like it better than a bondage-and-discipline approach where you have to go through all the mess even when it doesn't matter.
I dread Params::Validate because its code is often longer than my subroutine. The Moose stuff is very attractive, but you have to realize that it's a way for you to declare what you want and you still get what you could build by hand (you just don't have to see it or do it). The biggest thing I hate about Perl is the lack of optional method signatures, and that's one of the most attractive features in Perl 6 as well as Moose.
I basically concur with brian. How much you need to worry about your method's inputs depends heavily on how much you are concerned that a) someone will input bad data, and b) bad data will corrupt the purpose of the method. I would also add that there is a difference between external and internal methods. You need to be more diligent about public methods because you're making a promise to consumers of your class; conversely you can be less diligent about internal methods as you have greater (theoretical) control over the code that accesses it, and have only yourself to blame if things go wrong.
MooseX::Method::Signatures is an elegant solution to adding a simple declarative way to explain the parameters of a method. Method::Signatures::Simple and Params::Validate are nice but lack one of the features I find most appealing about Moose: the Type system. I have used MooseX::Declare and by extension MooseX::Method::Signatures for several projects and I find that the bar to writing the extra checks is so minimal it's almost seductive.
Yes its worth it - defensive programming is one of those things that are always worth it.
The counterargument I've seen presented to this is that checking parameters on every single function call is redundant and a waste of CPU time. This argument's supporters favor a model in which all incoming data is rigorously checked when it first enters the system, but internal methods have no parameter checks because they should only be called by code which will pass them data which has already passed the checks at the system's border, so it is assumed to still be valid.
In theory, I really like the sound of that, but I can also see how easily it can fall like a house of cards if someone uses the system (or the system needs to grow to allow use) in a way that was unforeseen when the initial validation border is established. All it takes is one external call to an internal function and all bets are off.
In practice, I'm using Moose at the moment and Moose doesn't really give you the option to bypass validation at the attribute level, plus MooseX::Declare handles and validates method parameters with less fuss than unrolling #_ by hand, so it's pretty much a moot point.
I want to mention two points here.
The first are the tests, the second the performance question.
1) Tests
You mentioned that tests can do a lot and that tests are the only way
to be sure that your code is correct. In general i would say this is
absolutly correct. But tests itself only solves one problem.
If you write a module you have two problems or lets say two different
people that uses your module.
You as a developer and a user that uses your module. Tests helps with the
first that your module is correct and do the right thing, but it didn't
help the user that just uses your module.
For the later, i have one example. i had written a module using Moose
and some other stuff, my code ended always in a Segmentation fault.
Then i began to debug my code and search for the problem. I spend around
4 hours of time to find the error. In the end the problem was that i have
used Moose with the Array Trait. I used the "map" function and i didn't
provide a subroutine function, just a string or something else.
Sure this was an absolutly stupid error of mine, but i spend a long time to
debug it. In the end just a checking of the input that the argument is
a subref would cost the developer 10 seconds of time, and would cost me
and propably other a lot of more time.
I also know of other examples. I had written a REST Client to an interface
completly OOP with Moose. In the end you always got back Objects, you
can change the attributes but sure it didn't call the REST API for
every change you did. Instead you change your values and in the end you
call a update() method that transfers the data, and change the values.
Now i had a user that then wrote:
$obj->update({ foo => 'bar' })
Sure i got an error back, that update() does not work. But sure it didn't
work, because the update() method didn't accept a hashref. It only does
a synchronisation of the actual state of the object with the online
service. The correct code would be.
$obj->foo('bar');
$obj->update();
The first thing works because i never did a checking of the arguments. And i don't throw an error if someone gives more arguments then i expect. The method just starts normal like.
sub update {
my ( $self ) = #_;
...
}
Sure all my tests absolutely works 100% fine. But handling these errors that
are not errors cost me time too. And it costs the user propably a lot
of more time.
So in the end. Yes, tests are the only correct way to ensure that your code
works correct. But that doesn't mean that type checking is meaningless.
Type checking is there to help all your non-developers (on your module)
to use your module correctly. And saves you and others time finding
dump errors.
2) Performance
The short: You don't care for performance until you care.
That means until your module works to slow, Performance is always fast
enough and you don't need to care for this. If your module really works
to slow you need further investigations. But for these investigions
you should use a profiler like Devel::NYTProf to look what is slow.
And i would say. In 99% slowliness is not because you do type
checking, it is more your algorithm. You do a lot of computation, calling
functions to often etc. Often it helps if you do completly other solutions
use another better algorithm, do caching or something else, and the
performance hit is not your type checking. But even if the checking is the
performance hit. Then just remove it where it matters.
There is no reason to leave the type checking where performance don't
matters. Do you think type checking does matter in a case like above?
Where i have written a REST Client? 99% of performance issues here are
the amount of request that goes to the webservice or the time for such an
request. Don't using type checking or MooseX::Declare etc. would propably
speed up absolutly nothing.
And even if you see performance disadvantages. Sometimes it is acceptable.
Because the speed doesn't matter or sometimes something gives you a greater
value. DBIx::Class is slower then pure SQL with DBI, but DBIx::Class
gives you a lot for these.
Params::Validate works great,but of course checking args slows things down. Tests are mandatory(at least in the code I write).
Yes it's absolutely worth it, because it will help during development, maintenance, debugging, etc.
If a developer accidentally sends the wrong parameters to a method, a useful error message will be generated, instead of the error being propagated down to somewhere else.
I'm using Moose extensively for a fairly large OO project I'm working on. Moose's strict type checking has saved my bacon on a few occassions. Most importantly it has helped avoid situations where "undef" values are incorrectly being passed to the method. In just those instances alone it saved me hours of debugging time..
The performance hit is definitely there, but it can be managed. 2 hours of using NYTProf helped me find a few Moose Attributes that I was grinding too hard and I just refactored my code and got 4x performance improvement.
Use type checking. Defensive coding is worth it.
Patrick.
Sometimes. I generally do it whenever I'm passing options via hash or hashref. In these cases it's very easy to misremember or misspell an option name, and checking with Params::Check can save a lot of troubleshooting time.
For example:
sub revise {
my ($file, $options) = #_;
my $tmpl = {
test_mode => { allow => [0,1], 'default' => 0 },
verbosity => { allow => qw/^\d+$/, 'default' => 1 },
force_update => { allow => [0,1], 'default' => 0 },
required_fields => { 'default' => [] },
create_backup => { allow => [0,1], 'default' => 1 },
};
my $args = check($tmpl, $options, 1)
or croak "Could not parse arguments: " . Params::Check::last_error();
...
}
Prior to adding these checks, I'd forget whether the names used underscores or hyphens, pass require_backup instead of create_backup, etc. And this is for code I wrote myself--if other people are going to use it, you should definitely do some sort of idiot-proofing. Params::Check makes it fairly easy to do type checking, allowed value checking, default values, required options, storing option values to other variables and more.

Resources