Test Cases AND assertion statements - defensive-programming

The code in this question made me think
assert(value>0); //Precondition
if (value>0)
{
//Doit
}
I never write the if-statement. Asserting is enough/all you can do.
"Crash early, crash often"
CodeComplete states:
The assert-statement makes the application Correct
The if-test makes the application Robust
I don't think you've made an application more robust by correcting invalid input values, or skipping code:
assert(value >= 0 ); //Precondition
assert(value <= 90); //Precondition
if(value < 0) //Just in case
value = 0;
if (value > 90) //Just in case
value = 90;
//Doit
These corrections are based on assumptions you made about the outside world.
Only the caller knows what "a valid input value" is for your function, and he must check its validity before he calls your function.
To paraphrase CodeComplete:
"Real-world programs become too messy when we don't rely solely on assertions."
Question: Am I wrong, stuborn, stupid, too non-defensive...

The problem with trusting just Asserts, is that they may be turned off in a production environment. To quote the wikipedia article:
Most languages allow assertions to be
enabled or disabled globally, and
sometimes independently. Assertions
are often enabled during development
and disabled during final testing and
on release to the customer. Not
checking assertions avoiding the cost
of evaluating the assertions while,
assuming the assertions are free of
side effects, still producing the same
result under normal conditions. Under
abnormal conditions, disabling
assertion checking can mean that a
program that would have aborted will
continue to run. This is sometimes
preferable.
Wikipedia
So if the correctness of your code relies on the Asserts to be there you may run into serious problems. Sure, if the code worked during testing it should work during production... Now enter the second guy that works on the code and is just going to fix a small problem...

Use assertions for validating input you control: private methods and such.
Use if statements for validating input you don't control: public interfaces designed for consumption by the user, user input testing etc.
Test you application with assertions built in. Then deploy without the assertions.

I some cases, asserts are disabled when building for release. You may not have control over this (otherwise, you could build with asserts on), so it might be a good idea to do it like this.
The problem with "correcting" the input values is that the caller will not get what they expect, and this can lead to problems or even crashes in wholly different parts of the program, making debugging a nightmare.
I usually throw an exception in the if-statement to take over the role of the assert in case they are disabled
assert(value>0);
if(value<=0) throw new ArgumentOutOfRangeException("value");
//do stuff

I would disagree with this statement:
Only the caller knows what "a valid
input value" is for your function, and
he must check its validity before he
calls your function.
Caller might think that he know that input value is correct. Only method author knows how it suppose to work. Programmer's best goal is to make client to fall into "pit of success". You should decide what behavior is more appropriate in given case. In some cases incorrect input values can be forgivable, in other you should throw exception\return error.
As for Asserts, I'd repeat other commenters, assert is a debug time check for code author, not code clients.

Don't forget that most languages allow you to turn off assertions... Personally, if I was prepared to write if tests to protect against all ranges of invalid input, I wouldn't bother with the assertion in the first place.
If, on the other hand you don't write logic to handle all cases (possibly because it's not sensible to try and continue with invalid input) then I would be using the assertion statement and going for the "fail early" approach.

If I remember correctly from CS-class
Preconditions define on what conditions the output of your function is defined. If you make your function handle errorconditions your function is defined for those condition and you don't need the assert statement.
So I agree. Usually you don't need both.
As Rik commented this can cause problems if you remove asserts in released code. Usually I don't do that except in performance-critical places.

I should have stated I was aware of the fact that asserts (here) dissappear in production code.
If the if-statement actually corrects invalid input data in production code, this means the assert never went off during testing on debug code, this means you wrote code that you never executed.
For me it's an OR situation:
(quote Andrew) "protect against all ranges of invalid input, I wouldn't bother with the assertion in the first place." -> write an if-test.
(quote aku) "incorrect input values can be forgivable" -> write an assert.
I can't stand both...

For internal functions, ones that only you will use, use asserts only. The asserts will help catch bugs during your testing, but won't hamper performance in production.
Check inputs that originate externally with if-conditions. By externally, that's anywhere outside the code that you/your team control and test.
Optionally, you can have both. This would be for external facing functions where integration testing is going to be done before production.

A problem with assertions is that they can (and usually will) be compiled out of the code, so you need to add both walls in case one gets thrown away by the compiler.

Related

Assertive code in programming and the definition of it

Is there a simple definition?
What is the nature, so to speak, of "assertive code"?
All the definitions of this I have, by now, found are very vague.
Is there something I can read that is concise and to the point without using a lot of jargon?
I think that the jargon could be a problem in my case. I am quite dumb but I wanna learn it so any help and pointers are welcome.
When you write "imperative code", you tell the computer what to do.
When you write "declarative code", you tell the computer what to produce.
When you write "assertive code", you tell the computer what you expect to be true.
The phrase "assertive code" isn't nearly as common as the other two, and is used in different ways in practice. In an common OO language it usually just refers to using assert expressions to catch bugs. In functional programming (the example you provide), it usually refers to pattern matching and destructuring constructs that imply a particular shape for their inputs. In a language like Prolog, it can refer to a definition of goals that the program must resolve.
An assert statement is essentially an if statement that will print an error (and, sometimes, stop the program) if the condition is false. If the condition is true, it will do nothing.
Assertions are normally used in software testing. You use them to check that a program behaves in a way that you expect it to. In other words, they will sound an alarm when a program violates an assumption that the programmer wanted to check.
However, there's nothing preventing you from leaving assertions in your production code too. This can sometimes be beneficial, especially in cases where you cannot easily simulate the program with a test - for example because you don't have the real data to test it with.
In such cases you typically want failed assertions just to print a message to a log file. After having your program run for a while, you check the log file and if everything is OK, there should be no messages about failed assertions.

Is function parameter validation using errors a good pattern in Go?

Is parameter validation using error return codes considered good practice ? I mean where should somebody use errors vs panics (are there any guidelines?).
For instance:
Is checking for non-nil + returning an error if it is nil a good
practice ?
Or checking for correct integer ranges etc.
I imagine that using errors that often would make Go feel very C-ish and would look pretty bad. Are panics a good alternative in those situations ?
Or should a Gopher use the Python/Ruby/JS-approach "just let it fail" ?
I'm a bit confused because panics are for real "errors" in my understanding. But using errors all the time is just bad.
And even if I would return error code: What could I do if somebody passes wrong parameter to my function but ignores the errors codes ? -> Nothing! So honestly I would say panics are nice for those situations but in a language where error codes are used over panics this is not very clear.
"Escaping" panics1 in Go (I mean, those which might be produced by the functions comprising the public API of your package) are to deal with errors programmers do. So, if your function gets a pointer to an object, and that can't be nil (say, to indicate that the value is missing) just go on and dereference the pointer to make the runtime panic itself if it happens to be nil. If a function expects an integer that must be in a certain range, panic if it's not in that range — because in a correct program all values which might be passed to your function are in that range, and if they don't then either the programmer failed to obey the API or they did not sanitize the value acquired from the outside which, again, is not your fault.
On the other hand, problems like failure to open a file or pefrorm some other action your function is supposed to perform when called correctly should not cause panics and the function should return an appropriate error instead.
Note that the recommendation for explicit checking for null parameters in the functions of public APIs in .NET and Java code has different goal of making such kinds of errors sort-of more readable. But since 99% of .NET and Java code just lets all the exceptions propagate to the top level (and then be displayed or may be logged) it's just replacing one (generated by runtime) exception with another. It might make errors more obvious—the execution fails in the API function, not somewhere deeper down the call stack—but adds unnecessary cruft to these API functions. So yes, this is opinionated but my subjective opinion is: to let it just crash is OK in Go—you'll get a descriptive stack trace.
TL;DR
With regard to processing of run-time problems,
panics are for programming errors;
returning errors is for problems with carrying out the intended tasks of functions.
1 Another legitimate use for panics is quick "cold-path" returning from deep recursive processing/computation; in this case panic should be caught and processed by your package, and the corresponding public API functions should return errors. See this and this for more info.
The answer to this is subjective. Here are my thoughts:
Regarding panic, I like this quote from Go By Example (ref)
A panic typically means something went unexpectedly wrong. Mostly we use it to fail fast on errors that shouldn’t occur during normal operation, or that we aren’t prepared to handle gracefully.
In the description of your use case, I would argue that you should raise an errors and handle the errors. I would further argue that it is good practice to check the error status when one is provided by the function you are using and that the user should check if one is provided in the documentation.
Panics I would use to stop the execution if I run across an error that is returned from the function you are writing that I check and don't have a way to recover from.

`global` assertions?

Are there any languages with possibility of declaring global assertions - that is assertion that should hold during the whole program execution. So that it would be possible to write something like:
global assert (-10 < speed < 10);
and this assertion will be checked every time speed changes state?
eiffel supports all different contracts: precondition, postcondition, invariant... you may want to use that.
on the other hand, why do you have a global variable? why don't you create a class which modifies the speed. doing so, you can easily check your condition every time the value changes.
I'm not aware of any languages that truly do such a thing, and I would doubt that there exist any since it is something that is rather hard to implement and at the same time not something that a lot of people need.
It is often better to simply assert that the inputs are valid and modifications are only done when allowed and in a defined, sane way. This concludes the need of "global asserts".
You can get this effect "through the backdoor" in several ways, though none is truly elegant, and two are rather system-dependent:
If your language allows operator overloading (such as e.g. C++), you can make a class that overloads any operator which modifies the value. It is considerable work, but on the other hand trivial, to do the assertions in there.
On pretty much every system, you can change the protection of memory pages that belong to your process. You could put the variable (and any other variables that you want to assert) separately and set the page to readonly. This will cause a segmentation fault when the value is written to, which you can catch (and verify that the assertion is true). Windows even makes this explicitly available via "guard pages" (which are really only "readonly pages in disguise").
Most modern processors support hardware breakpoints. Unless your program is to run on some very exotic platform, you can exploit these to have more fine-grained control in a similar way as by tampering with protections. See for example this article on another site, which describes how to do it under Windows on x86. This solution will require you to write a kind of "mini-debugger" and implies that you may possibly run into trouble when running your program under a real debugger.

If as assert fails, is there a bug?

I've always followed the logic: if assert fails, then there is a bug. Root cause could either be:
Assert itself is invalid (bug)
There is a programming error (bug)
(no other options)
I.E. Are there any other conclusions one could come to? Are there cases where an assert would fail and there is no bug?
If assert fails there is a bug in either the caller or callee. Why else would there be an assertion?
Yes, there is a bug in the code.
Code Complete
Assertions check for conditions that
should never occur. [...]
If an
assertion is fired for an anomalous
condition, the corrective action is
not merely to handle an error
gracefully- the corrective action is
to change the program's source code,
recompile, and release a new version
of the software.
A good way to
think of assertions is as executable
documentation - you can't rely on them
to make the code work, but they can
document assumptions more actively
than program-language comments can.
That's a good question.
My feeling is, if the assert fails due to your code, then it is a bug. The assertion is an expected behaviour/result of your code, so an assertion failure will be a failure of your code.
Only if the assert was meant to show a warning condition - in which case a special class of assert should have been used.
So, any assert should show a bug as you suggest.
If you are using assertions you're following Bertrand Meyer's Design by Contract philosophy. It's a programming error - the contract (assertion) you have specified is not being followed by the client (caller).
If you are trying to be logically inclusive about all the possibilities, remember that electronic circuitry is known to be affected by radiation from space. If the right photon/particle hits in just the right place at just the right time, it can cause an otherwise logically impossible state transition.
The probability is vanishingly small but still non-zero.
I can think of one case that wouldn't really class as a bug:
An assert placed to check for something external that normally should be there. You're hunting something nutty that occurs on one machine and you want to know if a certain factor is responsible.
A real world example (although from before the era of asserts): If a certain directory was hidden on a certain machine the program would barf. I never found any piece of code that should have cared if the directory was hidden. I had only very limited access to the offending machine (it had a bunch of accounting stuff on it) so I couldn't hunt it properly on the machine and I couldn't reproduce it elsewhere. Something that was done with that machine (the culprit was never identified) occasionally turned that directory hidden.
I finally resorted to putting a test in the startup to see if the directory was hidden and stopping with an error if it was.
No. An assertion failure means something happened that the original programmer did not intend or expect to occur.
This can indicate:
A bug in your code (you are simply calling the method incorrectly)
A bug in the Assertion (the original programmer has been too zealous and is complaining about you doing something that is quite reasonable and the method will actually handle perfectly well.
A bug in the called code (a design flaw). That is, the called code provides a contract that does not allow you to do what you need to do. The assertion warns you that you can't do things that way, but the solution is to extend the called method to handle your input.
A known but unimplemented feature. Imagine I implement a method that could process positive and negative integers, but I only need it (for now) to handle positive ones. I know that the "perfect" implementation would handle both, but until I actually need it to handle negatives, it is a waste of effort to implement support (and it would add code bloat and possibly slow down my application). So I have considered the case but I decide not to implement it until the need is proven. I therefore add an assert to mark this unimplemented code. When I later trigger the assert by passing a negative value in, I know that the additional functionality is now needed, so I must augment the implementation. Deferring writing the code until it is actually required thus saves me a lot of time (in most cases I never imeplement the additiona feature), but the assert makes sure that I don't get any bugs when I try to use the unimplemented feature.

How "defensive" should my code be?

I was having a discussion with one of my colleagues about how defensive your code should be. I am all pro defensive programming but you have to know where to stop. We are working on a project that will be maintained by others, but this doesn't mean we have to check for ALL the crazy things a developer could do. Of course, you could do that but this will add a very big overhead to your code.
How do you know where to draw the line?
Anything a user enters directly or indirectly, you should always sanity-check. Beyond that, a few asserts here and there won't hurt, but you can't really do much about crazy programmers editing and breaking your code, anyway!-)
I tend to change the amount of defense I put in my code based on the language. Today I'm primarily working in C++ so my thoughts are drifting in that direction.
When working in C++ there cannot be enough defensive programming. I treat my code as if I'm guarding nuclear secrets and every other programmer is out to get them. Asserts, throws, compiler time error template hacks, argument validation, eliminating pointers, in depth code reviews and general paranoia are all fair game. C++ is an evil wonderful language that I both love and severely mistrust.
I'm not a fan of the term "defensive programming". To me it suggests code like this:
void MakePayment( Account * a, const Payment * p ) {
if ( a == 0 || p == 0 ) {
return;
}
// payment logic here
}
This is wrong, wrong, wrong, but I must have seen it hundreds of times. The function should never have been called with null pointers in the first place, and it is utterly wrong to quietly accept them.
The correct approach here is debatable, but a minimal solution is to fail noisily, either by using an assert or by throwing an exception.
Edit: I disagree with some other answers and comments here - I do not think that all functions should check their parameters (for many functions this is simply impossible). Instead, I believe that all functions should document the values that are acceptable and state that other values will result in undefined behaviour. This is the approach taken by the most succesful and widely used libraries ever written - the C and C++ standard libraries.
And now let the downvotes begin...
I don't know that there's really any way to answer this. It's just something that you learn from experience. You just need to ask yourself how common a potential problem is likely to be and make a judgement call. Also consider that you don't necessarily have to always code defensively. Sometimes it's acceptable just to note any potential problems in your code's documentation.
Ultimately though, I think this is just something that a person has to follow their intuition on. There's no right or wrong way to do it.
If you're working on public APIs of a component then its worth doing a good amount of parameter validation. This led me to have a habit of doing validation everywhere. Thats a mistake. All that validation code never gets tested and potentially makes the system more complicated than it needs to be.
Now I prefer to validate by unit testing. Validation definitely happens for data coming from external sources, but not for calls from non-external developers.
I always Debug.Assert my assumptions.
My personal ideology: the defensiveness of a program should be proportional to the maximum naivety/ignorance of the potential user base.
Being defensive against developers consuming your API code is not that different from being defensive against regular users.
Check the parameters to make sure they are within appropriate bounds and of expected types
Verify that the number of API calls which could be made are within your Terms of Service. Generally called throttling it usually only applies to web services and password checking functions.
Beyond that there's not much else to do except make sure your app recovers well in the event of a problem and that you always give ample information to the developer so that they understand what's going on.
Defensive programming is only one way of hounouring a contract in a design-by-contract manner of coding.
The other two are
total programming and
nominal programming.
Of course you shouldnt defend yourself against every crazy thing a developer could do, but then you should state in wich context it will do what is expected to using preconditions.
//precondition : par is so and so and so
function doSth(par)
{
debug.assert(par is so and so and so )
//dostuf with par
return result
}
I think you have to bring in the question of whether you're creating tests as well. You should be defensive in your coding, but as pointed out by JaredPar -- I also believe it depends on the language you're using. If it's unmanaged code, then you should be extremely defensive. If it's managed, I believe you have a little bit of wiggleroom.
If you have tests, and some other developer tries to decimate your code, the tests will fail. But then again, it depends on test coverage on your code (if there is any).
I try to write code that is more than defensive, but down right hostile. If something goes wrong and I can fix it, I will. if not, throw or pass on the exception and make it someone elses problem. Anything that interacts with a physical device - file system, database connection, network connection should be considered unereliable and prone to failure. anticipating these failures and trapping them is critical
Once you have this mindset, the key is to be consistent in your approach. do you expect to hand back status codes to comminicate problems in the call chain or do you like exceptions. mixed models will kill you or at least drive you to drink. heavily. if you are using someone elses api, then isolate these things into mechanisms that trap/report in terms you use. use these wrapping interfaces.
If the discussion here is how to code defensively against future (possibly malevolent or incompetent) maintainers, there is a limit to what you can do. Enforcing contracts through test coverage and liberal use of asserting your assumptions is probably the best you can do, and it should be done in a way that ideally doesn't clutter the code and make the job harder for the future non-evil maintainers of the code. Asserts are easy to read and understand and make it clear what the assumptions of a given piece of code is, so they're usually a great idea.
Coding defensively against user actions is another issue entirely, and the approach that I use is to think that the user is out to get me. Every input is examined as carefully as I can manage, and I make every effort to have my code fail safe - try not to persist any state that isn't rigorously vetted, correct where you can, exit gracefully if you cannot, etc. If you just think about all the bozo things that could be perpetrated on your code by outside agents, it gets you in the right mindset.
Coding defensively against other code, such as your platform or other modules, is exactly the same as users: they're out to get you. The OS is always going to swap out your thread at an inopportune time, networks are always going to go away at the wrong time, and in general, evil abounds around every corner. You don't need to code against every potential problem out there - the cost in maintenance might not be worth the increase in safety - but it sure doesn't hurt to think about it. And it usually doesn't hurt to explicitly comment in the code if there's a scenario you thought of but regard as unimportant for some reason.
Systems should have well designed boundaries where defensive checking happens. There should be a decision about where user input is validated (at what boundary) and where other potential defensive issues require checking (for example, third party integration points, publicly available APIs, rules engine interaction, or different units coded by different teams of programmers). More defensive checking than that violates DRY in many cases, and just adds maintenance cost for very little benifit.
That being said, there are certain points where you cannot be too paranoid. Potential for buffer overflows, data corruption and similar issues should be very rigorously defended against.
I recently had scenario, in which user input data was propagated through remote facade interface, then local facade interface, then some other class, to finally get to the method where it was actually used. I was asking my self a question: When should be the value validated? I added validation code only to the final class, where the value was actually used. Adding other validation code snippets in classes laying on the propagation path would be too defensive programming for me. One exception could be the remote facade, but I skipped it too.
Good question, I've flip flopped between doing sanity checks and not doing them. Its a 50/50
situation, I'd probably take a middle ground where I would only "Bullet Proof" any routines that are:
(a) Called from more than one place in the project
(b) has logic that is LIKELY to change
(c) You can not use default values
(d) the routine can not be 'failed' gracefully
Darknight

Resources