Ruby's duck-typing is great, but this is the one way that it bites me in the ass. I'll have some long running text-processing script or something running, and after several hours, some unexpected set of circumstances ends up causing the script to exit with at NoMethodError due to a variable becoming nil.
Now, once it happens, it's usually an easy fix, but it would be nicer if I could predict these better, or at least handle these types of errors more gracefully. Sorry for the vagueness of the question, but this type of error just happens too often to me and I wonder if there's a good way to avoid it.
Is there some best practice related to these kinds of "type errors" for Ruby?
Look up Design by Contract. It's useful in many programming paradigms, but it's particularly useful when you don't have a compiler to help you catch these sort of errors, of forbidding particular sorts of values for a parameter.
In essence, DbC allows you to make an assumption about a parameter. It allows you (in all but one place) to skip the mundane checks that guarantee this assumption to hold.
What about Object.nil?
I am trying to track down a non-exhaustive pattern in a libraries code. Specifically HDBC's mysql implementation. It is trying to match over types in my program and map them to mysql's types I believe. I can't seem to get a callstack for this error which means that since there are a number of parameters to the SQL query it is difficult to track down exactly what is causing it.
Is it possible to get a callstack in haskell so I would know which parameter was causing the error? Also I would think that this should be caught by the compiler since it should be able to look at my types and the patterns and make sure that there was a corresponding match.
You can use the GHCi debugger to identify where the exception is coming from.
I walk through a full example here.
You might also take a look at the Debug.Trace library.
I really don't see a sane use for these. There is already rescue and raise, so why the need for throw and catch? It seems they are supposed to be used to jump out of deep nesting, but that just smells like a goto to me. Are there any examples of good, clean use for these?
Note: It looks like a few things have changed with catch/throw in 1.9. This answer applies to Ruby 1.9.
A big difference is that you can throw anything, not just things that are derived from StandardError, unlike raise. Something silly like this is legal, for example:
throw Customer.new
but it's not terribly meaningful. But you can't do:
irb(main):003:0> raise Customer.new
TypeError: exception class/object expected
from (irb):3:in `raise'
from (irb):3
from /usr/local/bin/irb:12:in `<main>'
They can be really useful in simplifying DSLs for end users by passing control out of the DSL without the need for complex case / if statements
I have a Ruby app which allows users to extend it via an internal DSL. Some of the functions in the DSL need to return control to specific parts of my application. Let's take a simple example. Suppose the user is developing a simple extension concerning dates
if today is a holiday then
do nothing
end
week_of_year = today.week.number
if week_of_year < 10 then
...
The do nothing bit triggers a throw which passes control out of the exec statement and back to me.
Rather than continuing to execute the DSL, on some condition, we want it to exit and hand control back to my application. Now you could get the user to use lots of embedded if statements and have the DSL end naturally but that just obscures what the logic is trying to say.
Throw really is a goto which is 'considered dangerous' but damn it sometimes they are the best solution.
It's basically a goto, and slightly more akin to a call/cc, except that the control flow is wired up implicitly by name instead of explicitly as a parameter. The difference between throw/catch and raise/rescue is that the former is intended to be used for control flow instead of only exceptional situations, and it doesn't waste time putting together a stack trace.
Sinatra uses throw/catch for HTTP error codes, where a handler can use throw to cede control to the Sinatra library in a structured way. Other sorts of HTTP frameworks use exceptions, or by returning a different class of response, but this lets Sinatra (for example) try another request handler after catching it.
The difference between the two is that you can only 'raise' exceptions but can 'throw' anything (1.9). Other than that, they should be interchangeable, that is, it should be possible to rewrite one with another, just like the example given by #john-feminella.
I'm implementing a few things in Ruby and I was wondering how much error checking is appropriate (or, more precisely, how much error checking should be done by convention)?
For example, I'm implementing a method which swaps two elements in an array. The method is very simple:
def swap(a,b)
#array[a], #array[b] = #array[b], #array[a]
end
It's really simple, but is it ruby-ish to check whether the given indexes are valid, or is that an unnecessary overhead (bearing in mind I do not intend for the method to work with wrap-around values like -1)?
I can't help you with negative indexes, but you can use
#array.fetch(a)
to raise an exception if a is an invalid index.
I ought to use fetch when I regard an invalid index as a "Can't happen" case, but sometimes I think only about the "happy path" scenario.
It depends on the behavior that you're looking for. The Array#[] method that you're calling will do that check for you and return nil if you're using a nonexistent index, so I don't see a need to duplicate the error checking if you want that standard behavior. If you want something else, you'll need to implement the behavior you want.
However, this method will work with an index of -1, so if you want to disallow that, you will need to put a check for that.
Basically, I think a good rule is: Check for conditions in which your method will behave incorrectly, and implement the behavior you want. The out-of-bounds index condition will be caught by the array and handled a certain way -- if that handling is correct, you don't need to do anything. An index that does not match some custom expectation of the method will not be caught at all, so you definitely need to check for it.
The title may not really explain what I'm really trying to get at, couldn't really think of a way to describe what I mean.
I was wondering if it is good practice to check the arguments that a function accepts for nulls or empty before using them. I have this function which just wraps some hash creation like so.
Public Shared Function GenerateHash(ByVal FilePath As IO.FileInfo) As String
If (FilePath Is Nothing) Then
Throw New ArgumentNullException("FilePath")
End If
Dim _sha As New Security.Cryptography.MD5CryptoServiceProvider
Dim _Hash = Convert.ToBase64String(_sha.ComputeHash(New IO.FileStream(FilePath.FullName, IO.FileMode.Open, IO.FileAccess.Read)))
Return _Hash
End Function
As you can see I just takes a IO.Fileinfo as an argument, at the start of the function I am checking to make sure that it is not nothing.
I'm wondering is this good practice or should I just let it get to the actual hasher and then throw the exception because it is null.?
Thanks.
In general, I'd suggest it's good practice to validate all of the arguments to public functions/methods before using them, and fail early rather than after executing half of the function. In this case, you're right to throw the exception.
Depending on what your method is doing, failing early could be important. If your method was altering instance data on your class, you don't want it to alter half of the data, then encounter the null and throw an exception, as your object's data might them be in an intermediate and possibly invalid state.
If you're using an OO language then I'd suggest it's essential to validate the arguments to public methods, but less important with private and protected methods. My rationale here is that you don't know what the inputs to a public method will be - any other code could create an instance of your class and call it's public methods, and pass in unexpected/invalid data. Private methods, however, are called from inside the class, and the class should already have validated any data passing around internally.
One of my favourite techniques in C++ was to DEBUG_ASSERT on NULL pointers. This was drilled into me by senior programmers (along with const correctness) and is one of the things I was most strict on during code reviews. We never dereferenced a pointer without first asserting it wasn't null.
A debug assert is only active for debug targets (it gets stripped in release) so you don't have the extra overhead in production to test for thousands of if's. Generally it would either throw an exception or trigger a hardware breakpoint. We even had systems that would throw up a debug console with the file/line info and an option to ignore the assert (once or indefinitely for the session). That was such a great debug and QA tool (we'd get screenshots with the assert on the testers screen and information on whether the program continued if ignored).
I suggest asserting all invariants in your code including unexpected nulls. If performance of the if's becomes a concern find a way to conditionally compile and keep them active in debug targets. Like source control, this is a technique that has saved my ass more often than it has caused me grief (the most important litmus test of any development technique).
Yes, it's good practice to validate all arguments at the beginning of a method and throw appropriate exceptions like ArgumentException, ArgumentNullException, or ArgumentOutOfRangeException.
If the method is private such that only you the programmer could pass invalid arguments, then you may choose to assert each argument is valid (Debug.Assert) instead of throw.
If NULL is an inacceptable input, throw an exception. By yourself, like you did in your sample, so that the message is helpful.
Another method of handling NULL inputs is just to respont with a NULL in turn. Depends on the type of function -- in the example above I would keep the exception.
If its for an externally facing API then I would say you want to check every parameter as the input cannot be trusted.
However, if it is only going to be used internally then the input should be able to be trusted and you can save yourself a bunch of code that's not adding value to the software.
You should check all arguments against the set of assumptions that you make in that function about their values.
As in your example, if a null argument to your function doesn't make any sense and you're assuming that anyone using your function will know this then being passed a null argument shows some sort of error and some sort of action taken (eg. throwing an exception). And if you use asserts (as James Fassett got in and said before me ;-) ) they cost you nothing in a release version. (they cost you almost nothing in a debug version either)
The same thing applies to any other assumption.
And it's going to be easier to trace the error if you generate it than if you leave it to some standard library routine to throw the exception. You will be able to provide much more useful contextual information.
It's outside the bounds of this question, but you do need to expose the assumptions that your function makes - for example, through the comment header to your function.
According to The Pragmatic Programmer by Andrew Hunt and David Thomas, it is the responsibility of the caller to make sure it gives valid input. So, you must now choose whether you consider a null input to be valid. Unless it makes specific sense to consider null to be a valid input (e.g. it is probably a good idea to consider null to be a legal input if you're testing for equality), I would consider it invalid. That way your program, when it hits incorrect input, will fail sooner. If your program is going to encounter an error condition, you want it to happen as soon as possible. In the event your function does inadvertently get passed a null, you should consider it to be a bug, and react accordingly (i.e. instead of throwing an exception, you should consider making use of an assertion that kills the program, until you are releasing the program).
Classic design by contract: If input is right, output will be right. If input is wrong, there is a bug. (if input is right but output is wrong, there is a bug. That's a gimme.)
I'll add a couple of elaborations (in bold) to the excellent design by contract advice offerred by Brian earlier...
The priniples of "design by contract" require that you define what is acceptable for the caller to pass in (the valid domain of input values) and then, for any valid input, what the method/provider will do.
For an internal method, you can define NULLs as outside the domain of valid input parameters. In this case, you would immediately assert that the input parameter value is NOT NULL. The key insight in this contract specification is that any call passing in a NULL value IS A CALLER'S BUG and the error thrown by the assert statement is the proper behavior.
Now, while very well defined and parsimonius, if you're exposing the method to external/public callers, you should ask yourself, is that the contract I/we really want?
Probably not. In a public interface, you'd probably accept the NULL (as technically in the domain of inputs that the method accepts), but then decline to process gracefully w/ a return message. (More work to meet the naturally more complex customer-facing requirement.)
In either case, what you're after is a protocol that handles all of the cases from both the perspective of the caller and the provider, not lots of scattershot tests that can make it difficult to assess the completeness or lack of completeness of the contractual condition coverage.
Most of the time, letting it just throw the exception is pretty reasonable as long as you are sure the exception won't be ignored.
If you can add something to it, however, it doesn't hurt to wrap the exception with one that is more accurate and rethrow it. Decoding "NullPointerException" is going to take a bit longer than "IllegalArgumentException("FilePath MUST be supplied")" (Or whatever).
Lately I've been working on a platform where you have to run an obfuscator before you test. Every stack trace looks like monkeys typing random crap, so I got in the habit of checking my arguments all the time.
I'd love to see a "nullable" or "nonull" modifier on variables and arguments so the compiler can check for you.
If you're writing a public API, do your caller the favor of helping them find their bugs quickly, and check for valid inputs.
If you're writing an API where the caller might untrusted (or the caller of the caller), checked for valid inputs, because it's good security.
If your APIs are only reachable by trusted callers, like "internal" in C#, then don't feel like you have to write all that extra code. It won't be useful to anyone.