What are the serious problems caused by undefining object_id and __send__ on classes? - ruby

If I undefine all instance methods on a class then I get the following warnings:
warning: undefining `object_id' may cause serious problems
warning: undefining `__send__' may cause serious problems
What are some examples of "serious problems" this may cause?
(In particular, I am also curious as to whether this has any implications for garbage collection?)

In short, these methods are used for meta purposes (such as error reporting) as well as for ordinary purposes, so they are more important than other methods.
When something goes wrong, Ruby gives back an error message and backtraces. By default, an error message displays the inspection of the offending object. Except for a few special classes such String and Numeric, an inspection displays the object id. In order to display the error messages correctly, having object_id defined is essential. If it were not defined, then the error displaying routine itself would raise an error, which would need to be displayed, causing an infinite loop that cannot be reported. Under such situation, you have no way to tell what is going wrong. This is serious.
Perhaps the same can be said for __send__. It is the method that underlies send for most classes, and is crucial in sending instructions to objects.

Related

Is it possible to change Ruby's frozen object handling behaviour?

I am submitting solutions to Ruby puzzles on codewars.com and experimenting with how locked into the testing enviroment I am for one of the challenges.
I can redefine the classes used to test my solution but they are defined by the system after I submit my code. If I freeze these objects, the system cannot write over them but a RunTime error is raised when it tries to.
I'm fairly new to Ruby, so I'm not sure which parts (other than falsiness and truthiness) are impossible to override. Can I use Ruby code to force modification of frozen objects to silently fail instead of terminate the program or is that bound up in untouchable things like the assignment operator or similar?
The real answer here is that if you might want to modify an object later, you shouldn't freeze it. That's inherent in the whole concept of "freezing" an object. But since you asked, note that you can test whether an object is frozen with:
obj.frozen?
So if those pesky RuntimeErrors are getting you down, one solution is to use a guard clause like:
obj.do_something! if !obj.frozen?
If you want to make the guard clauses implicit, you can redefine the "problem" methods using a monkey patch:
class Array
# there are a couple other ways to do this
# read up on Ruby metaprogramming if you want to know
alias :__pop__ :pop
def pop
frozen? ? nil : __pop__
end
end
If you want your code to work seamlessly with any and all Ruby libraries/gems, adding behavior to built-in methods like this is probably a bad idea. In this case, I doubt it will cause any problems, but whenever you choose to start hacking on Ruby's core classes, you have to be ready for the possible consequences.

Cutting off from backtrace the steps coming from particular libraries

Sometimes, we use libraries that are pretty much well debugged and are usually not a cause of an error. Still, these libraries can return errors due to our misuse of their API. In such case, the steps internal to these libraries show up within the backtrace of an error, which are just garbage from the point of view of the programmers using the library, and make it difficult to spot the cause of the error. Even some methods in the core Ruby insert some internal steps into the backtrace. For example, whenever you see a backtrace involving Enumerable#inject, there is always Enumerable#each being called from it, which shows up in the backtrace and is annoying.
What is a good way to remove from the backtrace the steps internal to certain given libraries? I am currently dong it by parsing the backtrace and filtering it by the file name. Is there a better way to do it?
When you are writing a library by yourself, is there a good way to suppress the internal steps appearing in a backtrace involving a method call that uses the library? An obvious way might be to insert a pair of rescue and raise for every method that is to be used from outside of the library, but that does not seem right.
Well...
There isn't really a better way to filter. If you can get the full filepath for the backtrace, though, you can filter by directory which can rule out all stdlibs and gems. Beyond that, it's more trouble than it's worth.
There is a much better solution for this. However, it requires that you catch all exception thrown by Ruby in your library, and then rethrow them after doing this (also do this to all your own excpetions). So wrap all your method with this:
begin
...
rescue Exception
e = $!
e.set_backtrace(caller(nesting_level))
raise e
end
The nesting_level is how many methods of this library the current method was called from. If it was called directly from user code, put 0. If it was called by one method that was called in user code, put in 1, and so on.

How stringent should I be with Code Analysis compliance in Visual Studio?

After playing with Code Analysis for a small project I am working on, I am wondering just how severe I should be when resolving code to be analytically compliant.
I know I can suppress warnings for this, but to me, suppressing a warning to some extent is a Cop-out (no pun intended..."FXCop").
Example warning:
Do not raise exceptions in unexpected
locations 'CustomObject.Equals(object)' creates an exception of type
'ArgumentException'. Exceptions should not be raised in this type of
method. If this exception instance might be raised, change this
method's logic so it no longer raises an exception.
Reason for throwing this...
CustomObject.Equals(object) might try and compare CustomObject to FooBarObject...which aren't even of the same type, so in this instance, should I throw an exception, or just return false?
In general, should I be really anal (for want of a better word) in making my code absolutely compliant, or will I come across situations where warning suppression will become necessary?
FxCop warnings are just warnings, they don't flag invalid code. That's the job of the compiler. The rules FxCop uses were collected from years of experience writing .NET code. They represent "best practices" and in general are there to remind you of unintended consequences and the more obscure parts of .NET programming, like CAS.
Always refer back to the documentation to see why the rule exists. For CA1065 you'll see:
An Equals method should return true or false instead of throwing an exception. For example, if Equals is passed two mismatched types it should just return false instead of throwing an ArgumentException.
Which exactly matches your usage, you'll have no trouble adopting the advice. Unfortunately it is a bit short on the exact reason the rule was created. Which really doesn't go beyond the "don't throw in unexpected places" guidance. The unintended consequence here is that another programmer that uses your class won't realize that a try/catch would be needed if he doesn't want the code to fail. Feel free to put a Debug.Assert() in your Equals method. There are plenty of cases where you'll want to ignore the advice, CA2000 is particularly prone to false warnings for example. Apply the [SuppressMessage] attribute if necessary to not have to look at it again.

Avoiding, in general, "undefined method 'some_method' for nil:NilClass" in Ruby

Ruby's duck-typing is great, but this is the one way that it bites me in the ass. I'll have some long running text-processing script or something running, and after several hours, some unexpected set of circumstances ends up causing the script to exit with at NoMethodError due to a variable becoming nil.
Now, once it happens, it's usually an easy fix, but it would be nicer if I could predict these better, or at least handle these types of errors more gracefully. Sorry for the vagueness of the question, but this type of error just happens too often to me and I wonder if there's a good way to avoid it.
Is there some best practice related to these kinds of "type errors" for Ruby?
Look up Design by Contract. It's useful in many programming paradigms, but it's particularly useful when you don't have a compiler to help you catch these sort of errors, of forbidding particular sorts of values for a parameter.
In essence, DbC allows you to make an assumption about a parameter. It allows you (in all but one place) to skip the mundane checks that guarantee this assumption to hold.
What about Object.nil?

Should I make sure arguments aren't null before using them in a function?

The title may not really explain what I'm really trying to get at, couldn't really think of a way to describe what I mean.
I was wondering if it is good practice to check the arguments that a function accepts for nulls or empty before using them. I have this function which just wraps some hash creation like so.
Public Shared Function GenerateHash(ByVal FilePath As IO.FileInfo) As String
If (FilePath Is Nothing) Then
Throw New ArgumentNullException("FilePath")
End If
Dim _sha As New Security.Cryptography.MD5CryptoServiceProvider
Dim _Hash = Convert.ToBase64String(_sha.ComputeHash(New IO.FileStream(FilePath.FullName, IO.FileMode.Open, IO.FileAccess.Read)))
Return _Hash
End Function
As you can see I just takes a IO.Fileinfo as an argument, at the start of the function I am checking to make sure that it is not nothing.
I'm wondering is this good practice or should I just let it get to the actual hasher and then throw the exception because it is null.?
Thanks.
In general, I'd suggest it's good practice to validate all of the arguments to public functions/methods before using them, and fail early rather than after executing half of the function. In this case, you're right to throw the exception.
Depending on what your method is doing, failing early could be important. If your method was altering instance data on your class, you don't want it to alter half of the data, then encounter the null and throw an exception, as your object's data might them be in an intermediate and possibly invalid state.
If you're using an OO language then I'd suggest it's essential to validate the arguments to public methods, but less important with private and protected methods. My rationale here is that you don't know what the inputs to a public method will be - any other code could create an instance of your class and call it's public methods, and pass in unexpected/invalid data. Private methods, however, are called from inside the class, and the class should already have validated any data passing around internally.
One of my favourite techniques in C++ was to DEBUG_ASSERT on NULL pointers. This was drilled into me by senior programmers (along with const correctness) and is one of the things I was most strict on during code reviews. We never dereferenced a pointer without first asserting it wasn't null.
A debug assert is only active for debug targets (it gets stripped in release) so you don't have the extra overhead in production to test for thousands of if's. Generally it would either throw an exception or trigger a hardware breakpoint. We even had systems that would throw up a debug console with the file/line info and an option to ignore the assert (once or indefinitely for the session). That was such a great debug and QA tool (we'd get screenshots with the assert on the testers screen and information on whether the program continued if ignored).
I suggest asserting all invariants in your code including unexpected nulls. If performance of the if's becomes a concern find a way to conditionally compile and keep them active in debug targets. Like source control, this is a technique that has saved my ass more often than it has caused me grief (the most important litmus test of any development technique).
Yes, it's good practice to validate all arguments at the beginning of a method and throw appropriate exceptions like ArgumentException, ArgumentNullException, or ArgumentOutOfRangeException.
If the method is private such that only you the programmer could pass invalid arguments, then you may choose to assert each argument is valid (Debug.Assert) instead of throw.
If NULL is an inacceptable input, throw an exception. By yourself, like you did in your sample, so that the message is helpful.
Another method of handling NULL inputs is just to respont with a NULL in turn. Depends on the type of function -- in the example above I would keep the exception.
If its for an externally facing API then I would say you want to check every parameter as the input cannot be trusted.
However, if it is only going to be used internally then the input should be able to be trusted and you can save yourself a bunch of code that's not adding value to the software.
You should check all arguments against the set of assumptions that you make in that function about their values.
As in your example, if a null argument to your function doesn't make any sense and you're assuming that anyone using your function will know this then being passed a null argument shows some sort of error and some sort of action taken (eg. throwing an exception). And if you use asserts (as James Fassett got in and said before me ;-) ) they cost you nothing in a release version. (they cost you almost nothing in a debug version either)
The same thing applies to any other assumption.
And it's going to be easier to trace the error if you generate it than if you leave it to some standard library routine to throw the exception. You will be able to provide much more useful contextual information.
It's outside the bounds of this question, but you do need to expose the assumptions that your function makes - for example, through the comment header to your function.
According to The Pragmatic Programmer by Andrew Hunt and David Thomas, it is the responsibility of the caller to make sure it gives valid input. So, you must now choose whether you consider a null input to be valid. Unless it makes specific sense to consider null to be a valid input (e.g. it is probably a good idea to consider null to be a legal input if you're testing for equality), I would consider it invalid. That way your program, when it hits incorrect input, will fail sooner. If your program is going to encounter an error condition, you want it to happen as soon as possible. In the event your function does inadvertently get passed a null, you should consider it to be a bug, and react accordingly (i.e. instead of throwing an exception, you should consider making use of an assertion that kills the program, until you are releasing the program).
Classic design by contract: If input is right, output will be right. If input is wrong, there is a bug. (if input is right but output is wrong, there is a bug. That's a gimme.)
I'll add a couple of elaborations (in bold) to the excellent design by contract advice offerred by Brian earlier...
The priniples of "design by contract" require that you define what is acceptable for the caller to pass in (the valid domain of input values) and then, for any valid input, what the method/provider will do.
For an internal method, you can define NULLs as outside the domain of valid input parameters. In this case, you would immediately assert that the input parameter value is NOT NULL. The key insight in this contract specification is that any call passing in a NULL value IS A CALLER'S BUG and the error thrown by the assert statement is the proper behavior.
Now, while very well defined and parsimonius, if you're exposing the method to external/public callers, you should ask yourself, is that the contract I/we really want?
Probably not. In a public interface, you'd probably accept the NULL (as technically in the domain of inputs that the method accepts), but then decline to process gracefully w/ a return message. (More work to meet the naturally more complex customer-facing requirement.)
In either case, what you're after is a protocol that handles all of the cases from both the perspective of the caller and the provider, not lots of scattershot tests that can make it difficult to assess the completeness or lack of completeness of the contractual condition coverage.
Most of the time, letting it just throw the exception is pretty reasonable as long as you are sure the exception won't be ignored.
If you can add something to it, however, it doesn't hurt to wrap the exception with one that is more accurate and rethrow it. Decoding "NullPointerException" is going to take a bit longer than "IllegalArgumentException("FilePath MUST be supplied")" (Or whatever).
Lately I've been working on a platform where you have to run an obfuscator before you test. Every stack trace looks like monkeys typing random crap, so I got in the habit of checking my arguments all the time.
I'd love to see a "nullable" or "nonull" modifier on variables and arguments so the compiler can check for you.
If you're writing a public API, do your caller the favor of helping them find their bugs quickly, and check for valid inputs.
If you're writing an API where the caller might untrusted (or the caller of the caller), checked for valid inputs, because it's good security.
If your APIs are only reachable by trusted callers, like "internal" in C#, then don't feel like you have to write all that extra code. It won't be useful to anyone.

Resources