Should class-level method calls in Ruby class definitions be idempotent? - ruby

I recently ran into an issue where calling load twice on a Ruby class caused bugs (here's a real-world example). This is because there were stateful method invocations taking place in the class body, and load was causing these to be executed twice. A simple example is as follows:
base.rb:
class Base
def foo
puts "BASE"
end
end
derived.rb:
require "./base"
class Derived < Base
alias_method :foo_aliased, :foo
def foo
puts "DERIVED!"
end
end
Execution from a REPL:
$ load './derived.rb'
> true
$ Derived.new.foo
> DERIVED!
> nil
$ Derived.new.foo_aliased
> BASE
> nil
$ load './derived.rb'
> true
$ Derived.new.foo
> DERIVED!
> nil
$ Derived.new.foo_aliased
> DERIVED!
> nil
In this example, the second load causes alias_method to clobber the original alias. As a result, we've broken any code that depends on having an intact alias to the original method.
Brute-force class reloading is common enough (e.g., you see it a lot in the each_run clause for RSpec configurations using Spork) that it's not always easy to forbid the use of load outright. As such, it seems like the only way to prevent bugs is to ensure that class definitions are "idempotent". In other words, you have make sure that methods intended to be called from a class definition should produce the same result no matter how many times you call them. The fact that re-loads don't break more code seems to suggest an unspoken convention.
Is there some style guide for Ruby class design that mandates this? If so, what are the common techniques for enforcing these semantics? For some methods, you could just do some memoization to prevent re-execution of the same method with the same arguments, but other things like alias_method seem harder to work around.

You are entirely wrong. Assuming all class methods to be idempotent is neither a sufficient nor a necessary condition for a code to not break when loaded multiple times.
Therefore, it naturally follows that there is no style guide to enforce that.

Related

How can I determine what objects a call to ruby require added to the global namespace?

Suppose I have a file example.rb like so:
# example.rb
class Example
def foo
5
end
end
that I load with require or require_relative. If I didn't know that example.rb defined Example, is there a list (other than ObjectSpace) that I could inspect to find any objects that had been defined? I've tried checking global_variables but that doesn't seem to work.
Thanks!
Although Ruby offers a lot of reflection methods, it doesn't really give you a top-level view that can identify what, if anything, has changed. It's only if you have a specific target you can dig deeper.
For example:
def tree(root, seen = { })
seen[root] = true
root.constants.map do |name|
root.const_get(name)
end.reject do |object|
seen[object] or !object.is_a?(Module)
end.map do |object|
seen[object] = true
puts object
[ object.to_s, tree(object, seen) ]
end.to_h
end
p tree(Object)
Now if anything changes in that tree structure you have new things. Writing a diff method for this is possible using seen as a trigger.
The problem is that evaluating Ruby code may not necessarily create all the classes that it will or could create. Ruby allows extensive modification to any and all classes, and it's common that at run-time it will create more, or replace and remove others. Only libraries that forcibly declare all of their modules and classes up front will work with this technique, and I'd argue that's a small portion of them.
It depends on what you mean by "the global namespace". Ruby doesn't really have a "global" namespace (except for global variables). It has a sort-of "root" namespace, namely the Object class. (Although note that Object may have a superclass and mixes in Kernel, and stuff can be inherited from there.)
"Global" constants are just constants of Object. "Global functions" are just private instance methods of Object.
So, you can get reasonably close by examining global_variables, Object.constants, and Object.instance_methods before and after the call to require/require_relative.
Note, however, that, depending on your definition of "global namespace" (private) singleton methods of main might also count, so you check for those as well.
Of course, any of the methods the script added could, when called at a later time, themselves add additional things to the global scope. For example, the following script adds nothing to the scope, but calling the method will:
class String
module MyNonGlobalModule
def self.my_non_global_method
Object.const_set(:MY_GLOBAL_CONSTANT, 'Haha, gotcha!')
end
end
end
Strictly speaking, however, you asked about adding "objects" to the global namespace, and neither constants nor methods nor variables are objects, soooooo … the answer is always "none"?

Should mixins make assumptions about their including class?

I found examples of a mixin that makes assumptions about what instance variables an including class has. Something like this:
module Fooable
def calculate
#val_one + #val_two
end
end
class Bar
attr_accessor :val_one, :val_two
include Fooable
end
I found arguments for and against whether it's a good practice. The obvious alternative is passing val_one and val_two as parameters, but that doesn't seem as common, and having more heavily parameterized methods could be a downside.
Is there conventional wisdom regarding a mixin's dependence on class state? What are the advantages/disadvantages of reading values from instance variables vs. passing them in as parameters? Alternatively, does the answer change if you start modifying instance variables instead of just reading them?
It is not a problem at all to assume in a module some properties about the class that includes/prepends it. That is usually done. In fact, the Enumerable module assumes that a class that includes/prepends it has a each method, and has many methods that depend on it. Likewise, the Comparable module assumes that the including/prepending class has <=>. I cannot immediately come up with an example of an instance variable, but there is not a crucial difference between methods and instance variables regarding this point; the same should be said about instance variables.
Disadvantage of passing arguments without using instance variable is that your method call will be verbose and less flexible.
Rule of thumb: Mixins should never make any assumptions about the classes/modules they may be included in. However, as it usually goes, any rule has exceptions.
But first, let's talk about the first part. Specifically, accessing (depending on) including class instance variables. If your mixin depends on anything within the including class, then it means that you can not change that "anything" in the parent class with a guarantee that it would not break something. Also, you will have to document that dependency of mixin not only in documentation related to mixin, but also in the documentation of the class/module that includes the mixin. Because, down the road, the requirements may change or someone might see an opportunity in refactoring your class/module code. Obviously, that person will not dig for that class's documentation or know that that specific class/module has a section in your documentation.
Anyhow, by depending on including class internals, not only your mixin made itself dependant, but also ended up making any class/module that includes it a dependant. Which is definitely not a good thing. Because, you can not control who or which class/module has included your mixin, you will never have a confidence to introduce a change. Not having that confidence to change without a fear of breaking anything is project drainer!
The "workaround" may be - "covering it with test". But, consider yourself or someone else maintaining that code in 2 years. Will you remember to cover your new class, that includes the mixin, to make sure it complies with all mixin dependency requirements? I am sure you or the new maintainer will not.
So, from the maintenance or basic OOP principles, your mixin must not depend on any including class/module.
Now, let's talk about that there is alway an exception to the rule bit.
You can make an exception, provided that mixin dependency does not introduce "surprises" to your code. So, it is ok, if the mixin dependencies are well known among your team or they are a convention. Another case could be when the mixin is used internally and you control who uses it (basically, when you are using it within your own project).
The key advantage of OOP in developing maintainable systems was its ability to hide/encapsulate the implementation details. Making your mixin dependant on any class that includes it, is throwing all the years of OOP experience out the window.
I'd say a mixin should not make assumptions about a specific class it is included in, but it's totally fine to make assumptions about a common parent class (respectively its public methods).
Good example: It's fine to call params in a mixin that will be included in controllers.
Or, to be more precise according your example, I'd think something like this would totally be fine:
class Calculation
attr_accesor :operands
end
module SumOperation
def sum
self.operands.sum
end
end
class MyCustomCalculation < Calculation
include SumOperation
end
You should not hesitate, even for a second, to include instance variables in your mixin module when the situation calls for it.
Suppose, for example, you wrote:
class A
def initialize(h)
#h = h
end
def confirm_colour(colour)
#h[:colour] == colour
end
def confirm_size(size)
#h[:size] == size
end
def confirm_all(colour, size)
confirm_colour(colour) && confirm_size(size)
end
end
a = A.new(:colour=>:blue, :size=>:medium, :weight=>10)
a.confirm_all(:blue, :medium)
#=> true
a.confirm_all(:blue, :large)
#=> false
Now suppose someone asks for the weight be checked as well. We could add the method
def confirm_weight(weight)
#h[:weight] == weight
end
and change confirm_all to
def confirm_all(colour, size)
confirm_colour(colour) && confirm_size(size) && confirm_weight(size)
end
but there is a better way: put all the checks in a module.
module Checks
def confirm_colour(g)
#h[:colour] == g[:colour]
end
def confirm_size(g)
#h[:size] == g[:size]
end
def confirm_weight(g)
#h[:weight] == g[:weight]
end
end
Then include the module in A and run through all the checks.
class A
include Checks
def initialize(h)
#h = h
end
def confirm_all(g)
Checks.instance_methods.all? { |m| send(m, g) }
end
end
a = A.new(:colour=>:blue, :size=>:medium, :weight=>10)
a.confirm_all(:colour=>:blue, :size=>:medium, :weight=>10)
#=> true
a.confirm_all(:colour=>:blue, :size=>:large, :weight=>10)
#=> false
This has the advantage that when checks are to be added or removed, only the module is affected; no changes need to be made to the class. This example is admittedly contrived, but it is a short step to real-world situations.
Although it is common to make these assumptions, you may want to consider a different pattern in order to make more composable and testable code: Dependency Injection.
module Fooable
def add(one, two)
one + two
end
end
class Bar
attr_accessor :val_one, :val_two
include Fooable
def calculate
add #val_one, #val_two
end
end
Although it adds an extra layer of indirection, often times it will be worth it because of the potential to use the concern across a larger number of classes, as well as making it easier to test code.

Disambiguate Function calls in Ruby

I am working through Learn Ruby The Hard Way and came across something intriguing in exercise 49.
In parser.rb I have a function named skip(word_list, word_type) at the top level, which is used to skip through unrequited words (such as stop words) in user input. It is not encapsulated in a class or module. As per the exercise I have to write a unit test for the parser.
This is my code for the Unit Tests:
require "./lib/ex48/parser"
require "minitest/autorun"
class TestGame < Minitest::Test
def test_skip()
word_list = [['stop', 'from'], ['stop', 'the'], ['noun', 'west']]
assert_equal(skip(word_list, 'stop'), nil)
assert_equal(skip([['noun', 'bear'], ['verb', 'eat'], ['noun', 'honey']], 'noun'), nil)
end
end
However, when I run rake test TESTOPTS="-v" from the command line, these particular tests are skipped. This seems to be because there is a clash with the skip method in the Minitest module because they run perfectly after I change the name to skip_words.
Can someone please explain what is going on here exactly?
"Top level functions" are actually methods too, in particular they are private instance methods on Object (there's some funkiness around the main object but that's not important here)
However minitest's Test class also has a skip method and since the individual tests are instance methods on a subclass of Test you end up calling that skip instead.
There's not a very simple way of dealing with this - unlike some languages there is no easy way of saying that you want to call a particular superclass' implementation of something
Other than renaming your method, you'll have to pick an alternative way of calling it eg:
Object.new.send(:skip, list, type)
Object.instance_method(:skip).bind(self).call(list, type)
Of course you can wrap this in a helper method for your test or even redefine skip for this particular Test subclass (although that might lead to some head scratching the day someone tries to call minitest's skip.

What are empty-body methods used for in Ruby?

Currently reading a Ruby style guide and I came across an example:
def no_op; end
What is the purpose of empty body methods?
There are a number of reasons you might create an empty method:
Stub a method that you will fill in later.
Stub a method that a descendant class will override.
Ensure a class or object will #respond_to? a method without necessarily doing anything other than returning nil.
Undefine an inherited method's behavior while still allowing it to #respond_to? the message, as opposed to using undef foo on public methods and surprising callers.
There are possibly other reasons, too, but those are the ones that leapt to mind. Your mileage may vary.
There may be several reasons.
One case is when a class is expected to implement a specific interface (virtually speaking, given that in Ruby there are no interfaces), but in that specific class that method would not make sense. In this case, the method is left for consistency.
class Foo
def say
"foo"
end
end
class Bar
def say
"bar"
end
end
class Null
def say
end
end
In other cases, it is left as a temporary placeholder or reminder.
There are also cases where the method is left blank on purpose, as a hook for developers using that library. The method it is called somewhere at runtime, and developers using that library can override the blank method in order to execute some custom callback. This approach was used in the past by some Rails libraries.

Multiple classes of the same name in Ruby

I have an existing codebase of unit tests where the same classes are defined for each test, and a program that iterates over them. Something like this:
test_files.each do |tf|
load "tests/#{tf}/"+tf
test= ::Kernel.const_get("my_class")
Test::Unit::UI::Console::TestRunner.run( test )
While working on these tests, I've realized that ruby allows you to require classes with the same name from different files, and it merges the methods of the two. This leads to problems as soon as the class hierarchy is not the same: superclass mismatch for class ...
Unfortunately, simply unloading the class is not enough, as there are many other classes in the hierarchy that remain loaded.
Is there a way to execute each test in a different global namespace?
While I figure that using modules would be the way to go, I'm not thrilled with idea of changing the hundreds of existing files by hand.
--EDIT--
Following Cary Swoveland's suggestion, I've moved the test running code to a separate .rb file and am running it using backticks. While this seems to work well, loading the ruby interpreter and requireing all the libraries again has a considerable overhead. Also, cross-platform compatibility requires some extra work.
first I wanted to propose the following solution:
#main.rb
modules = []
Dir.foreach('include') do |file|
if file =~ /\w/
new_module = Module.new
load "include/#{file}"
new_module.const_set("StandardClass", StandardClass)
Object.send(:remove_const, :StandardClass)
modules << new_module
end
end
modules.each do |my_module|
my_module::StandardClass.standard_method
end
#include/first.rb
class StandardClass
def self.standard_method
puts "first method"
end
end
#include/second.rb
class StandardClass
def self.standard_method
puts "second method"
end
end
this wraps each class with it's own module and calls the class inside it's module:
$ruby main.rb
second method
first method
But as you answered there are references to other classes so modules can break them.
I checked this by calling the third class from one of the wrapped classes and got:
include/second.rb:5:in `standard_method': uninitialized constant StandardClass::Independent (NameError)
so probably using backticks is better solution but I hope my answer will be helpful for some other similar situations.

Resources