instance_variable_set in constructor - ruby

I've made a constructor like this:
class Foo
def initialize(p1, p2, opts={})
#...Initialize p1 and p2
opts.each do |k, v|
instance_variable_set("##{k}", v)
end
end
end
I'm wondering if it's a good practice to dynamically set instance variables like this or if I should better set them manually one by one as in most of the libs, and why.

Diagnosing the problem
What you're doing here is a fairly simple example of metaprogramming, i.e. dynamically generating code based on some input. Metaprogramming often reduces the amount of code you need to write, but makes the code harder to understand.
In this particular case, it also introduces some coupling concerns: the public interface of the class is directly related to the internal state in a way that makes it hard to change one without changing the other.
Refactoring the example
Consider a slightly longer example, where we make use of one of the instance variables:
class Foo
def initialize(opts={})
opts.each do |k, v|
instance_variable_set("##{k}", v)
end
end
def greet(name)
greeting = #greeting || "Hello"
puts "#{greeting}, name"
end
end
Foo.new(greeting: "Hi").greet
In this case, if someone wanted to rename the #greeting instance variable to something else, they'd possibly have a hard time understanding how to do that. It's clear that #greeting is used by the greet method, but searching the code for #greeting wouldn't help them find where it was first set. Even worse, to change this bit of internal state they'd also have to change any calls to Foo.new, because the approach we've taken ties the internal state to the public interface.
Remove the metaprogramming
Let's look at an alternative, where we just store all of the opts and treat them as state:
class Foo
def initialize(opts={})
#opts = opts
end
def greet(name)
greeting = #opts.fetch(:greeting, "Hello")
puts "#{greeting}, name"
end
end
Foo.new(greeting: "Hi").greet
By removing the metaprogramming, this clarifies the situation slightly. A new team member who's looking to change this code for the first time is going to have a slightly easier time of things, because they can use editor features (like find-and-replace) to rename the internal ivars, and the relationship between the arguments passed to the initialiser and the internal state is a bit more explicit.
Reduce the coupling
We can go even further, and decouple the internals from the interface:
class Foo
def initialize(opts={})
#greeting = opts.fetch(:greeting, "Hello")
end
def greet(name)
puts "#{#greeting}, name"
end
end
Foo.new(greeting: "Hi").greet
In my opinion, this is the best implementation we've looked at:
There's no metaprogramming, which means we can find explicit references to variables being set and used, e.g. with an editor's search features, grep, git log -S, etc.
We can change the internals of the class without changing the interface, and vice-versa.
By calling opts.fetch in the initialiser, we're making it clear to future readers of our class what the opts argument should look like, without making them read the whole class.
When to use metaprogramming
Metaprogramming can sometimes be useful, but those situations are rare. As a rough guide, I'd be more likely to use metaprogramming in framework or library code which typically needs to be more generic (e.g. the ActiveModel::AttributeAssignment module in Rails), and to avoid it in application code, which is typically more specific to a particular problem or domain.
Even in library code, I'd prefer the clarity of a few lines of repetition.

Answers to this question are always going to be based on someone's personal opinion so here's mine.
Clarity v Brevity
If you cannot know the set of options ahead of time then you have no real choice but to do as you have. However if the options are drawn from a known set then I would favour clarity over brevity and have explicit methods to set the options. These would also be a good place to add any rdoc etc.
Safety
From a safety perspective, having methods to handle the setting of an option would allow you to perform validation as required.

When you need to do such thing, the inventory of the parameters varies. In such case, there is already a handy structure within Ruby (as well as most modern languages): array and hash. In this case, you should just save the entire option as a single hash. That would make things simpler.

Instead of creating instance variables dynamically, you could use attr_accessor to declare the available instance variables and just call the setters dynamically:
class Foo
attr_accessor :bar, :baz, :qux
def initialize(opts = {})
opts.each do |k, v|
public_send("#{k}=", v)
end
end
end
Foo.new(bar: 1, baz: 2) #=> #<Foo:0x007fa8250a31e0 #bar=1, #baz=2>
Foo.new(qux: 3) #=> #<Foo:0x007facbc06ed50 #qux=3>
This approach also shows an error if an unknown option is passed:
Foo.new(quux: 4) #=> undefined method `quux=' for #<Foo:0x007fd71483aa20> (NoMethodError)

Related

Ruby: understanding data structure

Most of the Factorybot factories are like:
FactoryBot.define do
factory :product do
association :shop
title { 'Green t-shirt' }
price { 10.10 }
end
end
It seems that inside the ":product" block we are building a data structure, but it's not the typical hashmap, the "keys" are not declared through symbols and commas aren't used.
So my question is: what kind of data structure is this? and how it works?
How declaring "association" inside the block doesn't trigger a:
NameError: undefined local variable or method `association'
when this would happen on many other situations. Is there a subject in compsci related to this?
The block is not a data structure, it's code. association and friends are all method calls, probably being intercepted by method_missing. Here's an example using that same technique to build a regular hash:
class BlockHash < Hash
def method_missing(key, value=nil)
if value.nil?
return self[key]
else
self[key] = value
end
end
def initialize(&block)
self.instance_eval(&block)
end
end
With which you can do this:
h = BlockHash.new do
foo 'bar'
baz :zoo
end
h
#=> {:foo=>"bar", :baz=>:zoo}
h.foo
#=> "bar"
h.baz
#=> :zoo
I have not worked with FactoryBot so I'm going to make some assumptions based on other libraries I've worked with. Milage may vary.
The basics:
FactoryBot is a class (Obviously)
define is a static method in FactoryBot (I'm going to assume I still haven't lost you ;) ).
Define takes a block which is pretty standard stuff in ruby.
But here's where things get interesting.
Typically when a block is executed it has a closure relative to where it was declared. This can be changed in most languages but ruby makes it super easy. instance_eval(block) will do the trick. That means you can have access to methods in the block that weren't available outside the block.
factory on line 2 is just such a method. You didn't declare it, but the block it's running in isn't being executed with a standard scope. Instead your block is being immediately passed to FactoryBot which passes it to a inner class named DSL which instance_evals the block so its own factory method will be run.
line 3-5 don't work that way since you can have an arbitrary name there.
ruby has several ways to handle missing methods but the most straightforward is method_missing. method_missing is an overridable hook that any class can define that tells ruby what to do when somebody calls a method that doesn't exist.
Here it's checking to see if it can parse the name as an attribute name and use the parameters or block to define an attribute or declare an association. It sounds more complicated than it is. Typically in this situation I would use define_method, define_singleton_method, instance_variable_set etc... to dynamically create and control the underlying classes.
I hope that helps. You don't need to know this to use the library the developers made a domain specific language so people wouldn't have to think about this stuff, but stay curious and keep growing.

How can I determine what objects a call to ruby require added to the global namespace?

Suppose I have a file example.rb like so:
# example.rb
class Example
def foo
5
end
end
that I load with require or require_relative. If I didn't know that example.rb defined Example, is there a list (other than ObjectSpace) that I could inspect to find any objects that had been defined? I've tried checking global_variables but that doesn't seem to work.
Thanks!
Although Ruby offers a lot of reflection methods, it doesn't really give you a top-level view that can identify what, if anything, has changed. It's only if you have a specific target you can dig deeper.
For example:
def tree(root, seen = { })
seen[root] = true
root.constants.map do |name|
root.const_get(name)
end.reject do |object|
seen[object] or !object.is_a?(Module)
end.map do |object|
seen[object] = true
puts object
[ object.to_s, tree(object, seen) ]
end.to_h
end
p tree(Object)
Now if anything changes in that tree structure you have new things. Writing a diff method for this is possible using seen as a trigger.
The problem is that evaluating Ruby code may not necessarily create all the classes that it will or could create. Ruby allows extensive modification to any and all classes, and it's common that at run-time it will create more, or replace and remove others. Only libraries that forcibly declare all of their modules and classes up front will work with this technique, and I'd argue that's a small portion of them.
It depends on what you mean by "the global namespace". Ruby doesn't really have a "global" namespace (except for global variables). It has a sort-of "root" namespace, namely the Object class. (Although note that Object may have a superclass and mixes in Kernel, and stuff can be inherited from there.)
"Global" constants are just constants of Object. "Global functions" are just private instance methods of Object.
So, you can get reasonably close by examining global_variables, Object.constants, and Object.instance_methods before and after the call to require/require_relative.
Note, however, that, depending on your definition of "global namespace" (private) singleton methods of main might also count, so you check for those as well.
Of course, any of the methods the script added could, when called at a later time, themselves add additional things to the global scope. For example, the following script adds nothing to the scope, but calling the method will:
class String
module MyNonGlobalModule
def self.my_non_global_method
Object.const_set(:MY_GLOBAL_CONSTANT, 'Haha, gotcha!')
end
end
end
Strictly speaking, however, you asked about adding "objects" to the global namespace, and neither constants nor methods nor variables are objects, soooooo … the answer is always "none"?

Should mixins make assumptions about their including class?

I found examples of a mixin that makes assumptions about what instance variables an including class has. Something like this:
module Fooable
def calculate
#val_one + #val_two
end
end
class Bar
attr_accessor :val_one, :val_two
include Fooable
end
I found arguments for and against whether it's a good practice. The obvious alternative is passing val_one and val_two as parameters, but that doesn't seem as common, and having more heavily parameterized methods could be a downside.
Is there conventional wisdom regarding a mixin's dependence on class state? What are the advantages/disadvantages of reading values from instance variables vs. passing them in as parameters? Alternatively, does the answer change if you start modifying instance variables instead of just reading them?
It is not a problem at all to assume in a module some properties about the class that includes/prepends it. That is usually done. In fact, the Enumerable module assumes that a class that includes/prepends it has a each method, and has many methods that depend on it. Likewise, the Comparable module assumes that the including/prepending class has <=>. I cannot immediately come up with an example of an instance variable, but there is not a crucial difference between methods and instance variables regarding this point; the same should be said about instance variables.
Disadvantage of passing arguments without using instance variable is that your method call will be verbose and less flexible.
Rule of thumb: Mixins should never make any assumptions about the classes/modules they may be included in. However, as it usually goes, any rule has exceptions.
But first, let's talk about the first part. Specifically, accessing (depending on) including class instance variables. If your mixin depends on anything within the including class, then it means that you can not change that "anything" in the parent class with a guarantee that it would not break something. Also, you will have to document that dependency of mixin not only in documentation related to mixin, but also in the documentation of the class/module that includes the mixin. Because, down the road, the requirements may change or someone might see an opportunity in refactoring your class/module code. Obviously, that person will not dig for that class's documentation or know that that specific class/module has a section in your documentation.
Anyhow, by depending on including class internals, not only your mixin made itself dependant, but also ended up making any class/module that includes it a dependant. Which is definitely not a good thing. Because, you can not control who or which class/module has included your mixin, you will never have a confidence to introduce a change. Not having that confidence to change without a fear of breaking anything is project drainer!
The "workaround" may be - "covering it with test". But, consider yourself or someone else maintaining that code in 2 years. Will you remember to cover your new class, that includes the mixin, to make sure it complies with all mixin dependency requirements? I am sure you or the new maintainer will not.
So, from the maintenance or basic OOP principles, your mixin must not depend on any including class/module.
Now, let's talk about that there is alway an exception to the rule bit.
You can make an exception, provided that mixin dependency does not introduce "surprises" to your code. So, it is ok, if the mixin dependencies are well known among your team or they are a convention. Another case could be when the mixin is used internally and you control who uses it (basically, when you are using it within your own project).
The key advantage of OOP in developing maintainable systems was its ability to hide/encapsulate the implementation details. Making your mixin dependant on any class that includes it, is throwing all the years of OOP experience out the window.
I'd say a mixin should not make assumptions about a specific class it is included in, but it's totally fine to make assumptions about a common parent class (respectively its public methods).
Good example: It's fine to call params in a mixin that will be included in controllers.
Or, to be more precise according your example, I'd think something like this would totally be fine:
class Calculation
attr_accesor :operands
end
module SumOperation
def sum
self.operands.sum
end
end
class MyCustomCalculation < Calculation
include SumOperation
end
You should not hesitate, even for a second, to include instance variables in your mixin module when the situation calls for it.
Suppose, for example, you wrote:
class A
def initialize(h)
#h = h
end
def confirm_colour(colour)
#h[:colour] == colour
end
def confirm_size(size)
#h[:size] == size
end
def confirm_all(colour, size)
confirm_colour(colour) && confirm_size(size)
end
end
a = A.new(:colour=>:blue, :size=>:medium, :weight=>10)
a.confirm_all(:blue, :medium)
#=> true
a.confirm_all(:blue, :large)
#=> false
Now suppose someone asks for the weight be checked as well. We could add the method
def confirm_weight(weight)
#h[:weight] == weight
end
and change confirm_all to
def confirm_all(colour, size)
confirm_colour(colour) && confirm_size(size) && confirm_weight(size)
end
but there is a better way: put all the checks in a module.
module Checks
def confirm_colour(g)
#h[:colour] == g[:colour]
end
def confirm_size(g)
#h[:size] == g[:size]
end
def confirm_weight(g)
#h[:weight] == g[:weight]
end
end
Then include the module in A and run through all the checks.
class A
include Checks
def initialize(h)
#h = h
end
def confirm_all(g)
Checks.instance_methods.all? { |m| send(m, g) }
end
end
a = A.new(:colour=>:blue, :size=>:medium, :weight=>10)
a.confirm_all(:colour=>:blue, :size=>:medium, :weight=>10)
#=> true
a.confirm_all(:colour=>:blue, :size=>:large, :weight=>10)
#=> false
This has the advantage that when checks are to be added or removed, only the module is affected; no changes need to be made to the class. This example is admittedly contrived, but it is a short step to real-world situations.
Although it is common to make these assumptions, you may want to consider a different pattern in order to make more composable and testable code: Dependency Injection.
module Fooable
def add(one, two)
one + two
end
end
class Bar
attr_accessor :val_one, :val_two
include Fooable
def calculate
add #val_one, #val_two
end
end
Although it adds an extra layer of indirection, often times it will be worth it because of the potential to use the concern across a larger number of classes, as well as making it easier to test code.

In how many ways can methods be added to a ruby object?

When it comes to run time introspection and dynamic code generation I don't think ruby has any rivals except possibly for some lisp dialects. The other day I was doing some code exercise to explore ruby's dynamic facilities and I started to wonder about ways of adding methods to existing objects. Here are 3 ways I could think of:
obj = Object.new
# add a method directly
def obj.new_method
...
end
# add a method indirectly with the singleton class
class << obj
def new_method
...
end
end
# add a method by opening up the class
obj.class.class_eval do
def new_method
...
end
end
This is just the tip of the iceberg because I still haven't explored various combinations of instance_eval, module_eval and define_method. Is there an online/offline resource where I can find out more about such dynamic tricks?
Ruby Metaprogramming seems to be a good resource. (And, linked from there, The Book of Ruby.)
If obj has a superclass, you can add methods to obj from the superclass using define_method (API) as you mentioned. If you ever look at the Rails source code, you'll notice that they do this quite a bit.
Also while this isn't exactly what you're asking for, you can easily give the impression of creating an almost infinite number of methods dynamically by using method_missing:
def method_missing(name, *args)
string_name = name.to_s
return super unless string_name =~ /^expected_\w+/
# otherwise do something as if you have a method called expected_name
end
Adding that to your class will allow it to respond to any method call which looks like
#instance.expected_something
I like the book Metaprogramming Ruby which is published by the publishers of the pickaxe book.

How do you modify the scope passed to a block?

I've been working with Ruby for a little under a year, and I still don't fully understand "what makes blocks tick". In particular I'm curious how much control one has over the scope of a block. For example, say I have this code:
class Blob
attr_accessor :viscosity
def configure(&:block)
block.call self
end
end
blob = Blob.new
blob.configure do |b|
b.viscosity 0.5
end
A bit of a contrived example there, obviously.
Now, one thing I've noticed while migrating from Rails 2 to Rails 3 is that a lot of their configuration methods that take blocks no longer take a non-block argument.
For example, in routes.rb, it used to be ActionController::Routing::Routes.draw do |map| ... end, and now it's just ActionController::Routing::Routes.draw do ... end. But the methods that are called inside the block still have the appropriate context, without the need to repeat the name of the block's argument over and over.
In my example above, then, I want to be able to do:
blob.configure do
viscosity 0.5
end
so that I can tell people how easy it is to write a DSL in Ruby. :)
This uses instance_eval to do the magic. See http://apidock.com/ruby/Object/instance_eval/ for some documentation. instance_eval evaluates a block (or a string) in the context of it's receiver.
def configure(&block)
self.instance_eval &block
end
You'd still have to use the accessor method viscosity= in your example block or you'd have to define
def viscosity(value)
#viscosity = value
end
in your class.

Resources