Can someone please provide some insight as to when to use delegation via DelegateClass (e.g. Seller < DelegateClass(Person)) and when to use class inheritance (e.g. Seller < Person) in ruby?
class Seller < DelegateClass(Person)
def sales
...
end
end
class Seller < Person
def sales
...
end
end
When I was looking over the Ruby on Rails source on Github I found quite a few uses of DelegateClass.
There are a couple of differences that can help provide insight as to which approach to use.
1) You can safely delegate to primitives (e.g. String), but cannot always safely inherit from them
If you're building on top of Hash or String or Fixnum, you're safer using DelegateClass (or another delegator). For more on why, Steve Klabnik's cautioning is a good place to start).
2) DelegateClass makes it easy to “convert” a more general object into a more specific one
This makes it easier to accept an instance of a general object and make it behave in a way that's specific to your implementation:
class Message < DelegateClass(String)
def print
upcase
end
end
# […]
def log(message)
message = Message.new(message) unless message.is_a?(Message)
end
3) A gotcha: DelegateClass subclasses expect an instance of the delegated class as an argument to new
This can make it tricky to “subclass” classes that you're handing to library code. For example, this is a fairly common practice that won't work out of the box with DelegateClass:
class MyLogger < DelegateClass(ActiveSupport::Logger); end
Foo::ThirdParty::Library.configure do |c|
c.logger = MyLogger # no good
end
This doesn't work because our library expects to behave like most loggers and instantiate without arguments. This can be addressed by defining initialize and creating an instance of ActiveSupport::Logger, but probably not the right solution in this case.
delegates model different behaviors of Person based on the context. e.g. the same person could be a seller in one context or a buyer in a different context. Inheritance is more rigid: a Bear and Tiger inherit from Animal, but an instance of Animal would never need to sometimes behave like a Bear and sometimes behave like a Tiger. An instance of a descendent of Animal is either one or the other.
Related
Some open source code I'm integrating in my application has some classes that include code to that effect:
class SomeClass < SomeParentClass
def self.new(options = {})
super().tap { |o|
# do something with `o` according to `options`
}
end
def initialize(options = {})
# initialize some data according to `options`
end
end
As far as I understand, both self.new and initialize do the same thing - the latter one "during construction" and the former one "after construction", and it looks to me like a horrible pattern to use - why split up the object initialization into two parts where one is obviously "The Wrong Think(tm)"?
Ideally, I'd like to see what is inside the super().tap { |o| block, because although this looks like bad practice, just maybe there is some interaction required before or after initialize is called.
Without context, it is possible that you are just looking at something that works but is not considered good practice in Ruby.
However, maybe the approach of separate self.new and initialize methods allows the framework designer to implement a subclass-able part of the framework and still ensure setup required for the framework is completed without slightly awkward documentation that requires a specific use of super(). It would be a slightly easier to document and cleaner-looking API if the end user gets functionality they expect with just the subclass class MyClass < FrameworkClass and without some additional note like:
When you implement the subclass initialize, remember to put super at the start, otherwise the magic won't work
. . . personally I'd find that design questionable, but I think there would at least be a clear motivation.
There might be deeper Ruby language reasons to have code run in a custom self.new block - for instance it may allow constructor to switch or alter the specific object (even returning an object of a different class) before returning it. However, I have very rarely seen such things done in practice, there is nearly always some other way of achieving the goals of such code without customising new.
Examples of custom/different Class.new methods raised in the comments:
Struct.new which can optionally take a class name and return objects of that dynamically created class.
In-table inheritance for ActiveRecord, which allows end user to load an object of unknown class from a table and receive the right object.
The latter one could possibly be avoided with a different ORM design for inheritance (although all such schemes have pros/cons).
The first one (Structs) is core to the language, so has to work like that now (although the designers could have chosen a different method name).
It's impossible to tell why that code is there without seeing the rest of the code.
However, there is something in your question I want to address:
As far as I understand, both self.new and initialize do the same thing - the latter one "during construction" and the former one "after construction"
They do not do the same thing.
Object construction in Ruby is performed in two steps: Class#allocate allocates a new empty object from the object space and sets its internal class pointer to self. Then, you initialize the empty object with some default values. Customarily, this initialization is performed by a method called initialize, but that is just a convention; the method can be called anything you like.
There is an additional helper method called Class#new which does nothing but perform the two steps in sequence, for the programmer's convenience:
class Class
def new(*args, &block)
obj = allocate
obj.send(:initialize, *args, &block)
obj
end
def allocate
obj = __MagicVM__.__allocate_an_empty_object_from_the_object_space__
obj.__set_internal_class_pointer__(self)
obj
end
end
class BasicObject
private def initialize(*) end
end
The constructor new has to be a class method since you start from where there is no instance; you can't be calling that method on a particular instance. On the other hand, an initialization routine initialize is better defined as an instance method because you want to do something specifically with a certain instance. Hence, Ruby is designed to internally call the instance method initialize on a new instance right after its creation by the class method new.
I have a class QuestionList that stores a list of 'Question' objects.
class QuestionList
attr_accessor :questions
def initialize
#questions = []
end
end
I then add questions to the list and then "ask" these questions from my main class as so:
list = QuestionList.new
list.questions << Question.new(5, 3)
list.questions << Question.new(1, 5)
list.questions << Question.new(2, 4)
list.questions.each do |question|
puts "#{question.ask}"
end
where Question.ask simply outputs the question as a string.
I'm not sure how acceptable it is to be writing to an instance variable from my main class using the << operator, and list.questions.push(Question.new(5, 3)) is even more unclear from the main class.
Would it be better to have a QuestionsList.add_question(question) method?
The same goes for list.questions.each - is this acceptable to be used in the main class?
I think your use of attr_accessor here is fine, though depending on how much functionality you continue to add, it will likely be clearer to restrict functionality of that class to the class itself.
Regarding your question on using a method within QuestionList, this would come down to readability. Something to note first, however: you used QuestionsList.add_question(question) as an example. This would create a class method. What you really want here is an instance method, which would read as list.add_question(question), since you already have an instance of the list created. This blog post has some good information on the difference between class and instance methods.
I personally find instance methods would most clearly communicate your intent. I would write out QuestionList as follows:
class QuestionList
def initialize
#questions = []
end
def add_question(question)
#questions << question
end
def print_all_questions_in_list
#questions.each do |question|
puts "#{question.ask}"
end
end
end
This SO post has some excellent information on Ruby's attr methods, if you'd like additional info there.
The #questions array is a private internal implementation detail of your class. It should never ever get exposed to clients. There are many reasons for this, two examples are:
You overpromise an interface: now, all your clients depend on it being an array. What if you later want to change it to something else? A textfile? A database? A webservice?
You expose operations that break your object invariants: a client can, for example, add an integer to that array. Or nil. Or anything else that is not a Question.
How you store your questions should be an implementation detail. The QuestionList should have methods to manage the list of questions. It should probably have an each method (and include Enumerable), and an add method (possibly aliased to <<). Maybe also an [], if that makes sense. These methods could simply be delegated to the array, if convenient, but the point is: if you later decide to not use an array, you can do that without anyone even noticing. You can decide to only support those methods you actually want to support, as opposed to all ~100 methods of Array.
Suppose you write a class Sup and I decide to extend it to Sub < Sup. Not only do I need to understand your published interface, but I also need to understand your private fields. Witness this failure:
class Sup
def initialize
#privateField = "from sup"
end
def getX
return #privateField
end
end
class Sub < Sup
def initialize
super()
#privateField = "i really hope Sup does not use this field"
end
end
obj = Sub.new
print obj.getX # prints "i really hope Sup does not use this field"
The question is, what is the right way to tackle this problem? It seems a subclass should be able to use whatever fields it wants without messing up the superclass.
EDIT: The equivalent example in Java returns "from Sup", which is the answer this should produce as well.
Instance variables have nothing to do with inheritance, they are created on first usage, not by some defining mechanism, therefore there is no special access control for them in language and they can not be shadowed.
Not only do I need to understand your
published interface, but I also need
to understand your private fields.
Actually this is an "official" position. Excerpt from "The Ruby Programming Language" book (where Matz is one of the authors):
... this is another reason why it is only safe to extend Ruby
classes when you are familiar with
(and in control of) the implementation
of the superclass.
If you don't know it inside and out you're on your own. Sad but true.
Don't subclass it!
Use composition instead of inheritance.
Edit: Rather than MyObject subclassing ExistingObject, see if my_object having an instance variable referring to existing_object would be more appropriate.
Instance variables belong to instances (ie objects). They're not determined by the classes themselves.
unlike java/C#, in ruby private variables are always visible to the inheriting classes. There is no way to hide the private variables.
Ruby and Java don't treat 'private' property the same way. In Ruby if you mark something as private it only means that it can't be called with receiver, i.e.:
class Sub
private
def foo; end
end
sub.foo => error accessing private method with caller
but you can always access it if you change who is self like:
sub.instance_eval { foo } #instance_eval changes self to receiver, 'sub' in this example
Conclusion: Don't rely that you can hide or protect something from outer space! Or with great power comes great responsibility!
EDIT:
Yes, I know question was for fields but it's the same thing. You can always do:
sub.instance_eval { #my_private_field = 'something else' }
puts sub.instance_eval { #my_private_field }
I am currently working through the Gregory Brown Ruby Best Practices book. Early on, he is talking about refactoring some functionality from helper methods on a related class, to some methods on module, then had the module extend self.
Hadn't seen that before, after a quick google, found out that extend self on a module lets methods defined on the module see each other, which makes sense.
Now, my question is when would you do something like this
module StyleParser
extend self
def process(text)
...
end
def style_tag?(text)
...
end
end
and then refer to it in tests with
#parser = Prawn::Document::Text::StyleParser
as opposed to something like this?
class StyleParser
def self.process(text)
...
end
def self.style_tag?(text)
...
end
end
is it so that you can use it as a mixin? or are there other reasons I'm not seeing?
A class should be used for functionality that will require instantiation or that needs to keep track of state. A module can be used either as a way to mix functionality into multiple classes, or as a way to provide one-off features that don't need to be instantiated or to keep track of state. A class method could also be used for the latter.
With that in mind, I think the distinction lies in whether or not you really need a class. A class method seems more appropriate when you have an existing class that needs some singleton functionality. If what you're making consists only of singleton methods, it makes more sense to implement it as a module and access it through the module directly.
In this particular case I would probably user neither a class nor a module.
A class is a factory for objects (note the plural). If you don't want to create multiple instances of the class, there is no need for it to exist.
A module is a container for methods, shared among multiple objects. If you don't mix in the module into multiple objects, there is no need for it to exist.
In this case, it looks like you just want an object. So use one:
def (StyleParser = Object.new).process(text)
...
end
def StyleParser.style_tag?(text)
...
end
Or alternatively:
class << (StyleParser = Object.new)
def process(text)
...
end
def style_tag?(text)
...
end
end
But as #Azeem already wrote: for a proper decision, you need more context. I am not familiar enough with the internals of Prawn to know why Gregory made that particular decision.
If it's something you want to instantiate, use a class. The rest of your question needs more context to make sense.
In Ruby, since you can include multiple mixins but only extend one class, it seems like mixins would be preferred over inheritance.
My question: if you're writing code which must be extended/included to be useful, why would you ever make it a class? Or put another way, why wouldn't you always make it a module?
I can only think of one reason why you'd want a class, and that is if you need to instantiate the class. In the case of ActiveRecord::Base, however, you never instantiate it directly. So shouldn't it have been a module instead?
I just read about this topic in The Well-Grounded Rubyist (great book, by the way). The author does a better job of explaining than I would so I'll quote him:
No single rule or formula always results in the right design. But it’s useful to keep a
couple of considerations in mind when you’re making class-versus-module decisions:
Modules don’t have instances. It follows that entities or things are generally best
modeled in classes, and characteristics or properties of entities or things are
best encapsulated in modules. Correspondingly, as noted in section 4.1.1, class
names tend to be nouns, whereas module names are often adjectives (Stack
versus Stacklike).
A class can have only one superclass, but it can mix in as many modules as it wants. If
you’re using inheritance, give priority to creating a sensible superclass/subclass
relationship. Don’t use up a class’s one and only superclass relationship to
endow the class with what might turn out to be just one of several sets of characteristics.
Summing up these rules in one example, here is what you should not do:
module Vehicle
...
class SelfPropelling
...
class Truck < SelfPropelling
include Vehicle
...
Rather, you should do this:
module SelfPropelling
...
class Vehicle
include SelfPropelling
...
class Truck < Vehicle
...
The second version models the entities and properties much more neatly. Truck
descends from Vehicle (which makes sense), whereas SelfPropelling is a characteristic of vehicles (at least, all those we care about in this model of the world)—a characteristic that is passed on to trucks by virtue of Truck being a descendant, or specialized
form, of Vehicle.
I think mixins are a great idea, but there's another problem here that nobody has mentioned: namespace collisions. Consider:
module A
HELLO = "hi"
def sayhi
puts HELLO
end
end
module B
HELLO = "you stink"
def sayhi
puts HELLO
end
end
class C
include A
include B
end
c = C.new
c.sayhi
Which one wins? In Ruby, it turns out the be the latter, module B, because you included it after module A. Now, it's easy to avoid this problem: make sure all of module A and module B's constants and methods are in unlikely namespaces. The problem is that the compiler doesn't warn you at all when collisions happen.
I argue that this behavior does not scale to large teams of programmers-- you shouldn't assume that the person implementing class C knows about every name in scope. Ruby will even let you override a constant or method of a different type. I'm not sure that could ever be considered correct behavior.
My take: Modules are for sharing behavior, while classes are for modeling relationships between objects. You technically could just make everything an instance of Object and mix in whatever modules you want to get the desired set of behaviors, but that would be a poor, haphazard and rather unreadable design.
The answer to your question is largely contextual. Distilling pubb's observation, the choice is primarily driven by the domain under consideration.
And yes, ActiveRecord should have been included rather than extended by a subclass. Another ORM - datamapper - precisely achieves that!
I like Andy Gaskell's answer very much - just wanted to add that yes, ActiveRecord should not use inheritance, but rather include a module to add the behavior (mostly persistence) to a model/class. ActiveRecord is simply using the wrong paradigm.
For the same reason, I very much like MongoId over MongoMapper, because it leaves the developer the chance to use inheritance as a way of modelling something meaningful in the problem domain.
It's sad that pretty much nobody in the Rails community is using "Ruby inheritance" the way it's supposed to be used - to define class hierarchies, not just to add behavior.
The best way I understand mixins are as virtual classes. Mixins are "virtual classes" that have been injected in a class's or module's ancestor chain.
When we use "include" and pass it a module, it adds the module to the ancestor chain right before the class that we are inheriting from:
class Parent
end
module M
end
class Child < Parent
include M
end
Child.ancestors
=> [Child, M, Parent, Object ...
Every object in Ruby also has a singleton class. Methods added to this singleton class can be directly called on the object and so they act as "class" methods. When we use "extend" on an object and pass the object a module, we are adding the methods of the module to the singleton class of the object:
module M
def m
puts 'm'
end
end
class Test
end
Test.extend M
Test.m
We can access the singleton class with the singleton_class method:
Test.singleton_class.ancestors
=> [#<Class:Test>, M, #<Class:Object>, ...
Ruby provides some hooks for modules when they are being mixed into classes/modules. included is a hook method provided by Ruby which gets called whenever you include a module in some module or class. Just like included, there is an associated extended hook for extend. It will be called when a module is extended by another module or class.
module M
def self.included(target)
puts "included into #{target}"
end
def self.extended(target)
puts "extended into #{target}"
end
end
class MyClass
include M
end
class MyClass2
extend M
end
This creates an interesting pattern that developers could use:
module M
def self.included(target)
target.send(:include, InstanceMethods)
target.extend ClassMethods
target.class_eval do
a_class_method
end
end
module InstanceMethods
def an_instance_method
end
end
module ClassMethods
def a_class_method
puts "a_class_method called"
end
end
end
class MyClass
include M
# a_class_method called
end
As you can see, this single module is adding instance methods, "class" methods, and acting directly on the target class (calling a_class_method() in this case).
ActiveSupport::Concern encapsulates this pattern. Here's the same module rewritten to use ActiveSupport::Concern:
module M
extend ActiveSupport::Concern
included do
a_class_method
end
def an_instance_method
end
module ClassMethods
def a_class_method
puts "a_class_method called"
end
end
end
Right now, I'm thinking about the template design pattern. It just wouldn't feel right with a module.