On ruby, why include is private and extend is public? - ruby

On ruby, what is the reason for include is private, while Object#extend is public?

Object#extend has to be public, otherwise you wouldn't be able to use it. After all, its purpose is to mix in a module into an object, so you'd generally call it like obj.extend(Foo), which isn't possible with private methods.
Module#include is usually only used inside a module body like so:
class Bar
include Foo
end
I.e. it is usually called without a receiver, so it doesn't have to be public. Of course, it doesn't have to be private either.
My guess is the reason why it is private is that it is more invasive, because it changes the behavior of every instance of Bar, whereas Object#extend only changes a single object. Therefore, Module#include is in some sense "more dangerous" and thus is made private.
I don't know whether that is the actual reason, but it is consistent with other similar methods like Module#define_method.

To be able to run Foo.include(Bar) at any point would most likely be a source of very nasty bugs.

To supplement Jörg W Mittag's answer, Object#extend can also be used to include module's instance methods to be used in the class level (which will also be available to all instances of that class):
module Foo
def bar (baz)
end
end
class Qux
extend Foo
bar 'asdf'
end

Related

Using `extend self` and `class << self` in Ruby Modules

I sometimes write modules that will only contain module methods (as opposed to module instance methods) (are there better names for these?). These modules should not be included in classes because that would have no effect and be misleading to a reader. So I'd like it to be as clear as possible to the reader that these modules contain no instance methods.
If I define all methods with .self, then a reader has to inspect all methods to ensure that this module contains no instance methods. If I instead use class << self or extend self then it is automatic; as soon as the reader sees this, they know.
I think extend self is best becuase with class << self one has to find its corresponding end; that is, it may not apply to all methods in the module.
So is it a good idea, and a best practice, to use extend self in cases like this?
Also, is there any difference at runtime between enclosing all methods in class << self as opposed to using extend self?
I sometimes write modules that will only contain module methods (as opposed to module instance methods) (are there better names for these?).
Singleton, meaning a class with a single instance. Here that "single instance" is the Module instance.
If I define all methods with .self, then a reader has to inspect all methods to ensure that this module contains no instance methods
The module's documentation should make this clear. If the user of a module has to study the code to understand your module, that is a documentation failure.
What does extend self do?
So I'd like it to be as clear as possible to the reader that these modules contain no instance methods.
extend self does the opposite. It makes all the instance methods also be class methods. It's equivalent to YourModule.extend(YourModule).
module YourModule
def some_method
23
end
extend self
end
Is the same as...
module YourModule
def some_method
23
end
end
YourModule.extend(YourModule)
Which is similar to...
module YourModule
def some_method
23
end
def self.some_method
23
end
end
Why would you do this? To allow both...
YourModule.some_method
and also...
class SomeClass
extend YourModule
end
SomeClass.some_method
There are edge cases where you might want this, but for general use I would argue this is an anti-pattern. The first is using a module as a singleton, the second is using the module as a mixin or trait. These are two rather different design goals for a module. Trying to be both will compromise the design of both.
Pros and Cons.
Since the primary use case of being both a singleton and a mixin is an anti-pattern, I would argue use class << self, with def self.method occasionally, and module_function and extend self never.
class << self
Pros
All class definitions are grouped together.
The block scope makes it clear what affects the class and what affects the instances.
Indentation makes it clear what is in the block.
IDEs can clearly identify what is in the block.
It allows using normal declarations like attr_accessor on the class.
It is documented.
It is common.
Rubocop approved.
Cons
When looking at an individual method, it's not as obvious as def self.method.
def self.method
Pros
It's obvious it's a class method from looking at the method.
It is documented.
It is common.
Rubocop approved.
Cons
You might forget to add the self..
It allows mixing of class and instance methods making the reader hunt through the code.
It does not help using attr_accessor and friends on the class.
extend self
Pros
It allows your module to act as both a singleton (YourModule.method) and a mixin (extend YourModule)... which is also a con.
Cons
It is obscure; many (most?) won't know to look for it or what it means if they find it.
It is not documented (or if it is, I can't find it).
Individual methods look like instance methods.
It can appear anywhere in the module, and there's no consensus where it should go, making it action at a distance.
It affects the meaning of code before it, the one case I can think of this in Ruby, further making it action at a distance.
Rubocop prefers module_function to extend self, though doesn't explain why. For my guesses, see below.
It allows your module to act as both a singleton (YourModule.method) and a mixin (extend YourModule). Those are two rather different use cases making this an anti-pattern.
module_function
I've never heard of this either, but it came up when searching for extend self. I would also say to never use this, use class << self, but it's better than extend self.
Pros
It's at least mentioned in the Modules and Classes documentation.
It's documented.
It works like private in that it affects all methods below it (though this is also a con, see below).
If there are to be no instance methods, it must appear at the top of the module.
Rubocop approved.
Cons
It is obscure; many (most?) won't know to look for it or what it means when they find it.
Individual methods look like instance methods.
It affects the meaning of distant code after it making it action at a distance.
I don't see why it should matter how you decide to define the module methods. Consider simply raising an exception if the module is included in another module (which may be a class). You can do that with the callback (a.k.a. "hook") method Module#included. Here's an example.
module M
# This module is not to be included in a class because
# it contains no instance methods.
def self.included(klass)
raise "\nYou intended to include this module in #{klass}. You must be out of\nyour mind! It does no harm but there is no point in doing so\nbecause this module contains no instance methods. Duh!"
end
def self.hi
puts "Hi, guys"
end
end
M.hi
Hi, guys
class C
include M
end
RuntimeError:
You intended to include this module in C. You must be out of
your mind! It does no harm but there is no point in doing so
because this module contains no instance methods. Duh!

Can I temporarily make all Ruby methods public?

I'm reproducing a bug in my Rails console. I'm interested in what some methods return, but some of them turn out to be private, so in my console I have to write:
> my_object.my_method
NoMethodError (private method `my_method' called for #<MyClass:0x0123456789ABCDEF>)
> my_object.send(:my_method)
This gets a bit tedious after a while, especially since it's not obvious which are private without drilling through to the class they're defined in.
Is there any way I could temporarily make all methods public? I don't intend to use it in production, just temporarily in my local console while I'm debugging.
UPDATE: when I say “all methods”, I don’t just mean the ones on my_object. I mean literally all methods on every object.
To make all methods public, this should work:
ObjectSpace.each_object(Module) do |m|
m.send(:public, *m.private_instance_methods(false))
end
ObjectSpace.each_object traverses all modules (that includes classes and singleton classes) and makes their (own) private_instance_methods public.
To just make a single object public, you could use:
my_object.singleton_class.send(:public, *my_object.private_methods)
By changing the singleton class, only the my_object instance is affected.
Note that by default, private_methods returns inherited methods too, including many from Kernel. You might want to pass false to just include the object's own private methods.
Something like
my_object.class.private_instance_methods.each { |m| my_object.class.send :public, m }
might help you shoot a leg, I guess. :) (but the answer above is way better)

What are empty-body methods used for in Ruby?

Currently reading a Ruby style guide and I came across an example:
def no_op; end
What is the purpose of empty body methods?
There are a number of reasons you might create an empty method:
Stub a method that you will fill in later.
Stub a method that a descendant class will override.
Ensure a class or object will #respond_to? a method without necessarily doing anything other than returning nil.
Undefine an inherited method's behavior while still allowing it to #respond_to? the message, as opposed to using undef foo on public methods and surprising callers.
There are possibly other reasons, too, but those are the ones that leapt to mind. Your mileage may vary.
There may be several reasons.
One case is when a class is expected to implement a specific interface (virtually speaking, given that in Ruby there are no interfaces), but in that specific class that method would not make sense. In this case, the method is left for consistency.
class Foo
def say
"foo"
end
end
class Bar
def say
"bar"
end
end
class Null
def say
end
end
In other cases, it is left as a temporary placeholder or reminder.
There are also cases where the method is left blank on purpose, as a hook for developers using that library. The method it is called somewhere at runtime, and developers using that library can override the blank method in order to execute some custom callback. This approach was used in the past by some Rails libraries.

ruby style and instance variables

Something that I see in a lot of code:
class Foo
attr_accessor :bar
# lots of code omitted
def baz
'qux' if bar
end
end
The exact form of the baz method is not too important - it's just that bar here is a reference to the getter method for the instance variable #bar, called from within the instance's class. I would favor retrieving the value from #bar explicitly. Are there any opinions on this?
I've never seen anything in the ruby style guide or similar covering this. I personally find that doing the former makes it harder to read and understand, especially when classes are over several hundred lines long.
Edit:
Perhaps to illustrate what I would consider to be the awkwardness of this design, let's re-evaluate a pretty standard initialize method:
class Foo
attr_accessor :bar, :qux
def initialize(bar, qux)
#bar = bar
#qux = qux
end
end
If we use the setter method, we cannot use bar = ? by analogy. Instead, we have:
class Foo
attr_accessor :bar, :qux
def initialize(bar, qux)
self.bar = bar
self.qux = qux
end
end
which has lost some of the elegance of the first. We have a more flexible design in that we are now free to rewrite our setter method and do away with attr_writer. But I've just some points in style, and it feels a lot like configuration over convention rather than the converse, something that Russ Olsen has declared a design 'pattern' not just of Rails but of Ruby too.
Accessing the attribute through the getter has the advantage of providing encapsulation. The use of an instance variable to store the value is an implementation detail in some respects. Whether that's appropriate is, of course, situational. I don't recall reading anything explicit this style issue, however.
Found https://softwareengineering.stackexchange.com/questions/181567/should-the-methods-of-a-class-call-its-own-getters-and-setters, which discusses the issue from a language-independent point of view. Also found https://www.ruby-forum.com/topic/141107, which is ruby-specific, although it doesn't break any new ground, let alone imply a Ruby standard.
Update: Just came across the following statement on page 24 of http://www.amazon.com/Practical-Object-Oriented-Design-Ruby-Addison-Wesley/dp/0321721330/ref=sr_1_1?s=books&ie=UTF8&qid=1376760915&sr=1-1, a well-respected book on Ruby: "Hide the variables, even from the class that defines them, by wrapping them in methods." (emphasis added). It goes on to give examples of methods in the class using the accessor methods for access.
I would favor retrieving the value from #bar explicitly. Are there any
opinions on this?
Yes, direct access is not as flexible of a design. Getters and setters can be used to transform values. That is why java programmers spend half their lives banging out do nothing setters and getters for their private variables--they want to present the setters and getters as their api, which allows them to change their code in the future to transform values on the way in or the way out without changing the api.
Then ruby came along with the neat attr_accessor method, which meant that writing do nothing setters and getters wasn't painful anymore.
Python goes one step further. In python, instance variables are public and you can directly access them, e.g.
print my_dog.age
A java programmer writing a python program would implement get_age() and set_age() methods:
class Dog:
def get_age(self):
return self.age
def set_age(self, age):
self.age = age
The java programmer would then fly over all the towns in the land and drop leaflets describing the getter and setter methods as the api for getting and setting the age instance variable, and they would warn people not to access the instance variables directly--or else things might break.
However, python has a feature that allows programmers to eliminate getters and setters until they are actually needed to do something useful--rather than dumbly getting or setting a value. Python allows you to transform direct access to instance variables by client code into method calls. To the client it's transparent. For instance, the client code may be accessing an instance variable in a class by writing:
my_dog.age
Python allows the writer of the class to subsequently implement a method named age(), and my_dog.age can be made to call that method instead of directly accessing the instance variable (note that in python, unlike in ruby, you can't normally call a method without the parentheses). The newly implemented age() method can then do anything it wants to the age instance variable before returning it to the client code, e.g. transform it into human years, or retrieve the age from a database.
It's actually faster to use a getter, mainly because attr_reader and attr_accessor are written in C instead of Ruby.
As someone who's been coding Ruby for a few years, I think using attr_* is much more readable. But that's probably just something I've gotten used to.

How to make instance variables private in Ruby?

Is there any way to make instance variables "private"(C++ or Java definition) in ruby? In other words I want following code to result in an error.
class Base
def initialize()
#x = 10
end
end
class Derived < Base
def x
#x = 20
end
end
d = Derived.new
Like most things in Ruby, instance variables aren't truly "private" and can be accessed by anyone with d.instance_variable_get :#x.
Unlike in Java/C++, though, instance variables in Ruby are always private. They are never part of the public API like methods are, since they can only be accessed with that verbose getter. So if there's any sanity in your API, you don't have to worry about someone abusing your instance variables, since they'll be using the methods instead. (Of course, if someone wants to go wild and access private methods or instance variables, there isn’t a way to stop them.)
The only concern is if someone accidentally overwrites an instance variable when they extend your class. That can be avoided by using unlikely names, perhaps calling it #base_x in your example.
Never use instance variables directly. Only ever use accessors. You can define the reader as public and the writer private by:
class Foo
attr_reader :bar
private
attr_writer :bar
end
However, keep in mind that private and protected do not mean what you think they mean. Public methods can be called against any receiver: named, self, or implicit (x.baz, self.baz, or baz). Protected methods may only be called with a receiver of self or implicitly (self.baz, baz). Private methods may only be called with an implicit receiver (baz).
Long story short, you're approaching the problem from a non-Ruby point of view. Always use accessors instead of instance variables. Use public/protected/private to document your intent, and assume consumers of your API are responsible adults.
It is possible (but inadvisable) to do exactly what you are asking.
There are two different elements of the desired behavior. The first is storing x in a read-only value, and the second is protecting the getter from being altered in subclasses.
Read-only value
It is possible in Ruby to store read-only values at initialization time. To do this, we use the closure behavior of Ruby blocks.
class Foo
def initialize (x)
define_singleton_method(:x) { x }
end
end
The initial value of x is now locked up inside the block we used to define the getter #x and can never be accessed except by calling foo.x, and it can never be altered.
foo = Foo.new(2)
foo.x # => 2
foo.instance_variable_get(:#x) # => nil
Note that it is not stored as the instance variable #x, yet it is still available via the getter we created using define_singleton_method.
Protecting the getter
In Ruby, almost any method of any class can be overwritten at runtime. There is a way to prevent this using the method_added hook.
class Foo
def self.method_added (name)
raise(NameError, "cannot change x getter") if name == :x
end
end
class Bar < Foo
def x
20
end
end
# => NameError: cannot change x getter
This is a very heavy-handed method of protecting the getter.
It requires that we add each protected getter to the method_added hook individually, and even then, you will need to add another level of method_added protection to Foo and its subclasses to prevent a coder from overwriting the method_added method itself.
Better to come to terms with the fact that code replacement at runtime is a fact of life when using Ruby.
Unlike methods having different levels of visibility, Ruby instance variables are always private (from outside of objects). However, inside objects instance variables are always accessible, either from parent, child class, or included modules.
Since there probably is no way to alter how Ruby access #x, I don't think you could have any control over it. Writing #x would just directly pick that instance variable, and since Ruby doesn't provide visibility control over variables, live with it I guess.
As #marcgg says, if you don't want derived classes to touch your instance variables, don't use it at all or find a clever way to hide it from seeing by derived classes.
It isn't possible to do what you want, because instance variables aren't defined by the class, but by the object.
If you use composition rather than inheritance, then you won't have to worry about overwriting instance variables.
If you want protection against accidental modification. I think attr_accessor can be a good fit.
class Data
attr_accessor :id
private :id
end
That will disable writing of id but would be readable. You can however use public attr_reader and private attr_writer syntax as well. Like so:
class Data
attr_reader :id
private
attr_writer :id
end
I know this is old, but I ran into a case where I didn't as much want to prevent access to #x, I did want to exclude it from any methods that use reflection for serialization. Specifically I use YAML::dump often for debug purposes, and in my case #x was of class Class, which YAML::dump refuses to dump.
In this case I had considered several options
Addressing this just for yaml by redefining "to_yaml_properties"
def to_yaml_properties
super-["#x"]
end
but this would have worked just for yaml and if other dumpers (to_xml ?) would not be happy
Addressing for all reflection users by redefining "instance_variables"
def instance_variables
super-["#x"]
end
Also, I found this in one of my searches, but have not tested it as the above seem simpler for my needs
So while these may not be exactly what the OP said he needed, if others find this posting while looking for the variable to be excluded from listing, rather than access - then these options may be of value.

Resources