I've been googling around for this and haven't been able to find an answer, which makes me think the answer is no, but I figured I'd ask here in case anyone knows for sure.
Does Ruby have a hook for when methods are defined (ie on a module or class)?
If not, is anyone familiar enough with the implementation of the main object to know how exactly it copies methods to Object when they're defined at the top level?
Really curious about this. Thanks for any info :)
It does. Module#method_added https://ruby-doc.org/core-2.2.2/Module.html#method-i-method_added
module Thing
def self.method_added(method_name)
puts "Thing added #{method_name}"
end
def self.a_class_method; end
def do_something; end
end
class Person
def self.method_added(method_name)
puts "I added #{method_name}"
end
attr_accessor :name
end
Thing
Person.new
# Thing added do_something
# I added name
# I added name=
If not, is anyone familiar enough with the implementation of the main object to know how exactly it copies methods to Object when they're defined at the top level?
It doesn't "copy methods". The language specification simply says that methods defined at the top-level become methods of Object. This is exactly the same mechanism as the one that says that methods defined inside class Foo become methods of class Foo. The language spec says it, therefore the implementors implement it that way. main doesn't need to do anything.
If you want to get real technical, then this is about the default definee, which is the implicit scope in which methods get defined when you don't explicitly specify the definee (as in def foo.bar; end). Usually, the default definee is the self of the closest lexically enclosing class or module definition body, and when there is no lexically enclosing class or module definition, it is Object. But some reflective methods, such as instance_eval or class_eval etc. may or may not change it.
Related
I sometimes write modules that will only contain module methods (as opposed to module instance methods) (are there better names for these?). These modules should not be included in classes because that would have no effect and be misleading to a reader. So I'd like it to be as clear as possible to the reader that these modules contain no instance methods.
If I define all methods with .self, then a reader has to inspect all methods to ensure that this module contains no instance methods. If I instead use class << self or extend self then it is automatic; as soon as the reader sees this, they know.
I think extend self is best becuase with class << self one has to find its corresponding end; that is, it may not apply to all methods in the module.
So is it a good idea, and a best practice, to use extend self in cases like this?
Also, is there any difference at runtime between enclosing all methods in class << self as opposed to using extend self?
I sometimes write modules that will only contain module methods (as opposed to module instance methods) (are there better names for these?).
Singleton, meaning a class with a single instance. Here that "single instance" is the Module instance.
If I define all methods with .self, then a reader has to inspect all methods to ensure that this module contains no instance methods
The module's documentation should make this clear. If the user of a module has to study the code to understand your module, that is a documentation failure.
What does extend self do?
So I'd like it to be as clear as possible to the reader that these modules contain no instance methods.
extend self does the opposite. It makes all the instance methods also be class methods. It's equivalent to YourModule.extend(YourModule).
module YourModule
def some_method
23
end
extend self
end
Is the same as...
module YourModule
def some_method
23
end
end
YourModule.extend(YourModule)
Which is similar to...
module YourModule
def some_method
23
end
def self.some_method
23
end
end
Why would you do this? To allow both...
YourModule.some_method
and also...
class SomeClass
extend YourModule
end
SomeClass.some_method
There are edge cases where you might want this, but for general use I would argue this is an anti-pattern. The first is using a module as a singleton, the second is using the module as a mixin or trait. These are two rather different design goals for a module. Trying to be both will compromise the design of both.
Pros and Cons.
Since the primary use case of being both a singleton and a mixin is an anti-pattern, I would argue use class << self, with def self.method occasionally, and module_function and extend self never.
class << self
Pros
All class definitions are grouped together.
The block scope makes it clear what affects the class and what affects the instances.
Indentation makes it clear what is in the block.
IDEs can clearly identify what is in the block.
It allows using normal declarations like attr_accessor on the class.
It is documented.
It is common.
Rubocop approved.
Cons
When looking at an individual method, it's not as obvious as def self.method.
def self.method
Pros
It's obvious it's a class method from looking at the method.
It is documented.
It is common.
Rubocop approved.
Cons
You might forget to add the self..
It allows mixing of class and instance methods making the reader hunt through the code.
It does not help using attr_accessor and friends on the class.
extend self
Pros
It allows your module to act as both a singleton (YourModule.method) and a mixin (extend YourModule)... which is also a con.
Cons
It is obscure; many (most?) won't know to look for it or what it means if they find it.
It is not documented (or if it is, I can't find it).
Individual methods look like instance methods.
It can appear anywhere in the module, and there's no consensus where it should go, making it action at a distance.
It affects the meaning of code before it, the one case I can think of this in Ruby, further making it action at a distance.
Rubocop prefers module_function to extend self, though doesn't explain why. For my guesses, see below.
It allows your module to act as both a singleton (YourModule.method) and a mixin (extend YourModule). Those are two rather different use cases making this an anti-pattern.
module_function
I've never heard of this either, but it came up when searching for extend self. I would also say to never use this, use class << self, but it's better than extend self.
Pros
It's at least mentioned in the Modules and Classes documentation.
It's documented.
It works like private in that it affects all methods below it (though this is also a con, see below).
If there are to be no instance methods, it must appear at the top of the module.
Rubocop approved.
Cons
It is obscure; many (most?) won't know to look for it or what it means when they find it.
Individual methods look like instance methods.
It affects the meaning of distant code after it making it action at a distance.
I don't see why it should matter how you decide to define the module methods. Consider simply raising an exception if the module is included in another module (which may be a class). You can do that with the callback (a.k.a. "hook") method Module#included. Here's an example.
module M
# This module is not to be included in a class because
# it contains no instance methods.
def self.included(klass)
raise "\nYou intended to include this module in #{klass}. You must be out of\nyour mind! It does no harm but there is no point in doing so\nbecause this module contains no instance methods. Duh!"
end
def self.hi
puts "Hi, guys"
end
end
M.hi
Hi, guys
class C
include M
end
RuntimeError:
You intended to include this module in C. You must be out of
your mind! It does no harm but there is no point in doing so
because this module contains no instance methods. Duh!
Suppose I have the following:
module MyModule
module SubModule
Var = 'this is a constant'
var = 'this is not a constant'
def hello_world
return 'hello world!'
end
end
end
In the same file, I can only seem to access MyModule::SubModule::Var, but not any the constant or the method. If I now create a class and include these modules in different ways, I get additional strange behavior:
class MyClass
include MyModule
def initialize()
puts SubModule::Var
end
def self.cool_method
puts SubModule::Var
end
end
In this case, I can again only access Var, but not the other two. SubModule::var and SubModule::hello_world do not work. Finally:
class MyClass
include MyModule::SubModule
def initialize()
puts Var
puts hello_world
end
def self.cool_method
puts Var
puts hello_world
end
end
In this case, I can now access both Var and the method hello_world but not var, and, the weirdest thing, is that hello_world appears to have become an instance method! That is, the call to hello_world in initialize works, but the one in self.cool_method doesn't. This is pretty strange, considering that Var seems to have been included as a class variable, since outside the class, I must access them like so:
MyClass::Var
x = MyClass.new
x.hello_world
So, I have a few major questions.
What is going on behind the scenes with regards to Var vs var? It appears that capitalizing a variable name is more than just a convention after all.
When includeing a module, what kinds of things are passed to the including class, and at what scope?
Is there a way to do the opposite? That is, use include to include an instance variable or a class method?
What is going on behind the scenes with regards to Var vs var? It appears that capitalizing a variable name is more than just a convention after all.
Yes, of course, it's not a convention. Variables which start with an uppercase letter are constants, variables which start with a lowercase letter are local variables. The two are completely different.
When includeing a module, what kinds of things are passed to the including class, and at what scope?
Nothing gets passed anywhere. includeing a mixin simply makes that mixin the superclass of the class you are includeing it into. That's all. Everything else then works exactly as with classes.
Is there a way to do the opposite? That is, use include to include an instance variable or a class method?
I don't understand this question. Instance variables have nothing to do with mixins or classes. They belong to instances, that's why they are called "instance" variables.
There are no such things as "class methods" in Ruby. Ruby only knows one kind of methods: instance methods. When Rubyists talk to each other, they will sometimes use the term "class method" to mean "singleton method of an object that happens to be a class", but they do that knowing full well that class methods don't actually exist, it's just a shorthand in conversation. (And, of course, singleton methods don't exist either, they are just a convenient way of saying "instance method of the singleton class".)
I am new to Ruby and I saw methods defined like:
def method_one
puts "method 1"
end
class MyClass
method_one
def method_two
puts "method 2"
end
end
The way method_one is used reminds me of Python decorators.The output of
c = MyClass.new
c.method_two
is
method 1
method 2
I have been trying to search for more information about this syntax/language feature in the Ruby documentation on the web but I don't know what keywords to search for.
What this is thing called?
TL;DR
This code doesn't do what you think it does. Don't do stuff like this.
Ruby's Top-Level Object
Ruby lets you define methods outside a class. These methods exist on a top-level object, which you can (generally) treat as a sort of catch-all namespace. You can see various posts like What is the Ruby Top-Level? for more details, but you shouldn't really need to care.
In your original post, method_one is just a method defined in the top-level. It is therefore available to classes and methods nested within the top-level, such as MyClass.
Methods in Classes
Despite what you think, the following doesn't actually declare a :method_one class or instance method on MyClass:
class MyClass
method_one
def method_two; end
end
Instead, Ruby calls the top-level ::method_one during the definition of the class, but it never becomes a class method (e.g. MyClass::method_one) or an instance method (e.g. MyClass.new.method_one). There might be a few use cases for doing this (e.g. printing debugging information, test injection, etc.) but it's confusing, error-prone, and generally to be avoided unless you have a really strong use case for it.
Better Options
In general, when you see something like this outside an academic lesson, the programmer probably meant to do one of the following:
Extend a class.
Add a singleton method to a class.
Include a module in a class.
Set up a closure during class definition.
The last gets into murky areas of metaprogramming, at which point you should probably be looking at updating your class initializer, or passing Proc or lambda objects around instead. Ruby lets you do all sorts of weird and wonderful things, but that doesn't mean you should.
I think you're a little mislead; the output of:
c = MyClass.new
c.method_two
is
#<MyClass:0x007feda41acf18>
"method 2"
You're not going to see method one until the class is loaded or if you're in IRB you enter the last end statement.
I would suggest looking into ruby's initialize method.
I understand from this question that in an instance method, self refers to the current instance of the class. Is that true no matter how many layers of methods or loops deep in you are within the instance method definition?
Generally, yes, though there are some metaprogramming methods that can do some strange things with self - for instance, Object#instance_eval allows you to pass a block to be evaluated in the context of another instance. In that case, the self within the block is that of the other instance, like so:
class Foo
end
class Bar
def wacky
puts self.class #"Bar"
Foo.new.instance_eval do
puts self.class #"Foo"
end
end
end
Without a careful reading, you might be tempted to think that the self within the block refers to the Bar instance, but this is not so.
So you see, for most purposes, you can assume self is the self that is bound when entering a method. Although you have the ability to pass blocks around that get a different binding, self doesn't get re-bound "by accident" in Ruby. For more interesting reading, you might look at the ruby Binding class' documentation.
yes self does always refer to self ie the instance the method/block is invoked upon
Is it possible to code something that can tell me when a Ruby class is defined?
Yes!
class Object
def self.inherited(base)
puts "#{base} inherited from object"
end
end
class Animal
end
class Cat < Animal
end
Running the above code prints the following:
Animal inherited from object
Cat inherited from object
Basically, the self.inherited callback is triggered whenever a class is defined that inherits from the class it is defined on. Put it on Object and that's any class! (Although there may be some special case exceptions I can't think of just now).
I should probably add the disclaimer that, while it is possible to do this (because of just how awesome Ruby is as a language), whether it is advisable to do this, especially in code destined for production use, I'm not so sure. Well, actually, I am sure. It would be a bad idea.