Constant lookup - ruby

Scenario #1:
In the example below, puts Post::User.foo line prints foo. In other words Post::User returns a global User constant.
class User
def self.foo
"foo"
end
end
class Post
puts Post::User.foo
end
# => warning: toplevel constant User referenced by Post::User
# => foo
Scenario #2:
This second example raises an error because no constant is found.
module User
def self.foo
"foo"
end
end
module Post
puts Post::User.foo
end
# => uninitialized constant Post::User (NameError)
The result of scenario #2 is more intuitive. Why is the constant found in scenario #1? If User constant is returned in scenario #1, why isn't that happening in scenario #2? In scenario #2, Post is a module, so in that case, constant should be searched in Object.ancestors, which should also return User constant, but this is not happening.

The behaviour is caused by the lookup happening inside a module, instead of in a class. (the fact that you are looking up a module inside the module and a class inside the class is irrelevant).
module M
def self.foo; "moo" end
end
class C
def self.foo; "coo" end
end
class A
A::M.foo rescue puts $! # warning: toplevel constant M referenced by A::M
A::C.foo rescue puts $! # warning: toplevel constant M referenced by A::M
end
module B
B::M.foo rescue puts $! # error: uninitialized constant B::M
B::C.foo rescue puts $! # error: uninitialized constant B::C
end
If you look at the C code, both of these call into rb_const_get_0 with exclude=true, recurse=true, visibility=true.
In the case of A::M, this looks up:
tmp = A
tmp::M # (doesn't exist)
tmp = tmp.super # (tmp = Object)
tmp::M # (exists, but warns).
In the case of B::M, this looks up:
tmp = B
tmp::M # (doesn't exist)
tmp = tmp.super # (modules have no super so lookup stops here)
Because exclude is true, the usual edge case for modules jumping to (tmp = Object) is skipped. This means that B::M behaves differently to module B; M; end, which is arguably an inconsistency in ruby.

First of all, please consider that all top-level constants are defined in class Object, because Ruby is an object-oriented language and there can not be a variable or constant that does not belong to some class:
class A; end
module B; end
A == Object::A # => true
B == Object::B # => true
Second, class Object is by default an ancestor of any class, but not of a module:
class A; puts ancestors; end # => [A, Object, Kernel, BasicObject]
module B; puts ancestors; end # => []
At the same time, there's no Object in neither ofModule.nesting:
class A; puts Module.nesting; end # => [A]
module B; puts Module.nesting; end # => [B]
Then, chapter 7.9 of the Matz book mentioned above says that Ruby searches for any constant in Module.nesting and then in ancestors.
Therefore, in your example, it finds the constant User for class Post, because Ruby defines top-level constant User in class Object (as Object::User). And Object is an ancestor of Post:
Object::User == Post::User # => true
But there is no Object in ancestors or Module.nesting of your module Post. The constant Post is defined in class Object as Object::Post but it does not derive from Object because module Post is not an object. Therefore, it does not resolve constant User in module Post through it's ancestors.
At the same time, it will work if you keep User to be a module and turn Post into a class:
module User
def self.foo
"foo"
end
end
class Post
puts Post::User.foo
end
# => foo
That's because class Post can resolve any constant inside it's superclass Object and all top-level constants are defined in Object.

The problem i think is in the difference of inheritance level of Class and Module (ascii image). In fact Class object is inherited from Module object. ancestors that are looked up.
module A; end
p A.ancestors #=> [A]
class B; end
p B.ancestors #=> [B, Object, Kernel, BasicObject]
That means if lookup algorithm step in module A, it can not step out.
So in case of module Post lookup algorithm of Post::User is something like
find const Post (found module Post)
find const User in module Post (not found, go for ancestors of Post)
dead-end - error
and in case of class Post
find const Post (found class Post)
find const User in class Post (not found, go for ancestors of Post)
find const User (found class User with warning)
That's why you can chain classes in one level of namespace and ruby will still find them. If you try
class User
def self.foo
"foo"
end
end
class A1; end
class A2; end
class Foo
p Foo::A1::A2::User.foo
end
#... many of warnings
#> "foo"
is still good.
And
class User
def self.foo
"foo1"
end
end
class A1; end
module A2; end
class Foo
p Foo::A1::A2::User.foo
end
#>.. some warnings
#> uninitialized constant A2::User (NameError)
because lookup algorithm steps in module and got trapped.
TL;DR
class User
def self.foo
"foo"
end
end
class Post1
Post1.tap{|s| p s.ancestors}::User.foo #=> [Post1, Object, Kernel, BasicObject]
end
# ok
module Post2
Post2.tap{|s| p s.ancestors}::User.foo #=> [Post2]
end
# error
Updated
In this situation the place at what Post::User.foo was called does not play much role. It could be also outside of class/module and the behaviour will be the same.
module ModuleA; end
class ClassB; end
class ClassC; end
class E; ModuleA::ClassC; end # error
module F; ClassB::ClassC; end # ok
ClassB::ClassC # ok
ClassB::ModuleA # ok
ModuleA::ClassB # error
And as you pointed out
Sure, ancestors for a module or class differ, but in the case of
module, the additional constant lookup in Object.ancestors happens.
it happens only at moment of "initial" lookup. It means that in case of
module Post; Post::User.foo; end
const Post will firstly be looked in (here I can mistake something) Post.ancestors and because there are no Post::Post, continue to find in Object.ancestors (then it will go with algorithm that i described at top).
Summing up, the context at what you called const matter only for first (most left) const lookup. And then only what object is left considered.
A::B::C
A # consider context
::B # consider only A
::C # consider only B

** class keyword in ruby is actually a method name which accepts Constant and defines a class for it
In case of scenario 1, when puts Post::User.foo gets invoked. Ruby looks whether it has a class Post::User defined (Ruby searches it as a constant, because that's what it is). Once it finds it, foo method gets called.
But in scenario 2 you have defined it inside modules since when puts Post::User.foo gets invoked and there exist no such class as Post::User. Search fails and you get the obvious error message.
You may refer Class Names Are Constants Section in this link for more details.

Related

How different ways to define a class influence the way include works?

I have a simple module that defines a constant and makes it private:
module Foo
Bar = "Bar"
private_constant :Bar
end
I can include it in a class like this, and it works as expected:
class User
include Foo
def self.test
Bar
end
end
puts User.test
# => Bar
begin
User::Bar
rescue => exception
puts "#{exception} as expected"
# => private constant Foo::Bar referenced as expected
end
(let's call it "typical class" definition)
Then I tried the Class.new approach, but this failed miserably:
X = Class.new do
include Foo
def self.test
Bar # Line 28 pointed in the stack trace
end
end
begin
X::Bar
rescue => exception
puts "#{exception}"
# => private constant Foo::Bar
end
puts X.test
# test.rb:28:in `test': uninitialized constant Bar (NameError)
# from test.rb:28:in `<main>'
Why? I always though class Something and Something = Class.new are equivalent. What's the actual difference?
Then I had a strike of inspiration, and recalled there's alternative way to define class methods, which actually worked:
X = Class.new do
class << self
include Foo
def test
Bar
end
end
end
begin
X::Bar
rescue => exception
puts "#{exception}"
# => uninitialized constant X::Bar
end
puts X.test
# Bar
Again - why this one work, and why the exception is now different: private constant Foo::Bar vs uninitialized constant X::Bar?
It seems like those 3 ways of initializing classes differ in a nuanced way.
does exactly what I want: Bar is accessible internally, and accessing it gives exception about referencing private constant.
second gives "ok" exception, but has no access to Bar itself
third has access, but now gives slightly different exception
What is exactly going on in here?
This is one of the biggest gotchas in Ruby: constant definition scope is partially syntactic, that is, it depends on how the code around it is structured.
module Foo
Bar = "Bar"
end
Bar is inside a module definition, so it is defined in that module.
class << self
include Foo
end
Bar gets included inside a class definition, so it is defined in that class.
Class.new do
include Foo
end
There is no enclosing class or module (this is a normal method call with a block), so the constant is defined at top level.
As for your third error, I believe that is because the constant got defined in the singleton class (that's what class << self is) versus the class itself. They are two separate class objects.

In Ruby, in a method defined in class << self, why can't a constant defined on the superclass be access without self?

I'm trying to understand Ruby singletons and class inheritance better. I read everywhere that
def self.method_name; end`
is equivalent to
class << self
def method_name; end
end
But if that were true, then I would expect print_constant_fails to work, but it doesn't. What is going on here?
class SuperExample
A_CONSTANT = "super example constant"
end
class SubExample < SuperExample
def self.print_constant_works_1
puts A_CONSTANT
end
class << self
def print_constant_works_2
puts self::A_CONSTANT
end
def print_constant_fails
puts A_CONSTANT
end
end
end
pry(main)> SubExample.print_constant_works_1
super example constant
pry(main)> SubExample.print_constant_works_2
super example constant
pry(main)> SubExample.print_constant_fails
NameError: uninitialized constant #<Class:SubExample>::A_CONSTANT
from (pry):13:in `print_constant_fails'
You have encountered a common Ruby gotcha - constant lookup.
The most important concept in constant lookup is Module.nesting (unlike in method lookup, where the primary starting point is self). This method gives you the current module nesting which is directly used by the Ruby interpreter when resolving the constant token. The only way to modify the nesting is to use keywords class and module and it only includes modules and classes for which you used that keyword:
class A
Module.nesting #=> [A]
class B
Module.nesting #=> [A::B, A]
end
end
class A::B
Module.nesting #=> [A::B] sic! no A
end
In meta programming, a module or class can be defined dynamically using Class.new or Module.new - this does not affect nesting and is an extremely common cause of bugs (ah, also worth mentioning - constants are defined on the first module of Module.nesting):
module A
B = Class.new do
VALUE = 1
end
C = Class.new do
VALUE = 2
end
end
A::B::VALUE #=> uninitialized constant A::B::VALUE
A::VALUE #=> 2
The above code will generate two warnings: one for double initialization of constant A::VALUE and a second for reassigning the constant.
If it looks like "I'd never do that" - this also applies to all the constants defined within RSpec.describe (which internally calls Class.new), so if you define a constant within your rspec tests, they are most certainly global (unless you explicitly stated the module it is to be defined in with self::)
Now let's get back to your code:
class SubExample < SuperExample
puts Module.nesting.inspect #=> [SubExample]
class << self
puts Module.nesting.inspect #=> [#<Class:SubExample>, SubExample]
end
end
When resolving the constant, the interpreter first iterates over all the modules in Module.nesting and searches this constant within that module. So if nesting is [A::B, A] and we're looking for the constant with token C, the interpreter will look for A::B::C first and then A::C.
However, in your example, that will fail in both cases :). Then the interpreter starts searching ancestors of the first (and only first) module in Module.nesting. SubrExample.singleton_class.ancestors gives you:
[
#<Class:SubExample>,
#<Class:SuperExample>,
#<Class:Object>,
#<Class:BasicObject>,
Class,
Module,
Object,
Kernel,
BasicObject
]
As you can see - there is no SuperExample module, only its singleton class - which is why constant lookup within class << self fails (print_constant_fails).
The ancestors of Subclass are:
[
SubExample,
SuperExample,
Object,
Kernel,
BasicObject
]
We have SuperExample there, so the interpreter will manage to find SuperExample::A_CONSTANT within this nesting.
We're left with print_constant_works_2. This is an instance method on a singleton class, so self within this method is just SubExample. So, we're looking for SubExample::A_CONSTANT - constant lookup firstly searches on SubExample and, when that fails, on all its ancestors, including SuperExample.
It has to do with scope. When you are inside class << self, the scope is different than when you are inside class Something. Thus, inside class << self there is actually no constant called A_CONSTANT.
In Ruby, every Ruby's constant has its own path, start from the main (root) with the sign :: (default, we don't need to declare this sign). And class should not be considered a keyword (kind of static), but a method (kind of dynamic) take responsibility for creating a class object and a class name constant which point to that class object, and all Constants are defined inside a class without a path (P::Q::...) will automatically be considered belongs to the created class with path :: ClassName::A_CONSTANT.
GLOBAL = 1
class SuperExample
A_CONSTANT = "super constant" # <-- ::SuperExample::A_CONSTANT
end
puts ::GLOBAL # 1
puts ::SuperExample # SuperExample
puts ::SuperExample::A_CONSTANT # "super constant"
It looks like constants paths in children classes has same level with parent
class SubExample < SuperExample
end
puts ::SubExample::A_CONSTANT # "super constant"
As I noticed, all constants (without ::) inside the class block will be set path under the classpath, so when you get them, either you get with the explicitly constant path or under the class that constants belong to:
class SubExample < SuperExample
def self.print_constant_works_1
puts A_CONSTANT # ::SubExample::A_CONSTANT
end
def another
puts A_CONSTANT # ::SubExample::A_CONSTANT
end
def yet_another
puts SubExample::A_CONSTANT # ::SubExample::A_CONSTANT
end
end
Now check class << self
class SubExample < SuperExample
class << self
puts self # <Class:SubExample>
def print_constant_works_2
puts self::A_CONSTANT # declare explicitly constant path
end
def print_constant_fails
puts A_CONSTANT # not declare explicitly <-- Class:SubExample::A_CONSTANT
end
end
end
As you can see, the class inside class << self is different, so the path of constant A_CONSTANT inside method print_constant_fails is pointing to Class:SubExample which does not define any constant A_CONSTANT, so an error uninitialized constant #<Class:SubExample>::A_CONSTANT be raised.
Meanwhile print_constant_works_2 will work since we declare explicitly constant path, and self in this case is actually SubExample(call SubExample.print_constant_works_2).
Now let try with an explicit path ::A_CONSTANT inside print_constant_fails
def print_constant_fails
puts ::A_CONSTANT
end
The error be raised is uninitialized constant A_CONSTANT, ::A_CONSTANT is considered a global constant (main).

how rails delegate method works?

After reading the answer by jvans below and looking at the source code a few more time I get it now :). And in case anyone is still wondering how exactly rails delegates works. All rails is doing is creating a new method with (module_eval) in the file/class that you ran the delegate method from.
So for example:
class A
delegate :hello, :to => :b
end
class B
def hello
p hello
end
end
At the point when delegate is called rails will create a hello method with (*args, &block) in class A (technically in the file that class A is written in) and in that method all rails do is uses the ":to" value(which should be an object or a Class that is already defined within the class A) and assign it to a local variable _, then just calls the method on that object or Class passing in the params.
So in order for delegate to work without raising an exception... with our previous example. An instance of A must already have a instance variable referencing to an instance of class B.
class A
attr_accessor :b
def b
#b ||= B.new
end
delegate :hello, :to => :b
end
class B
def hello
p hello
end
end
This is not a question on "how to use the delegate method in rails", which I already know. I'm wondering how exactly "delegate" delegates methods :D. In Rails 4 source code delegate is defined in the core Ruby Module class, which makes it available as a class method in all rails app.
Actually my first question would be how is Ruby's Module class included? I mean every Ruby class has ancestors of > Object > Kernel > BasicObject and any module in ruby has the same ancestors. So how exactly how does ruby add methods to all ruby class/modules when someone reopens the Module class?
My second question is.. I understand that the delegate method in rails uses module_eval do the actual delegation but I don't really understand how module_eval works.
def delegate(*methods)
options = methods.pop
unless options.is_a?(Hash) && to = options[:to]
raise ArgumentError, 'Delegation needs a target. Supply an options hash with a :to key as the last argument (e.g. delegate :hello, to: :greeter).'
end
prefix, allow_nil = options.values_at(:prefix, :allow_nil)
if prefix == true && to =~ /^[^a-z_]/
raise ArgumentError, 'Can only automatically set the delegation prefix when delegating to a method.'
end
method_prefix = \
if prefix
"#{prefix == true ? to : prefix}_"
else
''
end
file, line = caller.first.split(':', 2)
line = line.to_i
to = to.to_s
to = 'self.class' if to == 'class'
methods.each do |method|
# Attribute writer methods only accept one argument. Makes sure []=
# methods still accept two arguments.
definition = (method =~ /[^\]]=$/) ? 'arg' : '*args, &block'
# The following generated methods call the target exactly once, storing
# the returned value in a dummy variable.
#
# Reason is twofold: On one hand doing less calls is in general better.
# On the other hand it could be that the target has side-effects,
# whereas conceptually, from the user point of view, the delegator should
# be doing one call.
if allow_nil
module_eval(<<-EOS, file, line - 3)
def #{method_prefix}#{method}(#{definition}) # def customer_name(*args, &block)
_ = #{to} # _ = client
if !_.nil? || nil.respond_to?(:#{method}) # if !_.nil? || nil.respond_to?(:name)
_.#{method}(#{definition}) # _.name(*args, &block)
end # end
end # end
EOS
else
exception = %(raise DelegationError, "#{self}##{method_prefix}#{method} delegated to #{to}.#{method}, but #{to} is nil: \#{self.inspect}")
module_eval(<<-EOS, file, line - 2)
def #{method_prefix}#{method}(#{definition}) # def customer_name(*args, &block)
_ = #{to} # _ = client
_.#{method}(#{definition}) # _.name(*args, &block)
rescue NoMethodError => e # rescue NoMethodError => e
if _.nil? && e.name == :#{method} # if _.nil? && e.name == :name
#{exception} # # add helpful message to the exception
else # else
raise # raise
end # end
end # end
EOS
end
end
end
Ruby isn't reopening the module class here. In ruby the class Module and the class Class are almost identical.
Class.instance_methods - Module.instance_methods #=> [:allocate, :new, :superclass]
The main difference is that you can't 'new' a module.
Module's are ruby's version of multiple inheritance so when you do:
module A
end
module B
end
class C
include A
include B
end
behind the scenes ruby is actually creating something called an anonymous class. so the above is actually equivalent to:
class A
end
class B < A
end
class C < B
end
module_eval here is a little deceptive. Nothing from the code you're looking at is dealing with modules. class_eval and module_eval are the same thing and they just reopen the class that they're called on so if you want to add methods to a class C you can do:
C.class_eval do
def my_new_method
end
end
or
C.module_eval do
def my_new_method
end
end
both of which are equivalent to manually reopening the class and defining the method
class C
end
class C
def my_new_method
end
end
so when they're calling module_eval in the source above, they're just reopening the current class it's being called it and dynamically defining the methods that you're delegating
I think this will answer your question better:
Class.ancestors #=> [Module, Object, PP::ObjectMixin, Kernel, BasicObject]
since everything in ruby is a class, the method lookup chain will go through all of these objects until it finds what it's looking for. By reoping module you add behavior to everything. The ancestor chain here is a little deceptive, since BasicObject.class #=> Class and Module is in Class's lookup hierarchy, even BasicObject inherits behavior from repening module. The advantage of reopening Module here over Class is that you can now call this method from within a module as well as within a class! Very cool, learned something here myself.
After reading the answer by jvans below and looking at the source code a few more time I get it now :). And in case anyone is still wondering how exactly rails delegates works. All rails is doing is creating a new method with (module_eval) in the file/class that you ran the delegate method from.
So for example:
class A
delegate :hello, :to => :b
end
class B
def hello
p hello
end
end
At the point when delegate is called rails will create a hello method with (*args, &block) in class A (technically in the file that class A is written in) and in that method all rails do is uses the ":to" value(which should be an object or a Class that is already defined within the class A) and assign it to a local variable _, then just calls the method on that object or Class passing in the params.
So in order for delegate to work without raising an exception... with our previous example. An instance of A must already have a instance variable referencing to an instance of class B.
class A
attr_accessor :b
def b
#b ||= B.new
end
delegate :hello, :to => :b
end
class B
def hello
p hello
end
end

Difference between "class A; class B" and "class A::B"

What's the difference between:
class A
class B
end
end
and
class A
end
class A::B
end
Update: These 2 approaches are not exactly the same.
In the second approach, B doesn't have access to constants defined in A.
Also, as Matheus Moreira correctly stated, in the second approach, A must be defined before A::B can be defined.
What other differences are there?
In Ruby, modules and classes are instances of the Module and Class classes, respectively. They derive their names from the constant they are assigned to. When you write:
class A::B
# ...
end
You are effectively writing:
A::B ||= Class.new do
# ...
end
Which is valid constant assignment syntax, and assumes that the A constant has been properly initialized and that it refers to a Module or a Class.
For example, consider how classes are usually defined:
class A
# ...
end
What is effectively happening is this:
Object::A ||= Class.new do
# ...
end
Now, when you write:
class A
class B
# ...
end
end
What actually happens looks like this:
(Object::A ||= Class.new).class_eval do
(A::B ||= Class.new).class_eval do
# ...
end
end
Here's what is happening, in order:
A new Class instance is asssigned to the A constant of Object, unless it was already initialized.
A new Class instance is asssigned to the B constant of A, unless it was already initialized.
This ensures the existence of all outer classes before attempting to define any inner classes.
There is also a change in scope, which allows you to directly access A's constants. Compare:
class A
MESSAGE = "I'm here!"
end
# Scope of Object
class A::B
# Scope of B
puts MESSAGE # NameError: uninitialized constant A::B::MESSAGE
end
# Scope of Object
class A
# Scope of A
class B
# Scope of B
puts MESSAGE # I'm here!
end
end
According to this blog post, the Ruby core team calls the "current class" the cref. Unfortunately, the author does not elaborate, but as he notes, it is separate from the context of self.
As explained here, the cref is a linked list that represents the nesting of modules at some point in time.
The current cref is used for constant and class variable lookup and
for def, undef and alias.
As the others have stated, they are different ways of expressing the same thing.
There is, however, a subtle difference. When you write class A::B, you assume that the A class has already been defined. If it has not, you will get a NameError and B will not be defined at all.
Writing properly nested modules:
class A
class B
end
end
Ensures the A class exists before attempting to define B.
Two different ways to say the same thing. That thing is that class B is an inner or nested class and can only be accessed through the the A interface.
> class A
.. def say
.... "In A"
....end
..
.. class B
.... def say
...... "In B"
......end
....end
..end
=> nil
> A.new.say
=> "In A"
> B.new.say
=> #<NameError: uninitialized constant B>
> A::B.new.s­ay
=> "In B"
versus
> class A
.. def say
.... "In A"
....end
..end
=> nil
> class A::B
.. def say
.... "In B"
....end
..end
=> nil
> A.new.say
=> "In A"
> B.new.say
=> #<NameError: uninitialized constant B>
> A::B.new.s­ay
=> "In B"
>
They are the same. They are different ways of writing the same thing. The first one is the naive way of writing it, but often, it gets hard to keep track of the nesting once the class/module gets large. Using the second way, you can avoid nesting in the appearance.

Detecting that a method was not overridden

Say, I have the following 2 classes:
class A
def a_method
end
end
class B < A
end
Is it possible to detect from within (an instance of) class B that method a_method is only defined in the superclass, thus not being overridden in B?
Update: the solution
While I have marked the answer of Chuck as "accepted", later Paolo Perrota made me realize that the solution can apparently be simpler, and it will probably work with earlier versions of Ruby, too.
Detecting if "a_method" is overridden in B:
B.instance_methods(false).include?("a_method")
And for class methods we use singleton_methods similarly:
B.singleton_methods(false).include?("a_class_method")
If you're using Ruby 1.8.7 or above, it's easy with Method#owner/UnboundMethod#owner.
class Module
def implements_instance_method(method_name)
instance_method(method_name).owner == self
rescue NameError
false
end
end
class A
def m1; end
def m2; end
end
class B < A
def m1; end
def m3; end
end
obj = B.new
methods_in_class = obj.class.instance_methods(false) # => ["m1", "m3"]
methods_in_superclass = obj.class.superclass.instance_methods(false) # => ["m2", "m1"]
methods_in_superclass - methods_in_class # => ["m2"]
you can always to the following and see if its defined there:
a = A.new
a.methods.include?(:method)
Given an object b which is an instance of B, you can test to see whether b's immediate superclass has a_method:
b.class.superclass.instance_methods.include? 'a_method'
Notice that the test is against the method name, not a symbol or a method object.
"thus not being overridden in B" - Just knowing that the method is only defined in A is difficult because you can define the method on an individual instances of A and B... so I think it's going to be difficult to test that a_method is only defined on A, because you'd have to round up all the subclasses and subinstances in the system and test them...

Resources