I was reading a blog post about Service Objects by Dave Copeland and came across the following line:
A class in Ruby is a global symbol, which means that class methods are global symbols. Coding to globals is why we don’t use PHP anymore.
I'd like to understand this statement a little more and have some questions.
How are class methods and instance methods different in the context of symbols?
For example, take the following irb session:
irb(main):001:0> Symbol.all_symbols.grep /Foo/
=> []
irb(main):002:0> Symbol.all_symbols.grep /some.*method/
=> []
irb(main):003:0> class Foo
irb(main):004:1> def some_instance_method; end
irb(main):005:1> def self.some_class_method; end
irb(main):006:1> end
=> :some_class_method
irb(main):007:0> Symbol.all_symbols.grep /Foo/
=> [:Foo]
irb(main):008:0> Symbol.all_symbols.grep /some.*method/
=> [:some_instance_method, :some_class_method]
How are #some_instance_method and ::some_class_method different in the context of symbols?
What am I doing when I check Symbol.all_symbols is this the same thing as viewing the "global symbols"?
Why are both #some_instance_method and ::some_class_method shown? After reading the above quote I would have expected the result of 008 to be:
irb(main):008:0> Symbol.all_symbols.grep /some.*method/
=> [:some_instance_method]
I think Dave was a little unclear in the way he phrased that, but he explains the impact in the paragraphs following your excerpt:
A great example of where a service-as-a-global-symbol is problematic is Resque. All Resque methods are available via Resque, which means that any Ruby VM has exactly one resque it can use.
[…]
If, on the other hand, Resque was implemented as an object, instead of a global, any code that needed to access a different Resque instance would not have to change—it would just be given a different object.
The difference is in the interface: with Resque, users of the tool “depend on” and interface with specific classes — they're objects, but they're objects relegated to treatment as globals. This is in opposition to interfacing with instance methods on an object, wherein any other object can be subbed in without the dependence on the class of the object.
Thus using class-methods on a global (like an unscoped class definition) is, Dave argues, akin to using global methods, a la PHP.
I see several problems with that article:
First off, the use of the word "symbol" may be confusing. While the word does, in fact, perfectly describe what the author means, some readers may confuse it with the Symbol datatype in Ruby. So, while not wrong, the choice of words is unfortunate in the context of Ruby. "Name" might have been a better choice.
Secondly, he makes an artificial distinction between class and instance methods, but there is no such thing as a class method in Ruby. Ruby only has exactly one kind of methods: instance methods. What we call "singleton methods" are actually just regular instance methods of the singleton class, and what we call "class methods" are actually just regular instance methods of the singleton class of an object that happens to be an instance of the Class class.
Thirdly, he makes an artificial distinction between classes and objects, but classes are objects in Ruby.
It seems that we he is really arguing against, are constants (because they are global names), singletons (which classes usually are), and static state. And while it is certainly true that all of those are bad, he should say so, if that's what he means. (It's also not exactly a new discovery; entire programming languages have been designed based on the avoidance of static state, e.g. Newspeak.)
tl;dr summary: The article argues against global names, singletons, and static state, but is badly presented and phrased.
Question 1
The :some_instance_method and :some_class_method symbols merely exist in Ruby's symbol table. They are not different in the context of symbols. The Symbol.all_symbols result doesn't declare anything about the objects being referenced. If you had:
class Aaa
def kick_it
logger.debug { "You kicked an Aaa object" }
end
end
module Bbb
def self.kick_it
logger.debug { "You kicked Bbb" }
end
end
You would see only 1 :kick_it reported by Symbol.all_symbols, even though one of them is a module-level method and the other is an instance method.
Question 2
The use of the word "symbol" in the article probably made it confusing. A "global symbol" here probably means a name for a member of the set Object.constants, or any other constant accessible in the defined constants subtree.
So Symbol.all_symbols is not the same as "global symbols" in this case. However all names in the in-memory constants tree would be a subset of Symbol.all_symbols, keeping in mind that all scoping information is lost there.
Question 3
I think the Question 1 answer above also explains why both symbols are shown in the Symbol.all_symbols result.
Related
Rubocop dislikes the following; it issues Pass a binding, __FILE__ and __LINE__ to eval.:
sort_lambda = eval "->(a) { a.date }"
Yes, I know that eval is a security problem. The issue of security is out of scope for this question.
The Ruby documentation on binding says:
Objects of class Binding encapsulate the execution context at some particular place in the code and retain this context for future use. The variables, methods, value of self, and possibly an iterator block that can be accessed in this context are all retained. Binding objects can be created using Kernel#binding, and are made available to the callback of Kernel#set_trace_func and instances of TracePoint.
These binding objects can be passed as the second argument of the Kernel#eval method, establishing an environment for the evaluation.
The lambda being created does not need to access any variables in any scopes.
A quick and dirty binding to the scope where the eval is invoked from would look like this:
sort_lambda = eval "->(a) { a.date }", self.binding, __FILE__, __LINE__
Ideally, a null binding (a binding without anything defined in it, nothing from self, etc.) should be passed to this eval instead.
How could this be done?
Not exactly, but you can approximate it.
Before I go further, I know you've already said this, but I want to emphasize it for future readers of this question as well. What I'm describing below is NOT a sandbox. This will NOT protect you from malicious users. If you pass user input to eval, it can still do a lot of damage with the binding I show you below. Consult a cybersecurity expert before trying this in production.
Great, with that out of the way, let's move on. You can't really have an empty binding in Ruby. The Binding class is sort of compile-time magic. Although the class proper only exposes a way to get local variables, it also captures any constant names (including class names) that are in scope at the time, as well as the current receiver object self and all methods on self that can be invoked from the point of execution. The problem with an empty binding is that Ruby is a lot like Smalltalk sometimes. Everything exists in one big world of Platonic ideals called "objects", and no Ruby code can truly run in isolation.
In fact, trying to do so is really just putting up obstacles and awkward goalposts. Think you can block me from accessing BasicObject? If I have literally any object a in Ruby, then a.class.ancestors.last is BasicObject. Using this technique, we can get any global class by simply having an instance of that class or a subclass. Once we have classes, we have modules, and once we have modules we have Kernel, and at that point we have most of the Ruby built-in functionality.
Likewise, self always exists. You can't get rid of it. It's a fundamental part of the Ruby object system, and it exists even in situations where you don't think it does (see this question of mine from awhile back, for instance). Every method or block of code in Ruby has a receiver, so the most you can do is try to limit the receiver to be as small an object as possible. One might think you want self to be BasicObject, but amusingly there's not really a way to do that either, since you can only get a binding if Kernel is in scope, and BasicObject doesn't include Kernel. So at minimum, you're getting all of Kernel. You might be able to skimp by somehow and use some subclass of BasicObject that includes Kernel, thereby avoiding other Object methods, but that's likely to cause confusion down the road too.
All of this is to emphasize that a hypothetical null binding would really only make it slightly more complicated to get all of the global names, not impossible. And that's why it doesn't exist.
That being said, if your goal is to eliminate local variables and to try, you can get that easily by creating a binding inside of a module.
module F
module_function def get_binding
binding
end
end
sort_lambda = eval "->(a) { a.date }", F.get_binding
This binding will never have local variables, and the methods and constants it has access to are limited to those available in Kernel or at the global scope. That's about as close to "null" as you're going to get in the complex nexus of interconnected types and names we call Ruby.
While I originally left this as a comment on #Silvio Mayolo's answer, which is very well written, it seems germane to post it as an answer instead.
While most of what is contained within that answer is correct we can get slightly closer to a "Null Binding" through BasicObject inheritance:
class NullBinding < BasicObject
def get_binding
::Kernel
.instance_method(:binding)
.bind(self)
.call
end
end
This binding context has as limited a context as possible in ruby.
Using this context you will be unable to reference constants solely by name:
eval 'Class', NullBinding.new.get_binding
#=> NameError
That being said you can still reference the TOP_LEVEL scope so
eval '::Class', NullBinding.new.get_binding
#=> Class
The methods directly available in this binding context are limited only to the instance methods available to BasicObject. By way of Example:
eval "puts 'name'", NullBinding.new.get_binding
#=> NoMethodError
Again with the caveat that you can access TOP_LEVEL scope so:
eval "::Kernel.puts 'name'", NullBinding.new.get_binding
# name
#=> nil
Most of the Factorybot factories are like:
FactoryBot.define do
factory :product do
association :shop
title { 'Green t-shirt' }
price { 10.10 }
end
end
It seems that inside the ":product" block we are building a data structure, but it's not the typical hashmap, the "keys" are not declared through symbols and commas aren't used.
So my question is: what kind of data structure is this? and how it works?
How declaring "association" inside the block doesn't trigger a:
NameError: undefined local variable or method `association'
when this would happen on many other situations. Is there a subject in compsci related to this?
The block is not a data structure, it's code. association and friends are all method calls, probably being intercepted by method_missing. Here's an example using that same technique to build a regular hash:
class BlockHash < Hash
def method_missing(key, value=nil)
if value.nil?
return self[key]
else
self[key] = value
end
end
def initialize(&block)
self.instance_eval(&block)
end
end
With which you can do this:
h = BlockHash.new do
foo 'bar'
baz :zoo
end
h
#=> {:foo=>"bar", :baz=>:zoo}
h.foo
#=> "bar"
h.baz
#=> :zoo
I have not worked with FactoryBot so I'm going to make some assumptions based on other libraries I've worked with. Milage may vary.
The basics:
FactoryBot is a class (Obviously)
define is a static method in FactoryBot (I'm going to assume I still haven't lost you ;) ).
Define takes a block which is pretty standard stuff in ruby.
But here's where things get interesting.
Typically when a block is executed it has a closure relative to where it was declared. This can be changed in most languages but ruby makes it super easy. instance_eval(block) will do the trick. That means you can have access to methods in the block that weren't available outside the block.
factory on line 2 is just such a method. You didn't declare it, but the block it's running in isn't being executed with a standard scope. Instead your block is being immediately passed to FactoryBot which passes it to a inner class named DSL which instance_evals the block so its own factory method will be run.
line 3-5 don't work that way since you can have an arbitrary name there.
ruby has several ways to handle missing methods but the most straightforward is method_missing. method_missing is an overridable hook that any class can define that tells ruby what to do when somebody calls a method that doesn't exist.
Here it's checking to see if it can parse the name as an attribute name and use the parameters or block to define an attribute or declare an association. It sounds more complicated than it is. Typically in this situation I would use define_method, define_singleton_method, instance_variable_set etc... to dynamically create and control the underlying classes.
I hope that helps. You don't need to know this to use the library the developers made a domain specific language so people wouldn't have to think about this stuff, but stay curious and keep growing.
Suppose I have a file example.rb like so:
# example.rb
class Example
def foo
5
end
end
that I load with require or require_relative. If I didn't know that example.rb defined Example, is there a list (other than ObjectSpace) that I could inspect to find any objects that had been defined? I've tried checking global_variables but that doesn't seem to work.
Thanks!
Although Ruby offers a lot of reflection methods, it doesn't really give you a top-level view that can identify what, if anything, has changed. It's only if you have a specific target you can dig deeper.
For example:
def tree(root, seen = { })
seen[root] = true
root.constants.map do |name|
root.const_get(name)
end.reject do |object|
seen[object] or !object.is_a?(Module)
end.map do |object|
seen[object] = true
puts object
[ object.to_s, tree(object, seen) ]
end.to_h
end
p tree(Object)
Now if anything changes in that tree structure you have new things. Writing a diff method for this is possible using seen as a trigger.
The problem is that evaluating Ruby code may not necessarily create all the classes that it will or could create. Ruby allows extensive modification to any and all classes, and it's common that at run-time it will create more, or replace and remove others. Only libraries that forcibly declare all of their modules and classes up front will work with this technique, and I'd argue that's a small portion of them.
It depends on what you mean by "the global namespace". Ruby doesn't really have a "global" namespace (except for global variables). It has a sort-of "root" namespace, namely the Object class. (Although note that Object may have a superclass and mixes in Kernel, and stuff can be inherited from there.)
"Global" constants are just constants of Object. "Global functions" are just private instance methods of Object.
So, you can get reasonably close by examining global_variables, Object.constants, and Object.instance_methods before and after the call to require/require_relative.
Note, however, that, depending on your definition of "global namespace" (private) singleton methods of main might also count, so you check for those as well.
Of course, any of the methods the script added could, when called at a later time, themselves add additional things to the global scope. For example, the following script adds nothing to the scope, but calling the method will:
class String
module MyNonGlobalModule
def self.my_non_global_method
Object.const_set(:MY_GLOBAL_CONSTANT, 'Haha, gotcha!')
end
end
end
Strictly speaking, however, you asked about adding "objects" to the global namespace, and neither constants nor methods nor variables are objects, soooooo … the answer is always "none"?
In the Ruby Standard Library we have the Singleton class:
http://ruby-doc.org/stdlib/libdoc/singleton/rdoc/index.html
We can make any class a singleton by including this class inside. I just rarely see this used. When would it make sense to use this Singleton class vs. just using plain old class methods - also known as singleton methods?
Said in another way: Which Singleton coding-convention are the best and why? Here are three ways I could think of:
require 'singleton'
class Foo
include Singleton
# method definitions go here...
end
Foo.instance.do_something!
Versus
class Foo
class << self
# method definitions go here...
end
end
Foo.do_something!
Versus
module Foo
class << self
# method definitions go here...
end
end
Foo.do_something!
WARNING: Opinions ahead!
If you just need a single object, just use a single object:
class << (Foo = Object.new)
# method definitions go here...
end
Foo.do_something!
Modules are for sharing behavior between objects. Classes are factories for objects. Note the plural: if you have only one object, you need neither a facility for sharing behavior nor a factory for producing multiple copies.
Whether or not this is considered idiomatic Ruby depends largely on which "Ruby" you are talking about: are you talking about Ruby as used by Rails programmers, Ruby as used by Smalltalk programmers, Ruby as used by Java programmers or Ruby as used by Ruby programmers? There are significant differences in the styles used by these different communities.
For example, old time Rubyists like David Alan Black tend to start always with just objects and singleton methods. Only if they later on discover duplicated behavior between objects will they extract that behavior into a mixin and extend the objects with it. (Note that at this stage, there still aren't any classes!)
Classes are created, again, only by refactoring, not by upfront design, when and only when there is duplicated structure between objects.
The most common approach I've seen is neither a singleton nor a class that you never instantiate. The most common and idiomatic way to do this is a module.
I am studying the ruby object model and have some questions. I understand the idea that an object only stores instance variables, and methods are stored in the class, which an object has a reference to. I also understand the idea of 'self' -- what it is, how it changes, etc.
However, what I don't understand is the notion that 'classes are objects.' Is there a good, intuitive explanation anywhere?
(BTW: I'm using Ruby Object Model and Metaprogramming and Metaprogramming Ruby as my two resources. If anybody can suggest something else, that would be helpful.)
Thanks.
It means precisely what it sounds like — classes are objects. Specifically, they are instances of the class Class, which is itself a subclass of the class Module, which in turn is a subclass of Object, just like every other class in Ruby. Like any other object in Ruby, a class can respond to messages, have its own instance variables, etc.
As a practical example, let's take private.
class Person
attr_accessor :name, :height
private
attr_accessor :weight
end
This gives instances of Person public methods to access the person's name and height, but the accessors for the person's weight are private. BUTBUTBUT — rather than being a keyword like in most languages, private is an ordinary method of the Module class. If we wanted, we could redefine it to do something different for a particular class hierarchy.
class RichardStallman
def self.private(*args)
puts "NO! INFORMATION WAS MEANT TO BE FREE!"
end
end
Here's my shot at one.
In Ruby, classes are objects. Usually they have class Class. For example, let's consider the class Foo.
class Foo
end
Doubtless you've seen this before, and it's not terribly exciting. But we could also have defined Foo this way:
Foo = Class.new
Just as you'd create a new Foo by calling Foo.new, you can create a new Class by calling Class.new. Then you give that class the name Foo by assigning it, just like any other variable. That's all there is to it.
The notion of "classes are objects" ( as I understand it ) implies that anything you can do with an object, you can do it with a class.
This differs from other programming languages where the class and the class definition are special artifacts different from objects and often unaccessible to the runtime.
For instance in Ruby, you can modify any object at runtime, since classes are also objects you can modify the class it self and add methods at runtime, delete methods, or add and delete attributes at runtime.
For instance:
$ irb
>> x = Object.new
=> #<Object:0x1011ce560>
>> x.to_s
=> "#<Object:0x1011ce560>"
>> undef to_s
=> nil
>> x.to_s
NoMethodError: undefined method `to_s' for #<Object:0x1011ce560>
from (irb):4
>>
That's not possible on other programming languages where a distinction between objects and classes is made.
note: Probably you should understand basic Ruby concepts before going to meta programming as it may be confusing, that what I would do.
Look at this article, you may find it helpful:
The Ruby Object Model - Structure and Semantics
Personally I learned a lot about the Ruby object model by reading about the Smalltalk one (e.g. in the Squeak documentation). And depending on how fluent you are in C, the MRI sources are quite approachable and yield the most definite answers.
When you think of it, it's completely logical for new to be a function, right? A function which creates and returns a new object. (Unlike most of other languages where new is some kind of operator or a language construct.)
Pushing it further, even more logical for this function new is that it should be a method, if we are talking about an OO language. Whose method? A method of an object, just a little bit different sort of an object that we can call "class".
So, looking it that way, classes are just special kinds of objects, objects that, among other peculiarities, have method new and know how to create other objects based on their own image.
what I don't understand is the notion that 'classes are objects.' Is there a good, intuitive explanation anywhere?
An answer to SO thread `Visual representation of Ruby Object Model' links to an excellent video on the subject.