All over the Internet, I see people using "include" to bring new functionality to the main scope.
Most recenty, I saw this in an SO answer:
require 'fileutils' #I know, no underscore is not ruby-like
include FileUtils
# Gives you access (without prepending by 'FileUtils.') to
cd(dir, options)
cd(dir, options) {|dir| .... }
pwd()
Extend works too, but from my understanding, ONLY extend should work. I don't even know why the main object has the include functionality.
Inside the main scope:
self.class == Object #true
Object.new.include #NoMethodError: undefined method `include' for #<Object:0x000000022c66c0>
An I mean logically, since self.is_a?(Module) == false when inside the main scope, main shouldn't even have the include functionality, since include is used to add methods to child instances and main isn't a class or a module so there are no child instances to speak of.
My question is, why does include work in the main scope, what was the design decision that led to making it work there, and lastly, shouldn't we prefer "extend" in that case so as to not make the "extend/include" functionality even more confusing than it already may be thanks to people often using ruby hooks to invoke the other when one is called.
The main scope is "special". Instance methods defined at the main scope become private instance methods of Object, constants become constants of Object, and include includes into Object. There's probably other things I'm missing. (E.g. presumably, prepend prepends to Object, I never thought about it until now.)
Related
Rubocop dislikes the following; it issues Pass a binding, __FILE__ and __LINE__ to eval.:
sort_lambda = eval "->(a) { a.date }"
Yes, I know that eval is a security problem. The issue of security is out of scope for this question.
The Ruby documentation on binding says:
Objects of class Binding encapsulate the execution context at some particular place in the code and retain this context for future use. The variables, methods, value of self, and possibly an iterator block that can be accessed in this context are all retained. Binding objects can be created using Kernel#binding, and are made available to the callback of Kernel#set_trace_func and instances of TracePoint.
These binding objects can be passed as the second argument of the Kernel#eval method, establishing an environment for the evaluation.
The lambda being created does not need to access any variables in any scopes.
A quick and dirty binding to the scope where the eval is invoked from would look like this:
sort_lambda = eval "->(a) { a.date }", self.binding, __FILE__, __LINE__
Ideally, a null binding (a binding without anything defined in it, nothing from self, etc.) should be passed to this eval instead.
How could this be done?
Not exactly, but you can approximate it.
Before I go further, I know you've already said this, but I want to emphasize it for future readers of this question as well. What I'm describing below is NOT a sandbox. This will NOT protect you from malicious users. If you pass user input to eval, it can still do a lot of damage with the binding I show you below. Consult a cybersecurity expert before trying this in production.
Great, with that out of the way, let's move on. You can't really have an empty binding in Ruby. The Binding class is sort of compile-time magic. Although the class proper only exposes a way to get local variables, it also captures any constant names (including class names) that are in scope at the time, as well as the current receiver object self and all methods on self that can be invoked from the point of execution. The problem with an empty binding is that Ruby is a lot like Smalltalk sometimes. Everything exists in one big world of Platonic ideals called "objects", and no Ruby code can truly run in isolation.
In fact, trying to do so is really just putting up obstacles and awkward goalposts. Think you can block me from accessing BasicObject? If I have literally any object a in Ruby, then a.class.ancestors.last is BasicObject. Using this technique, we can get any global class by simply having an instance of that class or a subclass. Once we have classes, we have modules, and once we have modules we have Kernel, and at that point we have most of the Ruby built-in functionality.
Likewise, self always exists. You can't get rid of it. It's a fundamental part of the Ruby object system, and it exists even in situations where you don't think it does (see this question of mine from awhile back, for instance). Every method or block of code in Ruby has a receiver, so the most you can do is try to limit the receiver to be as small an object as possible. One might think you want self to be BasicObject, but amusingly there's not really a way to do that either, since you can only get a binding if Kernel is in scope, and BasicObject doesn't include Kernel. So at minimum, you're getting all of Kernel. You might be able to skimp by somehow and use some subclass of BasicObject that includes Kernel, thereby avoiding other Object methods, but that's likely to cause confusion down the road too.
All of this is to emphasize that a hypothetical null binding would really only make it slightly more complicated to get all of the global names, not impossible. And that's why it doesn't exist.
That being said, if your goal is to eliminate local variables and to try, you can get that easily by creating a binding inside of a module.
module F
module_function def get_binding
binding
end
end
sort_lambda = eval "->(a) { a.date }", F.get_binding
This binding will never have local variables, and the methods and constants it has access to are limited to those available in Kernel or at the global scope. That's about as close to "null" as you're going to get in the complex nexus of interconnected types and names we call Ruby.
While I originally left this as a comment on #Silvio Mayolo's answer, which is very well written, it seems germane to post it as an answer instead.
While most of what is contained within that answer is correct we can get slightly closer to a "Null Binding" through BasicObject inheritance:
class NullBinding < BasicObject
def get_binding
::Kernel
.instance_method(:binding)
.bind(self)
.call
end
end
This binding context has as limited a context as possible in ruby.
Using this context you will be unable to reference constants solely by name:
eval 'Class', NullBinding.new.get_binding
#=> NameError
That being said you can still reference the TOP_LEVEL scope so
eval '::Class', NullBinding.new.get_binding
#=> Class
The methods directly available in this binding context are limited only to the instance methods available to BasicObject. By way of Example:
eval "puts 'name'", NullBinding.new.get_binding
#=> NoMethodError
Again with the caveat that you can access TOP_LEVEL scope so:
eval "::Kernel.puts 'name'", NullBinding.new.get_binding
# name
#=> nil
I'm reproducing a bug in my Rails console. I'm interested in what some methods return, but some of them turn out to be private, so in my console I have to write:
> my_object.my_method
NoMethodError (private method `my_method' called for #<MyClass:0x0123456789ABCDEF>)
> my_object.send(:my_method)
This gets a bit tedious after a while, especially since it's not obvious which are private without drilling through to the class they're defined in.
Is there any way I could temporarily make all methods public? I don't intend to use it in production, just temporarily in my local console while I'm debugging.
UPDATE: when I say “all methods”, I don’t just mean the ones on my_object. I mean literally all methods on every object.
To make all methods public, this should work:
ObjectSpace.each_object(Module) do |m|
m.send(:public, *m.private_instance_methods(false))
end
ObjectSpace.each_object traverses all modules (that includes classes and singleton classes) and makes their (own) private_instance_methods public.
To just make a single object public, you could use:
my_object.singleton_class.send(:public, *my_object.private_methods)
By changing the singleton class, only the my_object instance is affected.
Note that by default, private_methods returns inherited methods too, including many from Kernel. You might want to pass false to just include the object's own private methods.
Something like
my_object.class.private_instance_methods.each { |m| my_object.class.send :public, m }
might help you shoot a leg, I guess. :) (but the answer above is way better)
Suppose I have a file example.rb like so:
# example.rb
class Example
def foo
5
end
end
that I load with require or require_relative. If I didn't know that example.rb defined Example, is there a list (other than ObjectSpace) that I could inspect to find any objects that had been defined? I've tried checking global_variables but that doesn't seem to work.
Thanks!
Although Ruby offers a lot of reflection methods, it doesn't really give you a top-level view that can identify what, if anything, has changed. It's only if you have a specific target you can dig deeper.
For example:
def tree(root, seen = { })
seen[root] = true
root.constants.map do |name|
root.const_get(name)
end.reject do |object|
seen[object] or !object.is_a?(Module)
end.map do |object|
seen[object] = true
puts object
[ object.to_s, tree(object, seen) ]
end.to_h
end
p tree(Object)
Now if anything changes in that tree structure you have new things. Writing a diff method for this is possible using seen as a trigger.
The problem is that evaluating Ruby code may not necessarily create all the classes that it will or could create. Ruby allows extensive modification to any and all classes, and it's common that at run-time it will create more, or replace and remove others. Only libraries that forcibly declare all of their modules and classes up front will work with this technique, and I'd argue that's a small portion of them.
It depends on what you mean by "the global namespace". Ruby doesn't really have a "global" namespace (except for global variables). It has a sort-of "root" namespace, namely the Object class. (Although note that Object may have a superclass and mixes in Kernel, and stuff can be inherited from there.)
"Global" constants are just constants of Object. "Global functions" are just private instance methods of Object.
So, you can get reasonably close by examining global_variables, Object.constants, and Object.instance_methods before and after the call to require/require_relative.
Note, however, that, depending on your definition of "global namespace" (private) singleton methods of main might also count, so you check for those as well.
Of course, any of the methods the script added could, when called at a later time, themselves add additional things to the global scope. For example, the following script adds nothing to the scope, but calling the method will:
class String
module MyNonGlobalModule
def self.my_non_global_method
Object.const_set(:MY_GLOBAL_CONSTANT, 'Haha, gotcha!')
end
end
end
Strictly speaking, however, you asked about adding "objects" to the global namespace, and neither constants nor methods nor variables are objects, soooooo … the answer is always "none"?
In the book OO Design in Ruby, Sandi Metz says that the main use of modules is to implement duck types with them and include them in every class needed. Why is the Ruby Kernel a module included in Object? As far as I know it isn't used anywhere else. What's the point of using a module?
Ideally,
Methods in spirit (that are applicable to any object), that is, methods that make use of the receiver, should be defined on the Object class, while
Procedures (provided globally), that is, methods that ignore the receiver, should be collected in the Kernel module.
Kernel#puts, for example doesn't do anything with its receiver; it doesn't call private methods on it, it doesn't access any instance variables of it, it only acts on its arguments.
Procedures in Ruby are faked by using Ruby's feature that a receiver that is equal to self can be omitted. They are also often made private to prevent them from being called with an explicit receiver and thus being even more confusing. E.g., "Hello".puts would print a newline and nothing else since puts only cares about its arguments, not its receiver. By making it private, it can only be called as puts "Hello".
In reality, due to the long history of Ruby, that separation hasn't always been strictly followed. It is also additionally complicated by the fact that some Kernel methods are documented in Object and vice versa, and even further by the fact that when you define something which looks like a global procedure, and which by the above reasoning should then end up in Kernel, it actually ends up as a private instance method in Object.
As you already pointed out: Modules provide a way to collect and structure behavior, so does the Kernel module. This module is mixed in early into the class Object so every Ruby class will provide these methods. There is only a BasicObject before in hierarchy, it's child Objects purpose is only to get extended by the Kernel methods. BasicObject has only 7 methods that very very basic like new, __send__ or __id__.
class Object < BasicObject
include Kernel # all those many default methods we appreciate :)
end
Sinatra defines a number of methods which appear to live in the current scope, i.e. not within a class declaration. These are defined in the Sinatra gem.
I'd like to be able to write a gem which will create a function which I can call from the global scope, e.g
add_blog(:my_blog)
This would then call the function my_blog in the global scope.
Obviously I could monkeypatch the Object class in the gem with the add_blog function, but this seems like overkill as it would extend every object.
TL;DR extend-ing a module at the top level adds its methods to the top level without adding them to every object.
There are three ways to do this:
Option 1: Write a top-level method in your gem
#my_gem.rb
def add_blog(blog_name)
puts "Adding blog #{blog_name}..."
end
#some_app.rb
require 'my_gem' #Assume your gem is in $LOAD_PATH
add_blog 'Super simple blog!'
This will work, but it's not the neatest way of doing it: it's impossible to require your gem without adding the method to the top level. Some users may want to use your gem without this.
Ideally we'd have some way of making it available in a scoped way OR at the top level, according to the user's preference. To do that, we'll define your method(s) within a module:
#my_gem.rb
module MyGem
#We will add methods in this module to the top-level scope
module TopLevel
def self.add_blog(blog_name)
puts "Adding blog #{blog_name}..."
end
end
#We could also add other, non-top-level methods or classes here
end
Now our code is nicely scoped. The question is how do we make it accessible from the top level, so we don't always need to call MyGem::TopLevel.add_blog ?
Let's look at what the top level in Ruby actually means. Ruby is a highly object oriented language. That means, amongst other things, that all methods are bound to an object. When you call an apparently global method like puts or require, you're actually calling a method on a "default" object called main.
Therefore if we want to add a method to the top-level scope we need to add it to main. There are a couple of ways we could do this.
Option 2: Adding methods to Object
main is an instance of the class Object. If we add the methods from our module to Object (the monkey patch the OP referred to), we'll be able to use them from main and therefore the top level. We can do that by using our module as a mixin:
#my_gem.rb
module MyGem
module TopLevel
def self.add_blog
#...
end
end
end
class Object
include MyGem::TopLevel
end
We can now call add_blog from the top level. However, this also isn't a perfect solution (as the OP points out), because we haven't just added our new methods to main, we've added them to every instance of Object! Unfortunately almost everything in Ruby is a descendant of Object, so we've just made it possible to call add_blog on anything! For example, 1.add_blog("what does it mean to add a blog to a number?!") .
This is clearly undesirable, but conceptually pretty close to what we want. Let's refine it so we can add methods to main only.
Option 3: Adding methods to main only
So if include adds the methods from a module into a class, could we call it directly on main? Remember if a method is called without an explicit receiver ("owning object"), it will be called on main.
#app.rb
require 'my_gem'
include MyGem::TopLevel
add_blog "it works!"
Looks promising, but still not perfect - it turns out that include adds methods to the receiver's class, not just the receiver, so we'd still be able to do strange things like 1.add_blog("still not a good thing!") .
So, to fix that, we need a method that will add methods to the receiving object only, not its class. extend is that method.
This version will add our gem's methods to the top-level scope without messing with other objects:
#app.rb
require 'my_gem'
extend MyGem::TopLevel
add_blog "it works!"
1.add_blog "this will throw an exception!"
Excellent! The last stage is to set up our Gem so that a user can have our top-level methods added to main without having to call extend themselves. We should also provide a way for users to use our methods in a fully scoped way.
#my_gem/core.rb
module MyGem
module TopLevel
def self.add_blog...
end
end
#my_gem.rb
require './my_gem/core.rb'
extend MyGem::TopLevel
That way users can have your methods automatically added to the top level:
require 'my_gem'
add_blog "Super simple!"
Or can choose to access the methods in a scoped way:
require 'my_gem/core'
MyGem::TopLevel.add_blog "More typing, but more structure"
Ruby acheives this through a bit of magic called the eigenclass. Each Ruby object, as well as being an instance of a class, has a special class of its own - its eigenclass. We're actually using extend to add the MyGem::TopLevel methods to main's eigenclass.
This is the solution I would use. It's also the pattern that Sinatra uses. Sinatra applies it in a slightly more complex way, but it's essentially the same:
When you call
require 'sinatra'
sinatra.rb is effectively prepended to your script. That file calls
require sinatra/main
sinatra/main.rb calls
extend Sinatra::Delegator
Sinatra::Delegator is the equivalent of the MyGem::TopLevel module I described above - it causes main to know about Sinatra-specific methods.
Here's where Delgator differs slightly - instead of adding its methods to main directly, Delegator causes main to pass specific methods on to a designated target class - in this case, Sinatra::Application.
You can see this for yourself in the Sinatra code. A search through sinatra/base.rb shows that, when the Delegator module is extended or included, the calling scope (the "main" object in this case) will delegate the following methods to Sinatra::Application:
:get, :patch, :put, :post, :delete, :head, :options, :link, :unlink,
:template, :layout, :before, :after, :error, :not_found, :configure,
:set, :mime_type, :enable, :disable, :use, :development?, :test?,
:production?, :helpers, :settings, :register