Ruby binding instance_eval access to local vars? - ruby

Working on some functionality where I need to be able to define methods and local vars from strings via a binding
I think the code speaks best for itself, so here is what I'm trying to accomplish:
mybind = binding
mybind.eval("a = 5")
mybind.eval("a") #=> 5
a #=> NameError: undefined local variable or method `a'...
# Calling it in global/main scope doesn't find anything, but it's found in the `mybind` scope
mybind.eval("def thing; 3; end")
mybind.eval("thing") #=> 3
thing #=> NameError: undefined local variable or method `thing'...
# Calling it in global/main scope doesn't find anything, but it's found in the `mybind` scope
Essentially, I don't want any methods/vars defined to go to the global scope, but I want them to be callable within a second call to the same binding/eval. These will come through as strings, so callable blocks is out of the question as well.
I've tried instance_eval, class_eval, eval, and instance_exec
instance_eval is ALMOST perfect, except for some reason any local vars defined cannot be accessed within the next instance_eval called.
I'm a bit new to using binding, so if there is something obvious I'm missing, please let me know. Otherwise if there is anybody that has any idea on how to accomplish this, even with a different flow I would really appreciate any help.
Thanks!
I'm on Ruby 2.7.2, but can upgrade if that will provide a solution.

binding, by its design, gets the current binding whenever it's called. If you call it in the global scope, you'll get the global binding. What it sounds like you want is a binding that's not accessible to the user. We can create one of those easily, simply by calling binding inside of an otherwise empty function.
def new_binding
binding
end
Now, every time I call new_binding, I get a fresh scope that inherits lexically from Ruby's standard global scope but that nobody else can access without that object.
irb(main):012:0> new_binding == new_binding
=> false
irb(main):013:0> my_binding = new_binding
=> #<Binding:0x00005562fd103a00>
irb(main):014:0> my_binding.eval("a = 1")
=> 1
irb(main):015:0> my_binding.eval("a")
=> 1
--
Edit after your comment
We can get a little closer to what you're looking for by wrapping new_binding in a trivial eigenclass.
def new_binding
class <<Object.new
return binding
end
end
Now you can define methods, but you have to do so with an explicit self.
new_binding.eval("def self.thing; 3; end")
new_binding.eval("thing")
It's very possible there's some trickery that can be done with method_added to get around the explicit receiver, but for whatever reason it seems like method_added doesn't want to work on eigenclasses. I might end up opening a new SO question about that myself to be honest, as the behavior is quite odd.

Related

How to Make Or Reference a Null Ruby Binding For Eval

Rubocop dislikes the following; it issues Pass a binding, __FILE__ and __LINE__ to eval.:
sort_lambda = eval "->(a) { a.date }"
Yes, I know that eval is a security problem. The issue of security is out of scope for this question.
The Ruby documentation on binding says:
Objects of class Binding encapsulate the execution context at some particular place in the code and retain this context for future use. The variables, methods, value of self, and possibly an iterator block that can be accessed in this context are all retained. Binding objects can be created using Kernel#binding, and are made available to the callback of Kernel#set_trace_func and instances of TracePoint.
These binding objects can be passed as the second argument of the Kernel#eval method, establishing an environment for the evaluation.
The lambda being created does not need to access any variables in any scopes.
A quick and dirty binding to the scope where the eval is invoked from would look like this:
sort_lambda = eval "->(a) { a.date }", self.binding, __FILE__, __LINE__
Ideally, a null binding (a binding without anything defined in it, nothing from self, etc.) should be passed to this eval instead.
How could this be done?
Not exactly, but you can approximate it.
Before I go further, I know you've already said this, but I want to emphasize it for future readers of this question as well. What I'm describing below is NOT a sandbox. This will NOT protect you from malicious users. If you pass user input to eval, it can still do a lot of damage with the binding I show you below. Consult a cybersecurity expert before trying this in production.
Great, with that out of the way, let's move on. You can't really have an empty binding in Ruby. The Binding class is sort of compile-time magic. Although the class proper only exposes a way to get local variables, it also captures any constant names (including class names) that are in scope at the time, as well as the current receiver object self and all methods on self that can be invoked from the point of execution. The problem with an empty binding is that Ruby is a lot like Smalltalk sometimes. Everything exists in one big world of Platonic ideals called "objects", and no Ruby code can truly run in isolation.
In fact, trying to do so is really just putting up obstacles and awkward goalposts. Think you can block me from accessing BasicObject? If I have literally any object a in Ruby, then a.class.ancestors.last is BasicObject. Using this technique, we can get any global class by simply having an instance of that class or a subclass. Once we have classes, we have modules, and once we have modules we have Kernel, and at that point we have most of the Ruby built-in functionality.
Likewise, self always exists. You can't get rid of it. It's a fundamental part of the Ruby object system, and it exists even in situations where you don't think it does (see this question of mine from awhile back, for instance). Every method or block of code in Ruby has a receiver, so the most you can do is try to limit the receiver to be as small an object as possible. One might think you want self to be BasicObject, but amusingly there's not really a way to do that either, since you can only get a binding if Kernel is in scope, and BasicObject doesn't include Kernel. So at minimum, you're getting all of Kernel. You might be able to skimp by somehow and use some subclass of BasicObject that includes Kernel, thereby avoiding other Object methods, but that's likely to cause confusion down the road too.
All of this is to emphasize that a hypothetical null binding would really only make it slightly more complicated to get all of the global names, not impossible. And that's why it doesn't exist.
That being said, if your goal is to eliminate local variables and to try, you can get that easily by creating a binding inside of a module.
module F
module_function def get_binding
binding
end
end
sort_lambda = eval "->(a) { a.date }", F.get_binding
This binding will never have local variables, and the methods and constants it has access to are limited to those available in Kernel or at the global scope. That's about as close to "null" as you're going to get in the complex nexus of interconnected types and names we call Ruby.
While I originally left this as a comment on #Silvio Mayolo's answer, which is very well written, it seems germane to post it as an answer instead.
While most of what is contained within that answer is correct we can get slightly closer to a "Null Binding" through BasicObject inheritance:
class NullBinding < BasicObject
def get_binding
::Kernel
.instance_method(:binding)
.bind(self)
.call
end
end
This binding context has as limited a context as possible in ruby.
Using this context you will be unable to reference constants solely by name:
eval 'Class', NullBinding.new.get_binding
#=> NameError
That being said you can still reference the TOP_LEVEL scope so
eval '::Class', NullBinding.new.get_binding
#=> Class
The methods directly available in this binding context are limited only to the instance methods available to BasicObject. By way of Example:
eval "puts 'name'", NullBinding.new.get_binding
#=> NoMethodError
Again with the caveat that you can access TOP_LEVEL scope so:
eval "::Kernel.puts 'name'", NullBinding.new.get_binding
# name
#=> nil

Check if a method written inside a Proc is defined?

For some reason I store methods in a Proc and I want to check if they are defined.
If I do : defined? 1.+ I get method in return so I know the + method is defined.
Now if I store the methods in a Proc : p = Proc.new { 1.+ },
I want to know how i can check if the method stored in p is defined
I have tested defined? p or defined? p.call but it's not the attended results ...
Thanks for your help !
defined? is used to test if variables are defined or constants exist, not methods per-se.
For that you should use:
1.method(:+)
Where if that method doesn't exist you get a NameError exception. That's not the best way to test, though, instead there's a much simpler method for it:
1.respond_to?(:+)
Where if that method "responds to" that method call it will (usually) return true. Since Ruby is a highly dynamic programming language and methods can be made-up on the spot through method_missing and other tricks, this might not be 100% accurate, but it is for pre-defined methods.
Unfortunately once you've wrapped some valid Ruby code inside a Proc it's not possible to test any more if it will or won't work without executing it. This means if you have a Proc, it's basically a black box, short of delving in through tricks in the VM, but that's Ruby implementation specific.

Understanding the access of a variable assigned in initialize in Ruby

As a beginner, I've not quite got my head around self so I'm having trouble understanding how the self.blogs in initialize, and blogs then self.blogs on the next line after in the add_blog method, are all working together in the below code.
Why does blogs in the add_blog method access the same variable as self.blogs in initalize?
And then why is self.blogs used afterwards to sort the blogs array?
Also, would it matter if I used #blogs in initialize, instead of self.blogs?
class User
attr_accessor :username, :blogs
def initialize(username)
self.username = username
self.blogs = []
end
def add_blog(date, text)
added_blog = Blog.new(date, self, text)
blogs << added_blog
self.blogs = blogs.sort_by { |blog| blog.date }.reverse
added_blog
end
end
To answer your question, we have to reveal the true nature of attr_accessor.
class Foo
attr_accessor :bar
end
is completely equivalent to
class Foo
def bar
#bar
end
def bar=(value)
#bar = value
end
end
You can see that attr_accessor :bar defines two instance methods Foo#bar and Foo#bar= that access an instance variable #bar.
Lets then look at your code.
self.blogs = [] in initialize is actually calling the method User#blogs=, and through it sets the instance variable #blogs with an empty array. It can be written as self.blogs=([]) but it's noisy, isn't it? By the way, you can't omit self. here otherwise it just sets a local variable.
blogs << added_blog calls the method User#blog which returns the value of #blogs. It can also be written as self.blogs().push(added_blog), but again it's not rubyish. You can omit self. because there is no local variable named blogs in User#add_blog, so ruby falls back to call the instance method.
self.blogs = blogs.sort_by { |blog| blog.date }.reverse mixes call to User#blogs= and User#blogs.
For most method calls on self, self.method_name is equivalent to just method_name. That's not the case for methods whose name ends with an =, though.
The first thing to note, then, is that self.blogs = etc doesn't call a method named blogs and then somehow 'assign etc to it'; that line calls the method blogs=, and passes etc to it as an argument.
The reason you can't shorten that to just blogs = etc, like you can with other method calls, is because blogs = etc is indistinguishable from creating a new local variable named blogs.
When, on the previous line, you see a bare blogs, that is also a method call, and could just as easily have been written self.blogs. Writing it with an implicit receiver is just shorter. Of course, blogs is also potentially ambiguous as the use of a local variable, but in this case the parser can tell it's not, since there's no local variable named blogs assigned previously in the method (and if there had been, a bare blogs would have the value of that local variable, and self.blogs would be necessary if you had meant the method call).
As for using #blogs = instead of self.blogs =, in this case it would have the same effect, but there is a subtle difference: if you later redefine the blogs= method to have additional effects (say, writing a message to a log), the call to self.blogs = will pick up those changes, whereas the bare direct access will not. In the extreme case, if you redefine blogs= to store the value in a database rather than an instance variable, #blogs = won't even be similar anymore (though obviously that sort of major change in infrastructure will probably have knock-on effects internal to the class regardless).
#variable will directly access the instance variable for that class. Writing self.variable will send to the object a message variable. By default it will return the instance variable but it could do other things depending on how you set up your object. It could be a call to a method, or a subclass, or anything else.
The difference between calling blogs or self.blogs is totally up to syntax. If you use an opinionated syntax checker like rubocop it will tell you that you have a redundant use of self

Generic way to replace an object in it's own method

With strings one can do this:
a = "hello"
a.upcase!
p a #=> "HELLO"
But how would I write my own method like that?
Something like (although that doesn't work obviously):
class MyClass
def positify!
self = [0, self].max
end
end
I know there are some tricks one can use on String but what if I'm trying to do something like this for Object?
Many classes are immutable (e.g. Numeric, Symbol, ...), so have no method allowing you to change their value.
On the other hand, any Object can have instance variables and these can be modified.
There is an easy way to delegate the behavior to a known object (say 42) and be able to change, later on, to another object, using SimpleDelegator. In the example below, quacks_like_an_int behaves like an Integer:
require 'delegate'
quacks_like_an_int = SimpleDelegator.new(42)
quacks_like_an_int.round(-1) # => 40
quacks_like_an_int.__setobj__(666)
quacks_like_an_int.round(-1) # => 670
You can use it to design a class too, for example:
require 'delegate'
class MutableInteger < SimpleDelegator
def plus_plus!
__setobj__(self + 1)
self
end
def positify!
__setobj__(0) if self < 0
self
end
end
i = MutableInteger.new(-42)
i.plus_plus! # => -41
i.positify! # => 0
Well, the upcase! method doesn't change the object identity, it only changes its internal structure (s.object_id == s.upcase!.object_id).
On the other hand, numbers are immutable objects and therefore, you can't change their value without changing their identity. AFAIK, there's no way for an object to self-change its identity, but, of course, you may implement positify! method that changes properties of its object - and this would be an analogue of what upcase! does for strings.
Assignment, or binding of local variables (using the = operator) is built-in to the core language and there is no way to override or customize it. You could run a preprocessor over your Ruby code which would convert your own, custom syntax to valid Ruby, though. You could also pass a Binding in to a custom method, which could redefine variables dynamically. This wouldn't achieve the effect you are looking for, though.
Understand that self = could never work, because when you say a = "string"; a = "another string" you are not modifying any objects; you are rebinding a local variable to a different object. Inside your custom method, you are in a different scope, and any local variables which you bind will only exist in that scope; it won't have any effect on the scope which you called the method from.
You cannot change self to point to anything other than its current object. You can make changes to instance variables, such as in the case string which is changing the underlying characters to upper case.
As pointed out in this answer:
Ruby and modifying self for a Float instance
There is a trick mentioned here that is a work around, which is to write you class as a wrapper around the other object. Then your wrapper class can replace the wrapped object at will. I'm hesitant on saying this is a good idea though.

Which is better? Creating a instance variable or passing around a local variable in Ruby?

In general what is the best practice and pro/cons to creating an instance variable that can be accessed from multiple methods or creating an instance variable that is simply passed as an argument to those methods. Functionally they are equivalent since the methods are still able to do the work using the variable. While I could see a benefit if you were updating the variable and wanted to return the updated value but in my specific case the variable is never updated only read by each method to decide how to operate.
Example code to be clear:
class Test
#foo = "something"
def self.a
if #foo == "something"
puts "do #{#foo}"
end
end
a()
end
vs
class Test
foo = "something"
def self.a(foo)
if foo == "something"
puts "do #{foo}"
end
end
a(foo)
end
I don't pass instance variable around. They are state values for the instance.
Think of them as part of the DNA of that particular object, so they'll always be part of what makes the object be what it is. If I call a method of that object, it will already know how to access its own DNA and will do it internally, not through some parameter being passed in.
If I want to apply something that is foreign to the object, then I'll have to pass it in via the parameters.
As you mentioned, this is a non-functional issue about the code. With that in mind...
It's hard to give a definitive rule about it since it depends entirely on the context. Is the variable set once and forgotten about it, or constantly updated? How many methods share the same variable? How will the code be used?
In my experience, variables that drive behavior of the object but are seldom (if at all) modified are set in the initialize method, or given to the method that will cascade behavior. Libraries and leaf methods tend to have the variable passed in, as it's likely somebody will want to call it in isolation.
I'd suggest you start by passing everything first, and then refactoring if you notice the same variable being passed around all over the class.
If I need a variable that is scoped at the instance level, I use an instance variable, set in the initialize method.
If I need a variable that is scoped at the method level (that is, a value that is passed from one method to another method) I create the variable at the method level.
So the answer to your question is "When should my variable be in scope" and I can't really answer that without seeing all of your code and knowing what you plan to do with it.
If your object behavior should be statically set in the initialization phase, I would use an instance variable.

Resources