method_missing in "Programming Ruby" over my head - ruby

method_missing
*obj.method_missing( symbol h , args i ) → other_obj
Invoked by Ruby when obj is sent a
message it cannot handle. symbol is
the symbol for the method called, and
args are any arguments that were
passed to it. The example below
creates a class Roman, which responds
to methods with names consisting of
roman numerals, returning the
corresponding integer values. A more
typical use of method_missing is to
implement proxies, delegators, and
forwarders.
class Roman
def roman_to_int(str)
# ...
end
def method_missing(method_id)
str = method_id.id2name
roman_to_int(str)
end
end
r = Roman.new
r.iv ! 4
r.xxiii ! 23
r.mm ! 2000
I just heard about method-missing and went to find out more in Programming Ruby but the above explanation quoted from the book is over my head. Does anyone have an easier explanation? More specifically, is method-missing only used by the interpreter or is there ever a need to call it directly in a program (assuming I'm just writing web apps, as opposed to writing code for NASA)?

It's probably best to not think of ruby as having methods. When you call a ruby "method" you are actually sending a message to that instance (or class) and if you have defined a handler for the message, it is used to process and return a value.
So method_missing is a special definition that gets called whenever ruby cannot find an apropriate handler. You could also think of it like a * method.

Ruby doesn't have any type enforcement, and likewise doesn't do any checking as to what methods an object has when the script is first parsed, because this can be dynamically changed as the application runs.
What method_missing does, is let you intercept and handle calls to methods that don't exist for a given object. This provides the under-the-hood power behind pretty much every DSL (domain-specific language) written in Ruby.
In the case of the example, every one of 'r.iv', 'r.mm', and so on is actually a method call to the Roman object. Of course, it doesn't have an 'iv' or an 'mm' method, so instead control is passed to method_missing, which gets the name of the method that was called, as well as whatever arguments were passed.
method_missing then converts the method name from a symbol to a string, and parses it as a Roman number, returning the output as an integer.

It's basically a catch-all for messages that don't match up to any methods. It's used extensively in active record for dynamic finders. It's what lets you write something like this:
SomeModel.find_by_name_and_number(a_name, a_number)
The Model doesn't contain code for that find_by, so method_missing is called which looks at is says - I recognize that format, and carries it out. If it doesn't, then you get a method not found error.

In the Roman example you provide it illustrates how you can extend the functionality of a class without explicitly defining methods.
r.iv is not a method so method_missing catches it and calls roman_to_int on the method id "iv"
It's also useful when you want to handle unrecognized methods elsewhere, like proxies, delegators, and forwarders, as the documentation states.

You do not call "method_missing" (the interpreter calls it). Rather, you define it (override it) for a class which you want to make to be more flexible. As mentioned in other comments, the interpreter will call your version of method_missing when the class (or instance) does not ("explicitly"?) define the requested method. This gives you a chance to make something up, based on the ersatz method/message name.
Have you ever done any "reflection" programming in Java? Using this method would be as if the class to be accessed via reflection could look at the string (excuse me, "symbol") of the method name if a no-such-method exception was thrown, and then make something up as that method's implementation on the fly.
Dynamic programming is kind of a "one-up" on reflection.

Since you mention web apps I'll guess that you are familiar with Ruby on Rails. A good example of how method_missing can be used is the different find_by_<whatever> methods that's available. None of those methods actually exist! They are synthesized during run time. All of this magic happens because of how ruby intercepts invocations of non-existing methods.
You can read more about that and other uses of method_missing here.

ActiveRecord uses method_missing to define find_by methods. But since, method_missing is basically a last resort method, this can be a serious performance bottleneck. What I discovered is that ActiveRecord does some further awesome metaprogramming by defining the new finder method as a class method !! Thus, any further calls to the same finder method would not hit the method_missing because it is now a class method. For details about the actual code snippet from base.rb, click here.

Related

How to Make Or Reference a Null Ruby Binding For Eval

Rubocop dislikes the following; it issues Pass a binding, __FILE__ and __LINE__ to eval.:
sort_lambda = eval "->(a) { a.date }"
Yes, I know that eval is a security problem. The issue of security is out of scope for this question.
The Ruby documentation on binding says:
Objects of class Binding encapsulate the execution context at some particular place in the code and retain this context for future use. The variables, methods, value of self, and possibly an iterator block that can be accessed in this context are all retained. Binding objects can be created using Kernel#binding, and are made available to the callback of Kernel#set_trace_func and instances of TracePoint.
These binding objects can be passed as the second argument of the Kernel#eval method, establishing an environment for the evaluation.
The lambda being created does not need to access any variables in any scopes.
A quick and dirty binding to the scope where the eval is invoked from would look like this:
sort_lambda = eval "->(a) { a.date }", self.binding, __FILE__, __LINE__
Ideally, a null binding (a binding without anything defined in it, nothing from self, etc.) should be passed to this eval instead.
How could this be done?
Not exactly, but you can approximate it.
Before I go further, I know you've already said this, but I want to emphasize it for future readers of this question as well. What I'm describing below is NOT a sandbox. This will NOT protect you from malicious users. If you pass user input to eval, it can still do a lot of damage with the binding I show you below. Consult a cybersecurity expert before trying this in production.
Great, with that out of the way, let's move on. You can't really have an empty binding in Ruby. The Binding class is sort of compile-time magic. Although the class proper only exposes a way to get local variables, it also captures any constant names (including class names) that are in scope at the time, as well as the current receiver object self and all methods on self that can be invoked from the point of execution. The problem with an empty binding is that Ruby is a lot like Smalltalk sometimes. Everything exists in one big world of Platonic ideals called "objects", and no Ruby code can truly run in isolation.
In fact, trying to do so is really just putting up obstacles and awkward goalposts. Think you can block me from accessing BasicObject? If I have literally any object a in Ruby, then a.class.ancestors.last is BasicObject. Using this technique, we can get any global class by simply having an instance of that class or a subclass. Once we have classes, we have modules, and once we have modules we have Kernel, and at that point we have most of the Ruby built-in functionality.
Likewise, self always exists. You can't get rid of it. It's a fundamental part of the Ruby object system, and it exists even in situations where you don't think it does (see this question of mine from awhile back, for instance). Every method or block of code in Ruby has a receiver, so the most you can do is try to limit the receiver to be as small an object as possible. One might think you want self to be BasicObject, but amusingly there's not really a way to do that either, since you can only get a binding if Kernel is in scope, and BasicObject doesn't include Kernel. So at minimum, you're getting all of Kernel. You might be able to skimp by somehow and use some subclass of BasicObject that includes Kernel, thereby avoiding other Object methods, but that's likely to cause confusion down the road too.
All of this is to emphasize that a hypothetical null binding would really only make it slightly more complicated to get all of the global names, not impossible. And that's why it doesn't exist.
That being said, if your goal is to eliminate local variables and to try, you can get that easily by creating a binding inside of a module.
module F
module_function def get_binding
binding
end
end
sort_lambda = eval "->(a) { a.date }", F.get_binding
This binding will never have local variables, and the methods and constants it has access to are limited to those available in Kernel or at the global scope. That's about as close to "null" as you're going to get in the complex nexus of interconnected types and names we call Ruby.
While I originally left this as a comment on #Silvio Mayolo's answer, which is very well written, it seems germane to post it as an answer instead.
While most of what is contained within that answer is correct we can get slightly closer to a "Null Binding" through BasicObject inheritance:
class NullBinding < BasicObject
def get_binding
::Kernel
.instance_method(:binding)
.bind(self)
.call
end
end
This binding context has as limited a context as possible in ruby.
Using this context you will be unable to reference constants solely by name:
eval 'Class', NullBinding.new.get_binding
#=> NameError
That being said you can still reference the TOP_LEVEL scope so
eval '::Class', NullBinding.new.get_binding
#=> Class
The methods directly available in this binding context are limited only to the instance methods available to BasicObject. By way of Example:
eval "puts 'name'", NullBinding.new.get_binding
#=> NoMethodError
Again with the caveat that you can access TOP_LEVEL scope so:
eval "::Kernel.puts 'name'", NullBinding.new.get_binding
# name
#=> nil

Extract information about method receiver from stack

If we call caller method, we get something like:
prog.rb:3:in `a'
prog.rb:6:in `b'
prog.rb:9:in `c'
This is helpful for humans, but if I wanted to analyze the stack programmatically, not really, as two methods called :a may be entirely unrelated.
Is there any way/method to extract information about the receiver of the methods (like its class or object id) as well? For example:
prog.rb:3:in `Klass#a'
prog.rb:6:in `Modoole#b'
prog.rb:9:in `OtherKlass#c'
Formatting is only an example; this info might be an Array or anything.
I'm trying to emulate this with TracePoint, but forming a separate stack is a bad solution. Is there any Ruby way I missed in the docs?
There's an alternative to Kernel#caller named Kernel#caller_locations, that returns an array of Thread::Backtrace::Location objects. According to the manual, these should in theory be able to give you this information through the #label method.
Returns the label of this frame.
Usually consists of method, class, module, etc names with decoration.
After trying this out however, I need to question the term usually in the docs, because it seems to only return the method name still. Unless usually means it works for you, there seems to be no way of accomplishing this as of now.
Edit:
As per comment, one case that satisfies the condition of usually is when the method call is coming from within a Class or Module body:
class A
def trace
puts caller_locations.first.label
end
end
class B
A.new.trace
end
#=> <class:B>
module C
A.new.trace
end
#=> <module:C>

What are empty-body methods used for in Ruby?

Currently reading a Ruby style guide and I came across an example:
def no_op; end
What is the purpose of empty body methods?
There are a number of reasons you might create an empty method:
Stub a method that you will fill in later.
Stub a method that a descendant class will override.
Ensure a class or object will #respond_to? a method without necessarily doing anything other than returning nil.
Undefine an inherited method's behavior while still allowing it to #respond_to? the message, as opposed to using undef foo on public methods and surprising callers.
There are possibly other reasons, too, but those are the ones that leapt to mind. Your mileage may vary.
There may be several reasons.
One case is when a class is expected to implement a specific interface (virtually speaking, given that in Ruby there are no interfaces), but in that specific class that method would not make sense. In this case, the method is left for consistency.
class Foo
def say
"foo"
end
end
class Bar
def say
"bar"
end
end
class Null
def say
end
end
In other cases, it is left as a temporary placeholder or reminder.
There are also cases where the method is left blank on purpose, as a hook for developers using that library. The method it is called somewhere at runtime, and developers using that library can override the blank method in order to execute some custom callback. This approach was used in the past by some Rails libraries.

Is this ruby metaprogramming abuse?

I am new to Ruby, and have a gem that I am making to interact with a JSONRPC API and basically all calls and responses are similar enough, that every API call can be handled with one function, like:
Module::api_command('APINamespace.NamespaceMethod')
but I would like to also (for convenience sake) be able to do:
Module::APINamespace.NamespaceMethod
Is there any reason not to do this by using Module.const_missing to return a dummy class that has a method_missing which will allow passing the call from Module::APINamespace.NamespaceMethod to Module::api_command('APINamespace.NamespaceMethod')
Is there a more elegant or civilized way to do this?
Yes, I'm sorry, but to my mind that hack is ridiculous. :)
First of all, i'm assuming that your api_command method is actually invoking methods on the APINamespace module, as implied by this line: Module::api_command('APINamespace.NamespaceMethod')
Given the above, why not just set a constant equal to APINamespace in your module?
MyModule::APINamespace = ::APINamespace
MyModule::APINamespace.NamespaceMethod()
UPDATE:
I'm still not entirely understanding your situation, but perhaps this:
module MyModule
def self.const_missing(c)
Object.const_get(c)
end
end
Now you can invoke any top-level constant as if it was defined on your module; say there was a module called StrangeAPI at top-level, if you use the hack above, you can now invoke its methods as follows:
MyModule::StrangeAPI.Blah()
Is this what you want?

Ruby - Array method confusion

we can call the Array method in the top level like this
Array(something)
that makes sense to me, it's a method call without explicit receiver, and self, which is main in this case, is inserted at the front of the method call. But isn't it that this is equivalent to :
Kernel.Array(something)
this doesn't make sense to me. Since in the first case, the object main is of class Object, which got Kernel module mixed in, thus have the Array method. But in the second case, we are calling the Array method on the Kernel module object itself, rather than main object, didn't they are NOT the same thing?
sorry for my bad english.
Kernel.Array is what is known as a module function. Other examples of module functions include Math.sin, and Math.hypot and so on.
A module function is a method that is both a class method on the module and also a private instance method. When you invoke Array() at the top-level you are invoking it as a private instance method of the main object. When you invoke it through Kernel.Array() you are invoking it as a class method on Kernel. They are the same method.
To learn more, read up on the module_function method in rubydocs: http://www.ruby-doc.org/core/classes/Module.html#M001642
class Object mixed-in module Kernel, but Kernel is an instance of Object. So Kernel "module" methods - is it's instance methods.
What's confusing you is the difference between class and instance methods.
Class methods don't have an explicit receiver, and thus no self to access other fields with. They just... are.
Generally instance methods are used to query or manipulate the attributes of a given object, whereas the class methods are "helper" or "factory" methods that provide some functionality associated with or especially useful for a certain kind of class, but not dependent on actual live instances (objects) of that class.
Not sure about Ruby, but Java has (for example) a whole class, Math that contains nothing but instance methods like sin(), max(), exp() and so forth: There is no "Math" object, these are just methods that embody mathematical algorithms. Not the best example, because in Ruby those methods are probably embedded right into the numeric classes as instance methods.
The case you mention is a bit confusing because Array's () method and Kernel's Array() method are in fact different methods that do similar things. Both are class methods.
Array() takes a list of arguments and makes and returns an array containing them.
Kernel.Array() takes a single argument of an "array-able" type, such as a sequence, and takes the values returned by this argument and builds an array from those.
UPDATE
The downvote was perhaps justified; I apologize for taking on a subject outside my area of expertise. I think I'll be deleting this answer soon.
# Chuck: I would sincerely hope that a language/library's official documentation would offer some meaningful clues as to how it works. This is what I consulted in answering this question.
The rdoc for Kernel.Array():
Returns arg as an Array. First tries to call arg.to_ary, then arg.to_a. If both fail, creates a single element array containing arg (unless arg is nil).
for Array.():
Returns a new array populated with the given objects.
I don't know about you, but I think if the docs vary that much then either they're talking about separate methods or the documentation is a train wreck.
# freeknight:
But everything in ruby is an object of some kind, even classes and modules. And Kernel.Array is actually a method call on an specific object - the Kernel object.
Yeah, under the covers it's similar in Java too. But the Array() method isn't doing anything with Kernel, any more than Array() is doing anything with the Array class object, so this is really only a semantic quibble. It's an instance method because you could hang it off class IPSocket if you were crazy enough, and it would still work the same way.
They are the same thing:
a = Kernel.Array('aa')
=> ["aa"]
a.class
=> Array
a = Array('aaa')
=> ["aaa"]
a.class
=> Array
Maybe there is an alias?

Resources