Module.nesting within instance_eval/exec or module_eval/exec - ruby

I came up with this question when I was trying to answer this. The following is an expected behaviour:
module A
p Module.nesting
end
# => [A]
But the following:
A.instance_eval{p Module.nesting}
A.instance_exec{p Module.nesting}
A.module_eval{p Module.nesting}
A.module_exec{p Module.nesting}
all return []. Why do these not work as the above?
Additional Question
Mu is too short suggested an interesting point. If that is correct, then Module.nesting would be one of the methods and variables that are dependent on the literal context like Method#source_location, __FILE__. Is this understanding correct? If so, can someone provide the inventory of these methods/variables that are dependent on the literal context? I think it would be useful for reference.

Warning: This is a little long and rambling. A bit of a tour through the Ruby source code seems necessary as the documentation is a bit thin. Feel free to skip to the end if you don't care about how sausage is made.
The 1.9.2 Module.nesting is implemented in eval.c like this:
static VALUE
rb_mod_nesting(void)
{
VALUE ary = rb_ary_new();
const NODE *cref = rb_vm_cref();
while (cref && cref->nd_next) {
VALUE klass = cref->nd_clss;
if (!(cref->flags & NODE_FL_CREF_PUSHED_BY_EVAL) &&
!NIL_P(klass)) {
rb_ary_push(ary, klass);
}
cref = cref->nd_next;
}
return ary;
}
I don't know the Ruby internals that well but I read the while loop like this: extract from the cref linked list all the nodes that are associated with a class-like thing but didn't come from eval. The NODE_FL_CREF_PUSHED_BY_EVAL bit is only set in here:
/* block eval under the class/module context */
static VALUE
yield_under(VALUE under, VALUE self, VALUE values)
A bit more grepping and reading reveals that instance_eval does end up going through yield_under. I'll leave checking instance_exec, module_eval, and module_exec as exercises for the reader. In any case, it looks like instance_eval is explicitly excluded from the Module.nesting list; this is, however, more of a distraction than anything else, it just means that you won't see something the evals mentioned.
So now the question is "what are NODE and rb_vm_cref() all about?".
If you look in node.h you'll see a bunch of NODE constants for the various Ruby keywords and language structures:
NODE_BLOCK
NODE_BREAK
NODE_CLASS
NODE_MODULE
NODE_DSYM
...
so I'd guess that NODE is a node in the instruction tree. This lines up nicely with my
Module.nesting seems to be more about talking to the parser
conjecture in the comment. But we'll keep going anyway.
The rb_vm_cref function is just a wrapper for vm_get_cref which is a wrapper for vm_get_cref0. What is vm_get_cref0 all about? It is all about this:
static NODE *
vm_get_cref0(const rb_iseq_t *iseq, const VALUE *lfp, const VALUE *dfp)
{
while (1) {
if (lfp == dfp) {
return iseq->cref_stack;
}
else if (dfp[-1] != Qnil) {
return (NODE *)dfp[-1];
}
dfp = GET_PREV_DFP(dfp);
}
}
All three arguments to the function come straight out of this control frame:
rb_control_frame_t *cfp = rb_vm_get_ruby_level_next_cfp(th, th->cfp);
The iseq appears to be an instruction sequence and the lfp and dfp are frame pointers:
VALUE *lfp; // cfp[6], local frame pointer
VALUE *dfp; // cfp[7], dynamic frame pointer
The definition of cref_stack is relevant:
/* klass/module nest information stack (cref) */
NODE *cref_stack;
So it looks like you're getting some sort of call or nesting stack out of rb_vm_cref.
Now back to the specifics at hand. When you do this:
module A
p Module.nesting
end
You'll have module A in the cref linked list (which is filtered to produce the Module.nesting result array) as you haven't hit the end yet. When you say these:
A.instance_eval { puts Module.nesting }
A.instance_exec { puts Module.nesting }
A.module_eval { puts Module.nesting }
A.module_exec { puts Module.nesting }
You won't have module A in cref anymore because you've already hit the end popped module A off the stack. However, if you do this:
module A
instance_eval { puts Module.nesting.inspect }
instance_exec { puts Module.nesting.inspect }
module_eval { puts Module.nesting.inspect }
module_exec { puts Module.nesting.inspect }
end
You'll see this output:
[A]
[A]
[A]
[A]
because the module A hasn't been closed (and popped off cref) yet.
To finish off, the Module.nesting documentation says this:
Returns the list of Modules nested at the point of call.
I think this statement combined with the review of the internals indicates that Module.nesting does in fact depend on the specific literal context in which it is called.
If anyone with more experience in the Ruby internals has anything to add I can hand this over to the SO community as a community wiki.
UPDATE: All of this applies to class_eval as well as it does to module_eval and it also applies to 1.9.3 as well as it does to 1.9.2.

Related

Omitting an argument for a method in a block

I wonder, is it possible to do something similar in Ruby to what I can do in Scala or other languages:
someCollection.foreach(x => println(x)) // a full version
someCollection.foreach(println) // a short version
In Ruby I can do:
some_array.each { |x| puts x }
So how can I do this?
some_array.each { puts }
UPDATE:
I'm not talking about puts in particular, it just picked it for example. There might be some_other_method which takes one parameter.
some_array.map { some_other_method }
some_array.map(some_other_method) # ???
def some_other_method a
# ... doing something with a
end
If you look up the rules for implicit η-expansion in the SLS (§6.26.5), it should be immediately obvious that it relies crucially on static type information and thus cannot possibly work in Ruby.
You can, however, explicitly obtain a Method object via reflection. Method objects respond to to_proc and like any object that responds to to_proc can thus be passed as if they were blocks using the unary prefix & operator:
some_array.each(&method(:puts))
Not quite like that, unfortunately. You can send a method name to be called on each object, e.g.:
some_array.each &:print_myself
Which is equivalent to:
some_array.each {|x| x.print_myself}
But I don't know of a clean (read: built-in) way to do what you're asking for. (Edit: #Jörg's answer does this, though it doesn't really save you any typing. There is no automatic partial function application in Ruby)

How would you express an idiom "with this object, if it exists, do this" in Ruby?

Very often in Ruby (and Rails specifically) you have to check if something exists and then perform an action on it, for example:
if #objects.any?
puts "We have these objects:"
#objects.each { |o| puts "hello: #{o}"
end
This is as short as it gets and all is good, but what if you have #objects.some_association.something.hit_database.process instead of #objects? I would have to repeat it second time inside the if expression and what if I don't know the implementation details and the method calls are expensive?
The obvious choice is to create a variable and then test it and then process it, but then you have to come up with a variable name (ugh) and it will also hang around in memory until the end of the scope.
Why not something like this:
#objects.some_association.something.hit_database.process.with :any? do |objects|
puts "We have these objects:"
objects.each { ... }
end
How would you do this?
Note that there's no reason to check that an array has at least one element with any? if you're only going to send each, because sending each to an empty array is a no-op.
To answer your question, perhaps you are looking for https://github.com/raganwald/andand?
Indeed, using a variable pollutes the namespace, but still, I think if (var = value).predicate is is a pretty common idiom and usually is perfectly ok:
if (objects = #objects.some_association.hit_database).present?
puts "We have these objects: #{objects}"
end
Option 2: if you like to create your own abstractions in a declarative fashion, that's also possible using a block:
#objects.some_association.hit_database.as(:if => :present?) do |objects|
puts "We have these objects: #{objects}"
end
Writing Object#as(options = {}) is pretty straigthforward.
What about tap?
#objects.some_association.something.hit_database.process.tap do |objects|
if objects.any?
puts "We have these objects:"
objects.each { ... }
end
end
Edit: If you're using Ruby 1.9, the Object#tap method provides the same functionality as the code listed below.
It sounds like you just want to be able to save a reference to an object without polluting the scope, correct? How about we open up the Object class and add a method do, which will just yield itself to the block:
class Object
def do
yield self if block_given?
return self # allow chaining
end
end
We can then call, for example:
[1,2,3].do { |a| puts a.length if a.any? }
=> 3
[].do { |a| puts a.length if a.any? }
=> nil

Ruby: tap writes on a read?

So if I understand correctly Object#tap uses yield to produce a temporary object to work with during the execution of a process or method. From what I think I know about yield, it does something like, yield takes (thing) and gives (thing).dup to the block attached to the method it's being used in?
But when I do this:
class Klass
attr_accessor :hash
def initialize
#hash={'key' => 'value'}
end
end
instance=Klass.new
instance.instance_variable_get('#hash')[key] # => 'value', as it should
instance.instance_variable_get('#hash').tap {|pipe| pipe['key']=newvalue}
instance.instance_variable_get('#hash')[key] # => new value... wut?
I was under the impression that yield -> new_obj. I don't know how correct this is though, I tried to look it up on ruby-doc, but Enumerator::yielder is empty, yield(proc) isn't there, and the fiber version... I don't have any fibers, in fact, doesn't Ruby actually explicitly require include 'fiber' to use them?
So what ought have been a read method on the instance variable and a write on the temp is instead a read/write on the instance variable... which is cool, because that's what I was trying to do and accidentally found when I was looking up a way to deal with hashes as instance variables (for some larger-than-I'm-used-to tables for named arrays of variables), but now I'm slightly confused, and I can't find a description of the mechanism that's making this happen.
Object#tap couldn't be simpler:
VALUE
rb_obj_tap(VALUE obj)
{
rb_yield(obj);
return obj;
}
(from the documentation). It just yields and then returns the receiver. A quick check in IRB shows that yield yields the object itself rather than a new object.
def foo
x = {}
yield x
x
end
foo { |y| y['key'] = :new_value }
# => {"key" => :new_value }
So the behavior of tap is consistent with yield, as we would hope.
tap does not duplicate the receiver. The block variable is assigned the very receiver itself. Then, tap returns the receiver. So when you do tap{|pipe| pipe['key']=newvalue}, the receiver of tap is modified. To my understanding,
x.tap{|x| foo(x)}
is equivalent to:
foo(x); x
and
y.tap{|y| y.bar}
is equivalent to:
y.bar; y

Is there a method in Ruby Object to pass itself to a block or proc?

I think it would be natural to have in Ruby something like:
class Object
def yield_self
yield(self)
end
end
Does there exist a method like this by any chance? (I haven't found.) Does anybody else think it would be nice to have it?
yield_self has been added to ruby core a month ago as of June 2017. https://bugs.ruby-lang.org/projects/ruby-trunk/repository/revisions/58528
It's in ruby 2.5.0 after revision number 58528, although I'm not exactly sure how to get that code yet. Perhaps if someone knows how they can edit this answer
I don't understand why you want the complexity of:
Object.new.yield_self do |foo|
...
end
When the following is almost exactly equivalent:
foo = Object.new
...
There is indeed the tap method that does almost exactly what you're asking:
x = [].tap do |array|
array << 'foo'
array << 9
end
p x
#=> ["foo", 9]
As Rob Davis points out, there's a subtle but important difference between tap and your method. The return value of tap is the receiver (i.e., the anonymous array in my example), while the return value of your method is the return value of the block.
You can see this in the source for the tap method:
VALUE
rb_obj_tap(VALUE obj)
{
rb_yield(obj);
return obj;
}
We're returning the obj that was passed into the function rather than the return value of rb_yield(obj). If this distinction is crucial, then tap is not what you need. Otherwise, it seems like a good fit.

Explanation of Ruby code for building Trie data structures

So I have this ruby code I grabbed from wikipedia and I modified a bit:
#trie = Hash.new()
def build(str)
node = #trie
str.each_char { |ch|
cur = ch
prev_node = node
node = node[cur]
if node == nil
prev_node[cur] = Hash.new()
node = prev_node[cur]
end
}
end
build('dogs')
puts #trie.inspect
I first ran this on console irb, and each time I output node, it just keeps giving me an empty hash each time {}, but when I actually invoke that function build with parameter 'dogs' string, it actually does work, and outputs {"d"=>{"o"=>{"g"=>{"s"=>{}}}}}, which is totally correct.
This is probably more of a Ruby question than the actual question about how the algorithm works. I don't really have adequate Ruby knowledge to decipher what is going on there I guess.
You're probably getting lost inside that mess of code which takes an approach that seems a better fit for C++ than for Ruby. Here's the same thing in a more concise format that uses a special case Hash for storage:
class Trie < Hash
def initialize
# Ensure that this is not a special Hash by disallowing
# initialization options.
super
end
def build(string)
string.chars.inject(self) do |h, char|
h[char] ||= { }
end
end
end
It works exactly the same but doesn't have nearly the same mess with pointers and such:
trie = Trie.new
trie.build('dogs')
puts trie.inspect
Ruby's Enumerable module is full of amazingly useful methods like inject which is precisely what you want for a situation like this.
I think you are just using irb incorrectly. You should type the whole function in, then run it, and see if you get correct results. If it doesn't work, how about you post your entire IRB session here.
Also here is a simplified version of your code:
def build(str)
node = #trie
str.each_char do |ch|
node = (node[ch] ||= {})
end
# not sure what the return value should be
end

Resources