Ruby Blocks, Procs and Local Variables - ruby

In Ruby, procs seem to have access to local variables that were present at the time they were declared, even if they are executed in a different scope:
module Scope1
def self.scope1_method
puts "In scope1_method"
end
end
module Scope2
def self.get_proc
x = 42
Proc.new do
puts x
puts self
scope1_method
end
end
end
Scope1.instance_eval(&Scope2.get_proc)
Output:
42
Scope1
In scope1_method
How and why does this occur?

The Proc.new call creates a closure for the block that it's given. In creating a closure for the block, the block is bound to the original variables in the scope of the Proc.new call.
Why is this done?
It allows Ruby blocks to function as closures. Closures are extremely useful, and the Wikipedia entry (linked above) does an excellent job of explaining some of their applications.
How is this done?
This is done in the Ruby VM (in C code) by copying the Ruby control frame that exists before entering the Proc.new method. The block is then run in the context of this control frame. This effectively copies all of the bindings that are present in this frame. In Ruby 1.8, you can find the code for this in the proc_alloc function in eval.c. In Ruby 1.9, you can find this in the proc_new function in proc.c.

This behavior is by design. In Ruby, blocks, procs, and lambdas are lexical closures. Read this blog post for a short explanation of the differences between Ruby's three flavors of closure.

Related

How proc is executed when passed to `instance_exec`

The question is inspired by this one.
Proc::new has an option to be called without a block inside a method:
Proc::new may be called without a block only within a method with an attached block, in which case that block is converted to the Proc object.
When the proc/lambda instance is passed as a code block, the new instance of Proc is being created:
Proc.singleton_class.prepend(Module.new do
def new(*args, &cb)
puts "PROC #{[block_given?, cb, *args].inspect}"
super
end
end)
Proc.prepend(Module.new do
def initialize(*args, &cb)
puts "INIT #{[block_given?, cb, *args].inspect}"
super
end
def call(*args, &cb)
puts "CALL #{[block_given?, cb, *args].inspect}"
super
end
end)
λ = ->(*args) { }
[1].each &λ
#⇒ [1]
As one might see, neither the call to Proc::new happened, nor Proc#initialize and/or Proc#call were called.
The question is: how ruby creates and executes a block wrapper under the hood?
NB Don’t test the code above in pry/irb console: they known to have glitches with pure execution of this, basically because they patch procs.
There has been some discussion of this behavior on the Ruby Issue Tracker, see Feature #10499: Eliminate implicit magic in Proc.new and Kernel#proc.
This is an implementation artifact of YARV: YARV pushes a block on the global VM stack, and Proc::new simply creates a Proc from the topmost block on the stack. So, if you happen to call Proc.new from within a method which was called with a block, it will happily grab whatever block is on top of the stack, without ever checking where it came from. Somehow, somewhere, in the mist of time, this (let's call it) "accidental artifact" (I'd actually rather call it a bug) became a documented feature. A feature that the developers of JRuby (and presumably Rubinius, Opal, MagLev, etc.) would rather get rid of.
Since most other implementations work completely differently, this behavior which comes "for free" on YARV, makes both blocks and Proc::new pontetially more expensive on other implementations and prohibits possible optimizations (which doesn't hurt on YARV, because YARV doesn't optimize).

Procs and Lambdas for what when we have methods

My doubt is pretty much a matter of misunderstanding...
From what I read, a block is a group of code enclosed by {} or do and end.
From my understanding, what a Proc or Lambda does is:
Get this block
Assign the block to a variable
Which means; we don't need to repeat the whole block all the time.
But, what is the difference among a Proc, Lambda and a standard Method? From my understanding, they all work the same way.
There is one crucial difference between Procs (and lambdas) and methods. Procs are objects, methods aren't. And since Ruby is an object-oriented language, where you can only do things with and to objects, that is very important.
So, if you want to pass a piece of code to a method, you need to have an object, and that's what Procs provide.
You can get a proxy object that represents a method via the Object#method method, which will return an instance of the Method class (which duck-types Proc).
Everything in ruby is considered an object. Proc and lambda are fundamentally similar constructs. This topic is highly opinionated as far as usage goes.
The key advantages is that they can be easily passed around into other blocks and the syntax is short and sweet. Consider the following very simple examples:
multiply = Proc.new {|x,y| x*y}
subtract = lambda {|x,y| x-y}
add = ->(x,y) {x+y}
def do_math (opr, *b)
opr.each do |bloc|
puts bloc.call(b[0],b[1])
end
end
do_math([multiply, subtract, add], 10, 5)
# => 50
# => 5
# => 15
puts multiply.call(5,5)
# => 25
puts subtract.call(5,5)
# => 0
puts add.call(5,5)
# => 10
To get a better grasp of what they are, watch this video: An Introduction to Procs, Lambdas and Closures in Ruby
Additionally the documentation has more examples here: http://www.ruby-doc.org/core-2.0.0/Proc.html
I found that this Codecademy section helps with the distinction.

Functional programming with Ruby

Ruby has support for functional programming features like code blocks and higher-level functions (see Array#map, inject, & select).
How can I write functional code in Ruby?
Would appreciate examples like implementing a callback.
You could use yield
def method(foo,bar)
operation=foo+bar
yield operation
end
then you call it like this:
foo=1
bar=2
method(foo,bar) {|result| puts "the result of the operation using arguments #{foo} and #{bar} is #{result}"}
the code in the block (a block is basically "a chunk of code" paraphrasing ruby programmers) gets executed in the "yield operation" line, you pass the method a block of code to be executed inside the method defined. This makes Ruby pretty versatile language.
In this case yield receives an argument called "operation". I wrote it that way because you asked for a way to implement a callback.
but you could just wrote
def method()
puts "I'm inside the method"
yield
end
method(){puts "I'm inside a block"}
and it would output
I'm inside the method
I'm inside a block

DSL block without argument in ruby

I'm writing a simple dsl in ruby. Few weeks ago I stumbled upon some blog post, which show how to transform code like:
some_method argument do |book|
book.some_method_on_book
book.some_other_method_on_book :with => argument
end
into cleaner code:
some_method argument do
some_method_on_book
some_other_method_on_book :with => argument
end
I can't remember how to do this and I'm not sure about downsides but cleaner syntax is tempting. Does anyone have a clue about this transformation?
def some_method argument, &blk
#...
book.instance_eval &blk
#...
end
UPDATE: However, that omits book but don't let you use the argument. To use it transparently you must transport it someway. I suggest to do it on book itself:
class Book
attr_accessor :argument
end
def some_method argument, &blk
#...
book.argument = argument
book.instance_eval &blk
#...
end
some_method 'argument' do
some_method_on_book
some_other_method_on_book argument
end
Take a look at this article http://www.dan-manges.com/blog/ruby-dsls-instance-eval-with-delegation — there is an overview of the method (specifically stated in the context of its downsides and possible solution to them), plus there're several useful links for further reading.
Basically, it's about using instance_eval to execute the block in the desirable context.
Speaking about downside of this technique:
So what's the problem with it? Well, the problem is that blocks are
generally closures. And you expect them to actually be full closures.
And it's not obvious from the point where you write the block that
that block might not be a full closure. That's what happens when you
use instance_eval: you reset the self of that block into something
else - this means that the block is still a closure over all local
variables outside the block, but NOT for method calls. I don't even
know if constant lookup is changed or not.
Using instance_eval changes the rules for the language in a way that
is not obvious when reading a block. You need to think an extra step
to figure out exactly why a method call that you can lexically see
around the block can actually not be called from inside of the block.
Check out the docile gem. It takes care of all the sharp edges, making this very easy for you.

Ruby performance: define method with define_method or eval

While looking through ActiveSupport source code I've noticed that sometimes eval is used in places where define_method is enough.
Example: ActiveSupport: Module.delegate
I consider define_method more clean and safe way of doing things.
What is the benefits of eval over define_method?
Perfomance, memory usage, something else?
When you use define_method, the method you're defining can't accept a block.
It’s pretty well known that because of
a deficiency in blocks arguments in
Ruby 1.8 Class#define_method cannot
define methods that take blocks.
def x *args, █ end # => works!
define_method(:x) {|*args,&block| } # => SyntaxError: compile error
The method being defined requires a block:
"def #{prefix}#{method}(*args, &block)" # def customer_name(*args, &block)
So define_method can't be used.
I found this to be a very nice article on the subject: http://blog.grayproductions.net/articles/eval_isnt_quite_pure_evil.
I don't know what the reason in that particular case, but define_method takes a block, which is a closure (carries local variables of the place it was defined), and that can lead to considerably higher memory consumption comparing to plain eval.

Resources