Can someone explain Ruby's use of pipe characters in a block? - ruby

Can someone explain to me Ruby's use of pipe characters in a block? I understand that it contains a variable name that will be assigned the data as it iterates. But what is this called? Can there be more than one variable inside the pipes? Anything else I should know about it? Any good links to more information on it?
For example:
25.times { | i | puts i }

Braces define an anonymous function, called a block. Tokens between the pipe are the arguments of this block. The number of arguments required depends on how the block is used. Each time the block is evaluated, the method requiring the block will pass a value based on the object calling it.
It's the same as defining a method, only it's not stored beyond the method that accepts a block.
For example:
def my_print(i)
puts i
end
will do the same as this when executed:
{|i| puts i}
the only difference is the block is defined on the fly and not stored.
Example 2:
The following statements are equivalent
25.times &method(:my_print)
25.times {|i| puts i}
We use anonymous blocks because the majority of functions passed as a block are usually specific to your situation and not worth defining for reuse.
So what happens when a method accepts a block? That depends on the method. Methods that accept a block will call it by passing values from their calling object in a well defined manner. What's returned depends on the method requiring the block.
For example: In 25.times {|i| puts i} .times calls the block once for each value between 0 and the value of its caller, passing the value into the block as the temporary variable i. Times returns the value of the calling object. In this case 25.
Let's look at method that accepts a block with two arguments.
{:key1 => "value1", :key2 => "value2"}.each {|key,value|
puts "This key is: #{key}. Its value is #{value}"
}
In this case each calls the block ones for each key/value pair passing the key as the first argument and the value as the second argument.

The pipes specify arguments that are populated with values by the function that calls your block. There can be zero or more of them, and how many you should use depends on the method you call.
For example, each_with_index uses two variables and puts the element in one of them and the index in the other.
here is a good description of how blocks and iterators work

Block arguments follow all the same conventions as method parameters (at least as of 1.9): you can define optional arguments, variable length arg lists, defaults, etc. Here's a pretty decent summary.
Some things to be aware of: because blocks see variables in the scope they were defined it, if you pass in an argument with the same name as an existing variable, it will "shadow" it - your block will see the passed in value and the original variable will be unchanged.
i = 10
25.times { | i | puts i }
puts i #=> prints '10'
Will print '10' at the end. Because sometimes this is desirable behavior even if you are not passing in a value (ie you want to make sure you don't accidentally clobber a variable from surrounding scope) you can specify block-local variable names after a semicolon after the argument list:
x = 'foo'
25.times { | i ; x | puts i; x = 'bar' }
puts x #=> prints 'foo'
Here, 'x' is local to the block, even though no value is passed in.

Related

why pass block arguments to a function in ruby?

I'm unclear on why there is a need to pass block arguments when calling a function.
why not just pass in as function arguments and what happens to the block arguments, how are they passed and used?
m.call(somevalue) {|_k, v| v['abc'] = 'xyz'}
module m
def call ( arg1, *arg2, &arg3)
end
end
Ruby, like almost all mainstream programming languages, is a strict language, meaning that arguments are fully evaluated before being passed into the method.
Now, imagine you want to implement (a simplified version of) Integer#times. The implementation would look a little bit like this:
class Integer
def my_times(action_to_be_executed)
raise ArgumentError, "`self` must be non-negative but is `#{inspect}`" if negative?
return if zero?
action_to_be_executed
pred.my_times(action_to_be_executed)
end
end
3.my_times(puts "Hello")
# Hello
0.my_times(puts "Hello")
# Hello
-1.my_times(puts "Hello")
# Hello
# ArgumentError (`self` must be non-negative but is `-1`)
As you can see, 3.my_times(puts "Hello") printed Hello exactly once, instead of thrice, as it should do. Also, 0.my_times(puts "Hello") printed Hello exactly once, instead of not at all, as it should do, despite the fact that it returns in the second line of the method, and thus action_to_be_executed is never even evaluated. Even -1.my_times(puts "Hello") printed Hello exactly once, despite that fact that it raises an ArgumentError exception as the very first thing in the method and thus the entire rest of the method body is never evaluated.
Why is that? Because Ruby is strict! Again, strict means that arguments are fully evaluated before being passed. So, what this means is that before my_times even gets called, the puts "Hello" is evaluated (which prints Hello to the standard output stream), and the result of that evaluation (which is just nil because Kernel#puts always returns nil) is passed into the method.
So, what we need to do, is somehow delay the evaluation of the argument. One way we know how to delay evaluation, is by using a method: methods are only evaluated when they are called.
So, we take a page out of Java's playbook, and define a Single Abstract Method Protocol: the argument that is being passed to my_each must be an object which implements a method with a specific name. Let's call it call, because, well, we are going to call it.
This would look a little bit like this:
class Integer
def my_times(action_to_be_executed)
raise ArgumentError, "`self` must be non-negative but is `#{inspect}`" if negative?
return if zero?
action_to_be_executed.call
pred.my_times(action_to_be_executed)
end
end
def (hello = Object.new).call
puts "Hello"
end
3.my_times(hello)
# Hello
# Hello
# Hello
0.my_times(hello)
-1.my_times(hello)
# ArgumentError (`self` must be non-negative but is `-1`)
Nice! It works! The argument that is passed is of course still strictly evaluated before being passed (we can't change the fundamental nature of Ruby from within Ruby itself), but this evaluation only results in the object that is bound by the local variable hello. The code that we want to run is another layer of indirection away and will only be executed at the point where we actually call it.
It also has another advantage: Integer#times actually makes the index of the current iteration available to the action as an argument. This was impossible to implement with our first solution, but here we can do it, because we are using a method and methods can take arguments:
class Integer
def my_times(action_to_be_executed)
raise ArgumentError, "`self` must be non-negative but is `#{inspect}`" if negative?
__my_times_helper(action_to_be_executed)
end
protected
def __my_times_helper(action_to_be_executed, index = 0)
return if zero?
action_to_be_executed.call(index)
pred.__my_times_helper(action_to_be_executed, index + 1)
end
end
def (hello = Object.new).call(i)
puts "Hello from iteration #{i}"
end
3.my_times(hello)
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
0.my_times(hello)
-1.my_times(hello)
# ArgumentError (`self` must be non-negative but is `-1`)
However, this is not actually very readable. If you didn't want to give a name to this action that we are trying to pass but instead simply literally write it down inside the argument list, it would look something like this:
3.my_times(Object.new.tap do |obj|
def obj.call(i)
puts "Hello from iteration #{i}"
end
end)
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
or on one line:
3.my_times(Object.new.tap do |obj| def obj.call; puts "Hello from iteration #{i}" end end)
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
# or:
3.my_times(Object.new.tap {|obj| def obj.call; puts "Hello from iteration #{i}" end })
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
Now, I don't know about you, but I find that pretty ugly.
In Ruby 1.9, Ruby added Proc literals aka stabby lambda literals to the language. Lambda literals are a concise literal syntax for writing objects with a call method, specifically Proc objects with Proc#call.
Using lambda literals, and without any changes to our existing code, it looks something like this:
3.my_times(-> i { puts "Hello from iteration #{i}" })
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
This does not look bad!
When Yukihiro "matz" Matsumoto designed Ruby almost thirty years ago in early 1993, he did a survey of the core libraries and standard libraries of languages like Smalltalk, Scheme, and Common Lisp to figure out how such methods that take a piece of code as an argument are actually used, and he found that the overwhelming majority of such methods take exactly one code argument and all they do with that argument is call it.
So, he decided to add special language support for a single argument that contains code and can only be called. This argument is both syntactically and semantically lightweight, in particular, it looks syntactically exactly like any other control structure, and it is semantically not an object.
This special language feature, you probably guessed it, are blocks.
Every method in Ruby has an optional block parameter. I can always pass a block to a method. It's up to the method to do anything with the block. Here, for example, the block is useless because Kernel#puts doesn't do anything with a block:
puts("Hello") { puts "from the block" }
# Hello
Because blocks are not objects, you cannot call methods on them. Also, because there can be only one block argument, there is no need to give it a name: if you refer to a block, it's always clear which block because there can be only one. But, if the block doesn't have methods and doesn't have a name, how can we call it?
That's what the yield keyword is for. It temporarily "yields" control flow to the block, or, in other words, it calls the block.
With blocks, our solution would look like this:
class Integer
def my_times(&action_to_be_executed)
raise ArgumentError, "`self` must be non-negative but is `#{inspect}`" if negative?
return enum_for(__callee__) unless block_given?
__my_times_helper(&action_to_be_executed)
end
protected
def __my_times_helper(&action_to_be_executed, index = 0)
return if zero?
yield index
pred.__my_times_helper(&action_to_be_executed, index + 1)
end
end
3.my_times do
puts "Hello from iteration #{i}"
end
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
0.my_times do
puts "Hello from iteration #{i}"
end
-1.my_times do
puts "Hello from iteration #{i}"
end
# ArgumentError (`self` must be non-negative but is `-1`)
Okay, you might notice that I simplified a bit when I wrote above that the only thing you can do with a block is call it. There are two other things you can do with it:
You can check whether a block argument was passed using Kernel#block_given?. Since blocks are always optional, and blocks have no names, there must be a way to check whether a block was passed or not.
You can "roll up" a block (which is not an object and doesn't have a name) into a Proc object (which is an object) and bind it to a parameter (which gives it a name) using the & ampersand unary prefix sigil in the parameter list of the method. Now that we have an object, and a way to refer to it, we can store it in a variable, return it from a method, or (as we are doing here) pass it along as an argument to a different method, which otherwise wouldn't be possible.
There is also the opposite operation: with the & ampersand unary prefix operator, you can "unroll" a Proc object into a block in an argument list; this makes it so that the method behaves as if you had passed the code that is stored inside the Proc as a literal block argument to the method.
And there you have it! That's what blocks are for: a semantically and syntactically lightweight form of passing code to a method.
There are other possible approaches, of course. The approach that is closest to Ruby is probably Smalltalk. Smalltalk also has a concept called blocks (in fact, that is where Ruby got both the idea and the name from). Similarly to Ruby, Smalltalk blocks have a syntactically light-weight literal form, but they are objects, and you can pass more than one to a method. Thanks to Smalltalk's generally light-weight and simple syntax, especially the keyword method syntax which intersperses parts of the method name with the arguments, even passing multiple blocks to a method call is very concise and readable.
For example, Smalltalk actually does not have an if / then / else conditional expression, in fact, Smalltalk has no control structures at all. Everything is done with methods. So, the way that a conditional works, is that the two boolean classes TrueClass and FalseClass each have a method named ifTrue:ifFalse: which takes two block arguments, and the two implementations will simply either evaluate the first or the second block. For example, the implementation in TrueClass might look a little bit like this (note that Smalltalk has no syntax for classes or methods, instead classes and methods are created in the IDE by creating class objects and method objects via the GUI):
True>>ifTrue: trueBlock ifFalse: falseBlock
"Answer with the value of `trueBlock`."
↑trueBlock value
The corresponding implementation in FalseClass would then look like this:
FalseClass>>ifTrue: trueBlock ifFalse: falseBlock
"Answer with the value of `falseBlock`."
↑falseBlock value
And you would call it like this:
2 < 3 ifTrue: [ Transcript show: 'yes' ] ifFalse: [ Transcript show: 'no' ].
"yes"
4 < 3 ifTrue: [ Transcript show: 'yes' ] ifFalse: [ Transcript show: 'no' ].
"no"
In ECMAScript, you can simply use function definitions as expressions, and there is also lightweight syntax for functions.
In the various Lisps, code is just data, and data is code, so you can just pass the code as an argument as data, then inside the function, treat that data as code again.
Scala has call-by-name parameters which are only evaluated when you use their name, and they are evaluated every time you use their name. It would look something like this:
implicit class IntegerTimes(val i: Int) extends AnyVal {
#scala.annotation.tailrec
def times(actionToBeExecuted: => Unit): Unit = {
if (i < 0) throw new Error()
if (i == 0) () else { actionToBeExecuted; (i - 1).times(actionToBeExecuted) }
}
}
3.times { println("Hello") }
// Hello
// Hello
// Hello

can somebody explain how does the following code execute?

I am following a linked tutorial from the Odin project, its about blocks and procs in ruby. I can't quite understand how does the following code work.
class Array
def eachEven(&wasABlock_nowAProc)
# We start with "true" because arrays start with 0, which is even.
isEven = true
self.each do |object|
if isEven
wasABlock_nowAProc.call object
end
isEven = (not isEven) # Toggle from even to odd, or odd to even.
end
end
end
['apple', 'bad apple', 'cherry', 'durian'].eachEven do |fruit|
puts 'Yum! I just love '+fruit+' pies, don\'t you?'
end
# Remember, we are getting the even-numbered elements
# of the array, all of which happen to be odd numbers,
# just because I like to cause problems like that.
[1, 2, 3, 4, 5].eachEven do |oddBall|
puts oddBall.to_s+' is NOT an even number!'
end
Is ['apple', 'bad apple', 'cherry', 'durian'] a block in this context and are we calling the method isEven on that block?
Does isEven used to only return true or false and if true the following code will be executed?
do |fruit|
puts 'Yum! I just love '+fruit+' pies, don\'t you?'
end
Also, what is this line doing?
self.each do |object|
if isEven
wasABlock_nowAProc.call object
end
end
If isEven is true then call [1, 2, 3, 4, 5] with the object??? What does calling that block with object mean?
Let's do it in parts:
1)The class Array was native from ruby, which means we are adding a method to all instances of Array, the method is the eachEven.
2) This method receives as parameter a block to be executed, keep this information in mind.
3) The ["apple", "bad apple", "cherry"] is an instance from Array, which means that we can execute the method eachEven for this array:
array = ["apple", "bad apple", "cherry"]
array.eachEven do |something|
# The do/end block is the parameter passed to the method `eachEven`
# the block will be binded in `wasABlock_nowAProc` in this case
end
4) Inside the method eachEven we get the self (self is the array itself) and execute another method from the Array instance: each (this method iterate over the array binding the current position to the variable inside brackets: |object|)
5) If the condition returns a positive result, it will execute the block inside if, in the case:
wasABlock_nowAProc.call object
# We execute the block of step 2 passing the current position value as a parameter
In fact, if we execute the following code:
array = [1, 2, 3, 4]
array.eachEven do |position_value|
puts "The #{position_value} is even"
end
We gonna get the following result:
The 1 is even # The block `wasABlock_nowAProc` will bind the 1 to the object and print it
The 3 is even # Same here, 3 will be used as the object in the execution of `wasABlock_nowAProc`
Hope it helps
Let's break apart the code here:
['apple', 'bad apple', 'cherry', 'durian'].eachEven do |fruit|
puts 'Yum! I just love '+fruit+' pies, don\'t you?'
end
What we have here boils down to:
receiver.method do |block_argument_one|
# this is the _body_ of the _block_
end
So:
['apple', 'bad apple', 'cherry', 'durian'] is called the receiver (or subject, or just object or instance)
eachEven is the method being called on the receiver
Everything from do to end is the block. It could also be { to } and work the same (well, mostly)
|fruit| is the block arguments list, with fruit being the only argument the block cares about.
puts … is the body of the block
What happens to the block is:
The code in the block gets interpreted, but not run
A placeholder for that code is passed to the method the block is attached to
the method runs, and can access the block while running
Now lets look at how a method that takes a block works:
class SomeClass
def some_method(regular_argument, &block_capture_argument)
# method body
# explicitly call the block:
block_capture_argument.call("first value passed to block")
# implicitly call the block (same as above)
yield "first value passed to block"
end
end
This shows several ways a block can be used:
When you define a method with the last argument beginning with &, a reference to the block is made available to the method by the name after the & (your wasABlock_nowAProc argument, for example). Then your method can do what is likes with the block, maybe calling it, or maybe even storing it somewhere a completely different method can use it.
Alternatively, you can use the yield keyword to call the block implicitly. In that case, you don't need a & argument to the method (but it still works if you do have that argument). Note that ruby allows you to attach a block to any method, regardless of if it uses that block. Methods can check if there was a block with the keyword block_given?, or check that the value of the & argument is present.
When you call the block, either with yield or with call, arguments you give to the call method are passed as arguments to the block.
The method can do whatever it wants with the block. It can call it once, twice, 0, or 300 times. It can call it with the same arguments each time or with different arguments each time.
In your specific example, the block gets called (with the value of object) for each item in the receiver, but only if the isEven variable is true.
Also in your specific example, you are calling the block from inside another block (which provides object for you), but don't let that confuse you.
To summarize:
blocks can be attached to any method using either do … end or {…}
blocks don't run unless the method they are attached to decides to call them
methods get called on a receiver
methods that use blocks get to decide how and when to use them
methods that use blocks can call blocks (or use yield) and pass any number of arguments to the block.
blocks can be defined to use those arguments (with the |…| syntax), and can name those arguments whatever they want (what matters is the order/position of the arguments).

How does a code block in Ruby know what variable belongs to an aspect of an object?

Consider the following:
(1..10).inject{|memo, n| memo + n}
Question:
How does n know that it is supposed to store all the values from 1..10? I'm confused how Ruby is able to understand that n can automatically be associated with (1..10) right away, and memo is just memo.
I know Ruby code blocks aren't the same as the C or Java code blocks--Ruby code blocks work a bit differently. I'm confused as to how variables that are in between the upright pipes '|' will automatically be assigned to parts of an object. For example:
hash1 = {"a" => 111, "b" => 222}
hash2 = {"b" => 333, "c" => 444}
hash1.merge(hash2) {|key, old, new| old}
How do '|key, old, new|' automatically assign themselves in such a way such that when I type 'old' in the code block, it is automatically aware that 'old' refers to the older hash value? I never assigned 'old' to anything, just declared it. Can someone explain how this works?
The parameters for the block are determined by the method definition. The definition for reduce/inject is overloaded (docs) and defined in C, but if you wanted to define it, you could do it like so (note, this doesn't cover all the overloaded cases for the actual reduce definition):
module Enumerable
def my_reduce(memo=nil, &blk)
# if a starting memo is not given, it defaults to the first element
# in the list and that element is skipped for iteration
elements = memo ? self : self[1..-1]
memo ||= self[0]
elements.each { |element| memo = blk.call(memo, element) }
memo
end
end
This method definition determines what values to use for memo and element and calls the blk variable (a block passed to the method) with them in a specific order.
Note, however, that blocks are not like regular methods, because they don't check the number of arguments. For example: (note, this example shows the usage of yield which is another way to pass a block parameter)
def foo
yield 1
end
# The b and c variables here will be nil
foo { |a, b, c| [a,b,c].compact.sum } # => 1
You can also use deconstruction to define variables at the time you run the block, for example if you wanted to reduce over a hash you could do something like this:
# this just copies the hash
{a: 1}.reduce({}) { |memo, (key, val)| memo[key] = val; memo }
How this works is, calling reduce on a hash implicitly calls to_a, which converts it to a list of tuples (e.g. {a: 1}.to_a = [[:a, 1]]). reduce passes each tuple as the second argument to the block. In the place where the block is called, the tuple is deconstructed into separate key and value variables.
A code block is just a function with no name. Like any other function, it can be called multiple times with different arguments. If you have a method
def add(a, b)
a + b
end
How does add know that sometimes a is 5 and sometimes a is 7?
Enumerable#inject simply calls the function once for each element, passing the element as an argument.
It looks a bit like this:
module Enumerable
def inject(memo)
each do |el|
memo = yield memo, el
end
memo
end
end
And memo is just memo
what do you mean, "just memo"? memo and n take whatever values inject passes. And it is implemented to pass accumulator/memo as first argument and current collection element as second argument.
How do '|key, old, new|' automatically assign themselves
They don't "assign themselves". merge assigns them. Or rather, passes those values (key, old value, new value) in that order as block parameters.
If you instead write
hash1.merge(hash2) {|foo, bar, baz| bar}
It'll still work exactly as before. Parameter names mean nothing [here]. It's actual values that matter.
Just to simplify some of the other good answers here:
If you are struggling understanding blocks, an easy way to think of them is as a primitive and temporary method that you are creating and executing in place, and the values between the pipe characters |memo| is simply the argument signature.
There is no special special concept behind the arguments, they are simply there for the method you are invoking to pass a variable to, like calling any other method with an argument. Similar to a method, the arguments are "local" variables within the scope of the block (there are some nuances to this depending on the syntax you use to call the block, but I digress, that is another matter).
The method you pass the block to simply invokes this "temporary method" and passes the arguments to it that it is designed to do. Just like calling a method normally, with some slight differences, such as there are no "required" arguments. If you do not define any arguments to receive, it will happily just not pass them instead of raising an ArgumentError. Likewise, if you define too many arguments for the block to receive, they will simply be nil within the block, no errors for not being defined.

In Ruby, is an if/elsif/else statement's subordinate block the same as a 'block' that is passed as a parameter?

I was doing some reading on if/elsif/else in Ruby, and I ran into some differences in terminology when describing how control expressions work.
In the Ruby Programming Wikibooks (emphasis added):
A conditional Branch takes the result of a test expression and executes a block of code depending whether the test expression is true or false.
and
An if expression, for example, not only determines whether a subordinate block of code will execute, but also results in a value itself.
Ruby-doc.org, however, does not mention blocks at all in the definitions:
The simplest if expression has two parts, a “test” expression and a “then” expression. If the “test” expression evaluates to a true then the “then” expression is evaluated.
Typically, when I have read about 'blocks' in Ruby, it has almost always been within the context of procs and lambdas. For example, rubylearning.com defines a block:
A Ruby block is a way of grouping statements, and may appear only in the source adjacent to a method call; the block is written starting on the same line as the method call's last parameter (or the closing parenthesis of the parameter list).
The questions:
When talking about blocks of code in Ruby, are we talking about
the group of code that gets passed in to a method or are we simply
talking about a group of code in general?
Is there a way to easily differentiate between the two (and is there
a technical difference between the two)?
Context for these questions: I am wondering if referring to the code inside of conditionals as blocks will be confusing to to new Ruby programmers when they are later introduced to blocks, procs, and lambdas.
TL;DR if...end is an expression, not a block
The proper use of the term block in Ruby is the code passed to a method in between do...end or curly braces {...}. A block can be and often is implicitly converted into a Proc within a method by using the &block syntax in the method signature. This new Proc is an object with its own methods that can be passed to other methods, stored in variables and data structures, called repeatedly, etc...
def block_to_proc(&block)
prc = block
puts prc
prc.class
end
block_to_proc { 'inside the block' }
# "#<Proc:0x007fa626845a98#(irb):21>"
# => Proc
In the code above, a Proc is being implicitly created with the block as its body and assigned to the variable block. Likewise, a Proc (or a lambda, a type of Proc) can be "expanded" into blocks and passed to methods that are expecting them, by using the &block syntax at the end of an arguments list.
def proc_to_block
result = yield # only the return value of the block can be saved, not the block itself
puts result
result.class
end
block = Proc.new { 'inside the Proc' }
proc_to_block(&block)
# "inside the Proc"
# => String
Although there's somewhat of a two-way street between blocks and Procs, they're not the same. Notice that to define a Proc we had to pass a block to Proc.new. Strictly speaking a block is just a chunk of code passed to a method whose execution is deferred until explicitly called. A Proc is defined with a block, its execution is also deferred until called, but it is a bonafide object just like any other. A block cannot survive on its own, a Proc can.
On the other hand, block or block of code is sometimes casually used to refer to any discreet chunk of code enclosed by Ruby keywords terminating with end: if...else...end, begin...rescue...end, def...end, class...end, module...end, until...end. But these are not really blocks, per se, and only really resemble them on the surface. Often they also have deferred execution until some condition is met. But they can stand entirely on their own, and always have return values. Ruby-doc.org's use of "expression" is more accurate.
From wikipedia
An expression in a programming language is a combination of one or
more explicit values, constants, variables, operators, and functions
that the programming language interprets (according to its particular
rules of precedence and of association) and computes to produce ("to
return", in a stateful environment) another value.
This is why you can do things like this
return_value = if 'expression'
true
end
return_value # => true
Try doing that with a block
return_value = do
true
end
# SyntaxError: (irb):24: syntax error, unexpected keyword_do_block
# return_value = do
# ^
A block is not an expression on its own. It needs either yield or a conversion to a Proc to survive. What happens when we pass a block to a method that doesn't want one?
puts("indifferent") { "to blocks" }
# "indifferent"
# => nil
The block is totally lost, it disappears with no return value, no execution, as if it never existed. It needs yield to complete the expression and produce a return value.
class Object
def puts(*args)
super
yield if block_given?
end
end
puts("mindful") { "of blocks" }
# "mindful"
# => "of blocks"

Understanding Ruby Closures

I'm trying to better understand Ruby closures and I came across this example code which I don't quite understand:
def make_counter
n = 0
return Proc.new { n = n + 1 }
end
c = make_counter
puts c.call # => this outputs 1
puts c.call # => this outputs 2
Can someone help me understand what actually happens in the above code when I call c = make_counter? In my mind, here's what I think is happening:
Ruby calls the make_counter method and returns a Proc object where the code block associated with the Proc will be { n = 1 }. When the first c.call is executed, the Proc object will execute the block associated with it, and returns n = 1. However, when the second c.call is executed, doesn't the Proc object still execute the block associated with it, which is still { n = 1 }? I don't get why the output will change to 2.
Maybe I'm not understanding this at all, and it would be helpful if you could provide some clarification on what's actually happening within Ruby.
The block is not evaluated when make_counter is called. The block is evaluated and run when you call the Proc via c.call. So each time you run c.call, the expression n = n + 1 will be evaluated and run. The binding for the Proc will cause the n variable to remain in scope since it (the local n variable) was first declared outside the Proc closure. As such, n will keep incrementing on each iteration.
To clarify this further:
The block that defines a Proc (or lambda) is not evaluated at initialization - the code within is frozen exactly as you see it.
Ok, the code is actually 'evaluated', but not for the purpose of changing the frozen code. Rather, it is checked for any variables that are currently in scope that are being used within the context of the Proc's code block. Since n is a local variable (as it was defined the line before), and it is used within the Proc, it is captured within the binding and comes along for the ride.
When the call method is called on the Proc, it will execute the 'frozen' code within the context of that binding that had been captured. So the n that had been originally been assigned as 0, is incremented to 1. When called again, the same n will increment again to 2. And so on...
I always feel like to understand whats going on, its always important to revisit the basics. No one ever answered the question of what is a Proc in Ruby which to a newbie reading this post, that would be crucial and would help in answering this question.
At a high-level, procs are methods that can be stored inside variables.
Procs can also take a code block as its parameter, in this case it took n = n + 1. In other programming languages a block is called a closure. Blocks allow you to group statements together and encapsulate behavior.
There are two ways to create blocks in Ruby. The example you provide is using curly braces syntax.
So why use Procs if you can use methods to perform the same functionality?
The answer is that Procs give you more flexibility than methods. With Procs you can store an entire set of processes inside a variable and then call the variable anywhere else in your program.
In this case, Proc was written inside a method and then that method was stored inside a variable called c and then called with puts each time incrementing the value of n.
Similar to Procs, Lambdas also allow you to store functions inside a variable and call the method from other parts of a program.
This here:
return Proc.new { n = n + 1 }
Actually, returns a proc object which has a block associated with it. And Ruby creates a binding with blocks! So the execution context is stored for later use and hence why we can increment n. Let me go a bit further into explaining Ruby Closures, so you can have a more broader idea.
First, we need to clarify the technical term 'binding'. In Ruby, a binding object encapsulates the execution context at some particular scope in a program and retains this context for future use in the program. This execution context includes arguments passed to a method and any local variables defined in the method, any associated blocks, the return stack and the value of self. Take this example:
class SomeClass
def initialize
#ivar = 'instance variable'
end
def m(param)
lvar = 'local variable'
binding
end
end
b = SomeClass.new.m(100) { 'block executed' }
=> #<Binding:0x007fb354b7aca0>
eval "puts param", b
=> 100
eval "puts lvar", b
=> local variable
eval "puts yield", b
=> block executed
eval "puts self", b
=> #<SomeClass:0x007fb354ad82e8>
eval "puts #ivar", b
instance variable
The last statement might seem a little tricky but it's not. Remember binding holds execution context for later use. So when we invoke yield, it is invoking yield as if it was still in that execution context and hence it invokes the block.
It's interesting, you can even reassign the value of the local variables in the closure:
eval "lvar = 'changed in eval'", b
eval "puts lvar", b
=> changed in eval
Now this is all cute, but not so useful. Bindings are really useful as it pertains to blocks. Ruby associates a binding object with a block. So when you create a proc or a lambda, the resulting Proc object holds not just the executable block but also bindings for all the variables used by the block.
You already know that blocks can use local variables and method arguments that are defined outside the block. In the following code, for example, the block associated with the collect iterator uses the method argument n:
# multiply each element of the data array by n
def multiply(data, n)
data.collect {|x| x*n }
end
puts multiply([1,2,3], 2) # Prints 2,4,6
What is more interesting is that if the block were turned into a proc or lambda, it could access n even after the method to which it is an argument had returned. That's because there is a binding associated to the block of the lambda or proc object! The following code demonstrates:
# Return a lambda that retains or "closes over" the argument n
def multiplier(n)
lambda {|data| data.collect{|x| x*n } }
end
doubler = multiplier(2) # Get a lambda that knows how to double
puts doubler.call([1,2,3]) # Prints 2,4,6
The multiplier method returns a lambda. Because this lambda is used outside of the scope in which it is defined, we call it a closure; it encapsulates or “closes over” (or just retains) the binding for the method argument n.
It is important to understand that a closure does not just retain the value of the variables it refers to—it retains the actual variables and extends their lifetime. Another way to say this is that the variables used in a lambda or proc are not statically bound when the lambda or proc is created. Instead, the bindings are dynamic, and the values of the variables are looked up when the lambda or proc is executed.

Resources