Differences between Proc and Lambda - ruby

Ruby has differences between Procs created via Proc.new and lambda (or the ->() operator in 1.9). It appears that non-lambda Procs will splat an array passed in across the block arguments; Procs created via lambda do not.
p = Proc.new { |a,b| a + b}
p[[1,2]] # => 3
l = lambda { |a,b| a + b }
l[[1,2]] # => ArgumentError: wrong number of arguments (1 for 2)
Does anyone have any insight into the motivations behind this behavior?

There are two main differences between lambdas and non-lambda Procs:
Just like methods, lambdas return from themselves, whereas non-lambda Procs return from the enclosing method, just like blocks.
Just like methods, lambdas have strict argument checking, whereas non-lambda Procs have loose argument checking, just like blocks.
Or, in short: lambdas behave like methods, non-lambda Procs behave like blocks.
What you are seeing there is an instance of #2. Try it with a block and a method in addition to a non-lambda Proc and a lambda, and you'll see. (Without this behavior, Hash#each would be a real PITA to use, since it does yield an array with two-elements, but you pretty much always want to treat it as two arguments.)

Related

In Ruby, is an if/elsif/else statement's subordinate block the same as a 'block' that is passed as a parameter?

I was doing some reading on if/elsif/else in Ruby, and I ran into some differences in terminology when describing how control expressions work.
In the Ruby Programming Wikibooks (emphasis added):
A conditional Branch takes the result of a test expression and executes a block of code depending whether the test expression is true or false.
and
An if expression, for example, not only determines whether a subordinate block of code will execute, but also results in a value itself.
Ruby-doc.org, however, does not mention blocks at all in the definitions:
The simplest if expression has two parts, a “test” expression and a “then” expression. If the “test” expression evaluates to a true then the “then” expression is evaluated.
Typically, when I have read about 'blocks' in Ruby, it has almost always been within the context of procs and lambdas. For example, rubylearning.com defines a block:
A Ruby block is a way of grouping statements, and may appear only in the source adjacent to a method call; the block is written starting on the same line as the method call's last parameter (or the closing parenthesis of the parameter list).
The questions:
When talking about blocks of code in Ruby, are we talking about
the group of code that gets passed in to a method or are we simply
talking about a group of code in general?
Is there a way to easily differentiate between the two (and is there
a technical difference between the two)?
Context for these questions: I am wondering if referring to the code inside of conditionals as blocks will be confusing to to new Ruby programmers when they are later introduced to blocks, procs, and lambdas.
TL;DR if...end is an expression, not a block
The proper use of the term block in Ruby is the code passed to a method in between do...end or curly braces {...}. A block can be and often is implicitly converted into a Proc within a method by using the &block syntax in the method signature. This new Proc is an object with its own methods that can be passed to other methods, stored in variables and data structures, called repeatedly, etc...
def block_to_proc(&block)
prc = block
puts prc
prc.class
end
block_to_proc { 'inside the block' }
# "#<Proc:0x007fa626845a98#(irb):21>"
# => Proc
In the code above, a Proc is being implicitly created with the block as its body and assigned to the variable block. Likewise, a Proc (or a lambda, a type of Proc) can be "expanded" into blocks and passed to methods that are expecting them, by using the &block syntax at the end of an arguments list.
def proc_to_block
result = yield # only the return value of the block can be saved, not the block itself
puts result
result.class
end
block = Proc.new { 'inside the Proc' }
proc_to_block(&block)
# "inside the Proc"
# => String
Although there's somewhat of a two-way street between blocks and Procs, they're not the same. Notice that to define a Proc we had to pass a block to Proc.new. Strictly speaking a block is just a chunk of code passed to a method whose execution is deferred until explicitly called. A Proc is defined with a block, its execution is also deferred until called, but it is a bonafide object just like any other. A block cannot survive on its own, a Proc can.
On the other hand, block or block of code is sometimes casually used to refer to any discreet chunk of code enclosed by Ruby keywords terminating with end: if...else...end, begin...rescue...end, def...end, class...end, module...end, until...end. But these are not really blocks, per se, and only really resemble them on the surface. Often they also have deferred execution until some condition is met. But they can stand entirely on their own, and always have return values. Ruby-doc.org's use of "expression" is more accurate.
From wikipedia
An expression in a programming language is a combination of one or
more explicit values, constants, variables, operators, and functions
that the programming language interprets (according to its particular
rules of precedence and of association) and computes to produce ("to
return", in a stateful environment) another value.
This is why you can do things like this
return_value = if 'expression'
true
end
return_value # => true
Try doing that with a block
return_value = do
true
end
# SyntaxError: (irb):24: syntax error, unexpected keyword_do_block
# return_value = do
# ^
A block is not an expression on its own. It needs either yield or a conversion to a Proc to survive. What happens when we pass a block to a method that doesn't want one?
puts("indifferent") { "to blocks" }
# "indifferent"
# => nil
The block is totally lost, it disappears with no return value, no execution, as if it never existed. It needs yield to complete the expression and produce a return value.
class Object
def puts(*args)
super
yield if block_given?
end
end
puts("mindful") { "of blocks" }
# "mindful"
# => "of blocks"

How `[]` works with lambdas

I have this lambda (or is closure the correct usage?) and I understand the usage of .call
def multi(m)
lambda { |n| n * m }
end
two = multi(2)
two.call(10) #=> 20 #call the proc
But I am trying to understand why/how this works?
two.(20) #=> 40
two[20] #=> 40
I don't know whether it should or shouldn't work. Most of the time I have used square brackets with arrays.
The documentation
prc[params,...] → obj
Invokes the block, setting the block’s parameters to the values in params using something close to method calling semantics. Generates a warning if multiple values are passed to a proc that expects just one (previously this silently converted the parameters to an array). Note that prc.() invokes prc.call() with the parameters given. It’s a syntax sugar to hide “call”.
For procs created using lambda or ->() an error is generated if the wrong number of parameters are passed to a Proc with multiple parameters. For procs created using Proc.new or Kernel.proc, extra parameters are silently discarded.
For your first question, proc.() is a hack because Ruby doesn't let you define () on an object. It's just syntaxic sugar for proc.call().
For your second question, using square brackets on a Proc calls it.

Lambda vs Proc in terms of memory and efficiency

I understand that there are different situations in which Procs and lambdas should be used (lambda checks number of arguments, etc.), but do they take up different amounts of memory? If so, which one is more efficient?
There are several differences between Lambdas and Procs.
Lambdas have what are known as "diminutive returns". What that means is that a Lambda will return flow to the function that called it, while a Proc will return out of the function that called it.
def proc_demo
Proc.new { return "return value from Proc" }.call
"return value from method"
end
def lambda_demo
lambda { return "return value from lambda" }.call
"return value from method"
end
proc_demo #=> "return value from Proc"
lambda_demo #=> "return value from method"
Lambdas check the number of parameters passed into them, while Procs do not. For example:
lambda { |a, b| [a, b] }.call(:foo)
#=> #<ArgumentError: wrong number of arguments (1 for 2)>
Proc.new { |a, b| [a, b] }.call(:foo)
#=> [:foo, nil]
The Ruby Language Specification does not prescribe any particular implementation strategy for procs and lambdas, therefore any implementation is free to choose any strategy it wants, ergo any implementation may (or may not) take up completely different amounts of memory. Actually, this isn't just true for lambdas and procs, but for every kind of object. The Ruby Language Specification only prescribes the behavior of the objects, it does not prescribe any particular implementation or representation.
However, since there is only one class to represent both lambdas and procs, it is very likely that they take up the exact same amount of memory, regardless of how they are implemented and represented.
The differences between Proc and lambda are mostly behavior related, and are answered better by Abraham and is also found here
The old answer talked about how Block is faster than lambda as explained and shown at Ruby Monk:Ascent

why does ruby need so many different types of closure?

As far as I can tell, there are essentially three different kinds of closure in Ruby; methods, procs and lambdas. I know that there are differences between them, but could we not just get away having one type that accommodates all possible use-cases?
Methods can already be passed around like procs and lambdas by calling self.method(method_name), and the only significant differences that I'm aware of between procs and lambdas is that lambdas check arity and procs do crazy things when you try to use return. So couldn't we just merge them all into one and be done with it?
As far as I can tell, there are essentially three different kinds of closure in Ruby; methods, procs and lambdas.
No, there are two: methods aren't closures, only procs and lambdas are. (Or at least can be, most of them aren't.)
There are two ways of packaging up a piece of executable code for reuse in Ruby: methods and blocks. Strictly speaking, blocks aren't necessary, you can get by with just methods. But blocks are meant to be extremely light-weight, conceptually, semantically and syntactically. That's not true for methods.
Because they are meant to be light-weight and easy to use, blocks behave different from methods in some respects, e.g. how arguments are bound to parameters. Block parameters are bound more like the left-hand side of an assignment than like method parameters.
Examples:
Passing a single array to multiple parameters:
def foo(a, b) end
foo([1, 2, 3]) # ArgumentError: wrong number of arguments (1 for 2)
a, b = [1, 2, 3]
# a == 1; b == 2
[[1, 2, 3]].each {|a, b| puts "a == #{a}; b == #{b}" }
# a == 1; b ==2
Passing less arguments than parameters:
def foo(a, b, c) end
foo(1, 2) # ArgumentError
a, b, c = 1, 2
# a == 1; b == 2; c == nil
[[1, 2]].each {|a, b, c| puts "a == #{a}; b == #{b}; c == #{c}" }
# a == 1; b == 2; c ==
Passing more arguments than parameters:
def foo(a, b) end
foo(1, 2, 3) # ArgumentError: wrong number of arguments (3 for 2)
a, b = 1, 2, 3
# a == 1; b == 2
[[1, 2, 3]].each {|a, b| puts "a == #{a}; b == #{b}" }
# a == 1; b == 2
[By the way: none of the blocks above are closures.]
This allows, for example, the Enumerable protocol which always yields a single element to the block to work with Hashes: you just make the single element an Array of [key, value] and rely on the implicit array destructuring of the block:
{one: 1, two: 2}.each {|k, v| puts "#{key} is assigned to #{value}" }
is much easier to understand than what you would have to otherwise write:
{one: 1, two: 2}.each {|el| puts "#{el.first} is assigned to #{el.last}" }
Another difference between blocks and methods is that methods use the return keyword to return a value whereas blocks use the next keyword.
If you agree that it makes sense to have both methods and blocks in the language, then it is just a small step to also accept the existence of both procs and lambdas, because they behave like blocks and methods, respectively:
procs return from the enclosing method (just like blocks) and they bind arguments exactly like blocks do
lambdas return from themselves (just like methods) and they bind arguments exactly like methods do.
IOW: the proc/lambda dichotomy just mirrors the block/method dichotomy.
Note that there are actually quite a lot more cases to consider. For example, what does self mean? Does it mean
whatever self was at the point the block was written
whatever self is at the point the block is run
the block itself
And what about return? Does it mean
return from the method the block is written in
return from the method the block is run in
return from the block itself?
This already gives you nine possibilities, even without taking into account the Ruby-specific peculiarities of parameter binding.
Now, for reasons of encapsulation, #2 above are really bad ideas, so that reduces our choices somewhat.
As always, it's a matter of taste of the language designer. There are other such redundancies in Ruby as well: why do you need both instance variables and local variables? If lexical scopes were objects, then local variables would just be instance variables of the lexical scope and you wouldn't need local variables. And why do you need both instance variables and methods? One of them is enough: a getter/setter pair of methods can replace an instance variable (see Newspeak for an example of such a language) and first-class procedures assigned to instance variables can replace methods (see Self, Python, JavaScript). Why do you need both classes and modules? If you allow classes to be mixed-in, then you can get rid of modules and use classes both as classes and mixins. And why do you need mixins at all? If everything is a method call, classes automatically become mixins anyway (again, see Newspeak for an example). And of course, if you allow inheritance directly between objects you don't need classes at all (see Self, Io, Ioke, Seph, JavaScript)
Some pretty good explanation http://www.robertsosinski.com/2008/12/21/understanding-ruby-blocks-procs-and-lambdas/ but i am guessing you want a bit more deeply philosophical explanation...
I believe the answer to "but could we not just get away having one type that accommodates all possible use-cases?", is that you can get away using just one.
The reason they exist is that ruby is trying to make the developer as productive as possible using expressions from both functional and object oriented paradigms, which makes the different types of closure "syntactic sugar".

Understanding Ruby Closures

I'm trying to better understand Ruby closures and I came across this example code which I don't quite understand:
def make_counter
n = 0
return Proc.new { n = n + 1 }
end
c = make_counter
puts c.call # => this outputs 1
puts c.call # => this outputs 2
Can someone help me understand what actually happens in the above code when I call c = make_counter? In my mind, here's what I think is happening:
Ruby calls the make_counter method and returns a Proc object where the code block associated with the Proc will be { n = 1 }. When the first c.call is executed, the Proc object will execute the block associated with it, and returns n = 1. However, when the second c.call is executed, doesn't the Proc object still execute the block associated with it, which is still { n = 1 }? I don't get why the output will change to 2.
Maybe I'm not understanding this at all, and it would be helpful if you could provide some clarification on what's actually happening within Ruby.
The block is not evaluated when make_counter is called. The block is evaluated and run when you call the Proc via c.call. So each time you run c.call, the expression n = n + 1 will be evaluated and run. The binding for the Proc will cause the n variable to remain in scope since it (the local n variable) was first declared outside the Proc closure. As such, n will keep incrementing on each iteration.
To clarify this further:
The block that defines a Proc (or lambda) is not evaluated at initialization - the code within is frozen exactly as you see it.
Ok, the code is actually 'evaluated', but not for the purpose of changing the frozen code. Rather, it is checked for any variables that are currently in scope that are being used within the context of the Proc's code block. Since n is a local variable (as it was defined the line before), and it is used within the Proc, it is captured within the binding and comes along for the ride.
When the call method is called on the Proc, it will execute the 'frozen' code within the context of that binding that had been captured. So the n that had been originally been assigned as 0, is incremented to 1. When called again, the same n will increment again to 2. And so on...
I always feel like to understand whats going on, its always important to revisit the basics. No one ever answered the question of what is a Proc in Ruby which to a newbie reading this post, that would be crucial and would help in answering this question.
At a high-level, procs are methods that can be stored inside variables.
Procs can also take a code block as its parameter, in this case it took n = n + 1. In other programming languages a block is called a closure. Blocks allow you to group statements together and encapsulate behavior.
There are two ways to create blocks in Ruby. The example you provide is using curly braces syntax.
So why use Procs if you can use methods to perform the same functionality?
The answer is that Procs give you more flexibility than methods. With Procs you can store an entire set of processes inside a variable and then call the variable anywhere else in your program.
In this case, Proc was written inside a method and then that method was stored inside a variable called c and then called with puts each time incrementing the value of n.
Similar to Procs, Lambdas also allow you to store functions inside a variable and call the method from other parts of a program.
This here:
return Proc.new { n = n + 1 }
Actually, returns a proc object which has a block associated with it. And Ruby creates a binding with blocks! So the execution context is stored for later use and hence why we can increment n. Let me go a bit further into explaining Ruby Closures, so you can have a more broader idea.
First, we need to clarify the technical term 'binding'. In Ruby, a binding object encapsulates the execution context at some particular scope in a program and retains this context for future use in the program. This execution context includes arguments passed to a method and any local variables defined in the method, any associated blocks, the return stack and the value of self. Take this example:
class SomeClass
def initialize
#ivar = 'instance variable'
end
def m(param)
lvar = 'local variable'
binding
end
end
b = SomeClass.new.m(100) { 'block executed' }
=> #<Binding:0x007fb354b7aca0>
eval "puts param", b
=> 100
eval "puts lvar", b
=> local variable
eval "puts yield", b
=> block executed
eval "puts self", b
=> #<SomeClass:0x007fb354ad82e8>
eval "puts #ivar", b
instance variable
The last statement might seem a little tricky but it's not. Remember binding holds execution context for later use. So when we invoke yield, it is invoking yield as if it was still in that execution context and hence it invokes the block.
It's interesting, you can even reassign the value of the local variables in the closure:
eval "lvar = 'changed in eval'", b
eval "puts lvar", b
=> changed in eval
Now this is all cute, but not so useful. Bindings are really useful as it pertains to blocks. Ruby associates a binding object with a block. So when you create a proc or a lambda, the resulting Proc object holds not just the executable block but also bindings for all the variables used by the block.
You already know that blocks can use local variables and method arguments that are defined outside the block. In the following code, for example, the block associated with the collect iterator uses the method argument n:
# multiply each element of the data array by n
def multiply(data, n)
data.collect {|x| x*n }
end
puts multiply([1,2,3], 2) # Prints 2,4,6
What is more interesting is that if the block were turned into a proc or lambda, it could access n even after the method to which it is an argument had returned. That's because there is a binding associated to the block of the lambda or proc object! The following code demonstrates:
# Return a lambda that retains or "closes over" the argument n
def multiplier(n)
lambda {|data| data.collect{|x| x*n } }
end
doubler = multiplier(2) # Get a lambda that knows how to double
puts doubler.call([1,2,3]) # Prints 2,4,6
The multiplier method returns a lambda. Because this lambda is used outside of the scope in which it is defined, we call it a closure; it encapsulates or “closes over” (or just retains) the binding for the method argument n.
It is important to understand that a closure does not just retain the value of the variables it refers to—it retains the actual variables and extends their lifetime. Another way to say this is that the variables used in a lambda or proc are not statically bound when the lambda or proc is created. Instead, the bindings are dynamic, and the values of the variables are looked up when the lambda or proc is executed.

Resources