How does ruby unpack arguments passed into Proc? - ruby

a_proc = Proc.new {|a,b,*c| p c; c.collect {|i| i*b }}
puts a_proc[2,2,4,3]
Code above is pretty intuitive according to https://ruby-doc.org/core-2.2.0/Proc.html, a_proc[2,2,4,3] is just a syntax sugar for a_proc.call(2,2,4,3) to hide “call”
But the following (works well) confused me a lot
a=[2,2,4,3]
puts a_proc.call(a)
puts a_proc.call(*a)
It seems very different from a normal function call, cause it doesn't check the number arguments passed in.
However, as expected the method calling semantics will raise an error if using parameters likewise
def foo(a,b,*c)
c.collect{|i| i*b}
end
foo([1,2,3,4]) #`block in <main>': wrong number of arguments (given 1, expected 2+) (ArgumentError)
foo(*[1,2,3,4]) #works as expected
I do not think such an inconsistency as a design glitch, so any insights on this will be appreciated.

Blocks use different semantics than methods for binding arguments to parameters.
Block semantics are more similar to assignment semantics than to method semantics in this regard. In fact, in older versions of Ruby, blocks literally used assignment for parameter binding, you could write something like this:
class Foo; def bar=(val) puts 'setter called!' end end
some_proc = Proc.new {|$foo, #foo, foo.bar|}
some_proc.call(1, 2, 3)
# setter called!
$foo #=> 1
#foo #=> 2
Thankfully, this is no longer the case since Ruby 1.9. However, some semantics have been retained:
If a block has multiple parameters but receives only a single argument, the argument will be sent a to_ary message (if it isn't an Array already) and the parameters will be bound to the elements of the Array
If a block receives more arguments than it has parameters, it ignores the extra arguments
If a block receives fewer arguments than it has parameters, the extra parameters are bound to nil
Note: #1 is what makes Hash#each work so beautifully, otherwise, you'd always have to deconstruct the array that it passes to the block.
In short, block parameters are bound much the same way as with multiple assignment. You can imagine assignment without setters, indexers, globals, instance variables, and class variables, only local variables, and that is pretty much how parameter binding for blocks work: copy&paste the parameter list from the block, copy&paste the argument list from the yield, put an = sign in between and you get the idea.
Now, you aren't actually talking about a block, though, you are talking about a Proc. For that, you need to know something important: there are two kinds of Procs, which unfortunately are implemented using the same class. (IMO, they should have been two different classes.) One kind is called a lambda and the other kind is usually called a proc (confusingly, since both are Procs).
Procs behave like blocks, both when it comes to parameter binding and argument passing (i.e. the afore-described assignment semantics) and also when it comes to the behavior of return (it returns from the closest lexically enclosing method).
Lambdas behave like methods, both when it comes to parameter binding and argument passing (i.e. strict argument checking) and also when it comes to the behavior of return (it returns from the lambda itself).
A simple mnemonic: "block" and "proc" rhyme, "method" and "lambda" are both Greek.
A small remark to your question:
a_proc[2,2,4,3] is just a syntax sugar for a_proc.call(2,2,4,3) to hide “call”
This is not syntactic sugar. Rather, Proc simply defines the [] method to behave identically to call.
What is syntactic sugar is this:
a_proc.(2, 2, 4, 3)
Every occurrence of
foo.(bar, baz)
gets interpreted as
foo.call(bar, baz)

I believe what might be confusing you are some of the properties of Procs. If they are given a single array argument, they will automatically splat it. Also, ruby blocks in general have some interesting ways of handling block arguments. The behavior you're expecting is what you will get with a Lambda. I suggest reading Proc.lambda? documentation and be careful when calling a ruby block with an array.
Now, let's start with the splat operator and then move to how ruby handles block arguments:
def foo(a, b, *c)
c.map { |i| i * b } # Prefer to use map alias over collect
end
foo([1, 2, 3, 4]) # `block in <main>': wrong number of arguments (given 1, expected 2+) (ArgumentError)
foo(*[1, 2, 3, 4]) # works as expected
So in your argument error, it makes sense: def foo() takes at least two arguments: a, b, and however many with *c. The * is the splat operator. It will turn an array into individual arguments, or in the reverse case here, a variable amount of arguments into an array. So when you say foo([1,2,3,4]), you are giving foo one argument, a, and it is [1,2,3,4]. You are not setting b or *c. What would work is foo(1, 1, 1, 2, 3, 4]) for example because you are setting a, b, and c. This would be the same thing: foo(1, 1, *[1,2,3,4]).
Now foo(*[1, 2, 3, 4]) works as expected because the splat operator (*) is turning that into foo(1, 2, 3, 4) or equivalently foo(1, 2, *[3, 4])
Okay, so now that we have the splat operator covered, let's look back at the following code (I made some minor changes):
a_proc = Proc.new { |a, b, *c| c.map { |i| i * b }}
a = [1, 2, 3, 4]
puts a_proc.call(a)
puts a_proc.call(*a)
Remember that if blocks/procs are given a single array argument they will automatically splat it. So if you have an array of arrays arrays = [[1, 1], [2, 2], [3, 3]] and you do arrays.each { |a, b| puts "#{a}:#{b}" } you are going to get 1:1, 2:2, and 3:3 as the output. As each element is passed as the argument to the block, it sees that it is an array and splats it, assigning the elements to as many of the given block variables as it can. Instead of just putting that array in a such as a = [1, 1]; b = nil, you get a = 1; b = 1. It's doing the same thing with the proc.
a_proc.call([1, 2, 3, 4]) is turned into Proc.new { |1, 2, [3, 4]| c.map { |i| i * b }} and will output [6, 8]. It splits up the arguments automatically it's own.

Related

Why is `to_ary` called from a double-splatted parameter in a code block?

It seems that a double-splatted block parameter calls to_ary on an object that is passed, which does not happen with lambda parameters and method parameters. This was confirmed as follows.
First, I prepared an object obj on which a method to_ary is defined, which returns something other than an array (i.e., a string).
obj = Object.new
def obj.to_ary; "baz" end
Then, I passed this obj to various constructions that have a double splatted parameter:
instance_exec(obj){|**foo|}
# >> TypeError: can't convert Object to Array (Object#to_ary gives String)
->(**foo){}.call(obj)
# >> ArgumentError: wrong number of arguments (given 1, expected 0)
def bar(**foo); end; bar(obj)
# >> ArgumentError: wrong number of arguments (given 1, expected 0)
As can be observed above, only code block tries to convert obj to an array by calling a (potential) to_ary method.
Why does a double-splatted parameter for a code block behave differently from those for a lambda expression or a method definition?
I don't have full answers to your questions, but I'll share what I've found out.
Short version
Procs allow to be called with number of arguments different than defined in the signature. If the argument list doesn't match the definition, #to_ary is called to make implicit conversion. Lambdas and methods require number of args matching their signature. No conversions are performed and that's why #to_ary is not called.
Long version
What you describe is a difference between handling params by lambdas (and methods) and procs (and blocks). Take a look at this example:
obj = Object.new
def obj.to_ary; "baz" end
lambda{|**foo| print foo}.call(obj)
# >> ArgumentError: wrong number of arguments (given 1, expected 0)
proc{|**foo| print foo}.call(obj)
# >> TypeError: can't convert Object to Array (Object#to_ary gives String)
Proc doesn't require the same number of args as it defines, and #to_ary is called (as you probably know):
For procs created using lambda or ->(), an error is generated if wrong number of parameters are passed to the proc. For procs created using Proc.new or Kernel.proc, extra parameters are silently discarded and missing parameters are set to nil. (Docs)
What is more, Proc adjusts passed arguments to fit the signature:
proc{|head, *tail| print head; print tail}.call([1,2,3])
# >> 1[2, 3]=> nil
Sources: makandra, SO question.
#to_ary is used for this adjustment (and it's reasonable, as #to_ary is for implicit conversions):
obj2 = Class.new{def to_ary; [1,2,3]; end}.new
proc{|head, *tail| print head; print tail}.call(obj2)
# >> 1[2, 3]=> nil
It's described in detail in a ruby tracker.
You can see that [1,2,3] was split to head=1 and tail=[2,3]. It's the same behaviour as in multi assignment:
head, *tail = [1, 2, 3]
# => [1, 2, 3]
tail
# => [2, 3]
As you have noticed, #to_ary is also called when when a proc has double-splatted keyword args:
proc{|head, **tail| print head; print tail}.call(obj2)
# >> 1{}=> nil
proc{|**tail| print tail}.call(obj2)
# >> {}=> nil
In the first case, an array of [1, 2, 3] returned by obj2.to_ary was split to head=1 and empty tail, as **tail wasn't able to match an array of[2, 3].
Lambdas and methods don't have this behaviour. They require strict number of params. There is no implicit conversion, so #to_ary is not called.
I think that this difference is implemented in these two lines of the Ruby soruce:
opt_pc = vm_yield_setup_args(ec, iseq, argc, sp, passed_block_handler,
(is_lambda ? arg_setup_method : arg_setup_block));
and in this function. I guess #to_ary is called somewhere in vm_callee_setup_block_arg_arg0_splat, most probably in RARRAY_AREF. I would love to read a commentary of this code to understand what happens inside.

How does Ruby Array #count handle multiple block arguments

When I execute the following:
[[1,1], [2,2], [3,4]].count {|a,b| a != b} # => 1
the block arguments a, b are assigned to the first and the second values of each inner array respectively. I don't understand how this is accomplished.
The only example given in the documentation for Array#count and Enumerable#count with a block uses a single block argument:
ary.count {|x| x % 2 == 0} # => 3
Just like assignments, there's a (not-so-) secret shortcut. If the right-hand-side is an array and the left-hand-side has multiple variables, the array is splatted, so the following two lines are identical:
a, b, c = [1, 2, 3]
a, b, c = *[1, 2, 3]
While not the same thing, blocks have something in the same vein, when the yielded value is an array, and there are multiple parameters. Thus, these two blocks will act the same when you yield [1, 2, 3]:
do |a, b, c|
...
end
do |(a, b, c)|
...
end
So, in your case, the value gets deconstructed, as if you wrote this:
[[1,1], [2,2], [3,4]].count {|(a,b)| a != b} # => 1
If you had another value that you are passing along with the array, you would have to specify the structure explicitly, as the deconstruction of the array would not be automatic in the way we want:
[[1,1], [2,2], [3,4]].each.with_index.count {|e,i| i + 1 == e[1] }
# automatic deconstruction of [[1,1],0]:
# e=[1,1]; i=0
[[1,1], [2,2], [3,4]].each.with_index.count {|(a,b),i| i + 1 == b }
# automatic deconstruction of [[1,1],0], explicit deconstruction of [1,1]:
# a=1; b=1; i=0
[[1,1], [2,2], [3,4]].each.with_index.count {|a,b,i| i + 1 == b }
# automatic deconstruction of [[1,1],0]
# a=[1,1]; b=0; i=nil
# NOT what we want
I have looked at the documentation for Array.count and Enumerable.count and the only example given with a block uses a single block argument ...
Ruby, like almost all mainstream programming languages, does not allow user code to change the fundamental semantics of the language. In other words, you won't find anything about block formal parameter binding semantics in the documentation of Array#count, because block formal parameter binding semantics are specified by the Ruby Language Specification and Array#count cannot possibly change that.
What I don't understand is how this is accomplished.
This has nothing to do with Array#count. This is just standard block formal parameter binding semantics for block formal parameters.
Formal parameter binding semantics for block formal parameters are different from formal parameter binding semantics for method formal parameters. In particular, they are much more flexible in how they handle mismatches between the number of formal parameters and actual arguments.
If there is exactly one block formal parameter and you yield more than one block actual argument, the block formal parameter gets bound to an Array containing the block actual arguments.
If there are more than one block formal parameters and you yield exactly one block actual argument, and that one actual argument is an Array, then the block formal parameters get bound to the individual elements of the Array. (This is what you are seeing in your example.)
If you yield more block actual arguments than the block has formal parameters, the extra actual arguments get ignored.
If you pass fewer actual arguments than the block has formal parameters, then those extra formal parameters are defined but not bound, and evaluate to nil (just like defined but unitialized local variables).
If you look closely, you can see that the formal parameter binding semantics for block formal parameters are much closer to assignment semantics, i.e. you can imagine an assignment with the block formal parameters on the left-hand side of the assignment operator and the block actual arguments on the right-hand side.
If you have a block defined like this:
{|a, b, c|}
and are yielding to it like this:
yield 1, 2, 3, 4
you can almost imagine the block formal parameter binding to work like this:
a, b, c = 1, 2, 3, 4
And if, as is the case in your question, you have a block defined like this:
{|a, b|}
and are yielding to it like this:
yield [1, 2]
you can almost imagine the block formal parameter binding to work like this:
a, b = [1, 2]
Which of course, as you well know, will have this result:
a #=> 1
b #=> 2
Fun fact: up to Ruby 1.8, block formal parameter binding was using actual assignment! You could, for example, define a constant, an instance variable, a class variable, a global variable, and even an attribute writer(!!!) as a formal parameter, and when you yielded to that block, Ruby would literally perform the assignment:
class Foo
def bar=(value)
puts "`#{__method__}` called with `#{value.inspect}`"
#bar = value
end
attr_reader :bar
end
def set_foo
yield 42
end
foo = Foo.new
set_foo {|foo.bar|}
# `bar=` called with `42`
foo.bar
#=> 42
Pretty crazy, huh?
The most widely-used application of these block formal parameter binding semantics is when using Hash#each (or any of the Enumerable methods with a Hash instance as the receiver). The Hash#each method yields a single two-element Array containing the key and the value as an actual argument to the block, but we almost always treat it as if it were yielding the key and value as separate actual arguments. Usually, we prefer writing
hsh.each do |k, v|
puts "The key is #{k} and the value is #{v}"
end
over
hsh.each do |key_value_pair|
k, v = key_value_pair
puts "The key is #{k} and the value is #{v}"
end
And that is exactly equivalent to what you are seeing in your question. I bet you have never asked yourself why you can pass a block with two block formal parameters to Hash#each even though it only yields a single Array? Well, this case is exactly the same. You are passing a block with two block formal parameters to a method that yields a single Array per iteration.

Can you pass arguments to the method in `&method(:method_ name)` in ruby? [duplicate]

I am aware of the shorthand for map that looks like:
[1, 2, 3, 4].map(&:to_s)
> ["1", "2", "3", "4"]
I was told this is shorthand for:
[1, 2, 3, 4].map{|i| i.to_s}
This makes perfect sense. My question is this: It seems there should be an easier way to write:
[1, 2, 3, 4].map{|x| f.call(x)}
for some procedure f. I know the way I just typed isn't all that long to begin with, but I'd contend that neither is the previous example for which the shorthand exists. This example just seems like the complement to the first example: Rather than calling i's to_s method for every i, I wish to call f for every x.
Does such a shorthand exist?
Unfortunately this shorthand notation (which calls "Symbol#to_proc") does not have a way to pass arguments to the method or block being called, so you couldn't even do the following:
array_of_strings.map(&:include, 'l') #=> this would fail
BUT, you are in luck because you don't actually need this shortcut to do what you are trying to do. The ampersand can convert a Proc or Lambda into a block, and vice-versa:
my_routine = Proc.new { |str| str.upcase }
%w{ one two three }.map &my_routine #=> ['ONE', 'TWO', 'THREE']
Note the lack of the colon before my_routine. This is because we don't want to convert the :my_routine symbol into a proc by finding the method and calling .method on it, rather we want to convert the my_routine Proc into a block and pass it to map.
Knowing this, you can even map a native Ruby method:
%w{ one two three }.map &method(:p)
The method method would take the p method and convert it into a Proc, and the & converts it into a block that gets passed to map. As a result, each item gets printed. This is the equivalent of this:
%w{ one two three }.map { |s| p s }
As of Ruby 2.7, numbered parameters (_1, _2, etc) are supported. This syntax can be cryptic, especially with more than one parameter, but it can be useful for simple situations.
[1, 2, 3, 4].map { f.call(_1) }

why does ruby need so many different types of closure?

As far as I can tell, there are essentially three different kinds of closure in Ruby; methods, procs and lambdas. I know that there are differences between them, but could we not just get away having one type that accommodates all possible use-cases?
Methods can already be passed around like procs and lambdas by calling self.method(method_name), and the only significant differences that I'm aware of between procs and lambdas is that lambdas check arity and procs do crazy things when you try to use return. So couldn't we just merge them all into one and be done with it?
As far as I can tell, there are essentially three different kinds of closure in Ruby; methods, procs and lambdas.
No, there are two: methods aren't closures, only procs and lambdas are. (Or at least can be, most of them aren't.)
There are two ways of packaging up a piece of executable code for reuse in Ruby: methods and blocks. Strictly speaking, blocks aren't necessary, you can get by with just methods. But blocks are meant to be extremely light-weight, conceptually, semantically and syntactically. That's not true for methods.
Because they are meant to be light-weight and easy to use, blocks behave different from methods in some respects, e.g. how arguments are bound to parameters. Block parameters are bound more like the left-hand side of an assignment than like method parameters.
Examples:
Passing a single array to multiple parameters:
def foo(a, b) end
foo([1, 2, 3]) # ArgumentError: wrong number of arguments (1 for 2)
a, b = [1, 2, 3]
# a == 1; b == 2
[[1, 2, 3]].each {|a, b| puts "a == #{a}; b == #{b}" }
# a == 1; b ==2
Passing less arguments than parameters:
def foo(a, b, c) end
foo(1, 2) # ArgumentError
a, b, c = 1, 2
# a == 1; b == 2; c == nil
[[1, 2]].each {|a, b, c| puts "a == #{a}; b == #{b}; c == #{c}" }
# a == 1; b == 2; c ==
Passing more arguments than parameters:
def foo(a, b) end
foo(1, 2, 3) # ArgumentError: wrong number of arguments (3 for 2)
a, b = 1, 2, 3
# a == 1; b == 2
[[1, 2, 3]].each {|a, b| puts "a == #{a}; b == #{b}" }
# a == 1; b == 2
[By the way: none of the blocks above are closures.]
This allows, for example, the Enumerable protocol which always yields a single element to the block to work with Hashes: you just make the single element an Array of [key, value] and rely on the implicit array destructuring of the block:
{one: 1, two: 2}.each {|k, v| puts "#{key} is assigned to #{value}" }
is much easier to understand than what you would have to otherwise write:
{one: 1, two: 2}.each {|el| puts "#{el.first} is assigned to #{el.last}" }
Another difference between blocks and methods is that methods use the return keyword to return a value whereas blocks use the next keyword.
If you agree that it makes sense to have both methods and blocks in the language, then it is just a small step to also accept the existence of both procs and lambdas, because they behave like blocks and methods, respectively:
procs return from the enclosing method (just like blocks) and they bind arguments exactly like blocks do
lambdas return from themselves (just like methods) and they bind arguments exactly like methods do.
IOW: the proc/lambda dichotomy just mirrors the block/method dichotomy.
Note that there are actually quite a lot more cases to consider. For example, what does self mean? Does it mean
whatever self was at the point the block was written
whatever self is at the point the block is run
the block itself
And what about return? Does it mean
return from the method the block is written in
return from the method the block is run in
return from the block itself?
This already gives you nine possibilities, even without taking into account the Ruby-specific peculiarities of parameter binding.
Now, for reasons of encapsulation, #2 above are really bad ideas, so that reduces our choices somewhat.
As always, it's a matter of taste of the language designer. There are other such redundancies in Ruby as well: why do you need both instance variables and local variables? If lexical scopes were objects, then local variables would just be instance variables of the lexical scope and you wouldn't need local variables. And why do you need both instance variables and methods? One of them is enough: a getter/setter pair of methods can replace an instance variable (see Newspeak for an example of such a language) and first-class procedures assigned to instance variables can replace methods (see Self, Python, JavaScript). Why do you need both classes and modules? If you allow classes to be mixed-in, then you can get rid of modules and use classes both as classes and mixins. And why do you need mixins at all? If everything is a method call, classes automatically become mixins anyway (again, see Newspeak for an example). And of course, if you allow inheritance directly between objects you don't need classes at all (see Self, Io, Ioke, Seph, JavaScript)
Some pretty good explanation http://www.robertsosinski.com/2008/12/21/understanding-ruby-blocks-procs-and-lambdas/ but i am guessing you want a bit more deeply philosophical explanation...
I believe the answer to "but could we not just get away having one type that accommodates all possible use-cases?", is that you can get away using just one.
The reason they exist is that ruby is trying to make the developer as productive as possible using expressions from both functional and object oriented paradigms, which makes the different types of closure "syntactic sugar".

Other Ruby Map Shorthand Notation

I am aware of the shorthand for map that looks like:
[1, 2, 3, 4].map(&:to_s)
> ["1", "2", "3", "4"]
I was told this is shorthand for:
[1, 2, 3, 4].map{|i| i.to_s}
This makes perfect sense. My question is this: It seems there should be an easier way to write:
[1, 2, 3, 4].map{|x| f.call(x)}
for some procedure f. I know the way I just typed isn't all that long to begin with, but I'd contend that neither is the previous example for which the shorthand exists. This example just seems like the complement to the first example: Rather than calling i's to_s method for every i, I wish to call f for every x.
Does such a shorthand exist?
Unfortunately this shorthand notation (which calls "Symbol#to_proc") does not have a way to pass arguments to the method or block being called, so you couldn't even do the following:
array_of_strings.map(&:include, 'l') #=> this would fail
BUT, you are in luck because you don't actually need this shortcut to do what you are trying to do. The ampersand can convert a Proc or Lambda into a block, and vice-versa:
my_routine = Proc.new { |str| str.upcase }
%w{ one two three }.map &my_routine #=> ['ONE', 'TWO', 'THREE']
Note the lack of the colon before my_routine. This is because we don't want to convert the :my_routine symbol into a proc by finding the method and calling .method on it, rather we want to convert the my_routine Proc into a block and pass it to map.
Knowing this, you can even map a native Ruby method:
%w{ one two three }.map &method(:p)
The method method would take the p method and convert it into a Proc, and the & converts it into a block that gets passed to map. As a result, each item gets printed. This is the equivalent of this:
%w{ one two three }.map { |s| p s }
As of Ruby 2.7, numbered parameters (_1, _2, etc) are supported. This syntax can be cryptic, especially with more than one parameter, but it can be useful for simple situations.
[1, 2, 3, 4].map { f.call(_1) }

Resources