Passing symbol to block methods - ruby

Some block methods such as inject can optionally take a symbol instead of a block:
%w[a b c].inject(&:+)
# => "abc"
%w[a b c].inject(:+)
# => "abc"
%w[a b c].inject("", :+)
# => "abc"
while other block methods such as map cannot:
%w[a b c].map(&:upcase)
# => ["A", "B", "C"]
%w[a b c].map(:upcase)
# => ArgumentError: wrong number of arguments (1 for 0)
Why can't the latter take a symbol?

For inject, a block (or a substitute) is obligatory. If it weren't passed a block, then there has to be at least one argument, the last argument has to be a symbol, and the block would be constructed out of it. Whatever the arity, there is no ambiguity; the last argument is used to construct a block when the block is lacking.
For map, a block is optional. When there is no block given, then the return value would be an Enumerator instance. Hence, from the information whether a block was passed or not, it cannot be decided whether the last argument should be used to construct a block.
In the particular case of map, it does not take an argument, so there is a sense in saying that an extra argument should be taken as a block, but it makes things complicated to judge whether that last argument is to be taken as a block depending on the arity. And it also loses the future possibility of changing the arity of the method.

Not sure but i know this operator & to_proc, inject() method accept 2 args first accum second proc, but map() accept only one args proc or block. In inject() first args(accum) can be first item in enum.

It's just a special handling for this special case in some methods and lack of such handling in the others.

Related

How does Ruby Array #count handle multiple block arguments

When I execute the following:
[[1,1], [2,2], [3,4]].count {|a,b| a != b} # => 1
the block arguments a, b are assigned to the first and the second values of each inner array respectively. I don't understand how this is accomplished.
The only example given in the documentation for Array#count and Enumerable#count with a block uses a single block argument:
ary.count {|x| x % 2 == 0} # => 3
Just like assignments, there's a (not-so-) secret shortcut. If the right-hand-side is an array and the left-hand-side has multiple variables, the array is splatted, so the following two lines are identical:
a, b, c = [1, 2, 3]
a, b, c = *[1, 2, 3]
While not the same thing, blocks have something in the same vein, when the yielded value is an array, and there are multiple parameters. Thus, these two blocks will act the same when you yield [1, 2, 3]:
do |a, b, c|
...
end
do |(a, b, c)|
...
end
So, in your case, the value gets deconstructed, as if you wrote this:
[[1,1], [2,2], [3,4]].count {|(a,b)| a != b} # => 1
If you had another value that you are passing along with the array, you would have to specify the structure explicitly, as the deconstruction of the array would not be automatic in the way we want:
[[1,1], [2,2], [3,4]].each.with_index.count {|e,i| i + 1 == e[1] }
# automatic deconstruction of [[1,1],0]:
# e=[1,1]; i=0
[[1,1], [2,2], [3,4]].each.with_index.count {|(a,b),i| i + 1 == b }
# automatic deconstruction of [[1,1],0], explicit deconstruction of [1,1]:
# a=1; b=1; i=0
[[1,1], [2,2], [3,4]].each.with_index.count {|a,b,i| i + 1 == b }
# automatic deconstruction of [[1,1],0]
# a=[1,1]; b=0; i=nil
# NOT what we want
I have looked at the documentation for Array.count and Enumerable.count and the only example given with a block uses a single block argument ...
Ruby, like almost all mainstream programming languages, does not allow user code to change the fundamental semantics of the language. In other words, you won't find anything about block formal parameter binding semantics in the documentation of Array#count, because block formal parameter binding semantics are specified by the Ruby Language Specification and Array#count cannot possibly change that.
What I don't understand is how this is accomplished.
This has nothing to do with Array#count. This is just standard block formal parameter binding semantics for block formal parameters.
Formal parameter binding semantics for block formal parameters are different from formal parameter binding semantics for method formal parameters. In particular, they are much more flexible in how they handle mismatches between the number of formal parameters and actual arguments.
If there is exactly one block formal parameter and you yield more than one block actual argument, the block formal parameter gets bound to an Array containing the block actual arguments.
If there are more than one block formal parameters and you yield exactly one block actual argument, and that one actual argument is an Array, then the block formal parameters get bound to the individual elements of the Array. (This is what you are seeing in your example.)
If you yield more block actual arguments than the block has formal parameters, the extra actual arguments get ignored.
If you pass fewer actual arguments than the block has formal parameters, then those extra formal parameters are defined but not bound, and evaluate to nil (just like defined but unitialized local variables).
If you look closely, you can see that the formal parameter binding semantics for block formal parameters are much closer to assignment semantics, i.e. you can imagine an assignment with the block formal parameters on the left-hand side of the assignment operator and the block actual arguments on the right-hand side.
If you have a block defined like this:
{|a, b, c|}
and are yielding to it like this:
yield 1, 2, 3, 4
you can almost imagine the block formal parameter binding to work like this:
a, b, c = 1, 2, 3, 4
And if, as is the case in your question, you have a block defined like this:
{|a, b|}
and are yielding to it like this:
yield [1, 2]
you can almost imagine the block formal parameter binding to work like this:
a, b = [1, 2]
Which of course, as you well know, will have this result:
a #=> 1
b #=> 2
Fun fact: up to Ruby 1.8, block formal parameter binding was using actual assignment! You could, for example, define a constant, an instance variable, a class variable, a global variable, and even an attribute writer(!!!) as a formal parameter, and when you yielded to that block, Ruby would literally perform the assignment:
class Foo
def bar=(value)
puts "`#{__method__}` called with `#{value.inspect}`"
#bar = value
end
attr_reader :bar
end
def set_foo
yield 42
end
foo = Foo.new
set_foo {|foo.bar|}
# `bar=` called with `42`
foo.bar
#=> 42
Pretty crazy, huh?
The most widely-used application of these block formal parameter binding semantics is when using Hash#each (or any of the Enumerable methods with a Hash instance as the receiver). The Hash#each method yields a single two-element Array containing the key and the value as an actual argument to the block, but we almost always treat it as if it were yielding the key and value as separate actual arguments. Usually, we prefer writing
hsh.each do |k, v|
puts "The key is #{k} and the value is #{v}"
end
over
hsh.each do |key_value_pair|
k, v = key_value_pair
puts "The key is #{k} and the value is #{v}"
end
And that is exactly equivalent to what you are seeing in your question. I bet you have never asked yourself why you can pass a block with two block formal parameters to Hash#each even though it only yields a single Array? Well, this case is exactly the same. You are passing a block with two block formal parameters to a method that yields a single Array per iteration.

How does a code block in Ruby know what variable belongs to an aspect of an object?

Consider the following:
(1..10).inject{|memo, n| memo + n}
Question:
How does n know that it is supposed to store all the values from 1..10? I'm confused how Ruby is able to understand that n can automatically be associated with (1..10) right away, and memo is just memo.
I know Ruby code blocks aren't the same as the C or Java code blocks--Ruby code blocks work a bit differently. I'm confused as to how variables that are in between the upright pipes '|' will automatically be assigned to parts of an object. For example:
hash1 = {"a" => 111, "b" => 222}
hash2 = {"b" => 333, "c" => 444}
hash1.merge(hash2) {|key, old, new| old}
How do '|key, old, new|' automatically assign themselves in such a way such that when I type 'old' in the code block, it is automatically aware that 'old' refers to the older hash value? I never assigned 'old' to anything, just declared it. Can someone explain how this works?
The parameters for the block are determined by the method definition. The definition for reduce/inject is overloaded (docs) and defined in C, but if you wanted to define it, you could do it like so (note, this doesn't cover all the overloaded cases for the actual reduce definition):
module Enumerable
def my_reduce(memo=nil, &blk)
# if a starting memo is not given, it defaults to the first element
# in the list and that element is skipped for iteration
elements = memo ? self : self[1..-1]
memo ||= self[0]
elements.each { |element| memo = blk.call(memo, element) }
memo
end
end
This method definition determines what values to use for memo and element and calls the blk variable (a block passed to the method) with them in a specific order.
Note, however, that blocks are not like regular methods, because they don't check the number of arguments. For example: (note, this example shows the usage of yield which is another way to pass a block parameter)
def foo
yield 1
end
# The b and c variables here will be nil
foo { |a, b, c| [a,b,c].compact.sum } # => 1
You can also use deconstruction to define variables at the time you run the block, for example if you wanted to reduce over a hash you could do something like this:
# this just copies the hash
{a: 1}.reduce({}) { |memo, (key, val)| memo[key] = val; memo }
How this works is, calling reduce on a hash implicitly calls to_a, which converts it to a list of tuples (e.g. {a: 1}.to_a = [[:a, 1]]). reduce passes each tuple as the second argument to the block. In the place where the block is called, the tuple is deconstructed into separate key and value variables.
A code block is just a function with no name. Like any other function, it can be called multiple times with different arguments. If you have a method
def add(a, b)
a + b
end
How does add know that sometimes a is 5 and sometimes a is 7?
Enumerable#inject simply calls the function once for each element, passing the element as an argument.
It looks a bit like this:
module Enumerable
def inject(memo)
each do |el|
memo = yield memo, el
end
memo
end
end
And memo is just memo
what do you mean, "just memo"? memo and n take whatever values inject passes. And it is implemented to pass accumulator/memo as first argument and current collection element as second argument.
How do '|key, old, new|' automatically assign themselves
They don't "assign themselves". merge assigns them. Or rather, passes those values (key, old value, new value) in that order as block parameters.
If you instead write
hash1.merge(hash2) {|foo, bar, baz| bar}
It'll still work exactly as before. Parameter names mean nothing [here]. It's actual values that matter.
Just to simplify some of the other good answers here:
If you are struggling understanding blocks, an easy way to think of them is as a primitive and temporary method that you are creating and executing in place, and the values between the pipe characters |memo| is simply the argument signature.
There is no special special concept behind the arguments, they are simply there for the method you are invoking to pass a variable to, like calling any other method with an argument. Similar to a method, the arguments are "local" variables within the scope of the block (there are some nuances to this depending on the syntax you use to call the block, but I digress, that is another matter).
The method you pass the block to simply invokes this "temporary method" and passes the arguments to it that it is designed to do. Just like calling a method normally, with some slight differences, such as there are no "required" arguments. If you do not define any arguments to receive, it will happily just not pass them instead of raising an ArgumentError. Likewise, if you define too many arguments for the block to receive, they will simply be nil within the block, no errors for not being defined.

What does a single variable name mean in ruby?

In some code, I found:
class Job
##types = [:a, :b, :c, :d].reduce({}) do |acc, cmd|
acc[cmd] = cmd.to_s
acc
end
# ...
end
There's nothing passed into reduce. What does that mean?
There's a single acc. What does that mean?
reduce is called with an empty hash. This means that the value of acc in the first call to the block will be {}.
In Ruby, the last statement within a function is the return value, so the block returns acc.
You probably need to read what reduce does exactly to understand this code.
There's nothing passed into reduce. What does that mean?
That's not true. There is a positional argument {} passed into reduce as well as a block.
But even if nothing were passed, what's the big deal? There's nothing being passed into to_s either, yet somehow that doesn't seem to bother you.
There's a single acc. What does that mean?
It means the same thing as the acc on the line before: dereference the variable.
Read the documentation for reduce
The first argument ({}) is the initial parameter of acc, acc is what it will be returned when reduce finishes.
It is just transforming the array into a hash, the final result is:
{:a=>"a", :b=>"b", :c=>"c", :d=>"d"}

Meaning of & in parameters

I saw this code for a method same as each, except it receives a block to run some test against every item:
def every?(&predicate)
predicate = lambda { |item| item } if predicate.nil?
each do |item|
return false if !predicate.call(item)
end
true
end
Why is there a & in the parameter, and what does it do? What are the uses of it?
Sometimes in parameter lists you'll see something like
def foo(&block)
logic_with block
end
This just means that argument is expecting a block - and in your example.
&predicate just means passing a block as a parameter, which we're assigning to a local variable predicate
You can get a good idea of this from the fact that if predicate is nil the first line of the method assigns a new lamda to the predicate variable.
For further reading here's a good posts on blocks, procs and lambdas: http://www.robertsosinski.com/2008/12/21/understanding-ruby-blocks-procs-and-lambdas/
EDITED per sawa's explanation below.
My take was you wanted the simple explanation that if you see & in this context it means a block is expected.
If you want to know specifically what the & operator itself actually does there's a good blog post here: http://ablogaboutcode.com/2012/01/04/the-ampersand-operator-in-ruby/
As sawa mentions it's very similar to calling to_proc on the incoming block. From the post I linked to, in more detail:
if object is a block, it converts the block into a simple proc.
if object is a Proc, it converts the object into a block while preserving the lambda? status of the object.
if object is not a Proc, it first calls #to_proc on the object and then converts it into a block.
The two operators * and & swap Ruby objects and non-objects.
The operator * prepended to a list of comma-separated objects (which is not an object) converts it into an array (which is an object).
*("foo", "bar", "baz") # => ["foo", "bar", "baz"]
The operator * prepended to an object converts it into an array by applying to_a, and then into a list of comma-separated objects.
*["foo", "bar", "baz"] # => ("foo", "bar", "baz")
*nil # => *[] # => ()
The operator & prepended to a block (which is not an object) converts it into a proc (which is an object).
&{|e| puts e} # => ->(e){puts e}
The operator & prepended to an object converts it into a proc by applying to_proc, and then into a block.
&->(e){puts e} # => {|e| puts e}
&:foo # => &->(e){e.foo} # => {|e| e.foo}
When you have a & in an argument position, the & is appended to a block, so the third case above applies. The block becomes a proc.
In the context of a method definition, putting an ampersand in front of the last parameter indicates that a method may take a block and gives us a name to refer to this block within the method body.
I often refer to this post when I get confused.

Can someone explain Ruby's use of pipe characters in a block?

Can someone explain to me Ruby's use of pipe characters in a block? I understand that it contains a variable name that will be assigned the data as it iterates. But what is this called? Can there be more than one variable inside the pipes? Anything else I should know about it? Any good links to more information on it?
For example:
25.times { | i | puts i }
Braces define an anonymous function, called a block. Tokens between the pipe are the arguments of this block. The number of arguments required depends on how the block is used. Each time the block is evaluated, the method requiring the block will pass a value based on the object calling it.
It's the same as defining a method, only it's not stored beyond the method that accepts a block.
For example:
def my_print(i)
puts i
end
will do the same as this when executed:
{|i| puts i}
the only difference is the block is defined on the fly and not stored.
Example 2:
The following statements are equivalent
25.times &method(:my_print)
25.times {|i| puts i}
We use anonymous blocks because the majority of functions passed as a block are usually specific to your situation and not worth defining for reuse.
So what happens when a method accepts a block? That depends on the method. Methods that accept a block will call it by passing values from their calling object in a well defined manner. What's returned depends on the method requiring the block.
For example: In 25.times {|i| puts i} .times calls the block once for each value between 0 and the value of its caller, passing the value into the block as the temporary variable i. Times returns the value of the calling object. In this case 25.
Let's look at method that accepts a block with two arguments.
{:key1 => "value1", :key2 => "value2"}.each {|key,value|
puts "This key is: #{key}. Its value is #{value}"
}
In this case each calls the block ones for each key/value pair passing the key as the first argument and the value as the second argument.
The pipes specify arguments that are populated with values by the function that calls your block. There can be zero or more of them, and how many you should use depends on the method you call.
For example, each_with_index uses two variables and puts the element in one of them and the index in the other.
here is a good description of how blocks and iterators work
Block arguments follow all the same conventions as method parameters (at least as of 1.9): you can define optional arguments, variable length arg lists, defaults, etc. Here's a pretty decent summary.
Some things to be aware of: because blocks see variables in the scope they were defined it, if you pass in an argument with the same name as an existing variable, it will "shadow" it - your block will see the passed in value and the original variable will be unchanged.
i = 10
25.times { | i | puts i }
puts i #=> prints '10'
Will print '10' at the end. Because sometimes this is desirable behavior even if you are not passing in a value (ie you want to make sure you don't accidentally clobber a variable from surrounding scope) you can specify block-local variable names after a semicolon after the argument list:
x = 'foo'
25.times { | i ; x | puts i; x = 'bar' }
puts x #=> prints 'foo'
Here, 'x' is local to the block, even though no value is passed in.

Resources