Why doesn't Array#each_with_object(0) work? - ruby

Why does using 0 as the argument in each_with_object not return the correct value:
[1,2,3].each_with_object(0) {|i,o| o += i }
# => 0
but using an empty array and reduce(:+) does?
[1,2,3].each_with_object([]) {|i,o| o << i }.reduce(:+)
# => 6

From documentation it says:
each_with_object(obj) → an_enumerator
Iterates the given block for each element with an arbitrary object given, and returns the initially given object.
If no block is given, returns an enumerator.
As Array is an the same initial object, but with modified values is being returned in this case.
If we see the code of each_with_object, It is:
# File activesupport/lib/active_support/core_ext/enumerable.rb, line 79
def each_with_object(memo)
return to_enum :each_with_object, memo unless block_given?
each do |element|
yield element, memo
end
memo
end
You can see, It don't modifies memo, so if memo is 0 and you change it in the code block, It will still return zero, but if you pass [] and change it inside code block, It will return array with values.

There is no Ruby bug in your examples, which means they are all correct.
In the first example, the argument 0 is the return value. In the second example, the argument (that appears to be [] initially) is the return value. Only in the latter, the argument had been modified, and ended up looking different from what it looked like at the beginning, but the identity of the object is retained.

Related

How does a code block in Ruby know what variable belongs to an aspect of an object?

Consider the following:
(1..10).inject{|memo, n| memo + n}
Question:
How does n know that it is supposed to store all the values from 1..10? I'm confused how Ruby is able to understand that n can automatically be associated with (1..10) right away, and memo is just memo.
I know Ruby code blocks aren't the same as the C or Java code blocks--Ruby code blocks work a bit differently. I'm confused as to how variables that are in between the upright pipes '|' will automatically be assigned to parts of an object. For example:
hash1 = {"a" => 111, "b" => 222}
hash2 = {"b" => 333, "c" => 444}
hash1.merge(hash2) {|key, old, new| old}
How do '|key, old, new|' automatically assign themselves in such a way such that when I type 'old' in the code block, it is automatically aware that 'old' refers to the older hash value? I never assigned 'old' to anything, just declared it. Can someone explain how this works?
The parameters for the block are determined by the method definition. The definition for reduce/inject is overloaded (docs) and defined in C, but if you wanted to define it, you could do it like so (note, this doesn't cover all the overloaded cases for the actual reduce definition):
module Enumerable
def my_reduce(memo=nil, &blk)
# if a starting memo is not given, it defaults to the first element
# in the list and that element is skipped for iteration
elements = memo ? self : self[1..-1]
memo ||= self[0]
elements.each { |element| memo = blk.call(memo, element) }
memo
end
end
This method definition determines what values to use for memo and element and calls the blk variable (a block passed to the method) with them in a specific order.
Note, however, that blocks are not like regular methods, because they don't check the number of arguments. For example: (note, this example shows the usage of yield which is another way to pass a block parameter)
def foo
yield 1
end
# The b and c variables here will be nil
foo { |a, b, c| [a,b,c].compact.sum } # => 1
You can also use deconstruction to define variables at the time you run the block, for example if you wanted to reduce over a hash you could do something like this:
# this just copies the hash
{a: 1}.reduce({}) { |memo, (key, val)| memo[key] = val; memo }
How this works is, calling reduce on a hash implicitly calls to_a, which converts it to a list of tuples (e.g. {a: 1}.to_a = [[:a, 1]]). reduce passes each tuple as the second argument to the block. In the place where the block is called, the tuple is deconstructed into separate key and value variables.
A code block is just a function with no name. Like any other function, it can be called multiple times with different arguments. If you have a method
def add(a, b)
a + b
end
How does add know that sometimes a is 5 and sometimes a is 7?
Enumerable#inject simply calls the function once for each element, passing the element as an argument.
It looks a bit like this:
module Enumerable
def inject(memo)
each do |el|
memo = yield memo, el
end
memo
end
end
And memo is just memo
what do you mean, "just memo"? memo and n take whatever values inject passes. And it is implemented to pass accumulator/memo as first argument and current collection element as second argument.
How do '|key, old, new|' automatically assign themselves
They don't "assign themselves". merge assigns them. Or rather, passes those values (key, old value, new value) in that order as block parameters.
If you instead write
hash1.merge(hash2) {|foo, bar, baz| bar}
It'll still work exactly as before. Parameter names mean nothing [here]. It's actual values that matter.
Just to simplify some of the other good answers here:
If you are struggling understanding blocks, an easy way to think of them is as a primitive and temporary method that you are creating and executing in place, and the values between the pipe characters |memo| is simply the argument signature.
There is no special special concept behind the arguments, they are simply there for the method you are invoking to pass a variable to, like calling any other method with an argument. Similar to a method, the arguments are "local" variables within the scope of the block (there are some nuances to this depending on the syntax you use to call the block, but I digress, that is another matter).
The method you pass the block to simply invokes this "temporary method" and passes the arguments to it that it is designed to do. Just like calling a method normally, with some slight differences, such as there are no "required" arguments. If you do not define any arguments to receive, it will happily just not pass them instead of raising an ArgumentError. Likewise, if you define too many arguments for the block to receive, they will simply be nil within the block, no errors for not being defined.

why return change variables while inside a class

I cannot understand this ruby behavior, the code explains better what I mean:
class DoNotUnderstand
def initialize
#tiny_array = [3,4]
test
end
def messing(ary)
return [ary[0]+=700, ary[1]+=999]
end
def test
puts #tiny_array.join(",") # before => 3,4
messing(#tiny_array)
puts #tiny_array.join(",") # after => 703,1003
end
end
question = DoNotUnderstand.new
#tiny_array was [3,4] and became [703,1003]
if I don't use a class, that happens:
#tiny = [1,2]
def messing(ary)
return [ary[0]+693,ary[1]+999]
end
puts #tiny.join(",") # before => 1,2
messing(#tiny)
puts #tiny.join(",") # after => 1,2
the array simply remains [1,2]
why?
The class is a red herring, and completely irrelevant to the issue.
In the first case, where the array was modified, you defined messing as:
def messing(ary)
return [ary[0]+=700, ary[1]+=999]
end
Whereas in the second case, where the array was not modified, you defined messing as:
def messing(ary)
return [ary[0]+693,ary[1]+999]
end
In one case you used +=, and in the other, you used merely +.
ary[0] += 700 is exactly equivalent to ary[0] = ary[0] + 700. In other words you are changing the value stored in the 0th index of ary.
In the second case you merely add to the values stored in the array and return the result, but in the first case you not only return the result, you also store it back in the array.
For an explanation of why modifying ary modifies #tiny_array, see this answer to the question Is Ruby pass by reference or by value?.
You're second code example (the one from outside the class) is missing the two characters in the first that make it work the way it does. In the first example, the += operator is used, modifying the array in place:
return [ary[0]+=700, ary[1]+=999]
In your second example, the + operator is used, leaving the array as is:
return [ary[0]+693,ary[1]+999]
If you change it use the += operator, it works the same way as the first code snippet.

Iterate over array of arrays

This has been asked before, but I can't find an answer that works. I have the following code:
[[13,14,16,11],[22,23]].each do |key,value|
puts key
end
It should in theory print:
0
1
But instead it prints:
13
22
Why does ruby behave this way?
Why does ruby behave this way?
It's because what actually happens internally, when each and other iterators are used with a block instead of a lambda, is actually closer to this:
do |key, value, *rest|
puts key
end
Consider this code to illustrate:
p = proc do |key,value|
puts key
end
l = lambda do |key,value|
puts key
end
Using the above, the following will set (key, value) to (13, 14) and (22, 23) respectively, and the above-mentioned *rest as [16, 11] in the first case (with rest getting discarded):
[[13,14,16,11],[22,23]].each(&p)
In contrast, the following will spit an argument error, because the lambda (which is similar to a block except when it comes to arity considerations) will receive the full array as an argument (without any *rest as above, since the number of arguments is strictly enforced):
[[13,14,16,11],[22,23]].each(&l) # wrong number of arguments (1 for 2)
To get the index in your case, you'll want each_with_index as highlighted in the other answers.
Related discussions:
Proc.arity vs Lambda.arity
Why does Hash#select and Hash#reject pass a key to a unary block?
You can get what you want with Array's each_index' method which returns the index of the element instead of the element itself. See [Ruby'sArray` documentation]1 for more information.
When you do:
[[13,14,16,11],[22,23]].each do |key,value|
before the first iteration is done it makes an assignment:
key, value = [13,14,16,11]
Such an assignment will result with key being 13 and value being 14. Instead you should use each_with_index do |array, index|. This will change the assignment to:
array, index = [[13,14,16,11], 0]
Which will result with array being [13,14,16,11] and index being 0
You have an array of arrays - known as a two-dimensional array.
In your loop, your "value" variable is assigned to the first array, [13,14,16,11]
When you attempt to puts the "value" variable, it only returns the first element, 13.
Try changing puts value to puts value.to_s which will convert the array to a string.
If you want every value, then add another loop block to your code, to loop through each element within the "value" variable.
[[1,2,3],['a','b','c']].each do |key,value|
value.each do |key2,value2|
puts value2
end
end

Yield within Set to eliminate in an Array

I found the following code here for eliminating duplicate records in an array:
require 'set'
class Array
def uniq_by
seen = Set.new
select{ |x| seen.add?( yield( x ) ) }
end
end
And we can use the code above as follows:
#messages = Messages.all.uniq_by { |h| h.body }
I would like to know how and what happens when the method is called. Can someone explain the internals of the code above? In the uniq_by method, we did not do anything to handle block argument. How is the passed argument handled by uniq_by method?
Let's break it down :
seen = Set.new
Create an empty set
select{ |x| seen.add?( yield( x ) ) }
Array#select will keep elements when the block yields true.
seen.add?(yield(x)) will return true if the result of the block can be added in the set, or false if it can't.
Indeed, yield(x) will call the block passed to the uniq_by method, and pass x as an argument.
In our case, since our block is { |h| h.body }, it would be the same as calling seen.add?(x.body)
Since a set is unique, calling add? when the element already exists will return false.
So it will try to call .body on each element of the array and add it in a set, keeping elements where the adding was possible.
The method uniq_by accepts a block argument. This allows to specify, by what criteria you wish to identify two elements as "unique".
The yield statement will evaluate the value of the given block for the element and return the value of the elements body attribute.
So, if you call unique_by like above, you are stating that the attribute body of the elements has to be unique for the element to be unique.
To answer the more specific question you have: yield will call the passed block {|h| h.body} like a method, substituting h for the current x and therefore return x.body
In Ruby, when you are putting yield keyword inside any method(say #bar), you are explicitly telling #bar that, you will be using a block with the method #bar. So yield knows, inside the method block will be converted to a Proc object, and yield have to call that Proc object.
Example :
def bar
yield
end
p bar { "hello" } # "hello"
p bar # bar': no block given (yield) (LocalJumpError)
In the uniq_by method, we did not do anything to handle block argument. How is the passed argument handled by uniq_by method?
You did do, that is you put yield. Once you will put this yield, now method is very smart to know, what it supposed to so. In the line Messages.all.uniq_by { |h| h.body } you are passing a block { |h| h.body }, and inside the method definition of uniq_by, that block has been converted to a Proc object, and yield does Proc#call.
Proof:
def bar
p block_given? # true
yield
end
bar { "hello" } # "hello"
Better for understanding :
class Array
def uniq_by
seen = Set.new
select{ |x| seen.add?( yield( x ) ) }
end
end
is same as
class Array
def uniq_by
seen = Set.new
# Below you are telling uniq_by, you will be using a block with it
# by using `yield`.
select{ |x| var = yield(x); seen.add?(var) }
end
end
Read the doc of yield
Called from inside a method body, yields control to the code block (if any) supplied as part of the method call. If no code block has been supplied, calling yield raises an exception. yield can take an argument; any values thus yielded are bound to the block's parameters. The value of a call to yield is the value of the executed code block.
Array#select returns a new array containing all elements of the array for which the given block returns a true value.
The block argument of the select use Set#add? to determine whether the element is already there. add? returns nil if there is already the same element in the set, otherwise it returns the set itself and add the element to the set.
The block again pass the argument (an element of the array) to another block (the block passed to the uniq_by) using yield; Return value of the yield is return value of the block ({|h| h.body })
The select .. statement is basically similar to following statement:
select{ |x| seen.add?(x.body) }
But by using yield, the code avoid hard-coding of .body, and defers decision to the block.

Ruby how does this inject code work?

I am new to Ruby and I am trying to write a method that groups an array of words into anagram groups. Here is the code:
def combine_anagrams(words)
dict = words.inject(Hash.new(0)) do |list,ws|
key = sort_word(ws)
if !list.has_key?(key)
list[key] = []
end
list[key].push(ws)
list #What is this
end
return dict.values
end
My question is what the statement list is for. If I take it out list becomes an array instead of hash.
Every method/block/etc. in Ruby returns something, and unless there is an early return statement, whatever the last statement in the method/block/etc. is, is what is returned.
In your case, having list be the last line in the block passed to inject ensures that list is returned by the block. When you remove it, the return value of list[key].push(ws) is returned, which obviously isn't what you want.
Note that this behavior also makes using the return keyword when it is the last statement that would be executed otherwise is unnecessary (this includes the return you have at the end of your method). Though some prefer to be explicit that they intend to return something and use them even when not needed.
On an unrelated note: your if !list.has_key?(key) can be rewritten unless list.has_key?(key).
inject works like this:
final = enumerable.inject(initial_value) do |current_value, iteration|
# calculations, etc. here
value # next iteration, current_value will be whatever the block returns
end
So, in your case, initial_value is Hash.new(0), or an empty Hash with 0 as the default value for a key that doesn't exist instead of nil. This is passed into the inject block for the first element in enumerable.
Inside the inject block, you check to see if key already exists as a key on the hash. If it does not, set it equal to an empty array. In either case, take the current iteration of words (ws) and push it onto the array.
Finally, the block returns the current version of list; it becomes current_value (the first parameter to the inject block) the next time the loop processes an element from enumerable.
As a more simple example, check out this sample:
numbers = [1, 2, 3, 4]
sum = inject(0) do |total, number| # first time, total will be 0
total + number # set total next time to be whatever total is now plus the current number
end
Take a look at http://ruby-doc.org/core-1.9.3/Enumerable.html#method-i-inject
In the inject method if you pass two arguments into it (in your case list and ws) the first one - list - is so-called accumulator value. The value which is returned by the inject block at each iteration step is assigned to the list variable. So the line with the only word "list" which you commented as "#What is this" is used for assigning the value of the list in the block to the "list" accumulator variable.
the statement "list" is the return value of the whole block. The line: "list[key] = []" has a return value of "list", therefore it doesnt need another line to set the return value of the if condition to 'list', but the return value of list[key].push(ws) is list[key]. we want to get the updated value of list in the end, therefore we need to return that value from the block each time, so that further processing acts of the updated list, and not something else.
As a background, each ruby line also has a return value, so if that were the last line of a block, or a function, it automatically becomes the return value of the whole block or the function respectively.
To understand this further, try some code like this in irb:
a = [1,2,3,4,5]
b = a.inject(0) {|sum, val| puts sum; puts val; sum + val}
the inner block comprises of three statememts; the last statement returns the value of sum+val to the block, which get stored in sum, to be used in next iterations.
Also, try some code like this:
h = {:a => []}
b = h[:a].push 6
See what b evaluates to; in your code, you need 'b' to be the accumulated hash, and not the array that is stored in h[:a]

Resources