Ruby how does this inject code work? - ruby

I am new to Ruby and I am trying to write a method that groups an array of words into anagram groups. Here is the code:
def combine_anagrams(words)
dict = words.inject(Hash.new(0)) do |list,ws|
key = sort_word(ws)
if !list.has_key?(key)
list[key] = []
end
list[key].push(ws)
list #What is this
end
return dict.values
end
My question is what the statement list is for. If I take it out list becomes an array instead of hash.

Every method/block/etc. in Ruby returns something, and unless there is an early return statement, whatever the last statement in the method/block/etc. is, is what is returned.
In your case, having list be the last line in the block passed to inject ensures that list is returned by the block. When you remove it, the return value of list[key].push(ws) is returned, which obviously isn't what you want.
Note that this behavior also makes using the return keyword when it is the last statement that would be executed otherwise is unnecessary (this includes the return you have at the end of your method). Though some prefer to be explicit that they intend to return something and use them even when not needed.
On an unrelated note: your if !list.has_key?(key) can be rewritten unless list.has_key?(key).

inject works like this:
final = enumerable.inject(initial_value) do |current_value, iteration|
# calculations, etc. here
value # next iteration, current_value will be whatever the block returns
end
So, in your case, initial_value is Hash.new(0), or an empty Hash with 0 as the default value for a key that doesn't exist instead of nil. This is passed into the inject block for the first element in enumerable.
Inside the inject block, you check to see if key already exists as a key on the hash. If it does not, set it equal to an empty array. In either case, take the current iteration of words (ws) and push it onto the array.
Finally, the block returns the current version of list; it becomes current_value (the first parameter to the inject block) the next time the loop processes an element from enumerable.
As a more simple example, check out this sample:
numbers = [1, 2, 3, 4]
sum = inject(0) do |total, number| # first time, total will be 0
total + number # set total next time to be whatever total is now plus the current number
end

Take a look at http://ruby-doc.org/core-1.9.3/Enumerable.html#method-i-inject
In the inject method if you pass two arguments into it (in your case list and ws) the first one - list - is so-called accumulator value. The value which is returned by the inject block at each iteration step is assigned to the list variable. So the line with the only word "list" which you commented as "#What is this" is used for assigning the value of the list in the block to the "list" accumulator variable.

the statement "list" is the return value of the whole block. The line: "list[key] = []" has a return value of "list", therefore it doesnt need another line to set the return value of the if condition to 'list', but the return value of list[key].push(ws) is list[key]. we want to get the updated value of list in the end, therefore we need to return that value from the block each time, so that further processing acts of the updated list, and not something else.
As a background, each ruby line also has a return value, so if that were the last line of a block, or a function, it automatically becomes the return value of the whole block or the function respectively.
To understand this further, try some code like this in irb:
a = [1,2,3,4,5]
b = a.inject(0) {|sum, val| puts sum; puts val; sum + val}
the inner block comprises of three statememts; the last statement returns the value of sum+val to the block, which get stored in sum, to be used in next iterations.
Also, try some code like this:
h = {:a => []}
b = h[:a].push 6
See what b evaluates to; in your code, you need 'b' to be the accumulated hash, and not the array that is stored in h[:a]

Related

What is the difference between each and any to find a prime number in Ruby

There are two completely identical methods, only one uses each and the other any, so what is the difference between each and any?
first program code (each using):
def is_prime?(num)
Math.sqrt(num).floor.downto(2).each {|i| return false if num % i == 0}
true
end
number = gets.chomp.to_i;
puts is_prime?(number);
second program code (any using):
def is_prime?(num)
Math.sqrt(num).floor.downto(2).any? {|i| return false if num % i == 0}
true
end
number = gets.chomp.to_i;
puts is_prime?(number);
The .each method iterates through all elements of a data structure, most commonly an array or a hash, and calls the given block once for each element. If no block is given, it returns an enumerator, i.e. an instance of the Enumerator class that can then be used to "manually" iterate through the data structure.
The .any? method tests every element of a data structure against the given condition and returns true if one or more of them match or pass the test. Otherwise, it returns false. There are more details to it, so please check it from the official documentation.
General tips using Ruby
Based on this code, there are a couple of suggestions. First, you should not use return in a block as it makes the method return that value, too, and might not execute the block for more iterations.
Second, you don't need trailing semicolons at the ends of lines.
Short analysis of your code
Both your functions seem to work, at least for some numbers, but it seems that you might not exactly know why. Let's take a closer look at it.
Let's assume num = 5, for example.
First function, the block variable i gets assigned to a value of 2, meaning the block does not use the return false because the test num % i == 0 fails. Instead, the function returns true from the next line.
Second function, the block variable i also gets assigned to a value of 2, but again, as the test num % i == 0 fails the result of .any? is false, and the function returns true from the next line.
Now, let's assume num = 4.
First function, the block variable i gets again assigned to a value of 2, and since this time the test num % i == 0 passes, the return false inside the block is executed making the function return false.
Second function, the block variable i also gets again assigned to a value of 2, and since the test num % i == 0 passes here as well, the return false inside the block is executed making the function return false.
Without the return false in your second function, the function would always return true because you would not check the value returned by .any?, and the function's next line would be executed returning true.
Mechnicov offers simpler alternatives how to make your function more understandable.
each is the iterator that go through the array from beginning to end
Its main purpose is just to pass through the array (or other collection) and perform some actions that were specified in the block. For example, it can be a rendering of HTML partial or making HTTP requests or something else
There are also many other iterators that have specific tasks. These are such as map, select, any? and others
You used it wrong way. You're not using the full power of a specific iterator
But you can do it like this:
def prime?(num)
!Math.sqrt(num).floor.downto(2).any? { |i| num % i == 0 }
end
or like this
def prime?(num)
Math.sqrt(num).floor.downto(2).all? { |i| num % i != 0 }
end

How does a code block in Ruby know what variable belongs to an aspect of an object?

Consider the following:
(1..10).inject{|memo, n| memo + n}
Question:
How does n know that it is supposed to store all the values from 1..10? I'm confused how Ruby is able to understand that n can automatically be associated with (1..10) right away, and memo is just memo.
I know Ruby code blocks aren't the same as the C or Java code blocks--Ruby code blocks work a bit differently. I'm confused as to how variables that are in between the upright pipes '|' will automatically be assigned to parts of an object. For example:
hash1 = {"a" => 111, "b" => 222}
hash2 = {"b" => 333, "c" => 444}
hash1.merge(hash2) {|key, old, new| old}
How do '|key, old, new|' automatically assign themselves in such a way such that when I type 'old' in the code block, it is automatically aware that 'old' refers to the older hash value? I never assigned 'old' to anything, just declared it. Can someone explain how this works?
The parameters for the block are determined by the method definition. The definition for reduce/inject is overloaded (docs) and defined in C, but if you wanted to define it, you could do it like so (note, this doesn't cover all the overloaded cases for the actual reduce definition):
module Enumerable
def my_reduce(memo=nil, &blk)
# if a starting memo is not given, it defaults to the first element
# in the list and that element is skipped for iteration
elements = memo ? self : self[1..-1]
memo ||= self[0]
elements.each { |element| memo = blk.call(memo, element) }
memo
end
end
This method definition determines what values to use for memo and element and calls the blk variable (a block passed to the method) with them in a specific order.
Note, however, that blocks are not like regular methods, because they don't check the number of arguments. For example: (note, this example shows the usage of yield which is another way to pass a block parameter)
def foo
yield 1
end
# The b and c variables here will be nil
foo { |a, b, c| [a,b,c].compact.sum } # => 1
You can also use deconstruction to define variables at the time you run the block, for example if you wanted to reduce over a hash you could do something like this:
# this just copies the hash
{a: 1}.reduce({}) { |memo, (key, val)| memo[key] = val; memo }
How this works is, calling reduce on a hash implicitly calls to_a, which converts it to a list of tuples (e.g. {a: 1}.to_a = [[:a, 1]]). reduce passes each tuple as the second argument to the block. In the place where the block is called, the tuple is deconstructed into separate key and value variables.
A code block is just a function with no name. Like any other function, it can be called multiple times with different arguments. If you have a method
def add(a, b)
a + b
end
How does add know that sometimes a is 5 and sometimes a is 7?
Enumerable#inject simply calls the function once for each element, passing the element as an argument.
It looks a bit like this:
module Enumerable
def inject(memo)
each do |el|
memo = yield memo, el
end
memo
end
end
And memo is just memo
what do you mean, "just memo"? memo and n take whatever values inject passes. And it is implemented to pass accumulator/memo as first argument and current collection element as second argument.
How do '|key, old, new|' automatically assign themselves
They don't "assign themselves". merge assigns them. Or rather, passes those values (key, old value, new value) in that order as block parameters.
If you instead write
hash1.merge(hash2) {|foo, bar, baz| bar}
It'll still work exactly as before. Parameter names mean nothing [here]. It's actual values that matter.
Just to simplify some of the other good answers here:
If you are struggling understanding blocks, an easy way to think of them is as a primitive and temporary method that you are creating and executing in place, and the values between the pipe characters |memo| is simply the argument signature.
There is no special special concept behind the arguments, they are simply there for the method you are invoking to pass a variable to, like calling any other method with an argument. Similar to a method, the arguments are "local" variables within the scope of the block (there are some nuances to this depending on the syntax you use to call the block, but I digress, that is another matter).
The method you pass the block to simply invokes this "temporary method" and passes the arguments to it that it is designed to do. Just like calling a method normally, with some slight differences, such as there are no "required" arguments. If you do not define any arguments to receive, it will happily just not pass them instead of raising an ArgumentError. Likewise, if you define too many arguments for the block to receive, they will simply be nil within the block, no errors for not being defined.

Why doesn't Array#each_with_object(0) work?

Why does using 0 as the argument in each_with_object not return the correct value:
[1,2,3].each_with_object(0) {|i,o| o += i }
# => 0
but using an empty array and reduce(:+) does?
[1,2,3].each_with_object([]) {|i,o| o << i }.reduce(:+)
# => 6
From documentation it says:
each_with_object(obj) → an_enumerator
Iterates the given block for each element with an arbitrary object given, and returns the initially given object.
If no block is given, returns an enumerator.
As Array is an the same initial object, but with modified values is being returned in this case.
If we see the code of each_with_object, It is:
# File activesupport/lib/active_support/core_ext/enumerable.rb, line 79
def each_with_object(memo)
return to_enum :each_with_object, memo unless block_given?
each do |element|
yield element, memo
end
memo
end
You can see, It don't modifies memo, so if memo is 0 and you change it in the code block, It will still return zero, but if you pass [] and change it inside code block, It will return array with values.
There is no Ruby bug in your examples, which means they are all correct.
In the first example, the argument 0 is the return value. In the second example, the argument (that appears to be [] initially) is the return value. Only in the latter, the argument had been modified, and ended up looking different from what it looked like at the beginning, but the identity of the object is retained.

Modify an Array in Place - Ruby

I'm wondering why the following will not modify the array in place.
I have this:
#card.map!.with_index {|value, key| key.even? ? value*=2 : value}
Which just iterates over an array, and doubles the values for all even keys.
Then I do:
#card.join.split('').map!{|x| x.to_i}
Which joins the array into one huge number, splits them into individual numbers and then maps them back to integers in an array. The only real change from step one to step two is step one would look like a=[1,2,12] and step two would look like a=[1,2,1,2]. For the second step, even though I use .map! when I p #card it appears the exact same after the first step. I have to set the second step = to something if I want to move onward with they new array. Why is this? Does the .map! in the second step not modify the array in place? Or do the linking of methods negate my ability to do that? Cheers.
tldr: A method chain only modifies objects in place, if every single method in that chain is a modify-in-place method.
The important difference in the case is the first method you call on your object. Your first example calls map! that this a methods that modifies the array in place. with_index is not important in this example, it just changes the behavior of the map!.
Your second example calls join on your array. join does not change the array in place, but it returns a totally different object: A string. Then you split the string, which creates a new array and the following map! modifies the new array in place.
So in your second example you need to assign the result to your variable again:
#card = #card.join.split('').map{ |x| x.to_i }
There might be other ways to calculate the desired result. But since you did not provide input and output examples, it is unclear what you're trying to achieve.
Does the .map! in the second step not modify the array in place?
Yes, it does, however the array it modifies is not #card. The split() method returns a new array, i.e. one that is not #card, and map! modifies the new array in place.
Check this out:
tap{|x|...} → x
Yields [the receiver] to the block, and then returns [the receiver].
The primary purpose of this method is to “tap into” a method chain,
in order to perform operations on intermediate results within the chain.
#card = ['a', 'b', 'c']
puts #card.object_id
#card.join.split('').tap{|arr| puts arr.object_id}.map{ |x| x.to_i } #arr is whatever split() returns
--output:--
2156361580
2156361380
Every object in a ruby program has a unique object_id.

What's i in each_with_index block

Okay, so im reading a guide for ruby and I can't make sense of this code. Where did i come from. I see that n is passed to iterate through the block but I have no idea where I comes from. If I could get a full explanation and breakdown of how this code works that would be great!
class Array
def iterate!
self.each_with_index do |n, i|
self[i] = yield(n)
end
end
end
array = [1, 2, 3, 4]
array.iterate! do |n|
n ** 2
end
i is the index of the element (hence the name, each_with_index).
Some methods that are called with code blocks will pass more than one value to the block, so you end up with multiple block arguments (in your case the block arguments are n and i, which will hold the current item in the array (n) and the index of it (i)).
You can find out how many arguments a block will be passed by looking at the documentation for a method (here's the docs for each_with_index). It does look like the extra values come from nowhere at first, and it takes a little while to memorize what a block will be passed when different methods are called.
i is commonly used as what's known as an "iterative variable". Basically, the loop block that you've copied here goes through each "iteration" of the loop and uses a new value of i and assigns it to the variable n, which is then passed on to the operation at the second to last line. In this case, the new value is simply the next number in array, and so there are four iterations of the loop.

Resources