How does iteration work in Ruby? - ruby

I've recently started coding Ruby and I'm having a mis-understanding with block parameters.
Take the following code for example:
h = { # A hash that maps number names to digits
:one => 1, # The "arrows" show mappings: key=>value
:two => 2 # The colons indicate Symbol literals
}
h[:one] # => 1. Access a value by key
h[:three] = 3 # Add a new key/value pair to the hash
h.each do |key,value| # Iterate through the key/value pairs
print "#{value}:#{key}; " # Note variables substituted into string
end # Prints "1:one; 2:two; 3:three; "
I understand the general hash functionality, however I don't understand how value and key are set to anything. They are specified as parameters in the block, but the hash is never associated in any way with these parameters.

This is the Ruby block (Ruby's name for an anonymous function) syntax. And key, value are nothing but the arguments passed to the anonymous function.
Hash#each takes one parameter: A function which has 2 parameters, key and value.
So if we break it down into parts, this part of your code: h.each, is calling the each function on h. And this part of your code:
do |key, value| # Iterate through the key/value pairs
print "#{value}:#{key}; " # Note variables substituted into string
end # Prints "1:one; 2:two; 3:three;
is the function passed to each as an argument and key, value are arguments passed to this function. It doesn't matter what you name them, first argument expected is key and second argument expected is value.
Lets draw some analogies. Consider a basic function:
def name_of_function(arg1, arg1)
# Do stuff
end
# You'd call it such:
foo.name_of_function bar, baz # bar is becomes as arg1, baz becomes arg2
# As a block:
ooga = lambda { |arg1, arg2|
# Do stuff
}
# Note that this is exactly same as:
ooga = lambda do |arg1, arg2|
# Do stuff
end
# You would call it such:
ooga.call(bar, baz) # bar is becomes as arg1, baz becomes arg2
So your code can also be written as:
print_key_value = lambda{|arg1, arg2| print "#{arg1}:#{arg2}"}
h = {
:one => 1,
:two => 2
}
h.each &print_key_value
There are multiple ways in which the code inside a block can be executed:
yield
yield key, value # This is one possible way in which Hash#each can use a block
yield item
block.call
block.call(key, value) # This is another way in which Hash#each can use a block
block.call(item)

The hash (h) is associated with the loop due to you calling h.each rather than calling each on something else. It's effectively saying, "For each key/value pair in h, let the key iteration variable be the key, let the value iteration variable be the value, then execute the body of the loop."
If that doesn't help, have a look at this page on each... and if you can explain more about which bit you're finding tricky, we may be able to help more. (Well, others may be able to. I don't really know Ruby.)

The hash is indeed associated with these parameters because you call h.each to iterate over the hash:
h.each <- here's the link you are missing
Perhaps it's easier for you if you start with an array instead:
a = [1,2,3]
a.each do |v|
puts v
end
and play around with this first (each, each_with_index, ...)

when you call h.each, that's when you say that this is this specific h hash that you want to use for this each iteration.
Hence when you do that the value and key variables are assigned to the values in your hash, one by one.

I think the question is about the variable names. The names have no significance. Only the order matters. Within |...| inside each {...}, the key and the value are given in that order. Since its natural to assign the variable name key to key and value to value, you often find it done like that. In fact, it can be anything else.
each{|k, v| ...} # k is key, v is value
each{|a, b| ...} # a is key, b is value
or even misleadingly:
each{|value, key| ...} # value is key, key is value

Related

How does a code block in Ruby know what variable belongs to an aspect of an object?

Consider the following:
(1..10).inject{|memo, n| memo + n}
Question:
How does n know that it is supposed to store all the values from 1..10? I'm confused how Ruby is able to understand that n can automatically be associated with (1..10) right away, and memo is just memo.
I know Ruby code blocks aren't the same as the C or Java code blocks--Ruby code blocks work a bit differently. I'm confused as to how variables that are in between the upright pipes '|' will automatically be assigned to parts of an object. For example:
hash1 = {"a" => 111, "b" => 222}
hash2 = {"b" => 333, "c" => 444}
hash1.merge(hash2) {|key, old, new| old}
How do '|key, old, new|' automatically assign themselves in such a way such that when I type 'old' in the code block, it is automatically aware that 'old' refers to the older hash value? I never assigned 'old' to anything, just declared it. Can someone explain how this works?
The parameters for the block are determined by the method definition. The definition for reduce/inject is overloaded (docs) and defined in C, but if you wanted to define it, you could do it like so (note, this doesn't cover all the overloaded cases for the actual reduce definition):
module Enumerable
def my_reduce(memo=nil, &blk)
# if a starting memo is not given, it defaults to the first element
# in the list and that element is skipped for iteration
elements = memo ? self : self[1..-1]
memo ||= self[0]
elements.each { |element| memo = blk.call(memo, element) }
memo
end
end
This method definition determines what values to use for memo and element and calls the blk variable (a block passed to the method) with them in a specific order.
Note, however, that blocks are not like regular methods, because they don't check the number of arguments. For example: (note, this example shows the usage of yield which is another way to pass a block parameter)
def foo
yield 1
end
# The b and c variables here will be nil
foo { |a, b, c| [a,b,c].compact.sum } # => 1
You can also use deconstruction to define variables at the time you run the block, for example if you wanted to reduce over a hash you could do something like this:
# this just copies the hash
{a: 1}.reduce({}) { |memo, (key, val)| memo[key] = val; memo }
How this works is, calling reduce on a hash implicitly calls to_a, which converts it to a list of tuples (e.g. {a: 1}.to_a = [[:a, 1]]). reduce passes each tuple as the second argument to the block. In the place where the block is called, the tuple is deconstructed into separate key and value variables.
A code block is just a function with no name. Like any other function, it can be called multiple times with different arguments. If you have a method
def add(a, b)
a + b
end
How does add know that sometimes a is 5 and sometimes a is 7?
Enumerable#inject simply calls the function once for each element, passing the element as an argument.
It looks a bit like this:
module Enumerable
def inject(memo)
each do |el|
memo = yield memo, el
end
memo
end
end
And memo is just memo
what do you mean, "just memo"? memo and n take whatever values inject passes. And it is implemented to pass accumulator/memo as first argument and current collection element as second argument.
How do '|key, old, new|' automatically assign themselves
They don't "assign themselves". merge assigns them. Or rather, passes those values (key, old value, new value) in that order as block parameters.
If you instead write
hash1.merge(hash2) {|foo, bar, baz| bar}
It'll still work exactly as before. Parameter names mean nothing [here]. It's actual values that matter.
Just to simplify some of the other good answers here:
If you are struggling understanding blocks, an easy way to think of them is as a primitive and temporary method that you are creating and executing in place, and the values between the pipe characters |memo| is simply the argument signature.
There is no special special concept behind the arguments, they are simply there for the method you are invoking to pass a variable to, like calling any other method with an argument. Similar to a method, the arguments are "local" variables within the scope of the block (there are some nuances to this depending on the syntax you use to call the block, but I digress, that is another matter).
The method you pass the block to simply invokes this "temporary method" and passes the arguments to it that it is designed to do. Just like calling a method normally, with some slight differences, such as there are no "required" arguments. If you do not define any arguments to receive, it will happily just not pass them instead of raising an ArgumentError. Likewise, if you define too many arguments for the block to receive, they will simply be nil within the block, no errors for not being defined.

Mixing keyword argument and arguments with default values duplicates the hash?

So i discovered this ruby behaviour, which kept me going crazy for over an hour. When I pass a hash to a function which has a default value for hash AND a keyword argument, it seems like the reference doesn't get passed correctly. As soon as I take away the default value OR the keyword argument, the function behaves as expected. Am I missing some obvious ruby rule here?
def change_hash(h={}, rand: om)
h['hey'] = true
end
k = {}
change_hash(k)
k
#=> {}
It works fine as soon as I take out the default or the keyword arg.
def change_hash(h, rand: om)
h['hey'] = true
end
k = {}
change_hash(k)
k
#=> {'hey' => true}
def change_hash(h={})
h['hey'] = true
end
k = {}
change_hash(k)
k
#=> {'hey' => true}
EDIT
Thanks for your answers. Most of you pointed out that ruby parses the hash as a keyword argument in some cases. However, I am talking about the case when a hash has string keys. When I pass the hash, it seems like the value that gets passed is correct. But modifying the hash inside the function doesn't modify the original hash.
def change_hash(hash={}, another_arg: 300)
puts "another_arg: #{another_arg}"
puts "hash: #{hash}"
hash['hey'] = 3
end
my_hash = {"o" => 3}
change_hash(my_hash)
puts my_hash
Prints out
another_arg: 300
hash: {"o"=>3}
{"o"=>3}
TL;DR ruby allows passing hash as a keyword argument as well as “expanded inplace hash.” Since change_hash(rand: :om) must be routed to keyword argument, so should change_hash({rand: :om}) and, hence, change_hash({}).
Since ruby allows default arguments in any position, the parser takes care of default arguments in the first place. That means, that the default arguments are greedy and the most amount of defaults will take a place.
On the other hand, since ruby lacks pattern-matching feature for function clauses, parsing the given argument to decide whether it should be passed as double-splat or not would lead to huge performance penalties. Since the call with an explicit keyword argument (change_hash(rand: :om)) should definitely pass :om to keyword argument, and we are allowed to pass an explicit hash {rand: :om} as a keyword argument, Ruby has nothing to do but to accept any hash as a keyword argument.
Ruby will split the single hash argument between hash and rand:
k = {"a" => 42, rand: 42}
def change_hash(h={}, rand: :om)
h[:foo] = 42
puts h.inspect
end
change_hash(k);
puts k.inspect
#⇒ {"a"=>42, :foo=>42}
#⇒ {"a"=>42, :rand=>42}
That split feature requires the argument being cloned before passing. That is why the original hash is not being modified.
This is particularly tricky case in Ruby indeed.
In your example you have optional argument which is a hash and you have an optional keyword argument at the same time. In this situation if you pass only one hash, Ruby interprets it as a hash which contains keyword arguments. Here is the code to clarify:
change_hash({rand1: 'om'})
# ArgumentError: unknown keyword: rand1
To work around this you can pass two separate hashes into the method with second one (the one for keyword arguments) being empty:
def change_hash(h={}, rand: 'om')
h['hey'] = true
end
k = {}
change_hash(k, {})
k
#=> {'hey' => true}
From the practical point of view it is better to avoid metdhod signature like that in production code, because it is very easy to make an error while using the method.

Iterate over array of arrays

This has been asked before, but I can't find an answer that works. I have the following code:
[[13,14,16,11],[22,23]].each do |key,value|
puts key
end
It should in theory print:
0
1
But instead it prints:
13
22
Why does ruby behave this way?
Why does ruby behave this way?
It's because what actually happens internally, when each and other iterators are used with a block instead of a lambda, is actually closer to this:
do |key, value, *rest|
puts key
end
Consider this code to illustrate:
p = proc do |key,value|
puts key
end
l = lambda do |key,value|
puts key
end
Using the above, the following will set (key, value) to (13, 14) and (22, 23) respectively, and the above-mentioned *rest as [16, 11] in the first case (with rest getting discarded):
[[13,14,16,11],[22,23]].each(&p)
In contrast, the following will spit an argument error, because the lambda (which is similar to a block except when it comes to arity considerations) will receive the full array as an argument (without any *rest as above, since the number of arguments is strictly enforced):
[[13,14,16,11],[22,23]].each(&l) # wrong number of arguments (1 for 2)
To get the index in your case, you'll want each_with_index as highlighted in the other answers.
Related discussions:
Proc.arity vs Lambda.arity
Why does Hash#select and Hash#reject pass a key to a unary block?
You can get what you want with Array's each_index' method which returns the index of the element instead of the element itself. See [Ruby'sArray` documentation]1 for more information.
When you do:
[[13,14,16,11],[22,23]].each do |key,value|
before the first iteration is done it makes an assignment:
key, value = [13,14,16,11]
Such an assignment will result with key being 13 and value being 14. Instead you should use each_with_index do |array, index|. This will change the assignment to:
array, index = [[13,14,16,11], 0]
Which will result with array being [13,14,16,11] and index being 0
You have an array of arrays - known as a two-dimensional array.
In your loop, your "value" variable is assigned to the first array, [13,14,16,11]
When you attempt to puts the "value" variable, it only returns the first element, 13.
Try changing puts value to puts value.to_s which will convert the array to a string.
If you want every value, then add another loop block to your code, to loop through each element within the "value" variable.
[[1,2,3],['a','b','c']].each do |key,value|
value.each do |key2,value2|
puts value2
end
end

undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)

I'm trying to return a list of values based on user defined arguments, from hashes defined in the local environment.
def my_method *args
#initialize accumulator
accumulator = Hash.new(0)
#define hashes in local environment
foo=Hash["key1"=>["var1","var2"],"key2"=>["var3","var4","var5"]]
bar=Hash["key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"]]
baz=Hash["key6"=>["var13","var14","var15","var16"]]
#iterate over args and build accumulator
args.each do |x|
if foo.has_key?(x)
accumulator=foo.assoc(x)
elsif bar.has_key?(x)
accumulator=bar.assoc(x)
elsif baz.has_key?(x)
accumulator=baz.assoc(x)
else
puts "invalid input"
end
end
#convert accumulator to list, and return value
return accumulator = accumulator.to_a {|k,v| [k].product(v).flatten}
end
The user is to call the method with arguments that are keywords, and the function to return a list of values associated with each keyword received.
For instance
> my_method(key5,key6,key1)
=> ["var10","var11","var12","var13","var14","var15","var16","var1","var2"]
The output can be in any order. I received the following error when I tried to run the code:
undefined method `assoc' for #<Hash:0x10f591518> (NoMethodError)
Please would you point me how to troubleshoot this? In Terminal assoc performs exactly how I expect it to:
> foo.assoc("key1")
=> ["var1","var2"]
I'm guessing you're coming to Ruby from some other language, as there is a lot of unnecessary cruft in this method. Furthermore, it won't return what you expect for a variety of reasons.
`accumulator = Hash.new(0)`
This is unnecessary, as (1), you're expecting an array to be returned, and (2), you don't need to pre-initialize variables in ruby.
The Hash[...] syntax is unconventional in this context, and is typically used to convert some other enumerable (usually an array) into a hash, as in Hash[1,2,3,4] #=> { 1 => 2, 3 => 4}. When you're defining a hash, you can just use the curly brackets { ... }.
For every iteration of args, you're assigning accumulator to the result of the hash lookup instead of accumulating values (which, based on your example output, is what you need to do). Instead, you should be looking at various array concatenation methods like push, +=, <<, etc.
As it looks like you don't need the keys in the result, assoc is probably overkill. You would be better served with fetch or simple bracket lookup (hash[key]).
Finally, while you can call any method in Ruby with a block, as you've done with to_a, unless the method specifically yields a value to the block, Ruby will ignore it, so [k].product(v).flatten isn't actually doing anything.
I don't mean to be too critical - Ruby's syntax is extremely flexible but also relatively compact compared to other languages, which means it's easy to take it too far and end up with hard to understand and hard to maintain methods.
There is another side effect of how your method is constructed wherein the accumulator will only collect the values from the first hash that has a particular key, even if more than one hash has that key. Since I don't know if that's intentional or not, I'll preserve this functionality.
Here is a version of your method that returns what you expect:
def my_method(*args)
foo = { "key1"=>["var1","var2"],"key2"=>["var3","var4","var5"] }
bar = { "key3"=>["var6"],"key4"=>["var7","var8","var9"],"key5"=>["var10","var11","var12"] }
baz = { "key6"=>["var13","var14","var15","var16"] }
merged = [foo, bar, baz].reverse.inject({}, :merge)
args.inject([]) do |array, key|
array += Array(merged[key])
end
end
In general, I wouldn't define a method with built-in data, but I'm going to leave it in to be closer to your original method. Hash#merge combines two hashes and overwrites any duplicate keys in the original hash with those in the argument hash. The Array() call coerces an array even when the key is not present, so you don't need to explicitly handle that error.
I would encourage you to look up the inject method - it's quite versatile and is useful in many situations. inject uses its own accumulator variable (optionally defined as an argument) which is yielded to the block as the first block parameter.

How can I know how many parameters a method passes to a block?

If I have two variables like a and h.
a = ["cat", "dog", "mat"]
h = {cat: 'gatto', dog: 'cane', mat: 'stuoia'} # (Italian translations)
And I call the method .each on them, if I don't know the kind of object they are pointing to, how can I know that the block passed to a.each can take one parameter and the block passed to b.each can take two?
In other words, when I pass a block to a method, how can I know how many block parameters the method will set?
Is there some_method which returns the number of parameters a block should take? So that obj.general_method_that_takes_a_block.some_method would return the number of parameters that general_method_that_takes_a_block passes to its block?
A straightforward way is:
a.each{|e| p [*e].length}
# => 1 1 1
h.each{|e| p [*e].length}
# => 2 2 2
The each blocks always gets a single parameter, it never gets two. In the Hash case, when you do this:
h.each { |k, v| ... }
Ruby is, more or less, doing this behind your back:
h.each { |a| k, v = a; ... }
So you could check if the block's argument is an Array:
e.each do |x|
if x.kind_of? Array
# e might be a Hash
else
# e might be an Array
end
end
The problem is that e might be something like [ [1,2], [3,4] ] which would incorrectly put you into the might be a Hash branch; this sort of e will also fool a [*e].length check.
I don't think there is any clean and simple way to know what you're iterating over from inside the block.

Resources