In some code, I found:
class Job
##types = [:a, :b, :c, :d].reduce({}) do |acc, cmd|
acc[cmd] = cmd.to_s
acc
end
# ...
end
There's nothing passed into reduce. What does that mean?
There's a single acc. What does that mean?
reduce is called with an empty hash. This means that the value of acc in the first call to the block will be {}.
In Ruby, the last statement within a function is the return value, so the block returns acc.
You probably need to read what reduce does exactly to understand this code.
There's nothing passed into reduce. What does that mean?
That's not true. There is a positional argument {} passed into reduce as well as a block.
But even if nothing were passed, what's the big deal? There's nothing being passed into to_s either, yet somehow that doesn't seem to bother you.
There's a single acc. What does that mean?
It means the same thing as the acc on the line before: dereference the variable.
Read the documentation for reduce
The first argument ({}) is the initial parameter of acc, acc is what it will be returned when reduce finishes.
It is just transforming the array into a hash, the final result is:
{:a=>"a", :b=>"b", :c=>"c", :d=>"d"}
Related
the following code return this error:
block in find_word_lengths': undefined method `[]=' for 3:Integer (NoMethodError)
animals = ['cat', 'horse', 'rabbit', 'deer']
def find_word_lengths(word_list)
word_list.reduce(Hash.new()) do |result, animal|
result[animal] = animal.length
end
end
puts find_word_lengths(animals)
The return value of the block is the accumulator value for the next iteration. That is how a fold works.
Assignments in Ruby evaluate to the right-hand side. So, in the first iteration of reduce, the block evaluates to 3 (the length of 'cat'). Which means that in the second iteration of reduce, result is 3, and you are essentially running
3['horse'] = 5
# which is equivalent to
3.[]=('horse', 5)
Which is why you are getting the error message that the Integer 3 does not respond to the message []=.
So, you need to make sure that your block always returns the value that you want to use for the accumulator in the next iteration. Something like this:
word_list.reduce(Hash.new()) do |result, animal|
result.tap {|result| result[animal] = animal.length }
end
This would be the obvious solution, although somewhat cheating.
word_list.reduce(Hash.new()) do |result, animal|
result.merge(animal => animal.length)
end
Would be more idiomatic.
However, when you want to fold into a mutable object, it makes more sense to use Enumerable#each_with_object instead of Enumerable#reduce. each_with_object ignores the result of the block, and simply passes the same object every time. Note that somewhat confusingly, the order of the block parameters is swapped in each_with_object compared to reduce.
word_list.each_with_object(Hash.new()) do |animal, result|
result[animal] = animal.length
end
But I guess the most idiomatic solution would be something like this:
word_list.map {|word| [word, word.length] }.to_h
By the way, in Ruby, it is idiomatic to leave out the parentheses for the argument list if you are not passing any arguments, so Hash.new() should be Hash.new instead. Even more important than being idiomatic is to be consistent – confusingly, you leave out the parentheses for animal.length, but not for Hash.new
Even more idiomatically, you would use the Hash literal notation instead of the Hash::new method, i.e. you should use {} instead of Hash.new.
Consider the following:
(1..10).inject{|memo, n| memo + n}
Question:
How does n know that it is supposed to store all the values from 1..10? I'm confused how Ruby is able to understand that n can automatically be associated with (1..10) right away, and memo is just memo.
I know Ruby code blocks aren't the same as the C or Java code blocks--Ruby code blocks work a bit differently. I'm confused as to how variables that are in between the upright pipes '|' will automatically be assigned to parts of an object. For example:
hash1 = {"a" => 111, "b" => 222}
hash2 = {"b" => 333, "c" => 444}
hash1.merge(hash2) {|key, old, new| old}
How do '|key, old, new|' automatically assign themselves in such a way such that when I type 'old' in the code block, it is automatically aware that 'old' refers to the older hash value? I never assigned 'old' to anything, just declared it. Can someone explain how this works?
The parameters for the block are determined by the method definition. The definition for reduce/inject is overloaded (docs) and defined in C, but if you wanted to define it, you could do it like so (note, this doesn't cover all the overloaded cases for the actual reduce definition):
module Enumerable
def my_reduce(memo=nil, &blk)
# if a starting memo is not given, it defaults to the first element
# in the list and that element is skipped for iteration
elements = memo ? self : self[1..-1]
memo ||= self[0]
elements.each { |element| memo = blk.call(memo, element) }
memo
end
end
This method definition determines what values to use for memo and element and calls the blk variable (a block passed to the method) with them in a specific order.
Note, however, that blocks are not like regular methods, because they don't check the number of arguments. For example: (note, this example shows the usage of yield which is another way to pass a block parameter)
def foo
yield 1
end
# The b and c variables here will be nil
foo { |a, b, c| [a,b,c].compact.sum } # => 1
You can also use deconstruction to define variables at the time you run the block, for example if you wanted to reduce over a hash you could do something like this:
# this just copies the hash
{a: 1}.reduce({}) { |memo, (key, val)| memo[key] = val; memo }
How this works is, calling reduce on a hash implicitly calls to_a, which converts it to a list of tuples (e.g. {a: 1}.to_a = [[:a, 1]]). reduce passes each tuple as the second argument to the block. In the place where the block is called, the tuple is deconstructed into separate key and value variables.
A code block is just a function with no name. Like any other function, it can be called multiple times with different arguments. If you have a method
def add(a, b)
a + b
end
How does add know that sometimes a is 5 and sometimes a is 7?
Enumerable#inject simply calls the function once for each element, passing the element as an argument.
It looks a bit like this:
module Enumerable
def inject(memo)
each do |el|
memo = yield memo, el
end
memo
end
end
And memo is just memo
what do you mean, "just memo"? memo and n take whatever values inject passes. And it is implemented to pass accumulator/memo as first argument and current collection element as second argument.
How do '|key, old, new|' automatically assign themselves
They don't "assign themselves". merge assigns them. Or rather, passes those values (key, old value, new value) in that order as block parameters.
If you instead write
hash1.merge(hash2) {|foo, bar, baz| bar}
It'll still work exactly as before. Parameter names mean nothing [here]. It's actual values that matter.
Just to simplify some of the other good answers here:
If you are struggling understanding blocks, an easy way to think of them is as a primitive and temporary method that you are creating and executing in place, and the values between the pipe characters |memo| is simply the argument signature.
There is no special special concept behind the arguments, they are simply there for the method you are invoking to pass a variable to, like calling any other method with an argument. Similar to a method, the arguments are "local" variables within the scope of the block (there are some nuances to this depending on the syntax you use to call the block, but I digress, that is another matter).
The method you pass the block to simply invokes this "temporary method" and passes the arguments to it that it is designed to do. Just like calling a method normally, with some slight differences, such as there are no "required" arguments. If you do not define any arguments to receive, it will happily just not pass them instead of raising an ArgumentError. Likewise, if you define too many arguments for the block to receive, they will simply be nil within the block, no errors for not being defined.
So i discovered this ruby behaviour, which kept me going crazy for over an hour. When I pass a hash to a function which has a default value for hash AND a keyword argument, it seems like the reference doesn't get passed correctly. As soon as I take away the default value OR the keyword argument, the function behaves as expected. Am I missing some obvious ruby rule here?
def change_hash(h={}, rand: om)
h['hey'] = true
end
k = {}
change_hash(k)
k
#=> {}
It works fine as soon as I take out the default or the keyword arg.
def change_hash(h, rand: om)
h['hey'] = true
end
k = {}
change_hash(k)
k
#=> {'hey' => true}
def change_hash(h={})
h['hey'] = true
end
k = {}
change_hash(k)
k
#=> {'hey' => true}
EDIT
Thanks for your answers. Most of you pointed out that ruby parses the hash as a keyword argument in some cases. However, I am talking about the case when a hash has string keys. When I pass the hash, it seems like the value that gets passed is correct. But modifying the hash inside the function doesn't modify the original hash.
def change_hash(hash={}, another_arg: 300)
puts "another_arg: #{another_arg}"
puts "hash: #{hash}"
hash['hey'] = 3
end
my_hash = {"o" => 3}
change_hash(my_hash)
puts my_hash
Prints out
another_arg: 300
hash: {"o"=>3}
{"o"=>3}
TL;DR ruby allows passing hash as a keyword argument as well as “expanded inplace hash.” Since change_hash(rand: :om) must be routed to keyword argument, so should change_hash({rand: :om}) and, hence, change_hash({}).
Since ruby allows default arguments in any position, the parser takes care of default arguments in the first place. That means, that the default arguments are greedy and the most amount of defaults will take a place.
On the other hand, since ruby lacks pattern-matching feature for function clauses, parsing the given argument to decide whether it should be passed as double-splat or not would lead to huge performance penalties. Since the call with an explicit keyword argument (change_hash(rand: :om)) should definitely pass :om to keyword argument, and we are allowed to pass an explicit hash {rand: :om} as a keyword argument, Ruby has nothing to do but to accept any hash as a keyword argument.
Ruby will split the single hash argument between hash and rand:
k = {"a" => 42, rand: 42}
def change_hash(h={}, rand: :om)
h[:foo] = 42
puts h.inspect
end
change_hash(k);
puts k.inspect
#⇒ {"a"=>42, :foo=>42}
#⇒ {"a"=>42, :rand=>42}
That split feature requires the argument being cloned before passing. That is why the original hash is not being modified.
This is particularly tricky case in Ruby indeed.
In your example you have optional argument which is a hash and you have an optional keyword argument at the same time. In this situation if you pass only one hash, Ruby interprets it as a hash which contains keyword arguments. Here is the code to clarify:
change_hash({rand1: 'om'})
# ArgumentError: unknown keyword: rand1
To work around this you can pass two separate hashes into the method with second one (the one for keyword arguments) being empty:
def change_hash(h={}, rand: 'om')
h['hey'] = true
end
k = {}
change_hash(k, {})
k
#=> {'hey' => true}
From the practical point of view it is better to avoid metdhod signature like that in production code, because it is very easy to make an error while using the method.
Given this method definition:
def foo(a = nil, b: nil)
p a: a, b: b
end
When I invoke the method with a single hash argument, the hash is always implicitly converted to keyword arguments, regardless of **:
hash = {b: 1}
foo(hash) #=> {:a=>nil, :b=>1}
foo(**hash) #=> {:a=>nil, :b=>1}
I can pass another (empty) hash as a workaround:
foo(hash, {}) #=> {:a=>{:b=>1}, :b=>nil}
But, this looks pretty cumbersome and awkward.
I would have expected Ruby to handle this more like arrays are handled, i.e.:
foo(hash) #=> {:a=>{:b=>1}, :b=>nil}
foo(**hash) #=> {:a=>nil, :b=>1}
And using literals:
foo({b: 1}) #=> {:a=>{:b=>1}, :b=>nil}
foo(b: 1) #=> {:a=>nil, :b=>1}
foo(**{b: 1}) #=> {:a=>nil, :b=>1}
The current implementation looks like a flaw and the way I was expecting it to work seems obvious.
Is this an overlooked edge case? I don't think so. There's probably a good reason that it wasn't implemented this way.
Can someone enlighten me, please?
As for the lack of ** part:
My guess is that, to make method invocation simple, Ruby always once interprets the key: value form without the braces as a hash with omitted braces, whether it is actually going to be interpreted as such hash or as keyword arguments.
Then, in order to interpret that as keyword arguments, ** is implicitly applied to it.
Therefore, if you had passed an explicit hash, it will not make difference to the process above, and there is room for it to be interpreted either as an actual hash or as keyword arguments.
What happens when you do pass ** explicitly like:
method(**{key: value})
is that the hash is decomposed:
method(key: value)
then is interpreted as a hash with omitted braces:
method({key: value})
then is interpreted either as a hash or as a keyword argument.
As for keyword arguments having priority over other arguments, see this post on Ruby core: https://bugs.ruby-lang.org/issues/11967.
Some block methods such as inject can optionally take a symbol instead of a block:
%w[a b c].inject(&:+)
# => "abc"
%w[a b c].inject(:+)
# => "abc"
%w[a b c].inject("", :+)
# => "abc"
while other block methods such as map cannot:
%w[a b c].map(&:upcase)
# => ["A", "B", "C"]
%w[a b c].map(:upcase)
# => ArgumentError: wrong number of arguments (1 for 0)
Why can't the latter take a symbol?
For inject, a block (or a substitute) is obligatory. If it weren't passed a block, then there has to be at least one argument, the last argument has to be a symbol, and the block would be constructed out of it. Whatever the arity, there is no ambiguity; the last argument is used to construct a block when the block is lacking.
For map, a block is optional. When there is no block given, then the return value would be an Enumerator instance. Hence, from the information whether a block was passed or not, it cannot be decided whether the last argument should be used to construct a block.
In the particular case of map, it does not take an argument, so there is a sense in saying that an extra argument should be taken as a block, but it makes things complicated to judge whether that last argument is to be taken as a block depending on the arity. And it also loses the future possibility of changing the arity of the method.
Not sure but i know this operator & to_proc, inject() method accept 2 args first accum second proc, but map() accept only one args proc or block. In inject() first args(accum) can be first item in enum.
It's just a special handling for this special case in some methods and lack of such handling in the others.