Given that map() is defined by Enumerable, how can Hash#map yield two variables to its block? Does Hash override Enumerable#map()?
Here's a little example, for fun:
ruby-1.9.2-p180 :001 > {"herp" => "derp"}.map{|k,v| k+v}
=> ["herpderp"]
It doesn't override map
Hash.new.method(:map).owner # => Enumerable
It yields two variables which get collected into an array
class Nums
include Enumerable
def each
yield 1
yield 1, 2
yield 3, 4, 5
end
end
Nums.new.to_a # => [1, [1, 2], [3, 4, 5]]
Given that map() is defined by Enumerable, how can Hash#map yield two variables to its block?
It doesn't. It yields a single object to its block, which is a two-element array consisting of the key and the value.
It's just destructuring bind:
def without_destructuring(a, b) end
without_destructuring([1, 2])
# ArgumentError: wrong number of arguments (1 for 2)
def with_destructuring((a, b)) end # Note the extra parentheses
with_destructuring([1, 2])
def with_nested_destructuring((a, (b, c))) p a; p b; p c end
with_nested_destructuring([1, [2, 3]])
# 1
# 2
# 3
# Note the similarity to
a, (b, c) = [1, [2, 3]]
Theoretically, you would have to call map like this:
hsh.map {|(k, v)| ... }
And, in fact, for inject, you actually need to do that:
hsh.inject {|acc, (k, v)| ... }
However, Ruby is more lenient with argument checking for blocks than it is for methods. In particular:
If you yield more than one object, but the block only takes a single argument, all the objects are collected into an array.
If you yield a single object, but the block takes multiple arguments, Ruby performs destructuring bind. (This is the case here.)
If you yield more objects than the block takes arguments, the extra objects get ignored.
If you the block takes more arguments than you are yielding, the extra arguments are bound to nil.
Basically, the same semantics as parallel assignment.
In fact, before Ruby 1.9, block arguments actually did have assignment semantics. This allowed you to do crazy things like this:
class << (a = Object.new); attr_accessor :b end
def wtf; yield 1, 2 end
wtf {|#a, a.b| } # WTF? The block body is empty!
p #a
# 1
p a.b
# 2
This crazy stuff works (in 1.8 and older), because block argument passing is treated the same as assignment. IOW, even though the above block is empty and doesn't do anything, the fact that block arguments are passed as if they had been assigned, means that #a is set and the a.b= setter method is called. Crazy, huh? That's why it was removed in 1.9.
If you want to startle your co-workers, stop defining your setters like this:
attr_writer :foo
and instead define them like this:
define_method(:foo=) {|#foo|}
Just make sure someone else ends up maintaining it :-)
Related
I noticed if you type:
object &, you get the object back.
For example:
1.class # => Integer
1 &.class # => Integer
'hello'.then { |x| x.equal?(x &.itself) } # => true
[1, 2, 3] &.map(&:next) # => [2, 3, 4]
I am unable to find a documentation for the syntax for object &.method
How does this syntax work?
There are 2 seperate operators here:
Safe navigation operator &. - It is safe navigation operator which was introduced in Ruby 2.3.0. It basically returns nil if the callee is nil instead of raising excecption undefined method called for Nil class. eg:
a = 1
a.next
# => 2
a&.next
# => 2
a = nil
a.next
# => NoMethodError (undefined method `next' for nil:NilClass)
a&.next
# => nil ## No exception, returns nil
You can read about it more here and documentation
Unary & : This operator is a little more complex. It is almost equivalent to calling #to_proc but not quite that. But for this discussion let us think like that. So, if you have a Proc, calling with & in front of it will call #to_proc on the Proc and convert it into a block
multiply_by_2 = Proc.new { |x| x * 2 }
# => #<Proc:0x00007fb4771cf560>
# &multiply_by_2 almost equivalent to { |x| x * 2 } but is not correct syntax
[1, 2].map(&multiply_by_2)
# => [2, 4]
# equivalent to [1, 2].map { |x| x * 2 }
But what happens if we give a symbol like :abc to & operator instead of a proc. It will try to call #to_proc on the symbol and ruby has defined Symbol#to_proc which roughly translates to something like this:
def to_proc
# this will return some block like { |x| x.send(:abc) }
lambda { |x| x.send(self) }
end
So &:abc roughly translates to this block { |x| x.abc } using the below transformation
&:abc =====> :abc.to_proc =====> { |x| x.send(:abc) } ====> { |x| x.abc }
So, instead of doing [1, 2, 3].map { |x| x.next }, you could do [1, 2, 3].map(&:next) as &:next is roughly equivalent to the block { |x| x.next }.
See unary & (which is the main source of what I have written here) for more reading.
It's ruby syntax, & calls to_proc on the object and passes the result as a block to the method.
An explanation from the pickaxe book, programming Ruby 1.9 and 2.0
Blocks Can Be Objects
Blocks are like anonymous methods, but there’s
more to them than that. You can also convert a block into an object,
store it in variables, pass it around, and then invoke its code later.
Remember we said that you can think of blocks as being like an
implicit parameter that’s passed to a method? Well, you can also make
that parameter explicit. If the last parameter in a method definition
is prefixed with an ampersand (such as &action ), Ruby looks for a
code block whenever that method is called. That code block is
converted to an object of class Proc and assigned to the parameter.
You can then treat the parameter as any other variable. Here’s an
example where we create a Proc object in one instance method and store
it in an instance variable. We then invoke the proc from a second
instance method.
class ProcExample
def pass_in_block(&action)
#stored_proc = action
end
def use_proc(parameter)
#stored_proc.call(parameter)
end
end
Use it like so
eg = ProcExample.new
eg.pass_in_block { |param| puts "The parameter is #{param}" }
eg.use_proc(99)
produces:
The parameter is 99
I am iterating over an array, and I'm wondering if there's a shorthand to refer to the receiver of #each (or #each_with_index) method from within the iteration.
self returns main.
You should be able to just reference it:
my_thing.each {|one_thing| puts my_thing }
This is pretty similar to the answer I gave here https://stackoverflow.com/a/45421168/2981429 but slightly different.
First off, you can create a scope with self bound to the array, and then execute the each in that scope:
[1].instance_exec do
# in this scope, self is the array
# thus we can use just 'each' because the self is inferred
each do |x|
# note that since 'class' is a special keyword,
# it needs to be explicitly namespaced on self
puts self.class, x
end
end
# => prints Array, 1
You can create a utility function to do this, if you want:
def bound_each(enumerable, &blk)
enumerable.instance_exec { each &blk }
end
bound_each([1]) { |x| puts self.class, x }
# prints Array, 1
You can call your each method within an Object#tap block and reference the original receiver like that.
[1, 2, 3].tap { |i| i.each { |j| p i.dup << j } }
# [1, 2, 3, 1]
# [1, 2, 3, 2]
# [1, 2, 3, 3]
#=> [1, 2, 3]
Here the receiving object is [1, 2, 3] and is passed to the block-variable i which we can use locally or in nested scopes such as each's block.
Avoid modifying the receiving object else you may end up with undesired results such as an infinite array. Using dup could allay this possibility.
This is an interesting question. As far as I know it's not possible – the closest I can come up with would be to use inject (or reduce) and explicitly pass the receiver as an argument. A bit pointless, but there might be a use-case for it that I'm not seeing:
a = [1,2,3]
a.inject(a) do |this, element|
this == a #=> true
this.include?(element) #=> true
this
end
Apart from looking a bit redundant, you have to be very sure to return this at the end of each iteration, as the return value will become this in the next iteration. For that reason (and the fact that you could just reference your collection in an each block, as in David's answer) I don't recommend using this.
Edit - as Simple Lime pointed out in the comments – I missed the obvious Enumerator#with_object, which has the same (rather pointless) effect, but without the drawback of having to return this at the end of each iteration. For example:
a = [1,2,3]
a.map.with_object(a) do |element, this|
this == a #=> true, for each iteration
end
I still don't recommend that you use this though.
Ok, reviewing Procs, lambdas, and blocks via this link.
Question on this code:
class Array
def iterate!
self.each_with_index do |n, i|
self[i] = yield(n)
end
end
end
array = [1, 2, 3, 4]
array.iterate! do |n|
n ** 2
end
puts array.inspect
Conceptually, I understand almost everything, except one line which is this:
self[i] = yield(n)
I get that this self in this line self.each_with_index do |n, i| means that it's a class method, right?
But why do we need to assign the parameters in yield(n) to self[i]?
Please explain in super basic way if you can.
(in other words, please be nice - which people generally are for most part here - just a little extra nervous that I'm not getting this which is making me feel stupid)
The method is iterate!, which is an instance method. self in self.each_with_index is the receiver of the method Enumerable#each_with_instance. Since self is the current instance of Array ([1,2,3,4] in your example), self. is not needed; i.e., you could (and imo, should) just write each_with_index do |n, i|.... In other words, self is the implied receiver when no explicit receiver is specified.
Regarding the line:
self[i] = yield(n)
for your example array = [1,2,3,4] your enumerator is:
enum = [1,2,3,4].each_with_index
#=> #<Enumerator: [1, 2, 3, 4]:each_with_index>
with elements
enum.to_a
#=> [[1, 0], [2, 1], [3, 2], [4, 3]]
The first element passed into block by Array#each is therefore [1,0], which is assigned to the block variables:
n = 1
i = 0
resulting in
self[0] = yield(1) => 1**2 => 1
and so on.
I'll try to explain in a super basic way.
I get that this self in this line self.each_with_index do |n, i| means
that it's a class method, right?
Nope. The meaning of self depends on the context. If self was in the class, it would refer to the class. But here self is in an instance method, so it refers to the instance (so each_with_index is also an instance method).
But why do we need to assign the parameters in yield(n) to self[i]?
The goal of iterate! is to modify the array in place. Since self refers to the instance, self[i] accesses the elements of the array that iterate! is being called on, thus modifying the array in place.
Also, I'm not sure what you mean by "parameters" here. yield(n) passes n to the block, runs the block, and returns the value.
self[i] = yield(n) reassigns the values in the array, to the block that was specified in
array.iterate! do |n|
n ** 2
end
which basically means, take the value of the array, and square it, save that value in the element of the array. So [1, 2, 3 , 4] becomes [1 ** 2, 2 ** 2, 3 ** 2, 4 ** 2] => [2, 4, 9, 16]
Self changes with(and actually is) the current context or surrounding object.
Since
self.each_with_index do |n, i|
...
is monkey patching the Array class and is within an instance method iterate!, self refers to the instance itself: in this case the array [1, 2, 3, 4].
You're probably thinking of this:
class some_class
def self.a_class_method
...
which is defined in the context of a class. So self is the class itself(which is also an object), not an instance of that class.
Since self is just the array [1, 2, 3, 4]
self[i] = yield(n)
is replacing each element of the array with results of the sent in block.
Here iterate! is an instance function of Array class and you have an array object.When you do
array.iterate! do |n|
n ** 2
end
You are passing a block 'do |n| n**2 end' to iterate! function.In the function you can access this block using yield.But as you can see block is expecting one parameter through |n| so you need to pass one parameter and the block code will return the square of it.
self[i] = yield(n)
self is being used in Array instance context.So it is modifying the values of array.
For more information please check this article:
http://geekdirt.com/blog/blocks-lambda-and-procs-in-ruby/
How can I create an enumerator that optionally takes a block? I want some method foo to be called with some arguments and an optional block. If it is called with a block, iteration should happen on the block, and something should be done on it, involving the arguments given. For example, if foo were to take a single array argument, apply map on it with the given block, and return the result of join applied to it, then it would look like:
foo([1, 2, 3]){|e| e * 3}
# => "369"
If it is called without a block, it should return an enumerator, on which instance methods of Enumerator (such as with_index) should be able to apply, and execute the block in the corresponding way:
enum = foo([1, 2, 3])
# => Enumerator
enum.with_index{|e, i| e * i}
# => "026"
I defined foo using a condition to see if a block is given. It is easy to implement the case where the block is given, but the part returning the enumerator is more difficult. I guess I need to implement a sublass MyEnum of Enumerator and make foo return an instance of it:
def foo a, &pr
if pr
a.map(&pr).join
else
MyEnum.new(a)
end
end
class MyEnum < Enumerator
def initialize a
#a = a
...
end
...
end
But calling MyEnum.new raises a warning message: Enumerator.new without a block is deprecated; use Object#to_enum. If I use to_enum, I think it would return a plain Enumerator instance, not the MyEnum with the specific feature built in. On top of that, I am not sure how to implement MyEnum in the first place. How can I implement such enumerator? Or, what is the right way to do this?
You could do something like this.
def foo a, &pr
if pr
a.map(&pr).join
else
o = Object.new
o.instance_variable_set :#a, a
def o.each *y
foo #a.map { |z| yield z, *y } { |e| e }
end
o.to_enum
end
end
Then we have
enum = foo([1,2,3])
enum.each { |x| 2 * x } # "246"
or
enum = foo([1,2,3])
enum.with_index { |x, i| x * i } # "026"
Inspiration was drawn from the Enumerator documentation. Note that all of your expectations about enumerators like you asked for hold, because .to_enum takes care of all that. enum is now a legitimate Enumerator!
enum.class # Enumerator
If I create an Enumertor like so:
enum = [1,2,3].each => #<Enumerator: [1, 2, 3]:each>
enum is an Enumerator. What is the purpose of this object? I can't say this:
enum { |i| puts i }
But I can say this:
enum.each { |i| puts i }
That seems redundant, because the Enumerator was created with .each. It seems like it's storing some data regarding the each method.
I don't understand what's going on here. I'm sure there is some logical reason we have this Enumerator class, but what can it do that an Array can't? I thought maybe it was an ancestor of Array and other Enumerables, but it doesn't seem to be. What exactly is the reason for the existence of the Enumerator class, and in what context would it ever be used?
What happens if you do enum = [1,2,3].each; enum.next?:
enum = [1,2,3].each
=> #<Enumerator: [1, 2, 3]:each>
enum.next
=> 1
enum.next
=> 2
enum.next
=> 3
enum.next
StopIteration: iteration reached an end
This can be useful when you have an Enumerator that does a calculation, such as a prime-number calculator, or a Fibonacci-sequence generator. It provides flexibility in how you write your code.
I think, the main purpose is to get elements by demand instead of getting them all in a single loop. I mean something like this:
e = [1, 2, 3].each
... do stuff ...
first = e.next
... do stuff with first ...
second = e.next
... do more stuff with second ...
Note that those do stuff parts can be in different functions far far away from each other.
Lazily evaluated infinite sequences (e.g. primes, Fibonacci numbers, string keys like 'a'..'z','aa'..'az','ba'..'zz','aaa'.. etc.) are a good use case for enumerators.
As answered so far, Enumerator comes in handy when you want to iterate through a sequence of data of potentially infinite length.
Take a prime number generator prime_generator that extends Enumerator for example. If we want to get the first 5 primes, we can simply write prime_generator.take 5 instead of embedding the "limit" into the generating logic. Thus we can separate generating prime numbers and taking a certain amount out of generated prime numbers making the generator reusable.
I for one like method chaining using methods of Enumerable returning Enumerator like the following example (it may not be a "purpose" but I want to just point out an aesthetic aspect of it):
prime_generator.take_while{|p| p < n}.each_cons(2).find_all{|pair| pair[1] - pair[0] == 2}
Here the prime_generator is an instance of Enumerator that returns primes one by one. We can take prime numbers below n using take_while method of Enumerable. The methods each_cons and find_all both return Enumerator so they can be chained. This example is meant to generate twin primes below n. This may not be an efficient implementation but is easily written within a line and IMHO suitable for prototyping.
Here is a pretty straightforward implementation of prime_generator based on Enumerator:
def prime?(n)
n == 2 or
(n >= 3 and n.odd? and (3...n).step(2).all?{|k| n%k != 0})
end
prime_generator = Enumerator.new do |yielder|
n = 1
while true
yielder << n if prime? n
n += 1
end
end
It is possible to combine enumerators:
array.each.with_index { |el, idx| ... }
To understand the major advantage of the enumerator class, you first need to distinguish internal and external iterators. With internal iterators, the iterator itself controls the iteration. With external iterators, the client (often times the programmer) controls the iteration. Clients that use an external iterator must advance the traversal and request the next element explicitly from the iterator. In contrast, the client hands an internal iterator an operation to perform, and the iterator applies that operation to every element in the collection.
In Ruby, the Enumerator class enables you to make use of external iterators. And once you understand external iterators you will begin to discover a lot of advantages. First, let's look how the Enumerator class facilitates external iteration:
class Fruit
def initialize
#kinds = %w(apple orange pear banana)
end
def kinds
yield #kinds.shift
yield #kinds.shift
yield #kinds.shift
yield #kinds.shift
end
end
f = Fruit.new
enum = f.to_enum(:kinds)
enum.next
=> "apple"
f.instance_variable_get :#kinds
=> ["orange", "pear", "banana"]
enum.next
=> "orange"
f.instance_variable_get :#kinds
=> ["pear", "banana"]
enum.next
=> "pear"
f.instance_variable_get :#kinds
=> ["banana"]
enum.next
=> "banana"
f.instance_variable_get :#kinds
=> []
enum.next
StopIteration: iteration reached an end
It's important to note that calling to_enum on an object and passing a symbol that corresponds to a method will instantiate Enumerator class and in our example, the enum local variable holds an Enumerator instance. And then we use external iteration to traverse through the enumeration method we created. Our enumeration method called "kinds" and notice we use the yield method, which we typically do with blocks. Here, the enumerator will yield one value at a time. It pauses after each yield. When asked for another value, it will resume immediately after the last yielded value, and execute up to the next yielded value. When nothing left to yield, and you call next, it will invoke StopIteration exception.
So what is the power of external iteration in Ruby? There are several benefits and I will highlight a few of them. First, the Enumerator class allows for chaining. For example, with_index is defined in the Enumerator class and it allows us to specify a start value for iteration when iterating over an Enumerator object:
f.instance_variable_set :#kinds, %w(apple orange pear banana)
enum.rewind
enum.with_index(1) do |name, i|
puts "#{name}: #{i}"
end
apple: 1
orange: 2
pear: 3
banana: 4
Second, it provides a TON of useful convenience methods from the Enumerable module. Remember Enumerator is a class and Enumerable is a module, but the Enumerable module is included in the Enumerator class and so Enumerators are Enumerable:
Enumerator.ancestors
=> [Enumerator, Enumerable, Object, Kernel, BasicObject]
f.instance_variable_set :#kinds, %w(apple orange pear banana)
enum.rewind
enum.detect {|kind| kind =~ /^a/}
=> "apple"
enum
=> #<Enumerator: #<Fruit:0x007fb86c09bdf8 #kinds=["orange", "pear", "banana"]>:kinds>
And there is one other major benefit of Enumerator that might not be immediately clear. Let me explain this through a demonstration. As you probably know, you can make any of your user-defined classes Enumerable by including the Enumerable module and defining an each instance method:
class Fruit
include Enumerable
attr_accessor :kinds
def initialize
#kinds = %w(apple orange pear banana)
end
def each
#kinds.each { |kind| yield kind }
end
end
This is cool. Now we have a ton of Enumerable instance method goodies available to us like chunk, drop_while, flat_map, grep, lazy, partition, reduce, take_while and more.
f.partition {|kind| kind =~ /^a/ }
=> [["apple"], ["orange", "pear", "banana"]]
It's interesting to note that each of the instance methods of Enumerable module actually call our each method behind the scenes in order to get the enumerable items. So if we were to implement the reduce method, it might look something like this:
module Enumerable
def reduce(acc)
each do |value|
acc = yield(acc, value)
end
acc
end
end
Notice how it passes a block to the each method and so our each method is expected to yield something back to the block.
But look what happens if client code calls the each method without specifying a block:
f.each
LocalJumpError: no block given (yield)
So now we can modify our each method to use enum_for, which will return an Enumerator object when a block is not given:
class Fruit
include Enumerable
attr_accessor :kinds
def initialize
#kinds = %w(apple orange pear banana)
end
def each
return enum_for(:each) unless block_given?
#kinds.each { |kind| yield kind }
end
end
f = Fruit.new
f.each
=> #<Enumerator: #<Fruit:0x007ff70aa3b548 #kinds=["apple", "orange", "pear", "banana"]>:each>
And now we have an Enumerator instance we could control with our client code for later use.