Enumerator without collection - ruby

I have run into this construction:
Enumerator.new(&:map)
and couldn't figure out how I can use it.
For sure this is valid code and it returns the enumerator, but it doesn't have any collection to enumerate over.
How to feed a collection to this enumerator?

With the current setup you can't pass any collection to it. You can't change the collection of the enumerator once instantiated.
The current code only works because is block is not instantly executed, therefore you see the error when you try to start iterating (or retrieving items).
enumerator = Enumerator.new(&:map)
enumerator.take(1)
# NoMethodError (undefined method `map' for #<Enumerator::Yielder:0x00000000055b6e90>)
This is because Enumerator::new yields a Enumerator::Yielder which doesn't has the method map.
The above could also be written as:
enumerator = Enumerator.new { |yielder| yielder.map }
If you would like to create an enumerator from a collection the easiest way is to call each without block. Other methods like map also create enumerators without block given.
enumerator = [1, 2, 3].each
#=> #<Enumerator: [1, 2, 3]:each>
If you for some reason still want to create the enumerator by hand it could look like this:
enumerator = Enumerator.new { |yielder| [1, 2, 3].each { |number| yielder << number } }
If the intent was to preselect a method of iterating before you receive the collection, you can do so in the following manner:
# assuming both the collection and block are passed by the user
map = :map.to_proc
result = map.call(collection, &block)
# which is equivalent to
result = collection.map(&block)

Related

Undefined method error when using 'yield' and 'even?'

When I try to call a method with select and num.even? as follows,
def selection(array)
puts "This is inside the method"
return yield(array)
end
collection = [1,2,3,4,5]
selection(collection.select) { |num| num.even? }
I get a no defined Method error:
undefined method `even?' for #<Enumerator: [1, 2, 3, 4, 5]:select>
I'm looking for a return of even numbers in the array. I can get the select even? combo work in other examples of an array.
Array#select returns an Enumerator instance if no block was given to it
then you call selection method passing the result of call to collection.select as an argument and { |num| num.even? } as block
inside your selection function you yield the argument (an Enumerator instance) to the block
in the block you call even? on the block argument, resulting in the error message you receive.
I am unsure what’s wrong with collection.select(&:even?), but if you want to re-implement it yourself, here you go:
def selection(array)
# convention: return enumerator unless block is given
return enum_for(:selection) unless block_given?
enumerator = array.each
result = []
loop do
(value = enumerator.next) rescue return result
result.push(value) if yield value
end
end
selection([1,2,3,4,5]) { |num| num.even? }
#⇒ [2, 4]
You are seeing this error because you are passing an enumerator object in to your "selection" method ... that is, the result of "collection.select" is an Enumerator and enumerators do not implement an "even" method.
I believe that you are trying to implement your own version of "select". The following is one way to achieve your stated intent: "I'm looking for a return of even numbers in the array."
def selection(array)
results = []
for item in array do
results << item if yield item
end
results
end
collection = [1,2,3,4,5]
puts selection(collection) { |num| num.even? }
# => [2,4]
https://mixandgo.com/learn/mastering-ruby-blocks-in-less-than-5-minutes is a nice reference
yield(array) is passing the whole array in one go to the block given to the method, so it is trying to call even? on the array.

Way to refer to the receiver of 'Array#each'

I am iterating over an array, and I'm wondering if there's a shorthand to refer to the receiver of #each (or #each_with_index) method from within the iteration.
self returns main.
You should be able to just reference it:
my_thing.each {|one_thing| puts my_thing }
This is pretty similar to the answer I gave here https://stackoverflow.com/a/45421168/2981429 but slightly different.
First off, you can create a scope with self bound to the array, and then execute the each in that scope:
[1].instance_exec do
# in this scope, self is the array
# thus we can use just 'each' because the self is inferred
each do |x|
# note that since 'class' is a special keyword,
# it needs to be explicitly namespaced on self
puts self.class, x
end
end
# => prints Array, 1
You can create a utility function to do this, if you want:
def bound_each(enumerable, &blk)
enumerable.instance_exec { each &blk }
end
bound_each([1]) { |x| puts self.class, x }
# prints Array, 1
You can call your each method within an Object#tap block and reference the original receiver like that.
[1, 2, 3].tap { |i| i.each { |j| p i.dup << j } }
# [1, 2, 3, 1]
# [1, 2, 3, 2]
# [1, 2, 3, 3]
#=> [1, 2, 3]
Here the receiving object is [1, 2, 3] and is passed to the block-variable i which we can use locally or in nested scopes such as each's block.
Avoid modifying the receiving object else you may end up with undesired results such as an infinite array. Using dup could allay this possibility.
This is an interesting question. As far as I know it's not possible – the closest I can come up with would be to use inject (or reduce) and explicitly pass the receiver as an argument. A bit pointless, but there might be a use-case for it that I'm not seeing:
a = [1,2,3]
a.inject(a) do |this, element|
this == a #=> true
this.include?(element) #=> true
this
end
Apart from looking a bit redundant, you have to be very sure to return this at the end of each iteration, as the return value will become this in the next iteration. For that reason (and the fact that you could just reference your collection in an each block, as in David's answer) I don't recommend using this.
Edit - as Simple Lime pointed out in the comments – I missed the obvious Enumerator#with_object, which has the same (rather pointless) effect, but without the drawback of having to return this at the end of each iteration. For example:
a = [1,2,3]
a.map.with_object(a) do |element, this|
this == a #=> true, for each iteration
end
I still don't recommend that you use this though.

Global array being changed by function

I just started studying Ruby a little while ago and I was having difficulties with global versus local variable scoping.
Working on a practice problem, I found that an array defined globally was being changed by a function called on it. If I explicitly assign the array to something else, nothing changes. But if I run through and delete items one by one, this deletes them from the global array itself.
Why do delete and pop (which I also tested) methods have this behavior? I understood from reading that this should not be happening, that the "array" inside the functions is a reference to the values of arr, rather than the variable arr.
(I'm using Ruby version 2+)
def change_int x
x += 2
end
def change_arr array
array = [4, 5, 6]
end
def pop_arr array
puts array
new_array = []
while array.length > 0
new_array.push array[0]
array.delete_at 0
end
array
end
x = 5
change_int x
puts x == 5 # true
arr = [1, 2, 3]
change_arr arr
puts arr == [1, 2, 3] # true
old_arr = arr
puts pop_arr arr
puts arr == [1, 2, 3] # false
puts "arr = #{arr}" # arr = []
You can see by printing #object_id before calling pop_arr and inside pop_arr that those arrays are the same objects. This means that arguments are passed into the function by reference in Ruby.
Here is code:
def pop_arr(array)
puts array.object_id
# Rest of the fucntion
end
arr = [1, 2, 3]
puts arr.object_id
pop_arr(arr)
All of this means that when you edit array inside the function it will have effect on the object which was passed. #delete, #delete_at, #pop are operations that change the Array on which they are made.
See also: Ruby - Parameters by reference or by value? and Is Ruby pass by reference or by value?.
The curious thing is that change_arr doesn't affect the global array, but pop_arr does, in your code.
Here's what's happening: ruby passes references to objects as parameters. So like Bartosz said, you can see that at the top of those methods, the object id matches the one you passed in; they're referencing the same object.
So, in pop_arr, when you call delete_at, you're operating on the same object that you passed in, and the changes persist after the method returns.
In change_arr, the difference is that you're assigning the internal var to a new object. When you pass in the parameter array, the internal variable references the same object you passed in. When you instantiate a new Array object and assign the internal array variable to it, the internal variable is now referencing a different object.
def change_arr array
puts "change id: #{array.object_id}"
array = [4, 5, 6]
puts "change id2: #{array.object_id}"
array
end
That's why the changes don't persist after the method ends. If you wanted the changes to persist, you'd have to say
array = change_arr(array)
Hope that helps.

Procs From "Understanding Ruby Blocks, Procs, and Lambdas" Article

Ok, reviewing Procs, lambdas, and blocks via this link.
Question on this code:
class Array
def iterate!
self.each_with_index do |n, i|
self[i] = yield(n)
end
end
end
array = [1, 2, 3, 4]
array.iterate! do |n|
n ** 2
end
puts array.inspect
Conceptually, I understand almost everything, except one line which is this:
self[i] = yield(n)
I get that this self in this line self.each_with_index do |n, i| means that it's a class method, right?
But why do we need to assign the parameters in yield(n) to self[i]?
Please explain in super basic way if you can.
(in other words, please be nice - which people generally are for most part here - just a little extra nervous that I'm not getting this which is making me feel stupid)
The method is iterate!, which is an instance method. self in self.each_with_index is the receiver of the method Enumerable#each_with_instance. Since self is the current instance of Array ([1,2,3,4] in your example), self. is not needed; i.e., you could (and imo, should) just write each_with_index do |n, i|.... In other words, self is the implied receiver when no explicit receiver is specified.
Regarding the line:
self[i] = yield(n)
for your example array = [1,2,3,4] your enumerator is:
enum = [1,2,3,4].each_with_index
#=> #<Enumerator: [1, 2, 3, 4]:each_with_index>
with elements
enum.to_a
#=> [[1, 0], [2, 1], [3, 2], [4, 3]]
The first element passed into block by Array#each is therefore [1,0], which is assigned to the block variables:
n = 1
i = 0
resulting in
self[0] = yield(1) => 1**2 => 1
and so on.
I'll try to explain in a super basic way.
I get that this self in this line self.each_with_index do |n, i| means
that it's a class method, right?
Nope. The meaning of self depends on the context. If self was in the class, it would refer to the class. But here self is in an instance method, so it refers to the instance (so each_with_index is also an instance method).
But why do we need to assign the parameters in yield(n) to self[i]?
The goal of iterate! is to modify the array in place. Since self refers to the instance, self[i] accesses the elements of the array that iterate! is being called on, thus modifying the array in place.
Also, I'm not sure what you mean by "parameters" here. yield(n) passes n to the block, runs the block, and returns the value.
self[i] = yield(n) reassigns the values in the array, to the block that was specified in
array.iterate! do |n|
n ** 2
end
which basically means, take the value of the array, and square it, save that value in the element of the array. So [1, 2, 3 , 4] becomes [1 ** 2, 2 ** 2, 3 ** 2, 4 ** 2] => [2, 4, 9, 16]
Self changes with(and actually is) the current context or surrounding object.
Since
self.each_with_index do |n, i|
...
is monkey patching the Array class and is within an instance method iterate!, self refers to the instance itself: in this case the array [1, 2, 3, 4].
You're probably thinking of this:
class some_class
def self.a_class_method
...
which is defined in the context of a class. So self is the class itself(which is also an object), not an instance of that class.
Since self is just the array [1, 2, 3, 4]
self[i] = yield(n)
is replacing each element of the array with results of the sent in block.
Here iterate! is an instance function of Array class and you have an array object.When you do
array.iterate! do |n|
n ** 2
end
You are passing a block 'do |n| n**2 end' to iterate! function.In the function you can access this block using yield.But as you can see block is expecting one parameter through |n| so you need to pass one parameter and the block code will return the square of it.
self[i] = yield(n)
self is being used in Array instance context.So it is modifying the values of array.
For more information please check this article:
http://geekdirt.com/blog/blocks-lambda-and-procs-in-ruby/

What is the purpose of the Enumerator class in Ruby

If I create an Enumertor like so:
enum = [1,2,3].each => #<Enumerator: [1, 2, 3]:each>
enum is an Enumerator. What is the purpose of this object? I can't say this:
enum { |i| puts i }
But I can say this:
enum.each { |i| puts i }
That seems redundant, because the Enumerator was created with .each. It seems like it's storing some data regarding the each method.
I don't understand what's going on here. I'm sure there is some logical reason we have this Enumerator class, but what can it do that an Array can't? I thought maybe it was an ancestor of Array and other Enumerables, but it doesn't seem to be. What exactly is the reason for the existence of the Enumerator class, and in what context would it ever be used?
What happens if you do enum = [1,2,3].each; enum.next?:
enum = [1,2,3].each
=> #<Enumerator: [1, 2, 3]:each>
enum.next
=> 1
enum.next
=> 2
enum.next
=> 3
enum.next
StopIteration: iteration reached an end
This can be useful when you have an Enumerator that does a calculation, such as a prime-number calculator, or a Fibonacci-sequence generator. It provides flexibility in how you write your code.
I think, the main purpose is to get elements by demand instead of getting them all in a single loop. I mean something like this:
e = [1, 2, 3].each
... do stuff ...
first = e.next
... do stuff with first ...
second = e.next
... do more stuff with second ...
Note that those do stuff parts can be in different functions far far away from each other.
Lazily evaluated infinite sequences (e.g. primes, Fibonacci numbers, string keys like 'a'..'z','aa'..'az','ba'..'zz','aaa'.. etc.) are a good use case for enumerators.
As answered so far, Enumerator comes in handy when you want to iterate through a sequence of data of potentially infinite length.
Take a prime number generator prime_generator that extends Enumerator for example. If we want to get the first 5 primes, we can simply write prime_generator.take 5 instead of embedding the "limit" into the generating logic. Thus we can separate generating prime numbers and taking a certain amount out of generated prime numbers making the generator reusable.
I for one like method chaining using methods of Enumerable returning Enumerator like the following example (it may not be a "purpose" but I want to just point out an aesthetic aspect of it):
prime_generator.take_while{|p| p < n}.each_cons(2).find_all{|pair| pair[1] - pair[0] == 2}
Here the prime_generator is an instance of Enumerator that returns primes one by one. We can take prime numbers below n using take_while method of Enumerable. The methods each_cons and find_all both return Enumerator so they can be chained. This example is meant to generate twin primes below n. This may not be an efficient implementation but is easily written within a line and IMHO suitable for prototyping.
Here is a pretty straightforward implementation of prime_generator based on Enumerator:
def prime?(n)
n == 2 or
(n >= 3 and n.odd? and (3...n).step(2).all?{|k| n%k != 0})
end
prime_generator = Enumerator.new do |yielder|
n = 1
while true
yielder << n if prime? n
n += 1
end
end
It is possible to combine enumerators:
array.each.with_index { |el, idx| ... }
To understand the major advantage of the enumerator class, you first need to distinguish internal and external iterators. With internal iterators, the iterator itself controls the iteration. With external iterators, the client (often times the programmer) controls the iteration. Clients that use an external iterator must advance the traversal and request the next element explicitly from the iterator. In contrast, the client hands an internal iterator an operation to perform, and the iterator applies that operation to every element in the collection.
In Ruby, the Enumerator class enables you to make use of external iterators. And once you understand external iterators you will begin to discover a lot of advantages. First, let's look how the Enumerator class facilitates external iteration:
class Fruit
def initialize
#kinds = %w(apple orange pear banana)
end
def kinds
yield #kinds.shift
yield #kinds.shift
yield #kinds.shift
yield #kinds.shift
end
end
f = Fruit.new
enum = f.to_enum(:kinds)
enum.next
=> "apple"
f.instance_variable_get :#kinds
=> ["orange", "pear", "banana"]
enum.next
=> "orange"
f.instance_variable_get :#kinds
=> ["pear", "banana"]
enum.next
=> "pear"
f.instance_variable_get :#kinds
=> ["banana"]
enum.next
=> "banana"
f.instance_variable_get :#kinds
=> []
enum.next
StopIteration: iteration reached an end
It's important to note that calling to_enum on an object and passing a symbol that corresponds to a method will instantiate Enumerator class and in our example, the enum local variable holds an Enumerator instance. And then we use external iteration to traverse through the enumeration method we created. Our enumeration method called "kinds" and notice we use the yield method, which we typically do with blocks. Here, the enumerator will yield one value at a time. It pauses after each yield. When asked for another value, it will resume immediately after the last yielded value, and execute up to the next yielded value. When nothing left to yield, and you call next, it will invoke StopIteration exception.
So what is the power of external iteration in Ruby? There are several benefits and I will highlight a few of them. First, the Enumerator class allows for chaining. For example, with_index is defined in the Enumerator class and it allows us to specify a start value for iteration when iterating over an Enumerator object:
f.instance_variable_set :#kinds, %w(apple orange pear banana)
enum.rewind
enum.with_index(1) do |name, i|
puts "#{name}: #{i}"
end
apple: 1
orange: 2
pear: 3
banana: 4
Second, it provides a TON of useful convenience methods from the Enumerable module. Remember Enumerator is a class and Enumerable is a module, but the Enumerable module is included in the Enumerator class and so Enumerators are Enumerable:
Enumerator.ancestors
=> [Enumerator, Enumerable, Object, Kernel, BasicObject]
f.instance_variable_set :#kinds, %w(apple orange pear banana)
enum.rewind
enum.detect {|kind| kind =~ /^a/}
=> "apple"
enum
=> #<Enumerator: #<Fruit:0x007fb86c09bdf8 #kinds=["orange", "pear", "banana"]>:kinds>
And there is one other major benefit of Enumerator that might not be immediately clear. Let me explain this through a demonstration. As you probably know, you can make any of your user-defined classes Enumerable by including the Enumerable module and defining an each instance method:
class Fruit
include Enumerable
attr_accessor :kinds
def initialize
#kinds = %w(apple orange pear banana)
end
def each
#kinds.each { |kind| yield kind }
end
end
This is cool. Now we have a ton of Enumerable instance method goodies available to us like chunk, drop_while, flat_map, grep, lazy, partition, reduce, take_while and more.
f.partition {|kind| kind =~ /^a/ }
=> [["apple"], ["orange", "pear", "banana"]]
It's interesting to note that each of the instance methods of Enumerable module actually call our each method behind the scenes in order to get the enumerable items. So if we were to implement the reduce method, it might look something like this:
module Enumerable
def reduce(acc)
each do |value|
acc = yield(acc, value)
end
acc
end
end
Notice how it passes a block to the each method and so our each method is expected to yield something back to the block.
But look what happens if client code calls the each method without specifying a block:
f.each
LocalJumpError: no block given (yield)
So now we can modify our each method to use enum_for, which will return an Enumerator object when a block is not given:
class Fruit
include Enumerable
attr_accessor :kinds
def initialize
#kinds = %w(apple orange pear banana)
end
def each
return enum_for(:each) unless block_given?
#kinds.each { |kind| yield kind }
end
end
f = Fruit.new
f.each
=> #<Enumerator: #<Fruit:0x007ff70aa3b548 #kinds=["apple", "orange", "pear", "banana"]>:each>
And now we have an Enumerator instance we could control with our client code for later use.

Resources