Enumerator chain starting with find - ruby

There's on case of enumerator chain I can't get my head around:
[1, 2, 3, 4, 5].find.map { |x| x * x }
#=> [1, 4, 9, 16, 25]
This returns the array of initial value squares, but I would expect it to return just the [1]
I tried to deconstruct everything and this is what I achieved: .map is called on a find enumerator for the original array. It calls each on itself to get values for iteration. each on enumerator delegates iteration to method for which enumerator was created, i.e. find. find gets first element of array, yields it up, and it keeps being yielded until it reaches the block in the example. The value gets squared, block returns it, underlying each block in map definition returns [1], it drops down to the find, and since it's true in a boolean sense, I expect find to return at this point, effectively ending iteration, but somehow it keeps feeding values from the array all the way up to the map block.
This is not a real-world example, I'm just trying to understand how to read these chains correctly, and this case got me confused.
UPD
Since it was suggested several times that find being called without a block returns 'default' enumerator, here's an example:
[1, 2, 3, 4, 5].find
#=> #<Enumerator: [1, 2, 3, 4, 5]:find>
[1, 2, 3, 4, 5].find.each { |x| x < 4 }
#=> 1

Ok, I finally figured it out.
The reason find does not terminate the iteration after the first value is processed by block, is that collect_i iterator within collect aka map method of Enumerable module explicitly returns nil after every iteration, no matter what was the returning value of a block provided with a call to map or collect. Here it is, taken from enum.c:
static VALUE
collect_i(RB_BLOCK_CALL_FUNC_ARGLIST(i, ary))
{
rb_ary_push(ary, enum_yield(argc, argv));
return Qnil;
}
So internal call to find on the initial array always gets nil as a result of yielding a value, and thus doesn't stop iteration until the last element is processed. This is easy to prove by downloading ruby as archive and modifying this function like this:
static VALUE
collect_i(RB_BLOCK_CALL_FUNC_ARGLIST(i, ary))
{
rb_ary_push(ary, enum_yield(argc, argv));
return ary;
}
After saving and building ruby from modified source, we get this:
irb(main):001:0> [1,2,3,4,5].find.map { |x| x*x }
=> [1]
Another interesting thing, is that Rubinius implements collect exactly like this, so I think there's a chance that MRI and Rubinius produce different results for this statement. I don't have a working Rubinius installation right now, but I will check for this when I'll have an oppoprtunity and update this post with result.
Not sure if this will ever be of any use to anyone, except maybe for satisfying one's curiosity, but still :)

Related

Can someone explain respond_to? :each?

Can someone help me understand the following code?
array = [1,2,3,4];
if array.respond_to? :each
puts "1234"
else
puts "5678"
end
I can understand the result of the code, but what is the syntax of :each?
Is :each a global method? Why can we write it like this? Or how I can find out about it?
:each is a Symbol, which is kind of like a String but more limited, and more efficient in comparisons for equality. It is not a method; it does happen to be a method name.
respond_to? is a method defined on Object, which (almost) all Ruby objects ultimately inherit from.
When you say [1, 2, 3, 4].each, it will send the message :each to the Array object [1, 2, 3, 4]. The Array class object is aware that its instances will know what to do when they receive such a message, and thus Array.respond_to?(:each) return true. Basically, if array.respond_to?(:each) is false, then array.each will raise an error. [Note that, as p11y notes in comments, if array is really an Array, then this will always return true. But programmers can lie, and array does not have to be an Array; for example: array = "not an Array, fooled you!"]
[1, 2, 3, 4].respond_to? :each is equivalent to [1, 2, 3, 4].respond_to?(:each).
On a side note, semicolons are only ever required in Ruby if you want to stuff several statements on one line. Unlike in C, for example, where semicolon is a statement terminator, in Ruby it is statement separator. It is thus bad style to write array = [1, 2, 3, 4];.

Why does this enumerator.to_a returns []?

I just encounter this piece of code
Enumerator.new((1..100), :take, 5).to_a
# => []
Does anyone know why it returns an empty array and not an array of 5 integers?
From the docs, Enumerator#new:
new(obj, method = :each, *args)
In the second, deprecated, form, a generated Enumerator iterates over
the given object using the given method with the given arguments
passed.
Use of this form is discouraged. Use Kernel#enum_for or Kernel#to_enum
instead.
This second usage (which you should not use according to the documentation), needs a each-like method (that's it, one that yields values). take returns values, but does not yield them, so you get an empty enumerable.
Note that in Ruby 2 it will be plain simple to perform a lazy take:
2.0.0dev> xs = (1..100).lazy.take(5)
#=> #<Enumerator::Lazy: #<Enumerator::Lazy: 1..100>:take(5)>
2.0.0dev> xs.to_a
#=> [1, 2, 3, 4, 5]

Index a ruby array to omit an element or range of elements?

I have a ruby array of, say, 10 elements, and I'd like to return all but the 5th element.
a = *(1..10)
I'd like to be able to do something like
a[0..3 + 5..9]
Or even better something like (borrowing from the R syntax),
a[-4]
to do this, but that doesn't work. (Nor does anything more clever I've tried like getting an array of the element indices). What's the best way to do this in ruby?
You can use values_at: http://www.ruby-doc.org/core-1.9.3/Array.html#method-i-values_at
so you may use it as
a.values_at(0..3, 5..9)
No black magic required:
a[0..3] + a[5..9]
Look in the documentation under Array#delete_at
For example, myarray.delete_at(5) results in the array lacking what was element 5 (counting from 0, of course).
Example:
class Array
def without(n)
self.delete_at(n)
return self
end
end
arr = [1, 2, 3, 4]
p arr.without(1) #=> [1, 3, 4]
However, as a commenter points out, this alters the original array, which may not be what the O.P. wants! If that's an issue, write it like this:
class Array
def without(n)
arr2 = Array.new(self)
arr2.delete_at(n)
return arr2
end
end
That returns the desired array (an array lacking the nth element of the original) while leaving the original array untouched.

Can someone explain a real-world, plain-language use for inject in Ruby?

I'm working on learning Ruby, and came across inject. I am on the cusp of understanding it, but when I'm the type of person who needs real world examples to learn something. The most common examples I come across are people using inject to add up the sum of a (1..10) range, which I could care less about. It's an arbitrary example.
What would I use it for in a real program? I'm learning so I can move on to Rails, but I don't have to have a web-centric example. I just need something that has a purpose I can wrap my head around.
Thanks all.
inject can sometimes be better understood by its "other" name, reduce. It's a function that operates on an Enumerable (iterating through it once) and returns a single value.
There are many interesting ways that it can be used, especially when chained with other Enumerable methods, such as map. Often times, it can be a more concise and expressive way of doing something, even if there is another way to do it.
An example like this may seem useless at first:
range.inject {|sum, x| sum += x}
The variable range, however, doesn't have to be a simple explicit range. It could be (for example) a list of values returned from your database. If you ran a database query that returned a list of prices in a shopping cart, you could use .inject to sum them all and get a total.
In the simple case, you can do this in the SQL query itself. In a more difficult case, such as where some items have tax added to them and some don't, something like inject can be more useful:
cart_total = prices.inject {|sum, x| sum += price_with_tax(x)}
This sort of thing is also particularly useful when the objects in the Enumerable are complex classes that require more detailed processing than a simple numerical value would need, or when the Enumerable contains objects of different types that need to be converted into a common type before processing. Since inject takes a block, you can make the functionality here as complex as you need it to be.
Here are a couple of inject() examples in action:
[1, 2, 3, 4].inject(0) {|memo, num| memo += num; memo} # sums all elements in array
The example iterates over every element of the [1, 2, 3, 4] array and adds the elements to the memo variable (memo is commonly used as the block variable name). This example explicitly returns memo after every iteration, but the return can also be implicit.
[1, 2, 3, 4].inject(0) {|memo, num| memo += num} # also works
inject() is conceptually similar to the following explicit code:
result = 0
[1, 2, 3, 4].each {|num| result += num}
result # result is now 10
inject() is also useful to create arrays and hashes. Here is how to use inject() to convert [['dogs', 4], ['cats', 3], ['dogs', 7]] to {'dogs' => 11, 'cats' => 3}.
[['dogs', 4], ['cats', 3], ['dogs', 7]].inject({'dogs' => 0, 'cats' => 0}) do |memo, (animal, num)|
memo[animal] = num
memo
end
Here is a more generalized and elegant solution:
[['dogs', 4], ['cats', 3], ['dogs', 7]].inject(Hash.new(0)) do |memo, (animal, num)|
memo[animal] = num
memo
end
Again, inject() is conceptually similar to the following code:
result = Hash.new(0)
[['dogs', 4], ['cats', 3], ['dogs', 7]].each do |animal, num|
result[animal] = num
end
result # now equals {'dogs' => 11, 'cats' => 3}
Instead of a range, imagine you've got a list of sales prices for some item on eBay and you want to know the average price. You can do that by injecting + and then dividing by the length.
ActiveRecord scopes are a typical case. If we call scoped on a model, we get an object on which we can chain additional scopes. This lets us use inject to build up a search scope from, say, a params hash:
search_params = params.slice("first_name","last_name","city","zip").
reject {|k,v| v.blank?}
search_scope = search_params.inject(User.scoped) do |memo, (k,v)|
case k
when "first_name"
memo.first_name(v)
when "last_name"
memo.last_name(v)
when "city"
memo.city(v)
when "zip"
memo.zip(v)
else
memo
end
(Note: if NO params are supplied, this brings back the whole table, which might not be what you wanted.)
My favorite explanation for inject or it's synonym reduce is:
reduce takes in an array and reduces it to a single value. It does this by iterating through a list, keeping and transforming a running total along the way.
I found it in a wonderful article at
http://railspikes.com/2008/8/11/understanding-map-and-reduce

How can I get a lazy array in Ruby?

How can I get a lazy array in Ruby?
In Haskell, I can talk about [1..], which is an infinite list, lazily generated as needed. I can also do things like iterate (+2) 0, which applies whatever function I give it to generate a lazy list. In this case, it would give me all even numbers.
I'm sure I can do such things in Ruby, but can't seem to work out how.
With Ruby 1.9 you can use the Enumerator class. This is an example from the docs:
fib = Enumerator.new { |y|
a = b = 1
loop {
y << a
a, b = b, a + b
}
}
p fib.take(10) #=> [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
Also, this is a nice trick:
Infinity = 1.0/0
range = 5..Infinity
p range.take(10) #=> [5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
This one only works for consecutive values though.
Recently Enumerable::Lazy has been added to ruby trunk. We'll see it in ruby 2.0.
In particular:
a = data.lazy.map(&:split).map(&:reverse)
will not be evaluated immediately.
The result is instance of Enumerable::Lazy, that can be lazy chained any further. If you want to get an actual result - use #to_a, #take(n) (#take is now lazy too, use #to_a or #force), etc.
If you want more on this topic and my C patch - see my blog post Ruby 2.0 Enumerable::Lazy
Lazy range (natural numbers):
Inf = 1.0/0.0
(1..Inf).take(3) #=> [1, 2, 3]
Lazy range (even numbers):
(0..Inf).step(2).take(5) #=> [0, 2, 4, 6, 8]
Note, you can also extend Enumerable with some methods to make working with lazy ranges (and so on) more convenient:
module Enumerable
def lazy_select
Enumerator.new do |yielder|
each do |obj|
yielder.yield(obj) if yield(obj)
end
end
end
end
# first 4 even numbers
(1..Inf).lazy_select { |v| v.even? }.take(4)
output:
[2, 4, 6, 8]
More info here:
http://banisterfiend.wordpress.com/2009/10/02/wtf-infinite-ranges-in-ruby/
There are also implementations of lazy_map, and lazy_select for the Enumeratorclass that can be found here:
http://www.michaelharrison.ws/weblog/?p=163
In Ruby 2.0.0, they were introduced new method "Lazy" in Enumerable class.
You can check the lazy function core and usage here..
http://www.ruby-doc.org/core-2.0/Enumerator/Lazy.html
https://github.com/yhara/enumerable-lazy
http://shugomaeda.blogspot.in/2012/03/enumerablelazy-and-its-benefits.html
As I already said in my comments, implementing such a thing as lazy arrays wouldn't be sensible.
Using Enumerable instead can work nicely in some situations, but differs from lazy lists in some points: methods like map and filter won't be evaluated lazily (so they won't work on infinite enumerables) and elements that have been calculated once aren't stored, so if you access an element twice, it's calculated twice.
If you want the exact behavior of haskell's lazy lists in ruby, there's a lazylist gem which implements lazy lists.
This will loop to infinity:
0.step{|i| puts i}
This will loop to infinity twice as fast:
0.step(nil, 2){|i| puts i}
This will go to infinity, only if you want it to (results in an Enumerator).
table_of_3 = 0.step(nil, 3)
I surprised no one answered this question appropriately yet
So, recently I found this method Enumerator.produce which in conjunction with .lazy does exactly what you described but in ruby-ish fashion
Examples
Enumerator.produce(0) do
_1 + 2
end.lazy
.map(&:to_r)
.take(1_000)
.inject(&:+)
# => (999000/1)
def fact(n)
= Enumerator.produce(1) do
_1 + 1
end.lazy
.take(n)
.inject(&:*)
fact 6 # => 720
The right answer has already identified the "lazy" method, but the example provided was not too useful. I will give a better example of when it is appropriate to use lazy with arrays. As stated, lazy is defined as an instance method of the module Enumerable, and it works on EITHER objects that implement Enumerable module (e.g. arrays - [].lazy) or enumerators which are the return value of iterators in the enumerable module (e.g. each_slice - [].each_slice(2).lazy). Note that in Enumerable module, some of the instance methods return more primitive values like true or false, some return collections like arrays and some return enumerators. Some return enumerators if a block is not given.
But for our example, the IO class also has an iterator each_line, which returns an enumerator and thus can be used with "lazy". The beautiful thing about returning an enumerator is that it does not actually load the collection (e.g. large array) in memory that it is working on. Rather, it has a pointer to the collection and then stories the algorithm (e.g. each_slice(2)) that it will use on that collection, when you want to process the collection with something like to_a, for example.
So if you are working with an enumerator for a huge performance boost, now you can attach lazy to the enumerator. So instead of iterating through an entire collection to match this condition:
file.each_line.select { |line| line.size == 5 }.first(5)
You can invoke lazy:
file.each_line.lazy.select { |line| line.size == 5 }.first(5)
If we're scanning a large text file for the first 5 matches, then once we find the 5 matches, there is no need to proceed the execution. Hence, the power of lazy with any type of enumerable object.
Ruby Arrays dynamically expand as needed. You can apply blocks to them to return things like even numbers.
array = []
array.size # => 0
array[0] # => nil
array[9999] # => nil
array << 1
array.size # => 1
array << 2 << 3 << 4
array.size # => 4
array = (0..9).to_a
array.select do |e|
e % 2 == 0
end
# => [0,2,4,6,8]
Does this help?

Resources