How many times does Ruby evaluate a loop's enumerated collection? - ruby

In Ruby, if I were to loop over a collection, how many times would Ruby evaluate the enumerated collection?
Specifically, I'd like to sort a collection, and loop over the sorted collection. Since I have no need for keeping a copy of the sorted collection around, I figured I'd just write the loop as:
for item in #items.sort{ |a,b| b.created_at <=> a.created_at } do
#do some stuff
end
However, after crafting that lovely bit of code I began to wonder how many times I might be actually calling sort.
Would the above line indeed only sort the collection once? Or will Ruby end up sorting it N times for each item in the collection?

You're calling sort once.
Except for scoping differences,
for x in xs do
some_stuff
end
is the same as
xs.each do |x|
some_stuff
end
And of course when you do foo.bar(baz), foo is evaluated exactly once, no matter what bar does.

That's equivalent to sorting the whole collection once, and then iterating it once.
Equivalent to:
#items.sort{ |a,b| b.created_at <=> a.created_at }.each do |item|
# do some stuff
end

Even cleaner and much faster:
#items.sort_by {|a| a.created_at}.reverse
You should pretty much always use sort_by instead of sort if you can (and you almost always can), because it evaluates the sort-key function only once per item. And it lets you write half as much comparison code!

Related

Loop method until it returns falsey

I was trying to make my bubble sort shorter and I came up with this
class Array
def bubble_sort!(&block)
block = Proc.new { |a, b| a <=> b } unless block_given?
sorted = each_index.each_cons(2).none? do |i, next_i|
if block.call(self[i], self[next_i]) == 1
self[i], self[next_i] = self[next_i], self[i]
end
end until sorted
self
end
def bubble_sort(&prc)
self.dup.bubble_sort!(&prc)
end
end
I don't particularly like the thing with sorted = --sort code-- until sorted.
I just want to run the each_index.each_cons(s).none? code until it returns true. It's a weird situation that I use until, but the condition is a code I want to run. Any way, my try seems awkward, and ruby usually has a nice concise way of putting things. Is there a better way to do this?
This is just my opinion
have you ever read the ruby source code of each and map to understand what they do?
No, because they have a clear task expressed from the method name and if you test them, they will take an object, some parameters and then return a value to you.
For example if I want to test the String method split()
s = "a new string"
s.split("new")
=> ["a ", " string"]
Do you know if .split() takes a block?
It is one of the core ruby methods, but to call it I don't pass a block 90% of the times, I can understand what it does from the name .split() and from the return value
Focus on the objects you are using, the task the methods should accomplish and their return values.
I read your code and I can not refactor it, I hardly can understand what the code does.
I decided to write down some points, with possibility to follow up:
1) do not use the proc for now, first get the Object Oriented code clean.
2) split bubble_sort! into several methods, each one with a clear task
def ordered_inverted! (bubble_sort!), def invert_values, maybe perform a invert_values until sorted, check if existing methods already perform this sorting functionality
3) write specs for those methods, tdd will push you to keep methods simple and easy to test
4) If those methods do not belong to the Array class, include them in the appropriate class, sometimes overly complicated methods are just performing simple String operations.
5) Reading books about refactoring may actually help more then trying to force the usage of proc and functional programming when not necessary.
After looking into it further I'm fairly sure the best solution is
loop do
break if condition
end
Either that or the way I have it in the question, but I think the loop do version is clearer.
Edit:
Ha, a couple weeks later after I settled for the loop do solution, I stumbled into a better one. You can just use a while or until loop with an empty block like this:
while condition; end
until condition; end
So the bubble sort example in the question can be written like this
class Array
def bubble_sort!(&block)
block = Proc.new { |a, b| a <=> b } unless block_given?
until (each_index.each_cons(2).none? do |i, next_i|
if block.call(self[i], self[next_i]) == 1
self[i], self[next_i] = self[next_i], self[i]
end
end); end
self
end
def bubble_sort(&prc)
self.dup.bubble_sort!(&prc)
end
end

Chaining partition, keep_if etc

[1,2,3].partition.inject(0) do |acc, x|
x>2 # this line is intended to be used by `partition`
acc+=x # this line is intended to be used by `inject`
end
I know that I can write above stanza using different methods but this is not important here.
What I want to ask why somebody want to use partition (or other methods like keep_if, delete_if) at the beginning of the "chain"?
In my example, after I chained inject I couldn't use partition. I can write above stanza using each:
[1,2,3].each.inject(0) do |acc, x|
x>2 # this line is intended to be used by `partition`
acc+=x # this line is intended to be used by `inject`
end
and it will be the same, right?
I know that x>2 will be discarded (and not used) by partition. Only acc+=x will do the job (sum all elements in this case).
I only wrote that to show my "intention": I want to use partition in the chain like this [].partition.inject(0).
I know that above code won't work as I intended and I know that I can chain after block( }.map as mentioned by Neil Slater).
I wanted to know why, and when partition (and other methods like keep_if, delete_if etc) becomes each (just return elements of the array as partition do in the above cases).
In my example, partition.inject, partition became each because partition cannot take condition (x>2).
However partition.with_index (as mentioned by Boris Stitnicky) works (I can partition array and use index for whatever I want):
shuffled_array
.partition
.with_index { |element, index|
element > index
}
ps. This is not question about how to get sum of elements that are bigger than 2.
This is an interesting situation. Looking at your code examples, you are obviously new to Ruby and perhaps also to programming. Yet you managed to ask a very difficult question that basically concerns the Enumerator class, one of the least publicly understood classes, especially since Enumerator::Lazy was introduced. To me, your question is difficult enough that I am not able to provide a comprehensive answer. Yet the remarks about your code would not fit into a comment under the OP. That's why I'm adding this non-answer.
First of all, let us notice a few awful things in your code:
Useless lines. In both blocks, x>2 line is useless, because its return value is discarded.
[1,2,3].partition.inject(0) do |x, acc|
x>2 # <---- return value of this line is never used
acc+=x
end
[1,2,3].each.inject(0) do |x, acc|
x>2 # <---- return value of this line is never used
acc+=x
end
I will ignore this useless line when discussing your code examples further.
Useless #each method. It is useless to write
[1,2,3].each.inject(0) do |x, acc|
acc+=x
end
This is enough:
[1,2,3].inject(0) do |x, acc|
acc+=x
end
Useless use of #partition method. Instead of:
[1,2,3].partition.inject(0) do |x, acc|
acc+=x
end
You can just write this:
[1,2,3].inject(0) do |x, acc|
acc+=x
end
Or, as I would write it, this:
[ 1, 2, 3 ].inject :+
But then, you ask a deep question about using #partition method in the enumerator mode. Having discussed the trivial newbie problems of your code, we are left with the question how exactly the enumerator-returning versions of the #partition, #keep_if etc. should be used, or rather, what are the interesting way of using them, because everyone knows that we can use them for chaining:
array = [ *1..6 ]
shuffled_arrray = array.shuffle # randomly shuffles the array elements
shuffled_array
.partition # partition enumerator comes into play
.with_index { |element, index| # method Enumerator#with_index comes into play
element > index # and partitions elements into those greater
} # than their index, and those smaller
And also like this:
e = partition_enumerator_of_array = array.partition
# And then, we can partition the array in many ways:
e.each &:even? # partitions into odd / even numbers
e.each { rand() > 0.5 } # partitions the array randomly
# etc.
An easily understood advantage is that instead of writing longer:
array.partition &:even?
You can write shorter:
e.each &:even?
But I am basically sure that enumerators provide more power to the programmer than just chaining collection methods and shortening code a little bit. Because different enumerators do very different things. Some, such as #map! or #reject!, can even modify the collection on which they operate. In this case, it is imaginable that one could combine different enumerators with the same block to do different things. This ability to vary not just the blocks, but also the enumerators to which they are passed, gives combinatorial power, which can very likely be used to make some otherwise lengthy code very concise. But I am unable to provide a very useful concrete example of this.
In sum, Enumerator class is here mainly for chaining, and to use chaining, programmers do not really need to undestand Enumerator in detail. But I suspect that the correct habits regarding the use of Enumerator might be as difficult to learn as, for instance, correct habits of parametrized subclassing. I suspect I have not grasped the most powerful ways to use enumerators yet.
I think that the result [3, 3] is what you are looking for here - partitioning the array into smaller and larger numbers then summing each group. You seem to be confused about how you give the block "rules" to the two different methods, and have merged what should be two blocks into one.
If you need the net effects of many methods that each take a block, then you can chain after any block, by adding the .method after the close of the block like this: }.each or end.each
Also note that if you create partitions, you are probably wanting to sum over each partition separately. To do that you will need an extra link in the chain (in this case a map):
[1,2,3].partition {|x| x > 2}.map do |part|
part.inject(0) do |acc, x|
x + acc
end
end
# => [3, 3]
(You also got the accumulator and current value wrong way around in the inject, and there is no need to assign to the accumulator, Ruby does that for you).
The .inject is no longer in a method chain, instead it is inside a block. There is no problem with blocks inside other blocks, in fact you will see this very often in Ruby code.
I have chained .partition and .map in the above example. You could also write the above like this:
[1,2,3].partition do
|x| x > 2
end.map do |part|
part.inject(0) do |acc, x|
x + acc
end
end
. . . although when chaining with short blocks, I personally find it easier to use the { } syntax instead of do end, especially at the start of a chain.
If it all starts to look complex, there is not usually a high cost to assigning the results of the first part of a chain to a local variable, in which case there is no chain at all.
parts = [1,2,3].partition {|x| x > 2}
parts.map do |part|
part.inject(0) do |acc, x|
x + acc
end
end

Is there a comparable Array operation to Java's Iterator.hasNext() in ruby?

I can only find methods that look for specific elements of an array.
During my objects.each |a| loop, I want to know when I'm at the final element so I can have a loop like:
objects.each |a|
if objects.hasNext()
puts a.name + ","
else
puts a.name
Iterator's hasNext() determines if the Array's iterator has another element after the one currently being evaluated.
I want to emphasize that I'm looking to print out these values, not turn them in to an Array. .join is not what I'm looking for here.
No, there isn't. However, note that hasNext is not an Array operation in Java, either. It's an Iterator operation, and Ruby's equivalent to Java's Iterator is Enumerator, not Array.
However, Ruby's Enumerator works a little bit different than Java's: instead of asking whether there is a next element, and then advancing the iterator, you simply try to look at the next element and it throws a StopIteration exception when there are no more elements.
So, the equivalent to Java's
iter.hasNext);
would be roughly
begin
enum.peek
true
rescue StopIteration
false
end
However, you almost never iterate manually in Ruby. Instead, you use higher-level iteration methods such as join, flat_map, group_by, uniq, sort, sort_by, inject, map, each_with_object, each etc.
For example:
%w(pretty ugly stupid).join(', ') # => 'pretty, ugly, stupid'
Is there a comparable Array operation to Java's hasNext() in ruby?
First of all Java's Array doesn't have any hasNext method per se because it wouldn't make any sense. It's the iterator that has it. In Ruby there's no such a thing as a list and the powerful iterator methods (each and the each_* family) would make it pretty useless:
my_array.each do |current|
// operations
// implicit:
// if (current.has_next) current = current.next
// else break
end
So, no there's no such a thing.
I'm using .each |a| to run the loop. I want to print a comma each time through unless it's the last. I want this list (pretty, ugly, stupid) not (pretty, ugly, stupid,). Any thoughts
You should take a look at the .join method.
It is common that, when leaning a new language, people tend to looking for something that familiar with:)
Your specific questions could easily be solved by using each_with_index.
objects.each_with_index do |object ,index|
if index == (object.length -1) then
puts a.name + ","
else
puts a.name
end
end
In the Ruby library iterators are implemented as internal iterators in contrast to Java which implements external iterators. The key difference between the two is that the former are designed to not let the client control the iteration, while the latter leave to the client this responsibility.
The purpose of a method like hasNext is to control iteration directly, thus in Ruby no such thing exists. Methods like peek and next defined by Enumerator are, I guess, not intended to control iteration directly but to implement custom internal iterators.
That said, your problem is easily solved with this code:
puts objects.map(&:name).join(', ')
However sometimes could be useful to concoct your own internal iterator using an Enumerator object:
module Enumerable
def puts_each_with_separator(separator)
enum = each
loop do
print yield(enum.next)
enum.peek rescue break
print separator
end
puts
end
end
objects.puts_each_with_separator(', ', &:name)
I want to emphasize that I'm looking to print out these values, not
turn them in to an Array. .join is not what I'm looking for here.
Actually, I think .join is exactly what you're looking for. The result of .join is a string, not an array, so
puts objects.join(",")
does what you say you want.

Sorting a hash table of objects (by the attributes of the objects) in Ruby

Say I have a class called Person, and it contains things such as last name, first name, address, etc.
I also have a hash table of Person objects that needs to be sorted by last and first name.
I understand that a sort_by will not change the hash permanently, which is fine, I only need to print in that order. Currently, I am trying to sort/print in place using:
#hash.sort_by {|a,b| a <=> b}.each { |person| puts person.last}
I have overloaded the <=> operator to sort by last/first, but nothing appears to actually sort. The puts there simply outputs in the hash's original order. I have spent a good 4 days trying to figure this out (it is a school assignment, and my first Ruby program). Any ideas? I am sure this is easy, but I am having the hardest time bringing my brain out of the C++ way of thinking.
You appear to be confusing sort and sort_by
sort yields two objects from the collection to the block and expects you to return a <=> like value: -1,0 or 1 depending on whether the arguments are equal, ascending or descending, for example
%w(one two three four five).sort {|a,b| a.length <=> b.length}
Sorts the strings by length. This is the form to use if you want to use your <=> operator
sort_by yields one object from the collection at a time and expects you to return what you want to sort by - you shouldn't be doing any comparison here. Ruby then uses <=> on these objecfs to sort your collection. The previous example can e rewritten as
%w(one two three four five).sort_by {|s| s.length}
This is also known as a schwartzian transform
In your case the collection is a hash so things are slightly more complicated: the values that are passed into the block are arrays that contain key/value pairs, so you'll need to extract the person object from that pair. You could also just work on #hash.keys or #hash.values (depending on whether the person objects are keys or values)
If you've overidden the <=> operator to sort Person objects appropriately then you can simply do:
#hash.sort_by{ |key, person| person }
because sort_by will yield both the hash key, and the object (in your case a person) to each iteration of the block. So the code above will sort your hash based on Person objects - for which you've already specified an <=> operator.
When you #.sort_by with a Hash, the parameters passed to the block are 'key','value', not 'element a' and 'element b'. Try:
#hash.sort_by{|key,value| value.last}.each{|key,value| puts value.last}
Also, see Frederick Cheung's excellent explanation of #sort vs #sort_by.

What is the "right" way to iterate through an array in Ruby?

PHP, for all its warts, is pretty good on this count. There's no difference between an array and a hash (maybe I'm naive, but this seems obviously right to me), and to iterate through either you just do
foreach (array/hash as $key => $value)
In Ruby there are a bunch of ways to do this sort of thing:
array.length.times do |i|
end
array.each
array.each_index
for i in array
Hashes make more sense, since I just always use
hash.each do |key, value|
Why can't I do this for arrays? If I want to remember just one method, I guess I can use each_index (since it makes both the index and value available), but it's annoying to have to do array[index] instead of just value.
Oh right, I forgot about array.each_with_index. However, this one sucks because it goes |value, key| and hash.each goes |key, value|! Is this not insane?
This will iterate through all the elements:
array = [1, 2, 3, 4, 5, 6]
array.each { |x| puts x }
# Output:
1
2
3
4
5
6
This will iterate through all the elements giving you the value and the index:
array = ["A", "B", "C"]
array.each_with_index {|val, index| puts "#{val} => #{index}" }
# Output:
A => 0
B => 1
C => 2
I'm not quite sure from your question which one you are looking for.
I think there is no one right way. There are a lot of different ways to iterate, and each has its own niche.
each is sufficient for many usages, since I don't often care about the indexes.
each_ with _index acts like Hash#each - you get the value and the index.
each_index - just the indexes. I don't use this one often. Equivalent to "length.times".
map is another way to iterate, useful when you want to transform one array into another.
select is the iterator to use when you want to choose a subset.
inject is useful for generating sums or products, or collecting a single result.
It may seem like a lot to remember, but don't worry, you can get by without knowing all of them. But as you start to learn and use the different methods, your code will become cleaner and clearer, and you'll be on your way to Ruby mastery.
I'm not saying that Array -> |value,index| and Hash -> |key,value| is not insane (see Horace Loeb's comment), but I am saying that there is a sane way to expect this arrangement.
When I am dealing with arrays, I am focused on the elements in the array (not the index because the index is transitory). The method is each with index, i.e. each+index, or |each,index|, or |value,index|. This is also consistent with the index being viewed as an optional argument, e.g. |value| is equivalent to |value,index=nil| which is consistent with |value,index|.
When I am dealing with hashes, I am often more focused on the keys than the values, and I am usually dealing with keys and values in that order, either key => value or hash[key] = value.
If you want duck-typing, then either explicitly use a defined method as Brent Longborough showed, or an implicit method as maxhawkins showed.
Ruby is all about accommodating the language to suit the programmer, not about the programmer accommodating to suit the language. This is why there are so many ways. There are so many ways to think about something. In Ruby, you choose the closest and the rest of the code usually falls out extremely neatly and concisely.
As for the original question, "What is the “right” way to iterate through an array in Ruby?", well, I think the core way (i.e. without powerful syntactic sugar or object oriented power) is to do:
for index in 0 ... array.size
puts "array[#{index}] = #{array[index].inspect}"
end
But Ruby is all about powerful syntactic sugar and object oriented power, but anyway here is the equivalent for hashes, and the keys can be ordered or not:
for key in hash.keys.sort
puts "hash[#{key.inspect}] = #{hash[key].inspect}"
end
So, my answer is, "The “right” way to iterate through an array in Ruby depends on you (i.e. the programmer or the programming team) and the project.". The better Ruby programmer makes the better choice (of which syntactic power and/or which object oriented approach). The better Ruby programmer continues to look for more ways.
Now, I want to ask another question, "What is the “right” way to iterate through a Range in Ruby backwards?"! (This question is how I came to this page.)
It is nice to do (for the forwards):
(1..10).each{|i| puts "i=#{i}" }
but I don't like to do (for the backwards):
(1..10).to_a.reverse.each{|i| puts "i=#{i}" }
Well, I don't actually mind doing that too much, but when I am teaching going backwards, I want to show my students a nice symmetry (i.e. with minimal difference, e.g. only adding a reverse, or a step -1, but without modifying anything else).
You can do (for symmetry):
(a=*1..10).each{|i| puts "i=#{i}" }
and
(a=*1..10).reverse.each{|i| puts "i=#{i}" }
which I don't like much, but you can't do
(*1..10).each{|i| puts "i=#{i}" }
(*1..10).reverse.each{|i| puts "i=#{i}" }
#
(1..10).step(1){|i| puts "i=#{i}" }
(1..10).step(-1){|i| puts "i=#{i}" }
#
(1..10).each{|i| puts "i=#{i}" }
(10..1).each{|i| puts "i=#{i}" } # I don't want this though. It's dangerous
You could ultimately do
class Range
def each_reverse(&block)
self.to_a.reverse.each(&block)
end
end
but I want to teach pure Ruby rather than object oriented approaches (just yet). I would like to iterate backwards:
without creating an array (consider 0..1000000000)
working for any Range (e.g. Strings, not just Integers)
without using any extra object oriented power (i.e. no class modification)
I believe this is impossible without defining a pred method, which means modifying the Range class to use it. If you can do this please let me know, otherwise confirmation of impossibility would be appreciated though it would be disappointing. Perhaps Ruby 1.9 addresses this.
(Thanks for your time in reading this.)
Use each_with_index when you need both.
ary.each_with_index { |val, idx| # ...
The other answers are just fine, but I wanted to point out one other peripheral thing: Arrays are ordered, whereas Hashes are not in 1.8. (In Ruby 1.9, Hashes are ordered by insertion order of keys.) So it wouldn't make sense prior to 1.9 to iterate over a Hash in the same way/sequence as Arrays, which have always had a definite ordering. I don't know what the default order is for PHP associative arrays (apparently my google fu isn't strong enough to figure that out, either), but I don't know how you can consider regular PHP arrays and PHP associative arrays to be "the same" in this context, since the order for associative arrays seems undefined.
As such, the Ruby way seems more clear and intuitive to me. :)
Here are the four options listed in your question, arranged by freedom of control. You might want to use a different one depending on what you need.
Simply go through values:
array.each
Simply go through indices:
array.each_index
Go through indices + index variable:
for i in array
Control loop count + index variable:
array.length.times do | i |
Trying to do the same thing consistently with arrays and hashes might just be a code smell, but, at the risk of my being branded as a codorous half-monkey-patcher, if you're looking for consistent behaviour, would this do the trick?:
class Hash
def each_pairwise
self.each { | x, y |
yield [x, y]
}
end
end
class Array
def each_pairwise
self.each_with_index { | x, y |
yield [y, x]
}
end
end
["a","b","c"].each_pairwise { |x,y|
puts "#{x} => #{y}"
}
{"a" => "Aardvark","b" => "Bogle","c" => "Catastrophe"}.each_pairwise { |x,y|
puts "#{x} => #{y}"
}
I'd been trying to build a menu (in Camping and Markaby) using a hash.
Each item has 2 elements: a menu label and a URL, so a hash seemed right, but the '/' URL for 'Home' always appeared last (as you'd expect for a hash), so menu items appeared in the wrong order.
Using an array with each_slice does the job:
['Home', '/', 'Page two', 'two', 'Test', 'test'].each_slice(2) do|label,link|
li {a label, :href => link}
end
Adding extra values for each menu item (e.g. like a CSS ID name) just means increasing the slice value. So, like a hash but with groups consisting of any number of items. Perfect.
So this is just to say thanks for inadvertently hinting at a solution!
Obvious, but worth stating: I suggest checking if the length of the array is divisible by the slice value.
If you use the enumerable mixin (as Rails does) you can do something similar to the php snippet listed. Just use the each_slice method and flatten the hash.
require 'enumerator'
['a',1,'b',2].to_a.flatten.each_slice(2) {|x,y| puts "#{x} => #{y}" }
# is equivalent to...
{'a'=>1,'b'=>2}.to_a.flatten.each_slice(2) {|x,y| puts "#{x} => #{y}" }
Less monkey-patching required.
However, this does cause problems when you have a recursive array or a hash with array values. In ruby 1.9 this problem is solved with a parameter to the flatten method that specifies how deep to recurse.
# Ruby 1.8
[1,2,[1,2,3]].flatten
=> [1,2,1,2,3]
# Ruby 1.9
[1,2,[1,2,3]].flatten(0)
=> [1,2,[1,2,3]]
As for the question of whether this is a code smell, I'm not sure. Usually when I have to bend over backwards to iterate over something I step back and realize I'm attacking the problem wrong.
In Ruby 2.1, each_with_index method is removed.
Instead you can use each_index
Example:
a = [ "a", "b", "c" ]
a.each_index {|x| print x, " -- " }
produces:
0 -- 1 -- 2 --
The right way is the one you feel most comfortable with and which does what you want it to do. In programming there is rarely one 'correct' way to do things, more often there are multiple ways to choose.
If you are comfortable with certain way of doings things, do just it, unless it doesn't work - then it is time to find better way.
Using the same method for iterating through both arrays and hashes makes sense, for example to process nested hash-and-array structures often resulting from parsers, from reading JSON files etc..
One clever way that has not yet been mentioned is how it's done in the Ruby Facets library of standard library extensions. From here:
class Array
# Iterate over index and value. The intention of this
# method is to provide polymorphism with Hash.
#
def each_pair #:yield:
each_with_index {|e, i| yield(i,e) }
end
end
There is already Hash#each_pair, an alias of Hash#each. So after this patch, we also have Array#each_pair and can use it interchangeably to iterate through both Hashes and Arrays. This fixes the OP's observed insanity that Array#each_with_index has the block arguments reversed compared to Hash#each. Example usage:
my_array = ['Hello', 'World', '!']
my_array.each_pair { |key, value| pp "#{key}, #{value}" }
# result:
"0, Hello"
"1, World"
"2, !"
my_hash = { '0' => 'Hello', '1' => 'World', '2' => '!' }
my_hash.each_pair { |key, value| pp "#{key}, #{value}" }
# result:
"0, Hello"
"1, World"
"2, !"

Resources