What are .each iterator fetch order guarantees? - ruby

I am really baffled by something as it led in hours of head scratching; I have the following segment of code
objectA.arrayA.each do |p|
do stuff with p
end
I thought this was fine, since from this question I felt that since I am using an array for the job so I should be fine. Unfortunately that was not the case since the order that the each iterator returned the elements was not always the same. After hours of looking at other blocks for the issue swapping the above code with this for loop solved the problem:
for i in 0...objectA.arrayA.length
do stuff with the array element
end
Anyone has any idea when the ordering of each is guaranteed?

The docs for Enumerable state
The Enumerable mixin provides collection classes with several
traversal and searching methods, and with the ability to sort. The
class must provide a method each, which yields successive members of
the collection. If Enumerable#max, #min, or #sort is used, the objects
in the collection must also implement a meaningful <=> operator, as
these methods rely on an ordering between members of the collection.
So Array.each must also yield successive members to meet this contract
If an implementation doesn't enforce this, it would be a bug in the implementation

Related

Why must we call to_a on an enumerator object?

The chaining of each_slice and to_a confuses me. I know that each_slice is a member of Enumerable and therefore can be called on enumerable objects like arrays, and chars does return an array of characters.
I also know that each_slice will slice the array in groups of n elements, which is 2 in the below example. And if a block is not given to each_slice, then it returns an Enumerator object.
'186A08'.chars.each_slice(2).to_a
But why must we call to_a on the enumerator object if each_slice has already grouped the array by n elements? Why doesn't ruby just evaluate what the enumerator object is (which is a collection of n elements)?
The purpose of enumerators is lazy evaluation. When you call each_slice, you get back an enumerator object. This object does not calculate the entire grouped array up front. Instead, it calculates each “slice” as it is needed. This helps save on memory, and also allows you quite a bit of flexibility in your code.
This stack overflow post has a lot of information in it that you’ll find useful:
What is the purpose of the Enumerator class in Ruby
To give you a cut and dry answer to your question “Why must I call to_a when...”, the answer is, it hasn’t. It hasn’t yet looped through the array at all. So far it’s just defined an object that says that when it goes though the array, you’re going to want elements two at a time. You then have the freedom to either force it to do the calculation on all elements in the enumerable (by calling to_a), or you could alternatively use next or each to go through and then stop partway through (maybe calculate only half of them as opposed to calculating all of them and throwing the second half away).
It’s similar to how the Range class does not build up the list of elements in the range. (1..100000) doesn’t make an array of 100000 numbers, but instead defines an object with a min and max and certain operations can be performed on that. For example (1..100000).cover?(5) doesn’t build a massive array to see if that number is in there, but instead just sees if 5 is greater than or equal to 1 and less than or equal to 100000.
The purpose of this all is performance and flexibility.
It may be worth considering whether your implementation actually needs to make an array up front, or whether you can actually keep your RAM consumption down a bit by iterating over the enumerator. (If your real world scenario is as simple as you described, an enumerator won’t help much, but if the array actually is large, an enumerator could help you a lot).

Bubble Sort method

I am just learning ruby and KevinC's response (in this link) makes sense to me with one exception. I don't understand why the code is encompassed in the arr.each do |i| #while... end That part seems redundant to me as the 'while' loop is already hitting each of the positions? Can someone explain?
The inner loop finds a bubble and carries it up; if it finds another, lighter bubble, it switches them around and carries the lighter one. So you need several passes through the array to find all the bubbles and carry them to the correct place, since you can't float several bubbles at the same time.
EDIT:
The each is really misused in KevinC's code, since it is not used for its normal purpose: yielding elements of the collection. Instead of arr.each, it would be better to use arr.size.times - as it would be more informative to the reader. Redefining the i within the block is adding insult to injury. While none of this will cause the code to be wrong as such, it is misleading.
The other problem with the code is the fact that it does not provide the early termination condition (swapped in most other answers on that question). In theory, bubble sort could find the array sorted in the first pass; the other size - 1 steps are unnecesary. KevinC's code would still dry-hump the already sorted array, never realising it is done.
As for rewrite into block-less code, it is certainly possible, but you need to understand that blocks syntax is very idiomatic in Ruby, and non-block loops are almost unheard of in Ruby world. While Ruby has for, it is pretty much never used in Ruby. But...
arr.each do |i|
...
end
is equivalent to
for i in arr
...
end
which is, again, at least for the array case, equivalent to
index = 0
while index < arr.size
i = arr[index]
...
index += 1
end

Is It Proper Ruby Style to Extend Built-in Classes?

I realize this question may be too philosophical for StackOverflow, but I'm wondering if baseclassing built-in classes to extend their functionality is considered "good" Ruby style.
E.g.
class Grades < Array
def sum
sum = 0
self.each do |num|
sum += num
end
return sum
end
def avg
self.sum/self.length
end
end
Now Grades objects look like arrays when built, but have the additional sum and avg functions that I want access to. Would it be "better" style not to baseclass Array, but to add this functionality to a generic object?
Yes.
In general, everyone "monkey patches" everything in Ruby freely, from classes you wrote, to classes someone more important than you wrote, to library classes.
However, the general computing style guideline is if your class does everything, it does nothing. Your example accesses each() and length() efficiently, but your new Grades class now exposes every Array method, including ones you might not want called ~ and including ones that some cretin might someday go and monkey-patch! So if your Grades class were very public (used by your entire program), you might want to consider delegation.
Another guideline—that applies more in some languages than others—is you should never inherit unless you then override a method, to achieve polymorphism. Yet another rule which the entire Ruby community, including me, enjoys breaking freely.
For this case, I would say subclassing isn't really appropriate. A subclass should be a more specific version of its superclass—for example, Fixnum is a specific sort of Integer (it's a small integer stored in a particular way), which is a specific sort of Numeric (only some numbers are integers), which is a specific sort of Object (only objects that represent numbers are numerics). Your Grades class, on the other hand, is exactly equivalent to an Array except that it can calculate a couple more things about itself.
If Grades constrained something about the data it stored—for example, it only allowed you to insert numerics between 0.0 and 1.0 (or integers between 0 and 100, if you'd prefer)—it might make sense to subclass Array. On the other hand, it might also make sense to have Grades subclass Object directly, and keep the actual grades in an Array attribute.
Adding sum and avg, on the other hand, simply adds functionality that would be equally useful for other kinds of arrays, too. For such generic functionality, I would simply add those methods to Array so you don't have to worry about whether you've got a plain Array or a Grades in a particular place.
There are some gray areas here, of course—if you were proposing adding a letter method to convert the grades to an A through F grade letter, I wouldn't be so reluctant to subclass Grades. This is definitely a judgement call. But for this level of genericness, I really don't think subclassing is appropriate.
In Ruby, morals are freeer. Permissible is anything the programmer deems such. Actually, monkey patching existing classes is pretty much standard practice. Apart from what the other two answers said, let me bring your attention to the Ruby 2.0 refine feature, which allows you to monkey-patch within stricter boundaries, with less fear of undesirable interactions with other code.
But in your particular case, I think that your decision to create a separate class Grades might be a correct one. It's just a gut feeling, I'd have to be familiar with your codebase to say that for sure. It is less important, whether you make your Grades class a subclass of Array, or whether you just give it an attribute #grade_array, in which you will store actual grades and to which you will delegate the methods you want from Array class.

Is the .each iterator in ruby guaranteed to give the same order on the same elements every time?

I'm doing something like this with a list 'a':
a.each_with_index |outer, i|
a.each_with_index |inner, j|
if(j > i)
# do some operation with outer and inner
end
end
end
if the iterator is not going to use the same order, this won't work. I don't care what the order actually is, I just need for two .each_with_index iterators to use the same order.
I would assume that it would be a property of an array that it has a fixed order and I'm just being paranoid that the iterator wouldn't use that order...
This depends on the specific Enumerable object you are operating on.
Arrays for example will always return elements in the same order. But other enumerable objects are not guaranteed to behave this way. A good example of this is the 1.8,7 base Hash. That is why many frameworks (most notably ActiveSupport) implement an OrderedHash.
One interesting side note: Even Hash will return objects in the same order if the hash has not changed between each calls. While many objects behave this way, relying on this subtlety is probably not a great idea.
So, no. The generic each will not always return objects in the same order.
P.S. Ruby 1.9's hashes are now actually ordered http://www.igvita.com/2009/02/04/ruby-19-internals-ordered-hash
I've not looked at your actual code but here is your answer taken from the Ruby API docs:
Arrays are ordered, integer-indexed collections of any object.
So yes, you are being paranoid but surely that's a good thing when you're developing?
Array by definition is an ordered list of elements. So you should have no problems with that.
It depends on the specific Enumerable. Certainly an Array will always iterate in the obvious order.
It would be quite lunatic fringe for someone to implement an each method that would traverse the same collection in different ways, but the only actual restriction for such a "feature" would be in the documentation for the class that mixes in Enumerable. Well, in that and the sanity of the implementors.
I can almost imagine some sort of cryptographic API that deliberately traversed a collection in an unpredictable way.

Ruby Loops Question

C++:
for(i=0,j=0;i<0;i++,j++)
What's the equivalence to this in ruby?
Besides the normal for, while loop seen in C++. Can someone name off the other special loops ruby has? Such as .times? .each?
Thanks in advance.
If I understand your question (at least the first part of it), you are wondering how you can iterate two separate variables at the same time, such as i and j.
You can do that in Ruby using the for loop, with multiple variables. For instance, if you wanted i to count up from 1 to 10, and j to count from 10 to 20, you could do:
for i, j in (1..10).zip(10..20)
puts "#{i}, #{j}"
end
zip will produce, from two arrays, a single array of which each element is an array, with the first element taken from the corresponding position in the first array, and the second element taken from the corresponding position in the second array:
> [1, 2, 3].zip([4, 5, 6])
=> [[1, 4], [2, 5], [3, 6]]
And using i, j in your for loop will take i from the first element of each inner array, and j from the second element.
If you'd rather use each than for, you can just use a block with two parameters:
(1..10).zip(10..20).each { |i, j| puts "#{i}, #{j}" }
As to the second part of your question, Ruby doesn't really have a fixed number of different iterators, since most iteration is done by passing a block to a method, and thus any class can define its own methods that allow iterating over its own contents. The most common is each, and any class that defines an each method can mix in the Enumerable class, which gives you a variety of different methods for iterating over elements, selecting elements, filtering, and so on. There are also times, upto, and downto defined on the Integer class, each_key, each_value, each_pair on Hash, each_byte, each_char, each_line on String, and so on. Just about any class that defines some sort of collection or sequence has methods for iterating over said collection or sequence.
Ruby is different to C++. In C++ you use a for loop to loop through anything, but in Ruby you'll find you're usually looping through an enumerable object, so it's more common to do something like:
monkeys.each do |monkey|
monkey.say 'ow!'
end
Don't try to look for too much equivalence between the two languages - they're built for different things. Obviously there are a lot of equivalent things, but you can't learn Ruby by producing a chart that shows C++ code on one side and the Ruby equivalent on the other. Try to learn the idiomatic way of doing things and you'll find it much easier.
If you want ways of looping through enumerable objects, check out all the methods in Module: Enumerable: all? any? collect detect each_cons each_slice each_with_index entries enum_cons enum_slice enum_with_index find find_all grep include? inject inject map max member? min partition reject select sort sort_by to_a to_set zip. With most of these methods you'd use a for loop to do the equivalent thing in C++.
You can do:
(0..j).each do |i|
puts i
end
I am not terribly familiar with C++, but AFAICS, the equivalent Ruby code to the loop you posted is simply:
i, j = 0, 0
Which shows once again the expressive power Ruby has. Anybody can figure out what this does, even if he has never seen Ruby before, while the equivalent C++ takes quite a while to figure out.

Resources