Difference between map and collect in Ruby? - ruby

I have Googled this and got patchy / contradictory opinions - is there actually any difference between doing a map and doing a collect on an array in Ruby/Rails?
The docs don't seem to suggest any, but are there perhaps differences in method or performance?

There's no difference, in fact map is implemented in C as rb_ary_collect and enum_collect (eg. there is a difference between map on an array and on any other enum, but no difference between map and collect).
Why do both map and collect exist in Ruby? The map function has many naming conventions in different languages. Wikipedia provides an overview:
The map function originated in functional programming languages but is today supported (or may be defined) in many procedural, object oriented, and multi-paradigm languages as well: In C++'s Standard Template Library, it is called transform, in C# (3.0)'s LINQ library, it is provided as an extension method called Select. Map is also a frequently used operation in high level languages such as Perl, Python and Ruby; the operation is called map in all three of these languages. A collect alias for map is also provided in Ruby (from Smalltalk) [emphasis mine]. Common Lisp provides a family of map-like functions; the one corresponding to the behavior described here is called mapcar (-car indicating access using the CAR operation).
Ruby provides an alias for programmers from the Smalltalk world to feel more at home.
Why is there a different implementation for arrays and enums? An enum is a generalized iteration structure, which means that there is no way in which Ruby can predict what the next element can be (you can define infinite enums, see Prime for an example). Therefore it must call a function to get each successive element (typically this will be the each method).
Arrays are the most common collection so it is reasonable to optimize their performance. Since Ruby knows a lot about how arrays work it doesn't have to call each but can only use simple pointer manipulation which is significantly faster.
Similar optimizations exist for a number of Array methods like zip or count.

I've been told they are the same.
Actually they are documented in the same place under ruby-doc.org:
http://www.ruby-doc.org/core/classes/Array.html#M000249
ary.collect {|item| block } → new_ary
ary.map {|item| block } → new_ary
ary.collect → an_enumerator
ary.map → an_enumerator
Invokes block once for each element of self.
Creates a new array containing the values returned by the block.
See also Enumerable#collect.
If no block is given, an enumerator is returned instead.
a = [ "a", "b", "c", "d" ]
a.collect {|x| x + "!" } #=> ["a!", "b!", "c!", "d!"]
a #=> ["a", "b", "c", "d"]

The collect and collect! methods are aliases to map and map!, so they can be used interchangeably. Here is an easy way to confirm that:
Array.instance_method(:map) == Array.instance_method(:collect)
=> true

I did a benchmark test to try and answer this question, then found this post so here are my findings (which differ slightly from the other answers)
Here is the benchmark code:
require 'benchmark'
h = { abc: 'hello', 'another_key' => 123, 4567 => 'third' }
a = 1..10
many = 500_000
Benchmark.bm do |b|
GC.start
b.report("hash keys collect") do
many.times do
h.keys.collect(&:to_s)
end
end
GC.start
b.report("hash keys map") do
many.times do
h.keys.map(&:to_s)
end
end
GC.start
b.report("array collect") do
many.times do
a.collect(&:to_s)
end
end
GC.start
b.report("array map") do
many.times do
a.map(&:to_s)
end
end
end
And the results I got were:
user system total real
hash keys collect 0.540000 0.000000 0.540000 ( 0.570994)
hash keys map 0.500000 0.010000 0.510000 ( 0.517126)
array collect 1.670000 0.020000 1.690000 ( 1.731233)
array map 1.680000 0.020000 1.700000 ( 1.744398)
Perhaps an alias isn't free?

Ruby aliases the method Array#map to Array#collect; they can be used interchangeably. (Ruby Monk)
In other words, same source code :
static VALUE
rb_ary_collect(VALUE ary)
{
long i;
VALUE collect;
RETURN_SIZED_ENUMERATOR(ary, 0, 0, ary_enum_length);
collect = rb_ary_new2(RARRAY_LEN(ary));
for (i = 0; i < RARRAY_LEN(ary); i++) {
rb_ary_push(collect, rb_yield(RARRAY_AREF(ary, i)));
}
return collect;
}
http://ruby-doc.org/core-2.2.0/Array.html#method-i-map

#collect is actually an alias for #map. That means the two methods can be used interchangeably, and effect the same behavior.

Related

Array difference by explicitly specified method or block

If I have Arrays a and b, the expression a-b returns an Array with all those elements in a which are not in b. "Not in" means unequality (!=) here.
In my case, both arrays only contain elements of the same type (or, from the ducktyping perspective, only elements which understand a "equality" method f).
Is there an easy way to specify this f as a criterium of equality, in a similar way I can provide my own comparator when doing sort? Currently, I implemented this explicitly :
# Get the difference a-b, based on 'f':
a.select { |ael| b.all? {|bel| ael.f != bel.f} }
This works, but I wonder if there is an easier way.
UPDATE: From the comments to this question, I get the impression, that a concrete example would be appreciated. So, here we go:
class Dummy; end
# Create an Array of Dummy objects.
a = Array.new(99) { Dummy.new }
# Pick some of them at random
b = Array.new(10) { a.sample }
# Now I want to get those elements from a, which are not in b.
diff = a.select { |ael| b.all? {|bel| ael.object_id != bel.object_id} }
Of course in this case, I could also have said ! ael eql? bel, but in my general solution, this is not the case.
The "normal" object equality for e.g. Hashes and set operations on Arrays (such as the - operation) uses the output of the Object#hash method of the contained objects along with the semantics of the a.eql?(b) comparison.
This can be used to to improve performance. Ruby assumes here that two objects are eql? if the return value of their respective hash methods is the same (and consequently, assumes that two objects returning different hash values to not be eql?).
For a normal a - b operation, this can thus be used to first calculate the hash value of each object once and then only compare those values. This is quite fast.
Now, if you have a custom equality, your best bet would be to overwrite the object's hash methods so that they return suitable values for those semantics.
A common approach is to build an array containing all data taking part of the object's identity and getting its hash, e.g.
class MyObject
#...
attr_accessor :foo, :bar
def hash
[self.class, foo, bar].hash
end
end
In your object's hash method, you would than include all data that is currently considered by your f comparison method. Instead of actually using f then, you are using the default semantics of all Ruby objects and again can achieve quick set operations with your objects.
If however this is not feasible (e.g. because you need different equality semantics based on use-case), you could emulate what ruby does on your own.
With your f method, you could then perform your set operation as follows:
def f_difference(a, b)
a_map = a.each_with_object({}) do |a_el, hash|
hash[a_el.f] = a_el
end
b.each do |b_el|
a_map.delete b_el.f
end
a_map.values
end
With this approach, you only need to calculate the f value of each of your objects once. We first build a hash map with all f values and elements from a and remove the matching elements from b according to their f values. The remaining values are the result.
This approach saves you from having to loop over b for each object in a which can be slow of you have a lot of objects. If however you only have a few objects on each of your arrays, your original approach should already be fine.
Let's have a look at a benchmark whee I use the standard hash method in place of your custom f to have a comparable result.
require 'benchmark/ips'
def question_diff(a, b)
a.select { |ael| b.all? {|bel| ael.hash != bel.hash} }
end
def answer_diff(a, b)
a_map = a.each_with_object({}) do |a_el, hash|
hash[a_el.hash] = a_el
end
b.each do |b_el|
a_map.delete b_el.hash
end
a_map.values
end
A = Array.new(100) { rand(10_000) }
B = Array.new(10) { A.sample }
Benchmark.ips do |x|
x.report("question") { question_diff(A, B) }
x.report("answer") { answer_diff(A, B) }
x.compare!
end
With Ruby 2.7.1, I get the following result on my machine, showing that the original approach from the question is about 5.9 times slower than the optimized version from my answer:
Warming up --------------------------------------
question 1.304k i/100ms
answer 7.504k i/100ms
Calculating -------------------------------------
question 12.779k (± 2.0%) i/s - 63.896k in 5.002006s
answer 74.898k (± 3.3%) i/s - 375.200k in 5.015239s
Comparison:
answer: 74898.0 i/s
question: 12779.3 i/s - 5.86x (± 0.00) slower

Is order of a Ruby hash literal guaranteed?

Ruby, since v1.9, supports a deterministic order when looping through a hash; entries added first will be returned first.
Does this apply to literals, i.e. will { a: 1, b: 2 } always yield a before b?
I did a quick experiment with Ruby 2.1 (MRI) and it was in fact consistent, but to what extent is this guaranteed by the language to work on all Ruby implementations?
There are couple of locations where this could be specified, i.e. a couple of things that are considered "The Ruby Language Specification":
the ISO Ruby Language Specification
the RubySpec project
the YARV testsuite
The Ruby Programming Language book by matz and David Flanagan
The ISO spec doesn't say anything about Hash ordering: it was written in such a way that all existing Ruby implementations are automatically compliant with it, without having to change, i.e. it was written to be descriptive of current Ruby implementations, not prescriptive. At the time the spec was written, those implementations included MRI, YARV, Rubinius, JRuby, IronRuby, MagLev, MacRuby, XRuby, Ruby.NET, Cardinal, tinyrb, RubyGoLightly, SmallRuby, BlueRuby, and others. Of particular interest are MRI (which only implements 1.8) and YARV (which only implements 1.9 (at the time)), which means that the spec can only specify behavior which is common to 1.8 and 1.9, which Hash ordering is not.
The RubySpec project was abandoned by its developers out of frustration that the ruby-core developers and YARV developers never recognized it. It does, however, (implicitly) specify that Hash literals are ordered left-to-right:
new_hash(1 => 2, 4 => 8, 2 => 4).keys.should == [1, 4, 2]
That's the spec for Hash#keys, however, the other specs test that Hash#values has the same order as Hash#keys, Hash#each_value and Hash#each_key has the same order as those, and Hash#each_pair and Hash#each have the same order as well.
I couldn't find anything in the YARV testsuite that specifies that ordering is preserved. In fact, I couldn't find anything at all about ordering in that testsuite, quite the opposite: the tests go to great length to avoid depending on ordering!
The Flanagan/matz book kinda-sorta implicitly specifies Hash literal ordering in section 9.5.3.6 Hash iterators. First, it uses much the same formulation as the docs:
In Ruby 1.9, however, hash elements are iterated in their insertion order, […]
But then it goes on:
[…], and that is the order shown in the following examples:
And in those examples, it actually uses a literal:
h = { :a=>1, :b=>2, :c=>3 }
# The each() iterator iterates [key,value] pairs
h.each {|pair| print pair } # Prints "[:a, 1][:b, 2][:c, 3]"
# It also works with two block arguments
h.each do |key, value|
print "#{key}:#{value} " # Prints "a:1 b:2 c:3"
end
# Iterate over keys or values or both
h.each_key {|k| print k } # Prints "abc"
h.each_value {|v| print v } # Prints "123"
h.each_pair {|k,v| print k,v } # Prints "a1b2c3". Like each
In his comment, #mu is too short mentioned that
h = { a: 1, b: 2 } is the same as h = { }; h[:a] = 1; h[:b] = 2
and in another comment that
nothing else would make any sense
Unfortunately, that is not true:
module HashASETWithLogging
def []=(key, value)
puts "[]= was called with [#{key.inspect}] = #{value.inspect}"
super
end
end
class Hash
prepend HashASETWithLogging
end
h = { a: 1, b: 2 }
# prints nothing
h = { }; h[:a] = 1; h[:b] = 2
# []= was called with [:a] = 1
# []= was called with [:b] = 2
So, depending on how you interpret that line from the book and depending on how "specification-ish" you judge that book, yes, ordering of literals is guaranteed.
From the documentation:
Hashes enumerate their values in the order that the corresponding keys
were inserted.

What is prefered way to loop in Ruby?

Why is each loop preferred over for loop in Ruby? Is there a difference in time complexity or are they just syntactically different?
Yes, these are two different ways of iterating over, But hope this calculation helps.
require 'benchmark'
a = Array( 1..100000000 )
sum = 0
Benchmark.realtime {
a.each { |x| sum += x }
}
This takes 5.866932 sec
a = Array( 1..100000000 )
sum = 0
Benchmark.realtime {
for x in a
sum += x
end
}
This takes 6.146521 sec.
Though its not a right way to do the benchmarking, there are some other constraints too. But on a single machine, each seems to be a bit faster than for.
The variable referencing an item in iteration is temporary and does not have significance outside of the iteration. It is better if it is hidden from outside of the iteration. With external iterators, such variable is located outside of the iteration block. In the following, e is useful only within do ... end, but is separated from the block, and written outside of it; it does not look easy to a programmer:
for e in [:foo, :bar] do
...
end
With internal iterators, the block variable is defined right inside the block, where it is used. It is easier to read:
[:foo, :bar].each do |e|
...
end
This visibility issue is not just for a programmer. With respect to visibility in the sense of scope, the variable for an external iterator is accessible outside of the iteration:
for e in [:foo] do; end
e # => :foo
whereas in internal iterator, a block variable is invisible from outside:
[:foo].each do |e|; end
e # => undefined local variable or method `e'
The latter is better from the point of view of encapsulation.
When you want to nest the loops, the order of variables would be somewhat backwards with external iterators:
for a in [[:foo, :bar]] do
for e in a do
...
end
end
but with internal iterators, the order is more straightforward:
[[:foo, :bar]].each do |a|
a.each do |e|
...
end
end
With external iterators, you can only use hard-coded Ruby syntax, and you also have to remember the matching between the keyword and the method that is internally called (for calls each), but for internal iterators, you can define your own, which gives flexibility.
each is the Ruby Way. Implements the Iterator Pattern that has decoupling benefits.
Check also this: "for" vs "each" in Ruby
An interesting question. There are several ways of looping in Ruby. I have noted that there is a design principle in Ruby, that when there are multiple ways of doing the same, there are usually subtle differences between them, and each case has its own unique use, its own problem that it solves. So in the end you end up needing to be able to write (and not just to read) all of them.
As for the question about for loop, this is similar to my earlier question whethe for loop is a trap.
Basically there are 2 main explicit ways of looping, one is by iterators (or, more generally, blocks), such as
[1, 2, 3].each { |e| puts e * 10 }
[1, 2, 3].map { |e| e * 10 )
# etc., see Array and Enumerable documentation for more iterator methods.
Connected to this way of iterating is the class Enumerator, which you should strive to understand.
The other way is Pascal-ish looping by while, until and for loops.
for y in [1, 2, 3]
puts y
end
x = 0
while x < 3
puts x; x += 1
end
# same for until loop
Like if and unless, while and until have their tail form, such as
a = 'alligator'
a.chop! until a.chars.last == 'g'
#=> 'allig'
The third very important way of looping is implicit looping, or looping by recursion. Ruby is extremely malleable, all classes are modifiable, hooks can be set up for various events, and this can be exploited to produce most unusual ways of looping. The possibilities are so endless that I don't even know where to start talking about them. Perhaps a good place is the blog by Yusuke Endoh, a well known artist working with Ruby code as his artistic material of choice.
To demonstrate what I mean, consider this loop
class Object
def method_missing sym
s = sym.to_s
if s.chars.last == 'g' then s else eval s.chop end
end
end
alligator
#=> "allig"
Aside of readability issues, the for loop iterates in the Ruby land whereas each does it from native code, so in principle each should be more efficient when iterating all elements in an array.
Loop with each:
arr.each {|x| puts x}
Loop with for:
for i in 0..arr.length
puts arr[i]
end
In the each case we are just passing a code block to a method implemented in the machine's native code (fast code), whereas in the for case, all code must be interpreted and run taking into account all the complexity of the Ruby language.
However for is more flexible and lets you iterate in more complex ways than each does, for example, iterating with a given step.
EDIT
I didn't come across that you can step over a range by using the step() method before calling each(), so the flexibility I claimed for the for loop is actually unjustified.

Monad equivalent in Ruby

What would an equivalent construct of a monad be in Ruby?
The precise technical definition: A monad, in Ruby, would be any class with bind and self.unit methods defined such that for all instances m:
m.class.unit[a].bind[f] == f[a]
m.bind[m.class.unit] == m
m.bind[f].bind[g] == m.bind[lambda {|x| f[x].bind[g]}]
Some practical examples
A very simple example of a monad is the lazy Identity monad, which emulates lazy semantics in Ruby (a strict language):
class Id
def initialize(lam)
#v = lam
end
def force
#v[]
end
def self.unit
lambda {|x| Id.new(lambda { x })}
end
def bind
x = self
lambda {|f| Id.new(lambda { f[x.force] })}
end
end
Using this, you can chain procs together in a lazy manner. For example, in the following, x is a container "containing" 40, but the computation is not performed until the second line, evidenced by the fact that the puts statement doesn't output anything until force is called:
x = Id.new(lambda {20}).bind[lambda {|x| puts x; Id.unit[x * 2]}]
x.force
A somewhat similar, less abstract example would be a monad for getting values out of a database. Let's presume that we have a class Query with a run(c) method that takes a database connection c, and a constructor of Query objects that takes, say, an SQL string. So DatabaseValue represents a value that's coming from the database. DatabaseValue is a monad:
class DatabaseValue
def initialize(lam)
#cont = lam
end
def self.fromQuery(q)
DatabaseValue.new(lambda {|c| q.run(c) })
end
def run(c)
#cont[c]
end
def self.unit
lambda {|x| DatabaseValue.new(lambda {|c| x })}
end
def bind
x = self
lambda {|f| DatabaseValue.new(lambda {|c| f[x.run(c)].run(c) })}
end
end
This would let you chain database calls through a single connection, like so:
q = unit["John"].bind[lambda {|n|
fromQuery(Query.new("select dep_id from emp where name = #{n}")).
bind[lambda {|id|
fromQuery(Query.new("select name from dep where id = #{id}"))}].
bind[lambda { |name| unit[doSomethingWithDeptName(name)] }]
begin
c = openDbConnection
someResult = q.run(c)
rescue
puts "Error #{$!}"
ensure
c.close
end
OK, so why on earth would you do that? Because there are extremely useful functions that can be written once for all monads. So code that you would normally write over and over can be reused for any monad once you simply implement unit and bind. For example, we can define a Monad mixin that endows all such classes with some useful methods:
module Monad
I = lambda {|x| x }
# Structure-preserving transform that applies the given function
# across the monad environment.
def map
lambda {|f| bind[lambda {|x| self.class.unit[f[x]] }]}
end
# Joins a monad environment containing another into one environment.
def flatten
bind[I]
end
# Applies a function internally in the monad.
def ap
lambda {|x| liftM2[I,x] }
end
# Binds a binary function across two environments.
def liftM2
lambda {|f, m|
bind[lambda {|x1|
m.bind[lambda {|x2|
self.class.unit[f[x1,x2]]
}]
}]
}
end
end
And this in turn lets us do even more useful things, like define this function:
# An internal array iterator [m a] => m [a]
def sequence(m)
snoc = lambda {|xs, x| xs + [x]}
lambda {|ms| ms.inject(m.unit[[]], &(lambda {|x, xs| x.liftM2[snoc, xs] }))}
end
The sequence method takes a class that mixes in Monad, and returns a function that takes an array of monadic values and turns it into a monadic value containing an array. They could be Id values (turning an array of Identities into an Identity containing an array), or DatabaseValue objects (turning an array of queries into a query that returns an array), or functions (turning an array of functions into a function that returns an array), or arrays (turning an array of arrays inside-out), or parsers, continuations, state machines, or anything else that could possibly mix in the Monad module (which, as it turns out, is true for almost all data structures).
To add my two cents, I'd say that hzap has misunderstood the concept of monads.
It's not only a « type interface » or a « structure providing some specific functions », it's muck more than that.
It's an abstract structure providing operations (bind (>>=) and unit (return)) which follow, as Ken and Apocalisp said, strict rules.
If you're interested by monads and want to know more about them than the few things said in these answers, I strongly advise you to read : Monads for functional programming (pdf), by Wadler.
See ya!
PS: I see I don't directly answer your question, but Apocalisp already did, and I think (at least hope) that my precisions were worth it
Monads are not language constructs. They're just types that implement a particular interface, and since Ruby is dynamically typed, any class that implement something like collect in arrays, a join method (like flatten but only flattens one level), and a constructor that can wrap anything, is a monad.
Following on the above answers:
You may be interested in checking out Rumonade, a ruby gem which implements a Monad mix-in for Ruby.
Romande is implemented as mix-in, so it expect its host class to implement the methods self.unit and #bind (and optionally, self.empty), and will do the rest to make things work for you.
You can use it to map over Option, as you are used to in Scala, and you can even get some nice multiple-failure return values from validations, a la Scalaz's Validation class.

What is the "right" way to iterate through an array in Ruby?

PHP, for all its warts, is pretty good on this count. There's no difference between an array and a hash (maybe I'm naive, but this seems obviously right to me), and to iterate through either you just do
foreach (array/hash as $key => $value)
In Ruby there are a bunch of ways to do this sort of thing:
array.length.times do |i|
end
array.each
array.each_index
for i in array
Hashes make more sense, since I just always use
hash.each do |key, value|
Why can't I do this for arrays? If I want to remember just one method, I guess I can use each_index (since it makes both the index and value available), but it's annoying to have to do array[index] instead of just value.
Oh right, I forgot about array.each_with_index. However, this one sucks because it goes |value, key| and hash.each goes |key, value|! Is this not insane?
This will iterate through all the elements:
array = [1, 2, 3, 4, 5, 6]
array.each { |x| puts x }
# Output:
1
2
3
4
5
6
This will iterate through all the elements giving you the value and the index:
array = ["A", "B", "C"]
array.each_with_index {|val, index| puts "#{val} => #{index}" }
# Output:
A => 0
B => 1
C => 2
I'm not quite sure from your question which one you are looking for.
I think there is no one right way. There are a lot of different ways to iterate, and each has its own niche.
each is sufficient for many usages, since I don't often care about the indexes.
each_ with _index acts like Hash#each - you get the value and the index.
each_index - just the indexes. I don't use this one often. Equivalent to "length.times".
map is another way to iterate, useful when you want to transform one array into another.
select is the iterator to use when you want to choose a subset.
inject is useful for generating sums or products, or collecting a single result.
It may seem like a lot to remember, but don't worry, you can get by without knowing all of them. But as you start to learn and use the different methods, your code will become cleaner and clearer, and you'll be on your way to Ruby mastery.
I'm not saying that Array -> |value,index| and Hash -> |key,value| is not insane (see Horace Loeb's comment), but I am saying that there is a sane way to expect this arrangement.
When I am dealing with arrays, I am focused on the elements in the array (not the index because the index is transitory). The method is each with index, i.e. each+index, or |each,index|, or |value,index|. This is also consistent with the index being viewed as an optional argument, e.g. |value| is equivalent to |value,index=nil| which is consistent with |value,index|.
When I am dealing with hashes, I am often more focused on the keys than the values, and I am usually dealing with keys and values in that order, either key => value or hash[key] = value.
If you want duck-typing, then either explicitly use a defined method as Brent Longborough showed, or an implicit method as maxhawkins showed.
Ruby is all about accommodating the language to suit the programmer, not about the programmer accommodating to suit the language. This is why there are so many ways. There are so many ways to think about something. In Ruby, you choose the closest and the rest of the code usually falls out extremely neatly and concisely.
As for the original question, "What is the “right” way to iterate through an array in Ruby?", well, I think the core way (i.e. without powerful syntactic sugar or object oriented power) is to do:
for index in 0 ... array.size
puts "array[#{index}] = #{array[index].inspect}"
end
But Ruby is all about powerful syntactic sugar and object oriented power, but anyway here is the equivalent for hashes, and the keys can be ordered or not:
for key in hash.keys.sort
puts "hash[#{key.inspect}] = #{hash[key].inspect}"
end
So, my answer is, "The “right” way to iterate through an array in Ruby depends on you (i.e. the programmer or the programming team) and the project.". The better Ruby programmer makes the better choice (of which syntactic power and/or which object oriented approach). The better Ruby programmer continues to look for more ways.
Now, I want to ask another question, "What is the “right” way to iterate through a Range in Ruby backwards?"! (This question is how I came to this page.)
It is nice to do (for the forwards):
(1..10).each{|i| puts "i=#{i}" }
but I don't like to do (for the backwards):
(1..10).to_a.reverse.each{|i| puts "i=#{i}" }
Well, I don't actually mind doing that too much, but when I am teaching going backwards, I want to show my students a nice symmetry (i.e. with minimal difference, e.g. only adding a reverse, or a step -1, but without modifying anything else).
You can do (for symmetry):
(a=*1..10).each{|i| puts "i=#{i}" }
and
(a=*1..10).reverse.each{|i| puts "i=#{i}" }
which I don't like much, but you can't do
(*1..10).each{|i| puts "i=#{i}" }
(*1..10).reverse.each{|i| puts "i=#{i}" }
#
(1..10).step(1){|i| puts "i=#{i}" }
(1..10).step(-1){|i| puts "i=#{i}" }
#
(1..10).each{|i| puts "i=#{i}" }
(10..1).each{|i| puts "i=#{i}" } # I don't want this though. It's dangerous
You could ultimately do
class Range
def each_reverse(&block)
self.to_a.reverse.each(&block)
end
end
but I want to teach pure Ruby rather than object oriented approaches (just yet). I would like to iterate backwards:
without creating an array (consider 0..1000000000)
working for any Range (e.g. Strings, not just Integers)
without using any extra object oriented power (i.e. no class modification)
I believe this is impossible without defining a pred method, which means modifying the Range class to use it. If you can do this please let me know, otherwise confirmation of impossibility would be appreciated though it would be disappointing. Perhaps Ruby 1.9 addresses this.
(Thanks for your time in reading this.)
Use each_with_index when you need both.
ary.each_with_index { |val, idx| # ...
The other answers are just fine, but I wanted to point out one other peripheral thing: Arrays are ordered, whereas Hashes are not in 1.8. (In Ruby 1.9, Hashes are ordered by insertion order of keys.) So it wouldn't make sense prior to 1.9 to iterate over a Hash in the same way/sequence as Arrays, which have always had a definite ordering. I don't know what the default order is for PHP associative arrays (apparently my google fu isn't strong enough to figure that out, either), but I don't know how you can consider regular PHP arrays and PHP associative arrays to be "the same" in this context, since the order for associative arrays seems undefined.
As such, the Ruby way seems more clear and intuitive to me. :)
Here are the four options listed in your question, arranged by freedom of control. You might want to use a different one depending on what you need.
Simply go through values:
array.each
Simply go through indices:
array.each_index
Go through indices + index variable:
for i in array
Control loop count + index variable:
array.length.times do | i |
Trying to do the same thing consistently with arrays and hashes might just be a code smell, but, at the risk of my being branded as a codorous half-monkey-patcher, if you're looking for consistent behaviour, would this do the trick?:
class Hash
def each_pairwise
self.each { | x, y |
yield [x, y]
}
end
end
class Array
def each_pairwise
self.each_with_index { | x, y |
yield [y, x]
}
end
end
["a","b","c"].each_pairwise { |x,y|
puts "#{x} => #{y}"
}
{"a" => "Aardvark","b" => "Bogle","c" => "Catastrophe"}.each_pairwise { |x,y|
puts "#{x} => #{y}"
}
I'd been trying to build a menu (in Camping and Markaby) using a hash.
Each item has 2 elements: a menu label and a URL, so a hash seemed right, but the '/' URL for 'Home' always appeared last (as you'd expect for a hash), so menu items appeared in the wrong order.
Using an array with each_slice does the job:
['Home', '/', 'Page two', 'two', 'Test', 'test'].each_slice(2) do|label,link|
li {a label, :href => link}
end
Adding extra values for each menu item (e.g. like a CSS ID name) just means increasing the slice value. So, like a hash but with groups consisting of any number of items. Perfect.
So this is just to say thanks for inadvertently hinting at a solution!
Obvious, but worth stating: I suggest checking if the length of the array is divisible by the slice value.
If you use the enumerable mixin (as Rails does) you can do something similar to the php snippet listed. Just use the each_slice method and flatten the hash.
require 'enumerator'
['a',1,'b',2].to_a.flatten.each_slice(2) {|x,y| puts "#{x} => #{y}" }
# is equivalent to...
{'a'=>1,'b'=>2}.to_a.flatten.each_slice(2) {|x,y| puts "#{x} => #{y}" }
Less monkey-patching required.
However, this does cause problems when you have a recursive array or a hash with array values. In ruby 1.9 this problem is solved with a parameter to the flatten method that specifies how deep to recurse.
# Ruby 1.8
[1,2,[1,2,3]].flatten
=> [1,2,1,2,3]
# Ruby 1.9
[1,2,[1,2,3]].flatten(0)
=> [1,2,[1,2,3]]
As for the question of whether this is a code smell, I'm not sure. Usually when I have to bend over backwards to iterate over something I step back and realize I'm attacking the problem wrong.
In Ruby 2.1, each_with_index method is removed.
Instead you can use each_index
Example:
a = [ "a", "b", "c" ]
a.each_index {|x| print x, " -- " }
produces:
0 -- 1 -- 2 --
The right way is the one you feel most comfortable with and which does what you want it to do. In programming there is rarely one 'correct' way to do things, more often there are multiple ways to choose.
If you are comfortable with certain way of doings things, do just it, unless it doesn't work - then it is time to find better way.
Using the same method for iterating through both arrays and hashes makes sense, for example to process nested hash-and-array structures often resulting from parsers, from reading JSON files etc..
One clever way that has not yet been mentioned is how it's done in the Ruby Facets library of standard library extensions. From here:
class Array
# Iterate over index and value. The intention of this
# method is to provide polymorphism with Hash.
#
def each_pair #:yield:
each_with_index {|e, i| yield(i,e) }
end
end
There is already Hash#each_pair, an alias of Hash#each. So after this patch, we also have Array#each_pair and can use it interchangeably to iterate through both Hashes and Arrays. This fixes the OP's observed insanity that Array#each_with_index has the block arguments reversed compared to Hash#each. Example usage:
my_array = ['Hello', 'World', '!']
my_array.each_pair { |key, value| pp "#{key}, #{value}" }
# result:
"0, Hello"
"1, World"
"2, !"
my_hash = { '0' => 'Hello', '1' => 'World', '2' => '!' }
my_hash.each_pair { |key, value| pp "#{key}, #{value}" }
# result:
"0, Hello"
"1, World"
"2, !"

Resources