Ruby for loop a trap? - ruby

In a discussion of Ruby loops, Niklas B. recently talked about for loop 'not introducing a new scope', as compared to each loop. I'd like to see some examples of how does one feel this.
O.K., I expand the question: Where else in Ruby do we see what apears do/end block delimiters, but there is actually no scope inside? Anything else apart from for ... do ... end?
O.K., One more expansion of the question, is there a way to write for loop with curly braces { block } ?

Let's illustrate the point by an example:
results = []
(1..3).each do |i|
results << lambda { i }
end
p results.map(&:call) # => [1,2,3]
Cool, this is what was expected. Now check the following:
results = []
for i in 1..3
results << lambda { i }
end
p results.map(&:call) # => [3,3,3]
Huh, what's going on? Believe me, these kinds of bugs are nasty to track down. Python or JS developers will know what I mean :)
That alone is a reason for me to avoid these loops like the plague, although there are more good arguments in favor of this position. As Ben pointed out correctly, using the proper method from Enumerable almost always leads to better code than using plain old, imperative for loops or the fancier Enumerable#each. For instance, the above example could also be concisely written as
lambdas = 1.upto(3).map { |i| lambda { i } }
p lambdas.map(&:call)
I expand the question: Where else in Ruby do we see what apears do/end block delimiters, but there is actually no scope inside? Anything else apart from for ... do ... end?
Every single one of the looping constructs can be used that way:
while true do
#...
end
until false do
# ...
end
On the other hand, we can write every one of these without the do (which is obviously preferrable):
for i in 1..3
end
while true
end
until false
end
One more expansion of the question, is there a way to write for loop with curly braces { block }
No, there is not. Also note that the term "block" has a special meaning in Ruby.

First, I'll explain why you wouldn't want to use for, and then explain why you might.
The main reason you wouldn't want to use for is that it's un-idiomatic. If you use each, you can easily replace that each with a map or a find or an each_with_index without a major change of your code. But there's no for_map or for_find or for_with_index.
Another reason is that if you create a variable within a block within each, and it hasn't been created before-hand, it'll only stay in existance for as long as that loop exists. Getting rid of variables once you have no use for them is a good thing.
Now I'll mention why you might want to use for. each creates a closure for each loop, and if you repeat that loop too many times, that loop can cause performance problems. In https://stackoverflow.com/a/10325493/38765 , I posted that using a while loop rather than a block made it slower.
RUN_COUNT = 10_000_000
FIRST_STRING = "Woooooha"
SECOND_STRING = "Woooooha"
def times_double_equal_sign
RUN_COUNT.times do |i|
FIRST_STRING == SECOND_STRING
end
end
def loop_double_equal_sign
i = 0
while i < RUN_COUNT
FIRST_STRING == SECOND_STRING
i += 1
end
end
times_double_equal_sign consistently took 2.4 seconds, while loop_double_equal_sign was consistently 0.2 to 0.3 seconds faster.
In https://stackoverflow.com/a/6475413/38765 , I found that executing an empty loop took 1.9 seconds, whereas executing an empty block took 5.7 seconds.
Know why you wouldn't want to use for, know why you would want to use for, and only use the latter when you need to. Unless you feel nostalgic for other languages. :)

Well, even blocks are not perfect in Ruby prior to 1.9. They don't always introduce new scope:
i = 0
results = []
(1..3).each do |i|
results << lambda { i }
end
i = 5
p results.map(&:call) # => [5,5,5]

Related

Loop method until it returns falsey

I was trying to make my bubble sort shorter and I came up with this
class Array
def bubble_sort!(&block)
block = Proc.new { |a, b| a <=> b } unless block_given?
sorted = each_index.each_cons(2).none? do |i, next_i|
if block.call(self[i], self[next_i]) == 1
self[i], self[next_i] = self[next_i], self[i]
end
end until sorted
self
end
def bubble_sort(&prc)
self.dup.bubble_sort!(&prc)
end
end
I don't particularly like the thing with sorted = --sort code-- until sorted.
I just want to run the each_index.each_cons(s).none? code until it returns true. It's a weird situation that I use until, but the condition is a code I want to run. Any way, my try seems awkward, and ruby usually has a nice concise way of putting things. Is there a better way to do this?
This is just my opinion
have you ever read the ruby source code of each and map to understand what they do?
No, because they have a clear task expressed from the method name and if you test them, they will take an object, some parameters and then return a value to you.
For example if I want to test the String method split()
s = "a new string"
s.split("new")
=> ["a ", " string"]
Do you know if .split() takes a block?
It is one of the core ruby methods, but to call it I don't pass a block 90% of the times, I can understand what it does from the name .split() and from the return value
Focus on the objects you are using, the task the methods should accomplish and their return values.
I read your code and I can not refactor it, I hardly can understand what the code does.
I decided to write down some points, with possibility to follow up:
1) do not use the proc for now, first get the Object Oriented code clean.
2) split bubble_sort! into several methods, each one with a clear task
def ordered_inverted! (bubble_sort!), def invert_values, maybe perform a invert_values until sorted, check if existing methods already perform this sorting functionality
3) write specs for those methods, tdd will push you to keep methods simple and easy to test
4) If those methods do not belong to the Array class, include them in the appropriate class, sometimes overly complicated methods are just performing simple String operations.
5) Reading books about refactoring may actually help more then trying to force the usage of proc and functional programming when not necessary.
After looking into it further I'm fairly sure the best solution is
loop do
break if condition
end
Either that or the way I have it in the question, but I think the loop do version is clearer.
Edit:
Ha, a couple weeks later after I settled for the loop do solution, I stumbled into a better one. You can just use a while or until loop with an empty block like this:
while condition; end
until condition; end
So the bubble sort example in the question can be written like this
class Array
def bubble_sort!(&block)
block = Proc.new { |a, b| a <=> b } unless block_given?
until (each_index.each_cons(2).none? do |i, next_i|
if block.call(self[i], self[next_i]) == 1
self[i], self[next_i] = self[next_i], self[i]
end
end); end
self
end
def bubble_sort(&prc)
self.dup.bubble_sort!(&prc)
end
end

Push an array into another array with Ruby, and return square brackets

I've spent a few hours searching for a way to push an array into another array or into a hash. Apologies in advance if the formatting of this question is bit messy. This is the first time I've asked a question on StackOverflow so I'm trying to get the hang of styling my questions properly.
I have to write some code to make the following test unit past:
class TestNAME < Test::Unit::TestCase
def test_directions()
assert_equal(Lexicon.scan("north"), [['direction', 'north']])
result = Lexicon.scan("north south east")
assert_equal(result, [['direction', 'north'],
['direction', 'south'],
['direction', 'east']])
end
end
The most simple thing I've come up with is below. The first part passes, but then the second part is not returning the expected result when I run rake test.
Instead or returning:
[["direction", "north"], ["direction", "south"], ["direction",
"east"]]
it's returning:
["north", "south", "east"]
Although, if I print the result of y as a string to the console, I get 3 separate arrays that are not contained within another array (as below). Why hasn't it printed the outermost square brackets of the array, y?
["direction", "north"]
["direction", "south"]
["direction", "east"]
Below is the code I've written in an attempt to pass the test unit above:
class Lexicon
def initialize(stuff)
#words = stuff.split
end
def self.scan(word)
if word.include?(' ')
broken_words = word.split
broken_words.each do |word|
x = ['direction']
x.push(word)
y = []
y.push(x)
end
else
return [['direction', word]]
end
end
end
Any feedback about this will be much appreciated. Thank you all so much in advance.
What you're seeing is the result of each, which returns the thing being iterated over, or in this case, broken_words. What you want is collect which returns the transformed values. Notice in your original, y is never used, it's just thrown out after being composed.
Here's a fixed up version:
class Lexicon
def initialize(stuff)
#words = stuff.split
end
def self.scan(word)
broken_words = word.split(/\s+/)
broken_words.collect do |word|
[ 'direction', word ]
end
end
end
It's worth noting a few things were changed here:
Splitting on an arbitrary number of spaces rather than one.
Simplifying to a single case instead of two.
Eliminating the redundant return statement.
One thing you might consider is using a data structure like { direction: word } instead. That makes referencing values a lot easier since you'd do entry[:direction] avoiding the ambiguous entry[1].
If you're not instantiating Lexicon objects, you can use a Module which may make it more clear that you're not instantiating objects.
Also, there is no need to use an extra variable (i.e. broken_words), and I prefer the { } block syntax over the do..end syntax for functional blocks vs. iterative blocks.
module Lexicon
def self.scan str
str.split.map {|word| [ 'direction', word ] }
end
end
UPDATE: based on Cary's comment (I assume he meant split when he said scan), I've removed the superfluous argument to split.

What is prefered way to loop in Ruby?

Why is each loop preferred over for loop in Ruby? Is there a difference in time complexity or are they just syntactically different?
Yes, these are two different ways of iterating over, But hope this calculation helps.
require 'benchmark'
a = Array( 1..100000000 )
sum = 0
Benchmark.realtime {
a.each { |x| sum += x }
}
This takes 5.866932 sec
a = Array( 1..100000000 )
sum = 0
Benchmark.realtime {
for x in a
sum += x
end
}
This takes 6.146521 sec.
Though its not a right way to do the benchmarking, there are some other constraints too. But on a single machine, each seems to be a bit faster than for.
The variable referencing an item in iteration is temporary and does not have significance outside of the iteration. It is better if it is hidden from outside of the iteration. With external iterators, such variable is located outside of the iteration block. In the following, e is useful only within do ... end, but is separated from the block, and written outside of it; it does not look easy to a programmer:
for e in [:foo, :bar] do
...
end
With internal iterators, the block variable is defined right inside the block, where it is used. It is easier to read:
[:foo, :bar].each do |e|
...
end
This visibility issue is not just for a programmer. With respect to visibility in the sense of scope, the variable for an external iterator is accessible outside of the iteration:
for e in [:foo] do; end
e # => :foo
whereas in internal iterator, a block variable is invisible from outside:
[:foo].each do |e|; end
e # => undefined local variable or method `e'
The latter is better from the point of view of encapsulation.
When you want to nest the loops, the order of variables would be somewhat backwards with external iterators:
for a in [[:foo, :bar]] do
for e in a do
...
end
end
but with internal iterators, the order is more straightforward:
[[:foo, :bar]].each do |a|
a.each do |e|
...
end
end
With external iterators, you can only use hard-coded Ruby syntax, and you also have to remember the matching between the keyword and the method that is internally called (for calls each), but for internal iterators, you can define your own, which gives flexibility.
each is the Ruby Way. Implements the Iterator Pattern that has decoupling benefits.
Check also this: "for" vs "each" in Ruby
An interesting question. There are several ways of looping in Ruby. I have noted that there is a design principle in Ruby, that when there are multiple ways of doing the same, there are usually subtle differences between them, and each case has its own unique use, its own problem that it solves. So in the end you end up needing to be able to write (and not just to read) all of them.
As for the question about for loop, this is similar to my earlier question whethe for loop is a trap.
Basically there are 2 main explicit ways of looping, one is by iterators (or, more generally, blocks), such as
[1, 2, 3].each { |e| puts e * 10 }
[1, 2, 3].map { |e| e * 10 )
# etc., see Array and Enumerable documentation for more iterator methods.
Connected to this way of iterating is the class Enumerator, which you should strive to understand.
The other way is Pascal-ish looping by while, until and for loops.
for y in [1, 2, 3]
puts y
end
x = 0
while x < 3
puts x; x += 1
end
# same for until loop
Like if and unless, while and until have their tail form, such as
a = 'alligator'
a.chop! until a.chars.last == 'g'
#=> 'allig'
The third very important way of looping is implicit looping, or looping by recursion. Ruby is extremely malleable, all classes are modifiable, hooks can be set up for various events, and this can be exploited to produce most unusual ways of looping. The possibilities are so endless that I don't even know where to start talking about them. Perhaps a good place is the blog by Yusuke Endoh, a well known artist working with Ruby code as his artistic material of choice.
To demonstrate what I mean, consider this loop
class Object
def method_missing sym
s = sym.to_s
if s.chars.last == 'g' then s else eval s.chop end
end
end
alligator
#=> "allig"
Aside of readability issues, the for loop iterates in the Ruby land whereas each does it from native code, so in principle each should be more efficient when iterating all elements in an array.
Loop with each:
arr.each {|x| puts x}
Loop with for:
for i in 0..arr.length
puts arr[i]
end
In the each case we are just passing a code block to a method implemented in the machine's native code (fast code), whereas in the for case, all code must be interpreted and run taking into account all the complexity of the Ruby language.
However for is more flexible and lets you iterate in more complex ways than each does, for example, iterating with a given step.
EDIT
I didn't come across that you can step over a range by using the step() method before calling each(), so the flexibility I claimed for the for loop is actually unjustified.

Can i use a ternary operator instead of while loop

I'm trying to reduce the while loop below to a single line
def this_method(week)
i = 0
while i < array.length
yield(week[i])
i += 1
end
end
week.each do |week|
puts week
end
Like others, I'm confused about the example (array is not defined, and this_method is never called). But you certainly don't need the while loop. I'd just use the Integer#times method, since you're making no use of the array values:
array.length.times {|i| yield week[i]}
#each_index (which ram suggested) works just as well.
But if array is actually meant to be week, then it gets even simpler:
week.each {|x| yield x}
I'm not sure why you'd want to create a method that just recycles #each though.
For since line you can use Array#each_index:
array.each_index {|i| yield week[i] }
No, you can't. The ternary operator is a conditional expression, the while is a loop expression.
However, in Ruby you normally use enumerators, not while. Your code can be rewritten as
def this_method(week)
array.each_with_index { |item, i| yield(week[i]) }
end
What is not clear to me, is there the array variable comes from. Even in your example, there is no definition of such variable.
if in any form check conditions only once.
while on other hand, can check conditions many times.
Well, if you don't like other answers with enumerators you can use while in a different form:
def this_method(week)
i = -1
yield(week[i]) while (i+=1) < array.length
end

Using ruby-debug in for i in 0...5

I am learning ruby from 'Programming ruby 1.9'. I am learning to use the ruby-debug so I can understand what is going on underneath. I use rubymine since it integrates ruby-debug19 or something like that (it says I don't have the gem and installs it). Here is the question, I was able to step through the code and explore the variables and the stack. However, when it reaches a for i in 0...5, the debugger says
stack frame not available
I know that ruby don't use for loops much but I'd still like to know if there debug through for loops.
Code:
raw_text = %{
The problem breaks down into two parts. First, given some text as a
string, return a list of words. That sounds like an array. Then, build a
count for each distinct word. That sounds like a use for a hash---we can
index it with the word and use the corresponding entry to keep a count.}
word_list = words_from_string(raw_text)
counts = count_frequency(word_list)
sorted = counts.sort_by {|word, count| count}
top_five = sorted.last(5)
for i in 0...5 # (this is ugly code--read on
word = top_five[i][0] # for a better version)
count = top_five[i][1]
puts "#{word}: #{count}"
end
If you take a look at the Ruby Language Specification (clause 11.5.2.3.4 on p. 91), you will see that
for i in 0...5
word = top_five[i][0]
count = top_five[i][1]
puts "#{word}: #{count}"
end
is syntactic sugar for
(0...5).each do |i|
word = top_five[i][0]
count = top_five[i][1]
puts "#{word}: #{count}"
end
except that no new variable scope is created for the block. So, the code with for will be translated into the code with each and executed as if it were written that way, except that the variables used in the for loop leak into the surrounding scope.
To put it another way: for actually executes each but without allocating a new stack frame for the block. So, the error message is exactly right: there is a call to a block, but somehow there is no stack frame allocated for that block call. That obviously confuses the debugger.
Now, one might argue that this is a bug and that for loops should get special treatment inside the debugger. I guess that so far nobody has ever bothered to fix that bug, since nobody ever uses for loops, precisely because they leak their variables into the surrounding scope and are exactly equivalent to an idiomatic each which doesn't.
What do I mean by "leaking variables"? See here:
(1..2).each do |i|
t = true
end
i
# NameError: undefined local variable or method `i' for main:Object
t
# NameError: undefined local variable or method `t' for main:Object
for i in 1..2
t = true
end
i
# => 2
t
# => true

Resources