Ruby while syntax - ruby

Does anybody why I can write this:
ruby-1.8.7-p302 > a = %w( a b c)
=> ["a", "b", "c"]
ruby-1.8.7-p302 > while (i = a.shift) do; puts i ; end
a
b
c
=> nil
Which looks like passing a block to while.
And not:
while(i = a.shift) { puts i; }
Is it because the "do" of the while syntax is just syntaxic sugar and as nothing to do with the "do" of a block?

Is it because the do of the while syntax is just syntaxic sugar and as nothing to do with the do of a block?
More or less, yes. It's not syntactic sugar, it's simply a built-in language construct, like def or class, as #meagar already wrote.
It has nothing to do with the do of a block, except that keywords are expensive and so reusing keywords makes sense. (By "expensive" I mean that they limit the programmer in his expressiveness.)
In a while loop, there are two ways to separate the block from the condition:
the do keyword and
an expression separator.
There are, in turn, two different expression separators in Ruby:
the semicolon ; and
a newline
So, all three of the following are valid:
while i = a.shift do puts i end # do
while i = a.shift; puts i end # semicolon
while i = a.shift
puts i end # newline
[Obviously, that last one wouldn't be written that way, you would put the end on a new line, dedented to match the while. I just wanted to demonstrate what is the minimum needed to separate the parts of the while loop.]
By the way: it is highly un-idiomatic to put the condition in parentheses. There's also a lot of superfluous semicolons in your code. And the variable name i is usually reserved for an index, not an element. (I normally use el for generic elements, but I much prefer more semantic names.)
It is also highly un-idiomatic to iterate a collection manually. Your code would be much better written as
a.each(&method(:puts)).clear
Not only is it much easier to understand what this does (print all elements of the array and delete all items from it), it is also much easier to write (there is no way to get the termination condition wrong, or screw up any assignments). It also happens to be more efficient: your version is Θ(n2), this one is Θ(n).
And actually, that's not really how you would write it, either, because Kernel#puts already implements that behavior, anyway. So, what you would really write is this
puts a
a.clear
or maybe this
a.tap(&method(:puts)).clear
[Note: this very last one is not 100% equivalent. It prints a newline for an empty array, all the other ones print nothing.]
Simple. Clear. Concise. Expressive. Fast.
Compare that to:
while (i = a.shift) do; puts i ; end
I actually had to run that multiple times to be 100% clear what it does.

while doesn't take a block, it's a language construct. The do is optional:
while (i = a.shift)
puts i
end

Related

Detecting a missing sub!, sort!, map!, etc

After returning to Ruby from a long stint coding in another language, I regularly assume that foo.sort, foo.map {...}, foo.sub /bar/, 'zip' will change foo. Of course I meant foo.sort!, etc. But that usually takes 3 or 4 debugging potshots before I notice. Meanwhile, the sort is calculated, but then isn't assigned to anything. Can I make ruby warn about that missing lvalue, like a C compiler warns of a function's ignored return value?
You mean like Perl's somewhat infamous "using map in void context"? I don't know of Ruby having such a thing. Sounds like you need more unit testing to catch mistakes like this before they can worm into your code deeply enough to be considered bugs.
Keep in mind Ruby's a lot more flexible than languages like Perl. For example, the following code might be useful:
def rewrite(list)
list.map do |row|
row += '!'
end
end
Now technically that's a map in a void context, but because it's used as a return value it might be captured elsewhere. It's the responsibility of the caller to make use of it. Flagging the method itself for some sort of warning is a level removed from what most linting type tools can do.
Here's a very basic parser :
#forgetful_methods = %w(sort map sub)
Dir['*.rb'].each do |script|
File.readlines(script).each.with_index(1) do |line, i|
#forgetful_methods.each do |method|
if line =~ /\.#{method}(?!!)/ && $` !~ /(=|\b(puts|print|return)\b|^#)/
puts format('%-25s (%3d) : %s', script, i, line.strip)
end
end
end
end
# =>
# brace_globbing.rb ( 13) : subpatterns.map{|subpattern| explode_extglob(match.pre_match+subpattern+match.post_match)}.flatten
# delegate.rb ( 11) : #targets.map { |t| t.send(m, *args) }
It checks every ruby script in the current directory for sort, map or sub without ! that aren't preceded by =, puts, print or return.
It's just a start, but maybe it could help you find some of the low hanging fruits.
There are many false positives, though.
A more complex version could use abstract syntax trees, for example with Ripper.

Why is `loop` not a keyword? What is the use case for `loop` without a block?

I was wondering why loop is a Kernel method rather than a keyword like while and until. There are cases where I want to do unconditional loop, but since loop, being a method, is slower than while true, I chose to do the latter when performance is important. But writing true here looks ugly, and is not Rubish; loop looks better. Here is a dilemma.
My guess is that it is because there is a usage of loop that does not take a block and returns an enumerator. But to me, it looks that an unconditional loop can easily be created on the spot, and does not make sense to create such an instance of Enumerator and later use it. I cannot think of a use case.
Is my guess regarding my wonder correct? If not, why is loop a method rather than a keyword?
What is the use case for an enumerator created by loop without a block?
Only Ruby's developers can answer your first question, but your guess seems reasonable. As to your second question, sure there are use cases. The whole point of Enumerables is that you can pass them around, which, as you know, you can't do with a while or for structure.
As a trivial example, here's a Fibonacci sequence method that takes an Enumerable as an argument:
def fib(enum)
a, b = nil, nil
enum.each do
a, b = b || 0, a ? a + b : 1
puts a
end
puts "DONE"
end
Now suppose you want to print out the first seven Fibonacci numbers. You can use any Enumerable that yields seven times, like the one returned by 7.times:
fib 7.times
# => 0
# 1
# 1
# 2
# 3
# 5
# 8
# DONE
But what if you want to print out Fibonacci numbers forever? Well, just pass it the Enumerable returned by loop:
fib loop
# => 0
# 1
# ...
# (Never stops)
Like I said, this is a silly example that clearly is a terrible way to generate Fibonacci numbers, but hopefully it helps you understand that there are times—albeit perhaps rarely—when it's useful to have an Enumerable that never ends, and why loop is a nice convenience for those cases.

Multiple in-place substitutions in one line

What is the most idiomatic way to do multiple in-place substitutions on a string when none of the patterns being replaced are guaranteed to have a match?
For instance, say I have an array of strings and I want to replace "sad" with "happy" and "goodbye" with "hello" in each one:
a = ["I am sad", "goodbye for now"]
# This will work:
a.map! do |s|
s = s.gsub(/sad/,"happy").gsub(/goodbye/,"hello")
end
# So will this:
a.each do |s|
s.gsub!(/sad/,"happy")
s.gsub!(/goodbye/,"hello")
end
# This will fail when s does not match /sad/:
a.each do |s|
s.gsub!(/sad/,"happy").gsub!(/goodbye/,"hello")
end
The first option seems a little bit silly, since logically I'm trying to do an in-place substitution rather than a re-assignment. The second option is okay, but my aesthetic sense tells me that it seems wrong to be required to turn the substitution into two sequential statements, particularly in cases when only one or the other substitution is expected to be successful (which, ironically, is exactly the case that causes the third version, which looks "right" to me, to fail). Also, it's probably wrong to use each destructively as I'm doing here, but if I used map! instead, I'd need to add s as the final line in the block to ensure I don't accidentally make nil entries when substitution fails, which seems almost sillier than the first option.
I'm guessing the reason the (g)sub! methods return nil when no substitution is done is because that makes them convenient for use in logical constructs, which is admittedly a very good reason (especially since the non-destructive versions obviously must return "true" values no matter what).
So...I know this is really very little more than a minor aesthetic quibble, but is there a better way than the two (working) ways I've shown? If not, is there any reason to prefer one over the other (beyond my intuitive aesthetic preference for the second version)?
First you need to create a replacement_hash as below :
replacement_hash = { "sad" => "happy", "goodbye" => "hello"}
a = ["I am sad", "goodbye for now"]
Regexp.union(replacement_hash.keys) # => /sad|goodbye/
a.map { |s| s.gsub(Regexp.union(replacement_hash.keys), replacement_hash) }
# => ["I am happy", "hello for now"]
If in-place replacement needed, do as below :-
a.each { |s| s.gsub!(Regexp.union(replacement_hash.keys), replacement_hash) }

Chaining partition, keep_if etc

[1,2,3].partition.inject(0) do |acc, x|
x>2 # this line is intended to be used by `partition`
acc+=x # this line is intended to be used by `inject`
end
I know that I can write above stanza using different methods but this is not important here.
What I want to ask why somebody want to use partition (or other methods like keep_if, delete_if) at the beginning of the "chain"?
In my example, after I chained inject I couldn't use partition. I can write above stanza using each:
[1,2,3].each.inject(0) do |acc, x|
x>2 # this line is intended to be used by `partition`
acc+=x # this line is intended to be used by `inject`
end
and it will be the same, right?
I know that x>2 will be discarded (and not used) by partition. Only acc+=x will do the job (sum all elements in this case).
I only wrote that to show my "intention": I want to use partition in the chain like this [].partition.inject(0).
I know that above code won't work as I intended and I know that I can chain after block( }.map as mentioned by Neil Slater).
I wanted to know why, and when partition (and other methods like keep_if, delete_if etc) becomes each (just return elements of the array as partition do in the above cases).
In my example, partition.inject, partition became each because partition cannot take condition (x>2).
However partition.with_index (as mentioned by Boris Stitnicky) works (I can partition array and use index for whatever I want):
shuffled_array
.partition
.with_index { |element, index|
element > index
}
ps. This is not question about how to get sum of elements that are bigger than 2.
This is an interesting situation. Looking at your code examples, you are obviously new to Ruby and perhaps also to programming. Yet you managed to ask a very difficult question that basically concerns the Enumerator class, one of the least publicly understood classes, especially since Enumerator::Lazy was introduced. To me, your question is difficult enough that I am not able to provide a comprehensive answer. Yet the remarks about your code would not fit into a comment under the OP. That's why I'm adding this non-answer.
First of all, let us notice a few awful things in your code:
Useless lines. In both blocks, x>2 line is useless, because its return value is discarded.
[1,2,3].partition.inject(0) do |x, acc|
x>2 # <---- return value of this line is never used
acc+=x
end
[1,2,3].each.inject(0) do |x, acc|
x>2 # <---- return value of this line is never used
acc+=x
end
I will ignore this useless line when discussing your code examples further.
Useless #each method. It is useless to write
[1,2,3].each.inject(0) do |x, acc|
acc+=x
end
This is enough:
[1,2,3].inject(0) do |x, acc|
acc+=x
end
Useless use of #partition method. Instead of:
[1,2,3].partition.inject(0) do |x, acc|
acc+=x
end
You can just write this:
[1,2,3].inject(0) do |x, acc|
acc+=x
end
Or, as I would write it, this:
[ 1, 2, 3 ].inject :+
But then, you ask a deep question about using #partition method in the enumerator mode. Having discussed the trivial newbie problems of your code, we are left with the question how exactly the enumerator-returning versions of the #partition, #keep_if etc. should be used, or rather, what are the interesting way of using them, because everyone knows that we can use them for chaining:
array = [ *1..6 ]
shuffled_arrray = array.shuffle # randomly shuffles the array elements
shuffled_array
.partition # partition enumerator comes into play
.with_index { |element, index| # method Enumerator#with_index comes into play
element > index # and partitions elements into those greater
} # than their index, and those smaller
And also like this:
e = partition_enumerator_of_array = array.partition
# And then, we can partition the array in many ways:
e.each &:even? # partitions into odd / even numbers
e.each { rand() > 0.5 } # partitions the array randomly
# etc.
An easily understood advantage is that instead of writing longer:
array.partition &:even?
You can write shorter:
e.each &:even?
But I am basically sure that enumerators provide more power to the programmer than just chaining collection methods and shortening code a little bit. Because different enumerators do very different things. Some, such as #map! or #reject!, can even modify the collection on which they operate. In this case, it is imaginable that one could combine different enumerators with the same block to do different things. This ability to vary not just the blocks, but also the enumerators to which they are passed, gives combinatorial power, which can very likely be used to make some otherwise lengthy code very concise. But I am unable to provide a very useful concrete example of this.
In sum, Enumerator class is here mainly for chaining, and to use chaining, programmers do not really need to undestand Enumerator in detail. But I suspect that the correct habits regarding the use of Enumerator might be as difficult to learn as, for instance, correct habits of parametrized subclassing. I suspect I have not grasped the most powerful ways to use enumerators yet.
I think that the result [3, 3] is what you are looking for here - partitioning the array into smaller and larger numbers then summing each group. You seem to be confused about how you give the block "rules" to the two different methods, and have merged what should be two blocks into one.
If you need the net effects of many methods that each take a block, then you can chain after any block, by adding the .method after the close of the block like this: }.each or end.each
Also note that if you create partitions, you are probably wanting to sum over each partition separately. To do that you will need an extra link in the chain (in this case a map):
[1,2,3].partition {|x| x > 2}.map do |part|
part.inject(0) do |acc, x|
x + acc
end
end
# => [3, 3]
(You also got the accumulator and current value wrong way around in the inject, and there is no need to assign to the accumulator, Ruby does that for you).
The .inject is no longer in a method chain, instead it is inside a block. There is no problem with blocks inside other blocks, in fact you will see this very often in Ruby code.
I have chained .partition and .map in the above example. You could also write the above like this:
[1,2,3].partition do
|x| x > 2
end.map do |part|
part.inject(0) do |acc, x|
x + acc
end
end
. . . although when chaining with short blocks, I personally find it easier to use the { } syntax instead of do end, especially at the start of a chain.
If it all starts to look complex, there is not usually a high cost to assigning the results of the first part of a chain to a local variable, in which case there is no chain at all.
parts = [1,2,3].partition {|x| x > 2}
parts.map do |part|
part.inject(0) do |acc, x|
x + acc
end
end

Ruby for loop a trap?

In a discussion of Ruby loops, Niklas B. recently talked about for loop 'not introducing a new scope', as compared to each loop. I'd like to see some examples of how does one feel this.
O.K., I expand the question: Where else in Ruby do we see what apears do/end block delimiters, but there is actually no scope inside? Anything else apart from for ... do ... end?
O.K., One more expansion of the question, is there a way to write for loop with curly braces { block } ?
Let's illustrate the point by an example:
results = []
(1..3).each do |i|
results << lambda { i }
end
p results.map(&:call) # => [1,2,3]
Cool, this is what was expected. Now check the following:
results = []
for i in 1..3
results << lambda { i }
end
p results.map(&:call) # => [3,3,3]
Huh, what's going on? Believe me, these kinds of bugs are nasty to track down. Python or JS developers will know what I mean :)
That alone is a reason for me to avoid these loops like the plague, although there are more good arguments in favor of this position. As Ben pointed out correctly, using the proper method from Enumerable almost always leads to better code than using plain old, imperative for loops or the fancier Enumerable#each. For instance, the above example could also be concisely written as
lambdas = 1.upto(3).map { |i| lambda { i } }
p lambdas.map(&:call)
I expand the question: Where else in Ruby do we see what apears do/end block delimiters, but there is actually no scope inside? Anything else apart from for ... do ... end?
Every single one of the looping constructs can be used that way:
while true do
#...
end
until false do
# ...
end
On the other hand, we can write every one of these without the do (which is obviously preferrable):
for i in 1..3
end
while true
end
until false
end
One more expansion of the question, is there a way to write for loop with curly braces { block }
No, there is not. Also note that the term "block" has a special meaning in Ruby.
First, I'll explain why you wouldn't want to use for, and then explain why you might.
The main reason you wouldn't want to use for is that it's un-idiomatic. If you use each, you can easily replace that each with a map or a find or an each_with_index without a major change of your code. But there's no for_map or for_find or for_with_index.
Another reason is that if you create a variable within a block within each, and it hasn't been created before-hand, it'll only stay in existance for as long as that loop exists. Getting rid of variables once you have no use for them is a good thing.
Now I'll mention why you might want to use for. each creates a closure for each loop, and if you repeat that loop too many times, that loop can cause performance problems. In https://stackoverflow.com/a/10325493/38765 , I posted that using a while loop rather than a block made it slower.
RUN_COUNT = 10_000_000
FIRST_STRING = "Woooooha"
SECOND_STRING = "Woooooha"
def times_double_equal_sign
RUN_COUNT.times do |i|
FIRST_STRING == SECOND_STRING
end
end
def loop_double_equal_sign
i = 0
while i < RUN_COUNT
FIRST_STRING == SECOND_STRING
i += 1
end
end
times_double_equal_sign consistently took 2.4 seconds, while loop_double_equal_sign was consistently 0.2 to 0.3 seconds faster.
In https://stackoverflow.com/a/6475413/38765 , I found that executing an empty loop took 1.9 seconds, whereas executing an empty block took 5.7 seconds.
Know why you wouldn't want to use for, know why you would want to use for, and only use the latter when you need to. Unless you feel nostalgic for other languages. :)
Well, even blocks are not perfect in Ruby prior to 1.9. They don't always introduce new scope:
i = 0
results = []
(1..3).each do |i|
results << lambda { i }
end
i = 5
p results.map(&:call) # => [5,5,5]

Resources