How to use blocks properly? - ruby

I'm very new to Ruby, so please bear with me.
Why is it a syntax error to have "test" {|s| print s}? How about "test" do |s| print s end?
Thanks

You can't say this:
"test" { |s| print s }
because "test" is a string literal, not a method. The same would apply to your do/end version. You could say:
["test"].each { |s| print s }
though because Arrays are Enumerable and Enumerable has an each method.

The {} are usually used for one liners.
do/end for multiple lines.
But there is no rule, do what you prefer.
Notice:
If ever you need to pass several instructions in a one liner, separate them with ;

A block is just a chunk of code enclosed in braces or keywords do/end. As mentioned already, you typically use braces for one liners, and do/end for multiple lines of code. Blocks can appear only immediately after the calling of some method. You can think of a block as an anonymous method (one that doesn't have a method name).
In your code, you were placing a block immediately after a string literal, not a method invocation. Blocks can be used for looping, as such:
2.times { puts "hello" } # => 2
# >> hello
# >> hello
In the above code, times is a method that belongs to all integers (that is to say, it is a instance method of the Integer class). The times method executes the code in the block twice, and returns the object (2 in this case) you called it on. You can pass a block to any method, although methods that are not expecting them will simply ignore the block.
Blocks can take parameters. The parameters are placed between pipes (the '|' character). It turns out, the first example could have accepted a parameter as seen here:
2.times { |i| puts i.to_s + " hello" } # => 2
# >> 0 hello
# >> 1 hello
I've only just scratched the surface of the power of blocks. You can read more about blocks for free in the online version of Programming Ruby: The Pragmatic Programmer's Guide (aka PickAx Book). It is a couple editions old now, but for an introduction to Ruby, you should find it sufficient. Once you understand blocks, you can start using power features of Enumerable which is included in Arrays and Hashes.

Related

why pass block arguments to a function in ruby?

I'm unclear on why there is a need to pass block arguments when calling a function.
why not just pass in as function arguments and what happens to the block arguments, how are they passed and used?
m.call(somevalue) {|_k, v| v['abc'] = 'xyz'}
module m
def call ( arg1, *arg2, &arg3)
end
end
Ruby, like almost all mainstream programming languages, is a strict language, meaning that arguments are fully evaluated before being passed into the method.
Now, imagine you want to implement (a simplified version of) Integer#times. The implementation would look a little bit like this:
class Integer
def my_times(action_to_be_executed)
raise ArgumentError, "`self` must be non-negative but is `#{inspect}`" if negative?
return if zero?
action_to_be_executed
pred.my_times(action_to_be_executed)
end
end
3.my_times(puts "Hello")
# Hello
0.my_times(puts "Hello")
# Hello
-1.my_times(puts "Hello")
# Hello
# ArgumentError (`self` must be non-negative but is `-1`)
As you can see, 3.my_times(puts "Hello") printed Hello exactly once, instead of thrice, as it should do. Also, 0.my_times(puts "Hello") printed Hello exactly once, instead of not at all, as it should do, despite the fact that it returns in the second line of the method, and thus action_to_be_executed is never even evaluated. Even -1.my_times(puts "Hello") printed Hello exactly once, despite that fact that it raises an ArgumentError exception as the very first thing in the method and thus the entire rest of the method body is never evaluated.
Why is that? Because Ruby is strict! Again, strict means that arguments are fully evaluated before being passed. So, what this means is that before my_times even gets called, the puts "Hello" is evaluated (which prints Hello to the standard output stream), and the result of that evaluation (which is just nil because Kernel#puts always returns nil) is passed into the method.
So, what we need to do, is somehow delay the evaluation of the argument. One way we know how to delay evaluation, is by using a method: methods are only evaluated when they are called.
So, we take a page out of Java's playbook, and define a Single Abstract Method Protocol: the argument that is being passed to my_each must be an object which implements a method with a specific name. Let's call it call, because, well, we are going to call it.
This would look a little bit like this:
class Integer
def my_times(action_to_be_executed)
raise ArgumentError, "`self` must be non-negative but is `#{inspect}`" if negative?
return if zero?
action_to_be_executed.call
pred.my_times(action_to_be_executed)
end
end
def (hello = Object.new).call
puts "Hello"
end
3.my_times(hello)
# Hello
# Hello
# Hello
0.my_times(hello)
-1.my_times(hello)
# ArgumentError (`self` must be non-negative but is `-1`)
Nice! It works! The argument that is passed is of course still strictly evaluated before being passed (we can't change the fundamental nature of Ruby from within Ruby itself), but this evaluation only results in the object that is bound by the local variable hello. The code that we want to run is another layer of indirection away and will only be executed at the point where we actually call it.
It also has another advantage: Integer#times actually makes the index of the current iteration available to the action as an argument. This was impossible to implement with our first solution, but here we can do it, because we are using a method and methods can take arguments:
class Integer
def my_times(action_to_be_executed)
raise ArgumentError, "`self` must be non-negative but is `#{inspect}`" if negative?
__my_times_helper(action_to_be_executed)
end
protected
def __my_times_helper(action_to_be_executed, index = 0)
return if zero?
action_to_be_executed.call(index)
pred.__my_times_helper(action_to_be_executed, index + 1)
end
end
def (hello = Object.new).call(i)
puts "Hello from iteration #{i}"
end
3.my_times(hello)
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
0.my_times(hello)
-1.my_times(hello)
# ArgumentError (`self` must be non-negative but is `-1`)
However, this is not actually very readable. If you didn't want to give a name to this action that we are trying to pass but instead simply literally write it down inside the argument list, it would look something like this:
3.my_times(Object.new.tap do |obj|
def obj.call(i)
puts "Hello from iteration #{i}"
end
end)
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
or on one line:
3.my_times(Object.new.tap do |obj| def obj.call; puts "Hello from iteration #{i}" end end)
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
# or:
3.my_times(Object.new.tap {|obj| def obj.call; puts "Hello from iteration #{i}" end })
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
Now, I don't know about you, but I find that pretty ugly.
In Ruby 1.9, Ruby added Proc literals aka stabby lambda literals to the language. Lambda literals are a concise literal syntax for writing objects with a call method, specifically Proc objects with Proc#call.
Using lambda literals, and without any changes to our existing code, it looks something like this:
3.my_times(-> i { puts "Hello from iteration #{i}" })
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
This does not look bad!
When Yukihiro "matz" Matsumoto designed Ruby almost thirty years ago in early 1993, he did a survey of the core libraries and standard libraries of languages like Smalltalk, Scheme, and Common Lisp to figure out how such methods that take a piece of code as an argument are actually used, and he found that the overwhelming majority of such methods take exactly one code argument and all they do with that argument is call it.
So, he decided to add special language support for a single argument that contains code and can only be called. This argument is both syntactically and semantically lightweight, in particular, it looks syntactically exactly like any other control structure, and it is semantically not an object.
This special language feature, you probably guessed it, are blocks.
Every method in Ruby has an optional block parameter. I can always pass a block to a method. It's up to the method to do anything with the block. Here, for example, the block is useless because Kernel#puts doesn't do anything with a block:
puts("Hello") { puts "from the block" }
# Hello
Because blocks are not objects, you cannot call methods on them. Also, because there can be only one block argument, there is no need to give it a name: if you refer to a block, it's always clear which block because there can be only one. But, if the block doesn't have methods and doesn't have a name, how can we call it?
That's what the yield keyword is for. It temporarily "yields" control flow to the block, or, in other words, it calls the block.
With blocks, our solution would look like this:
class Integer
def my_times(&action_to_be_executed)
raise ArgumentError, "`self` must be non-negative but is `#{inspect}`" if negative?
return enum_for(__callee__) unless block_given?
__my_times_helper(&action_to_be_executed)
end
protected
def __my_times_helper(&action_to_be_executed, index = 0)
return if zero?
yield index
pred.__my_times_helper(&action_to_be_executed, index + 1)
end
end
3.my_times do
puts "Hello from iteration #{i}"
end
# Hello from iteration 0
# Hello from iteration 1
# Hello from iteration 2
0.my_times do
puts "Hello from iteration #{i}"
end
-1.my_times do
puts "Hello from iteration #{i}"
end
# ArgumentError (`self` must be non-negative but is `-1`)
Okay, you might notice that I simplified a bit when I wrote above that the only thing you can do with a block is call it. There are two other things you can do with it:
You can check whether a block argument was passed using Kernel#block_given?. Since blocks are always optional, and blocks have no names, there must be a way to check whether a block was passed or not.
You can "roll up" a block (which is not an object and doesn't have a name) into a Proc object (which is an object) and bind it to a parameter (which gives it a name) using the & ampersand unary prefix sigil in the parameter list of the method. Now that we have an object, and a way to refer to it, we can store it in a variable, return it from a method, or (as we are doing here) pass it along as an argument to a different method, which otherwise wouldn't be possible.
There is also the opposite operation: with the & ampersand unary prefix operator, you can "unroll" a Proc object into a block in an argument list; this makes it so that the method behaves as if you had passed the code that is stored inside the Proc as a literal block argument to the method.
And there you have it! That's what blocks are for: a semantically and syntactically lightweight form of passing code to a method.
There are other possible approaches, of course. The approach that is closest to Ruby is probably Smalltalk. Smalltalk also has a concept called blocks (in fact, that is where Ruby got both the idea and the name from). Similarly to Ruby, Smalltalk blocks have a syntactically light-weight literal form, but they are objects, and you can pass more than one to a method. Thanks to Smalltalk's generally light-weight and simple syntax, especially the keyword method syntax which intersperses parts of the method name with the arguments, even passing multiple blocks to a method call is very concise and readable.
For example, Smalltalk actually does not have an if / then / else conditional expression, in fact, Smalltalk has no control structures at all. Everything is done with methods. So, the way that a conditional works, is that the two boolean classes TrueClass and FalseClass each have a method named ifTrue:ifFalse: which takes two block arguments, and the two implementations will simply either evaluate the first or the second block. For example, the implementation in TrueClass might look a little bit like this (note that Smalltalk has no syntax for classes or methods, instead classes and methods are created in the IDE by creating class objects and method objects via the GUI):
True>>ifTrue: trueBlock ifFalse: falseBlock
"Answer with the value of `trueBlock`."
↑trueBlock value
The corresponding implementation in FalseClass would then look like this:
FalseClass>>ifTrue: trueBlock ifFalse: falseBlock
"Answer with the value of `falseBlock`."
↑falseBlock value
And you would call it like this:
2 < 3 ifTrue: [ Transcript show: 'yes' ] ifFalse: [ Transcript show: 'no' ].
"yes"
4 < 3 ifTrue: [ Transcript show: 'yes' ] ifFalse: [ Transcript show: 'no' ].
"no"
In ECMAScript, you can simply use function definitions as expressions, and there is also lightweight syntax for functions.
In the various Lisps, code is just data, and data is code, so you can just pass the code as an argument as data, then inside the function, treat that data as code again.
Scala has call-by-name parameters which are only evaluated when you use their name, and they are evaluated every time you use their name. It would look something like this:
implicit class IntegerTimes(val i: Int) extends AnyVal {
#scala.annotation.tailrec
def times(actionToBeExecuted: => Unit): Unit = {
if (i < 0) throw new Error()
if (i == 0) () else { actionToBeExecuted; (i - 1).times(actionToBeExecuted) }
}
}
3.times { println("Hello") }
// Hello
// Hello
// Hello

Why can't I use curly braces with a `for` loop in Ruby?

The Ruby Documentation says that "do/end is equivalent to curly braces", so why is it that when I attempt to do the following I do not receive any output:
a = [1, 2, 3]
for i in a {
puts i
}
When I perform the above I receive no output (but I don't receive an error message either). However, when I do the following everything is as it should be:
a = [1, 2, 3]
for i in a do
puts i
end
#=> 1
#=> 2
#=> 3
I know this can be done more idiomatically with the each statement, but that's not what I'm asking. What am I not understanding here?
The Ruby Documentation says that "do/end is equivalent to curly braces"
No, it doesn't. It says (bold emphasis mine):
In this context, do/end is equivalent to curly braces […]
What "in this context" means is defined directly before the half-sentence you quoted:
do
Paired with end, can delimit a code block
So, "in this context" here refers to the context of a block.
so why is it that when I attempt to do the following I do not receive any output
Because this is a completely different context, again quoting from the documentation you linked to:
do can also (optionally) appear at the end of a for/in statement. (See for for an example.)
The "also" in that sentence makes it very clear that this is a different usage of the keyword do that has nothing to do with the usage discussed in this section. And if you look at the documentation of for, you can see that there is no mention of curly braces being allowed.
When I perform the above I receive no output (but I don't receive an error message either).
That is not true. Your code is syntactically invalid because it is missing the end keyword to end the for/in expression, therefore you get a "syntax error, unexpected end-of-input" on line 4:
ruby -c -e 'a = [1, 2, 3]
for i in a {
puts i
}'
# -e:4: syntax error, unexpected end-of-input
And if you add the missing end, you get a in `<main>': undefined method `a' for main:Object (NoMethodError) on line 2:
ruby -e 'a = [1, 2, 3]
for i in a {
puts i
}
end'
# -e:2:in `<main>': undefined method `a' for main:Object (NoMethodError)
Again, this is expected because curly braces delimit a code block, so
a {
puts i
}
is interpreted as a code block being passed to a and since variables cannot receive arguments, only methods can, a must be a method. Therefore, Ruby rightfully complains about not finding a method named a.
There are three ways of delimiting the iterator expression from the loop body expression in a for/in loop (and the same applies to while and until loops, actually):
An expression separator. An expression separator can either be
a semicolon ;
a newline
The keyword do
So, the following would all be valid fixes for your code:
# non-idiomatic
for i in a; puts i end
# non-idiomatic
for i in a
puts i end
# the same but with idiomatic indentation and whitespace
for i in a
puts i
end
# idiomatic
for i in a do puts i end
# redundant, non-idiomatic
for i in a do
puts i
end
Note, that when I say "idiomatic" above, that is to be interpreted relative, since actually for/in loops as a whole are completely non-idiomatic, and you would rather do this:
a.each do |i|
puts i
end
or maybe
a.each(&method(:puts))
It is in general preferred to not mix I/O and data transformation, so another idiomatic solution would be to transform the data to the desired output first, then output it, like this:
puts a.join("\n")
Except that Kernel#puts will already treat Array arguments special and print each element on its own line (as documented at IO#puts), so the real correct idiomatic solution for your code would be just:
puts a
Take a look to the documentation here: For loop
It states:
Like while and until, the do is optional. The for loop is similar to
using each, but does not create a new variable scope.
And also
The for loop is rarely used in modern ruby programs.
So, be less Pythonic :) using Enumerator#each instead:
a.each { |a| puts a }

What does |lang| part mean in this each method? Where does this come from?

I am reading through Chris Pine's Learn To Program chapter 7 Arrays and Iterators.
He introduces the each method with the following example:
languages = ['English', 'German', 'Ruby']
languages.each do |lang|
puts 'I love ' + lang + '!'
puts 'Don\'t you?'
end
puts 'And let\'s hear it for C++!'
puts '...'
It's not hard to understand how it works overall, but I can't figure out where the |lang| part is coming from so out of blue. Shouldn't it be assigned/named or something before it can be used like this? So the computer can know what the "lang" refers to? Does || do something wrapping around lang? Or does ruby just know what lang means?
I am afraid the question is too basic, but I am hoping someone might help me just a bit...
lang is a variable used to hold an element from the languages array. Any variable inside || will be used to grab single element from array. So, every time the loops executes, an element from the array is popped out and held in an variable named lang and data held by lang is displayed using puts method.
The each method yields every element one by one and it gets assigned to the variable lang.
Internally, the each method is implemented something like this:
def each
index = 0
while index < array.length
yield array[index]
index += 1
end
end
|lang| is a block variable. If you strip down your code, you can see that the .each method is iterating over the languages array and assigning array elements to the block variable:
languages = ['English', 'German', 'Ruby']
languages.each do |lang|
puts lang
end
#=> English
#=> German
#=> Ruby
Multi-line blocks use a do/end syntax (as in your example), and single-line blocks use a braces syntax. For example:
languages = ['English', 'German', 'Ruby']
languages.each { |lang| puts lang}
It sounds like, in the above example, you created an array storing multiple language variables.
You then iterated over all three elements in the array and represented each one with a variable called lang.
lang, which is inside the brackets is simply a variable.
Hope this helped you

Can someone explain Ruby's use of pipe characters in a block?

Can someone explain to me Ruby's use of pipe characters in a block? I understand that it contains a variable name that will be assigned the data as it iterates. But what is this called? Can there be more than one variable inside the pipes? Anything else I should know about it? Any good links to more information on it?
For example:
25.times { | i | puts i }
Braces define an anonymous function, called a block. Tokens between the pipe are the arguments of this block. The number of arguments required depends on how the block is used. Each time the block is evaluated, the method requiring the block will pass a value based on the object calling it.
It's the same as defining a method, only it's not stored beyond the method that accepts a block.
For example:
def my_print(i)
puts i
end
will do the same as this when executed:
{|i| puts i}
the only difference is the block is defined on the fly and not stored.
Example 2:
The following statements are equivalent
25.times &method(:my_print)
25.times {|i| puts i}
We use anonymous blocks because the majority of functions passed as a block are usually specific to your situation and not worth defining for reuse.
So what happens when a method accepts a block? That depends on the method. Methods that accept a block will call it by passing values from their calling object in a well defined manner. What's returned depends on the method requiring the block.
For example: In 25.times {|i| puts i} .times calls the block once for each value between 0 and the value of its caller, passing the value into the block as the temporary variable i. Times returns the value of the calling object. In this case 25.
Let's look at method that accepts a block with two arguments.
{:key1 => "value1", :key2 => "value2"}.each {|key,value|
puts "This key is: #{key}. Its value is #{value}"
}
In this case each calls the block ones for each key/value pair passing the key as the first argument and the value as the second argument.
The pipes specify arguments that are populated with values by the function that calls your block. There can be zero or more of them, and how many you should use depends on the method you call.
For example, each_with_index uses two variables and puts the element in one of them and the index in the other.
here is a good description of how blocks and iterators work
Block arguments follow all the same conventions as method parameters (at least as of 1.9): you can define optional arguments, variable length arg lists, defaults, etc. Here's a pretty decent summary.
Some things to be aware of: because blocks see variables in the scope they were defined it, if you pass in an argument with the same name as an existing variable, it will "shadow" it - your block will see the passed in value and the original variable will be unchanged.
i = 10
25.times { | i | puts i }
puts i #=> prints '10'
Will print '10' at the end. Because sometimes this is desirable behavior even if you are not passing in a value (ie you want to make sure you don't accidentally clobber a variable from surrounding scope) you can specify block-local variable names after a semicolon after the argument list:
x = 'foo'
25.times { | i ; x | puts i; x = 'bar' }
puts x #=> prints 'foo'
Here, 'x' is local to the block, even though no value is passed in.

Ruby while syntax

Does anybody why I can write this:
ruby-1.8.7-p302 > a = %w( a b c)
=> ["a", "b", "c"]
ruby-1.8.7-p302 > while (i = a.shift) do; puts i ; end
a
b
c
=> nil
Which looks like passing a block to while.
And not:
while(i = a.shift) { puts i; }
Is it because the "do" of the while syntax is just syntaxic sugar and as nothing to do with the "do" of a block?
Is it because the do of the while syntax is just syntaxic sugar and as nothing to do with the do of a block?
More or less, yes. It's not syntactic sugar, it's simply a built-in language construct, like def or class, as #meagar already wrote.
It has nothing to do with the do of a block, except that keywords are expensive and so reusing keywords makes sense. (By "expensive" I mean that they limit the programmer in his expressiveness.)
In a while loop, there are two ways to separate the block from the condition:
the do keyword and
an expression separator.
There are, in turn, two different expression separators in Ruby:
the semicolon ; and
a newline
So, all three of the following are valid:
while i = a.shift do puts i end # do
while i = a.shift; puts i end # semicolon
while i = a.shift
puts i end # newline
[Obviously, that last one wouldn't be written that way, you would put the end on a new line, dedented to match the while. I just wanted to demonstrate what is the minimum needed to separate the parts of the while loop.]
By the way: it is highly un-idiomatic to put the condition in parentheses. There's also a lot of superfluous semicolons in your code. And the variable name i is usually reserved for an index, not an element. (I normally use el for generic elements, but I much prefer more semantic names.)
It is also highly un-idiomatic to iterate a collection manually. Your code would be much better written as
a.each(&method(:puts)).clear
Not only is it much easier to understand what this does (print all elements of the array and delete all items from it), it is also much easier to write (there is no way to get the termination condition wrong, or screw up any assignments). It also happens to be more efficient: your version is Θ(n2), this one is Θ(n).
And actually, that's not really how you would write it, either, because Kernel#puts already implements that behavior, anyway. So, what you would really write is this
puts a
a.clear
or maybe this
a.tap(&method(:puts)).clear
[Note: this very last one is not 100% equivalent. It prints a newline for an empty array, all the other ones print nothing.]
Simple. Clear. Concise. Expressive. Fast.
Compare that to:
while (i = a.shift) do; puts i ; end
I actually had to run that multiple times to be 100% clear what it does.
while doesn't take a block, it's a language construct. The do is optional:
while (i = a.shift)
puts i
end

Resources