As the title says, I'm getting confused about the difference between blocks and "normal" pieces of code that follow a statement.
I usually enclose the body of a while loop in Ruby with do...end, e.g.
while true do
# ...
end
But is it a block? If yes, why? If no, why is it declared as we declare blocks with methods?
while is not a method but a keyword. This allows it to have its very own syntax (e.g. to make the do optional) and evaluation rules.
The body of a while loop is not a block. Blocks are sent to methods and while is not a method. You can't pass a block argument to while (as in while(true, &block)) or refer to the body of a while loop or pass it around. It also doesn't create a new variable scope like blocks do.
why is it declared as we declare blocks with methods?
I assume it's out of convenience. Why would you use a different syntax if there's already do and end? Re-using the same syntax for similar constructs (the body of a while loop, the body of a block, the body of a method etc.) makes it easier to read and write code.
Related
I have the following ruby method: a single do iteration without any break, next, or return. Here, cats is an array of cat objects; is_cat_red evaluates to true if cat has a color property of red.
def get_non_red_cats cats
cats.each do |cat|
!is_cat_red?(cat)
end
end
What does the method return (what does the loop evaluate to)?
This is some unusual code and it depends entirely on what the cats method does. You can pass a block to any Ruby method and that method can get executed zero more more times at any point between immediately and the end of the program's execution.
The return value is whatever cats returns, which is not clear from this snippet.
Imagine this in JavaScript terms as that language is a lot less ambiguous:
function get_non_red_cats(cats) {
return cats(function(cat) {
return !is_cat_red?(cat);
}
}
Where this shows that cats is just a function that, potentially, takes a function. It might ignore your function, too.
Now if this is cats.each that changes things as that's probably the Enumerable each method which has well-defined behaviour.
In that case the return value is whatever cats is.
There is no loop in your code. Ruby has two kinds of loops: while and for/in. (Actually, the latter is just syntactic sugar for each.)
In Ruby, an expression evaluates to the value of the last sub-expression evaluated inside the expression. A message send evaluates to the return value of the method that was executed as a result of the message send. The return value of a method is either explicitly the value of the return expression that ended the method execution or implicitly the value of the last expression evaluated inside the method body. (Note that the last expression evaluated inside the body is also what a module or class definition expression evaluates to. A method definition expression however evaluates to a Symbol denoting the name of the method.)
So, what does get_non_red_cats return? Well, there is no return in it, so it returns the value of the last expression evaluated inside the method body. The last expression evaluated inside the method body is a message send of the message each to the object referenced by the parameter binding cats. Ergo, the return value of get_non_red_cats is the return value of the method that gets executed as a result of sending the each message to cats.
And that is all we positively know.
We can make some assumptions, though. In general, each should return self. That's what all implementations of each in the entire core library and standard library do, and it is part of the standard "Iterable" Protocol in Ruby. It would be highly unusual and highly confusing if that were not the case. So, we can assume that whatever implementation of each ends up being executed, it will return self, i.e. the receiver of the message send, i.e. the object referenced by the parameter binding cats.
In other words: the method get_non_red_cats simply returns whatever was passed in as an argument. It is a pretty boring method. In fact, it is the identity method, which is pretty much the most boring method possible.
However, it could have a side-effect. You didn't ask about side-effects, only the return value, but let's look at it anyway.
Since each is supposed to simply return its receiver, it is in some sense also an identity method and thus extremely boring. However, each is generally supposed to evaluate the block it is passed, passing each element of the collection in turn as an argument. But, it ignores the value that the block evaluates to; the block is evaluated purely for its side-effect. Note that each with a block that has no side-effect makes no sense whatsoever. If the block has no side-effect, then the only thing interesting about the block is its value, but each ignores the block's value, and simply returns self.
foo.each do
# something that has no side-effect
end
is fully equivalent to
foo
Another Ruby convention is that message sends that end in a question mark ? should be used for asking questions (duh!) I.e. a message send that ends in a question mark should return something that is suitable to used as a conditional. It also generally shouldn't have a side-effect. (This is called the Command-Query Separation Principle and is a fundamental design principle of Object-Oriented Software Construction.)
And lastly, the ! unary prefix operator, when applied to something that is intended to be used in a conditional (i.e. a boolean value or something equivalent) is generally not supposed to have side-effect. Ergo, since the message send in the block ends with a question mark, it is not supposed to have a side-effect, and the ! operator is also not supposed to have a side-effect, we can assume that the entire block has no side-effect.
This, in turn, means that each shouldn't have a side-effect, and thus get_non_red_cats doesn't have a side-effect. As a result, the only other thing get_non_red_cats can do, is return a value, and it very likely simply returns the value that was passed in.
Ergo, the entire method is equivalent to
def get_non_red_cats(cats)
cats
end
All of this is assuming that the author followed standard Ruby conventions. If she didn't, then this method could do and return anything whatsoever, it could format your harddrive, launch a nuclear attack, return 42, or do absolutely nothing at all.
I have the following code:
# Assuming each element in foo is an array.
foo.each do |bar|
zaz = bar.first
# Other code using zaz, but not modifying it.
end
Will zaz local variable be modified on each iteration inside this loop, making it mutable? I am not sure about the behavior of Ruby here.
It depends on the code before the loop, really.
If that is all the code, then zaz is a block-local variable, and a new zaz variable will be created every time the loop body is evaluated.
If, however, there is a zaz local variable in the surrounding scope, then zaz is a free variable in the block, and since block scopes nest in their surrounding scope, the existing zaz variable outside the block will be re-assigned over and over again, every time the block is evaluated.
You can ensure that zaz is always treated as a block-local variable and never looked up in the surrounding scope, by explicitly declaring it as a block-local variable in the block's parameter list:
foo.each do |bar; zaz|
zaz = bar.first
end
Note, however, that your code only makes sense IFF your code is impure and mutable:
You assign to zaz but never actually use it inside the block. So, the only way that this makes sense at all is if zaz is a local variable in the outer scope and you are assigning it. Although in that case, your entire loop is just equivalent to zaz = foo.last.first.
each evaluates the block only for its side-effects. Without side-effects, each makes no sense at all, so the fact that you are using each implies that you have side-effects.
Note that the term "immutable" without additional qualification usually refers to values. When talking about "immutable variables", we usually say "immutable variable" explicitly, to make clear that we are only talking about whether or not a variable can be re-bound, not about mutating object state. Or, one could just say "constant", which is the technical term for "immutable variable" … although that term already has a specific meaning in Ruby.
each loops often mutate the object. Each has to do something.
Because each doesn't return anything useful - it returns the array itself, It won't mutate the object if it sends every element somewhere, like to be printed on screen.
foo.each do |bar|
# do something with element like
# print it, change it, save it
end
Functional alterantive is map
foo.map { |bar| bar.something }
It returns new array which is original array processed in immutable way. Obviously you have to be careful to use immutable methods. This would not be immutable:
foo.map { |bar| bar.something! }
Here something! does something destructive to the element of array.
However I have never seen map used like that. Use each for something destructive.
I have a need to decipher some Ruby code. Being a Python dev, I am having hard time making sense to some of the syntax.
I need to deal with some (mostly clean and readable) Sinatra code. I started with a Sinatra tutorial, and it looks something like this:
get '/' do
"Hello, World!"
end
Now, I know that in Ruby you don't need parentheses to call a function. So if I were to try to understand the above, I would say:
get is a function that takes as its first argument the route.
'/' is the first argument
do ... end block is an anonymous function
Please correct me if I am wrong above, and explain in detail anything I might be missing.
Also they say that Sinatra is a DSL -- does this mean that it is parsing some special syntax that is not official Ruby?
do ... end (or { ... }) is a block, a very important concept in Ruby. It was noticed that very often functions that take other functions as parameter (map, filter, grep, timeout...) very often accept a single function. So the Ruby designer decided to make a special syntax for it.
It is often said that in Ruby, everything is an object. This is not quite true: code is not an object. Code can be wrapped into an object. But Ruby blocks are pure code - not an object, not a first-order value at all. Blocks are a piece of code associated with a function call.
Your code snippet is equivalent to this:
self.get('/') do
return "Hello, World!"
end
The get method takes one parameter and a block; not two parameters. In a hypothetical example where get did take two parameters, we would have to write something like this:
get('/', lambda { "Hello, World" })
or
get('/', Proc.new { "Hello, World" })
but notice that the way we wrap code into objects involves calling methods lambda and Proc.new - and giving them a block (and zero parameters)!
There are many tutorials on "Ruby blocks", so I will not link any particular one.
Because of the block syntax, Ruby is very good at making dialects (still fully syntactic Ruby) that express certain concepts very neatly. Sinatra uses the get... "syntax" (but actually just a method call) to describe a web server; Rake uses task... "syntax" to describe build processes; RSpec, a testing framework, has its own DSL (that is still Ruby) that describes desired behaviours.
After some reading, I understood the code blocks.
Ruby code blocks are simple. They are 'closures'. There's two ways to write a block
do |x|
do_something(x)
end
{|x| do_something(x) }
The |x| is the argument that gets passed to the code within the block.
The crucial bit to grasping code blocks is to understand how they are used with methods.
In Ruby, methods are a bit different.
In addition to arguments, any method can accept a code block.
Code blocks are NOT arguments, but they are a separate entity that can be passed to a method along with arguments
A method can choose not to call the code block, in which case, any code block that was passed is ignored
If a method calls a code block, then it is necessary to pass it when calling the method, or otherwise Ruby will complain.
yield within a method executes the code block
For more on code blocks read this: http://mixandgo.com/blog/mastering-ruby-blocks-in-less-than-5-minutes
I had a test that did this:
expect(#parser.parse('adsadasdas')).to raise_error(Errno::ENOENT)
and it didn't work. I changed to:
expect { #parser.parse('adsadasdas') }.to raise_error(Errno::ENOENT)
And it worked.
When do we use curly braces and when do we use parentheses with expect?
In response to OP's comment, I've edited and completely rewritten my answer. I realize that my original answer was oversimplified, so much so that it could be considered incorrect.
Your question was actually addressed somewhat by this other StackOverflow question.
One poster, Peter Alfvin, makes a good point when he says:
As for rules, you pass a block or a Proc if you're trying to test
behavior (e.g. raising errors, changing some value). Otherwise, you
pass a "conventional" argument, in which case the value of that
argument is what is tested.
The reason you're encountering the phenomenon you're seeing has to do with the raising of errors. When you pass #parser.parse('adsadasdas') as an argument (use parentheses) to expect, you are essentially telling ruby:
Evaluate #parser.parse('adsadasdas') first.
Take the result and pass this to expect.
expect should see if this result matches my expectation (that is, that Errno:ENOENT will be raised).
But, what happens is: when ruby evaluates #parser.parse('adsadasdas'), an error is raised right then and there. Ruby doesn't even get a chance to pass the result on to expect. (For all we care, you could have passed #parser.parse('adsadasdas') as an argument to any function... like multiply() or capitalize()) The error is raised, and expect never even gets a chance to do its work.
But when you pass #parser.parse('adsadasdas') as a proc (a code block) to expect using curly braces, what you are telling ruby is this:
expect, get ready to do some work.
expect, I would like you to keep track of what happens as we evaluate #parser.parse('adsadasdas').
Ok, expect, did the code block that was just evaluated raise a Errno:ENOENT error? I was expecting that it would.
When you pass a code block to expect, you are telling expect that you want it to examine the resulting behavior, the changes, made by your code block's execution, and then to let you know if it meets up to the expectations that you provide it.
When you pass an argument to expect, you are telling ruby to evaluate that argument to come to some value before expect even gets involved, and then you are passing that value to expect to see if it meets up to some expectation.
TL;DR: use expect(exp) to specify something about the value of exp and use expect { exp } to specify a side effect that occurs when exp is executed.
Let's unpack this a bit. Most of RSpec's matchers are value matchers. They match (or not) against any ruby object. In contrast, a handful of RSpec's matchers can only be matched against a block, because they have to observe the block while it's running in order to operate properly. These matchers concern side effects that take place (or not) while the block executes. The matcher would have no way to tell if the named side effect had occurred unless it is passed a block to execute. Let's consider the built-in block matchers (as of RSpec 3.1) one-by-one:
raise_error
Consider that one can return an exception from a method, and that is different than raising the exception. Raising an exception is a side effect, and can only be observed by the matcher by it executing the block with an appropriate rescue clause. Thus, this matcher must receive a block to work properly.
throw_symbol
Throwing symbols is similar to raising errors -- it causes a stack jump and is a side effect that can only be observed by running a block inside an appropriate catch block.
change
Mutation to state is a side effect. The matcher can only tell if there was a change to some state by checking the state before hand, running the block, then checking the state after.
output
I/O is a side effect. For the output matcher to work, it has to replace the appropriate stream ($stdout or $stderr) with a new StringIO, execute the block, restore the stream to its original value, and then check the contents of theStringIO`.
yield_control/yield_with_args/yield_with_no_args/yield_with_successive_args
These matchers are a bit different. Yielding isn't really a side effect (it's really just syntactic sugar for calling another function provided by the caller), but yielding can't be observed by looking at the return value of the expression. For the yield matchers to work, they provide a probe object that you pass on to the method-under-test as a block using the &probe syntax:
expect { |probe| [1, 2, 3].each(&probe) }.to yield_with_successive_args(1, 2, 3)
What do all these matchers have in common? None of them can work on simple ruby values. Instead, they all have to wrap a block in an appropriate context (i.e. rescuing, catching or checking before/after values).
Note that in RSpec 3, we added some logic to provide users clear errors when they use the wrong expect form with a given matcher. However, in the specific case of expect(do_something).to raise_error, there's nothing we can do to provide you a clear explanation there -- if do_something raises an error (as you expect it to...), then the error is raised before ruby evaluates the to argument (the raise_error matcher) so RSpec has no way to check with the matcher to see if supports value or block expectations.
in short:
use curly-brace (a block): when you want to test the behavior
use parenthesis when you want to test the returned value
worth reading: As for rules, you pass a block or a Proc if you're trying to test behavior (e.g. raising errors, changing some value). Otherwise, you pass a "conventional" argument, in which case the value of that argument is what is tested. - from this answer
In the test written with parentheses, the code is executed normally, including all normal error handling. The curly-brace syntax defines a block object upon which you can place the expectation. It encapsulates the code you expect to be broken and allows rspec to catch the error and provide its own handling (in this case, a successful test).
You can think of it this way as well: with the parentheses, the code is executed before being passed to the expect method, but with the block, expect will run the code itself.
What does block in Ruby mean? It looks similar with Smalltalk, but you can't send messages to it.
For example, in smalltalk:
[:x | x + 3] value: 3
returns 6. But in ruby:
{|x| x + 3}.call 3
will cause SyntaxError.
Well, you can pass messages to lambda in ruby, though:
irb(main):025:0> ->(x){x+3}.call 3
=> 6
So in Ruby, block is not a block, but lambda is a block? Is this true? I mean, are there any differences between ruby lambda and smalltalk block? If this is true, then what is a ruby block?
Update:
From the comment and answer below, together with some googling, I guess I
have more understanding of Ruby block. In Ruby, usually a piece of code evaluates an value, and every value is an object. But, block doesn't evaluate an value. So it's not an object. Instead it can act as part of an object. For example, in {|x| x + 3} can act as a part of the object proc {|x| x + 3 }.
But it did confuse me. In smalltalk, almost every expression can be divided into objects (binding to variables are exceptions). It seems in Ruby, there are more exceptions.
First and the most important thing that Ruby block isn't: an object. It is a syntactic construct, and also obviously has an equivalent implementation - but it is not an object, and thus can't receive messages. Which makes your example of
{|x| x + 3}.call 3
ungrammatical. Lambdas, procs - those are objects that wrap a block, and have a call method which executes the block.
Thus, a block is simply a piece of code which can be passed to a method, outside the argument list - no more, no less. If you pass it to Proc.new constructor, for example, it will wrap it and give you an object you can handle:
Proc.new {|x| x + 3}.call 3
A precision:
I would even say that in smalltalk even binding is made up with object.
Think of the MethodContext.
What you are actually doing is to store the object in the MethodContext.
So
a := Object new
Can be rewrite in:
thisContext at: 1 put: Object new.
But obviously you wont write it this way since you need to know were are the temps variable.
A block in Smalltalk is an anonymous object. Syntactically, it is delimited by a [ ... ] pair.
When evaluated, it will return the last expression evaluated within itself, and there are lots of methods in its protocol.
Here are the Class comments for Blocks from a Smalltalk (in this instance, Dolphin Smalltalk 6.03 Community Edition)
"Blocks encapsulate a sequence of statements to be performed at a later time. Blocks may capture (or "close over") runtime state, such as the values of temporary variables, from the enclosing lexical scope at the point where they are created. When evaluated a block executes as if in the lexical scope in which it was defined, except that blocks may have arguments that are bound at the time of evaluation. Blocks may be passed as arguments with messages to other objects and evaluated by those objects when appropriate, and thus form a very powerful and generic "pluggability" mechanism that is a core feature which provides much of the power of Smalltalk".
By contrast, a block in Ruby is simply a parameter string. It's syntactically delimited by a { ... } pair, but it has no methods of its own.