Ruby loop local variables and inmutability - ruby

I have the following code:
# Assuming each element in foo is an array.
foo.each do |bar|
zaz = bar.first
# Other code using zaz, but not modifying it.
end
Will zaz local variable be modified on each iteration inside this loop, making it mutable? I am not sure about the behavior of Ruby here.

It depends on the code before the loop, really.
If that is all the code, then zaz is a block-local variable, and a new zaz variable will be created every time the loop body is evaluated.
If, however, there is a zaz local variable in the surrounding scope, then zaz is a free variable in the block, and since block scopes nest in their surrounding scope, the existing zaz variable outside the block will be re-assigned over and over again, every time the block is evaluated.
You can ensure that zaz is always treated as a block-local variable and never looked up in the surrounding scope, by explicitly declaring it as a block-local variable in the block's parameter list:
foo.each do |bar; zaz|
zaz = bar.first
end
Note, however, that your code only makes sense IFF your code is impure and mutable:
You assign to zaz but never actually use it inside the block. So, the only way that this makes sense at all is if zaz is a local variable in the outer scope and you are assigning it. Although in that case, your entire loop is just equivalent to zaz = foo.last.first.
each evaluates the block only for its side-effects. Without side-effects, each makes no sense at all, so the fact that you are using each implies that you have side-effects.
Note that the term "immutable" without additional qualification usually refers to values. When talking about "immutable variables", we usually say "immutable variable" explicitly, to make clear that we are only talking about whether or not a variable can be re-bound, not about mutating object state. Or, one could just say "constant", which is the technical term for "immutable variable" … although that term already has a specific meaning in Ruby.

each loops often mutate the object. Each has to do something.
Because each doesn't return anything useful - it returns the array itself, It won't mutate the object if it sends every element somewhere, like to be printed on screen.
foo.each do |bar|
# do something with element like
# print it, change it, save it
end
Functional alterantive is map
foo.map { |bar| bar.something }
It returns new array which is original array processed in immutable way. Obviously you have to be careful to use immutable methods. This would not be immutable:
foo.map { |bar| bar.something! }
Here something! does something destructive to the element of array.
However I have never seen map used like that. Use each for something destructive.

Related

To which level returns a return inside a Proc Object in Ruby?

As I understood return inside a Proc terminates the current method. So in the following example I would expect to see:
a1 > b1 > proc > a2. But actually it never reaches a2, why?
def a
puts "a1"
l = Proc.new {puts "proc"; return}
b l
puts "a2"
end
def b x
puts "b1"
x.call
puts "b2"
end
a
As a general rule, return always returns from the closest lexically enclosing method definition expression.
In this case, the closest lexically enclosing method definition expression is def a, therefore, return returns from a.
It does not actually matter that the return is inside a block in this case. The general rule is, well, general, so it applies regardless of where the return appears.
If we look more specifically at blocks, though, we can see that it still makes sense: in blocks, local variables are captured lexically, self is captured lexically, so it makes sense that return also behaves lexically. It is a general property of blocks that if you want to understand what is going on in a block, you only need to look lexically outwards.
And if we get even more specific, first going from the general rule to blocks, and now from blocks to Procs, the behavior still makes sense: a Proc is essentially a reified block, so it makes sense for a Proc to behave like a block.
There are some exceptions, though, to the general rule, and one important one are lambdas. Talking about lambdas in Ruby is always a little bit weird because lambdas are Procs but they behave differently from Procs. IMO, lambdas should have a separate class alongside Procs. Since lambdas are Procs, it makes it weird to talk about the differences between lambdas and Procs which are not lambdas (which don't have a standardized name and thus are confusingly also called Procs).
The behavior of a lambda differs from the behavior of a non-lambda Proc in two ways, one of which is relevant to your question:
Parameter binding in non-lambda Procs has the same semantics as parameter binding in blocks, whereas parameter binding in lambdas has the same semantics as parameter binding in message sends / method invocations.
In non-lambda Procs, return returns from the closest lexically enclosing method definition expression, just like in blocks, whereas in lambdas, return returns from the lambda itself, just like return in methods.
So, in both of these aspects, non-lambda Procs behave like blocks and lambdas behave like methods. I memorize it like this: "Proc" rhymes with "block" and both "lambda" and "method" are Greek.
As you probably know, there are some methods which also alter the behavior of blocks that are passed to them. E.g. instance_eval and instance_exec change the value of self, and define_method actually does change the behavior of return.
But since you didn't ask about blocks in general, and also didn't ask about lambdas specifically, and there are no reflective methods in your question, the general rules still applies to non-lambda Procs like the one shown in your question: return returns from the closest lexically enclosing method definition expression.

What does a ruby 'do' iteration loop evaluate to?

I have the following ruby method: a single do iteration without any break, next, or return. Here, cats is an array of cat objects; is_cat_red evaluates to true if cat has a color property of red.
def get_non_red_cats cats
cats.each do |cat|
!is_cat_red?(cat)
end
end
What does the method return (what does the loop evaluate to)?
This is some unusual code and it depends entirely on what the cats method does. You can pass a block to any Ruby method and that method can get executed zero more more times at any point between immediately and the end of the program's execution.
The return value is whatever cats returns, which is not clear from this snippet.
Imagine this in JavaScript terms as that language is a lot less ambiguous:
function get_non_red_cats(cats) {
return cats(function(cat) {
return !is_cat_red?(cat);
}
}
Where this shows that cats is just a function that, potentially, takes a function. It might ignore your function, too.
Now if this is cats.each that changes things as that's probably the Enumerable each method which has well-defined behaviour.
In that case the return value is whatever cats is.
There is no loop in your code. Ruby has two kinds of loops: while and for/in. (Actually, the latter is just syntactic sugar for each.)
In Ruby, an expression evaluates to the value of the last sub-expression evaluated inside the expression. A message send evaluates to the return value of the method that was executed as a result of the message send. The return value of a method is either explicitly the value of the return expression that ended the method execution or implicitly the value of the last expression evaluated inside the method body. (Note that the last expression evaluated inside the body is also what a module or class definition expression evaluates to. A method definition expression however evaluates to a Symbol denoting the name of the method.)
So, what does get_non_red_cats return? Well, there is no return in it, so it returns the value of the last expression evaluated inside the method body. The last expression evaluated inside the method body is a message send of the message each to the object referenced by the parameter binding cats. Ergo, the return value of get_non_red_cats is the return value of the method that gets executed as a result of sending the each message to cats.
And that is all we positively know.
We can make some assumptions, though. In general, each should return self. That's what all implementations of each in the entire core library and standard library do, and it is part of the standard "Iterable" Protocol in Ruby. It would be highly unusual and highly confusing if that were not the case. So, we can assume that whatever implementation of each ends up being executed, it will return self, i.e. the receiver of the message send, i.e. the object referenced by the parameter binding cats.
In other words: the method get_non_red_cats simply returns whatever was passed in as an argument. It is a pretty boring method. In fact, it is the identity method, which is pretty much the most boring method possible.
However, it could have a side-effect. You didn't ask about side-effects, only the return value, but let's look at it anyway.
Since each is supposed to simply return its receiver, it is in some sense also an identity method and thus extremely boring. However, each is generally supposed to evaluate the block it is passed, passing each element of the collection in turn as an argument. But, it ignores the value that the block evaluates to; the block is evaluated purely for its side-effect. Note that each with a block that has no side-effect makes no sense whatsoever. If the block has no side-effect, then the only thing interesting about the block is its value, but each ignores the block's value, and simply returns self.
foo.each do
# something that has no side-effect
end
is fully equivalent to
foo
Another Ruby convention is that message sends that end in a question mark ? should be used for asking questions (duh!) I.e. a message send that ends in a question mark should return something that is suitable to used as a conditional. It also generally shouldn't have a side-effect. (This is called the Command-Query Separation Principle and is a fundamental design principle of Object-Oriented Software Construction.)
And lastly, the ! unary prefix operator, when applied to something that is intended to be used in a conditional (i.e. a boolean value or something equivalent) is generally not supposed to have side-effect. Ergo, since the message send in the block ends with a question mark, it is not supposed to have a side-effect, and the ! operator is also not supposed to have a side-effect, we can assume that the entire block has no side-effect.
This, in turn, means that each shouldn't have a side-effect, and thus get_non_red_cats doesn't have a side-effect. As a result, the only other thing get_non_red_cats can do, is return a value, and it very likely simply returns the value that was passed in.
Ergo, the entire method is equivalent to
def get_non_red_cats(cats)
cats
end
All of this is assuming that the author followed standard Ruby conventions. If she didn't, then this method could do and return anything whatsoever, it could format your harddrive, launch a nuclear attack, return 42, or do absolutely nothing at all.

When, if ever, to use the Ruby keyword "for"

I personally like to iterate using the for keyword in Ruby since it reads very clean, from my eye. I generally assume that for may be an alias to Enumerable#each, but I do not know whether is is correct. In the most basic example:
for i in (1..10)
puts i
end
behaves the same as
(1..10).each do |i|
puts i
end
just without creating a new variable scope. Moreover, ruby-doc says
The for loop is rarely used in modern ruby programs.
which makes me feel there is a specific, technical reason against the usage. Does it matter that there's no new variable scope? In what way?
behaves the same as
This is incorrect. for is built on top of each, but it is semantically distinct:
array = %w(a b c d)
array.each { |character| }
defined? character # nil
for character in array; end
defined? character # "local-variable"
The for keyword doesn't introduce a new scope. Any variables introduced inside the block remain visible outside of it; as if it was written inline.
You should take this fact into account when you decide which form to use.

Block in Ruby compared to Smalltalk

What does block in Ruby mean? It looks similar with Smalltalk, but you can't send messages to it.
For example, in smalltalk:
[:x | x + 3] value: 3
returns 6. But in ruby:
{|x| x + 3}.call 3
will cause SyntaxError.
Well, you can pass messages to lambda in ruby, though:
irb(main):025:0> ->(x){x+3}.call 3
=> 6
So in Ruby, block is not a block, but lambda is a block? Is this true? I mean, are there any differences between ruby lambda and smalltalk block? If this is true, then what is a ruby block?
Update:
From the comment and answer below, together with some googling, I guess I
have more understanding of Ruby block. In Ruby, usually a piece of code evaluates an value, and every value is an object. But, block doesn't evaluate an value. So it's not an object. Instead it can act as part of an object. For example, in {|x| x + 3} can act as a part of the object proc {|x| x + 3 }.
But it did confuse me. In smalltalk, almost every expression can be divided into objects (binding to variables are exceptions). It seems in Ruby, there are more exceptions.
First and the most important thing that Ruby block isn't: an object. It is a syntactic construct, and also obviously has an equivalent implementation - but it is not an object, and thus can't receive messages. Which makes your example of
{|x| x + 3}.call 3
ungrammatical. Lambdas, procs - those are objects that wrap a block, and have a call method which executes the block.
Thus, a block is simply a piece of code which can be passed to a method, outside the argument list - no more, no less. If you pass it to Proc.new constructor, for example, it will wrap it and give you an object you can handle:
Proc.new {|x| x + 3}.call 3
A precision:
I would even say that in smalltalk even binding is made up with object.
Think of the MethodContext.
What you are actually doing is to store the object in the MethodContext.
So
a := Object new
Can be rewrite in:
thisContext at: 1 put: Object new.
But obviously you wont write it this way since you need to know were are the temps variable.
A block in Smalltalk is an anonymous object. Syntactically, it is delimited by a [ ... ] pair.
When evaluated, it will return the last expression evaluated within itself, and there are lots of methods in its protocol.
Here are the Class comments for Blocks from a Smalltalk (in this instance, Dolphin Smalltalk 6.03 Community Edition)
"Blocks encapsulate a sequence of statements to be performed at a later time. Blocks may capture (or "close over") runtime state, such as the values of temporary variables, from the enclosing lexical scope at the point where they are created. When evaluated a block executes as if in the lexical scope in which it was defined, except that blocks may have arguments that are bound at the time of evaluation. Blocks may be passed as arguments with messages to other objects and evaluated by those objects when appropriate, and thus form a very powerful and generic "pluggability" mechanism that is a core feature which provides much of the power of Smalltalk".
By contrast, a block in Ruby is simply a parameter string. It's syntactically delimited by a { ... } pair, but it has no methods of its own.

Does Ruby have a special storage for returning a value?

The following Ruby code
def a(b,c) b+c end
is the same as follows with Python
def a(b,c): return b+c
It looks like that ruby has the special storage(stack or something) that stores the final evaluation result and returns the value when a function is called.
If so, what's the name of the stack, and how can I get that stack?
If not, how does the Ruby code work without returning something?
It's not that magic, Ruby just returns the value returned by the operation that does at the end.
It's synctactic sugar that it's implemented just at parsing level: a statement that calculates something implicitly returns itself without any keyword..
to clarify it a little bit you can imagine both abstract syntax trees of the two snippets: they won't be different.
I don't think it's a stack. The final evaluation of the function is simply the return value, plain and simple. Just your everyday Ruby syntactic sugar.
I don't see any reason why a stack should be required to return a result. A simple pointer to a memory location would be sufficient. I'd guess that would usually be returned in a register, such as EAX.
You get the return value of a function by assigning the function's value to a variable (or doing something else with it). That's the way it was intended to be used, and the only way that works.
Not returning anything is really easy: The called function doesn't put anything into the return location (whatever it may be) and the caller ignores it.
Actually, return is special here, not the standard behavior. Consider:
def foo(ary)
ary.each do |e|
return true if e == 2
end
end
This code actually has more then one stack frame (at least the on for #foo, the one for Array#each and the one for the anonymous function passed to #each). What return does: it does a jump to the stack frame of the outermost lexical scope it is called in (the end of foo) and returns the given value. If you play a lot with anonymous functions, you will find that return is no allowed in all context, while just returning the last computed value is.
So I would recommend never to use return if you don't need it for precisely that reason: breaking and returning from a running iteration.

Resources