What does block in Ruby mean? It looks similar with Smalltalk, but you can't send messages to it.
For example, in smalltalk:
[:x | x + 3] value: 3
returns 6. But in ruby:
{|x| x + 3}.call 3
will cause SyntaxError.
Well, you can pass messages to lambda in ruby, though:
irb(main):025:0> ->(x){x+3}.call 3
=> 6
So in Ruby, block is not a block, but lambda is a block? Is this true? I mean, are there any differences between ruby lambda and smalltalk block? If this is true, then what is a ruby block?
Update:
From the comment and answer below, together with some googling, I guess I
have more understanding of Ruby block. In Ruby, usually a piece of code evaluates an value, and every value is an object. But, block doesn't evaluate an value. So it's not an object. Instead it can act as part of an object. For example, in {|x| x + 3} can act as a part of the object proc {|x| x + 3 }.
But it did confuse me. In smalltalk, almost every expression can be divided into objects (binding to variables are exceptions). It seems in Ruby, there are more exceptions.
First and the most important thing that Ruby block isn't: an object. It is a syntactic construct, and also obviously has an equivalent implementation - but it is not an object, and thus can't receive messages. Which makes your example of
{|x| x + 3}.call 3
ungrammatical. Lambdas, procs - those are objects that wrap a block, and have a call method which executes the block.
Thus, a block is simply a piece of code which can be passed to a method, outside the argument list - no more, no less. If you pass it to Proc.new constructor, for example, it will wrap it and give you an object you can handle:
Proc.new {|x| x + 3}.call 3
A precision:
I would even say that in smalltalk even binding is made up with object.
Think of the MethodContext.
What you are actually doing is to store the object in the MethodContext.
So
a := Object new
Can be rewrite in:
thisContext at: 1 put: Object new.
But obviously you wont write it this way since you need to know were are the temps variable.
A block in Smalltalk is an anonymous object. Syntactically, it is delimited by a [ ... ] pair.
When evaluated, it will return the last expression evaluated within itself, and there are lots of methods in its protocol.
Here are the Class comments for Blocks from a Smalltalk (in this instance, Dolphin Smalltalk 6.03 Community Edition)
"Blocks encapsulate a sequence of statements to be performed at a later time. Blocks may capture (or "close over") runtime state, such as the values of temporary variables, from the enclosing lexical scope at the point where they are created. When evaluated a block executes as if in the lexical scope in which it was defined, except that blocks may have arguments that are bound at the time of evaluation. Blocks may be passed as arguments with messages to other objects and evaluated by those objects when appropriate, and thus form a very powerful and generic "pluggability" mechanism that is a core feature which provides much of the power of Smalltalk".
By contrast, a block in Ruby is simply a parameter string. It's syntactically delimited by a { ... } pair, but it has no methods of its own.
Related
As I understood return inside a Proc terminates the current method. So in the following example I would expect to see:
a1 > b1 > proc > a2. But actually it never reaches a2, why?
def a
puts "a1"
l = Proc.new {puts "proc"; return}
b l
puts "a2"
end
def b x
puts "b1"
x.call
puts "b2"
end
a
As a general rule, return always returns from the closest lexically enclosing method definition expression.
In this case, the closest lexically enclosing method definition expression is def a, therefore, return returns from a.
It does not actually matter that the return is inside a block in this case. The general rule is, well, general, so it applies regardless of where the return appears.
If we look more specifically at blocks, though, we can see that it still makes sense: in blocks, local variables are captured lexically, self is captured lexically, so it makes sense that return also behaves lexically. It is a general property of blocks that if you want to understand what is going on in a block, you only need to look lexically outwards.
And if we get even more specific, first going from the general rule to blocks, and now from blocks to Procs, the behavior still makes sense: a Proc is essentially a reified block, so it makes sense for a Proc to behave like a block.
There are some exceptions, though, to the general rule, and one important one are lambdas. Talking about lambdas in Ruby is always a little bit weird because lambdas are Procs but they behave differently from Procs. IMO, lambdas should have a separate class alongside Procs. Since lambdas are Procs, it makes it weird to talk about the differences between lambdas and Procs which are not lambdas (which don't have a standardized name and thus are confusingly also called Procs).
The behavior of a lambda differs from the behavior of a non-lambda Proc in two ways, one of which is relevant to your question:
Parameter binding in non-lambda Procs has the same semantics as parameter binding in blocks, whereas parameter binding in lambdas has the same semantics as parameter binding in message sends / method invocations.
In non-lambda Procs, return returns from the closest lexically enclosing method definition expression, just like in blocks, whereas in lambdas, return returns from the lambda itself, just like return in methods.
So, in both of these aspects, non-lambda Procs behave like blocks and lambdas behave like methods. I memorize it like this: "Proc" rhymes with "block" and both "lambda" and "method" are Greek.
As you probably know, there are some methods which also alter the behavior of blocks that are passed to them. E.g. instance_eval and instance_exec change the value of self, and define_method actually does change the behavior of return.
But since you didn't ask about blocks in general, and also didn't ask about lambdas specifically, and there are no reflective methods in your question, the general rules still applies to non-lambda Procs like the one shown in your question: return returns from the closest lexically enclosing method definition expression.
The pre/post increment/decrement operator (++ and --) are pretty standard programing language syntax (for procedural and object-oriented languages, at least).
Why doesn't Ruby support them? I understand you could accomplish the same thing with += and -=, but it just seems oddly arbitrary to exclude something like that, especially since it's so concise and conventional.
Example:
i = 0 #=> 0
i += 1 #=> 1
i #=> 1
i++ #=> expect 2, but as far as I can tell,
#=> irb ignores the second + and waits for a second number to add to i
I understand Fixnum is immutable, but if += can just instanciate a new Fixnum and set it, why not do the same for ++?
Is consistency in assignments containing the = character the only reason for this, or am I missing something?
Here is how Matz(Yukihiro Matsumoto) explains it in an old thread:
Hi,
In message "[ruby-talk:02706] X++?"
on 00/05/10, Aleksi Niemelä <aleksi.niemela#cinnober.com> writes:
|I got an idea from http://www.pragprog.com:8080/rubyfaq/rubyfaq-5.html#ss5.3
|and thought to try. I didn't manage to make "auto(in|de)crement" working so
|could somebody help here? Does this contain some errors or is the idea
|wrong?
(1) ++ and -- are NOT reserved operator in Ruby.
(2) C's increment/decrement operators are in fact hidden assignment.
They affect variables, not objects. You cannot accomplish
assignment via method. Ruby uses +=/-= operator instead.
(3) self cannot be a target of assignment. In addition, altering
the value of integer 1 might cause severe confusion throughout
the program.
matz.
One reason is that up to now every assignment operator (i.e. an operator which changes a variable) has a = in it. If you add ++ and --, that's no longer the case.
Another reason is that the behavior of ++ and -- often confuse people. Case in point: The return value of i++ in your example would actually be 1, not 2 (the new value of i would be 2, however).
It's not conventional in OO languages. In fact, there is no ++ in Smalltalk, the language that coined the term "object-oriented programming" (and the language Ruby is most strongly influenced by). What you mean is that it's conventional in C and languages closely imitating C. Ruby does have a somewhat C-like syntax, but it isn't slavish in adhering to C traditions.
As for why it isn't in Ruby: Matz didn't want it. That's really the ultimate reason.
The reason no such thing exists in Smalltalk is because it's part of the language's overriding philosophy that assigning a variable is fundamentally a different kind of thing than sending a message to an object — it's on a different level. This thinking probably influenced Matz in designing Ruby.
It wouldn't be impossible to include it in Ruby — you could easily write a preprocessor that transforms all ++ into +=1. but evidently Matz didn't like the idea of an operator that did a "hidden assignment." It also seems a little strange to have an operator with a hidden integer operand inside of it. No other operator in the language works that way.
I think there's another reason: ++ in Ruby wouldn't be remotely useful as in C and its direct successors.
The reason being, the for keyword: while it's essential in C, it's mostly superfluous in Ruby. Most of the iteration in Ruby is done through Enumerable methods, such as each and map when iterating through some data structure, and Fixnum#times method, when you need to loop an exact number of times.
Actually, as far as I have seen, most of the time +=1 is used by people freshly migrated to Ruby from C-style languages.
In short, it's really questionable if methods ++ and -- would be used at all.
You can define a .+ self-increment operator:
class Variable
def initialize value = nil
#value = value
end
attr_accessor :value
def method_missing *args, &blk
#value.send(*args, &blk)
end
def to_s
#value.to_s
end
# pre-increment ".+" when x not present
def +(x = nil)
x ? #value + x : #value += 1
end
def -(x = nil)
x ? #value - x : #value -= 1
end
end
i = Variable.new 5
puts i #=> 5
# normal use of +
puts i + 4 #=> 9
puts i #=> 5
# incrementing
puts i.+ #=> 6
puts i #=> 6
More information on "class Variable" is available in "Class Variable to increment Fixnum objects".
I think Matz' reasoning for not liking them is that it actually replaces the variable with a new one.
ex:
a = SomeClass.new
def a.go
'hello'
end
# at this point, you can call a.go
# but if you did an a++
# that really means a = a + 1
# so you can no longer call a.go
# as you have lost your original
Now if somebody could convince him that it should just call #succ! or what not, that would make more sense, and avoid the problem. You can suggest it on ruby core.
And in the words of David Black from his book "The Well-Grounded Rubyist":
Some objects in Ruby are stored in variables as immediate values. These include
integers, symbols (which look like :this), and the special objects true, false, and
nil. When you assign one of these values to a variable (x = 1), the variable holds
the value itself, rather than a reference to it.
In practical terms, this doesn’t matter (and it will often be left as implied, rather than
spelled out repeatedly, in discussions of references and related topics in this book).
Ruby handles the dereferencing of object references automatically; you don’t have to
do any extra work to send a message to an object that contains, say, a reference to
a string, as opposed to an object that contains an immediate integer value.
But the immediate-value representation rule has a couple of interesting ramifications,
especially when it comes to integers. For one thing, any object that’s represented
as an immediate value is always exactly the same object, no matter how many
variables it’s assigned to. There’s only one object 100, only one object false, and
so on.
The immediate, unique nature of integer-bound variables is behind Ruby’s lack of
pre- and post-increment operators—which is to say, you can’t do this in Ruby:
x = 1
x++ # No such operator
The reason is that due to the immediate presence of 1 in x, x++ would be like 1++,
which means you’d be changing the number 1 to the number 2—and that makes
no sense.
Some objects in Ruby are stored in variables as immediate values. These include integers, symbols (which look like :this), and the special objects true, false, and nil. When you assign one of these values to a variable (x = 1), the variable holds the value itself, rather than a reference to it.
Any object that’s represented as an immediate value is always exactly the same object, no matter how many variables it’s assigned to. There’s only one object 100, only one object false, and so on.
The immediate, unique nature of integer-bound variables is behind Ruby’s lack of pre-and post-increment operators—which is to say, you can’t do this in Ruby:
x=1
x++ # No such operator
The reason is that due to the immediate presence of 1 in x, x++ would be like 1++, which means you’d be changing the number 1 to the number 2—and that makes no sense.
Couldn't this be achieved by adding a new method to the fixnum or Integer class?
$ ruby -e 'numb=1;puts numb.next'
returns 2
"Destructive" methods seem to be appended with ! to warn possible users, so adding a new method called next! would pretty much do what was requested ie.
$ ruby -e 'numb=1; numb.next!; puts numb'
returns 2 (since numb has been incremented)
Of course, the next! method would have to check that the object was an integer variable and not a real number, but this should be available.
The following code generates an output of 9. I understand send is simply calling the method :[], but am confused how the parameters work.
x = [1,2,3]
x.send :[]=,0,4 #why is x now [4,2,3]
x[0] + x.[](1) + x.send(:[],2) # 4 + 2 + 3
How do line 2 and line 3 work?
Object#send provides another way to invoke a method.
x.send :[]=,0,4
is saying, invoke []= method on x, and pass the argument 0, 4, it's equivalent to:
x[0] = 4
The name send is because in Ruby, methods are invoked by sending a message to an object. The message contains the method's name, along with parameters the method may need. This idea comes from SmallTalk.
x.send(:[]=, 0, 4)
is the same* as
x.[]=(0, 4)
which has the syntactic sugar of
x[0] = 4
which should be familiar from other languages.
Parentheses are, of course, optional.
The .send or .public_send variety has the advantage that the method to invoke doesn't have to be hardcoded—it can come from a variable.
All this is congruent with ruby's paradigm of object orientedness: objects communicate with each other by sending messages, and the messages trigger code execution.
*almost, #send will invoke private methods too
Line 2 is
x.send :[]=,0,4
That is basically a fancy way of writing this:
x[0] = 4
(Calling send allows you to call private methods though, and that is one difference between the two syntaxes. Also, an object could conceivably override the send method, which would break the first syntax.)
So line 2 has the effect of writing 4 into the first spot in the array.
Now on line 3, we see that we are adding up three things. Here is a list of the things we are adding:
x[0] - the first element
x.[](1) - another syntax for accessing elements, which accesses the second element. This syntax is a traditional method call, where the name of the method happens to be [].
x.send(:[], 2) - This shows another feature of Ruby, which is the send method. It accesses the third element.
So the result will be 9, because the third line adds the first, second, and third elements of the array.
These examples appear to illustrate an interesting point about the design of the Ruby language. Specifically, array accesses are implemented as method calls. The preferred syntax for writing to an array is x[0] = 4 and the preferred syntax for reading is x[0], because that syntax is familiar to programmers from many different languages. However, reading and writing from arrays is actually implemented using method calls under the hood, and that's why it is possible to use some other syntaxes that look more like a traditional method call.
A traditional method call looks like this: object.foo(arg1, arg2, arg3, ...).
The send thing shown above is a useful feature of Ruby that allows you to specify which method you are calling using a symbol, instead of forcing you to type the exact name. It also lets you call private methods.
I have a couple questions about Ruby's methods, procedures, and blocks that strike me as rather odd. They're not so much about syntax or function as the logic behind the decisions made.
Question 1:
Why is it that blocks can be passed to methods (e.g. each) but they cannot be assigned to a variable?
I know you can pass them around in procedures, i.e. p = Proc.new {...} (accessed with &p), but it doesn't make much sense to make the programmer go through these means.
Question 2:
Why is there a differentiation between methods and procedures?
For instance, I can accomplish the task of defining a function and calling that function in the following two ways:
def square(x)
x**2
end
square(3)
=> 9
or
square = lambda {|x| x**2}
square.call(3)
=> 9
Why the differentiation? In Python for example both defining a function in the standard way and by square = lambda x: x**2 accomplish the same task of creating the function and assigning it to square.
Question 1: Blocks are not objects, they are syntactic structures; this is why they cannot be assigned to a variable. This is a privilege reserved for objects.
Question 2: Methods are not objects, so they cannot receive messages. Inversely, procs and lambdas are objects, so they cannot be invoked like methods, but must receive a message that tells them to return a value on the basis of the parameters passed with the message.
Procs and Lambdas are objects, so they can receive the call message and be assigned to names. To summarize, it is being an object that makes procs and lambdas behave in ways you find odd. Methods and blocks are not objects and don't share that behavior.
To some extent at least, methods are objects:
class ABC
def some_method
end
end
ABC.instance_method(:some_method) #=> #<UnboundMethod: ABC#some_method>
Further to that, there is a built-in class: Method, as documented here.
See also this: http://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Method_Calls
Haphazardly <bseg>, it does rather seem to bear out the everything-is-an-object thing. In this particular case, it just appears to take a little more digging to see.
(I really must make an effort to understand this better: I'm starting to think it's fundamental to getting a deeper understanding.)
Methods are methods — that is, they're actions that an object can take in response to messages. They are not functions.
Blocks are closures — they're functions that close over the enclosing scope. They don't conceptually "belong to" a given object.
In some languages, methods are merely functions that are members of an object, but Ruby does not view them this way. Separating a method from its owning object is more akin to surgery than simple assignment. Ruby takes its object-orientation model from Smalltalk, the granddaddy of modern OO.
I recently discovered Ruby's blocks and yielding features, and I was wondering: where does this fit in terms of computer science theory? Is it a functional programming technique, or something more specific?
Ruby's yield is not an iterator like in C# and Python. yield itself is actually a really simple concept once you understand how blocks work in Ruby.
Yes, blocks are a functional programming feature, even though Ruby is not properly a functional language. In fact, Ruby uses the method lambda to create block objects, which is borrowed from Lisp's syntax for creating anonymous functions — which is what blocks are. From a computer science standpoint, Ruby's blocks (and Lisp's lambda functions) are closures. In Ruby, methods usually take only one block. (You can pass more, but it's awkward.)
The yield keyword in Ruby is just a way of calling a block that's been given to a method. These two examples are equivalent:
def with_log
output = yield # We're calling our block here with yield
puts "Returned value is #{output}"
end
def with_log(&stuff_to_do) # the & tells Ruby to convert into
# an object without calling lambda
output = stuff_to_do.call # We're explicitly calling the block here
puts "Returned value is #{output}"
end
In the first case, we're just assuming there's a block and say to call it. In the other, Ruby wraps the block in an object and passes it as an argument. The first is more efficient and readable, but they're effectively the same. You'd call either one like this:
with_log do
a = 5
other_num = gets.to_i
#my_var = a + other_num
end
And it would print the value that wound up getting assigned to #my_var. (OK, so that's a completely stupid function, but I think you get the idea.)
Blocks are used for a lot of things in Ruby. Almost every place you'd use a loop in a language like Java, it's replaced in Ruby with methods that take blocks. For example,
[1,2,3].each {|value| print value} # prints "123"
[1,2,3].map {|value| 2**value} # returns [2, 4, 8]
[1,2,3].reject {|value| value % 2 == 0} # returns [1, 3]
As Andrew noted, it's also commonly used for opening files and many other places. Basically anytime you have a standard function that could use some custom logic (like sorting an array or processing a file), you'll use a block. There are other uses too, but this answer is already so long I'm afraid it will cause heart attacks in readers with weaker constitutions. Hopefully this clears up the confusion on this topic.
There's more to yield and blocks than mere looping.
The series Enumerating enumerable has a series of things you can do with enumerations, such as asking if a statement is true for any member of a group, or if it's true for all the members, or searching for any or all members meeting a certain condition.
Blocks are also useful for variable scope. Rather than merely being convenient, it can help with good design. For example, the code
File.open("filename", "w") do |f|
f.puts "text"
end
ensures that the file stream is closed when you're finished with it, even if an exception occurs, and that the variable is out of scope once you're finished with it.
A casual google didn't come up with a good blog post about blocks and yields in ruby. I don't know why.
Response to comment:
I suspect it gets closed because of the block ending, not because the variable goes out of scope.
My understanding is that nothing special happens when the last variable pointing to an object goes out of scope, apart from that object being eligible for garbage collection. I don't know how to confirm this, though.
I can show that the file object gets closed before it gets garbage collected, which usually doesn't happen immediately. In the following example, you can see that a file object is closed in the second puts statement, but it hasn't been garbage collected.
g = nil
File.open("/dev/null") do |f|
puts f.inspect # #<File:/dev/null>
puts f.object_id # Some number like 70233884832420
g = f
end
puts g.inspect # #<File:/dev/null (closed)>
puts g.object_id # The exact same number as the one printed out above,
# indicating that g points to the exact same object that f pointed to
I think the yield statement originated from the CLU language. I always wonder if the character from Tron was named after CLU too....
I think 'coroutine' is the keyword you're looking for.
E.g. http://en.wikipedia.org/wiki/Yield
Yield in computing and information science:
in computer science, a point of return (and re-entry) of a coroutine