ruby blocks not first-class - ruby

From a language design perspective, why aren't ruby blocks first-class?
Similarly, I think blocks should actually be lambdas, thereby getting rid of the need for cumbersome syntax such as proc {...}.call or &proc or lambda or Proc.new. This would get rid of the need for yield too.

From a language design perspective, why aren't ruby blocks first-class?
Mostly for performance reasons, in as far as I'm aware. Consider:
def test_yield
yield
end
def test_block &block
block.call
end
la = lambda {}
def test_lambda l
l.call
end
Then, benchmark with an empty block for the first two, vs the third with a new la per call or with the same la, and note how much faster the yield goes in each case. The reason is, the explicit &block variable creates a Proc object, as does lambda, while merely yielding doesn't.
A side-effect (which I've actually found uses for, to recursively pipe passed blocks through the use of a proc object), is you cannot yield in a proc or lambda outside some kind of enclosing scope:
foo = proc { yield if block_given? }
foo.call { puts 'not shown' }
def bar
baz = proc { yield if block_given? }
baz.call
end
bar { puts 'should show' }
This is because, as I've come to understand it (I lost a lot of hair due to this, until it ticked), block_given? is sent to main when foo calls it, and to bar rather that baz when it gets evaluated in bar.

lambda and proc (and block) have different semantics. Procs/blocks have non-local returns and are less picky about arity; lambdas are more method-like in their behaviour. In my opinion this distinction is useful and procs/blocks/lambdas should NOT be unified as you suggest.

Ruby methods are not functions or first-class citizens because they cannot be passed to other methods as arguments, returned by other methods, or assigned to variables. Ruby procs are first-class, similar to JavaScript’s first-class functions
The following code demonstrates how Ruby methods cannot be stored in variables or returned from methods and therefore do not meet the ‘first-class’ criteria:
class Dog
def speak
'ruff'
end
end
fido = Dog.new
# Ruby methods cannot be stored in variables
# Methods are executed and variables only store values
x = fido.speak
# x stores the method's return value, not the method itself
x # => 'ruff'
# Methods cannot return other methods
# Methods can only return values from other methods
def hi
Dog.new.speak
end
# hi returns the method's return value, not the method itself
hi # => 'ruff'
a programming language is said to have first-class functions if it treats functions as first-class citizens. Specifically, this means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures.

Related

Within a Ruby method, should I create a proc or a method?

Just want to enquire what the right practice is.
My preference is to use procs, simply because I think that defining methods inside of methhods is a bit untidy and should be done only when necessary. To get around it, I simply use procs.
What is the right / better way to do it and why? (apart from the proc's ability to access the main method's variables defined before itself)
def meth( params_prime )
calculations = do_something_with_whatever
def sub_meth( params_sub )
do_something_with_params_sub
end
sub_meth_params(calculations) # is this better?
proc1 = proc{ |params_sub| do_something_with_params_sub }
proc1.call(calculations) # or is this?
end
It is not clear what your specific use-case is, but I would definitely go for procs or lambdas. There is less overhead when defining a proc or lambda dynamically, they are passable, so if needed you could return them and they could be used outside the function.
Using "def" exposes the method as an instance method outside of the current method scope (so in the containing class, which could be Object in your case). This may or may not be with you want. If you want to use an anonymous function only available in the local scope, use a lambda.
Also Proc vs Lambda: I generally prefer to use lambdas since they behave a little more "predictable", meaning: as you would expect (check passed variables, and return just returns from the lambda, proc returns from the called scope). But from your example it is hard to deduce what would apply. I think the key-difference is: lambas are ment to be passed around, and thus behave a little more sanely. If this is not your use-case, use Proc :) (a write-up of the difference).
If you want to use sub_func to encapsulate it from call from other methods you can use a class to group function and sub_func together and make sub_func private. Otherwise if you want to pass this function as a parameter further you can declare it as lamda.
def func params_prime
sub_func = ->(params_sub){do_something_with_params}
sub_func.call(params_prime)
end
Defining methods inside methods is a feature of Ruby that may have its use. But something is telling me that you are asking a very advanced question while you are still a beginner level Rubyist. Do you know what default definee is? If not, check this article by Yugui.
Procs are very important in Ruby, but newbies tend to use them instead of defining methods in appropriate objects, which is the exact smell I'm getting from your question. The normal way of doing things in OO languages of Ruby family is to define methods on objects:
class Foo
def bar *params
# do something with params
end
end
Since you do not understand the meaning of defining methods inside methods, refrain from doing it for the next 6 months. Once you understand objects, you can start experimenting with this very advanced feature again.
APPENDIX:
Since you demonstrated intrest, let me show you that using def in def at the top level is a frownable-upon thing to do. Normally, when you define a method on some class without further adornment, it becomes a public instance method of that class:
class X
def foo; "foo" end
end
X.instance_methods.include? :foo
#=> true
When you use def in a def, the definee for the inner def is going to be X:
class X
def bar
def baz
"baz"
end
"bar"
end
end
When you execute the above code, instance method #bar becomes defined on X:
X.instance_methods.include? :bar
#=> true
But #baz not yet:
X.instance_methods.include? :baz
#=> false
Only after you call #bar at least once does the method become defined on X:
X.new.bar
#=> "bar"
X.instance_methods.include? :baz
#=> true
And now I would like to ask you to appreciate how terrible thing just happened: An instance just modified its mother class. That's a violation. A violation of such a basic principle of OO design, that I'm not even sure it has a name. This technique is great for obfuscated coding competitions, but in production, it's taboo. Ruby gives you the freedom to break that taboo, gives you the rope to hang yourself on, but you don't do it under any kind of normal circumstances.
So what can be worse than a def inside a def in a class definition? The answer is, a def inside a def at the top level. Let me show you why. Normally, when you define methods with def at the top level, the default definee is Object, but the top level defnitions become private instance methods of object. This is to prevent the unintended consequence of top level defs, because almost all Ruby objects inherit from Object. For example, if you define:
class Object
def foo; "foo" end
end
Now all your objects will respond to foo:
foo #=> "foo"
1.foo #=> "foo"
[].foo #=> "foo
When we define methods at the top level, we usually just intend to use the method at the top level, and don't want every single object to inherit it. For that reason, top level defs become private:
hello #=> NameError: undefined local variable or method `hello' for main:Object
1.hello #=> NoMethodError: undifined method 'hello' for 1:Fixnum
Now we use def at the top level:
def hello; "hello" end
We can see that method #hello is has not become an instance methods of Object:
Object.instance_methods.include? :hello
#=> false
Mysteriously, it became its private method:
Object.private_instance_methods.include? :hello
#=> true
This way, we avoid the unintended consequence of defining #hello method for every single object. But the inheritance is there. The error message has changed:
1.hello #=> NoMethodError: private method 'hello' called for 1:Fixnum
And we can forcibly call the method via #send:
1.send :hello
#=> "hello"
Mysteriously, at the top level, we are allowed to call this private method without #send:
hello
#=> "hello"
And now, what happens when you do def in def at the top level:
def bar
def baz; "baz" end
"bar"
end
You define a private instance method Object#bar in an expected way. But when you call it, alas, the top level magic no longer works and a public method Object#baz gets defined:
bar #=> "bar"
This way, not just the top level, but every single Ruby object got polluted with your #baz method:
1.baz #=> "baz"
Class.baz #=> "baz"
This is why I told you to refrain from using this idiom until you progress from the level of unconscious incompetence to the level of conscious incompetence. I recommend you to read more about top level methods in Ruby.

Why do Ruby blocks not have required parameters?

While starting with Ruby 2.0, I created a small script that worked with the new keyword parameters. While coding this, the behavior of blocks and lambdas surprised me. Below exercises what I had found:
def print_parameters(proc = nil, &block)
p "Block: #{block.parameters}" if proc.nil?
p "Lambda: #{proc.parameters}" unless proc.nil?
end
print_parameters(-> (first, second = 'test') {})
print_parameters(&-> (first, second = 'test') {})
print_parameters {|first, second = 'test'|}
The results are as follows:
"Lambda: [[:req, :first], [:opt, :second]]"
"Block: [[:req, :first], [:opt, :second]]"
"Block: [[:opt, :first], [:opt, :second]]"
Why is it that creating a block does not have required parameters but using a lambda or a block created from a lambda does?
The semantics of blocks in Ruby are designed to make them as useful as possible for iterators, like Integer#times or Enumerable#each. Since blocks do not have required parameters, you can do things like:
10.times { puts "Hello!" }
...or:
10.times { |i| puts i }
This is also the reason behind the next / return distinction in Ruby.
Ruby "lambdas" are different; they are not "optimized" for use as "loop bodies" (though you can use them that way if you want). They are stricter about the number of arguments passed, which potentially can help to catch bugs.
lambdas behave more like methods in ruby: when you define a method, if parameter is required that when calling that method you have to supply parameters. Blocks behave more like procs: procs can declare parameter but they dont require it.
lambda syntax actually creates proc with rigid arity. if you where to output classes of both variables you will see that both lambda and blocks are instances of Proc. procs created using lambda syntax will respond true to #lambda? method. Also check out this SO discussion to understand some other behavioral distinction between lambdas and procs. When to use lambda, when to use Proc.new?

What is the purpose of blocks?

I've just started on ruby and can't wrap my head around blocks
How is it different from an anonymous function?
On what instance would I want to use it?
And when would I choose it over an anonymous function?
Ruby doesn't have anonymous functions like JavaScript (for example) has. Blocks have 3 basic uses:
Creating Procs
Creating lambdas
With functions
An example of where blocks are similar to anonymous functions is here (Ruby and JavaScript).
Ruby:
[1,2,3,4,5].each do |e| #do starts the block
puts e
end #end ends it
JS (jQuery):
$.each([1,2,3,4,5], function(e) { //Anonymous *function* starts here
console.log(e);
}); //Ends here
The power of Ruby blocks (and anonymous functions) is the fact that they can be passed to any method (including those you define). So if I want my own each method, here's how it could be done:
class Array
def my_each
i = 0
while(i<self.length)
yield self[i]
i+=1
end
end
end
For example, when you declare a method like this:
def foo(&block)
end
block is a Proc object representing the block passed. So Proc.new, could look like this:
def Proc.new(&block)
block
end
Blocks, by necessity, are bound to a method. They can only be turned into an object by a method like I described above. Although I'm not sure of the exact implementation of lambda (it does extra arg checking), but it is the same idea.
So the fundamental idea of a block is this: A block of code, bound to a method, that can either be contained in a Proc object by an & argument, or called by the yield keyword.

How does the "#map(&proc)" idiom work when introspecting module classes?

Presenting the Idiom
I found an interesting but unexplained alternative to an accepted answer. The code clearly works in the REPL. For example:
module Foo
class Bar
def baz
end
end
end
Foo.constants.map(&Foo.method(:const_get)).grep(Class)
=> [Foo::Bar]
However, I don't fully understand the idiom in use here. In particular, I don't understand the use of &Foo, which seems to be some sort of closure, or how this specific invocation of #grep operates on the result.
Parsing the Idiom
So far, I've been able to parse bits and pieces of this, but I'm not really seeing how it all fits together. Here's what I think I understand about the sample code.
Foo.constants returns an array of module constants as symbols.
method(:const_get) uses Object#method to perform a method lookup and return a closure.
Foo.method(:const_get).call :Bar is a closure that returns a qualified path to a constant within the class.
&Foo seems to be some sort of special lambda. The docs say:
The & argument preserves the tricks if a Proc object is given by & argument.
I'm not sure I fully understand what that means in this specific context, either. Why a Proc? What "tricks," and why are they necessary here?
grep(Class) is operating on the value of the #map method, but its features are not obvious.
Why is this #map construct returning a greppable Array instead of an Enumerator?
Foo.constants.map(&Foo.method(:const_get)).class
=> Array
How does grepping for a class named Class actually work, and why is that particular construction necessary here?
[Foo::Bar].grep Class
=> [Foo::Bar]
The Question, Restated
I'd really like to understand this idiom in its entirety. Can anyone fill in the gaps here, and explain how the pieces all fit together?
&Foo.method(:const_get) is the method const_get of the Foo object. Here's another example:
m = 1.method(:+)
#=> #<Method: Fixnum#+>
m.call(1)
#=> 2
(1..3).map(&m)
#=> [2, 3, 4]
So in the end this is just a pointfree way of saying Foo.constants.map { |c| Foo.const_get(c) }. grep uses === to select elements, so it would only get constants that refer to classes, not other values. This can be verified by adding another constant to Foo, e.g. Baz = 1, which will not get grepped.
If you have further questions please add them as comments and I'll try to clarify them.
Your parse of the idiom is pretty spot on, but I'll go through it and try to clear up any questions you mentioned.
1. Foo.constants
As you mentioned, this returns an array of module constant names as symbols.
2. Array#map
You obviously know what this does, but I want to include it for completeness. Map takes a block and calls that block with each element as an argument. It returns an Array of the results of these block calls.
3. Object#method
Also as you mentioned, this does a method lookup. This is important because a method without parentheses in Ruby is a method call of that method without any arguments.
4. &
This operator is for converting things to blocks. We need this because blocks are not first-class objects in Ruby. Because of this second-class status, we have no way to create blocks which stand alone, but we can convert Procs into blocks (but only when we are passing them to a function)! The & operator is our way of doing this conversion. Whenever we want to pass a Proc object as if it were a block, we can prepend it with the & operator and pass it as the last argument to our function. But & can actually convert more than just Proc objects, it can convert anything that has a to_proc method!
In our case, we have a Method object, which does have a to_proc method. The difference between a Proc object and a Method object lies in their context. A Method object is bound to a class instance and has access to the variables which belong to that class. A Proc is bound to the context in which it is created; that is, it has access to the scope in which it was created. Method#to_proc bundles up the context of the method so that the resulting Proc has access to the same variables. You can find more about the & operator here.
5. grep(Class)
The way Enumerable#grep works is that it runs argument === x for all x in the enumerable. The ordering of the arguments to === is very important in this case, since it's calling Class.=== rather than Foo::Bar.===. We can see the difference between these two by running:
irb(main):043:0> Class === Foo::Bar
=> true
irb(main):044:0> Foo::Bar === Class
=> false
Module#=== (Class inherits its === method from Method) returns True when the argument is an instance of Module or one of its descendants (like Class!), which will filter out constants which are not of type Module or Class.
You can find the documentation for Module#=== here.
The first thing to know is that:
& calls to_proc on the object succeeding it and uses the proc produced as the methods' block.
Now you have to drill down to how exactly the to_proc method is implemented in a specific class.
1. Symbol
class Symbol
def to_proc
Proc.new do |obj, *args|
obj.send self, *args
end
end
end
Or something like this. From the above code you clearly see that the proc produced calls the method (with name == the symbol) on the object and passes the arguments to the method. For a quick example:
[1,2,3].reduce(&:+)
#=> 6
which does exactly that. It executes like this:
Calls :+.to_proc and gets a proc object back => #<Proc:0x007fea74028238>
It takes the proc and passes it as the block to the reduce method, thus instead of calling [1,2,3].reduce { |el1, el2| el1 + el2 } it calls
[1,2,3].reduce { |el1, el2| el1.send(:+, el2) }.
2. Method
class Method
def to_proc
Proc.new do |*args|
self.call(*args)
end
end
end
Which as you can see it has a different implementation of Symbol#to_proc. To illustrate this consider again the reduce example, but now let as see how it uses a method instead:
def add(x, y); x + y end
my_proc = method(:add)
[1,2,3].reduce(&my_proc)
#=> 6
In the above example is calling [1,2,3].reduce { |el1, el2| my_proc(el1, el2) }.
Now on why the map method returns an Array instead of an Enumerator is because you are passing it a block, try this instead:
[1,2,3].map.class
#=> Enumerator
Last but not least the grep on an Array is selecting the elements that are === to its argument. Hope this clarifies your concerns.
Your sequence is equivalent to:
c_names = Foo.constants #=> ["Bar"]
cs = c_names.map { |c_name| Foo.__send__(:const_get, c_name) } #=> [Foo::Bar]
cs.select{ |c| Class === c } #=> [Foo::Bar]
You can consider Object#method as (roughly):
class Object
def method(m)
lambda{ |*args| self.__send__(m, *args) }
end
end
grep is described here http://ruby-doc.org/core-1.9.3/Enumerable.html#method-i-grep
=== for Class (which is subclass of Module) is described here http://ruby-doc.org/core-1.9.3/Module.html#method-i-3D-3D-3D
UPDATE: And you need to grep because there can be other constants:
module Foo
PI = 3.14
...
end
and you probably don't need them.

Mass assignment on construction from within ruby [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Idiomatic object creation in ruby
Sometimes it's useful to assign numerous of a constructed arguments to instance variables on construction. Other than the obvious method:
def initialize(arg1, arg2, arg3)
#arg1, #arg2, #arg3 = arg1, arg2, arg3
end
Is there a more concise idiom for achieving the same result? Something like that found in scala for instance:
class FancyGreeter(greeting: String) {
def greet() = println(greeting)
}
Where in this case the object FancyGreeter has a default constructor that provides assignment for it's passed arguments.
In Ruby 1.8, block arguments and method arguments have different semantics: method arguments have binding semantics, block arguments have assignment semantics.
What that means is that when you call a method, the method arguments get bound to the values that you pass in. When you call a block, the values get assigned to the arguments.
So, you can create some pretty crazy looking blocks that way, that seemingly don't do anything:
lambda {|#a|}.call(42)
The block body is empty, but because of the argument assignment semantics, the instance variable #a will be assigned the value 42. It works even crazier:
lambda {|foo.bar|}.call(42)
Yes, attr_writer methods work too. Or what about
foo = {}
lambda {|foo[:bar]|}.call(42)
p foo # => {:bar => 42}
Yup, those too.
And since you can define methods using blocks, you can do this:
class FancyGreeter
define_method(:initialize) {|#greeting|}
def greet; puts #greeting end
end
or even
class FancyGreeter
attr_accessor :greeting
define_method(:initialize) {|self.greeting|}
def greet; puts greeting end
end
However, I wouldn't recommend this for two reasons:
Not many Rubyists know this, be kind to the people who have to maintain the code after you.
In Ruby 1.9 and onwards, block argument semantics are gone, blocks also use method argument semantics, therefore this does no longer work.
I suppose you could do....
def initialize *e
#a, #b, #c = e
end
I don't know about "better" but there are varying levels of 'clever':
def initialize args={}
args.each do |key, value|
instance_variable_set "##{key}", value
end
end
But "clever" is usually dangerous when you program :-)
Edit: Given the edited question, I'll add this:
Class PickMe
def initialize say="what?"
#say = say
end
end
Just because I don't know if you're aware of default options. Otherwise, think of the value of self-documenting code. A cleanly-written 'initialize' method is priceless.
It was either Andy Hunt or Dave Thomas who proposed that Ruby should be able to handle this syntax for initializing member variables from constructor arguments:
def initialize(#a, #b, #c)
...
end
Matz did not accept their proposal; I don't remember why.

Resources