Ruby does not have first class functions; although it has procs and lambdas, these notoriously require significant overhead. (Python has first class functions, apparently without the overhead.)
It occurred me to that first class functions can be simulated with a little more work by using anonymous classes, as follows:
f = Class.new { def self.f; puts 'hi'; end }
def g(fun); fun; end
g(f.f)
# => "hi"
Does anyone know a better way?
In fact, Ruby doesn't have functions at all, only methods. So if you want pass a method to another method, you can
def g(f)
f.call
end
g('123'.method(:to_i))
This is less concise than Python, but it's the price that Ruby has to pay for the ability to omit parentheses in method calls. I think omitting parentheses is one of the things that makes Ruby shine, because this makes implementing DSL in pure Ruby a lot easier.
Ruby has procs and lambdas (both instances of the Proc class), and Methods, all of which approximate first-class functions. Lambdas are the closest to a true first-class function: they check the number of arguments when called and create a new call context such that return just returns from the lambda. In contrast, procs are just reified blocks of code; they don't check their number of arguments, and a return causes the enclosing method to return, not just the proc.
Method objects allow you to store an uncalled method in a variable, complete with implied invocant. There's no syntax for creating an anonymous Method, but you said first-class functions, not anonymous ones. Other than the invocant, they are basically lambdas whose body is that of the referenced method.
I'm not sure what an anonymous class gets you that is better than the above solutions, but it is certainly further away from a true first-class function. It's more like the way we had to approximate them in Java before closures were added to the language.
Related
As I understood return inside a Proc terminates the current method. So in the following example I would expect to see:
a1 > b1 > proc > a2. But actually it never reaches a2, why?
def a
puts "a1"
l = Proc.new {puts "proc"; return}
b l
puts "a2"
end
def b x
puts "b1"
x.call
puts "b2"
end
a
As a general rule, return always returns from the closest lexically enclosing method definition expression.
In this case, the closest lexically enclosing method definition expression is def a, therefore, return returns from a.
It does not actually matter that the return is inside a block in this case. The general rule is, well, general, so it applies regardless of where the return appears.
If we look more specifically at blocks, though, we can see that it still makes sense: in blocks, local variables are captured lexically, self is captured lexically, so it makes sense that return also behaves lexically. It is a general property of blocks that if you want to understand what is going on in a block, you only need to look lexically outwards.
And if we get even more specific, first going from the general rule to blocks, and now from blocks to Procs, the behavior still makes sense: a Proc is essentially a reified block, so it makes sense for a Proc to behave like a block.
There are some exceptions, though, to the general rule, and one important one are lambdas. Talking about lambdas in Ruby is always a little bit weird because lambdas are Procs but they behave differently from Procs. IMO, lambdas should have a separate class alongside Procs. Since lambdas are Procs, it makes it weird to talk about the differences between lambdas and Procs which are not lambdas (which don't have a standardized name and thus are confusingly also called Procs).
The behavior of a lambda differs from the behavior of a non-lambda Proc in two ways, one of which is relevant to your question:
Parameter binding in non-lambda Procs has the same semantics as parameter binding in blocks, whereas parameter binding in lambdas has the same semantics as parameter binding in message sends / method invocations.
In non-lambda Procs, return returns from the closest lexically enclosing method definition expression, just like in blocks, whereas in lambdas, return returns from the lambda itself, just like return in methods.
So, in both of these aspects, non-lambda Procs behave like blocks and lambdas behave like methods. I memorize it like this: "Proc" rhymes with "block" and both "lambda" and "method" are Greek.
As you probably know, there are some methods which also alter the behavior of blocks that are passed to them. E.g. instance_eval and instance_exec change the value of self, and define_method actually does change the behavior of return.
But since you didn't ask about blocks in general, and also didn't ask about lambdas specifically, and there are no reflective methods in your question, the general rules still applies to non-lambda Procs like the one shown in your question: return returns from the closest lexically enclosing method definition expression.
Instead of supporting method overloading Ruby overwrites existing methods. Can anyone explain why the language was designed this way?
"Overloading" is a term that simply doesn't even make sense in Ruby. It is basically a synonym for "static argument-based dispatch", but Ruby doesn't have static dispatch at all. So, the reason why Ruby doesn't support static dispatch based on the arguments, is because it doesn't support static dispatch, period. It doesn't support static dispatch of any kind, whether argument-based or otherwise.
Now, if you are not actually specifically asking about overloading, but maybe about dynamic argument-based dispatch, then the answer is: because Matz didn't implement it. Because nobody else bothered to propose it. Because nobody else bothered to implement it.
In general, dynamic argument-based dispatch in a language with optional arguments and variable-length argument lists, is very hard to get right, and even harder to keep it understandable. Even in languages with static argument-based dispatch and without optional arguments (like Java, for example), it is sometimes almost impossible to tell for a mere mortal, which overload is going to be picked.
In C#, you can actually encode any 3-SAT problem into overload resolution, which means that overload resolution in C# is NP-hard.
Now try that with dynamic dispatch, where you have the additional time dimension to keep in your head.
There are languages which dynamically dispatch based on all arguments of a procedure, as opposed to object-oriented languages, which only dispatch on the "hidden" zeroth self argument. Common Lisp, for example, dispatches on the dynamic types and even the dynamic values of all arguments. Clojure dispatches on an arbitrary function of all arguments (which BTW is extremely cool and extremely powerful).
But I don't know of any OO language with dynamic argument-based dispatch. Martin Odersky said that he might consider adding argument-based dispatch to Scala, but only if he can remove overloading at the same time and be backwards-compatible both with existing Scala code that uses overloading and compatible with Java (he especially mentioned Swing and AWT which play some extremely complex tricks exercising pretty much every nasty dark corner case of Java's rather complex overloading rules). I've had some ideas myself about adding argument-based dispatch to Ruby, but I never could figure out how to do it in a backwards-compatible manner.
Method overloading can be achieved by declaring two methods with the same name and different signatures. These different signatures can be either,
Arguments with different data types, eg: method(int a, int b) vs method(String a, String b)
Variable number of arguments, eg: method(a) vs method(a, b)
We cannot achieve method overloading using the first way because there is no data type declaration in ruby(dynamic typed language). So the only way to define the above method is def(a,b)
With the second option, it might look like we can achieve method overloading, but we can't. Let say I have two methods with different number of arguments,
def method(a); end;
def method(a, b = true); end; # second argument has a default value
method(10)
# Now the method call can match the first one as well as the second one,
# so here is the problem.
So ruby needs to maintain one method in the method look up chain with a unique name.
I presume you are looking for the ability to do this:
def my_method(arg1)
..
end
def my_method(arg1, arg2)
..
end
Ruby supports this in a different way:
def my_method(*args)
if args.length == 1
#method 1
else
#method 2
end
end
A common pattern is also to pass in options as a hash:
def my_method(options)
if options[:arg1] and options[:arg2]
#method 2
elsif options[:arg1]
#method 1
end
end
my_method arg1: 'hello', arg2: 'world'
Method overloading makes sense in a language with static typing, where you can distinguish between different types of arguments
f(1)
f('foo')
f(true)
as well as between different number of arguments
f(1)
f(1, 'foo')
f(1, 'foo', true)
The first distinction does not exist in ruby. Ruby uses dynamic typing or "duck typing". The second distinction can be handled by default arguments or by working with arguments:
def f(n, s = 'foo', flux_compensator = true)
...
end
def f(*args)
case args.size
when
...
when 2
...
when 3
...
end
end
This doesn't answer the question of why ruby doesn't have method overloading, but third-party libraries can provide it.
The contracts.ruby library allows overloading. Example adapted from the tutorial:
class Factorial
include Contracts
Contract 1 => 1
def fact(x)
x
end
Contract Num => Num
def fact(x)
x * fact(x - 1)
end
end
# try it out
Factorial.new.fact(5) # => 120
Note that this is actually more powerful than Java's overloading, because you can specify values to match (e.g. 1), not merely types.
You will see decreased performance using this though; you will have to run benchmarks to decide how much you can tolerate.
I often do the following structure :
def method(param)
case param
when String
method_for_String(param)
when Type1
method_for_Type1(param)
...
else
#default implementation
end
end
This allow the user of the object to use the clean and clear method_name : method
But if he want to optimise execution, he can directly call the correct method.
Also, it makes your test clearers and betters.
there are already great answers on why side of the question. however, if anyone looking for other solutions checkout functional-ruby gem which is inspired by Elixir pattern matching features.
class Foo
include Functional::PatternMatching
## Constructor Over loading
defn(:initialize) { #name = 'baz' }
defn(:initialize, _) {|name| #name = name.to_s }
## Method Overloading
defn(:greet, :male) {
puts "Hello, sir!"
}
defn(:greet, :female) {
puts "Hello, ma'am!"
}
end
foo = Foo.new or Foo.new('Bar')
foo.greet(:male) => "Hello, sir!"
foo.greet(:female) => "Hello, ma'am!"
I came across this nice interview with Yukihiro Matsumoto (aka. "Matz"), the creator of Ruby. Incidentally, he explains his reasoning and intention there. It is a good complement to #nkm's excellent exemplification of the problem. I have highlighted the parts that answer your question on why Ruby was designed that way:
Orthogonal versus Harmonious
Bill Venners: Dave Thomas also claimed that if I ask you to add a
feature that is orthogonal, you won't do it. What you want is
something that's harmonious. What does that mean?
Yukihiro Matsumoto: I believe consistency and orthogonality are tools
of design, not the primary goal in design.
Bill Venners: What does orthogonality mean in this context?
Yukihiro Matsumoto: An example of orthogonality is allowing any
combination of small features or syntax. For example, C++ supports
both default parameter values for functions and overloading of
function names based on parameters. Both are good features to have in
a language, but because they are orthogonal, you can apply both at the
same time. The compiler knows how to apply both at the same time. If
it's ambiguous, the compiler will flag an error. But if I look at the
code, I need to apply the rule with my brain too. I need to guess how
the compiler works. If I'm right, and I'm smart enough, it's no
problem. But if I'm not smart enough, and I'm really not, it causes
confusion. The result will be unexpected for an ordinary person. This
is an example of how orthogonality is bad.
Source: "The Philosophy of Ruby", A Conversation with Yukihiro Matsumoto, Part I
by Bill Venners, September 29, 2003 at: https://www.artima.com/intv/ruby.html
Statically typed languages support method overloading, which involves their binding at compile time. Ruby, on the other hand, is a dynamically typed language and cannot support static binding at all. In languages with optional arguments and variable-length argument lists, it is also difficult to determine which method will be invoked during dynamic argument-based dispatch. Additionally, Ruby is implemented in C, which itself does not support method overloading.
In Ruby 1.8.7 and prior, Enumerable::each_with_index did not accept any arguments. In Ruby 1.9, it will accept an arbitrary number of arguments. Documentation/code shows that it simply passes those arguments along to ::each. With the built in and standard library Enumerables, I believe passing an argument will yield an error, since the Enumerable's ::each method isn't expecting parameters.
So I would guess this is only useful in creating your own Enumerable in which you do create an ::each method that accepts arguments. What is an example where this would be useful?
Are there any other non-obvious consequences of this change?
I went through some gems code and found almost no uses of that feature. One that it does, spreadsheet:
def each skip=dimensions[0], &block
skip.upto(dimensions[1] - 1) do |idx|
block.call row(idx)
end
end
I don't really see that as an important change: #each is the base method for classes that mix-in module Enumerable, and methods added (map, select, ...) do not accept arguments.
I have a couple questions about Ruby's methods, procedures, and blocks that strike me as rather odd. They're not so much about syntax or function as the logic behind the decisions made.
Question 1:
Why is it that blocks can be passed to methods (e.g. each) but they cannot be assigned to a variable?
I know you can pass them around in procedures, i.e. p = Proc.new {...} (accessed with &p), but it doesn't make much sense to make the programmer go through these means.
Question 2:
Why is there a differentiation between methods and procedures?
For instance, I can accomplish the task of defining a function and calling that function in the following two ways:
def square(x)
x**2
end
square(3)
=> 9
or
square = lambda {|x| x**2}
square.call(3)
=> 9
Why the differentiation? In Python for example both defining a function in the standard way and by square = lambda x: x**2 accomplish the same task of creating the function and assigning it to square.
Question 1: Blocks are not objects, they are syntactic structures; this is why they cannot be assigned to a variable. This is a privilege reserved for objects.
Question 2: Methods are not objects, so they cannot receive messages. Inversely, procs and lambdas are objects, so they cannot be invoked like methods, but must receive a message that tells them to return a value on the basis of the parameters passed with the message.
Procs and Lambdas are objects, so they can receive the call message and be assigned to names. To summarize, it is being an object that makes procs and lambdas behave in ways you find odd. Methods and blocks are not objects and don't share that behavior.
To some extent at least, methods are objects:
class ABC
def some_method
end
end
ABC.instance_method(:some_method) #=> #<UnboundMethod: ABC#some_method>
Further to that, there is a built-in class: Method, as documented here.
See also this: http://en.wikibooks.org/wiki/Ruby_Programming/Syntax/Method_Calls
Haphazardly <bseg>, it does rather seem to bear out the everything-is-an-object thing. In this particular case, it just appears to take a little more digging to see.
(I really must make an effort to understand this better: I'm starting to think it's fundamental to getting a deeper understanding.)
Methods are methods — that is, they're actions that an object can take in response to messages. They are not functions.
Blocks are closures — they're functions that close over the enclosing scope. They don't conceptually "belong to" a given object.
In some languages, methods are merely functions that are members of an object, but Ruby does not view them this way. Separating a method from its owning object is more akin to surgery than simple assignment. Ruby takes its object-orientation model from Smalltalk, the granddaddy of modern OO.
I just wondered whether there is any good reason for or even an advantage in having to invoke Procs using proc.call(args) in Ruby, which makes higher-order function syntax much more verbose and less intuitive.
Why not just proc(args)? Why draw a distinction between functions, lambdas and blocks? Basically, it's all the same thing so why this confusing syntax? Or is there any point for it I don't realize?
You need some way to distinguish between calling the Proc and passing it around.
In Python and ECMAScript, it's simple: with parentheses it's a call, without it's not. In Ruby, leaving off the parentheses is also a call, therefore, there must be some other way to distinguish.
In Ruby 1.8, Proc#call and its alias Proc#[] serve that distinction. As of Ruby 1.9, obj.(arg) is syntactic sugar for obj.call(arg) and Proc#() is also an alias for Proc#call.
So, you can call a Proc like this:
foo.call(1, 2, 3)
foo[1, 2, 3]
foo.(1, 2, 3)
And you can even also define () for your own classes.
BTW: the same problem is also why you have to use the method method to get a hold of a method object.
In ruby you can have a local variable and a method that are both named foo. Assuming the method is private, the only way to call it would be foo(args) (self.foo(args) would not work for private methods which can't have an explicit receiver). If ruby would allow to overload the () operator, so that the foo in foo(bar) can be a variable, there would be no way to call the private method foo, when there is also a local variable named foo.
Note that with features like define_method and method_missing, it is not always possible to avoid situations where you have methods and local variables of the same name.
You want to be able to pass it around without calling it, right? Requiring that it be explicitly called allows that. Otherwise, every time you tried to use the proc as a parameter, you would end up calling it.