Procedural and Data abstraction in ruby - ruby

I'm new to Ruby. I'm learning abstraction principle in ruby.As I understood Procedural abstraction is hiding the implementation details from the user or simply concentrating on the essentials and ignoring the details.
My concern is how to implement it
1) Is it a simple function calling just like this
# function to sort array
# #params array[Array] to be sort
def my_sort(array)
return array if array.size <= 1
swapped = false
while !swapped
swapped = false
0.upto(array.size-2) do |i|
if array[i] > array[i+1]
array[i], array[i+1] = array[i+1], array[i]
swapped = true
end
end
end
array
end
and calling like this
sorted_array = my_sort([12,34,123,43,90,1])
2) How does Data Abstraction differs from Encapsulation
As I understood Data Abstraction is just hiding some member data from other classes.

Data abstraction is fundamental to most object oriented language - wherein the classes are designed to encapsulate data and provide methods to control how that data is modified (if at all), or helper methods to derive meaning of that data.
Ruby's Array class is an example of Data Abstraction. It provides a mechanism to manage an array of Objects, and provides operations that can be performed on that array, without you having to care how internally it is organized.
arr = [1,3,4,5,2,10]
p arr.class # Prints Array
p arr.sort # Prints [1,2,3,4,5,10]
Procedural abstraction is about hiding implementation details of procedure from the user. In the above example, you don't really need to know what sorting algorithm sort method uses internally, you just use it assuming that nice folks in Ruby Core team picked a best one for you.
At the same time, Ruby may not know how to compare two items present in the Array always. For example, below code would not run as Ruby does not know how to compare strings and numbers.
[1,3,4,5,"a","c","b", 2,10].sort
#=> `sort': comparison of Fixnum with String failed (ArgumentError)
It allows us to hook into implementation and help with comparison, even though underlying sorting algorithm remains same (as it is abstracted from the user)
[1,3,4,5,"a","c","b", 2,10].sort { |i,j|
if i.class == String and j.class == String
i <=> j
elsif i.class == Fixnum and j.class == Fixnum
i <=> j
else
0
end
}
#=> [1, 3, 4, 5, 2, 10, "a", "b", "c"]
When writing code for your own problems, procedural abstraction can be used to ensure a procedure often breaks down its problem into sub-problems, and solves each sub-problems using separate procedure. This allows, certain aspects to be extended later (as in above case, comparison could be extended - thanks to Ruby blocks, it was much easier). Template method pattern is good technique to achieve this.

You are returning an array from the method. Data structures are implementation details. If you change the data structure used in the method, you will break the client code. So your example does not hide the implementation details. It does not encapsulate the design decisions so that the client's are insulated from the internal implementation details.

Definition of 'Abstraction' : the quality of dealing with ideas rather than events.
Referring to this answer difference between abstraction and encapsulation? and my understanding I found that in your code the method my_sort fully justifies the Encapsulation as it encapsulates the behavior related to sorting of any single dimension array. However it lacks the abstraction as the method my_sort knows the type of data its gonna process on.
It would have justified Abstraction if it had not known / cared the type of data that comes in via params. In other words, it should have sorted any object that comes in no matter whether it is a list of Fixnum or String or other sortable datatypes.
Encapsulation:
We normally use access modifiers (public, private,..) to differentiate the data/behavior that are to be exposed to the clients and that are to be used internally. The public interface ( Exposed to clients ) are not subject to change as far as possible. However, the private are the behaviors that can change and should not in any case impact the expected behavior of the code that clients rely upon.
Also we separate the sensitive data/behavior to private/protected to prevent accidental modification / misuse. This makes client not to rely on the portion of the code that might change frequently.
So one always need to segregate the core logic to private scope.
Abstraction:
Example:
In case of church there is an abstraction between the confessor and the father / priest. The confessor should not have any idea about the name or any detail of the priest and vice-versa. Anyone can confess and yet hide his/her identity no matter how big mistakes/crimes he/she had committed.

Related

What happens when a method is used on an object created from a built in class?

I understand that classes are like mold from which you can create objects, and a class defines a number of methods and variables (class,instances,local...) inside of it.
Let's say we have a class like this:
class Person
def initialize (name,age)
#name = name
#age = age
end
def greeting
"#{#name} says hi to you!"
end
end
me = Person.new "John", 34
puts me.greeting
As I can understand, when we call Person.new we are creating an object of class Person and initializing some internal attributes for that object, which will be stored in the instance variables #name and #age. The variable me will then be a reference to this newly created object.
When we call me.greeting, what happens is that greeting method is called on the object referenced by me, and that method will use the instance variable #name that is directly tied/attached to that object.
Hence, when calling a method on an object you are actually "talking" to that object, inspecting and using its attributes that are stored in its instance variables. All good for now.
Let's say now that we have the string "hello". We created it using a string literal, just like: string = "hello".
My question is, when creating an object from a built in class (String, Array, Integer...), are we actually storing some information on some instance variables for that object during its creation?
My doubt arises because I can't understand what happens when we call something like string.upcase, how does the #upcase method "work" on string? I guess that in order to return the string in uppercase, the string object previously declared has some instance variables attached to, and the instances methods work on those variables?
Hence, when calling a method on an object you are actually "talking" to that object, inspecting and using its attributes that are stored in its instance variables. All good for now.
No, that is very much not what you are doing in an Object-Oriented Program. (Or really any well-designed program.)
What you are describing is a break of encapsulation, abstraction, and information hiding. You should never inspect and/or use another object's instance variables or any of its other private implementation details.
In Object-Orientation, all computation is performed by sending messages between objects. The only thing you can do is sending messages to objects and the only thing you can observe about an object is the responses to those messages.
Only the object itself can inspect and use its attributes and instance variables. No other object can, not even objects of the same type.
If you send an object a message and you get a response, the only thing you know is what is in that response. You don't know how the object created that response: did the object compute the answer on the fly? Was the answer already stored in an instance variable and the object just responded with that? Did the object delegate the problem to a different object? Did it print out the request, fax it to a temp agency in the Philippines, and have a worker compute the answer by hand with pen and paper? You don't know. You can't know. You mustn't know. That is at the very heart of Object-Orientation.
This is, BTW, exactly how messaging works in real-life. If you send someone a message asking "what is π²" and they answer with "9.8696044011", then you have no idea whether they computed this by hand, used a calculator, used their smart phone, looked it up, asked a friend, or hired someone to answer the question for them.
You can imagine objects as being little computers themselves: they have internal storage, RAM, HDD, SSD, etc. (instance variables), they have code running on them, the OS, the basic system libraries, etc. (methods), but one computer cannot read another computer's RAM (access its instance variables) or run its code (execute its methods). It can only send it a request over the network and look at the response.
So, in some sense, your question is meaningless: from the point of view of Object-Oriented Abstraction, is should be impossible to answer your question, because it should be impossible to know how an object is implemented internally.
It could use instance variables, or it could not. It could be implemented in Ruby, or it could be implemented in another programming language. It could be implemented as a standard Ruby object, or it could be implemented as some secret internal private part of the Ruby implementation.
In fact, it could even not exist at all! (For example, in many Ruby implementations small integers do not actually exist as objects at all. The Ruby implementation will just make it look like they do.)
My question is, when creating an object from a built in class (String, Array, Integer...), are we actually storing some information on some instance variables for that object during its creation?
[…] [W]hat happens when we call something like string.upcase, how does the #upcase method "work" on string? I guess that in order to return the string in uppercase, the string object previously declared has some instance variables attached to, and the instances methods work on those variables?
There is nothing in the Ruby Language Specification that says how the String#upcase method is implemented. The Ruby Language Specification only says what the result is, but it doesn't say anything about how the result is computed.
Note that this is not specific to Ruby. This is how pretty much every programming language works. The Specification says what the results should be, but the details of how to compute those results is left to the implementor. By leaving the decision about the internal implementation details up to the implementor, this frees the implementor to choose the most efficient, most performant implementation that makes sense for their particular implementation.
For example, in the Java platform, there are existing methods available for converting a string to upper case. Therefore, in an implementation like TruffleRuby, JRuby, or XRuby, which sits on top of the Java platform, it makes sense to just call the existing Java methods for converting strings to upper case. Why waste time implementing an algorithm for converting strings to upper case when somebody else has already done that for you? Likewise, in an implementation like IronRuby or Ruby.NET, which sit on top of the .NET platform, you can just use .NET's builtin methods for converting strings to upper case. In an implementation like Opal, you can just use ECMAScript's methods for converting strings to upper case. And so on.
Unfortunately, unlike many other programming languages, the Ruby Language Specification does not exist as a single document in a single place. Ruby does not have a single formal specification that defines what certain language constructs mean.
There are several resources, the sum of which can be considered kind of a specification for the Ruby programming language.
Some of these resources are:
The ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification – Note that the ISO Ruby Specification was written around 2009–2010 with the specific goal that all existing Ruby implementations at the time would easily be compliant. Since YARV only implements Ruby 1.9+ and MRI only implements Ruby 1.8 and lower, this means that the ISO Ruby Specification only contains features that are common to both Ruby 1.8 and Ruby 1.9. Also, the ISO Ruby Specification was specifically intended to be minimal and only contain the features that are absolutely required for writing Ruby programs. Because of that, it does for example only specify Strings very broadly (since they have changed significantly between Ruby 1.8 and Ruby 1.9). It obviously also does not specify features which were added after the ISO Ruby Specification was written, such as Ractors or Pattern Matching.
The Ruby Spec Suite aka ruby/spec – Note that the ruby/spec is unfortunately far from complete. However, I quite like it because it is written in Ruby instead of "ISO-standardese", which is much easier to read for a Rubyist, and it doubles as an executable conformance test suite.
The Ruby Programming Language by David Flanagan and Yukihiro 'matz' Matsumoto – This book was written by David Flanagan together with Ruby's creator matz to serve as a Language Reference for Ruby.
Programming Ruby by Dave Thomas, Andy Hunt, and Chad Fowler – This book was the first English book about Ruby and served as the standard introduction and description of Ruby for a long time. This book also first documented the Ruby core library and standard library, and the authors donated that documentation back to the community.
The Ruby Issue Tracking System, specifically, the Feature sub-tracker – However, please note that unfortunately, the community is really, really bad at distinguishing between Tickets about the Ruby Programming Language and Tickets about the YARV Ruby Implementation: they both get intermingled in the tracker.
The Meeting Logs of the Ruby Developer Meetings.
New features are often discussed on the mailing lists, in particular the ruby-core (English) and ruby-dev (Japanese) mailing lists.
The Ruby documentation – Again, be aware that this documentation is generated from the source code of YARV and does not distinguish between features of Ruby and features of YARV.
In the past, there were a couple of attempts of formalizing changes to the Ruby Specification, such as the Ruby Change Request (RCR) and Ruby Enhancement Proposal (REP) processes, both of which were unsuccessful.
If all else fails, you need to check the source code of the popular Ruby implementations to see what they actually do.
For example, this is what the ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification has to say about String#upcase:
15.2.10.5.42 String#upcase
upcase
Visibility: public
Behavior: The method returns a new direct instance of the class String which contains all the characters of the receiver, with all the lower-case characters replaced with the corresponding upper-case characters.
As you can see, there is no mention of instances variables or really any details at all about how the method is implemented. It only specifies the result.
If a Ruby implementor wants to use instance variables, they are allowed to use instances variables, if a Ruby implementor doesn't want to use instance variables, they are allowed to do that, too.
If you check the Ruby Spec Suite for String#upcase, you will find specifications like these (this is just an example, there are quite a few more):
describe "String#upcase" do
it "returns a copy of self with all lowercase letters upcased" do
"Hello".upcase.should == "HELLO"
"hello".upcase.should == "HELLO"
end
describe "full Unicode case mapping" do
it "works for all of Unicode with no option" do
"äöü".upcase.should == "ÄÖÜ"
end
it "updates string metadata" do
upcased = "aßet".upcase
upcased.should == "ASSET"
upcased.size.should == 5
upcased.bytesize.should == 5
upcased.ascii_only?.should be_true
end
end
end
Again, as you can see, the Spec only describes results but not mechanisms. And this is very much intentional.
The same is true for the Ruby-Doc documentation of String#upcase:
upcase(*options) → string
Returns a string containing the upcased characters in self:
s = 'Hello World!' # => "Hello World!"
s.upcase # => "HELLO WORLD!"
The casing may be affected by the given options; see Case Mapping.
There is no mention of any particular mechanism here, nor in the linked documentation about Unicode Case Mapping.
All of this only tells us how String#upcase is specified and documented, though. But how is it actually implemented? Well, lucky for us, the majority of Ruby implementations are Free and Open Source Software, or at the very least make their source code available for study.
In Rubinius, you can find the implementation of String#upcase in core/string.rb lines 819–822 and it looks like this:
def upcase
str = dup
str.upcase! || str
end
It just delegates the work to String#upcase!, so let's look at that next, it is implemented right next to String#upcase in core/string.rb lines 824–843 and looks something like this (simplified and abridged):
def upcase!
return if #num_bytes == 0
ctype = Rubinius::CType
i = 0
while i < #num_bytes
c = #data[i]
if ctype.islower(c)
#data[i] = ctype.toupper!(c)
end
i += 1
end
end
So, as you can see, this is indeed just standard Ruby code using instance variables like #num_bytes which holds the length of the String in platform bytes and #data which is an Array of platform bytes holding the actual content of the String. It uses two helper methods from the Rubinius::CType library (a library for manipulating individual characters as byte-sized integers). The "actual" conversion to upper case is done by Rubinius::CType::toupper!, which is implemented in core/ctype.rb and is extremely simple (to the point of being simplistic):
def self.toupper!(num)
num - 32
end
Another very simple example is the implementation of String#upcase in Opal, which you can find in opal/corelib/string.rb and looks like this:
def upcase
`self.toUpperCase()`
end
Opal is an implementation of Ruby for the ECMAScript platform. Opal cleverly overloads the Kernel#` method, which is normally used to spawn a sub shell (which doesn't exist in ECMAScript) and execute commands in the platform's native command language (which on the ECMAScript platform arguably is ECMAScript). In Opal, Kernel#` is instead used to inject arbitrary ECMAScript code into Ruby.
So, all that `self.toUpperCase()` does, is call the String.prototype.toUpperCase method on self, which does work because of how the String class is defined in Opal:
class ::String < `String`
In other words, Opal implements Ruby's String class by simply inheriting from ECMAScript's String "class" (really the String Constructor function) and is therefore able to very easily and elegantly reuse all the work that has been done implementing Strings in ECMAScript.
Another very simple example is TruffleRuby. Its implementation of String#upcase can be found in src/main/ruby/truffleruby/core/string.rb and looks like this:
def upcase(*options)
s = Primitive.dup_as_string_instance(self)
s.upcase!(*options)
s
end
Similar to Rubinius, String#upcase just delegates to String#upcase!, which is not surprising since TruffleRuby's core library was originally forked from Rubinius's. This is what String#upcase! looks like:
def upcase!(*options)
mapped_options = Truffle::StringOperations.validate_case_mapping_options(options, false)
Primitive.string_upcase! self, mapped_options
end
The Truffle::StringOperations::valdiate_case_mapping_options helper method is not terribly interesting, it is just used to implement the rather complex rules for what the Case Mapping Options that you can pass to the various String methods are allowed to look like. The actual "meat" of TruffleRuby's implementation of String#upcase! is just this: Primitive.string_upcase! self, mapped_options.
The syntax Primitive.some_name was agreed upon between the developers of multiple Ruby implementations as "magic" syntax within the core of the implementation itself to be able to call out from Ruby code into "primitives" or "intrinsics" that are provided by the runtime system, but are not necessarily implemented in Ruby.
In other words, all that Primitive.string_upcase! self, mapped_options tells us is "there is a magic function called string_upcase! defined somewhere deep in the bowels of TruffleRuby itself, which knows how to convert a string to upper case, but we are not supposed to know how it works".
If you are really curious, you can find the implementation of Primitive.string_upcase! in src/main/java/org/truffleruby/core/string/StringNodes.java. The code looks dauntingly long and complex, but all you really need to know is that the Truffle Language Implementation Framework is based on constructing Nodes for an AST-walking interpreter. Once you ignore all the machinery related to constructing the AST nodes, the code itself is actually fairly simple.
Once again, the implementors are relying on the fact that the Truffle Language Implementation Framework already comes with a powerful implementation of strings, which the TruffleRuby developers can simply reuse for their own strings.
By the way, this idea of "primitives" or "intrinsics" is an idea that is used in many programming language implementations. It is especially popular in the Smalltalk world. It allows you to write the definition of your methods in the language itself, which in turn allows features like reflection and tools like documentation generators and IDEs (e.g. for automatic code completion) to work without them having to understand a second language, but still have an efficient implementation in a separate language with privileged access to the internals of the implementation.
For example, because large parts of YARV are implemented in C instead of Ruby, but YARV is the implementation that the documentation on Ruby-Doc and Ruby-Lang is generated from, that means that the RDoc Ruby Documentation Generator actually needs to understand both Ruby and C. And you will notice that sometimes documentation for methods implemented in C is missing, incomplete, or corrupted. Similarly, trying to get information about methods implemented in C using Method#parameters sometimes returns non-sensical or useless results. This would not happen if YARV used something like Intrinsics instead of directly writing the methods in C.
JRuby implements String#upcase in several overloads of org.jruby.RubyString.upcase and String#upcase! in several overloads of org.jruby.RubyString.upcase_bang.
However, in the end, they all delegate to one specific overload of org.jruby.RubyString.upcase_bang defined in core/src/main/java/org/jruby/RubyString.java like this:
private IRubyObject upcase_bang(ThreadContext context, int flags) {
modifyAndKeepCodeRange();
Encoding enc = checkDummyEncoding();
if (((flags & Config.CASE_ASCII_ONLY) != 0 && (enc.isUTF8() || enc.maxLength() == 1)) ||
(flags & Config.CASE_FOLD_TURKISH_AZERI) == 0 && getCodeRange() == CR_7BIT) {
int s = value.getBegin();
int end = s + value.getRealSize();
byte[]bytes = value.getUnsafeBytes();
while (s < end) {
int c = bytes[s] & 0xff;
if (Encoding.isAscii(c) && 'a' <= c && c <= 'z') {
bytes[s] = (byte)('A' + (c - 'a'));
flags |= Config.CASE_MODIFIED;
}
s++;
}
} else {
flags = caseMap(context.runtime, flags, enc);
if ((flags & Config.CASE_MODIFIED) != 0) clearCodeRange();
}
return ((flags & Config.CASE_MODIFIED) != 0) ? this : context.nil;
}
As you can see, this is is a very low-level way of implementing it.
In MRuby, the implementation looks again very different. MRuby is designed to be light-weight, small, and easy to embed into a larger application. It is also designed to be used in small embedded systems such as robots, sensors, and IoT devices. Because of that, it is designed to be very modular: a lot of the parts of MRuby are optional and are distributed as "MGems". Even parts of the core language are optional and can be left out, such as support for the catch and throw keywords, big numbers, the Dir class, meta programming, eval, the Math module, IO and File, and so on.
If we want to find out where String#upcase is implemented, we have to follow a trail of breadcrumbs. We start with the mrb_str_upcase function in src/string.c which looks like this:
static mrb_value
mrb_str_upcase(mrb_state *mrb, mrb_value self)
{
mrb_value str;
str = mrb_str_dup(mrb, self);
mrb_str_upcase_bang(mrb, str);
return str;
}
This is a pattern we have already seen a couple of times: String#upcase just duplicates the String and then delegates to String#upcase!, which is implemented just above in mrb_str_upcase_bang:
static mrb_value
mrb_str_upcase_bang(mrb_state *mrb, mrb_value str)
{
struct RString *s = mrb_str_ptr(str);
char *p, *pend;
mrb_bool modify = FALSE;
mrb_str_modify_keep_ascii(mrb, s);
p = RSTRING_PTR(str);
pend = RSTRING_END(str);
while (p < pend) {
if (ISLOWER(*p)) {
*p = TOUPPER(*p);
modify = TRUE;
}
p++;
}
if (modify) return str;
return mrb_nil_value();
}
As you can see, there is a lot of mechanic in there to extract the underlying data structure from the Ruby String object, iterate over that data structure making sure to not run over the end, etc., but the real work of actually converting to uppercase is actually performed by the TOUPPER macro defined in include/mruby.h:
#define TOUPPER(c) (ISLOWER(c) ? ((c) & 0x5f) : (c))
There you have it! That's how String#upcase works "under the hood" in five different Ruby implementations: Rubinius, Opal, TruffleRuby, JRuby, and MRuby. And it will again be different in IronRuby, YARV, RubyMotion, Ruby.NET, XRuby, MagLev, MacRuby, tinyrb, MRI, IoRuby, or any of the other Ruby implementations of present, future, and past.
This shows you that there are many different ways of approaching how to implement something like String#upcase in a Ruby implementation. There are almost as many different approaches as there are implementations!
My question is, when creating an object from a built in class (String, Array, Integer...), are we actually storing some information on some instance variables for that object during its creation?
Yes, we are, basically:
string = "hello" is shorthand for string = String.new("hello")
take a look at the following:
https://ruby-doc.org/core-3.1.2/String.html#method-c-new (ruby 3)
https://ruby-doc.org/core-2.3.0/String.html#method-c-new (ruby 2)
What's the difference between String.new and a string literal in Ruby?
You can also check the following (to extend the functionalities of the class):
Extend Ruby String class with method to change the contents
So the short answer is:
Dealing with built in classes (String, Array, Integer, ...etc) is almost the same thing as we do in any other class we create

Cleaner way of mapping a hash in ruby

Let's assume I need to do a trivial task on every element of a Hash, e.g. increment its value by 1, or change value into an array containing that value. I've been doing it like this
hash.map{ |k, v| [k, v+1] }.to_h
v+1 is just an example, it can be anything.
Is there any cleaner way to do this? I don't really like mapping a hash to an array of 2-sized arrays, then remembering to convert it to hash again.
Example of what might be nicer:
hash.hash_map{ |v| v+1 }
This way some thing like string conversion (to_s) might be simplified to
hash.hash_map(&:to_s)
Duplication clarification:
I'm not looking for Hash[...] or .to_h, I'm asking if anyone knows a more compact and cleaner solution.
That's just the way Ruby's collection framework works. There is one map method in Enumerable which doesn't know anything about hashes or arrays or lists or sets or trees or streams or whatever else you may come up with. All it knows is that there is a method named each which will yield one single element per iteration. That's it.
Note that this is the same way the collections frameworks of Java and .NET work, too. All collections operations always return the same type: in .NET, that's IEnumerable, in Ruby, that's Array.
Another design approach is that collections operations are type-preserving, i.e. mapping a set will produce a set, etc. That's the way it is done in Smalltalk, for example. However, in Smalltalk, but there it is achieved by copy&pasting almost identical methods into each and every different collection. I.e. if you want to implement your own collection, in Ruby, you only have to implement each, and you get everything else for free, whereas in Smalltalk, you have to implement every single collection method separately. (In Ruby, that would be over 40 methods.)
Scala is the first language that managed to provide a collections framework with type-preserving operations without code duplication, but it took until Scala 2.8 (released in 2010) to figure that out. (The key is the idea of collection builders.) Ruby's collections library was designed in 1993, 17 years before we had figured out how to do type-preserving collections operations without code duplication. Plus, Scala depends heavily on its sophisticated static type system and type-level metaprogramming to find the correct collection builder at compile time. This is not necessary for the scheme to work, but having to look up the builder for every operation at runtime may incur a hefty runtime cost.
What you could do is add new methods that are not part of the standard Enumerable protocol, for example similar to Scala's mapValues and mapKeys.
AFAIK, this does not exist in the Hash out of Ruby box, but here is a simple monkeypatch to achieve what you want:
▶ class Hash
▷ def hash_map &cb
▷ keys.zip(values.map(&cb)).to_h
▷ end
▷ end
There are more readable ways to achieve the requested functionality, but this one uses the built-in map for values once, pretending to be the fastest implementation that comes into my mind.
▶ h = {a: 1, b: 2}
#⇒ { :a => 1, :b => 2 }
▶ h.hash_map do |v| v + 5 end
#⇒ { :a => 6, :b => 7 }

Why doesn't ruby support method overloading?

Instead of supporting method overloading Ruby overwrites existing methods. Can anyone explain why the language was designed this way?
"Overloading" is a term that simply doesn't even make sense in Ruby. It is basically a synonym for "static argument-based dispatch", but Ruby doesn't have static dispatch at all. So, the reason why Ruby doesn't support static dispatch based on the arguments, is because it doesn't support static dispatch, period. It doesn't support static dispatch of any kind, whether argument-based or otherwise.
Now, if you are not actually specifically asking about overloading, but maybe about dynamic argument-based dispatch, then the answer is: because Matz didn't implement it. Because nobody else bothered to propose it. Because nobody else bothered to implement it.
In general, dynamic argument-based dispatch in a language with optional arguments and variable-length argument lists, is very hard to get right, and even harder to keep it understandable. Even in languages with static argument-based dispatch and without optional arguments (like Java, for example), it is sometimes almost impossible to tell for a mere mortal, which overload is going to be picked.
In C#, you can actually encode any 3-SAT problem into overload resolution, which means that overload resolution in C# is NP-hard.
Now try that with dynamic dispatch, where you have the additional time dimension to keep in your head.
There are languages which dynamically dispatch based on all arguments of a procedure, as opposed to object-oriented languages, which only dispatch on the "hidden" zeroth self argument. Common Lisp, for example, dispatches on the dynamic types and even the dynamic values of all arguments. Clojure dispatches on an arbitrary function of all arguments (which BTW is extremely cool and extremely powerful).
But I don't know of any OO language with dynamic argument-based dispatch. Martin Odersky said that he might consider adding argument-based dispatch to Scala, but only if he can remove overloading at the same time and be backwards-compatible both with existing Scala code that uses overloading and compatible with Java (he especially mentioned Swing and AWT which play some extremely complex tricks exercising pretty much every nasty dark corner case of Java's rather complex overloading rules). I've had some ideas myself about adding argument-based dispatch to Ruby, but I never could figure out how to do it in a backwards-compatible manner.
Method overloading can be achieved by declaring two methods with the same name and different signatures. These different signatures can be either,
Arguments with different data types, eg: method(int a, int b) vs method(String a, String b)
Variable number of arguments, eg: method(a) vs method(a, b)
We cannot achieve method overloading using the first way because there is no data type declaration in ruby(dynamic typed language). So the only way to define the above method is def(a,b)
With the second option, it might look like we can achieve method overloading, but we can't. Let say I have two methods with different number of arguments,
def method(a); end;
def method(a, b = true); end; # second argument has a default value
method(10)
# Now the method call can match the first one as well as the second one,
# so here is the problem.
So ruby needs to maintain one method in the method look up chain with a unique name.
I presume you are looking for the ability to do this:
def my_method(arg1)
..
end
def my_method(arg1, arg2)
..
end
Ruby supports this in a different way:
def my_method(*args)
if args.length == 1
#method 1
else
#method 2
end
end
A common pattern is also to pass in options as a hash:
def my_method(options)
if options[:arg1] and options[:arg2]
#method 2
elsif options[:arg1]
#method 1
end
end
my_method arg1: 'hello', arg2: 'world'
Method overloading makes sense in a language with static typing, where you can distinguish between different types of arguments
f(1)
f('foo')
f(true)
as well as between different number of arguments
f(1)
f(1, 'foo')
f(1, 'foo', true)
The first distinction does not exist in ruby. Ruby uses dynamic typing or "duck typing". The second distinction can be handled by default arguments or by working with arguments:
def f(n, s = 'foo', flux_compensator = true)
...
end
def f(*args)
case args.size
when
...
when 2
...
when 3
...
end
end
This doesn't answer the question of why ruby doesn't have method overloading, but third-party libraries can provide it.
The contracts.ruby library allows overloading. Example adapted from the tutorial:
class Factorial
include Contracts
Contract 1 => 1
def fact(x)
x
end
Contract Num => Num
def fact(x)
x * fact(x - 1)
end
end
# try it out
Factorial.new.fact(5) # => 120
Note that this is actually more powerful than Java's overloading, because you can specify values to match (e.g. 1), not merely types.
You will see decreased performance using this though; you will have to run benchmarks to decide how much you can tolerate.
I often do the following structure :
def method(param)
case param
when String
method_for_String(param)
when Type1
method_for_Type1(param)
...
else
#default implementation
end
end
This allow the user of the object to use the clean and clear method_name : method
But if he want to optimise execution, he can directly call the correct method.
Also, it makes your test clearers and betters.
there are already great answers on why side of the question. however, if anyone looking for other solutions checkout functional-ruby gem which is inspired by Elixir pattern matching features.
class Foo
include Functional::PatternMatching
## Constructor Over loading
defn(:initialize) { #name = 'baz' }
defn(:initialize, _) {|name| #name = name.to_s }
## Method Overloading
defn(:greet, :male) {
puts "Hello, sir!"
}
defn(:greet, :female) {
puts "Hello, ma'am!"
}
end
foo = Foo.new or Foo.new('Bar')
foo.greet(:male) => "Hello, sir!"
foo.greet(:female) => "Hello, ma'am!"
I came across this nice interview with Yukihiro Matsumoto (aka. "Matz"), the creator of Ruby. Incidentally, he explains his reasoning and intention there. It is a good complement to #nkm's excellent exemplification of the problem. I have highlighted the parts that answer your question on why Ruby was designed that way:
Orthogonal versus Harmonious
Bill Venners: Dave Thomas also claimed that if I ask you to add a
feature that is orthogonal, you won't do it. What you want is
something that's harmonious. What does that mean?
Yukihiro Matsumoto: I believe consistency and orthogonality are tools
of design, not the primary goal in design.
Bill Venners: What does orthogonality mean in this context?
Yukihiro Matsumoto: An example of orthogonality is allowing any
combination of small features or syntax. For example, C++ supports
both default parameter values for functions and overloading of
function names based on parameters. Both are good features to have in
a language, but because they are orthogonal, you can apply both at the
same time. The compiler knows how to apply both at the same time. If
it's ambiguous, the compiler will flag an error. But if I look at the
code, I need to apply the rule with my brain too. I need to guess how
the compiler works. If I'm right, and I'm smart enough, it's no
problem. But if I'm not smart enough, and I'm really not, it causes
confusion. The result will be unexpected for an ordinary person. This
is an example of how orthogonality is bad.
Source: "The Philosophy of Ruby", A Conversation with Yukihiro Matsumoto, Part I
by Bill Venners, September 29, 2003 at: https://www.artima.com/intv/ruby.html
Statically typed languages support method overloading, which involves their binding at compile time. Ruby, on the other hand, is a dynamically typed language and cannot support static binding at all. In languages with optional arguments and variable-length argument lists, it is also difficult to determine which method will be invoked during dynamic argument-based dispatch. Additionally, Ruby is implemented in C, which itself does not support method overloading.

Why were ruby loops designed that way?

As is stated in the title, I was curious to know why Ruby decided to go away from classical for loops and instead use the array.each do ...
I personally find it a little less readable, but that's just my personal opinion. No need to argue about that. On the other hand, I suppose they designed it that way on purpose, there should be a good reason behind.
So, what are the advantages of putting loops that way? What is the "raison d'etre" of this design decision?
This design decision is a perfect example of how Ruby combines the object oriented and functional programming paradigms. It is a very powerful feature that can produce simple readable code.
It helps to understand what is going on. When you run:
array.each do |el|
#some code
end
you are calling the each method of the array object, which, if you believe the variable name, is an instance of the Array class. You are passing in a block of code to this method (a block is equivalent to a function). The method can then evaluate this block and pass in arguments either by using block.call(args) or yield args. each simply iterates through the array and for each element it calls the block you passed in with that element as the argument.
If each was the only method to use blocks, this wouldn't be that useful but many other methods and you can even create your own. Arrays, for example have a few iterator methods including map, which does the same as each but returns a new array containing the return values of the block and select which returns a new array that only contains the elements of the old array for which the block returns a true value. These sorts of things would be tedious to do using traditional looping methods.
Here's an example of how you can create your own method with a block. Let's create an every method that acts a bit like map but only for every n items in the array.
class Array #extending the built in Array class
def every n, &block #&block causes the block that is passed in to be stored in the 'block' variable. If no block is passed in, block is set to nil
i = 0
arr = []
while i < self.length
arr << ( block.nil? ? self[i] : block.call(self[i]) )#use the plain value if no block is given
i += n
end
arr
end
end
This code would allow us to run the following:
[1,2,3,4,5,6,7,8].every(2) #= [1,3,5,7] #called without a block
[1,2,3,4,5,6,7,8,9,10].every(3) {|el| el + 1 } #= [2,5,8,11] #called with a block
Blocks allow for expressive syntax (often called internal DSLs), for example, the Sinatra web microframework.
Sinatra uses methods with blocks to succinctly define http interaction.
eg.
get '/account/:account' do |account|
#code to serve of a page for this account
end
This sort of simplicity would be hard to achieve without Ruby's blocks.
I hope this has allowed you to see how powerful this language feature is.
I think it was mostly because Matz was interested in exploring what a fully object oriented scripting language would look like when he built it; this feature is based heavily on the CLU programming language's iterators.
It has turned out to provide some interesting benefits; a class that provides an each method can 'mix in' the Enumerable module to provide a huge variety of pre-made iteration routines to clients, which reduces the amount of tedious boiler-plate array/list/hash/etc iteration code that must be written. (Ever see java 4 and earlier iterators?)
I think you are kind of biased when you ask that question. Another might ask "why were C for loops designed that way?". Think about it - why would I need to introduce counter variable if I only want to iterate through array's elements? Say, compare these two (both in pseudocode):
for (i = 0; i < len(array); i++) {
elem = array[i];
println(elem);
}
and
for (elem in array) {
println(elem);
}
Why would the first feel more natural than the second, except for historical (almost sociological) reasons?
And Ruby, highly object-oriented as is, takes this even further, making it an array method:
array.each do |elem|
puts elem
end
By making that decision, Matz just made the language lighter for superfluous syntax construct (foreach loop), delegating its use to ordinary methods and blocks (closures). I appreciate Ruby the most just for this very reason - being really rational and economical with language features, but retaining expressiveness.
I know, I know, we have for in Ruby, but most of the people consider it unneccessary.
The do ... end blocks (or { ... }) form a so-called block (almost a closure, IIRC). Think of a block as an anonymous method, that you can pass as argument to another method. Blocks are used a lot in Ruby, and thus this form of iteration is natural for it: the do ... end block is passed as an argument to the method each. Now you can write various variations to each, for example to iterate in reverse or whatnot.
There's also the syntactic sugar form:
for element in array
# Do stuff
end
Blocks are also used for example to filter an array:
array = (1..10).to_a
even = array.select do |element|
element % 2 == 0
end
# "even" now contains [2, 4, 6, 8, 10]
I think it's because it emphasizes the "everything is an object" philosophy behind Ruby: the each method is called on the object.
Then switching to another iterator is much smoother than changing the logic of, for example, a for loop.
Ruby was designed to be expressive, to read as if it was being spoken... Then I think it just evolved from there.
This comes from Smalltalk, that implements control structures as methods, thus reducing the number of keywords and simplifying the parser. Thus allowing controll strucures to serve as proff of concept for the language definition.
In ST, even if conditions are methods, in the fashion:
boolean.ifTrue ->{executeIfBody()}, :else=>-> {executeElseBody()}
In the end, If you ignore your cultural bias, what will be easier to parse for the machine will also be easier to parse by yourself.

Ruby equivalent of C#'s 'yield' keyword, or, creating sequences without preallocating memory

In C#, you could do something like this:
public IEnumerable<T> GetItems<T>()
{
for (int i=0; i<10000000; i++) {
yield return i;
}
}
This returns an enumerable sequence of 10 million integers without ever allocating a collection in memory of that length.
Is there a way of doing an equivalent thing in Ruby? The specific example I am trying to deal with is the flattening of a rectangular array into a sequence of values to be enumerated. The return value does not have to be an Array or Set, but rather some kind of sequence that can only be iterated/enumerated in order, not by index. Consequently, the entire sequence need not be allocated in memory concurrently. In .NET, this is IEnumerable and IEnumerable<T>.
Any clarification on the terminology used here in the Ruby world would be helpful, as I am more familiar with .NET terminology.
EDIT
Perhaps my original question wasn't really clear enough -- I think the fact that yield has very different meanings in C# and Ruby is the cause of confusion here.
I don't want a solution that requires my method to use a block. I want a solution that has an actual return value. A return value allows convenient processing of the sequence (filtering, projection, concatenation, zipping, etc).
Here's a simple example of how I might use get_items:
things = obj.get_items.select { |i| !i.thing.nil? }.map { |i| i.thing }
In C#, any method returning IEnumerable that uses a yield return causes the compiler to generate a finite state machine behind the scenes that caters for this behaviour. I suspect something similar could be achieved using Ruby's continuations, but I haven't seen an example and am not quite clear myself on how this would be done.
It does indeed seem possible that I might use Enumerable to achieve this. A simple solution would be to us an Array (which includes module Enumerable), but I do not want to create an intermediate collection with N items in memory when it's possible to just provide them lazily and avoid any memory spike at all.
If this still doesn't make sense, then consider the above code example. get_items returns an enumeration, upon which select is called. What is passed to select is an instance that knows how to provide the next item in the sequence whenever it is needed. Importantly, the whole collection of items hasn't been calculated yet. Only when select needs an item will it ask for it, and the latent code in get_items will kick into action and provide it. This laziness carries along the chain, such that select only draws the next item from the sequence when map asks for it. As such, a long chain of operations can be performed on one data item at a time. In fact, code structured in this way can even process an infinite sequence of values without any kinds of memory errors.
So, this kind of laziness is easily coded in C#, and I don't know how to do it in Ruby.
I hope that's clearer (I'll try to avoid writing questions at 3AM in future.)
It's supported by Enumerator since Ruby 1.9 (and back-ported to 1.8.7). See Generator: Ruby.
Cliche example:
fib = Enumerator.new do |y|
y.yield i = 0
y.yield j = 1
while true
k = i + j
y.yield k
i = j
j = k
end
end
100.times { puts fib.next() }
Your specific example is equivalent to 10000000.times, but let's assume for a moment that the times method didn't exist and you wanted to implement it yourself, it'd look like this:
class Integer
def my_times
return enum_for(:my_times) unless block_given?
i=0
while i<self
yield i
i += 1
end
end
end
10000.my_times # Returns an Enumerable which will let
# you iterate of the numbers from 0 to 10000 (exclusive)
Edit: To clarify my answer a bit:
In the above example my_times can be (and is) used without a block and it will return an Enumerable object, which will let you iterate over the numbers from 0 to n. So it is exactly equivalent to your example in C#.
This works using the enum_for method. The enum_for method takes as its argument the name of a method, which will yield some items. It then returns an instance of class Enumerator (which includes the module Enumerable), which when iterated over will execute the given method and give you the items which were yielded by the method. Note that if you only iterate over the first x items of the enumerable, the method will only execute until x items have been yielded (i.e. only as much as necessary of the method will be executed) and if you iterate over the enumerable twice, the method will be executed twice.
In 1.8.7+ it has become to define methods, which yield items, so that when called without a block, they will return an Enumerator which will let the user iterate over those items lazily. This is done by adding the line return enum_for(:name_of_this_method) unless block_given? to the beginning of the method like I did in my example.
Without having much ruby experience, what C# does in yield return is usually known as lazy evaluation or lazy execution: providing answers only as they are needed. It's not about allocating memory, it's about deferring computation until actually needed, expressed in a way similar to simple linear execution (rather than the underlying iterator-with-state-saving).
A quick google turned up a ruby library in beta. See if it's what you want.
C# ripped the 'yield' keyword right out of Ruby- see Implementing Iterators here for more.
As for your actual problem, you have presumably an array of arrays and you want to create a one-way iteration over the complete length of the list? Perhaps worth looking at array.flatten as a starting point - if the performance is alright then you probably don't need to go too much further.

Resources