I found one webpage that describes how Ruby's execution stack looks like. It says that Ruby has seven stacks:
Is this article true?
This article focuses on the way ruby works in versions from 1.7 to 1.8. With introduction of YARV things have changed a lot. To better understand how Ruby works internally I'd recommend Ruby Under a Microscope. There are chapters on how Ruby execution stack works
No, this does not describe how Ruby works. This describes how MRI works. MRI is only one of many implementations of Ruby. The Ruby Programming Language does not specify any particular implementation strategy for memory management. It is perfectly valid to implement Ruby without any stack at all.
There are many implementations of Ruby. The most widely-used one currently is YARV, but there's also MRuby, JRuby, MagLev, Ruby+OMR, TruffleRuby, Rubinius (those last three are the most interesting IMO). MRI isn't even maintained any more. In the past, there were also IronRuby, IronRuby (yes, actually, there were two different implementations with that name), Ruby.NET, tinyrb, XRuby, SmallRuby, BlueRuby, Cardinal, and many others.
AFAIK, none of those works in the way that is described here, only MRI does.
Related
Can a String and its duplicate share the same underlying memory? Is there copy-on-write in Ruby?
I have a large, frozen String and I want to change its encoding. But I don't want to copy the whole String just to do that. For context, this is to pass values to a Google Protocol Buffer which has the bytes type and only accepts Encoding::ASCII_8BIT.
big_string.freeze
MyProtobuf::SomeMessage.new(
# I would prefer not to have to copy the whole string just to
# change the encoding.
value: big_string.dup.force_encoding(Encoding::ASCII_8BIT)
)
It seems to work just fine for me: (using MRI/YARV 1.9, 2.x, 3.x)
require 'objspace'
big_string = Random.bytes(1_000_000).force_encoding(Encoding::UTF_8)
big_string.encoding #=> #<Encoding:UTF-8>
big_string.bytesize #=> 1000000
ObjectSpace.memsize_of(big_string) #=> 1000041
dup_string = big_string.dup.force_encoding(Encoding::ASCII_8BIT)
dup_string.encoding #=> #<Encoding:ASCII-8BIT>
dup_string.bytesize #=> 1000000
ObjectSpace.memsize_of(dup_string) #=> 40
Those 40 bytes are the size to hold an object (RVALUE) in Ruby.
Note that instead of dup / force_encoding(Encoding::ASCII_8BIT) there's also b which returns a copy in binary encoding right away.
For more in-depth information, here's a blog post from 2012 (Ruby 1.9) about copy-on-write / shared strings in Ruby:
Seeing double: how Ruby shares string values
From the author's book Ruby Under a Microscope: (p. 265)
Internally, both JRuby and MRI use an optimization called copy-on-write for strings and other data. This trick allows two identical string values to share the same data buffer, which saves both memory and time because Ruby avoids making separate copies of the same string data unnecessarily.
Can a String and its duplicate share the same underlying memory? Is there copy-on-write in Ruby?
There is nothing in the Ruby Language Specification that prevents that. There is also nothing in the Ruby Language Specification that enforces that.
In general, the Ruby Language Specification tries to stay silent on all things related to memory management, space complexity, step complexity, or time complexity. This is not exclusive to the Ruby Language Specification, most Language Specifications try to leave the implementors as much leeway as possible. In other words, Language Specifications tend to specify Syntax and Semantics and leave the Pragmatics up to the implementor. (C++ is somewhat of an exception in that it specifies space and time complexity for the algorithms in the standard library.) Even C, which is typically thought of as a language which gives you full control over everything, doesn't actually specify things like memory layouts precisely – for example, due to the definition of the term width in the standard, a uint16_t is actually allowed to occupy more than 16 bits!
Every implementor is free to implement strings however they want, as long as they comply with the semantics defined in the Ruby Language Specification.
If I remember correctly, both Rubinius and TruffleRuby did, at one point, experiment with a String implementation based on Ropes. Chris Seaton, TruffleRuby's lead developer, wrote a paper about that implementation. However, I don't know if they are still using it. (I know TruffleRuby switched to Truffle Strings recently, and I am not sure what their underlying representation is … or whether they are even guaranteeing a specific underlying representation.)
There is problem with the answer "you have to look at the specification", though: unfortunately, unlike many other programming languages, the Ruby Language Specification does not exist as a single document in a single place. Ruby does not have a single formal specification that defines what certain language constructs mean.
There are several resources, the sum of which can be considered kind of a specification for the Ruby programming language.
Some of these resources are:
The ISO/IEC 30170:2012 Information technology — Programming languages — Ruby specification – Note that the ISO Ruby Specification was written around 2009–2010 with the specific goal that all existing Ruby implementations at the time would easily be compliant. Since YARV and MacRuby only implement Ruby 1.9+ and MRI only implements Ruby 1.8 and lower and JRuby, XRuby, Ruby.NET, and IronRuby (at the time) only implemented a subset of Ruby 1.8, this means that the ISO Ruby Specification only contains features that are common to both Ruby 1.8 and Ruby 1.9. Also, the ISO Ruby Specification was specifically intended to be minimal and only contain the features that are absolutely required for writing Ruby programs. Because of that, it does for example only specify Strings very broadly (since they have changed significantly between Ruby 1.8 and Ruby 1.9). It obviously also does not specify features which were added after the ISO Ruby Specification was written, such as Ractors or Pattern Matching.
The Ruby Spec Suite aka ruby/spec – Note that the ruby/spec is unfortunately far from complete. However, I quite like it because it is written in Ruby instead of "ISO-standardese", which is much easier to read for a Rubyist, and it doubles as an executable conformance test suite.
The Ruby Programming Language by David Flanagan and Yukihiro 'matz' Matsumoto – This book was written by David Flanagan together with Ruby's creator matz to serve as a Language Reference for Ruby.
Programming Ruby by Dave Thomas, Andy Hunt, and Chad Fowler – This book was the first English book about Ruby and served as the standard introduction and description of Ruby for a long time. This book also first documented the Ruby core library and standard library, and the authors donated that documentation back to the community.
The Ruby Issue Tracking System, specifically, the Feature sub-tracker – However, please note that unfortunately, the community is really, really bad at distinguishing between Tickets about the Ruby Programming Language and Tickets about the YARV Ruby Implementation: they both get intermingled in the tracker.
The Meeting Logs of the Ruby Developer Meetings. (Same problem: Ruby and YARV get intermingled.)
New features are often discussed on the mailing lists, in particular the ruby-core (English) and ruby-dev (Japanese) mailing lists. (Same problem again.)
The Ruby documentation – Again, be aware that this documentation is generated from the source code of YARV and does not distinguish between features of Ruby and features of YARV.
In the past, there were a couple of attempts of formalizing changes to the Ruby Specification, such as the Ruby Change Request (RCR) and Ruby Enhancement Proposal (REP) processes, both of which were unsuccessful.
If all else fails, you need to check the source code of the popular Ruby implementations to see what they actually do. Please note the plural: you have to look at multiple, ideally all, implementations to figure out what the consensus is. Only looking at one implementation cannot possibly tell you whether what you are looking at is an implementation quirk of this particular implementation or is a universally agreed-upon behavior of the Ruby Language.
So, I have a few questions that I have to ask, I did browse the internet, but there weren't too many reliable answers. Mostly blog posts that would cancel each-other out because they both praised different things and had benchmarks to "prove their viewpoint" (I have never seen so many contradicting benchmarks in my life).
Anyway, my questions are:
Is Rubinius really faster? I was pretty impressed by this apparently honest pro-Rubinius presentation. Another thing that confuses me a little is that a lot of Rubinius is written in Ruby itself, yet somehow it is faster than C-Ruby? It must be a pretty damn good implementation of the language, then!
Does EventMachine work with Ruinius? As far as I know, EventMachine partially relies on Fibers (correct me if I'm wrong) which weren't implemented until 1.9. I know Rubinius will eventually support 1.9, too; I don't mind waiting a little.
Do C extensions work in Rubinius? I have written a C extension which "serializes" binary messages received from a TCP stream into Ruby Objects and vice-versa (I suppose the details are not important, but if it helps answer this question I will update the post). This can be a lot of messages! I managed to write the same code in Ruby (although, it made little sense after a month), but it proved to be a real bottle-neck in the application. So, I had to use C as a "solution" to my problem.
EDIT: I just remembered, I use C for another task, it is a hit-test method for Arrays. Basically it just checks if a "point" is inside an a polygon, it was impossibly slow in CRuby.
If the previous answer was a "No," is there then an alternative for C extensions in Rubinus? I gather the VM is written in C++, so that then.
A few "bonus" questions:
Will C-Ruby (2.0+, YARV) ever get rid of GIL? Or at least modify it so CRuby supports true parallelism?
What is exactly mruby? I see matz is working on it, and as far as the description goes it seems pretty awesome. How different is it from CRuby (performance-wise)?
I apologize for this text-storm I unleashed upon you! ♥
Is Rubinius really faster?
In most benchmarks, yes.
But benchmarks are... dumb. Apps are what we really care about. So the best thing to do is benchmark your app & see how well it performs. The 2 areas where Rubinius will real shine over MRI are parallelism & memory usage. Rubinius has no GIL, so you can utilize all available threads. It also has a much more sophisticated GC, so in general it could perform better with respect to GC.
I did those benchmarks back in Oct '11 for my talk on MagLev at RubyConf
Does EventMachine work with Rubinius?
Yes, and if there are parts that don't work, then the issue should be reported. With that said, currently the EM tests don't pass on any Ruby implementation.
Do C extensions work in Rubinius?
Yes. I maintain the compatibility issue for C-exts, so if there is one you have that is tested on Travis, Rubinius would like to see it pass against rbx. Rubinius has historically had good support for the C-api and C-exts, though it would be nice if someday Rubinius could run Ruby so fast one would not need C-exts or the C-api.
Will C-Ruby (2.0+, YARV) ever get rid of GIL? Or at least modify it so CRuby supports true parallelism?
No, most likely not. Jesse Storimer has a succinct writeup of Matz's opinion (or lack thereof) on threads from RubyConf 2012. Koichi Sasada tried to remove the GIL once and MRI perf just tanked. Evan Phoenix also tried once, before he created Rubinius, but didn't have good results.
What is exactly mruby?
An embeddable Ruby interpreter, akin to Lua. Matt Aimonetti has a few articles that might shed some light for you.
I am not too much into Ruby but I might be able to answer the first question.
Is Rubinius really faster?
I've seen different Benchmarks telling different things. However, the fact that Rubinius is partially written in Ruby does not have to mean that it is slower. I thought the same about PyPy which is Python in Python. After some research and the right classes in college I knew why.
As far as I know both are written in a subset of their language which should be much simpler. An (e.g. C) interpreter can be be optimized much easier for such a subset than the whole language.
Writing the Ruby/Python interpreter in its own language allows much more flexibility and quicker prototyping of new interpretation algorithms. The whole point of the existence of Ruby and Python are among others that algorithms can be implemented much quicker than in e.g. C or even assembler. A faster algorithm outweighs the little overhead of an interpreter a lot of the time.
Btw. writing an interpreter for a language in the same language is also a common academic practice to show how mighty the language is. In one class we've written Lisp in Lisp in Lisp.
I'm doing some research into how different Ruby interpreters do method execution (e.g. when you call a method in ruby, what steps does the interpreter take to find and execute it, and which structures are involved in this). I am trying to compare the performance of the different approaches being used.
The interpreters I'm looking into are: MRI, YARV, JRuby, Rubinius, Ruby EE
I am looking for any general pointers about which files in the interpreter source I should check out, and any other general information about this topic that you guys can provide.
Thanks!
This article is a really good description of method dispatching in JRuby. It is nicely complemented by the JRuby Wiki page describing its internals.
According to this rubyquiz, external iterators used to be slow, but are now faster. Is this an improvement only available in YARV (the C-based implementation of ruby 1.9), or is this also available in the C-based implementation of ruby 1.8.7?
Also, does enum_for rely on external iterators?
Ruby 1.9 uses fibers to implement Enumerator#next, which might be better than Ruby 1.8, but still makes it an expensive call to make.
enum_for returns an Enumerator but does not rely on external iterators. A fiber/continuation will be created only if needed, i.e. if you call next but not if you call each or any other method inherited from Enumerable.
Rubinius and JRuby are optimizing next for the builtin types because it is very difficult to implement, in particular on the JVM. Fun bedtime reading: this thread on ruby-core
Rubinius also has some major performance enhancements, but it is a Ruby 1.8 implementation, not 1.9.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I know a lot of Java people have started looking at Scala since it runs on the JVM, and a lot of people in the Microsoft world are looking at F#, but what does Ruby have as a natural functional successor?
In a pure FP sense Ruby doesn't lack anything, instead it has too much some may say. A functional language forces the programmer to not use global variables and other idioms so much (although it is possible to use globals in functional languages)
There's two very different definitions of what "functional programming" means. You can kind-of do the one in Ruby, but you cannot do the other.
Those two definitions are:
programming with first-class functions and
programming with mathematical functions
You can kind-of program with first-class functions in Ruby. It has support for first-class functions. In fact, it has too much support for them: there is Proc.new, proc, lambda, Method, UnboundMethod, blocks, #to_proc and ->() (and probably some others that I forget).
All of these behave slightly differently, have slightly different syntax, slightly different behavior and slightly different restrictions. For example: the only one of these which is syntactically lightweight enough that you can actually use it densely, is blocks. But blocks have some rather severe restrictions: you can only pass one block to a method, blocks aren't objects (which in an object-oriented language in wich "everything is an object" is a very severe restriction) and at least in Ruby 1.8 there are also some restrictions w.r.t parameters.
Referring to a method is another thing that is fairly awkward. In Python or ECMAScript for example, I can just say baz = foo.bar to refer to the bar method of the foo object. In Ruby, foo.bar is a method call, if I want to refer to the bar method of foo, I have to say baz = foo.method(:bar). And if I now want to call that method, I cannot just say baz(), I have to say baz.call or baz[] or (in Ruby 1.9) baz.().
So, first-class functions in Ruby aren't really first-class. They are much better than second-class, and they are good enough™, but they aren't fully first-class.
But generally, Rubyists do not leave Ruby just for first-class functions. Ruby's support is good enough that any advantages you might gain from better support in another language usually is eaten up by the training effort for the new language or by something else that you are accustomed to that you must now give up. Like, say RubyGems or tight Unix integration or Ruby on Rails or syntax or …
However, the second definition of FP is where Ruby falls flat on its face. If you want to do programming with mathematical functions in Ruby, you are in for a world of pain. You cannot use the absolute majority of Ruby libraries, because most of them are stateful, effectful, encourage mutation or are otherwise impure. You cannot use the standard library for the same reasons. You cannot use the core library. You cannot use any of the core datatypes, because they are all mutable. You could just say "I don't care that they are mutable, I will simply not mutate them and always copy them", but the problem is: someone else still can mutate them. Also, because they are mutable, Ruby cannot optimize the copying and the garbage collector isn't tuned for that kind of workload.
It just doesn't work.
There is also a couple of features that have really nothing to do with functional programming but that most functional languages tend to have, that Ruby is missing. Pattern matching, for example. Laziness also was not that easy to achieve before Enumerators were more aggressively used in Ruby 1.9. And there's still some stuff that works with strict Enumerables or Arrays but not with lazy Enumerators, although there's actually no reason for them to require strictness.
And for this definition of FP, it definitely makes sense to leave Ruby behind.
The two main languages that Rubyists have been flocking to, are Erlang and Clojure. These are both relatively good matches for Ruby, because they are both dynamically typed, have a similar REPL culture as Ruby, and (this is more a Rails thing than a Ruby thing) are also very good on the web. They have still pretty small and welcoming communities, the original language creators are still active in the community, there is a strong focus on doing new, exciting and edgy things, all of which are traits that the Ruby community also has.
The interest in Erlang started, when someone showed the original 1993 introduction video "Erlang: The Movie" at RubyConf 2006. A couple of high-profile Rails projects started using Erlang, for example PowerSet and GitHub. Erlang is also easy to master for Rubyists, because it doesn't take purity quite as far as Haskell or Clean. The inside of an actor is pretty pure, but the act of sending messages itself is of course a side-effect. Another thing that makes Erlang easy to grasp, is that Actors and Objects are actually the same thing, when you follow Alan Kay's definition of object-oriented programming.
Clojure has been a recent addition to the Rubyist's toolbelt. Its popularity is I guess mostly driven by the fact that the Ruby community has finally warmed up to the idea that JVM ≠ Java and embraced JRuby and then they started to look around what other interesting stuff there was on the JVM. And again, Clojure is much more pragmatic than both other functional languages like Haskell and other Lisps like Scheme and much simpler and more modern than CommonLisp, so it is a natural fit for Rubyists.
Another cool thing about Clojure is that because both Clojure and Ruby run on the JVM, you can combine them.
The author of "Programming Clojure" (Stuart Halloway) is a (former?) Rubyist, for example, as is Phil Hagelberg, the author of the Leiningen build tool for Clojure.
However, Rubyists are also looking at both Scala (as one of the more pragmatic statically typed FP languages) and Haskell (as one of the more elegant ones). Then there is projects like Scuby and Hubris which are bridges that let you integrate Ruby with Scala and Haskell, respectively. Twitter's decision to move part of their low-level messaging infrastructure first from MySQL to Ruby, then from Ruby to Scala is also pretty widely known.
F# doesn't seem to play any role at all, possibly due to an irrational fear towards all things Microsoft the Ruby community has. (Which, BTW, seems mostly unfounded, given that the F# team has always made versions available for Mono.)
Java people are using a language on the JVM and want a more functional one compatible with their runtime, so they go to Scala.
C# people are using a language on the CLR and want a more functional one compatible with their runtime, so they go to F#.
Ruby people are using a language that's already pretty functional, and they're using it on a number of underlying runtimes (JRuby, IronRuby, MRI, MacRuby, Rubinius, etc...). I don't think it has a natural functional successor, or even needs one.
Any version of Lisp should be fine.
Ruby it self is a kind of Functional programming language, So I don't see any special dialects for FP using ruby.
In hype level, Haskell.
Assuming Ruby people don't just go to the JVM themselves, I think most would adopt Erlang, being another dynamically typed language.
Ruby isn't as functional as say Lisp, but it is just functional enough that you can do some functional programming in a good fun way. (unlike trying to do functional programming in something like C#)
Also, it actually forces you into functional paradigms in some of its syntax, such as the heavy use of blocks and yield. (which I fell in love with after learning Ruby).