In which versions of ruby are external iterator speeds improved? - ruby

According to this rubyquiz, external iterators used to be slow, but are now faster. Is this an improvement only available in YARV (the C-based implementation of ruby 1.9), or is this also available in the C-based implementation of ruby 1.8.7?
Also, does enum_for rely on external iterators?

Ruby 1.9 uses fibers to implement Enumerator#next, which might be better than Ruby 1.8, but still makes it an expensive call to make.
enum_for returns an Enumerator but does not rely on external iterators. A fiber/continuation will be created only if needed, i.e. if you call next but not if you call each or any other method inherited from Enumerable.
Rubinius and JRuby are optimizing next for the builtin types because it is very difficult to implement, in particular on the JVM. Fun bedtime reading: this thread on ruby-core

Rubinius also has some major performance enhancements, but it is a Ruby 1.8 implementation, not 1.9.

Related

What does Ruby's execution stack look like?

I found one webpage that describes how Ruby's execution stack looks like. It says that Ruby has seven stacks:
Is this article true?
This article focuses on the way ruby works in versions from 1.7 to 1.8. With introduction of YARV things have changed a lot. To better understand how Ruby works internally I'd recommend Ruby Under a Microscope. There are chapters on how Ruby execution stack works
No, this does not describe how Ruby works. This describes how MRI works. MRI is only one of many implementations of Ruby. The Ruby Programming Language does not specify any particular implementation strategy for memory management. It is perfectly valid to implement Ruby without any stack at all.
There are many implementations of Ruby. The most widely-used one currently is YARV, but there's also MRuby, JRuby, MagLev, Ruby+OMR, TruffleRuby, Rubinius (those last three are the most interesting IMO). MRI isn't even maintained any more. In the past, there were also IronRuby, IronRuby (yes, actually, there were two different implementations with that name), Ruby.NET, tinyrb, XRuby, SmallRuby, BlueRuby, Cardinal, and many others.
AFAIK, none of those works in the way that is described here, only MRI does.

What's meant by a thread-safe Ruby interpreter?

In a 2000 interview (that is, pre-YARV), Matz said
Matz: I'd like to make it faster and more stable. I'm planning a full
rewrite of the interpreter for Ruby 2.0, code-named "Rite". It will be
smaller, easier to embed, thread-safe, and faster. It will use a
bytecode engine. It will probably take me years to implement, since
I'm pretty busy just maintaining the current version.
What was meant by "thread-safe" in this context? An interpreter that allowed you to use green threads? An interpreter that allowed you to use native threads? An interpreter that didn't have a global interpreter lock (GVL in YARV Ruby terminology)?
At the moment ruby's threading is less than ideal. Ruby can use threading and the threading works fine, but because of its current threading mechanism, the long-and-short of it is that one interpreter can only use one CPU core at a time; there are also other potential issues.
If you want all the gory details, This article covers it pretty well.

Fast Thread-Safe Ruby Hash with strong read bias

I need some help in understanding Hash in Ruby 1.8.7.
I have a multi-threaded Ruby application, and about 95% of time multiple threads of the application are trying to access a global Hash.
I am not sure if the default Ruby Hash is thread safe. What would be the best way to have a fast Hash but also one that is thread safe given my situation?
The default Ruby Hash is not thread-safe. On MRI and YARV it is "somewhat accidentally thread-safe", because MRI and YARV have a broken threading implementation that is incapable of running two threads simultaneously anyway. On JRuby, IronRuby and Rubinius however, this is not the case.
I would suggest a wrapper which protects the Hash with a read-write lock. I couldn't find a pre-built Ruby read-write lock implementation (of course JRuby users can use java.util.concurrent.ReentrantReadWriteLock), so I built one. You can see it at:
https://github.com/alexdowad/showcase/blob/master/ruby-threads/read_write_lock.rb
Me and two other people have tested it on MRI 1.9.2, MRI 1.9.3, and JRuby. It seems to be working correctly (though I still want to do more thorough testing). It has a built-in test script; if you have a multi-core machine, please download, try running it, and let me know the results! As far as performance goes, it trounces Mutex in situations with a read bias. Even in situations with 80-90% writes, it still seems a bit faster than using a Mutex.
I am also planning to do a Ruby port of Java's ConcurrentHashMap.

Ruby interpreters, method execution

I'm doing some research into how different Ruby interpreters do method execution (e.g. when you call a method in ruby, what steps does the interpreter take to find and execute it, and which structures are involved in this). I am trying to compare the performance of the different approaches being used.
The interpreters I'm looking into are: MRI, YARV, JRuby, Rubinius, Ruby EE
I am looking for any general pointers about which files in the interpreter source I should check out, and any other general information about this topic that you guys can provide.
Thanks!
This article is a really good description of method dispatching in JRuby. It is nicely complemented by the JRuby Wiki page describing its internals.

Ruby monkey patching pitfalls

I'm looking for examples of why it's not a good idea to extend base classes in ruby. I need to show some people why it's a weapon to be wielded carefully.
Any horror stories you can share?
There was a pretty famous example of monkey-patching going horribly wrong about 2.5 years ago in Rubinius.
The interesting thing about this case is that both the offending code and the victim were highly visible and highly unusual. Usually, the offender is some piece of code written by some PHP script kiddy who got drunk on his 1337 metaprogramming h4X0r skillz. And the failure mode is a simple ArgumentError exception, because the original method and the monkeypatch have different arity.
However, in this case, the offender was a library in the stdlib (mathn) and the failure mode was the Rubinius VM completely blowing up.
So, what happened? Well, mathn monkeypatches the Fixnum class and changes how Fixnum arithmetic works. In particular, it changes both the results and the types of several core methods. E.g.:
r = 4/3 # => 1
r.class # => Fixnum
require 'mathn'
r = 4/3 # => (4/3)
r.class # => Rational
The problem is of course that in Rubinius, the entire Ruby compiler, the entire Ruby kernel, large parts of the Ruby core library, some parts of the Rubinius VM and other parts of the Rubinius infrastructure, are all written in Ruby. And of course, all of those use Fixnum arithmetic all over the place.
The Hash class is written in Ruby and it uses Fixnum arithmetic to compute the size of the hash buckets, compute the hash function and so on. Array is written in Ruby and needs to compute element sizes and array lengths. The FFI library is written in Ruby and needs to compute memory addresses(!) and structure sizes. Many parts of Rubinius assume that they can do some Fixnum arithmetic and then pass the result to some C function as a pointer or int.
And since Ruby doesn't support any kind of selector namespacing or class boxing or similar (although something like that is planned for Ruby 2.0), as soon as some random user code requires the mathn library, all of those pieces just spectacularly explode, because all of a sudden, the result of a Fixnum operation is no longer a Fixnum (which is basically identical to a machine int and can be passed around as such), but a Rational (which is a full-fledged Ruby object).
Basically, what would happen, is that some code would require 'mathn' (or you would type that into IRb), and immediately the VM would just die.
The solution, in this case, was the safe math plugin for the compiler: when the compiler detects that it is compiling the kernel or other core parts of Rubinius, it automatically rewrites calls to Fixnum methods into calls to private immutable copies of those methods. [Note: I think in current versions of Rubinius, the problem is solved in a different way.]
The Trifecta of FAIL; or, how to patch Rails 2.0 for Ruby 1.8.7 has an example of Rails (which is a large, well-scrutinized project) causing problems because they monkeypatched String to add the method chars.
One obvious pitfall would be name collisions - if two or more packages choose the same name for a method that behaves differently.

Resources