Ruby VM concurrency and parallelism - ruby

I have a general question about the Ruby VM (Ruby Interpreter ). How does it work with multiprocessors? Regarding parallelism and concurrency in Ruby, let's say that I have 4 processors. Will the VM automatically assign the tasks with the processors through the Kernel? With scaling, lets say that my ruby process is taking a lot of the CPU resources; what will happen if I add a new processor? Is the OS responsible for assigning the tasks to the processors, or will each VM work on one processor? What would be the best way to scale my ruby application? I tried as much as possible to separate my processes and use amqp queuing. Any other ideas?
It would be great if you can send me links for more explanation.
Thanks in advance.

Ruby Threading
The Ruby language itself supports parallel execution through a threading model; however, the implementation dictates if additional hardware resources get used. The "gold standard" interpreter (MRI Ruby) uses a "green threading" model in 1.8; threading is done within the interpreter and only uses a single system thread for execution. However, others (such as JRuby) leverage the Java VM to create actual system level threads for execution. MRI Ruby 1.9 adds additional threading capability but (afaik) it's still limited to only switching thread contexts when a thread stalls on an I/O event.
Advanced Threading
Typically the OS manages assignment of threads to logical cores since most application software doesn't actually care. In some high performance compute cases, the software will specifically request certain threads to execute on specific logical cores for architecture specific performance. It's highly unlikely anything written in Ruby would fall into this category.
Refactoring
Per application performance limits can usually be addressed by refactoring the code first. Leveraging a language or other environment more suited to the specific problem is likely the best first step instead of immediately jumping to threading in the existing implementation.
Example
I once worked on a Ruby on Rails app with a massive hash mapping function step in it when data was uploaded. The initial implementation was written completely in Ruby and took ~80s to complete. Rewriting the code in ANSI C and leveraging more specific memory allocation, the execution time fell to under a second (without even using threads). The next bottleneck was inserting the massive amount of data back into MySQL which eventually also moved out of the Ruby code and into threaded C code. I specifically went this route since the MRI Ruby interpreter easily binds to C code. The final result has Ruby preparing the environment for the C code, calling it as a Ruby instance method on a class with parameters, hash mapping by a single thread of C code, and finally finishes with an OpenMP worker queue model of generating and executing inserts into MySQL.

Related

MRI ruby memory acccess characteristics during multithreading

In some hi-level programming environments (java, dotnet), when accessing same memory from multiple threads, you have to explicitly mark it as volatile or synchrnozied, otherwise you could get stale results from some cache or out-of-order values due to out-of-order execution by CPU or other optimizations.
In MRI ruby for some time native OS threads are used. Each of those threads sometime execute ruby code (I assume, but not sure), even if never truly parallel because of VM lock.
I guess MRI solves this stale/ooo values issue somehow, because there is no volatile construct in ruby language and I never heard of stale value issues.
What guarantees Ruby lang or MRI specifically gives regarding memory access from multiple threads? I would be extremely grateful if someone would point me to any documentation regarding this. Thanks!
It sounds like your specific question is if Ruby implicitly provides a memory barrier when switching threads, such that all caching/reordering concerns that occur at a processor level are resolved automatically.
I believe MRI does provide this, as otherwise the GVL would be pointless; why restrict one thread to run at a time if even then they can end up reading/writing stale data? It is difficult to find the precise place where this is provided, but I believe the entry point is via RB_VM_LOCK_ENTER which is called throughout the codebase and which ultimately calls vm_lock_enter. This has code which strongly implies that memory barriers are in place:
// lock
rb_native_mutex_lock(&vm->ractor.sync.lock);
VM_ASSERT(vm->ractor.sync.lock_owner == NULL);
vm->ractor.sync.lock_owner = cr;
if (!no_barrier) {
// barrier
while (vm->ractor.sync.barrier_waiting) {
unsigned int barrier_cnt = vm->ractor.sync.barrier_cnt;

Do Ruby threads run on multiple cores?

I've read that Ruby code (CRuby/YARV) only "runs" on a single processor core, but something is not clear yet:
I understand that the GIL prevents threads from running concurrently and that in recent Ruby versions threads are scheduled by the operating system.
Couldn't a thread possibly be "placed" on core 1 and the other on core 2, even if they're not actually running at the same time?
Just trying to understand if the OS scheduler actually puts all Ruby threads on a single core. Thanks!
Edit: Another answer mentions that C++ uses pthreads and those are scheduled across cores, and that Ruby uses the same. I guess that's what I was looking for, but since most answers seem to equate not running threads in parallel with never running on multiple cores, I just wanted to confirm.
First off, we have to clearly distinguish between "Ruby Threads" and "Ruby Threads as implemented by YARV". Ruby Threads make no guarantees how they are scheduled. They might be scheduled concurrently, they might not. They might be scheduled on multiple CPUs, they might not. They might be implemented as native platform threads, they might be implemented as green threads, they might be implemented as something else.
YARV implements Ruby Threads as native platform threads (e.g. pthreads on POSIX and Windows threads on Windows). However, unlike other Ruby implementations which use native platform threads (e.g. JRuby, IronRuby, Rubinius), YARV has a Giant VM Lock (GVL) which prevents two threads to enter the YARV bytecode interpreter at the same time. This makes it effectively impossible to run Ruby code in multiple threads at the same time.
Note however, that the GVL only protects the YARV interpreter and runtime. This means that, for example, multiple threads can execute C code at the same time, and at the same time as another thread executed Ruby code. It just means that no two threads can execute Ruby code at the same time on YARV.
Note also that in recent versions of YARV, the "Giant" VM Lock is becoming ever smaller. Sections of code are moved out from under the lock, and the lock itself is broken down in smaller, more fine-grained locks. That is a very long process, but it means that in the future more and more Ruby code will be able to run in parallel on YARV.
But, all of this has nothing to do with how the platform schedules the threads. Many platforms have some sort of heuristics for thread affinity to CPU cores, e.g they may try to schedule the same thread to the same core, under the assumption that its working set is still in that core's cache, or they may try to identify threads that operate on shared data, and schedule those threads to the same CPU and so on. Therefore, it is hard to impossible to predict how and where a thread will be scheduled.
Many platforms also provide a way to influence this CPU affinity, e.g. on Linux and Windows, you can set a thread to only be scheduled on one specific or a set of specific cores. However, YARV does not do that by default. (In fact, on some platforms influencing CPU affinity requires elevated privileges, so it would mean that YARV would have to run with elevated privileges, which is not a good idea.)
So, in short: yes, depending on the platform, the hardware, and the environment, YARV threads may and probably will be scheduled on different cores. But, they won't be able to take advantage of that fact, i.e. they won't be able to run faster than on a single core (at least when running Ruby code).

Parallelism (Threads and Processes)

I have a question. I know the differece between a thread and a process in theory. But I still don't understand when we should use the first and when the latter. For example, we have a difficult task which needs to be parelleled. But in which way? Which is faster and MORE EFFECTIVE and in what cases? Should we split our task into a few processes or into a few threads? Could you give a few examples? I know that my question may seem silly, but I'm new to the topic of parallel computing. I hope that you understand my question. Thank you in advance.
In general, there is only one main difference between processes and threads: All threads of a given process share the same virtual address space. Whereas each process has its own virtual address space.
When dealing with problems that require concurrent access to the same set of data, it is easier to use threads, because they can all directly access the same memory.
Threads share memory. Processes do not.
This means that processes are somewhat more expensive to start up. It also means that threads can conveniently communicate through shared memory, and processes cannot.
However, from a coding perspective, it also means that threads are significantly more difficult to program correctly. It's very easy for threads to stomp on each others' memory in unintended ways. Processes are somewhat safer.
Welcome to the world of concurrency!
There is no theoretical difference between threads and processes that is practical to generalize from. There are many, many different ways to implement threads, including ways that nearly mirror those of processes (e.g. Linux threads). Then there's lightweight threading, which involves the process managing the threading by itself; but there's more variation there, because you can then have either co-operative or semi-preemptive threading model.
For example, we describe Haskell's threading model and Python's.
Haskell offers lightweight threads that introduce little runtime overhead; there are well-defined points at which threads may yield control, but this is largely hidden from the user, giving the appearance of pre-emptive multitasking. Shared state is held in specially typed variables that are treated specially by the language. Because of this, multi-threaded, even concurrent programs can be written in a largely single-threaded way, then forked from the main process. So there, threads are and abstraction mechanism, and may even be beneficial in a single-(OS)-threaded process to model the program; however, it scales well to N-threads, where N may be chosen dynamically. (And N Haskell threads are mapped dynamically to OS threads.)
Python allows threading, but with a huge bottleneck: the Global Interpreter lock. Therefore, to gain serious performance benefits, one must use processes in practice. There is no feasible, performant threading model to speak of.

Advantages of non-concurrent Ruby Threads in Ruby 1.9?

I have been reading about Ruby 1.9 Thread and I see that all ruby threads go through the Global Interpreter Lock (GIL for friends) and that concurrency is actually non-existant.
I have done a test (without any signals nor waiting) and the performance using threads doesn't only not improve but the operations actually take more time than running them serially
My question is basically - Whats the point for these Threads if they are not concurrent? Is there any hope that they will be concurrent in the future?
A lot of other Ruby interpreters (JRuby, Rubinius) don't actually have GILs. Also, MRI 2.0 is going to do away with the GIL as well.
Also, in a lot of cases (such as when waiting for IO) the interpreter does switch to another thread. So while it's not technically multithreading (in the case of MRI/REE as of 1.9), it does get some of the benefits.
Parallelism is nonexistent, but Ruby threads do not prevent concurrent execution of Ruby code. Even on a single core machine, concurrent code execution is possible. I think you just conflated the terms 'concurrent' and parallel'.
See Working with Ruby Threads by Jesse Storimer for more details.

What practical effect will different Ruby threading models (Ruby vs JRuby) have on your code as a developer?

I'm trying to understand the practical impact of different threading models between MRI Ruby 1.8 and JRuby.
What does this difference mean to me as a developer?
And also, are there any practical examples of code in MRI Ruby 1.8 that will have worse performance characteristics on JRuby due to different threading models?
State
ruby 1.8 has green threads, these are fast to create/delete (as objects) but do not truly execute in parallel and are not even scheduled by the operating system but by the virtual machine
ruby 1.9 has real threads, these are slow to create/delete (as objects) because of OS calls, but because of the GIL (global interpreter lock) that only allows one thread to execute at a time, neither these are truly parallel
JRuby also has real threads scheduled by the OS, and are truly concurrent
Conclusion
A threaded program running on a 2-core CPU will run faster on JRuby then the other implementations, regarding the threading point of view
Notice!
Many existing ruby libraries are not thread-safe so the advantage of JRuby in many times useless.
Also note that many techniques of ruby programming (for example class vars) will need additional programming effort to ensure thread-safeness (mutex locks, monitors etc) if one is to use threads.
JRuby's threads are native system threads, so they give you all the benefits of threaded programming (including the use of multiple processor cores, if applicable). However, Ruby has a Global Interpreter Lock (GIL), which prevents multiple threads from running simultaneously. So the only real performance difference is the fact that your MRI/YARV Ruby applications won't be able to utilize all of your processor cores, but your JRuby applications will happily do so.
However, if that isn't an issue, MRI's threads are (theoretically, I haven't tested this) a little faster because they are green threads, which use fewer system resources. YARV (Ruby 1.9) uses native system threads.
I am a regular JRuby user and the biggest difference is that JRuby threads are truly concurrent. They are actually system level threads so they can be executed concurrently on multiple cores. I do not know of any place where MRI Ruby 1.8 code runs slower on JRuby. You might consider checking out this question Does ruby have real multithreading?.

Resources