Multithreading within a hash of hashes ruby - ruby

I have a code snippet like this
myhash.each_value{|subhash|
(subhash['key]'.each {|subsubhash|
statement that modifies the subsubhash and takes about 0.07 s to execute
})
}
This loop runs 100+ times and needless to say slows down my application tremendously(about 7 seconds to run this loop).
Any pointers on how to make this faster? I have no control over the really expensive statement. Is there a way I can multi thread within the loop so the statements can be executed in parallel?

threads = []
myhash.each_value{ |subhash|
threads << Thread.start do
subhash['key'].each { |subsubhash|
threads << Thread.start do
statement that modifies the subsubhash and takes about 0.07 s to execute
end
}
end
}
threads.each { |t| t.join }
Note that MRI 1.8.x doesn't use real threads, but rather green ones which do not correspond to real OS threads. However, if you use JRuby you might see a performance boost as it supports real threads.

You could run each subhash processing loop in a separate thread but whether or not this results in a performance boost may depend on either (1) the Ruby interpreter you are using or (2) whether the innermost block is IO-bound or compute-bound.
The reason for #1 is that some Ruby interpreters (such as CRuby/MRI 1.8) use green threads which typically do not benefit from any actual parallel processing, even on multicore machines. However, YARV and JRuby both use native OS threads (JRuby even for 1.8 since the JVM uses native threads), so if you can target those interpreters specifically then you might see an improvement.
The reason for #2 is that if the innermost block is IO-bound then even a green thread based interpreter might improve performance since most OSes do a good job of scheduling threads around blocking IO calls. If the block is strictly compute-bound then only a native-thread based interpreter will likely show a performance boost using multiple threads.

Related

Do Ruby threads run on multiple cores?

I've read that Ruby code (CRuby/YARV) only "runs" on a single processor core, but something is not clear yet:
I understand that the GIL prevents threads from running concurrently and that in recent Ruby versions threads are scheduled by the operating system.
Couldn't a thread possibly be "placed" on core 1 and the other on core 2, even if they're not actually running at the same time?
Just trying to understand if the OS scheduler actually puts all Ruby threads on a single core. Thanks!
Edit: Another answer mentions that C++ uses pthreads and those are scheduled across cores, and that Ruby uses the same. I guess that's what I was looking for, but since most answers seem to equate not running threads in parallel with never running on multiple cores, I just wanted to confirm.
First off, we have to clearly distinguish between "Ruby Threads" and "Ruby Threads as implemented by YARV". Ruby Threads make no guarantees how they are scheduled. They might be scheduled concurrently, they might not. They might be scheduled on multiple CPUs, they might not. They might be implemented as native platform threads, they might be implemented as green threads, they might be implemented as something else.
YARV implements Ruby Threads as native platform threads (e.g. pthreads on POSIX and Windows threads on Windows). However, unlike other Ruby implementations which use native platform threads (e.g. JRuby, IronRuby, Rubinius), YARV has a Giant VM Lock (GVL) which prevents two threads to enter the YARV bytecode interpreter at the same time. This makes it effectively impossible to run Ruby code in multiple threads at the same time.
Note however, that the GVL only protects the YARV interpreter and runtime. This means that, for example, multiple threads can execute C code at the same time, and at the same time as another thread executed Ruby code. It just means that no two threads can execute Ruby code at the same time on YARV.
Note also that in recent versions of YARV, the "Giant" VM Lock is becoming ever smaller. Sections of code are moved out from under the lock, and the lock itself is broken down in smaller, more fine-grained locks. That is a very long process, but it means that in the future more and more Ruby code will be able to run in parallel on YARV.
But, all of this has nothing to do with how the platform schedules the threads. Many platforms have some sort of heuristics for thread affinity to CPU cores, e.g they may try to schedule the same thread to the same core, under the assumption that its working set is still in that core's cache, or they may try to identify threads that operate on shared data, and schedule those threads to the same CPU and so on. Therefore, it is hard to impossible to predict how and where a thread will be scheduled.
Many platforms also provide a way to influence this CPU affinity, e.g. on Linux and Windows, you can set a thread to only be scheduled on one specific or a set of specific cores. However, YARV does not do that by default. (In fact, on some platforms influencing CPU affinity requires elevated privileges, so it would mean that YARV would have to run with elevated privileges, which is not a good idea.)
So, in short: yes, depending on the platform, the hardware, and the environment, YARV threads may and probably will be scheduled on different cores. But, they won't be able to take advantage of that fact, i.e. they won't be able to run faster than on a single core (at least when running Ruby code).

Ruby multithreading Performance issues

I am building Ruby application. I have a set of images that I want to greyscale. My code used to be like this:
def Tools.grayscale_all_frames(frames_dir,output_dir)
number_of_frames = get_frames_count(frames_dir)
img_processor = ImageProcessor.new(frames_dir)
create_dir(output_dir)
for i in 1..number_of_frames
img_processor.load_image(frames_dir+"/frame_%04d.png"%+i)
img_processor.greyscale_image
img_processor.save_image_in_dir(output_dir,"frame_%04d"%+i)
end
end
after threading the code:
def Tools.greyscale_all_frames_threaded(frames_dir,output_dir)
number_of_frames = get_frames_count(frames_dir)
img_processor = ImageProcessor.new(frames_dir)
create_dir(output_dir)
greyscale_frames_threads = []
for frame_index in 1..3
greyscale_frames_threads << Thread.new(frame_index) { |frame_number|
puts "Loading Image #{frame_number}"
img_processor.load_image(frames_dir+"/frame_%04d.png"%+frame_number)
img_processor.greyscale_image
img_processor.save_image_in_dir(output_dir,"frame_%04d"%+frame_number)
puts "Greyscaled Image #{frame_number}"
}
end
puts "Starting Threads"
greyscale_frames_threads.each { |thread| thread.join }
end
What I expected is a thread being spawned for each image. I have 1000 images. The resolution is 1920*1080. So how I see things is like this. I have an array of threads that I call .join on it. So join will take all the threads and start them, one after the other? Does that mean that it will wait until thread 1 is done and then start thread 2? What is the point of multithreading then?
What I want is this:
Run all the threads at the same time and not one after the other. So mathematically, it will finish all the 1000 frames in the same time it will take to finish 1 frame, right?
Also can somebody explain me what .join does?
From my understanding .join will stop the main thread until your thread(s) is or are done?
If you don't use .join, then the thread will run the background and the main thread will just continue.
So what is the point of using .join? I want my main thread to continue running and have the other threads in the background doing stuff?
Thanks for any help/clarification!!
This is only true if you have 1000 CPU cores and massive (read: hundreds and hundreds) of RAM.
The point of join is not to start the thread, but to wait until the thread has finished. So calling join on an array of threads is a common pattern for waiting for them all to finish.
Explaining all of this, and clarifying your misconception this requires digging a little deeper. At the C/Assembler level, mst modern OSes (Win, Mac, Linux, and some others) use a preemptive scheduler. If you have only one core, two programs running in paralel is a complete illusion. In reality, the kernel is switching between the two every few milliseconds, giving all of use slow processing humans the illusion of parallel processing.
In newer, more modern CPUs, there are often more than one core. The most powerful CPU's today can go up to (I think) 16 real cores + 16 hyperthreaded cores (see here). This means that you could actually run 32 tasks completely in parallel. But even this does not ensure that if you start 32 threads they will all finish at the same time.
Because of competition for resources that are shared between cores (some cache, all the RAM, harddrive, network card, etc.), and the essentially random nature of preemptive scheduling, the amount of time your thread takes can be estimated in a certain range, but not exactly.
Unfortunatly, all of this breaks down when you get to Ruby. Because of some hariy internal details about the threading model an compatibility, only one thread can execute ruby code at a time. So, if your image processing is done in C, happy joy joy. If it's written in Ruby, well, all the treads in the world arn't going to help you now.
To be able to actually run Ruby code in parallel, you have to use fork. fork is only available on Linux and Mac, and not Windows, but you can think of it as a fork in a road. One process goes in, two processes come out. Multiple processes can run on all your different cores at once.
So, take #Stefan's advice: use a queue and a number of worker threads = to # of CPU cores. And con't expect so much of your computer. Now you know why ;).
So join will take all the threads and start them, one after the other?
No, the threads are started when invoking Thread#new. It creates a new thread and executed the given block within that thread.
join will stop the main thread until your thread(s) is or are done?
Yes, it will suspend execution until the receiver (each of your threads) exists.
So what is the point of using join?
Sometimes you want to start some tasks in parallel but you have to wait for each task to finish before you can continue.
I want my main thread to continue running and have the other threads in the background doing stuff
Then don't call join.
After all it's not a good idea to start 1,000 threads in parallel. Your machine is only capable of running as many tasks in parallel as CPUs are available. So instead of starting 1,000 threads, place your jobs / tasks in a queue / pool and process them using some worker threads (number of CPUs = number of workers).

Advantages of non-concurrent Ruby Threads in Ruby 1.9?

I have been reading about Ruby 1.9 Thread and I see that all ruby threads go through the Global Interpreter Lock (GIL for friends) and that concurrency is actually non-existant.
I have done a test (without any signals nor waiting) and the performance using threads doesn't only not improve but the operations actually take more time than running them serially
My question is basically - Whats the point for these Threads if they are not concurrent? Is there any hope that they will be concurrent in the future?
A lot of other Ruby interpreters (JRuby, Rubinius) don't actually have GILs. Also, MRI 2.0 is going to do away with the GIL as well.
Also, in a lot of cases (such as when waiting for IO) the interpreter does switch to another thread. So while it's not technically multithreading (in the case of MRI/REE as of 1.9), it does get some of the benefits.
Parallelism is nonexistent, but Ruby threads do not prevent concurrent execution of Ruby code. Even on a single core machine, concurrent code execution is possible. I think you just conflated the terms 'concurrent' and parallel'.
See Working with Ruby Threads by Jesse Storimer for more details.

Making ruby program run on all processors

I've been looking at optimizing a ruby program that's quite calculation intensive on a lot of data. I don't know C and have chosen Ruby (not that I know it well either) and I'm quite happy with the results, apart from the time it takes to execute. It is a lot of data, and without spending any money, I'd like to know what I can do to make sure I'm maximizing my own systems resources.
When I run a basic Ruby program, does it use a single processor? If I have not specifically assigned tasks to a processor, Ruby won't read my program and magically load each processor to complete the program as fast as possible will it? I'm assuming no...
I've been reading a bit on speeding up Ruby, and in another thread read that Ruby does not support true multithreading (though it said JRuby does). But, if I were to "break up" my program into two chunks that can be run in separate instances and run these in parralel...would these two chunks run on two separate processors automatically? If I had four processors and opened up four shells and ran four separate parts (1/4) of the program - would it complete in 1/4 the time?
Update
After reading the comments I decided to give JRuby a shot. Porting the app over wasn't that difficult. I haven't used "peach" yet, but just by running it in JRuby, the app runs in 1/4 the time!!! Insane. I didn't expect that much of a change. Going to give .peach a shot now and see how that improves things. Still can't believe that boost.
Update #2
Just gave peach a try. Ended up shaving another 15% off the time. So switching to JRuby and using Peach was definitely worth it.
Thanks everyone!
Use JRuby and the peach gem, and it couldn't be easier. Just replace an .each with .peach and voila, you're executing in parallel. And there are additional options to control exactly how many threads are spawned, etc. I have used this and it works great.
You get close to n times speedup, where n is the number of CPUs/cores available. I find that the optimal number of threads is slightly more than the number of CPUs/cores.
Like others have said the MRI implementation of ruby (the one most people use) does not support native threads. Hence you can not split work between CPU cores by launching more threads using the MRI implementation.
However if your process is IO-bound (restricted by disk or network activity for example), then you may still benefit from multiple MRI-threads.
JRuby on the other hand does support native threads, meaning you can use threads to split work between CPU cores.
But all is not lost. With MRI (and all the other ruby implementations), you can still use processes to split work.
This can be done using Process.fork for example like this:
Process.fork {
10.times {
# Do some work in process 1
sleep 1
puts "Hello 1"
}
}
Process.fork {
10.times {
# Do some work in process 2
sleep 1
puts "Hello 2"
}
}
# Wait for the child processes to finish
Process.wait
Using fork will split the processing between CPU cores, so if you can live without threads then separate processes are one way to do it.
As nice as ruby is, it's not known for its speed of execution. That being said, if, as noted in your comment, you can break up the input into equal-sized chunks you should be able to start up n instances of the program, where n is the number of cores you have, and the OS will take care of using all the cores for you.
In the best case it would run in 1/n the time, but this kind of thing can be tricky to get exactly right as some portions of the system, like memory, need to be shared between the processes and contention between processes can cause things not to scale linearly. If the split is easy to do I'd give it a try. You can also just try running the same program twice and see how long it takes to run, if it takes the same amount of time to run one as it does to run two you're likely all set, just split your data and go for it.
Trying jruby and some threads would probably help, but that adds a fair amount of complexity. (It would probably be a good excuse to learn about threading.)
Threading is usually considered one of Ruby's weak points, but it depends more on which implementation of Ruby you use.
A really good writeup on the different threading models is "Does ruby have real multithreading?".
From my experience and from what I gathered from people who know better about this stuff, it seems if you are going to chose a Ruby implementation, JRuby is the way to go. Though, if you are learning Ruby you might want to chose another language such has Erlang, or maybe Clojure, which are popular choices if you wanting to use the JVM.

Can Ruby Fibers be Concurrent?

I'm trying to get some speed up in my program and I've been told that Ruby Fibers are faster than threads and can take advantage of multiple cores. I've looked around, but I just can't find how to actually run different fibers concurrently. With threads you can do this:
threads = []
threads << Thread.new {Do something}
threads << Thread.new {Do something}
threads.each {|thread| thread.join}
I can't see how to do something like this with fibers. All I can find is yield and resume which seems like just a bunch of starting and stopping between the fibers. Is there a way to do true concurrency with fibers?
No, you cannot do concurrency with Fibers. Fibers simply aren't a concurrency construct, they are a control-flow construct, like Exceptions. That's the whole point of Fibers: they never run in parallel, they are cooperative and they are deterministic. Fibers are coroutines. (In fact, I never understood why they aren't simply called Coroutines.)
The only concurrency construct in Ruby is Thread.
There seems to be a terminology issue between concurrency and parallelism.
I just can't find how to actually run different fibers concurrently.
I think you actually talk about parallelism, not about concurrency:
Concurrency is when two tasks can start, run, and complete in overlapping time periods. It doesn't necessarily mean they'll ever both be running at the same instant. Eg. multitasking on a single-core machine. Parallelism is when tasks literally run at the same time, eg. on a multicore processor
Quoting: Concurrency vs Parallelism - What is the difference?.
Also well illustrated here:
http://concur.rspace.googlecode.com/hg/talk/concur.html#title-slide
So to answer the question:
Fibers are primitives for implementing light weight cooperative concurrency in Ruby.
http://www.ruby-doc.org/core-2.1.1/Fiber.html
Which doesn't mean it can run in parallel.
if you want true concurrency you'll want to use threads with jruby (which doesn't actually have fibers, it only has threads, one per fiber).
Another option is to "fork" to new processes, which could run things in true parallel on MRI.

Resources