Why concurrent loop is slower than normal loop in this scenario? - ruby

I am learning Threads in Ruby, from The Ruby Programming Language book & found this method which is described as concurrent version of each iterator,
module Enumerable
def concurrently
map {|item| Thread.new { yield item }}.each {|t| t.join }
end
end
The following code
start=Time.now
arr.concurrently{ |n| puts n} # Ran using threads
puts "Time Taken #{Time.now-start}"
outputs: Time Taken 6.6278332
While
start=Time.now
arr.each{ |n| puts n} # Normal each loop
puts "Time Taken #{Time.now-start}"
outputs: Time Taken 0.132975928
Why is it faster without threads ? Is the implementation wrong or the second one has only puts statement while the initial one took time for resource allocation/initialization/terminating the Threads ?

Threads in MRI (the "gold standard" ruby) are not really concurrent. There's a Global VM Lock (GVL) which prevents threads from running concurrently. It allows, however, other threads to run when the current thread is blocked on I/O, but that's not your case.
So, your code runs serially, and you have threading overhead (creating/destroying threads, etc). That's why it's slower.

Related

Ruby thread safe thread creation

I came across this bit of code today:
#thread ||= Thread.new do
# this thread should only spin up once
end
It is being called by multiple threads. I was worried that if multiple threads are calling this code that you could have multiple threads created since #thread access is not synchronized. However, I was told that this could not happen because of the Global Interpreter Lock. I did a little bit of reading about threads in Ruby and it seems like individual threads that are running Ruby code can get preempted by other threads. If this is the case, couldn't you have an interleaving like this:
Thread A Thread B
======== ========
Read from #thread .
Thread.New .
[Thread A preempted] .
. Read from #thread
. Thread.New
. Write to #thread
Write to #thread
Additionally, since access to #thread is not synchronized are writes to #thread guaranteed to be visible to all other threads? The memory models of other languages I've used in the past do not guarantee visibility of writes to memory unless you synchronize access to that memory using atomics, mutexes, etc.
I'm still learning Ruby and realize I have a long way to go to understanding concurrency in Ruby. Any help on this would be super appreciated!
You need a mutex. Essentially the only thing the GIL protects you from is accessing uninitialized memory. If something in Ruby could be well-defined without being atomic, you should not assume it is atomic.
A simple example to show that your example ordering is possible. I get the "double set" message every time I run it:
$global = nil
$thread = nil
threads = []
threads = Array.new(1000) do
Thread.new do
sleep 1
$thread ||= Thread.new do
if $global
warn "double set!"
else
$global = true
end
end
end
end
threads.each(&:join)

Ruby: CPU-Load degradation of concurent/multithreaded task?

Preamble: I am working on a project to restore truecrypt container. It was cut to more than 3M small files in most likely random order and the goal is to find either the beginning or the ending of the container containing the encryption keys.
To do so I’ve written a small ruby script that starts many truecrypt processes concurrently attempting to mount the main or restore the backup headers. Interaction with truecrypt occures through spawned PTYs:
PTY.spawn(#cmd) do |stdout, stdin, pid|
#spawn = {stdout: stdout, stdin: stdin, pid: pid}
if test_type == :forward
process_truecrypt_forward
else
process_truecrypt_backward
end
stdin.puts
pty_expect('Incorrect password')
Process.kill('INT', pid)
stdin.close
stdout.close
Process.wait(pid)
end
This all works fine and successfully finds required pieces of a test container. To speed things up (and I need to proccess over 3M pieces) I've first used Ruby MRI multithreading and after reading about problems with it switched to concurent-ruby.
My implementation is pretty straightforward:
log 'Starting DB test'
concurrent_db = Concurrent::Array.new(#db)
futures = []
progress_bar = initialize_progress_bar('Running DB test', concurrent_db.size)
MAXIMUM_FUTURES.times do
log "Started new future, total #{futures.size} futures"
futures << Concurrent::Future.execute do
my_piece = nil
run = 1
until concurrent_db.empty?
my_piece = concurrent_db.slice!(0, SLICE_PER_FUTURE)
break unless my_piece
log "Run #{run}, sliced #{my_piece.size} pieces, #{concurrent_db.size} left"
my_piece.each {|a| run_single_test(a)}
progress_bar.progress += my_piece.size
run += 1
end
log 'Future finished'
end
end
Than I rented a large AWS Instance with 74 CPU cores and thought: "now I gonna proccess it fast". But the problem is, that no matter how many futures/threads (and I mean 20 or 1000) I launch simultaneously I am not reaching over ~50 checks/second.
When I launch 1000 threads the CPU load keeps at 100% only for 20-30 minutes and than goes down till it reaches somewhat of 15% and it stays so. Graph of typical CPU load within such a run. Disk load is not an issue, I am hitting 3MiB/s at maximum, using Amazon EBS storage.
What am I missing? Why can't I utilize 100% cpu and achieve better perfomance?
It's hard to say why exactly you aren't seeing the benefits of multithreading. But here's my guess.
Let's say you have a really intensive Ruby method that takes 10 seconds to run called do_work. And, even worse, you need to run this method 100 times. Rather than wait 1000 seconds, you might try to multithread it. That could divide the work among your CPU cores, halving or maybe even quartering the runtime:
Array.new(100) { Thread.new { do_work } }.each(&:join)
But no, this is probably still going to take 1000 seconds to finish. Why?
The Global VM Lock
Consider this example:
thread1 = Thread.new { class Foo; end; Foo.new }
thread2 = Thread.new { class Foo; end; Foo.new }
Creating a class in Ruby does a lot of stuff under the hood, for example it has to create an actual class object and assign that object's pointer to a global constant (in some order). What happens if thread1 registers that global constant, gets half way through creating the actual class object and then thread2 starts running, says "Oh, Foo already exists. Let's go ahead and run Foo.new". What happens since the class hasn't been fully defined? Or what if both thread1 and thread2 create a new class object and then both try to register their class as Foo? Which one wins? What about the class object that was created and now doesn't get registered?
The official Ruby solution for this is simple: don't actually run this code in parallel. Instead, there is one single, massive mutex called "the global VM lock" that protects anything that modifies the Ruby VM's state (such as making a class). So while the two threads above may be interleaved in various ways, it's impossible for the VM to end up in an invalid state because each VM operation is essentially atomic.
Example
This takes about 6 seconds to run on my laptop:
def do_work
Array.new(100000000) { |i| i * i }
end
This takes about 18 seconds, obviously
3.times { do_work }
But, this also takes about 18, because the GVL prevents the threads from actually running in parallel
Array.new(3) { Thread.new { do_work } }.each(&:join)
This also takes 6 seconds to run
def do_work2
sleep 6
end
But now this also takes about 6 seconds to run:
Array.new(3) { Thread.new { do_work2 } }.each(&:join)
Why? If you dig through the Ruby source code, you find that sleep ultimately calls the C function native_sleep and in there we see
GVL_UNLOCK_BEGIN(th);
{
//...
}
GVL_UNLOCK_END(th);
The Ruby devs know that sleep doesn't affect the VM state, so they explicitly unlocked the GVL to allow it to run in parallel. It can be tricky to figure out exactly what locks/unlocks the GVL and when you're going to see the performance benefit of it.
How to fix your code
My guess is that something in your code is hitting the GVL so while some parts of your threads are running in parallel (generally any subprocess/PTY stuff does), there's still contention between them in the Ruby VM causing some parts to serialize.
Your best bet with getting truly parallel Ruby code is to simplify it to something like this:
Array.new(x) { Thread.new { do_work } }
where you're sure that do_work is something simple that definitely unlocks the GVL, such as spawning a subprocess. You could try moving your Truecrypt code into a little shell script so that Ruby doesn't have to interact with it anymore once it gets going.
I recommend starting with a little benchmark that just starts a few subprocesses, and make sure that they are actually running in parallel by comparing the time to running them serially.

Parallelism in Ruby

I've got a loop in my Ruby build script that iterates over each project and calls msbuild and does various other bits like minify CSS/JS.
Each loop iteration is independent of the others so I'd like to parallelise it.
How do I do this?
I've tried:
myarray.each{|item|
Thread.start {
# do stuff
}
}
puts "foo"
but Ruby just seems to exit straight away (prints "foo"). That is, it runs over the loop, starts a load of threads, but because there's nothing after the each, Ruby exits killing the other threads :(
I know I can do thread.join, but if I do this inside the loop then it's no longer parallel.
What am I missing?
I'm aware of http://peach.rubyforge.org/ but using that I get all kinds of weird behaviour that look like variable scoping issues that I don't know how to solve.
Edit
It would be useful if I could wait for all child-threads to execute before putting "foo", or at least the main ruby thread exiting. Is this possible?
Store all your threads in an array and loop through the array calling join:
threads = myarray.map do |item|
Thread.start do
# do stuff
end
end
threads.each { |thread| thread.join }
puts "foo"
Use em-synchrony here :). Fibers are cute.
require "em-synchrony"
require "em-synchrony/fiber_iterator"
# if you realy need to get a Fiber per each item
# in real life you could set concurrency to, for example, 10 and it could even improve performance
# it depends on amount of IO in your job
concurrency = myarray.size
EM.synchrony do
EM::Synchrony::FiberIterator.new(myarray, concurrency).each do |url|
# do some job here
end
EM.stop
end
Take into account that ruby threads are green threads, so you dont have natively true parallelism. I f this is what you want I would recommend you to take a look to JRuby and Rubinius:
http://www.engineyard.com/blog/2011/concurrency-in-jruby/

ruby thread block?

I read somewhere that ruby threads/fibre block the IO even with 1.9. Is this true and what does it truly mean? If I do some net/http stuff on multiple threads, is only 1 thread running at a given time for that request?
thanks
Assuming you are using CRuby, only one thread will be running at a time. However, the requests will be made in parallel, because each thread will be blocked on its IO while its IO is not finished. So if you do something like this:
require 'open-uri'
threads = 10.times.map do
Thread.new do
open('http://example.com').read.length
end
end
threads.map &:join
puts threads.map &:value
it will be much faster than doing it sequentially.
Also, you can check to see if a thread is finished w/o blocking on it's completion.
For example:
require 'open-uri'
thread = Thread.new do
sleep 10
open('http://example.com').read.length
end
puts 'still running' until thread.join(5)
puts thread.value
With CRuby, the threads cannot run at the same time, but they are still useful. Some of the other implementations, like JRuby, have real threads and can run multiple threads in parallel.
Some good references:
http://yehudakatz.com/2010/08/14/threads-in-ruby-enough-already/
http://www.engineyard.com/blog/2011/ruby-concurrency-and-you/
All threads run simultaneously but IO will be blocked until they all finish.
In other words, threading doesn't give you the ability to "background" a process. The interpreter will wait for all of the threads to complete before sending further messages.
This is good if you think about it because you don't have to wonder about whether they are complete if your next process uses data that the thread is modifying/working with.
If you want to background processes checkout delayed_job

What happens when you don't join your Threads?

I'm writing a ruby program that will be using threads to do some work. The work that is being done takes a non-deterministic amount of time to complete and can range anywhere from 5 to 45+ seconds. Below is a rough example of what the threading code looks like:
loop do # Program loop
items = get_items
threads = []
for item in items
threads << Thread.new(item) do |i|
# do work on i
end
threads.each { |t| t.join } # What happens if this isn't there?
end
end
My preference would be to skip joining the threads and not block the entire application. However I don't know what the long term implications of this are, especially because the code is run again almost immediately. Is this something that is safe to do? Or is there a better way to spawn a thread, have it do work, and clean up when it's finished, all within an infinite loop?
I think it really depends on the content of your thread work. If, for example, your main thread needed to print "X work done", you would need to join to guarantee that you were showing the correct answer. If you have no such requirement, then you wouldn't necessarily need to join up.
After writing the question out, I realized that this is the exact thing that a web server does when serving pages. I googled and found the following article of a Ruby web server. The loop code looks pretty much like mine:
loop do
session = server.accept
request = session.gets
# log stuff
Thread.start(session, request) do |session, request|
HttpServer.new(session, request, basePath).serve()
end
end
Thread.start is effectively the same as Thread.new, so it appears that letting the threads finish and die off is OK to do.
If you split up a workload to several different threads and you need to combine at the end the solutions from the different threads you definately need a join otherwise you could do it without a join..
If you removed the join, you could end up with new items getting started faster than the older ones get finished. If you're working on too many items at once, it may cause performance issues.
You should use a Queue instead (snippet from http://ruby-doc.org/stdlib/libdoc/thread/rdoc/classes/Queue.html):
require 'thread'
queue = Queue.new
producer = Thread.new do
5.times do |i|
sleep rand(i) # simulate expense
queue << i
puts "#{i} produced"
end
end
consumer = Thread.new do
5.times do |i|
value = queue.pop
sleep rand(i/2) # simulate expense
puts "consumed #{value}"
end
end
consumer.join

Resources