Ruby: CPU-Load degradation of concurent/multithreaded task?

Ruby: CPU-Load degradation of concurent/multithreaded task? - ruby

Preamble: I am working on a project to restore truecrypt container. It was cut to more than 3M small files in most likely random order and the goal is to find either the beginning or the ending of the container containing the encryption keys.
To do so I’ve written a small ruby script that starts many truecrypt processes concurrently attempting to mount the main or restore the backup headers. Interaction with truecrypt occures through spawned PTYs:
PTY.spawn(#cmd) do |stdout, stdin, pid|
#spawn = {stdout: stdout, stdin: stdin, pid: pid}
if test_type == :forward
process_truecrypt_forward
else
process_truecrypt_backward
end
stdin.puts
pty_expect('Incorrect password')
Process.kill('INT', pid)
stdin.close
stdout.close
Process.wait(pid)
end
This all works fine and successfully finds required pieces of a test container. To speed things up (and I need to proccess over 3M pieces) I've first used Ruby MRI multithreading and after reading about problems with it switched to concurent-ruby.
My implementation is pretty straightforward:
log 'Starting DB test'
concurrent_db = Concurrent::Array.new(#db)
futures = []
progress_bar = initialize_progress_bar('Running DB test', concurrent_db.size)
MAXIMUM_FUTURES.times do
log "Started new future, total #{futures.size} futures"
futures << Concurrent::Future.execute do
my_piece = nil
run = 1
until concurrent_db.empty?
my_piece = concurrent_db.slice!(0, SLICE_PER_FUTURE)
break unless my_piece
log "Run #{run}, sliced #{my_piece.size} pieces, #{concurrent_db.size} left"
my_piece.each {|a| run_single_test(a)}
progress_bar.progress += my_piece.size
run += 1
end
log 'Future finished'
end
end
Than I rented a large AWS Instance with 74 CPU cores and thought: "now I gonna proccess it fast". But the problem is, that no matter how many futures/threads (and I mean 20 or 1000) I launch simultaneously I am not reaching over ~50 checks/second.
When I launch 1000 threads the CPU load keeps at 100% only for 20-30 minutes and than goes down till it reaches somewhat of 15% and it stays so. Graph of typical CPU load within such a run. Disk load is not an issue, I am hitting 3MiB/s at maximum, using Amazon EBS storage.
What am I missing? Why can't I utilize 100% cpu and achieve better perfomance?

It's hard to say why exactly you aren't seeing the benefits of multithreading. But here's my guess.
Let's say you have a really intensive Ruby method that takes 10 seconds to run called do_work. And, even worse, you need to run this method 100 times. Rather than wait 1000 seconds, you might try to multithread it. That could divide the work among your CPU cores, halving or maybe even quartering the runtime:
Array.new(100) { Thread.new { do_work } }.each(&:join)
But no, this is probably still going to take 1000 seconds to finish. Why?
The Global VM Lock
Consider this example:
thread1 = Thread.new { class Foo; end; Foo.new }
thread2 = Thread.new { class Foo; end; Foo.new }
Creating a class in Ruby does a lot of stuff under the hood, for example it has to create an actual class object and assign that object's pointer to a global constant (in some order). What happens if thread1 registers that global constant, gets half way through creating the actual class object and then thread2 starts running, says "Oh, Foo already exists. Let's go ahead and run Foo.new". What happens since the class hasn't been fully defined? Or what if both thread1 and thread2 create a new class object and then both try to register their class as Foo? Which one wins? What about the class object that was created and now doesn't get registered?
The official Ruby solution for this is simple: don't actually run this code in parallel. Instead, there is one single, massive mutex called "the global VM lock" that protects anything that modifies the Ruby VM's state (such as making a class). So while the two threads above may be interleaved in various ways, it's impossible for the VM to end up in an invalid state because each VM operation is essentially atomic.
Example
This takes about 6 seconds to run on my laptop:
def do_work
Array.new(100000000) { |i| i * i }
end
This takes about 18 seconds, obviously
3.times { do_work }
But, this also takes about 18, because the GVL prevents the threads from actually running in parallel
Array.new(3) { Thread.new { do_work } }.each(&:join)
This also takes 6 seconds to run
def do_work2
sleep 6
end
But now this also takes about 6 seconds to run:
Array.new(3) { Thread.new { do_work2 } }.each(&:join)
Why? If you dig through the Ruby source code, you find that sleep ultimately calls the C function native_sleep and in there we see
GVL_UNLOCK_BEGIN(th);
{
//...
}
GVL_UNLOCK_END(th);
The Ruby devs know that sleep doesn't affect the VM state, so they explicitly unlocked the GVL to allow it to run in parallel. It can be tricky to figure out exactly what locks/unlocks the GVL and when you're going to see the performance benefit of it.
How to fix your code
My guess is that something in your code is hitting the GVL so while some parts of your threads are running in parallel (generally any subprocess/PTY stuff does), there's still contention between them in the Ruby VM causing some parts to serialize.
Your best bet with getting truly parallel Ruby code is to simplify it to something like this:
Array.new(x) { Thread.new { do_work } }
where you're sure that do_work is something simple that definitely unlocks the GVL, such as spawning a subprocess. You could try moving your Truecrypt code into a little shell script so that Ruby doesn't have to interact with it anymore once it gets going.
I recommend starting with a little benchmark that just starts a few subprocesses, and make sure that they are actually running in parallel by comparing the time to running them serially.

Related

Celluloid async inside ruby blocks does not work

Trying to implement Celluloid async on my working example seem to exhibit weird behavior.
here my code looks
class Indefinite
include Celluloid
def run!
loop do
[1].each do |i|
async.on_background
end
end
end
def on_background
puts "Running in background"
end
end
Indefinite.new.run!
but when I run the above code, I never see the puts "Running in Background"
But, if I put a sleep the code seem to work.
class Indefinite
include Celluloid
def run!
loop do
[1].each do |i|
async.on_background
end
sleep 0.5
end
end
def on_background
puts "Running in background"
end
end
Indefinite.new.run!
Any idea? why such a difference in the above two scenario.
Thanks.

Your main loop is dominating the actor/application's threads.
All your program is doing is spawning background processes, but never running them. You need that sleep in the loop purely to allow the background threads to get attention.
It is not usually a good idea to have an unconditional loop spawn infinite background processes like you have here. There ought to be either a delay, or a conditional statement put in there... otherwise you just have an infinite loop spawning things that never get invoked.
Think about it like this: if you put puts "looping" just inside your loop, while you do not see Running in the background ... you will see looping over and over and over.
Approach #1: Use every or after blocks.
The best way to fix this is not to use sleep inside a loop, but to use an after or every block, like this:
every(0.1) {
on_background
}
Or best of all, if you want to make sure the process runs completely before running again, use after instead:
def run_method
#running ||= false
unless #running
#running = true
on_background
#running = false
end
after(0.1) { run_method }
end
Using a loop is not a good idea with async unless there is some kind of flow control done, or a blocking process such as with #server.accept... otherwise it will just pull 100% of the CPU core for no good reason.
By the way, you can also use now_and_every as well as now_and_after too... this would run the block right away, then run it again after the amount of time you want.
Using every is shown in this gist:
https://gist.github.com/digitalextremist/686f42e58a58b743142b
The ideal situation, in my opinion:
This is a rough but immediately usable example:
https://gist.github.com/digitalextremist/12fc824c6a4dbd94a9df
require 'celluloid/current'
class Indefinite
include Celluloid
INTERVAL = 0.5
ONE_AT_A_TIME = true
def self.run!
puts "000a Instantiating."
indefinite = new
indefinite.run
puts "000b Running forever:"
sleep
end
def initialize
puts "001a Initializing."
#mutex = Mutex.new if ONE_AT_A_TIME
#running = false
puts "001b Interval: #{INTERVAL}"
end
def run
puts "002a Running."
unless ONE_AT_A_TIME && #running
if ONE_AT_A_TIME
#mutex.synchronize {
puts "002b Inside lock."
#running = true
on_background
#running = false
}
else
puts "002b Without lock."
on_background
end
end
puts "002c Setting new timer."
after(INTERVAL) { run }
end
def on_background
if ONE_AT_A_TIME
puts "003 Running background processor in foreground."
else
puts "003 Running in background"
end
end
end
Indefinite.run!
puts "004 End of application."
This will be its output, if ONE_AT_A_TIME is true:
000a Instantiating.
001a Initializing.
001b Interval: 0.5
002a Running.
002b Inside lock.
003 Running background processor in foreground.
002c Setting new timer.
000b Running forever:
002a Running.
002b Inside lock.
003 Running background processor in foreground.
002c Setting new timer.
002a Running.
002b Inside lock.
003 Running background processor in foreground.
002c Setting new timer.
002a Running.
002b Inside lock.
003 Running background processor in foreground.
002c Setting new timer.
002a Running.
002b Inside lock.
003 Running background processor in foreground.
002c Setting new timer.
002a Running.
002b Inside lock.
003 Running background processor in foreground.
002c Setting new timer.
002a Running.
002b Inside lock.
003 Running background processor in foreground.
002c Setting new timer.
And this will be its output if ONE_AT_A_TIME is false:
000a Instantiating.
001a Initializing.
001b Interval: 0.5
002a Running.
002b Without lock.
003 Running in background
002c Setting new timer.
000b Running forever:
002a Running.
002b Without lock.
003 Running in background
002c Setting new timer.
002a Running.
002b Without lock.
003 Running in background
002c Setting new timer.
002a Running.
002b Without lock.
003 Running in background
002c Setting new timer.
002a Running.
002b Without lock.
003 Running in background
002c Setting new timer.
002a Running.
002b Without lock.
003 Running in background
002c Setting new timer.
002a Running.
002b Without lock.
003 Running in background
002c Setting new timer.
You need to be more "evented" than "threaded" to properly issue tasks and preserve scope and state, rather than issue commands between threads/actors... which is what the every and after blocks provide. And besides that, it's good practice either way, even if you didn't have a Global Interpreter Lock to deal with, because in your example, it doesn't seem like you are dealing with a blocking process. If you had a blocking process, then by all means have an infinite loop. But since you're just going to end up spawning an infinite number of background tasks before even one is processed, you need to either use a sleep like your question started with, or use a different strategy altogether, and use every and after which is how Celluloid itself encourages you to operate when it comes to handling data on sockets of any kind.
Approach #2: Use a recursive method call.
This just came up in the Google Group. The below example code will actually allow execution of other tasks, even though it's an infinite loop.
https://groups.google.com/forum/#!topic/celluloid-ruby/xmkdrMQBGbY
This approach is less desirable because it will likely have more overhead, spawning a series of fibers.
def work
# ...
async.work
end
Question #2: Thread vs. Fiber behaviors.
The second question is why the following would work: loop { Thread.new { puts "Hello" } }
That spawns an infinite number of process threads, which are managed by the RVM directly. Even though there is a Global Interpreter Lock in the RVM you are using... that only means no green threads are used, which are provided by the operating system itself... instead these are handled by the process itself. The CPU scheduler for the process runs each Thread itself, without hesitation. And in the case of the example, the Thread runs very quickly and then dies.
Compared to an async task, a Fiber is used. So what's happening is this, in the default case:
Process starts.
Actor instantiated.
Method call invokes loop.
Loop invokes async method.
async method adds task to mailbox.
Mailbox is not invoked, and loop continues.
Another async task is added to the mailbox.
This continues infinitely.
The above is because the loop method itself is a Fiber call, which is not ever being suspended ( unless a sleep is called! ) and therefore the additional task added to the mailbox is never an invoking a new Fiber. A Fiber behaves differently than a Thread. This is a good piece of reference material discussing the differences:
https://blog.engineyard.com/2010/concurrency-real-and-imagined-in-mri-threads
Question #3: Celluloid vs. Celluloid::ZMQ behavior.
The third question is why include Celluloid behaves differently than Celluloid::ZMQ ...
That's because Celluloid::ZMQ uses a reactor-based evented mailbox, versus Celluloid which uses a condition variable based mailbox.
Read more about pipelining and execution modes:
https://github.com/celluloid/celluloid/wiki/Pipelining-and-execution-modes
That is the difference between the two examples. If you have additional questions about how these mailboxes behave, feel free to post on the Google Group ... the main dynamic you are facing is the unique nature of the GIL interacting with the Fiber vs. Thread vs. Reactor behavior.
You can read more about the reactor-pattern here:
http://en.wikipedia.org/wiki/Reactor_pattern
Explanation of the "Reactor pattern"
What is the difference between event driven model and reactor pattern?
And see the specific reactor used by Celluloid::ZMQ here:
https://github.com/celluloid/celluloid-zmq/blob/master/lib/celluloid/zmq/reactor.rb
So what's happening in the evented mailbox scenario, is that when sleep is hit, that is a blocking call, which causes the reactor to move to the next task in the mailbox.
But also, and this is unique to your situation, the specific reactor being used by Celluloid::ZMQ is using an eternal C library... specifically the 0MQ library. That reactor is external to your application, which behaves differently than Celluloid::IO or Celluloid itself, and that is also why the behavior is occurring differently than you expected.
Multi-core Support Alternative
If maintaining state and scope is not important to you, if you use jRuby or Rubinius which are not limited to one operating system thread, versus using MRI which has the Global Interpreter Lock, you can instantiate more than one actor and issue async calls between actors concurrently.
But my humble opinion is that you would be much better served using a very high frequency timer, such as 0.001 or 0.1 in my example, which will seem instantaneous for all intents and purposes, but also allow the actor thread plenty of time to switch fibers and run other tasks in the mailbox.

Let's make an experiment, by modifying your example a bit (we modify it because this way we get the same "weird" behaviour, while making things clearner):
class Indefinite
include Celluloid
def run!
(1..100).each do |i|
async.on_background i
end
puts "100 requests sent from #{Actor.current.object_id}"
end
def on_background(num)
(1..100000000).each {}
puts "message #{num} on #{Actor.current.object_id}"
end
end
Indefinite.new.run!
sleep
# =>
# 100 requests sent from 2084
# message 1 on 2084
# message 2 on 2084
# message 3 on 2084
# ...
You can run it on any Ruby interpreter, using Celluloid or Celluloid::ZMQ, the result always will be the same. Also note that, output from Actor.current.object_id is the same in both methods, giving us the clue, that we are dealing with a single actor in our experiment.
So there is not much difference between ruby and Celluloid implementations, as long as this experiment is concerned.
Let's first address why this code behaves in this way?
It's not hard to understand why it's happening. Celluloid is receiving incoming requests and saving them in the queue of tasks for appropriate actor. Note, that our original call to run! is on the top of the queue.
Celluloid then processes those tasks, one at a time. If there happens to be a blocking call or sleep call, according to the documentation, the next task will be invoked, not waiting for the current task to be completed.
Note, that in our experiment there are no blocking calls. It means, that the run! method will be executed from the beginning to the end, and only after it's done, each of the on_background calls will be invoked in the perfect order.
And it's how it's supposed to work.
If you add sleep call in your code, it will notify Celluloid, that it should start processing of the next task in queue. Thus, the behavior, you have in your second example.
Let's now continue to the part on how to design the system, so that it does not depend on sleep calls, which is weird at least.
Actually there is a good example at Celluloid-ZMQ project page. Note this loop:
def run
loop { async.handle_message #socket.read }
end
The first thing it does is #socket.read. Note that it's a blocking operation. So, Celluloid will process to the next message in the queue (if there are any). As soon as #socket.read responds, a new task will be generated. But this task won't be executed before #socket.read is called again, thus blocking execution, and notifying Celluloid to process with the next item on the queue.
You probably see the difference with your example. You are not blocking anything, thus not giving Celluloid a chance to process with queue.
How can we get behavior given in Celluloid::ZMQ example?
The first (in my opinion, better) solution is to have actual blocking call, like #socket.read.
If there are no blocking calls in your code and you still need to process things in background, then you should consider other mechanisms provided by Celluloid.
There are several options with Celluloid.
One can use conditions, futures, notifications, or just calling wait/signal on low level, like in this example:
class Indefinite
include Celluloid
def run!
loop do
async.on_background
result = wait(:background) #=> 33
end
end
def on_background
puts "background"
# notifies waiters, that they can continue
signal(:background, 33)
end
end
Indefinite.new.run!
sleep
# ...
# background
# background
# background
# ...
Using sleep(0) with Celluloid::ZMQ
I also noticed working.rb file you mentioned in your comment. It contains the following loop:
loop { [1].each { |i| async.handle_message 'hello' } ; sleep(0) }
It looks like it's doing the proper job. Actually, running it under jRuby revealed, it's leaking memory. To make it even more apparent, try to add a sleep call into the handle_message body:
def handle_message(message)
sleep 0.5
puts "got message: #{message}"
end
High memory usage is probably related to the fact, that queue is filled very fast and cannot be processed in given time. It will be more problematic, if handle_message is more work-intensive, then it's now.
Solutions with sleep
I'm skeptical about solutions with sleep. They potentially require much memory and even generate memory leaks. And it's not clear what should you pass as a parameter to the sleep method and why.

How threads work with Celluloid
Celluloid is not creating a new thread for each asynchronous task. It has a pool of threads in which it runs every task, synchronous and asynchronous ones. The key point is that the library sees the run! function as a synchronous task, and performs it in the same context than an asynchronous task.
By default, Celluloid runs everything in a single thread, using a queue system to schedule asynchronous tasks for later. It creates new threads only when needed.
Besides that, Celluloid overrides the sleep function. It means that every time you call sleep in a class extending the Celluloid class, the library will check if there are non-sleeping threads in its pool.
In your case, the first time you call sleep 0.5, it will create a new Thread to perform the asynchronous tasks in the queue while the first thread is sleeping.
So in your first example, only one Celluloid thread is running, performing the loop. In your second example, two Celluloid threads are running, the first one performing the loop and sleeping at each iteration, the other one performing the background task.
You could for instance change your first example to perform a finite number of iterations:
def run!
(0..100).each do
[1].each do |i|
async.on_background
end
end
puts "Done!"
end
When using this run! function, you'll see that Done! is printed before all the Running in background, meaning that Celluloid finishes the execution of the run! function before starting the asynchronous tasks in the same thread.

Mutex for ActiveRecord Model

My User model has a nasty method that should not be called simultaneously for two instances of the same record. I need to execute two http requests in a row and at the same time make sure that any other thread does not execute the same method for the same record at the same time.
class User
...
def nasty_long_running_method
// something nasty will happen if this method is called simultaneously
// for two instances of the same record and the later one finishes http_request_1
// before the first one finishes http_request_2.
http_request_1 // Takes 1-3 seconds.
http_request_2 // Takes 1-3 seconds.
update_model
end
end
For example this would break everything:
user = User.first
Thread.new { user.nasty_long_running_method }
Thread.new { user.nasty_long_running_method }
But this would be ok and it should be allowed:
user1 = User.find(1)
user2 = User.find(2)
Thread.new { user1.nasty_long_running_method }
Thread.new { user2.nasty_long_running_method }
What would be the best way to make sure the method is not called simultaneously for two instances of the same record?

I found a gem Remote lock when searching for a solution for my problem. It is a mutex solution that uses Redis in the backend.
It:
is accessible for all processes
does not lock the database
is in memory -> fast and no IO
The method looks like this now
def nasty
$lock = RemoteLock.new(RemoteLock::Adapters::Redis.new(REDIS))
$lock.synchronize("capi_lock_#{user_id}") do
http_request_1
http_request_2
update_user
end
end

I would start with adding a mutex or semaphore. Read about mutex: http://www.ruby-doc.org/core-2.1.2/Mutex.html
class User
...
def nasty
#semaphore ||= Mutex.new
#semaphore.synchronize {
# only one thread at a time can enter this block...
}
end
end
If your class is an ActiveRecord object you might want to use Rails' locking and database transactions. See: http://api.rubyonrails.org/classes/ActiveRecord/Locking/Pessimistic.html
def nasty
User.transaction do
lock!
...
save!
end
end
Update: You updated your question with more details. And it seems like my solutions do not really fit anymore. The first solutions does not work if you have multiple instances running. The second locks only the database row, it does not prevent multiple thread from entering the code block at the same time.
Therefore if would think about building a database based semaphore.
class Semaphore < ActiveRecord::Base
belongs_to :item, :polymorphic => true
def self.get_lock(item, identifier)
# may raise invalid key exception from unique key contraints in db
create(:item => item) rescue false
end
def release
destroy
end
end
The database should have an unique index covering the rows for the polymorphic association to item. That should protect multiple thread from getting a lock for the same item at the same time. Your method would look like this:
def nasty
until semaphore
semaphore = Semaphore.get_lock(user)
end
...
semaphore.release
end
There are a couple of problems to solve around this: How long do you want to wait to get the semaphore? What happens if the external http requests take ages? Do you need to store additional pieces of information (hostname, pid) to identifier what thread lock an item? You will need some kind of cleanup task the removes locks that still exist after a certain period of time or after restarting the server.
Furthermore I think it is a terrible idea to have something like this in a web server. At least you should move all that stuff into background jobs. What might solve your problem, if your app is small and needs just one background job to get everything done.

You state that this is an ActiveRecord model, in which case the usual approach would be to use a database lock on that record. No need for additional locking mechanisms as far as I can see.
Take a look at the short (one page) Rails Guides section on pessimistic locking - http://guides.rubyonrails.org/active_record_querying.html#pessimistic-locking
Basically you can get a lock on a single record or a whole table (if you were updating a lot of things)
In your case something like this should do the trick...
class User < ActiveRecord::Base
...
def nasty_long_running_method
with_lock do
// something nasty will happen if this method is called simultaneously
// for two instances of the same record and the later one finishes http_request_1
// before the first one finishes http_request_2.
http_request_1 // Takes 1-3 seconds.
http_request_2 // Takes 1-3 seconds.
update_model
end
end
end

I recently created a gem called szymanskis_mutex. It is a module that you can include in the class User and provides the method mutual_exclusion(concern) to provide the functionality you want.
It doesnt rely on databases and doesn't depend on how many processes want to enter the critical section at any given moment.
Note that if the class is initialized in different servers it will not work.
I may suite your needs if your app is small enough. Your code would look like this:
class User
include SzymanskisMutex
...
def nasty_long_running_method
mutual_exclusion(:nasty_long) do
http_request_1 // Takes 1-3 seconds.
http_request_2 // Takes 1-3 seconds.
end
update_model
end
end

I suggest rethinking your architecture as this is not going to be scalable - imagine having multiple ruby processes, failing processes, timeouts etc. Also in-process locking and spawning threads is quite dangerous for application servers.
If you want to sleep well with production then try some async background processing framework for long running tasks with serial queue which will ensure order of running tasks. Just simple RabbitMQ or check this QA Best practice for Rails App to run a long task in the background? , eventually try DB but Optimistic Locking.

Parallelism in Ruby

I've got a loop in my Ruby build script that iterates over each project and calls msbuild and does various other bits like minify CSS/JS.
Each loop iteration is independent of the others so I'd like to parallelise it.
How do I do this?
I've tried:
myarray.each{|item|
Thread.start {
# do stuff
}
}
puts "foo"
but Ruby just seems to exit straight away (prints "foo"). That is, it runs over the loop, starts a load of threads, but because there's nothing after the each, Ruby exits killing the other threads :(
I know I can do thread.join, but if I do this inside the loop then it's no longer parallel.
What am I missing?
I'm aware of http://peach.rubyforge.org/ but using that I get all kinds of weird behaviour that look like variable scoping issues that I don't know how to solve.
Edit
It would be useful if I could wait for all child-threads to execute before putting "foo", or at least the main ruby thread exiting. Is this possible?

Store all your threads in an array and loop through the array calling join:
threads = myarray.map do |item|
Thread.start do
# do stuff
end
end
threads.each { |thread| thread.join }
puts "foo"

Use em-synchrony here :). Fibers are cute.
require "em-synchrony"
require "em-synchrony/fiber_iterator"
# if you realy need to get a Fiber per each item
# in real life you could set concurrency to, for example, 10 and it could even improve performance
# it depends on amount of IO in your job
concurrency = myarray.size
EM.synchrony do
EM::Synchrony::FiberIterator.new(myarray, concurrency).each do |url|
# do some job here
end
EM.stop
end

Take into account that ruby threads are green threads, so you dont have natively true parallelism. I f this is what you want I would recommend you to take a look to JRuby and Rubinius:
http://www.engineyard.com/blog/2011/concurrency-in-jruby/

Improving ruby code performance with threading not working

I am trying to optimize this piece of ruby code with Thread, which involves a fair bit of IO and network activity. Unfortunately that is not going down too well.
# each host is somewhere in the local network
hosts.each { |host|
# reach out to every host with something for it to do
# wait for host to complete the work and get back
}
My original plan was to wrap the internal of the loop into a new Thread for each iteration. Something like:
# each host is somewhere in the local network
hosts.each { |host|
Thread.new {
# reach out to every host with something for it to do
# wait for host to complete the work and get back
}
}
# join all threads here before main ends
I was hoping that since this I/O bound even without ruby 1.9 I should be able to gain something but nope, nothing. Any ideas how this might be improved?

I'm not sure how many hosts you have, but if it's a lot you may want to try a producer/consumer model with a fixed number of threads:
require 'thread'
THREADS_COUNT = (ARGV[0] || 4).to_i
$q = Queue.new
hosts.each { |h| $q << h }
threads = (1..THREADS_COUNT).map {
Thread.new {
begin
loop {
host = $q.shift(true)
# do something with host
}
rescue ThreadError
# queue is empty, pass
end
}
}
threads.each(&:join)
If this all too complicated, you may try using xargs -P. ;)

#Fanatic23, before you draw any conclusions, instrument your code with puts and see whether the network requests are actually overlapping. In each call to puts, print a status string indicating the line which is executing, along with Time.now and Time.now.nsec.

You say "since this [is] I/O bound even without ruby 1.9 I should be able to gain something".
Vain hope. :-)
When a thread in Ruby 1.8 blocks on IO, the entire process has blocked on IO. This is because it's using green theads.
Upgrade to Ruby 1.9, and you'll have access to your platform's native threads implementation. For more, see:
http://en.wikipedia.org/wiki/Green_threads
Enjoy!

How to use ruby fibers to avoid blocking IO

I need to upload a bunch of files in a directory to S3. Since more than 90% of the time required to upload is spent waiting for the http request to finish, I want to execute several of them at once somehow.
Can Fibers help me with this at all? They are described as a way to solve this sort of problem, but I can't think of any way I can do any work while an http call blocks.
Any way I can solve this problem without threads?

I'm not up on fibers in 1.9, but regular Threads from 1.8.6 can solve this problem. Try using a Queue http://ruby-doc.org/stdlib/libdoc/thread/rdoc/classes/Queue.html
Looking at the example in the documentation, your consumer is the part that does the upload. It 'consumes' a URL and a file, and uploads the data. The producer is the part of your program that keeps working and finds new files to upload.
If you want to upload multiple files at once, simply launch a new Thread for each file:
t = Thread.new do
upload_file(param1, param2)
end
#all_threads << t
Then, later on in your 'producer' code (which, remember, doesn't have to be in its own Thread, it could be the main program):
#all_threads.each do |t|
t.join if t.alive?
end
The Queue can either be a #member_variable or a $global.

To answer your actual questions:
Can Fibers help me with this at all?
No they can't. Jörg W Mittag explains why best.
No, you cannot do concurrency with Fibers. Fibers simply aren't a concurrency construct, they are a control-flow construct, like Exceptions. That's the whole point of Fibers: they never run in parallel, they are cooperative and they are deterministic. Fibers are coroutines. (In fact, I never understood why they aren't simply called Coroutines.)
The only concurrency construct in Ruby is Thread.
When he says that the only concurrency contruct in Ruby is Thread, remember that there are many different implimentations of Ruby and that they vary in their threading implementations. Jörg once again provides a great answer to these differences; and correctly concludes that only something like JRuby (that uses JVM threads mapped to native threads) or forking your process is how you can achieve true parallelism.
Any way I can solve this problem without threads?
Other than forking your process, I would also suggest that you look at EventMachine and something like em-http-request. It's an event driven, non-blocking, reactor pattern based HTTP client that is asynchronous and does not incur the overhead of threads.

Aaron Patterson (#tenderlove) uses an example almost exactly like yours to describe exactly why you can and should use threads to achieve concurrency in your situation.
Most I/O libraries are now smart enough to release the GVL (Global VM Lock, or most people know it as the GIL or Global Interpreter Lock) when doing IO. There is a simple function call in C to do this. You don't need to worry about the C code, but for you this means that most IO libraries worth their salt are going to release the GVL and allow other threads to execute while the thread that is doing the IO waits for the data to return.
If what I just said was confusing, you don't need to worry about it too much. The main thing that you need to know is that if you are using a decent library to do your HTTP requests (or any other I/O operation for that matter... database, interprocess communication, whatever), the Ruby interpreter (MRI) is smart enough to be able to release the lock on the interpreter and allow other threads to execute while one thread awaits IO to return. If the next thread has its own IO to grab, the Ruby interpreter will do the same thing (assuming that the IO library is built to utilize this feature of Ruby, which I believe most are these days).
So, to sum up what I am saying, use threads! You should see the performance benefit. If not, check to see whether your http library is using the rb_thread_blocking_region() function in C and, if not, find out why not. Maybe there is a good reason, maybe you need to consider using a better library.
The link to the Aaron Patterson video is here: http://www.youtube.com/watch?v=kufXhNkm5WU
It is worth a watch, even if just for the laughs, as Aaron Patterson is one of the funniest people on the internet.

You could use separate processes for this instead of threads:
#!/usr/bin/env ruby
$stderr.sync = true
# Number of children to use for uploading
MAX_CHILDREN = 5
# Hash of PIDs for children that are working along with which file
# they're working on.
#child_pids = {}
# Keep track of uploads that failed
#failed_files = []
# Get the list of files to upload as arguments to the program
#files = ARGV
### Wait for a child to finish, adding the file to the list of those
### that failed if the child indicates there was a problem.
def wait_for_child
$stderr.puts " waiting for a child to finish..."
pid, status = Process.waitpid2( 0 )
file = #child_pids.delete( pid )
#failed_files << file unless status.success?
end
### Here's where you'd put the particulars of what gets uploaded and
### how. I'm just sleeping for the file size in bytes * milliseconds
### to simulate the upload, then returning either +true+ or +false+
### based on a random factor.
def upload( file )
bytes = File.size( file )
sleep( bytes * 0.00001 )
return rand( 100 ) > 5
end
### Start a child uploading the specified +file+.
def start_child( file )
if pid = Process.fork
$stderr.puts "%s: uploaded started by child %d" % [ file, pid ]
#child_pids[ pid ] = file
else
if upload( file )
$stderr.puts "%s: done." % [ file ]
exit 0 # success
else
$stderr.puts "%s: failed." % [ file ]
exit 255
end
end
end
until #files.empty?
# If there are already the maximum number of children running, wait
# for one to finish
wait_for_child() if #child_pids.length >= MAX_CHILDREN
# Start a new child working on the next file
start_child( #files.shift )
end
# Now we're just waiting on the final few uploads to finish
wait_for_child() until #child_pids.empty?
if #failed_files.empty?
exit 0
else
$stderr.puts "Some files failed to upload:",
#failed_files.collect {|file| " #{file}" }
exit 255
end

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Ruby: CPU-Load degradation of concurent/multithreaded task? - ruby

Related

Celluloid async inside ruby blocks does not work

Mutex for ActiveRecord Model

Parallelism in Ruby

Improving ruby code performance with threading not working

How to use ruby fibers to avoid blocking IO

Categories

Resources