Easiest way to make sure a lot of threads finish in Ruby? - ruby

I'm creating a lot of threads:
(1..255).each do |n|
Thread.new do
sleep(10) # does a lot of work
end
end
# at this point I need to make sure all the threads completed
I would've hoped I could add each thread to a ThreadGroup and call a function like wait_until_all_threads_complete on that ThreadGroup. But I don't see anything obvious in the Ruby docs.
Do I have to add each thread to an array and then iterate over each one calling thread.join? There must be an easier way for such an extremely common use case.
threads = (1..255).map do |n|
Thread.new do
sleep(10) # does a lot of work
end
end
threads.each do |thread|
thread.join
end

Use ThreadGroup#list
If you assign threads to a ThreadGroup with ThreadGroup#add, you can map Thread#join or other Thread methods onto each member of the group as returned by the ThreadGroup#list method. For example:
thread_group = ThreadGroup.new
255.times do
thread_group.add Thread.new { sleep 10 }
end
thread_group.list.map &:join
This will only join threads belonging to thread_group, rather than to ThreadGroup::Default.

Related

How to call Object Methods while object is in a thread? Ruby (not rails)

If I put an object inside a thread, how do I call methods on that object. Example of what I want to do below & my current error.
undefined method `_method_name' for ["var", #Thread:0x00007f9b181edec0#b.rb:183
threads = {}
freq.each do |var|
threads[var] = Thread.new {object.new.method}
end
while true
threads.each do |thr|
thr.method_inside_the_object
end
end
When you write this:
Thread.new { ... }
The { ... } is a block that will be executed by the new thread. If you want the thread to do something interesting,
then you have to provide code (the ...) to do the interesting thing.
Typically, the original thread goes on to do something else concurrently with the new thread:
Thread.new { ... }
do_something_else()
Doing things concurrently (maybe even, in parallel) is the whole point of multi-threading after all.
Threads communicate by accessing shared objects... but it doesn't make any sense for one thread to look at a shared object until it knows the the the other thread has finished updating it. The simplest way is to join() the other thread.
t = Thread.new { ... }
do_something_else()
t.join() # this simply waits until the thread has finished.
Now about those shared objects. It's especially easy in Ruby.
shared_object = Hash.new()
t = Thread.new {
shared_object["a"] = ...
shared_object["b"] = ...
...
}
do_something_else()
t.join()
# after the join() call returns, it's safe to look in shared_object
# to see what the other thread left for us.
do_something_with(shared_object["a"])
...
There's a whole other issue that arises if you need the main thread to access shared_object concurrently with the new thread (i.e., before it calls t.join()). Google for "race condition", or "locking", or "mutual exclusion", or "mutex" for more information about why that's tricky, and how to do it safely.

How to make a ruby thread execute a function of my choosing?

Is it possible to create a "worker thread" so to speak that is on standby until it receives a function to execute asynchronously?
Is there a way to send a function like
def some_function
puts "hi"
# write something
db.exec()
end
to an existing thread that's just sitting there waiting?
The idea is I'd like to pawn off some database writes to a thread which runs asynchronously.
I thought about creating a Queue instance, then have a thread do something like this:
$command = Queue.new
Thread.new do
while trigger = $command.pop
some_method
end
end
$command.push("go!")
However this does not seem like a particularly good way to go about it. What is a better alternative?
The thread gem looks like it would suit your needs:
require 'thread/channel'
def some_method
puts "hi"
end
channel = Thread.channel
Thread.new do
while data = channel.receive
some_method
end
end
channel.send("go!")
channel.send("ruby!") # Any truthy message will do
channel.send(nil) # Non-truthy message to terminate other thread
sleep(1) # Give other thread time to do I/O
The channel uses ConditionVariable, which you could use yourself if you prefer.

RSpec: Testing with Threads

In RSpec, I have function that creates a new thread, and inside that thread performs some action–in my case, calls TCPSocket#readline. Here's the function as it is right now:
def read
Thread.new do
while line = #socket.readline
#TODO: stuff
end
end
end
Due to thread scheduling, my test will fail if written as such:
it "reads from socket" do
subject.socket.should_receive(:readline)
subject.read
end
Currently the only way I know to hack around this is to use sleep 0.1. Is there a way to properly delay the test until that thread is running?
If your goal is to assert the system state is changed by the execution of your second thread, you should join on the second thread in your main test thread:
it "reads from socket" do
subject.socket.should_receive(:readline)
socket_thread = subject.read
socket_thread.join
end
This is a bit of a hack, but here's a before block you can use in case you'd like the thread to yield but be able to call join at the end of the thread.
before do
allow(Thread).to receive(:new).and_yield.and_return(Class.new { def join; end }.new)
end

Can EventMachine recognize all threads are completed?

I'm an EM newbie and writing two codes to compare synchronous and asynchronous IO. I'm using Ruby 1.8.7.
The example for sync IO is:
def pause_then_print(str)
sleep 2
puts str
end
5.times { |i| pause_then_print(i) }
puts "Done"
This works as expected, taking 10+ seconds until termination.
On the other hand, the example for async IO is:
require 'rubygems'
require 'eventmachine'
def pause_then_print(str)
Thread.new do
EM.run do
sleep 2
puts str
end
end
end
EventMachine.run do
EM.add_timer(2.5) do
puts "Done"
EM.stop_event_loop
end
EM.defer(proc do
5.times { |i| pause_then_print(i) }
end)
end
5 numbers are shown in 2.x seconds.
Now I explicitly wrote code that EM event loop to be stopped after 2.5 seconds. But what I want is that the program terminates right after printing out 5 numbers. For doing that, I think EventMachine should recognize all 5 threads are done, and then stop the event loop.
How can I do that? Also, please correct the async IO example if it can be more natural and expressive.
Thanks in advance.
A few things about your Async code. EM.defer schedules the code to execute on a thread. You're then creating more threads. There isn't much point to doing that when you could just use EM.defer in your creation loop. This has the added benefit that EM will service the threads from it's internal threadpool which should be a bit faster as there is no thread creation overhead. (Just note, the EM threadpool has, I believe, 20 threads in it so you want to stay below that number). Something like the following should work (although I haven't tested it)
require 'rubygems'
require 'eventmachine'
def pause_then_print(str)
sleep 2
puts str
end
EventMachine.run do
EM.add_timer(2.5) do
puts "Done"
EM.stop_event_loop
end
5.times do |i|
EM.defer { pause_then_print(i) }
end
end
In terms of detecting when the work is done, you can have EM.defer execute a callback when its operation is complete. So, you could have a little bit of code in there that adds the callback when i == 4, or something similar. See the EM docs for how to add the callback: EM.defer

conditional variable in main thread

In my main thread, I am trying to wait for two resources from two separate threads. The way I implemented is as below:
require 'thread'
def new_thread
Thread.current[:ready_to_go] = false
puts "in thread: new thread"
sleep(5)
puts "in thread: sleep finished"
Thread.current[:ready_to_go] = true
sleep(2)
puts "in thread: back to thread again!"
end
thread1 = Thread.new do
new_thread
end
thread2 = Thread.new do
new_thread
end
# the main thread wait for ready_to_go to start
while (!(thread1[:ready_to_go] && thread2[:ready_to_go]))
sleep(0.5)
end
puts "back to main!"
sleep(8)
puts "main sleep over!"
thread1.join
thread2.join
Is there any better way to implement this? I tried to use conditional variables: the two threads signal the conditional variables and the main thread waits for them. But the wait method requires a mutex in my main thread, which I am trying to avoid.
I'm not familiar with Ruby, but the first Google result for "Ruby wait for thread" says:
you can wait for a particular thread to finish by calling that thread's Thread#join method. The calling thread will block until the given thread is finished. By calling join on each of the requestor threads, you can make sure that all three requests have completed before you terminate the main program.
It's generally best to use synchronization methods to wait for something to complete, rather than looping until a particular state is reached.
One easy way would be to get a Queue (let's call it myqueue). It's threadsafe and located in the Thread module
Instead of
Thread.current[:ready_to_go] = true
... do
myqueue.push :ready_to_go
And then your main thread would be:
junk = myqueue.pop # wait for thread one to push
junk = myqueue.pop # wait for thread two to push
# go on with your work

Resources