I have method a that is invoked repeatedly at some random time, which triggers method b, which is completely executed after some random time and is in it own thread. I want to ensure that a subsequent execution of a waits until b is completed, which is triggered by the current execution of a. In other words, a and b are to be executed alternatively. I tried to do this using mutex and condition variable as follows:
def a
Thread.new do
$mutex.synchronize do
puts "a"
b
$cv.wait($mutex)
end
end
end
def b
Thread.new do
sleep(rand)
$mutex.synchronize do
puts "b"
$cv.signal
end
end
end
$mutex, $cv = Mutex.new, ConditionVariable.new
loop{a; sleep(rand)}
In this code, $mutex.synchronize do ... end in method a ensures that $cv.signal (also within another $mutex.synchronize do ... end) in method b is not invoked until $cv.wait($mutex) sets $cv into listening mode for signals. This much is given in the document.
Another function I intended to assign to $mutex.synchronize do ... end in method a is to avoid consecutive execution of method a. My reasoning is that $cv.wait($mutex) in method a should avoid $mutex from being completed and released until $cv.signal in method b is invoked, by which time b should be finished.
I expected that a and b are executed alternatively, thereby printing "a" and "b" alternatively. But in reality, they are not; each of "a" or "b" can be printed consecutively.
After that, I thought that my reasoning above may be wrong in the sense that $mutex is rather completed and released even if $cv (or $mutex) is in waiting mode, once $cv.wait($mutex) has been called. So I added some dummy process to a, changing it to:
def a
Thread.new do
$mutex.synchronize do
puts "a"
b
$cv.wait($mutex)
nil # Dummy process intended to keep `$mutex` locked until `$cv` is released
end
end
end
but that did not have effect.
How can this be fixed? Or, what am I wrong about this?
I don't have a solution for you, but isn't the reason a is being called more than you expect is wait releases the lock on the mutex? Otherwise signal could never be called. This seems to happen "as expected" the first time, but after that, you end up having several a threads queued up, itching to enter the synchronize block, and they sneak in before a b thread wakes up and locks the mutex again.
If you poor-man's instrument your code at every turn, you can see it happen:
def a
puts("a before thread #{Thread.current}")
Thread.new do
puts(" a synch0 #{Thread.current}")
$mutex.synchronize do
puts(" a before b #{Thread.current}")
b
puts(" a after b, before wait #{Thread.current}")
$cv.wait($mutex)
puts(" a after wait #{Thread.current}")
end
puts(" a synch1 #{Thread.current}")
end
puts("a after thread #{Thread.current}")
end
def b
puts("b before thread #{Thread.current}")
Thread.new do
puts(" b before sleep #{Thread.current}")
sleep(rand)
puts(" b after sleep, synch0 #{Thread.current}")
$mutex.synchronize do
puts(" b before signal #{Thread.current}")
$cv.signal
puts(" b after signal #{Thread.current}")
end
puts(" b synch1 #{Thread.current}")
end
puts("b after thread #{Thread.current}")
end
$mutex, $cv = Mutex.new, ConditionVariable.new
loop{a; sleep(rand)}
I know it sounds strange, but it will be easier to use a queue to block those threads:
def a
Thread.new do
$queue.pop
puts "a"
b
end
end
def b
Thread.new do
sleep(rand)
puts "b"
$queue << true
end
end
$queue = Queue.new
$queue << true
loop{a; sleep(rand)}
Related
Say I have a class Talker. I'm using a queue to make the Talker talk, but I occasionally want to mute the talker, but when I unmute the talker, I want the talker to pickup where he left off. How do I stop the threads from taking messages from the queue and wait until I unmute the talker to resume?
class Talker
def initialize
#queue = Queue.new
#threads = Array.new(1) do
Thread.new do
until #queue.empty?
# what logic should go here to check if mute
# and stop taking messages?
next_msg = #queue.shift
puts next_msg
end
end
end
end
def speak(msg)
#queue.push(msg)
end
# stop threads from taking messages from queue
def mute
# what goes here?
end
# allow threads to continuing taking messages from queue
def unmute
# what goes here?
end
end
Though ruby is definitely not the best choice for handling async operations, one still may make use of Thread::Mutex:
#handler = Class.new do
#locks = {}
def mute(id, mutex)
#locks[id] ||= mutex.lock
end
def unmute(id)
#locks[id].unlock if #locks[id].is_a?(Thread::Mutex)
#locks.delete(id)
end
end
Thread.new do
MX = Thread::Mutex.new
until #queue.empty?
MX.synchronize do
next_msg = #queue.shift
puts next_msg
end
end
end
# stop threads from taking messages from queue
def mute
#handler.mute(self, MX)
end
# allow threads to continuing taking messages from queue
def unmute
#handler.unmute(self)
end
The code is untested, but I believe it should work.
Rather than having a mutex per thread, you could have a simple flag protected by a mutex
class Talker
def initialize
#muted = false
#muted_mutex = Thread::Mutex.new
#queue = Queue.new
#threads = Array.new(1) do
Thread.new do
until #queue.empty?
next if #muted # skip this iteration
puts #queue.shift
end
end
end
end
def mute
#muted_mutex.synchronize { #muted = true }
end
def unmute
#muted_mutex.synchronize { #muted = false }
end
end
The difference between this and having a mutex per-thread is that this will only block if multiple threads (elsewhere) try to mute/unmute simultaneously. However there also might be a slight delay between muting and the threads actually stopping since there's a race between setting #muted = false and the thread reading it.
It's probably not considered good practice, but if I were you I would even ditch the mutex. For a Boolean flag it makes no difference if there are multiple writes occurring simultaneously.
I am writing a Ruby application (Ruby v2.1.3p242 in Linux x86_64) that will repeatedly process online data and store the results in a database. To speed things up, I have multiple threads running concurrently and I have been working on a way to cleanly stop all the threads both on command and when an exception is raised from a thread.
The issue is that some threads will continue to run multiple iterations of #do_stuff after Sever.stop is called. They do eventually stop, but I will see a couple threads running 10-50 times after the rest have stopped.
Each threads' mutex is locked before each iteration and unlocked afterwards. The code, #mutex.synchronize { kill } is run on each thread when Server.stop is called. This should kill the thread immediately after its next iteration, but this does not seem to be the case.
EDIT:
The code works as-is, so feel free to test it if you like. In my tests, it takes between 30 seconds and several minutes for all of the threads to stop after calling Server.stop. Note that each iteration takes between 1-3 seconds. I used the following to test the code (using ruby -I. while in the same directory):
require 'benchmark'
require 'server'
s = Server.new
s.start
puts Benchmark.measure { s.stop }
Here is the code:
server.rb:
require 'server/fetcher_thread'
class Server
THREADS = 8
attr_reader :threads
def initialize
#threads = []
end
def start
create_threads
end
def stop
#threads.map {|t| Thread.new { t.stop } }.each(&:join)
#threads = []
end
private
def create_threads
THREADS.times do |i|
#threads << FetcherThread.new(number: i + 1)
end
end
end
server/fetcher_thread.rb:
class Server
class FetcherThread < Thread
attr_reader :mutex
def initialize(opts = {})
#mutex = Mutex.new
#number = opts[:number] || 0
super do
loop do
#mutex.synchronize { do_stuff }
end
end
end
def stop
#mutex.synchronize { kill }
end
private
def do_stuff
debug "Sleeping for #{time_to_sleep = rand * 2 + 1} seconds"
sleep time_to_sleep
end
def debug(message)
$stderr.print "Thread ##{#number}: #{message}\n"
end
end
end
There's no guarantee that the thread calling stop will acquire the mutex before the next iteration of the loop. It's totally up to the Ruby and operating system schedulers, and some OSes (including Linux) don't implement a FIFO scheduling algorithm, but take other factors into account to try to optimize performance.
You can make this more predictable by avoiding kill and using a variable to exit the loop cleanly. Then, you only need to wrap the mutex around the code that accesses the variable
class Server
class FetcherThread < Thread
attr_reader :mutex
def initialize(opts = {})
#mutex = Mutex.new
#number = opts[:number] || 0
super do
until stopped?
do_stuff
end
end
end
def stop
mutex.synchronize { #stop = true }
end
def stopped?
mutex.synchronize { #stop }
end
#...
end
end
I make some important calculations in endless loop and don't want this calculation interrupts with SIGINT signal (e.g. ctrl-c). So I place loop in thread with protecting important calculation with mutex:
mutex = Mutex.new
trap('INT') do
Thread.new do
puts 'Terminating..'
exit(0)
end.join
end
Thread.new do
loop do
mutex.synchronize do
puts 'Some important computation is started.'
sleep(5)
puts 'Some important computation is done.'
end
sleep(30)
end
end.join
I add another thread inside trap block, so I expect this thread will be executed only when mutex will be unlocked.
But in fact, this second thread starts immediately after receiving SIGINT signal:
Some important computation is started.
^CTerminating..
What am I missed/doing wrong?
You must synchronize the trap thread with the computation:
trap('INT') do
Thread.new do
mutex.synchronize do
puts 'Terminating..'
exit(0)
end
end.join
end
But perhaps it is easier if you set a boolean var in your trap function, and you use it to break the loop.
mustexit = false
trap('INT') do
mustexit= true
end
Thread.new do
loop do
puts 'Some important computation is started.'
sleep(5)
puts 'Some important computation is done.'
if mustexit then
break
end
end
sleep(30)
end.join
There aren't many resources on Condition Variables in Ruby, however most of them are wrong. Like ruby-doc, tutorial here or post here - all of them suffer with possible deadlock.
We could solve the problem by starting threads in given order and maybe putting some sleep in between to enforce synchronization. But that's just postponing the real problem.
I rewrote the code into a classical producer-consumer problem:
require 'thread'
queue = []
mutex = Mutex.new
resource = ConditionVariable.new
threads = []
threads << Thread.new do
5.times do |i|
mutex.synchronize do
resource.wait(mutex)
value = queue.pop
print "consumed #{value}\n"
end
end
end
threads << Thread.new do
5.times do |i|
mutex.synchronize do
queue << i
print "#{i} produced\n"
resource.signal
end
sleep(1) #simulate expense
end
end
threads.each(&:join)
Sometimes you will get this (but not always):
0 produced
1 produced
consumed 0
2 produced
consumed 1
3 produced
consumed 2
4 produced
consumed 3
producer-consumer.rb:30:in `join': deadlock detected (fatal)
from producer-consumer.rb:30:in `each'
from producer-consumer.rb:30:in `<main>'
What is the correct solution?
The problem is that, as you commented earlier, this approach only works if you can guarantee that the consumer thread gets to grab the mutex first at the start of our program. When this is not the case, a deadlock will occur as the first resource.signal of your producer thread will be sent at a time that the consumer thread is not yet waiting for the resource. As a result this first resource.signal will essentially not do anything, so you end up with a scenario where you call resource.signal 4 times (as the first one gets lost), whereas resource.wait is called 5 times. This means the consumer will be stuck waiting forever, and a deadlock occurs.
Luckily we can solve this by only allowing the consumer thread to start waiting if no more immediate work is available.
require 'thread'
queue = []
mutex = Mutex.new
resource = ConditionVariable.new
threads = []
threads << Thread.new do
5.times do |i|
mutex.synchronize do
if queue.empty?
resource.wait(mutex)
end
value = queue.pop
print "consumed #{value}\n"
end
end
end
threads << Thread.new do
5.times do |i|
mutex.synchronize do
queue << i
print "#{i} produced\n"
resource.signal
end
sleep(1) #simulate expense
end
end
threads.each(&:join)
This is more robust solution with multiple consumers and producers and usage of MonitorMixin, MonitorMixin has a special ConditionVariable with wait_while() and wait_until() methods
require 'monitor'
queue = []
queue.extend(MonitorMixin)
cond = queue.new_cond
consumers, producers = [], []
for i in 0..5
consumers << Thread.start(i) do |i|
print "consumer start #{i}\n"
while (producers.any?(&:alive?) || !queue.empty?)
queue.synchronize do
cond.wait_while { queue.empty? }
print "consumer #{i}: #{queue.shift}\n"
end
sleep(0.2) #simulate expense
end
end
end
for i in 0..3
producers << Thread.start(i) do |i|
id = (65+i).chr
for j in 0..10 do
queue.synchronize do
item = "#{j} #{id}"
queue << item
print "producer #{id}: produced #{item}\n"
j += 1
cond.broadcast
end
sleep(0.1) #simulate expense
end
end
end
sleep 0.1 while producers.any?(&:alive?)
sleep 0.1 while consumers.any?(&:alive?)
print "queue size #{queue.size}\n"
Based on a forum thread I came up with a working solution. It enforces alternation between threads, which is not ideal. What is we want multiple threads of consumers and producers?
queue = []
mutex = Mutex.new
threads = []
next_run = :producer
cond_consumer = ConditionVariable.new
cond_producer = ConditionVariable.new
threads << Thread.new do
5.times do |i|
mutex.synchronize do
until next_run == :consumer
cond_consumer.wait(mutex)
end
value = queue.pop
print "consumed #{value}\n"
next_run = :producer
cond_producer.signal
end
end
end
threads << Thread.new do
5.times do |i|
mutex.synchronize do
until next_run == :producer
cond_producer.wait(mutex)
end
queue << i
print "#{i} produced\n"
next_run = :consumer
cond_consumer.signal
end
end
end
threads.each(&:join)
You can simplify your problem:
require 'thread'
queue = Queue.new
consumer = Thread.new { queue.pop }
consumer.join
Because your main thread is waiting for the consumer thread to exit, but the consumer thread is sleeping (due to queue.pop) this results in:
producer-consumer.rb:4:in `join': deadlock detected (fatal)
from producer-consumer.rb:4:in `<main>'
So you have to wait for the threads to finish without calling join:
require 'thread'
queue = Queue.new
threads = []
threads << Thread.new do
5.times do |i|
value = queue.pop
puts "consumed #{value}"
end
end
threads << Thread.new do
5.times do |i|
queue << i
puts "#{i} produced"
sleep(1) # simulate expense
end
end
# wait for the threads to finish
sleep(1) while threads.any?(&:alive?)
I couldn't find a decent ThreadPool implementation for Ruby, so I wrote mine (based partly on code from here: http://web.archive.org/web/20081204101031/http://snippets.dzone.com:80/posts/show/3276 , but changed to wait/signal and other implementation for ThreadPool shutdown. However after some time of running (having 100 threads and handling about 1300 tasks), it dies with deadlock on line 25 - it waits for a new job there. Any ideas, why it might happen?
require 'thread'
begin
require 'fastthread'
rescue LoadError
$stderr.puts "Using the ruby-core thread implementation"
end
class ThreadPool
class Worker
def initialize(callback)
#mutex = Mutex.new
#cv = ConditionVariable.new
#callback = callback
#mutex.synchronize {#running = true}
#thread = Thread.new do
while #mutex.synchronize {#running}
block = get_block
if block
block.call
reset_block
# Signal the ThreadPool that this worker is ready for another job
#callback.signal
else
# Wait for a new job
#mutex.synchronize {#cv.wait(#mutex)} # <=== Is this line 25?
end
end
end
end
def name
#thread.inspect
end
def get_block
#mutex.synchronize {#block}
end
def set_block(block)
#mutex.synchronize do
raise RuntimeError, "Thread already busy." if #block
#block = block
# Signal the thread in this class, that there's a job to be done
#cv.signal
end
end
def reset_block
#mutex.synchronize {#block = nil}
end
def busy?
#mutex.synchronize {!#block.nil?}
end
def stop
#mutex.synchronize {#running = false}
# Signal the thread not to wait for a new job
#cv.signal
#thread.join
end
end
attr_accessor :max_size
def initialize(max_size = 10)
#max_size = max_size
#workers = []
#mutex = Mutex.new
#cv = ConditionVariable.new
end
def size
#mutex.synchronize {#workers.size}
end
def busy?
#mutex.synchronize {#workers.any? {|w| w.busy?}}
end
def shutdown
#mutex.synchronize {#workers.each {|w| w.stop}}
end
alias :join :shutdown
def process(block=nil,&blk)
block = blk if block_given?
while true
#mutex.synchronize do
worker = get_worker
if worker
return worker.set_block(block)
else
# Wait for a free worker
#cv.wait(#mutex)
end
end
end
end
# Used by workers to report ready status
def signal
#cv.signal
end
private
def get_worker
free_worker || create_worker
end
def free_worker
#workers.each {|w| return w unless w.busy?}; nil
end
def create_worker
return nil if #workers.size >= #max_size
worker = Worker.new(self)
#workers << worker
worker
end
end
Ok, so the main problem with the implementation is: how to make sure no signal is lost and avoid dead locks ?
In my experience, this is REALLY hard to achieve with condition variables and mutex, but easy with semaphores. It so happens that ruby implement an object called Queue (or SizedQueue) that should solve the problem. Here is my suggested implementation:
require 'thread'
begin
require 'fasttread'
rescue LoadError
$stderr.puts "Using the ruby-core thread implementation"
end
class ThreadPool
class Worker
def initialize(thread_queue)
#mutex = Mutex.new
#cv = ConditionVariable.new
#queue = thread_queue
#running = true
#thread = Thread.new do
#mutex.synchronize do
while #running
#cv.wait(#mutex)
block = get_block
if block
#mutex.unlock
block.call
#mutex.lock
reset_block
end
#queue << self
end
end
end
end
def name
#thread.inspect
end
def get_block
#block
end
def set_block(block)
#mutex.synchronize do
raise RuntimeError, "Thread already busy." if #block
#block = block
# Signal the thread in this class, that there's a job to be done
#cv.signal
end
end
def reset_block
#block = nil
end
def busy?
#mutex.synchronize { !#block.nil? }
end
def stop
#mutex.synchronize do
#running = false
#cv.signal
end
#thread.join
end
end
attr_accessor :max_size
def initialize(max_size = 10)
#max_size = max_size
#queue = Queue.new
#workers = []
end
def size
#workers.size
end
def busy?
#queue.size < #workers.size
end
def shutdown
#workers.each { |w| w.stop }
#workers = []
end
alias :join :shutdown
def process(block=nil,&blk)
block = blk if block_given?
worker = get_worker
worker.set_block(block)
end
private
def get_worker
if !#queue.empty? or #workers.size == #max_size
return #queue.pop
else
worker = Worker.new(#queue)
#workers << worker
worker
end
end
end
And here is a simple test code:
tp = ThreadPool.new 500
(1..1000).each { |i| tp.process { (2..10).inject(1) { |memo,val| sleep(0.1); memo*val }; print "Computation #{i} done. Nb of tasks: #{tp.size}\n" } }
tp.shutdown
You can try the work_queue gem, designed to coordinate work between a producer and a pool of worker threads.
I'm slightly biased here, but I would suggest modelling this in some process language and model check it. Freely available tools are, for example, the mCRL2 toolset (using a ACP-based language), the Mobility Workbench (pi-calculus) and Spin (PROMELA).
Otherwise I would suggest removing every bit of code that is not essential to the problem and finding a minimal case where the deadlock occurs. I doubt that it the 100 threads and 1300 tasks are essential to get a deadlock. With a smaller case you can probably just add some debug prints which provide enough information the solve the problem.
Ok, the problem seems to be in your ThreadPool#signal method. What may happen is:
1 - All your worker are busy and you try to process a new job
2 - line 90 gets a nil worker
3 - a worker get freed and signals it, but the signal is lost as the ThreadPool is not waiting for it
4 - you fall on line 95, waiting even though there is a free worker.
The error here is that you can signal a free worker even when nobody is listening. This ThreadPool#signal method should be:
def signal
#mutex.synchronize { #cv.signal }
end
And the problem is the same in the Worker object. What might happen is:
1 - The Worker just completed a job
2 - It checks (line 17) if there is a job waiting: there isn't
3 - The thread pool send a new job and signals it ... but the signal is lost
4 - The worker wait for a signal, even though it is marked as busy
You should put your initialize method as:
def initialize(callback)
#mutex = Mutex.new
#cv = ConditionVariable.new
#callback = callback
#mutex.synchronize {#running = true}
#thread = Thread.new do
#mutex.synchronize do
while #running
block = get_block
if block
#mutex.unlock
block.call
#mutex.lock
reset_block
# Signal the ThreadPool that this worker is ready for another job
#callback.signal
else
# Wait for a new job
#cv.wait(#mutex)
end
end
end
end
end
Next, the Worker#get_block and Worker#reset_block methods should not be synchronized anymore. That way, you cannot have a block assigned to a worker between the test for a block and the wait for a signal.
Top commenter's code has helped out so much over the years. Here it is updated for ruby 2.x and improved with thread identification. How is that an improvement? When each thread has an ID, you can compose ThreadPool with an array which stores arbitrary information. Some ideas:
No array: typical ThreadPool usage. Even with the GIL it makes threading dead easy to code and very useful for high-latency applications like high-volume web crawling,
ThreadPool and Array sized to number of CPUs: easy to fork processes to use all CPUs,
ThreadPool and Array sized to number of resources: e.g., each array element represents one processor across a pool of instances, so if you have 10 instances each with 4 CPUs, the TP can manage work across 40 subprocesses.
With these last two, rather than thinking about threads doing work think about the ThreadPool managing subprocesses that are doing the work. The management task is lightweight and when combined with subprocesses, who cares about the GIL.
With this class, you can code up a cluster based MapReduce in about a hundred lines of code! This code is beautifully short although it can be a bit of a mind-bend to fully grok. Hope it helps.
# Usage:
#
# Thread.abort_on_exception = true # help localize errors while debugging
# pool = ThreadPool.new(thread_pool_size)
# 50.times {|i|
# pool.process { ... }
# or
# pool.process {|id| ... } # worker identifies itself as id
# }
# pool.shutdown()
class ThreadPool
require 'thread'
class ThreadPoolWorker
attr_accessor :id
def initialize(thread_queue, id)
#id = id # worker id is exposed thru tp.process {|id| ... }
#mutex = Mutex.new
#cv = ConditionVariable.new
#idle_queue = thread_queue
#running = true
#block = nil
#thread = Thread.new {
#mutex.synchronize {
while #running
#cv.wait(#mutex) # block until there is work to do
if #block
#mutex.unlock
begin
#block.call(#id)
ensure
#mutex.lock
end
#block = nil
end
#idle_queue << self
end
}
}
end
def set_block(block)
#mutex.synchronize {
raise RuntimeError, "Thread is busy." if #block
#block = block
#cv.signal # notify thread in this class, there is work to be done
}
end
def busy?
#mutex.synchronize { ! #block.nil? }
end
def stop
#mutex.synchronize {
#running = false
#cv.signal
}
#thread.join
end
def name
#thread.inspect
end
end
attr_accessor :max_size, :queue
def initialize(max_size = 10)
#process_mutex = Mutex.new
#max_size = max_size
#queue = Queue.new # of idle workers
#workers = [] # array to hold workers
# construct workers
#max_size.times {|i| #workers << ThreadPoolWorker.new(#queue, i) }
# queue up workers (workers in queue are idle and available to
# work). queue blocks if no workers are available.
#max_size.times {|i| #queue << #workers[i] }
sleep 1 # important to give threads a chance to initialize
end
def size
#workers.size
end
def idle
#queue.size
end
# are any threads idle
def busy?
# #queue.size < #workers.size
#queue.size == 0 && #workers.size == #max_size
end
# block until all threads finish
def shutdown
#workers.each {|w| w.stop }
#workers = []
end
alias :join :shutdown
def process(block = nil, &blk)
#process_mutex.synchronize {
block = blk if block_given?
worker = #queue.pop # assign to next worker; block until one is ready
worker.set_block(block) # give code block to worker and tell it to start
}
end
end