I have a thread in Ruby. It runs a loop. When that loop reaches a sleep(n) it halts and never wakes up. If I run the loop with out sleep(n) it runs as a infinite loop.
Whats going on in the code to stop the thread from running as expected?
How do i fix it?
class NewObject
def initialize
#a_local_var = 'somaText'
end
def my_funk(a_word)
t = Thread.new(a_word) do |args|
until false do
puts a_word
puts #a_local_var
sleep 5 #This invokes the Fail
end
end
end
end
if __FILE__ == $0
s = NewObject.new()
s.my_funk('theWord')
d = gets
end
My platform is Windows XP SP3
The version of ruby I have installed is 1.8.6
You're missing a join.
class NewObject
def initialize
#a_local_var = 'somaText'
end
def my_funk(a_word)
t = Thread.new(a_word) do |args|
until false do
puts a_word
puts #a_local_var
sleep 5
end
end
t.join # allow this thread to finish before finishing main thread
end
end
if __FILE__ == $0
s = NewObject.new()
s.my_funk('theWord')
d = gets # now we never get here
end
Related
I am writing a Ruby application (Ruby v2.1.3p242 in Linux x86_64) that will repeatedly process online data and store the results in a database. To speed things up, I have multiple threads running concurrently and I have been working on a way to cleanly stop all the threads both on command and when an exception is raised from a thread.
The issue is that some threads will continue to run multiple iterations of #do_stuff after Sever.stop is called. They do eventually stop, but I will see a couple threads running 10-50 times after the rest have stopped.
Each threads' mutex is locked before each iteration and unlocked afterwards. The code, #mutex.synchronize { kill } is run on each thread when Server.stop is called. This should kill the thread immediately after its next iteration, but this does not seem to be the case.
EDIT:
The code works as-is, so feel free to test it if you like. In my tests, it takes between 30 seconds and several minutes for all of the threads to stop after calling Server.stop. Note that each iteration takes between 1-3 seconds. I used the following to test the code (using ruby -I. while in the same directory):
require 'benchmark'
require 'server'
s = Server.new
s.start
puts Benchmark.measure { s.stop }
Here is the code:
server.rb:
require 'server/fetcher_thread'
class Server
THREADS = 8
attr_reader :threads
def initialize
#threads = []
end
def start
create_threads
end
def stop
#threads.map {|t| Thread.new { t.stop } }.each(&:join)
#threads = []
end
private
def create_threads
THREADS.times do |i|
#threads << FetcherThread.new(number: i + 1)
end
end
end
server/fetcher_thread.rb:
class Server
class FetcherThread < Thread
attr_reader :mutex
def initialize(opts = {})
#mutex = Mutex.new
#number = opts[:number] || 0
super do
loop do
#mutex.synchronize { do_stuff }
end
end
end
def stop
#mutex.synchronize { kill }
end
private
def do_stuff
debug "Sleeping for #{time_to_sleep = rand * 2 + 1} seconds"
sleep time_to_sleep
end
def debug(message)
$stderr.print "Thread ##{#number}: #{message}\n"
end
end
end
There's no guarantee that the thread calling stop will acquire the mutex before the next iteration of the loop. It's totally up to the Ruby and operating system schedulers, and some OSes (including Linux) don't implement a FIFO scheduling algorithm, but take other factors into account to try to optimize performance.
You can make this more predictable by avoiding kill and using a variable to exit the loop cleanly. Then, you only need to wrap the mutex around the code that accesses the variable
class Server
class FetcherThread < Thread
attr_reader :mutex
def initialize(opts = {})
#mutex = Mutex.new
#number = opts[:number] || 0
super do
until stopped?
do_stuff
end
end
end
def stop
mutex.synchronize { #stop = true }
end
def stopped?
mutex.synchronize { #stop }
end
#...
end
end
I am new to ruby and trying to work around threads
Let's say I have a method which I want to run every x seconds as follows
def say_hello
puts 'hello world'
end
I am trying to run it as follows
Thread.new do
while true do
say_hello
sleep(5)
end
end
But when I run the script, nothing is displayed on the console. What am I missing? Thanks!
The main thread is exiting before your thread can run. Use the join method to make the current thread wait for the say_hello thread to finish executing (though it never will).
t = Thread.new do
while true do
say_hello
sleep(5)
end
end
t.join
You are creating the Thread object, but you are not waiting for it to finish its execution, try with:
Thread.new do
while true do
say_hello
sleep(5)
end
end.join
Try
t1 = Thread.new do
while true do
say_hello
sleep(5)
end
end
t1.join
I was looking in detail at the Thread class. Basically, I was looking for an elegant mechanism to allow thread-local variables to be inherited as threads are created. For example the functionality I am looking to create would ensure that
Thread.new do
self[:foo]="bar"
t1=Thread.new { puts self[:foo] }
end
=> "bar"
i.e. a Thread would inherit it's calling thread's thread-local variables
So I hit upon the idea of redefining Thread.new, so that I could add an extra step to copy the thread-local variables into the new thread from the current thread. Something like this:
class Thread
def self.another_new(*args)
o=allocate
o.send(:initialize, *args)
Thread.current.keys.each{ |k| o[k]=Thread.current[k] }
o
end
end
But when I try this I get the following error:
:in `allocate': allocator undefined for Thread (TypeError)
I thought that as Thread is a subclass of Object, it should have a working #allocate method. Is this not the case?
Does anyone have any deep insight on this, and on how to achieve the functionality I am looking for.
Thanks in advance
Steve
Thread.new do
Thread.current[:foo]="bar"
t1=Thread.new(Thread.current) do |parent|
puts parent[:foo] ? parent[:foo] : 'nothing'
end.join
end.join
#=> bar
UPDATED:
Try this in irb:
thread_ext.rb
class Thread
def self.another_new(*args)
parent = Thread.current
a = Thread.new(parent) do |parent|
parent.keys.each{ |k| Thread.current[k] = parent[k] }
yield
end
a
end
end
use_case.rb
A = Thread.new do
Thread.current[:local_a]="A"
B1 =Thread.another_new do
C1 = Thread.another_new{p Thread.current[:local_a] }.join
end
B2 =Thread.another_new do
C2 = Thread.another_new{p Thread.current[:local_a] }.join
end
[B1, B2].each{|b| b.join }
end.join
output
"A"
"A"
Here is a revised answer based on #CodeGroover's suggestion, with a simple unit test harness
ext/thread.rb
class Thread
def self.inherit(*args, &block)
parent = Thread.current
t = Thread.new(parent, *args) do |parent|
parent.keys.each{ |k| Thread.current[k] = parent[k] }
yield *args
end
t
end
end
test/thread.rb
require 'test/unit'
require 'ext/thread'
class ThreadTest < Test::Unit::TestCase
def test_inherit
Thread.current[:foo]=1
m=Mutex.new
#check basic inheritence
t1= Thread.inherit do
assert_equal(1, Thread.current[:foo])
end
#check inheritence with parameters - in this case a mutex
t2= Thread.inherit(m) do |m|
assert_not_nil(m)
m.synchronize{ Thread.current[:bar]=2 }
assert_equal(1, Thread.current[:foo])
assert_equal(2, Thread.current[:bar])
sleep 0.1
end
#ensure t2 runs its mutexs-synchronized block first
sleep 0.05
#check that the inheritence works downwards only - not back up in reverse
m.synchronize do
assert_nil(Thread.current[:bar])
end
[t1,t2].each{|x| x.join }
end
end
I was looking for the same thing recently and was able to come up with the following answer. Note I am aware the following is a hack and not recommended, but for the sake of answering the specific question on how you could alter the Thread.new functionality, I have done as following:
class Thread
class << self
alias :original_new :new
def new(*args, **options, &block)
original_thread = Thread.current
instance = original_new(*args, **options, &block)
original_thread.keys.each do |key|
instance[key] = original_thread[key]
end
instance
end
end
end
In the situation below the #crawl object DOES RECEIVE the crawl call, but the method mock fails ie: the method is not mocked.
Does Thread somehow create its own copy of the #crawl object escaping the mock?
#crawl.should_receive(:crawl).with(an_instance_of(String)).twice.and_return(nil)
threads = #crawl.create_threads
thread creation code:
def crawl(uri)
dosomecrawling
end
def create_threads
(1..5).each do
Thread.new do
crawl(someurifeedingmethod)
end
end
end
It does not appear from the code posted that you are joining the threads. If so, there is a race condition: Sometimes the test will execute with some or all of the threads not having done their job; The fix is along these lines:
!/usr/bin/ruby1.9
class Crawler
def crawl(uri)
dosomecrawling
end
def create_threads
#threads = (1..5).collect do
Thread.new do
crawl(someurifeedingmethod)
end
end
end
def join
#threads.each do |thread|
thread.join
end
end
end
describe "the above code" do
it "should crawl five times" do
crawler = Crawler.new
uri = "uri"
crawler.should_receive(:someurifeedingmethod).with(no_args).exactly(5).times.and_return(uri)
crawler.should_receive(:crawl).with(uri).exactly(5).times
crawler.create_threads
crawler.join
end
end
This code works perfectly.
You can add 5 times the expects.
class Hello
def crawl(uri)
puts uri
end
def create_threads
(1..5).each do
Thread.new do
crawl('http://hello')
end
end
end
end
describe 'somting' do
it 'should mock' do
crawl = Hello.new
5.times do
crawl.should_receive(:crawl).with(an_instance_of(String)).and_return(nil)
end
threads = crawl.create_threads
end
end
I couldn't find a decent ThreadPool implementation for Ruby, so I wrote mine (based partly on code from here: http://web.archive.org/web/20081204101031/http://snippets.dzone.com:80/posts/show/3276 , but changed to wait/signal and other implementation for ThreadPool shutdown. However after some time of running (having 100 threads and handling about 1300 tasks), it dies with deadlock on line 25 - it waits for a new job there. Any ideas, why it might happen?
require 'thread'
begin
require 'fastthread'
rescue LoadError
$stderr.puts "Using the ruby-core thread implementation"
end
class ThreadPool
class Worker
def initialize(callback)
#mutex = Mutex.new
#cv = ConditionVariable.new
#callback = callback
#mutex.synchronize {#running = true}
#thread = Thread.new do
while #mutex.synchronize {#running}
block = get_block
if block
block.call
reset_block
# Signal the ThreadPool that this worker is ready for another job
#callback.signal
else
# Wait for a new job
#mutex.synchronize {#cv.wait(#mutex)} # <=== Is this line 25?
end
end
end
end
def name
#thread.inspect
end
def get_block
#mutex.synchronize {#block}
end
def set_block(block)
#mutex.synchronize do
raise RuntimeError, "Thread already busy." if #block
#block = block
# Signal the thread in this class, that there's a job to be done
#cv.signal
end
end
def reset_block
#mutex.synchronize {#block = nil}
end
def busy?
#mutex.synchronize {!#block.nil?}
end
def stop
#mutex.synchronize {#running = false}
# Signal the thread not to wait for a new job
#cv.signal
#thread.join
end
end
attr_accessor :max_size
def initialize(max_size = 10)
#max_size = max_size
#workers = []
#mutex = Mutex.new
#cv = ConditionVariable.new
end
def size
#mutex.synchronize {#workers.size}
end
def busy?
#mutex.synchronize {#workers.any? {|w| w.busy?}}
end
def shutdown
#mutex.synchronize {#workers.each {|w| w.stop}}
end
alias :join :shutdown
def process(block=nil,&blk)
block = blk if block_given?
while true
#mutex.synchronize do
worker = get_worker
if worker
return worker.set_block(block)
else
# Wait for a free worker
#cv.wait(#mutex)
end
end
end
end
# Used by workers to report ready status
def signal
#cv.signal
end
private
def get_worker
free_worker || create_worker
end
def free_worker
#workers.each {|w| return w unless w.busy?}; nil
end
def create_worker
return nil if #workers.size >= #max_size
worker = Worker.new(self)
#workers << worker
worker
end
end
Ok, so the main problem with the implementation is: how to make sure no signal is lost and avoid dead locks ?
In my experience, this is REALLY hard to achieve with condition variables and mutex, but easy with semaphores. It so happens that ruby implement an object called Queue (or SizedQueue) that should solve the problem. Here is my suggested implementation:
require 'thread'
begin
require 'fasttread'
rescue LoadError
$stderr.puts "Using the ruby-core thread implementation"
end
class ThreadPool
class Worker
def initialize(thread_queue)
#mutex = Mutex.new
#cv = ConditionVariable.new
#queue = thread_queue
#running = true
#thread = Thread.new do
#mutex.synchronize do
while #running
#cv.wait(#mutex)
block = get_block
if block
#mutex.unlock
block.call
#mutex.lock
reset_block
end
#queue << self
end
end
end
end
def name
#thread.inspect
end
def get_block
#block
end
def set_block(block)
#mutex.synchronize do
raise RuntimeError, "Thread already busy." if #block
#block = block
# Signal the thread in this class, that there's a job to be done
#cv.signal
end
end
def reset_block
#block = nil
end
def busy?
#mutex.synchronize { !#block.nil? }
end
def stop
#mutex.synchronize do
#running = false
#cv.signal
end
#thread.join
end
end
attr_accessor :max_size
def initialize(max_size = 10)
#max_size = max_size
#queue = Queue.new
#workers = []
end
def size
#workers.size
end
def busy?
#queue.size < #workers.size
end
def shutdown
#workers.each { |w| w.stop }
#workers = []
end
alias :join :shutdown
def process(block=nil,&blk)
block = blk if block_given?
worker = get_worker
worker.set_block(block)
end
private
def get_worker
if !#queue.empty? or #workers.size == #max_size
return #queue.pop
else
worker = Worker.new(#queue)
#workers << worker
worker
end
end
end
And here is a simple test code:
tp = ThreadPool.new 500
(1..1000).each { |i| tp.process { (2..10).inject(1) { |memo,val| sleep(0.1); memo*val }; print "Computation #{i} done. Nb of tasks: #{tp.size}\n" } }
tp.shutdown
You can try the work_queue gem, designed to coordinate work between a producer and a pool of worker threads.
I'm slightly biased here, but I would suggest modelling this in some process language and model check it. Freely available tools are, for example, the mCRL2 toolset (using a ACP-based language), the Mobility Workbench (pi-calculus) and Spin (PROMELA).
Otherwise I would suggest removing every bit of code that is not essential to the problem and finding a minimal case where the deadlock occurs. I doubt that it the 100 threads and 1300 tasks are essential to get a deadlock. With a smaller case you can probably just add some debug prints which provide enough information the solve the problem.
Ok, the problem seems to be in your ThreadPool#signal method. What may happen is:
1 - All your worker are busy and you try to process a new job
2 - line 90 gets a nil worker
3 - a worker get freed and signals it, but the signal is lost as the ThreadPool is not waiting for it
4 - you fall on line 95, waiting even though there is a free worker.
The error here is that you can signal a free worker even when nobody is listening. This ThreadPool#signal method should be:
def signal
#mutex.synchronize { #cv.signal }
end
And the problem is the same in the Worker object. What might happen is:
1 - The Worker just completed a job
2 - It checks (line 17) if there is a job waiting: there isn't
3 - The thread pool send a new job and signals it ... but the signal is lost
4 - The worker wait for a signal, even though it is marked as busy
You should put your initialize method as:
def initialize(callback)
#mutex = Mutex.new
#cv = ConditionVariable.new
#callback = callback
#mutex.synchronize {#running = true}
#thread = Thread.new do
#mutex.synchronize do
while #running
block = get_block
if block
#mutex.unlock
block.call
#mutex.lock
reset_block
# Signal the ThreadPool that this worker is ready for another job
#callback.signal
else
# Wait for a new job
#cv.wait(#mutex)
end
end
end
end
end
Next, the Worker#get_block and Worker#reset_block methods should not be synchronized anymore. That way, you cannot have a block assigned to a worker between the test for a block and the wait for a signal.
Top commenter's code has helped out so much over the years. Here it is updated for ruby 2.x and improved with thread identification. How is that an improvement? When each thread has an ID, you can compose ThreadPool with an array which stores arbitrary information. Some ideas:
No array: typical ThreadPool usage. Even with the GIL it makes threading dead easy to code and very useful for high-latency applications like high-volume web crawling,
ThreadPool and Array sized to number of CPUs: easy to fork processes to use all CPUs,
ThreadPool and Array sized to number of resources: e.g., each array element represents one processor across a pool of instances, so if you have 10 instances each with 4 CPUs, the TP can manage work across 40 subprocesses.
With these last two, rather than thinking about threads doing work think about the ThreadPool managing subprocesses that are doing the work. The management task is lightweight and when combined with subprocesses, who cares about the GIL.
With this class, you can code up a cluster based MapReduce in about a hundred lines of code! This code is beautifully short although it can be a bit of a mind-bend to fully grok. Hope it helps.
# Usage:
#
# Thread.abort_on_exception = true # help localize errors while debugging
# pool = ThreadPool.new(thread_pool_size)
# 50.times {|i|
# pool.process { ... }
# or
# pool.process {|id| ... } # worker identifies itself as id
# }
# pool.shutdown()
class ThreadPool
require 'thread'
class ThreadPoolWorker
attr_accessor :id
def initialize(thread_queue, id)
#id = id # worker id is exposed thru tp.process {|id| ... }
#mutex = Mutex.new
#cv = ConditionVariable.new
#idle_queue = thread_queue
#running = true
#block = nil
#thread = Thread.new {
#mutex.synchronize {
while #running
#cv.wait(#mutex) # block until there is work to do
if #block
#mutex.unlock
begin
#block.call(#id)
ensure
#mutex.lock
end
#block = nil
end
#idle_queue << self
end
}
}
end
def set_block(block)
#mutex.synchronize {
raise RuntimeError, "Thread is busy." if #block
#block = block
#cv.signal # notify thread in this class, there is work to be done
}
end
def busy?
#mutex.synchronize { ! #block.nil? }
end
def stop
#mutex.synchronize {
#running = false
#cv.signal
}
#thread.join
end
def name
#thread.inspect
end
end
attr_accessor :max_size, :queue
def initialize(max_size = 10)
#process_mutex = Mutex.new
#max_size = max_size
#queue = Queue.new # of idle workers
#workers = [] # array to hold workers
# construct workers
#max_size.times {|i| #workers << ThreadPoolWorker.new(#queue, i) }
# queue up workers (workers in queue are idle and available to
# work). queue blocks if no workers are available.
#max_size.times {|i| #queue << #workers[i] }
sleep 1 # important to give threads a chance to initialize
end
def size
#workers.size
end
def idle
#queue.size
end
# are any threads idle
def busy?
# #queue.size < #workers.size
#queue.size == 0 && #workers.size == #max_size
end
# block until all threads finish
def shutdown
#workers.each {|w| w.stop }
#workers = []
end
alias :join :shutdown
def process(block = nil, &blk)
#process_mutex.synchronize {
block = blk if block_given?
worker = #queue.pop # assign to next worker; block until one is ready
worker.set_block(block) # give code block to worker and tell it to start
}
end
end