SizedQueue.clear does not free push - ruby

The code below is a simplification of a much bigger and complex code, what happens is when I invoke the stop function and do a queue clear I was expecting the lock on the get_new thread to be free and ending the whole thread, instead what happens its a dead lock on the thread.join statement.
If I do a pop instead of clear the desired behavior happens. Can you help me understand why?
class Controller
require 'thread'
require 'monitor'
require 'net/http'
attr_accessor :thread_count, :event_queue, :is_running, :producer_thread, :events
def initialize
#thread_count = 5
#event_queue = SizedQueue.new(#thread_count)
#events = [27242233, 27242232,27242231]
end
def start
#is_running = true
#producer_thread = Thread.new{get_new()}
end
def get_new
while #is_running do
#events.each do |e|
p e.to_s
#event_queue << e
end
sleep 1
end
p "thread endend"
end
def stop
p "Stoping!"
#event_queue.clear
p "Queue size: " + #event_queue.length.to_s
sleep 2
#is_running = false
sleep 2
producer_thread.join
puts "DONE!"
end
end
service = Controller.new
service.start
sleep 5
service.stop

This was a bug in ruby. It is fixed in ruby 1.9.3p545.
Early versions of ruby 2.1 and 2.0 were affected too. For those you want 2.1.2 or 2.0.0p481 respectively

Related

RuntimeError (Circular dependency detected while autoloading constant Apps multithreading

I'm receiving this error:
RuntimeError (Circular dependency detected while autoloading constant Apps
when I'm multithreading. Here is my code below. Why is this happening?
The reason I am trying to multithread is because I am writing a HTML scraping app.
The call to Nokogiri::HTML(open()) is a synchronous blocking call that takes 1 second to return, and I have 100,000+ pages to visit, so I am trying to run several threads to overcome this issue. Is there a better way of doing this?
class ToolsController < ApplicationController
def getWebsites
t1=Thread.new{func1()}
t2=Thread.new{func1()}
t3=Thread.new{func1()}
t4=Thread.new{func1()}
t5=Thread.new{func1()}
t6=Thread.new{func1()}
t1.join
t2.join
t3.join
t4.join
t5.join
t6.join
end
def func1
puts Thread.current
apps = Apps.order("RANDOM()").where("apps.website IS NULL").take(1)
while apps.size == 1 do
app = apps[0]
puts app.name
puts app.iTunes
doc = Nokogiri::HTML(open(app.iTunes))
array = doc.css('div.app-links a').map { |link|
url = link['href']
url = Domainatrix.parse(url)
url.domain + "." + url.public_suffix
}
array.uniq!
if (array.size > 0)
app.website = array.join(', ')
puts app.website
else
app.website = "NONE"
end
app.save
apps = Apps.order("RANDOM()").where("apps.website IS NULL").take(1)
end
end
end
"require" isn't thread-safe
Change your methods so that everything that is to be "required" is done so before the threads start.
For example:
def get_websites
# values = Apps.all # try uncommenting this line if a second-try is required
ar = Apps.where("apps.website IS NULL")
t1 = Thread.new{ func1(ar) }
t2 = Thread.new{ func1(ar) }
t1.join
t2.join
end
def func1( ar )
apps = ar.order("RANDOM()").limit(1)
while (apps.size == 1)
puts Thread.current
end
end
But as somebody pointed out, the way you're multithreading within the controller isn't advised.

How can I terminate a SupervisionGroup?

I am implementing a simple program in Celluloid that ideally will run a few actors in parallel, each of which will compute something, and then send its result back to a main actor, whose job is simply to aggregate results.
Following this FAQ, I introduced a SupervisionGroup, like this:
module Shuffling
class AggregatorActor
include Celluloid
def initialize(shufflers)
#shufflerset = shufflers
#results = {}
end
def add_result(result)
#results.merge! result
#shufflerset = #shufflerset - result.keys
if #shufflerset.empty?
self.output
self.terminate
end
end
def output
puts #results
end
end
class EvalActor
include Celluloid
def initialize(shufflerClass)
#shuffler = shufflerClass.new
self.async.runEvaluation
end
def runEvaluation
# computation here, which yields result
Celluloid::Actor[:aggregator].async.add_result(result)
self.terminate
end
end
class ShufflerSupervisionGroup < Celluloid::SupervisionGroup
shufflers = [RubyShuffler, PileShuffle, VariablePileShuffle, VariablePileShuffleHuman].to_set
supervise AggregatorActor, as: :aggregator, args: [shufflers.map { |sh| sh.new.name }]
shufflers.each do |shuffler|
supervise EvalActor, as: shuffler.name.to_sym, args: [shuffler]
end
end
ShufflerSupervisionGroup.run
end
I terminate the EvalActors after they're done, and I also terminate the AggregatorActor when all of the workers are done.
However, the supervision thread stays alive and keeps the main thread alive. The program never terminates.
If I send .run! to the group, then the main thread terminates right after it, and nothing works.
What can I do to terminate the group (or, in group terminology, finalize, I suppose) after the AggregatorActor terminates?
What I did after all, is change the AggregatorActor to have a wait_for_results:
class AggregatorActor
include Celluloid
def initialize(shufflers)
#shufflerset = shufflers
#results = {}
end
def wait_for_results
sleep 5 while not #shufflerset.empty?
self.output
self.terminate
end
def add_result(result)
#results.merge! result
#shufflerset = #shufflerset - result.keys
puts "Results for #{result.keys.inspect} recorded, remaining: #{#shufflerset.inspect}"
end
def output
puts #results
end
end
And then I got rid of the SupervisionGroup (since I didn't need supervision, ie rerunning of actors that failed), and I used it like this:
shufflers = [RubyShuffler, PileShuffle, VariablePileShuffle, VariablePileShuffleHuman, RiffleShuffle].to_set
Celluloid::Actor[:aggregator] = AggregatorActor.new(shufflers.map { |sh| sh.new.name })
shufflers.each do |shuffler|
Celluloid::Actor[shuffler.name.to_sym] = EvalActor.new shuffler
end
Celluloid::Actor[:aggregator].wait_for_results
That doesn't feel very clean, it would be nice if there was a cleaner way, but at least this works.

How to rspec threaded code?

Starting using rspec I have difficulties trying to test threaded code.
Here is a simplicfication of a code founded, and I made it cause i need a Queue with Timeout capabilities
require "thread"
class TimeoutQueue
def initialize
#lock = Mutex.new
#items = []
#new_item = ConditionVariable.new
end
def push(obj)
#lock.synchronize do
#items.push(obj)
#new_item.signal
end
end
def pop(timeout = :never)
timeout += Time.now unless timeout == :never
#lock.synchronize do
loop do
time_left = timeout == :never ? nil : timeout - Time.now
if #items.empty? and time_left.to_f >= 0
#new_item.wait(#lock, time_left)
end
return #items.shift unless #items.empty?
next if timeout == :never or timeout > Time.now
return nil
end
end
end
alias_method :<<, :push
end
But I can't find a way to test it using rspec. Is there any effective documentation on testing threaded code? Any gem that can helps me?
I'm a bit blocked, thanks in advance
When unit-testing we don't want any non-deterministic behavior to affect our tests, so when testing threading we should not run anything in parallel.
Instead, we should isolate our code, and simulate the cases we want to test, by stubbing #lock, #new_item, and perhaps even Time.now (to be more readable I've taken the liberty to imagine you also have attr_reader :lock, :new_item):
it 'should signal after push' do
allow(subject.lock).to receive(:synchronize).and_yield
expect(subject.new_item).to receive(:signal)
subject.push('object')
expect(subject.items).to include('object')
end
it 'should time out if taken to long to enter synchronize loop' do
#now = Time.now
allow(Time).to receive(:now).and_return(#now, #now + 10.seconds)
allow(subject.items).to receive(:empty?).and_return true
allow(subject.lock).to receive(:synchronize).and_yield
expect(subject.new_item).to_not receive(:wait)
expect(subject.pop(5.seconds)).to be_nil
end
etc...

Stop Ruby - jRuby - thread after a certain time

I'm trying to create a simple multithreaded program with jRuby. It needs to start and stop threads based on a specified amount of time e.g. run for five seconds then stop. I'm pretty new to this sort of stuff, so it's probably pretty basic but I can't get it to work.
The relevant code looks like this:
require 'java'
require 'timeout'
require './lib/t1.rb'
require './lib/t2.rb'
class Threads
[...]
def manage_threads
thread2 = T2.new
# Wait for 5 seconds before the thread starts running..
thread2.run(wait_time = 5)
Timeout::timeout(10) do
thread1 = T1.new {}
end
end
class T1 < Thread
def initialize
while super.status != "sleep"
puts "Thread 1"
sleep(1)
end
end
end
class T2
include java.lang.Runnable
def run wait_time
thread = Thread.new do
sleep(wait_time)
loop do
puts "Thread 2"
sleep(1)
end
end
end
def stop_thread(after_run_time)
sleep(after_run_time)
end
end
I have already tried a couple if things, for example:
# Used timeout
Timeout::timeout(10) do
thread1 = T1.new {}
end
# This kinda works, except that it terminates the program and therefore isn't the behavior
# I want.
Does anyone have a suggestion on how to 1. start a thread, run it for a while. 2. Start a new thread, run both thread in parallel. 2. Stop thread 1 but keep running thread 2. Any tips/suggestions would be appreciated.
I think I solved it.
This did the trick:
def run wait_time
thread = Thread.new do
sleep(wait_time)
second_counter = 0
loop do
puts "Thread 2"
second_counter += 1
if second_counter == 15
sleep
end
sleep(1)
end
end
end

Thread lockup in ruby with Soap4r

This is related to a question I asked here:
Thread Locking in Ruby (use of soap4r and QT)
However it is particular to one part of that question and is supported by a simpler example. The test code is:
require 'rubygems'
require 'thread'
require 'soap/rpc/standaloneserver'
class SOAPServer < SOAP::RPC::StandaloneServer
def initialize(* args)
super
# Exposed methods
add_method(self, 'test', 'x', 'y')
end
def test(x, y)
return x + y
end
end
myServer = SOAPServer.new('monitorservice', 'urn:ruby:MonitorService', 'localhost', 4004)
Thread.new do
puts 'Starting web services'
myServer.start
puts 'Ending web services'
end
sleep(4)
#Thread.new do
testnum = 0
while testnum < 4000 do
testnum += 1
puts myServer.test(0,testnum)
sleep(2)
end
#end
puts myServer.test(0,4001)
puts myServer.test(0,4002)
puts myServer.test(0,4003)
puts myServer.test(0,4004)
gets
When I run this with the thread commented out everything runs along fine. However, once the thread is put in the process hangs. I poked into Webrick and found that the stop occurs here (the puts are, of course, mine):
while #status == :Running
begin
puts "1.1"
if svrs = IO.select(#listeners, nil, nil, 2.0)
svrs[0].each{|svr|
puts "-+-"
#tokens.pop # blocks while no token is there.
if sock = accept_client(svr)
th = start_thread(sock, &block)
th[:WEBrickThread] = true
thgroup.add(th)
else
#tokens.push(nil)
end
}
end
puts ".+."
When run with the thread NOT commented out I get something like this:
Starting web services
1.1
.+.
1.1
4001
4002
4003
4004
1
.+.
1.1
If the problem is caused by the gets() call and the purpose of the gets() call in your code is to prevent the Ruby interpreter from exiting, you can replace it with Thread.join() calls for each thread that you create. Join() will block until that thread has finished executing and therefore it'll prevent the Ruby interpreter from exiting.
E.g.:
t1 = Thread.new do
puts 'Starting web services'
myServer.start
puts 'Ending web services'
end
t2 = ...
...
t1.join
t2.join
Alternatively, if you can join() only one of the threads if there is a single thread that controls the execution of the application, and the other threads will be killed on exit.
The trailing gets blocks Ruby's IO. I'm not sure why. If it is replaced with pretty much anything the program works. I used a sleeping loop:
loop do
sleep 1
end
ADDED:
I should note that I also get strange behavior with sleep based on the sleep increment. In the end I abandoned Ruby since the threading behavior was too wonky.

Resources