Celluloid resize pool - ruby

I have the following program structure.
client = Client.new
params = client.get_params
pool = client.pool(size: params.size)
futures = params.map do |p|
pool.future(:perform_work, p)
end
futures.map(&:value)
Client is Celluloid-enabled class using include Celluloid. This works great until I try to execute the program in a loop. I need to dynamically resize pool of workers based on number of parameters I receive from external data-feed.
client = Client.new
pool = client.pool(size: 1)
loop do
params = client.get_params
....
**? pool.resize(size: params.size) ?**
....
futures = params.map do |p|
pool.future(:perform_work, p)
end
futures.map(&:value)
sleep 1
end
I tried include pool creation into the loop with subsequent pool.terminate but it's spamming threads and leads to actor crash.

Setting pool.size explicitly did the trick it seems
client = Client.new
pool = client.pool(size: 1)
loop do
params = client.get_params
pool.size = params.size
futures = params.map do |p|
pool.future(:perform_work, p)
end
futures.map(&:value)
sleep 1
end

Related

Two version of the same code not giving the same result

I am trying to implement a simple timeout class that handles timeouts of different requests.
Here is the first version:
class MyTimer
def handleTimeout mHash, k
while mHash[k] > 0 do
mHash[k] -=1
sleep 1
puts "#{k} : #{mHash[k]}"
end
end
end
MAX = 3
timeout = Hash.new
timeout[1] = 41
timeout[2] = 5
timeout[3] = 14
t1 = MyTimer.new
t2 = MyTimer.new
t3 = MyTimer.new
first = Thread.new do
t1.handleTimeout(timeout,1)
end
second = Thread.new do
t2.handleTimeout(timeout,2)
end
third = Thread.new do
t3.handleTimeout(timeout,3)
end
first.join
second.join
third.join
This seems to work fine. All the timeouts work independently of each other.
Screenshot attached
The second version of the code however produces different results:
class MyTimer
def handleTimeout mHash, k
while mHash[k] > 0 do
mHash[k] -=1
sleep 1
puts "#{k} : #{mHash[k]}"
end
end
end
MAX = 3
timeout = Hash.new
timers = Array.new(MAX+1)
threads = Array.new(MAX+1)
for i in 0..MAX do
timeout[i] = rand(40)
# To see timeout value
puts "#{i} : #{timeout[i]}"
end
sleep 1
for i in 0..MAX do
timers[i] = MyTimer.new
threads[i] = Thread.new do
timers[i].handleTimeout( timeout, i)
end
end
for i in 0..MAX do
threads[i].join
end
Screenshot attached
Why is this happening?
How can I implement this functionality using arrays?
Is there a better way to implement the same functionality?
In the loop in which you are creating threads by using Thread.new, the variable i is shared between main thread (where threads are getting created) and in the threads created. So, the value of i seen by handleTimeout is not consistent and you get different results.
You can validate this by adding a debug statement in your method:
#...
def handleTimeout mHash, k
puts "Handle timeout called for #{mHash} and #{k}"
#...
end
#...
To fix the issue, you need to use code like below. Here parameters are passed to Thread.new and subsequently accessed using block variables.
for i in 0..MAX do
timers[i] = MyTimer.new
threads[i] = Thread.new(timeout, i) do |a, b|
timers[i].handleTimeout(a, b)
end
end
More on this issue is described in When do you need to pass arguments to Thread.new? and this article.

Ruby - Reading from multiple sockets (irc bot)

I'm trying to make an IRC bot that connects to multiple servers, and I'm having trouble reading from all the sockets at once.
My current code:
#!/usr/bin/ruby
require 'socket'
servers = ["irc.chat4all.org"]
def connect(server, port, count)
puts "connecting to #{server}..."
#socket[count] = TCPSocket.open(server, port)
say("NICK link_hub", count)
say("USER link_hub 0 * link_hub", count)
read_data(count)
end
def say(msg, count)
#socket[count.to_i].puts msg.to_s
end
def say_channel(msg, count)
#socket[count.to_i].puts("PRIVMSG #test :"+msg.to_s)
end
def read_data(count)
until #socket[count].eof? do
msg = #socket[count].gets
puts msg
if msg.match(/^PING :(.*)$/)
say("PONG #{$~[1]}", count)
say("JOIN #test", count)
next
end
if msg.match(/`test/)
say_channel("connecting to efnet...", count)
Thread.new {
connect("irc.efnet.nl", 6667, count)
}
end
end
end
conn = []
count = 0
#socket = []
servers.each do |server|
connect(server, 6667, count)
count += 1
end
The problem is that when I send the command '`test', it connects to efnet, but it wont read the other socket anymore even though im running the new connection in a thread. I just want to read from both sockets at the same time. (the variable 'count' is the socket number)
Can anyone help me out? Much appreciated!
You need paralelism for that.
pids = []
servers.each do |server|
pids << fork do
connect(server, 6667, count)
count += 1
end
end
pids.each{|pid| Process.wait pid}
You might want to read about processes, threads and other operational system topics.

Ruby Pause thread

In ruby, is it possible to cause a thread to pause from a different concurrently running thread.
Below is the code that I've written so far. I want the user to be able to type 'pause thread' and the sample500 thread to pause.
#!/usr/bin/env ruby
# Creates a new thread executes the block every intervalSec for durationSec.
def DoEvery(thread, intervalSec, durationSec)
thread = Thread.new do
start = Time.now
timeTakenToComplete = 0
loopCounter = 0
while(timeTakenToComplete < durationSec && loopCounter += 1)
yield
finish = Time.now
timeTakenToComplete = finish - start
sleep(intervalSec*loopCounter - timeTakenToComplete)
end
end
end
# User input loop.
exit = nil
while(!exit)
userInput = gets
case userInput
when "start thread\n"
sample500 = Thread
beginTime = Time.now
DoEvery(sample500, 0.5, 30) {File.open('abc', 'a') {|file| file.write("a\n")}}
when "pause thread\n"
sample500.stop
when "resume thread"
sample500.run
when "exit\n"
exit = TRUE
end
end
Passing Thread object as argument to DoEvery function makes no sense because you immediately overwrite it with Thread.new, check out this modified version:
def DoEvery(intervalSec, durationSec)
thread = Thread.new do
start = Time.now
Thread.current["stop"] = false
timeTakenToComplete = 0
loopCounter = 0
while(timeTakenToComplete < durationSec && loopCounter += 1)
if Thread.current["stop"]
Thread.current["stop"] = false
puts "paused"
Thread.stop
end
yield
finish = Time.now
timeTakenToComplete = finish - start
sleep(intervalSec*loopCounter - timeTakenToComplete)
end
end
thread
end
# User input loop.
exit = nil
while(!exit)
userInput = gets
case userInput
when "start thread\n"
sample500 = DoEvery(0.5, 30) {File.open('abc', 'a') {|file| file.write("a\n")} }
when "pause thread\n"
sample500["stop"] = true
when "resume thread\n"
sample500.run
when "exit\n"
exit = TRUE
end
end
Here DoEvery returns new thread object. Also note that Thread.stop called inside running thread, you can't directly stop one thread from another because it is not safe.
You may be able to better able to accomplish what you are attempting using Ruby Fiber object, and likely achieve better efficiency on the running system.
Fibers are primitives for implementing light weight cooperative
concurrency in Ruby. Basically they are a means of creating code
blocks that can be paused and resumed, much like threads. The main
difference is that they are never preempted and that the scheduling
must be done by the programmer and not the VM.
Keeping in mind the current implementation of MRI Ruby does not offer any concurrent running threads and the best you are able to accomplish is a green threaded program, the following is a nice example:
require "fiber"
f1 = Fiber.new { |f2| f2.resume Fiber.current; while true; puts "A"; f2.transfer; end }
f2 = Fiber.new { |f1| f1.transfer; while true; puts "B"; f1.transfer; end }
f1.resume f2 # =>
# A
# B
# A
# B
# .
# .
# .

Odd bug with DataMapper, Mutexes, and Threads?

I have a database full of URLs that I need to test HTTP response time for on a regular basis. I want to have many worker threads combing the database at all times for a URL that hasn't been tested recently, and if it finds one, test it.
Of course, this could cause multiple threads to snag the same URL from the database. I don't want this. So, I'm trying to use Mutexes to prevent this from happening. I realize there are other options at the database level (optimistic locking, pessimistic locking), but I'd at least prefer to figure out why this isn't working.
Take a look at this test code I wrote:
threads = []
mutex = Mutex.new
50.times do |i|
threads << Thread.new do
while true do
url = nil
mutex.synchronize do
url = URL.first(:locked_for_testing => false, :times_tested.lt => 150)
if url
url.locked_for_testing = true
url.save
end
end
if url
# simulate testing the url
sleep 1
url.times_tested += 1
url.save
mutex.synchronize do
url.locked_for_testing = false
url.save
end
end
end
sleep 1
end
end
threads.each { |t| t.join }
Of course there is no real URL testing here. But what should happen is at the end of the day, each URL should end up with "times_tested" equal to 150, right?
(I'm basically just trying to make sure the mutexes and worker-thread mentality are working)
But each time I run it, a few odd URLs here and there end up with times_tested equal to a much lower number, say, 37, and locked_for_testing frozen on "true"
Now as far as I can tell from my code, if any URL gets locked, it will have to unlock. So I don't understand how some URLs are ending up "frozen" like that.
There are no exceptions and I've tried adding begin/ensure but it didn't do anything.
Any ideas?
I'd use a Queue, and a master to pull what you want. if you have a single master you control what's getting accessed. This isn't perfect but it's not going to blow up because of concurrency, remember if you aren't locking the database a mutex doesn't really help you is something else accesses the db.
code completely untested
require 'thread'
queue = Queue.new
keep_running = true
# trap cntrl_c or something to reset keep_running
master = Thread.new do
while keep_running
# check if we need some work to do
if queue.size == 0
urls = URL.all(:times_tested.lt => 150)
urls.each do |u|
queue << u.id
end
# keep from spinning the queue
sleep(0.1)
end
end
end
workers = []
50.times do
workers << Thread.new do
while keep_running
# get an id
id = queue.shift
url = URL.get(id)
#do something with the url
url.save
sleep(0.1)
end
end
end
workers.each do |w|
w.join
end

Ruby threading pass control to main

I am programming an application in Ruby which creates a new thread for every new job. So this is like a queue manager, where I check how many threads can be started from a database. Now when a thread finishes, I want to call the method to start a new job (i.e. a new thread). I do not want to create nested threads, so is there any way to join/terminate/exit the calling thread and pass control over to the main thread? Just to make the situation clear, there can be other threads running at this time.
I tried simply joining the calling thread, if its not the main thread and I get the following error;
"thread 0x7f8cf8dcf438 tried to join itself"
Any suggestions will be highly appreciated.
Thanks in advance.
I'd propose two solutions:
the first one is effectively to join on a thread, but join has to be called from the main thread (assuming you started all of your worker threads from the main) :
def thread_proc(s)
sleep rand(5)
puts "#{Thread.current.inspect}: #{s}"
end
strings = ["word", "test", "again", "value", "fox", "car"]
threads = []
2.times {
threads << Thread.new(strings.shift) { |s| thread_proc(s) }
}
while !threads.empty?
threads.each { |t|
t.join
threads << Thread.new(strings.shift) { |s| thread_proc(s) } unless strings.empty?
threads.delete(t)
}
end
but that method is kind of inefficient, because creating threads over and over again induces memory and CPU overhead.
You should better synchronize a fixed pool of reused threads by using a Queue:
require 'thread'
strings = ["word", "test", "again", "value", "fox", "car"]
q = Queue.new
strings.each { |s| q << s }
threads = []
2.times { threads << Thread.new {
while !q.empty?
s = q.pop
sleep(rand(5))
puts "#{Thread.current.inspect}: #{s}"
end
}}
threads.each { |t| t.join }
t1 = Thread.new { Thread.current[:status] = "1"; sleep 10; Thread.pass; sleep 100 }
t2 = Thread.new { Thread.current[:status] = "2"; sleep 1000 }
t3 = Thread.new { Thread.current[:status] = "3"; sleep 1000 }
puts Thread.list.map {|X| x[:status] }
#=> 1,2,3
Thread.list.each do |x|
if x[:status] == 2
x.kill # kill the thread
break
end
end
puts Thread.list.map {|X| x[:status] }
#=> 1,3
"Thread::pass" will pass control to the scheduler which can now schedule any other thread. The thread has voluntarily given up control to the scheduler - we cannot specify to pass control onto a specific thread
"Thread#kill" will kill the instance the thread
"Thread::list" will return the list of threads
Threads are managed by the scheduler, if you want explicit control then checkout fibers. But it has some gotchas, fibers are not supported in JRuby.
also checkout thread local variables, it will help you to communicate the status or return value of the thread, without joining to the thread.
http://github.com/defunkt/resque is a good option for a queue, check it out. Also try JRuby if you are going make heavy use of threads. It' advantage is that it will wrap java threads in ruby goodness.

Resources