Ruby thread callback weird behaviour - ruby

Creating a class which holds some threads, performing tasks and finally calling a callback-method is my current goal, nothing special on this road.
My experimental class does some connection-checks on specific ports of a given IP, to give me a status information.
So my attempt:
check = ConnectionChecker.new do | threads |
# i am done callback
end
check.check_connectivity(ip0, port0, timeout0, identifier0)
check.check_connectivity(ip1, port1, timeout1, identifier1)
check.check_connectivity(ip2, port2, timeout2, identifier2)
sleep while not check.is_done
Maybe not the best approach, but in general it fits in my case.
So what's happening:
In my Class I store a callback, perform actions and do internal stuff:
Thread.new -> success/failure -> mark as done, when all done -> call callback:
class ConnectionChecker
attr_reader :is_done
def initialize(&callback)
#callback = callback
#thread_count = 0
#threads = []
#is_done = false
end
def check_connectivity(host, port, timeout, ident)
#thread_count += 1
#threads << Thread.new do
status = false
pid = Process.spawn("nc -z #{host} #{port} >/dev/null")
begin
Timeout.timeout(timeout) do
Process.wait(pid)
status = true
end
rescue Process::TimeoutError => e
Process.kill('TERM', pid)
end
mark_as_done
#returnvalue for the callback.
[status, ident]
end
end
# one less to go..
def mark_as_done
#thread_count -= 1
if #thread_count.zero?
#is_done = true
#callback.call(#threads)
end
end
end
This code - yes, I know there is no start method so I have to trust that I call it all quite instantly - works fine.
But when I swap these 2 lines:
#is_done = true
#callback.call(#threads)
to
#callback.call(#threads)
#is_done = true
then the very last line,
sleep while not check.is_done
becomes an endless loop. Debugging shows me that the callback is called properly, when I check for the value of is_done, it really always is false. Since I don't put it into a closure, I wonder why this is happening.
The callback itself can also be empty, is_done remains false (so there is no mis-caught exception).
In this case I noticed that the last thread was at status running. Since I did not ask for the thread's value, I just don't get the hang here.
Any documentation/information regarding this problem? Also, a name for it would be fine.

Try using Mutex to ensure thread safety :)

Related

How to stop a udp_server_loop from the outside

I've written little UDP server in Ruby:
def listen
puts "Started UDP server on #{#port}..."
Socket.udp_server_loop(#port) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
I start it in a separate thread:
thread = Thread.new { listen }
Is there a way to gracefully stop the udp_server_loop from outside the thread without just killing it (thread.kill)? I also dont't want to stop it from the inside by receiving any UDP message. Is udp_server_loop maybe not the right tool for me?
I don’t think you can do this with udp_server_loop (although you might be able to use some of the methods it uses). You are going to have to call IO::select in a loop of your own with some way of signalling it to exit, and some way of waking the thread so you don’t have to send a packet to stop it.
A simple way would be to use the timeout option to select with a variable to set to indicate you want the thread to end, something like:
#halt_loop = false
def listen
puts "Started UDP server on #{#port}..."
sockets = Socket.udp_server_sockets(#port)
loop do
readable, _, _ = IO.select(sockets, nil, nil, 1) # timeout 1 sec
break if #halt_loop
next unless readable # select returns nil on timeout
Socket.udp_server_recv(readable) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
end
You then set #halt_loop to true when you want to stop the thread.
The downside to this is that it is effectively polling. If you decrease the timeout then you potentially do more work on an empty loop, and if you increase it you have to wait longer when stopping the thread.
Another, slightly more complex solution would be to use a pipe and have the select listen on it along with the sockets. You could then signal directly to finish the select and exit the thread.
#read, #write = IO.pipe
#halt_loop = false
def listen
puts "Started UDP server on #{#port}..."
sockets = Socket.udp_server_sockets(#port)
sockets << #read
loop do
readable, _, _ = IO.select(sockets)
break if #halt_loop
readable.delete #read
Socket.udp_server_recv(readable) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
end
def end_loop
#halt_loop = true
#write.puts "STOP!"
end
To exit the thread you just call end_loop which sets the #halt_loop flag then writes to the pipe, making the other end readable and causing the other thread to return from select.
You could have this code check the readable IOs and exit if one of them is the read end of the pipe instead of using the variable, but at least on Linux there is a potential bug where a call to select might return a file descriptor as readable when it actuallt isn’t. I don’t know if Ruby deals with this, so better safe than sorry.
Also be sure to remove the pipe from the readable array before passing it to udp_server_recv. It’s not a socket so will cause an exception if you don’t.
A downside to this technique is that pipes are “[n]ot available on all platforms".
Although I doubt I understand what would be wrong with Thread::kill and/or Thread#exit, you might use the thread local variable for that.
def listen
Socket.udp_server_loop(#port) do |message, message_source|
break :interrupted if Thread.current[:break]
handle_incoming_message(message)
end
end
and do
thread[:break] = true
from the outside.

Thread in Parallel gem Ruby

I am using sidekiq gem for queue. and I want to process my executing parallely inside the queue.
here is my code for queue
def perform(disbursement_id)
some logic...
Parallel.each(disbursement.employee_disbursements, in_threads: 2) do |employee|
amount = amount_format(employee.amount)
res = unload_company_account(cmp_acc_id, amount.to_s)
load_employee_account(employee) unless res.empty?
end
end
Now when I use Parallel.each() without threads it works good, but when i use Parallel.each(.., in_threads:3) it goes to busy state of queue.
Not sure why in_threads takes my queue to busy state. I am not able to resolve it.
Try next to make it work
Parallel.each(disbursement.employee_disbursements, in_threads: 2) do |employee|
ActiveRecord::Base.connection_pool.with_connection do
amount = amount_format(employee.amount)
res = unload_company_account(cmp_acc_id, amount.to_s)
load_employee_account(employee) unless res.empty?
end
end
Also, that issue go away when use map instead of each or pass attribute preserve_results as true or false. That is a bit mystery because:
def each(array, options={}, &block)
map(array, options.merge(:preserve_results => false), &block)
end

Periodically checking if a sidekiq job has been cancelled

Jobs in sidekiq are suppose to check if they have been cancelled, but if I have a long running job, I'd like for it to check itself periodically. This example does not work : I've not wrapped the fake work in any sort of future within which I can raise an exception -- which I'm not sure is even possible. How might I do this?
class ThingWorker
def perform(phase, id)
thing = Thing.find(id)
# schedule the initial check
schedule_cancellation_check(thing.updated_at, id)
# maybe wrap this in something I can raise an exception within?
sleep 10 # fake work
#done = true
return true
end
def schedule_cancellation_check(initial_time, thing_id)
Concurrent.schedule(5) {
# just check right away...
return if #done
# if our thing has been updated since we started this job, kill this job!
if Thing.find(thing_id).updated_at != initial_time
cancel!
# otherwise, schedule the next check
else
schedule_cancellation_check(initial_time, thing_id)
end
}
end
# as per sidekiq wiki
def cancelled?
#cancelled
Sidekiq.redis {|c| c.exists("cancelled-#{jid}") }
end
def cancel!
#cancelled = true
# not sure what this does besides marking the job as cancelled tho, read source
Sidekiq.redis {|c| c.setex("cancelled-#{jid}", 86400, 1) }
end
end
You're thinking about this way too hard. Your worker should be a loop and check for cancellation every iteration.
def perform(thing_id, updated_at)
thing = Thing.find(thing_id)
while !cancel?(thing, updated_at)
# do something
end
end
def cancel?(thing, last_updated_at)
thing.reload.updated_at > last_updated_at
end

Ruby Pause thread

In ruby, is it possible to cause a thread to pause from a different concurrently running thread.
Below is the code that I've written so far. I want the user to be able to type 'pause thread' and the sample500 thread to pause.
#!/usr/bin/env ruby
# Creates a new thread executes the block every intervalSec for durationSec.
def DoEvery(thread, intervalSec, durationSec)
thread = Thread.new do
start = Time.now
timeTakenToComplete = 0
loopCounter = 0
while(timeTakenToComplete < durationSec && loopCounter += 1)
yield
finish = Time.now
timeTakenToComplete = finish - start
sleep(intervalSec*loopCounter - timeTakenToComplete)
end
end
end
# User input loop.
exit = nil
while(!exit)
userInput = gets
case userInput
when "start thread\n"
sample500 = Thread
beginTime = Time.now
DoEvery(sample500, 0.5, 30) {File.open('abc', 'a') {|file| file.write("a\n")}}
when "pause thread\n"
sample500.stop
when "resume thread"
sample500.run
when "exit\n"
exit = TRUE
end
end
Passing Thread object as argument to DoEvery function makes no sense because you immediately overwrite it with Thread.new, check out this modified version:
def DoEvery(intervalSec, durationSec)
thread = Thread.new do
start = Time.now
Thread.current["stop"] = false
timeTakenToComplete = 0
loopCounter = 0
while(timeTakenToComplete < durationSec && loopCounter += 1)
if Thread.current["stop"]
Thread.current["stop"] = false
puts "paused"
Thread.stop
end
yield
finish = Time.now
timeTakenToComplete = finish - start
sleep(intervalSec*loopCounter - timeTakenToComplete)
end
end
thread
end
# User input loop.
exit = nil
while(!exit)
userInput = gets
case userInput
when "start thread\n"
sample500 = DoEvery(0.5, 30) {File.open('abc', 'a') {|file| file.write("a\n")} }
when "pause thread\n"
sample500["stop"] = true
when "resume thread\n"
sample500.run
when "exit\n"
exit = TRUE
end
end
Here DoEvery returns new thread object. Also note that Thread.stop called inside running thread, you can't directly stop one thread from another because it is not safe.
You may be able to better able to accomplish what you are attempting using Ruby Fiber object, and likely achieve better efficiency on the running system.
Fibers are primitives for implementing light weight cooperative
concurrency in Ruby. Basically they are a means of creating code
blocks that can be paused and resumed, much like threads. The main
difference is that they are never preempted and that the scheduling
must be done by the programmer and not the VM.
Keeping in mind the current implementation of MRI Ruby does not offer any concurrent running threads and the best you are able to accomplish is a green threaded program, the following is a nice example:
require "fiber"
f1 = Fiber.new { |f2| f2.resume Fiber.current; while true; puts "A"; f2.transfer; end }
f2 = Fiber.new { |f1| f1.transfer; while true; puts "B"; f1.transfer; end }
f1.resume f2 # =>
# A
# B
# A
# B
# .
# .
# .

Odd bug with DataMapper, Mutexes, and Threads?

I have a database full of URLs that I need to test HTTP response time for on a regular basis. I want to have many worker threads combing the database at all times for a URL that hasn't been tested recently, and if it finds one, test it.
Of course, this could cause multiple threads to snag the same URL from the database. I don't want this. So, I'm trying to use Mutexes to prevent this from happening. I realize there are other options at the database level (optimistic locking, pessimistic locking), but I'd at least prefer to figure out why this isn't working.
Take a look at this test code I wrote:
threads = []
mutex = Mutex.new
50.times do |i|
threads << Thread.new do
while true do
url = nil
mutex.synchronize do
url = URL.first(:locked_for_testing => false, :times_tested.lt => 150)
if url
url.locked_for_testing = true
url.save
end
end
if url
# simulate testing the url
sleep 1
url.times_tested += 1
url.save
mutex.synchronize do
url.locked_for_testing = false
url.save
end
end
end
sleep 1
end
end
threads.each { |t| t.join }
Of course there is no real URL testing here. But what should happen is at the end of the day, each URL should end up with "times_tested" equal to 150, right?
(I'm basically just trying to make sure the mutexes and worker-thread mentality are working)
But each time I run it, a few odd URLs here and there end up with times_tested equal to a much lower number, say, 37, and locked_for_testing frozen on "true"
Now as far as I can tell from my code, if any URL gets locked, it will have to unlock. So I don't understand how some URLs are ending up "frozen" like that.
There are no exceptions and I've tried adding begin/ensure but it didn't do anything.
Any ideas?
I'd use a Queue, and a master to pull what you want. if you have a single master you control what's getting accessed. This isn't perfect but it's not going to blow up because of concurrency, remember if you aren't locking the database a mutex doesn't really help you is something else accesses the db.
code completely untested
require 'thread'
queue = Queue.new
keep_running = true
# trap cntrl_c or something to reset keep_running
master = Thread.new do
while keep_running
# check if we need some work to do
if queue.size == 0
urls = URL.all(:times_tested.lt => 150)
urls.each do |u|
queue << u.id
end
# keep from spinning the queue
sleep(0.1)
end
end
end
workers = []
50.times do
workers << Thread.new do
while keep_running
# get an id
id = queue.shift
url = URL.get(id)
#do something with the url
url.save
sleep(0.1)
end
end
end
workers.each do |w|
w.join
end

Resources