Ruby Thread Still Blocking - ruby

I'm running a single thread to put 'data' onto the screen.
The point of the thread was to stop blocking on this function so I could send data to the socket while listening to data on it's way back.
def msg_loop()
t1 = Thread.new{
loop do
msg = #socket.recv(30)
self.msg_dis(msg)
end
}
t1.join
end
However if I run
myclass.msg_loop
myclass.send_msg("message to send")
The function send_msg is never run, no different than if msg_loop had no threading.

t1.join causes the program to wait until thread t1 has finished running. You want to do this instead.
def msg_loop()
t1 = Thread.new{
loop do
msg = #socket.recv(30)
self.msg_dis(msg)
end
}
t1
end
t1 = myclass.msg_loop
myclass.send_msg("message to send")
t1.join

Ruby doesn't provide real threading (jruby does).
With an infinite loop such as mine threading in ruby doesn't do anything because the loop never ends.
This causes the thread to never and and thus blocking occurs.

Related

How to stop a udp_server_loop from the outside

I've written little UDP server in Ruby:
def listen
puts "Started UDP server on #{#port}..."
Socket.udp_server_loop(#port) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
I start it in a separate thread:
thread = Thread.new { listen }
Is there a way to gracefully stop the udp_server_loop from outside the thread without just killing it (thread.kill)? I also dont't want to stop it from the inside by receiving any UDP message. Is udp_server_loop maybe not the right tool for me?
I don’t think you can do this with udp_server_loop (although you might be able to use some of the methods it uses). You are going to have to call IO::select in a loop of your own with some way of signalling it to exit, and some way of waking the thread so you don’t have to send a packet to stop it.
A simple way would be to use the timeout option to select with a variable to set to indicate you want the thread to end, something like:
#halt_loop = false
def listen
puts "Started UDP server on #{#port}..."
sockets = Socket.udp_server_sockets(#port)
loop do
readable, _, _ = IO.select(sockets, nil, nil, 1) # timeout 1 sec
break if #halt_loop
next unless readable # select returns nil on timeout
Socket.udp_server_recv(readable) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
end
You then set #halt_loop to true when you want to stop the thread.
The downside to this is that it is effectively polling. If you decrease the timeout then you potentially do more work on an empty loop, and if you increase it you have to wait longer when stopping the thread.
Another, slightly more complex solution would be to use a pipe and have the select listen on it along with the sockets. You could then signal directly to finish the select and exit the thread.
#read, #write = IO.pipe
#halt_loop = false
def listen
puts "Started UDP server on #{#port}..."
sockets = Socket.udp_server_sockets(#port)
sockets << #read
loop do
readable, _, _ = IO.select(sockets)
break if #halt_loop
readable.delete #read
Socket.udp_server_recv(readable) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
end
def end_loop
#halt_loop = true
#write.puts "STOP!"
end
To exit the thread you just call end_loop which sets the #halt_loop flag then writes to the pipe, making the other end readable and causing the other thread to return from select.
You could have this code check the readable IOs and exit if one of them is the read end of the pipe instead of using the variable, but at least on Linux there is a potential bug where a call to select might return a file descriptor as readable when it actuallt isn’t. I don’t know if Ruby deals with this, so better safe than sorry.
Also be sure to remove the pipe from the readable array before passing it to udp_server_recv. It’s not a socket so will cause an exception if you don’t.
A downside to this technique is that pipes are “[n]ot available on all platforms".
Although I doubt I understand what would be wrong with Thread::kill and/or Thread#exit, you might use the thread local variable for that.
def listen
Socket.udp_server_loop(#port) do |message, message_source|
break :interrupted if Thread.current[:break]
handle_incoming_message(message)
end
end
and do
thread[:break] = true
from the outside.

How to asynchronously collect results from new threads created in real time in ruby

I would like to continously check the table in the DB for the commands to run.
Some commands might take 4minutes to complete, some 10 seconds.
Hence I would like to run them in threads. So every record creates new thread, and after thread is created, record gets removed.
Because the DB lookup + Thread creation will run in an endless loop, how do I get the 'response' from the Thread (thread will issue shell command and get response code which I would like to read) ?
I thought about creating two Threads with endless loop each:
- first for DB lookups + creating new threads
- second for ...somehow reading the threads results and acting upon each response
Or maybe I should use fork, or os spawn a new process?
You can have each thread push its results onto a Queue, then your main thread can read from the Queue. Reading from a Queue is a blocking operation by default, so if there are no results, your code will block and wait on the read.
http://ruby-doc.org/stdlib-2.0.0/libdoc/thread/rdoc/Queue.html
Here is an example:
require 'thread'
jobs = Queue.new
results = Queue.new
thread_pool = []
pool_size = 5
(1..pool_size).each do |i|
thread_pool << Thread.new do
loop do
job = jobs.shift #blocks waiting for a task
break if job == "!NO-MORE-JOBS!"
#Otherwise, do job...
puts "#{i}...."
sleep rand(1..5) #Simulate the time it takes to do a job
results << "thread#{i} finished #{job}" #Push some result from the job onto the Queue
#Go back and get another task from the Queue
end
end
end
#All threads are now blocking waiting for a job...
puts 'db_stuff'
db_stuff = [
'job1',
'job2',
'job3',
'job4',
'job5',
'job6',
'job7',
]
db_stuff.each do |job|
jobs << job
end
#Threads are now attacking the Queue like hungry dogs.
pool_size.times do
jobs << "!NO-MORE-JOBS!"
end
result_count = 0
loop do
result = results.shift
puts "result: #{result}"
result_count +=1
break if result_count == 7
end

Is there a better way to make multiple HTTP requests asynchronously in Ruby?

I'm trying to make multiple HTTP requests in Ruby. I know it can be done in NodeJS quite easily. I'm trying to do it in Ruby using threads, but I don't know if that's the best way. I haven't had a successful run for high numbers of requests (e.g. over 50).
require 'json'
require 'net/http'
urls = [
{"link" => "url1"},
{"link" => "url2"},
{"link" => "url3"}
]
urls.each_value do |thing|
Thread.new do
result = Net::HTTP.get(URI.parse(thing))
json_stuff = JSON::parse(result)
info = json["person"]["bio"]["info"]
thing["name"] = info
end
end
# Wait until threads are done.
while !urls.all? { |url| url.has_key? "name" }; end
puts urls
Any thoughts?
Instead of the while clause you used, you can call Thread#join to make the main thread wait for other threads.
threads = []
urls.each_value do |thing|
threads << Thread.new do
result = Net::HTTP.get(URI.parse(thing))
json_stuff = JSON::parse(result)
info = json["person"]["bio"]["info"]
thing["name"] = info
end
end
# Wait until threads are done.
threads.each { |aThread| aThread.join }
Your way might work, but it's going to end up in a busy loop, eating up CPU cycles when it really doesn't need to. A better way is to only check whether you're done when a request completes. One way to accomplish this would be to use a Mutex and a ConditionVariable.
Using a mutex and condition variable, we can have the main thread waiting, and when one of the worker threads receives its response, it can wake up the main thread. The main thread can then see if any URLs remain to be downloaded; if so, it'll just go to sleep again, waiting; otherwise, it's done.
To wait for a signal:
mutex.synchronize { cv.wait mutex }
To wake up the waiting thread:
mutex.synchronize { cv.signal }
You might want to check for done-ness and set thing['name'] inside the mutex.synchronize block to avoid accessing data in multiple threads simultaneously.

Ruby 1.8.7: Forks & Pipes - Troubleshooting

I'm aware that there are great gems like Parallel, but I came up with the class below as an exercise.
It's working fine, but when doing a lot of iterations it happens sometimes that Ruby will get "stuck". When pressing CTRL+C I can see from the backtrace that it's always in lines 38 or 45 (the both Marshal lines).
Can you see anything that is wrong here? It seems to be that the Pipes are "hanging", so I thought I might be using them in a wrong way.
My goal was to iterate through an array (which I pass as 'objects') with a limited number of forks (max_forks) and to return some values. Additionally I wanted to guarantee that all childs get killed when the parent gets killed (even in case of kill -9), this is why I introduced the "life_line" Pipe (I've read here on Stackoverflow that this might do the trick).
class Parallel
def self.do_fork(max_forks, objects)
waiter_threads = []
fork_counter = []
life_line = {}
comm_line = {}
objects.each do |object|
key = rand(24 ** 24).to_s(36)
sleep(0.01) while fork_counter.size >= max_forks
if fork_counter.size < max_forks
fork_counter << true
life_line[key] = {}
life_line[key][:r], life_line[key][:w] = IO.pipe
comm_line[key] = {}
comm_line[key][:r], comm_line[key][:w] = IO.pipe
pid = fork {
life_line[key][:w].close
comm_line[key][:r].close
Thread.new {
begin
life_line[key][:r].read
rescue SignalException, SystemExit => e
raise e
rescue Exception => e
Kernel.exit
end
}
Marshal.dump(yield(object), comm_line[key][:w]) # return yield
}
waiter_threads << Thread.new {
Process.wait(pid)
comm_line[key][:w].close
reply = Marshal.load(comm_line[key][:r])
# process reply here
comm_line[key][:r].close
life_line[key][:r].close
life_line[key][:w].close
life_line[key] = nil
fork_counter.pop
}
end
end
waiter_threads.each { |k| k.join } # wait for all threads to finish
end
end
The bug was this:
A pipe can handle only a certain amount of data (e.g. 64 KB).
Once you write more than that, the Pipe will get "stuck" forever.
An easy solution is to read the pipe in a thread before you start writing to it.
comm_line = IO.pipe
# Buffered Pipe Reading (in case bigger than 64 KB)
reply = ""
read_buffer = Thread.new {
while !comm_line[0].eof?
reply = Marshal.load(comm_line[0])
end
}
child_pid = fork {
comm_line[0].close
comm_line[0].write "HUGE DATA LARGER THAN 64 KB"
}
Process.wait(child_pid)
comm_line[1].close
read_buffer.join
comm_line[0].close
puts reply # outputs the "HUGE DATA"
I don't think the problem is with Marshal. The more obvious one seems to be that your fork may finish execution before the waiter thread gets to it (leading to the latter to wait forever).
Try changing Process.wait(pid) to Process.wait(pid, Process::WNOHANG). The Process::WNOHANG flag instructs Ruby to not hang if there are no children (matching the given PID, if any) available. Note that this may not be available on all platforms but at the very least should work on Linux.
There's a number of other potential problems with your code but if you just came up with it "as an exercise", they probably don't matter. For example, Marshal.load does not like to encounter EOFs, so I'd probably guard against those by saying something like Marshal.load(comm_line[key][:r]) unless comm_line[key][:r].eof? or loop until comm_line[key][:r].eof? if you expect there to be several objects to be read.

ruby eventmachine http-request deferrable

That's my first time with EM so I really need some help here
so here's the code:
EM.run do
queue = EM::Queue.new
EM.start_server('0.0.0.0', '9000', RequestHandler, queue)
puts 'Server started on localhost:9000' # Any interface, actually
process_queue = proc do |url|
request = EM::HttpRequest.new(url, connect_timeout: 1).get # No time to wait, sorry
request.callback do |http| # deferrable
puts http.response_header.status
end
queue.pop(&process_queue)
end
EM.next_tick { queue.pop(&process_queue) }
end
I've read a couple of articles about EM, now my understanding of above code is the following:
EM::HttpRequest is deferrable, which means it won't block a reactor.
But when I try running 50 concurrent connections with ab, it only serves ~20 concurrently ( according to ab report ).
But if I place the process_queue execution inside EM.defer( which means it will run in a separate thread? ) it performs just fine.
Why is it so? process_queue just inits a deferrable object and assigns a callback, how does running it inside EM.defer makes a difference?
One thing you may want to do is put the queue.pop(&process_queue) in the process_queue callback inside an EM.next_tick. Currently you're going to process all of the queued connections before you allow anything new to connect. If you put the queue.pop into a next_tick call you'll let the reactor do some work before you process the next item.

Resources