I've written little UDP server in Ruby:
def listen
puts "Started UDP server on #{#port}..."
Socket.udp_server_loop(#port) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
I start it in a separate thread:
thread = Thread.new { listen }
Is there a way to gracefully stop the udp_server_loop from outside the thread without just killing it (thread.kill)? I also dont't want to stop it from the inside by receiving any UDP message. Is udp_server_loop maybe not the right tool for me?
I don’t think you can do this with udp_server_loop (although you might be able to use some of the methods it uses). You are going to have to call IO::select in a loop of your own with some way of signalling it to exit, and some way of waking the thread so you don’t have to send a packet to stop it.
A simple way would be to use the timeout option to select with a variable to set to indicate you want the thread to end, something like:
#halt_loop = false
def listen
puts "Started UDP server on #{#port}..."
sockets = Socket.udp_server_sockets(#port)
loop do
readable, _, _ = IO.select(sockets, nil, nil, 1) # timeout 1 sec
break if #halt_loop
next unless readable # select returns nil on timeout
Socket.udp_server_recv(readable) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
end
You then set #halt_loop to true when you want to stop the thread.
The downside to this is that it is effectively polling. If you decrease the timeout then you potentially do more work on an empty loop, and if you increase it you have to wait longer when stopping the thread.
Another, slightly more complex solution would be to use a pipe and have the select listen on it along with the sockets. You could then signal directly to finish the select and exit the thread.
#read, #write = IO.pipe
#halt_loop = false
def listen
puts "Started UDP server on #{#port}..."
sockets = Socket.udp_server_sockets(#port)
sockets << #read
loop do
readable, _, _ = IO.select(sockets)
break if #halt_loop
readable.delete #read
Socket.udp_server_recv(readable) do |message, message_source|
puts "Got \"#{message}\" from #{message_source}"
handle_incoming_message(message)
end
end
end
def end_loop
#halt_loop = true
#write.puts "STOP!"
end
To exit the thread you just call end_loop which sets the #halt_loop flag then writes to the pipe, making the other end readable and causing the other thread to return from select.
You could have this code check the readable IOs and exit if one of them is the read end of the pipe instead of using the variable, but at least on Linux there is a potential bug where a call to select might return a file descriptor as readable when it actuallt isn’t. I don’t know if Ruby deals with this, so better safe than sorry.
Also be sure to remove the pipe from the readable array before passing it to udp_server_recv. It’s not a socket so will cause an exception if you don’t.
A downside to this technique is that pipes are “[n]ot available on all platforms".
Although I doubt I understand what would be wrong with Thread::kill and/or Thread#exit, you might use the thread local variable for that.
def listen
Socket.udp_server_loop(#port) do |message, message_source|
break :interrupted if Thread.current[:break]
handle_incoming_message(message)
end
end
and do
thread[:break] = true
from the outside.
Related
I'm new both to IMAP and multi-thread programming, and I'd like to write a script to fetch incoming mails and process them in parallel.
Thread, Queue, Mutex and Monitor are new concepts to me, I may miss-use them in the following question and example.
The script's goal is to:
receive IDLE notifications from the IMAP server when a new mail arrives
fetch the corresponding mail
parse its body
generate a PDF from parsed data
send the report through SMTP
Another aproach could be to fetch and identify mails based on their UID, thought learning IMAP & multi-thread programing is another goal.
Here's what I've got so far (IMAP part):
def self.idle(imap)
#imap = Net::IMAP.new 'mail.company.org', 143, false # Not using TLS for test purposes
#imap.login 'username', 'password'
#idler_listener = Thread.new do
loop do
begin
imap.examine 'INBOX'
imap.idle do |res|
if res.kind_of?(Net::IMAP::UntaggedResponse) and res.name == 'EXISTS'
imap.idle_done
Thread.new { process_email(imap) }
end
end
rescue => e
puts e.inspect
end
end
end.join
end
def self.process_email(imap)
begin
uid = imap.uid_search(['SUBJECT', 'MyFavoriteSubject']).last
mail = imap.uid_fetch(uid, 'BODY[TEXT]')[0].attr['BODY[TEXT]']
puts mail
rescue => e
puts e.inspect
end
end
This example successfully prints out the body of an incoming mail. However, if several mail arrives at the same time, only one will be treated.
Q:
Is this behavior due to the fact that the loop is executed in a single thread ?
Does it illustrate the need for a queue, a thread pool, or a producer / consumer pattern ?
Chances that this script will attempt to access the same resource at the same time are low, however, would Mutex.new.synchronize protect me against race conditions & deadlocks ?
(Eventhough I've not fully understand it, I think it's worth sharing this great resource from Masatoshi Seki: https://www.druby.org/sidruby/)
I'm aware that there are great gems like Parallel, but I came up with the class below as an exercise.
It's working fine, but when doing a lot of iterations it happens sometimes that Ruby will get "stuck". When pressing CTRL+C I can see from the backtrace that it's always in lines 38 or 45 (the both Marshal lines).
Can you see anything that is wrong here? It seems to be that the Pipes are "hanging", so I thought I might be using them in a wrong way.
My goal was to iterate through an array (which I pass as 'objects') with a limited number of forks (max_forks) and to return some values. Additionally I wanted to guarantee that all childs get killed when the parent gets killed (even in case of kill -9), this is why I introduced the "life_line" Pipe (I've read here on Stackoverflow that this might do the trick).
class Parallel
def self.do_fork(max_forks, objects)
waiter_threads = []
fork_counter = []
life_line = {}
comm_line = {}
objects.each do |object|
key = rand(24 ** 24).to_s(36)
sleep(0.01) while fork_counter.size >= max_forks
if fork_counter.size < max_forks
fork_counter << true
life_line[key] = {}
life_line[key][:r], life_line[key][:w] = IO.pipe
comm_line[key] = {}
comm_line[key][:r], comm_line[key][:w] = IO.pipe
pid = fork {
life_line[key][:w].close
comm_line[key][:r].close
Thread.new {
begin
life_line[key][:r].read
rescue SignalException, SystemExit => e
raise e
rescue Exception => e
Kernel.exit
end
}
Marshal.dump(yield(object), comm_line[key][:w]) # return yield
}
waiter_threads << Thread.new {
Process.wait(pid)
comm_line[key][:w].close
reply = Marshal.load(comm_line[key][:r])
# process reply here
comm_line[key][:r].close
life_line[key][:r].close
life_line[key][:w].close
life_line[key] = nil
fork_counter.pop
}
end
end
waiter_threads.each { |k| k.join } # wait for all threads to finish
end
end
The bug was this:
A pipe can handle only a certain amount of data (e.g. 64 KB).
Once you write more than that, the Pipe will get "stuck" forever.
An easy solution is to read the pipe in a thread before you start writing to it.
comm_line = IO.pipe
# Buffered Pipe Reading (in case bigger than 64 KB)
reply = ""
read_buffer = Thread.new {
while !comm_line[0].eof?
reply = Marshal.load(comm_line[0])
end
}
child_pid = fork {
comm_line[0].close
comm_line[0].write "HUGE DATA LARGER THAN 64 KB"
}
Process.wait(child_pid)
comm_line[1].close
read_buffer.join
comm_line[0].close
puts reply # outputs the "HUGE DATA"
I don't think the problem is with Marshal. The more obvious one seems to be that your fork may finish execution before the waiter thread gets to it (leading to the latter to wait forever).
Try changing Process.wait(pid) to Process.wait(pid, Process::WNOHANG). The Process::WNOHANG flag instructs Ruby to not hang if there are no children (matching the given PID, if any) available. Note that this may not be available on all platforms but at the very least should work on Linux.
There's a number of other potential problems with your code but if you just came up with it "as an exercise", they probably don't matter. For example, Marshal.load does not like to encounter EOFs, so I'd probably guard against those by saying something like Marshal.load(comm_line[key][:r]) unless comm_line[key][:r].eof? or loop until comm_line[key][:r].eof? if you expect there to be several objects to be read.
That's my first time with EM so I really need some help here
so here's the code:
EM.run do
queue = EM::Queue.new
EM.start_server('0.0.0.0', '9000', RequestHandler, queue)
puts 'Server started on localhost:9000' # Any interface, actually
process_queue = proc do |url|
request = EM::HttpRequest.new(url, connect_timeout: 1).get # No time to wait, sorry
request.callback do |http| # deferrable
puts http.response_header.status
end
queue.pop(&process_queue)
end
EM.next_tick { queue.pop(&process_queue) }
end
I've read a couple of articles about EM, now my understanding of above code is the following:
EM::HttpRequest is deferrable, which means it won't block a reactor.
But when I try running 50 concurrent connections with ab, it only serves ~20 concurrently ( according to ab report ).
But if I place the process_queue execution inside EM.defer( which means it will run in a separate thread? ) it performs just fine.
Why is it so? process_queue just inits a deferrable object and assigns a callback, how does running it inside EM.defer makes a difference?
One thing you may want to do is put the queue.pop(&process_queue) in the process_queue callback inside an EM.next_tick. Currently you're going to process all of the queued connections before you allow anything new to connect. If you put the queue.pop into a next_tick call you'll let the reactor do some work before you process the next item.
I have a circumstance where my server may close TCPServer and restart, saving all the users to a file, and immediately reloading them; their connections do not sever.
The problem is I can't seem to reinitialize their streams.
When we restart (and attempt to maintain connections), I reinitialize TCPServer, and load my array of connected users – Since these each have an existing socket address, stored as <TCPSocket:0x00000000000000>, can I reinitialize these addresses with TCPServer?
Normally, each user connects and is accepted:
$nCS = TCPServer.new(HOST, PORT)
begin
while socket = $nCS.accept
Thread.new( socket ) do |sock|
begin
d = User.new(sock)
while sock.gets
szIn = $_.chomp
DBG( "Received '" + szIn + "' from Client " + sock.to_s )
d.parseInput( szIn )
end
rescue => e
$stdout.puts "ERROR: Caught error in Client Thread: #{e} \r\n #{e.backtrace.to_s.gsub(",", ",\r\n")}"
sock.write("Sorry, an error has occurred, and you have been disconnected."+EOL+"Please try again later."+EOL)
d.closeConnection
end
end
end
rescue => e
$stdout.puts "ERROR: Caught error in Server Thread: #{e} \r\n #{e.backtrace.to_s.gsub(",", ",\r\n")}"
exit
end
To give it a command to hot reboot, we use exec('./main --copyover') to flag that a copy over is occurring.
If $connected holds an array of all users, and each user has a socket, how do I reinitialize the socket that was open before the restart (assuming the other end is still connected)?
I suspect that using exec("./main", "--copyover", *$nCS, *$connected) is getting me closer, since this simply replaces the process, and should maintain the files (not close them).
You can't. The socket is only valid for the lifetime of the process: it is closed by the OS when the process exits. That in turn invalidates the connection, so the other end is not still connected.
How to Hot-Reboot a TCPServer in Ruby
Hot-Rebooting (aka Copyover) is a process by which an administrator can reload the application (along with any new changes made since last boot) without losing the client connections. This is useful in managing customer expectations as the application does not need to suffer severe downtime and disruption if in use.
What I propose below may not be the best practice, but it's functioning and perhaps will guide others to a similar solution.
The Command
I use a particular style of coding that makes use of command tables to find functions and their accessibility. All command functions are prefixed with cmd. I'll clean up the miscellany to improve readability:
def cmdCopyover
#$nCS is the TCPServer object
#$connected holds an array of all users sockets
#--copyover flags that this is a hot reboot.
connected_args = $connected.map do |sock|
sock.close_on_exec = false if sock.respond_to?(:close_on_exec=)
sock.fileno.to_s
end.join(",")
exec('./main.rb', '--copyover', $nCS.fileno.to_s, connected_args)
end
What we're passing are strings; $nCS.fileno.to_s provides us the file descriptor of the main TCPServer object, while connected_args is a comma-delineated list of file descriptors for each user connected. When we restart, ARGV will be an array holding each argument:
ARGV[0] == "--copyover"
ARGV[1] == "5" (Or whatever the file descriptor for TCPServer was)
ARGV[2] == "6,7,8,9" (Example, assuming 4 connected users)
What To Expect When You're Expecting (a Copyover)
Under normal circumstances, we may have a basic server (in main.rb that looks something like this:
puts "Starting Server"
$connected = Array.new
$nCS = TCPServer.new("127.0.0.1",9999)
begin
while socket = $nCS.accept
# NB: Move this loop to its own function, threadLoop()
Thread.new( socket ) do |sock|
begin
while sock.gets
szIn = $_.chomp
#do something with input.
end
rescue => e
puts "ERROR: Caught error in Client Thread: #{e}"
puts #{e.backtrace.to_s.gsub(",", ",\r\n")}"
sock.write("Sorry, an error has occurred, and you have been disconnected."+EOL+"Please try again later."+EOL)
sock.close
end
end
end
rescue => e
puts "Error: Caught Error in Server Thread: #{e}"
puts "#{e.backtrace.to_s.gsub(",", ",\r\n")}"
exit
end
We want to move that main loop to its own function to make it accessible -- our reconnecting users will need to be reinserted in the loop.
So let's get main.rb ready for accepting a hot reboot:
def threadLoop( socket )
Thread.new( socket ) do |sock|
begin
while sock.gets
szIn = $_.chomp
#do something with input.
end
rescue => e
puts "ERROR: Caught error in Client Thread: #{e}"
puts #{e.backtrace.to_s.gsub(",", ",\r\n")}"
sock.write("Sorry, an error has occurred, and you have been disconnected."+EOL+"Please try again later."+EOL)
sock.close
end
end
end
puts "Starting Server"
$connected = Array.new
if ARGV[0] == '--copyover'
$nCS = TCPServer.for_fd( ARGV[1].to_i )
$nCS.close_on_exec = false if $nCS.respond_to?(:close_on_exec=)
connected_args = ARGV[2]
connected_args.split(/,/).map do |sockfd|
$connected << sockfd
$connected.each {|c| threadLoop( c ) }
else
$nCS = TCPServer.new("127.0.0.1",9999)
$nCS.close_on_exec = false if $nCS.respond_to?(:close_on_exec=)
end
begin
while socket = $nCS.accept
threadLoop( socket )
end
rescue => e
puts "Error: Caught Error in Server Thread: #{e}"
puts "#{e.backtrace.to_s.gsub(",", ",\r\n")}"
exit
end
Caveat
My actual usage was a lot more ridiculously complicated, so I did my best to strip out all the garbage; however, I was realizing when I got the end here that you could probably do without $connected (it's a part of a larger system for me). There may be some errors, so please comment if you find them and I'll correct.
Hope this helps anyone who finds it.
Creating a class which holds some threads, performing tasks and finally calling a callback-method is my current goal, nothing special on this road.
My experimental class does some connection-checks on specific ports of a given IP, to give me a status information.
So my attempt:
check = ConnectionChecker.new do | threads |
# i am done callback
end
check.check_connectivity(ip0, port0, timeout0, identifier0)
check.check_connectivity(ip1, port1, timeout1, identifier1)
check.check_connectivity(ip2, port2, timeout2, identifier2)
sleep while not check.is_done
Maybe not the best approach, but in general it fits in my case.
So what's happening:
In my Class I store a callback, perform actions and do internal stuff:
Thread.new -> success/failure -> mark as done, when all done -> call callback:
class ConnectionChecker
attr_reader :is_done
def initialize(&callback)
#callback = callback
#thread_count = 0
#threads = []
#is_done = false
end
def check_connectivity(host, port, timeout, ident)
#thread_count += 1
#threads << Thread.new do
status = false
pid = Process.spawn("nc -z #{host} #{port} >/dev/null")
begin
Timeout.timeout(timeout) do
Process.wait(pid)
status = true
end
rescue Process::TimeoutError => e
Process.kill('TERM', pid)
end
mark_as_done
#returnvalue for the callback.
[status, ident]
end
end
# one less to go..
def mark_as_done
#thread_count -= 1
if #thread_count.zero?
#is_done = true
#callback.call(#threads)
end
end
end
This code - yes, I know there is no start method so I have to trust that I call it all quite instantly - works fine.
But when I swap these 2 lines:
#is_done = true
#callback.call(#threads)
to
#callback.call(#threads)
#is_done = true
then the very last line,
sleep while not check.is_done
becomes an endless loop. Debugging shows me that the callback is called properly, when I check for the value of is_done, it really always is false. Since I don't put it into a closure, I wonder why this is happening.
The callback itself can also be empty, is_done remains false (so there is no mis-caught exception).
In this case I noticed that the last thread was at status running. Since I did not ask for the thread's value, I just don't get the hang here.
Any documentation/information regarding this problem? Also, a name for it would be fine.
Try using Mutex to ensure thread safety :)