Ruby Sinatra with consumer thread and job queue - ruby

I’m trying to create a very simple restful server. When it receives a request, I want to create a new job on a queue that can be handled by another thread while the current thread returns a response to the client.
I looked at Sinatra, but haven't got too far.
require 'sinatra'
require 'thread'
queue = Queue.new
set :port, 9090
get '/' do
queue << 'item'
length = queue.size
puts 'QUEUE LENGTH %d', length
'Message Received'
end
consumer = Thread.new do
5.times do |i|
value = queue.pop(true) rescue nil
puts "consumed #{value}"
end
end
consumer.join
In the above example, I know the consumer thread would only run a few times (as opposed to the life of the application), but even this isn't working for me.
Is there a better approach?

Your main problem is your call to Queue#pop. You’re passing true, which causes it not to suspend the thread and raises an exception instead, which you rescue with nil. Your consumer thread therefore loops five times before any thing else can happen.
You need to change that line to
value = queue.pop
so that the thread waits for new data being pushed onto the queue.
You’ll also need to remove the consumer.join line from the end, since that will cause deadlock once you’ve changed the call to pop.
(Also, it’s not part of your main problem, but it looks like you want printf rather than puts when you print the queue length).

Related

Ruby Bunny - Consuming from Multiple Queues

I’ve just started using Ruby and am writing a piece to consume some messages from a RabbitMQ queue. I’m using Bunny to do so.
So I’ve created my queues and binded them to an exchange.
However I’m now unsure how I handle subscribing to them both and allowing the ruby app to continue running (want the messages to keep coming through i.e. not blocked or at least not for a long time) until I actually exit it with ctrl+c.
I’ve tried using :block => true however as I have 2 different queues I’m subscribing to, using this means it remains consuming from only one.
So this is how I’m consuming messages:
def consumer
begin
puts ' [*] Waiting for messages. To exit press CTRL+C'
#oneQueue.subscribe(:manual_ack => true) do |delivery_info, properties, payload|
puts('Got One Queue')
puts "Received #{payload}, message properties are #{properties.inspect}"
end
#twoQueue.subscribe(:manual_ack => true) do |delivery_info, properties, payload|
puts('Got Two Queue')
puts "Received #{payload}, message properties are #{properties.inspect}"
end
rescue Interrupt => _
#TODO - close connections here
exit(0)
end
end
Any help would be appreciated.
Thanks!
You can't use block: true when you have two subscriptions as only the first one will block; it'll never get to the second subscription.
One thing you can do is set up both subscriptions without blocking (which will automatically spawn two threads to process messages), and then block your main thread with a wait loop (add just before your rescue):
loop { sleep 5 }

How to pass a block to a yielding thread in Ruby

I am trying to wrap my head around Threads and Yielding in Ruby, and I have a question about how to pass a block to a yielding thread.
Specifically, I have a thread that is sleeping, and waiting to be told to do something, and I would like that Thread to execute a different block if told to (ie, it is sleeping, and if a user presses a button, do something besides sleep).
Say I have code like this:
window = Thread.new do
#thread1 = Thread.new do
# Do some cool stuff
# Decide it is time to sleep
until #told_to_wakeup
if block_given?
yield
end
sleep(1)
end
# At some point after #thread1 starts sleeping,
# a user might do something, so I want to execute
# some code in ##thread1 (unfortunately spawning a new thread
# won't work correctly in my case)
end
Is it possible to do that?
I tried using ##thread1.send(), but send was looking for a method name.
Thanks for taking the time to look at this!
Here's a simple worker thread:
queue = Queue.new
worker = Thread.new do
# Fetch an item from the work queue, or wait until one is available
while (work = queue.pop)
# ... Do something with work
end
end
queue.push(thing: 'to do')
The pop method will block until something is pushed into the queue.
When you're done you can push in a deliberately empty job:
queue.push(nil)
That will make the worker thread exit.
You can always expand on that functionality to do more things, or to handle more conditions.

Background thread in Rails can't see instance variables

I need to gather up some data from a rails application, aggregate it, and send it off to a remote server periodically. I instantiate my aggregation class in a global variable (I know, I know) in application.rb.
Inside my aggregation class, I fire up a thread that sleeps for 10 seconds, then looks at the queue, processes the data, and sends it. The queue is a hash stored in an instance variable of the class.
From the rails controller, I call a method in the aggregator class to queue the data in the hash. Of course this is on a different thread than the background task that reads the queue. The problem is that the background task never sees any data in the hash. In my log, I print out the object_id of the hash both when I write to it (from the controllers thread), and when I read from it (from the background thread). The hash#object_id matches from both threads, but the background thread never sees the data.
Whats killing me is that this works fine outside of rails. I've set up tests with many threads that really pound on it, and it works fine (there is some thread protection that I am not showing for clarity). Anyone know how the object_id's can match, but the contents are not consistent?
class Aggregator
def initialize
#q = {}
#timer = nil
end
def start
#timer = Thread.new do
loop do
sleep(10)
flush_q
end
end
end
def flush_q
logger.debug "flush: q.object_id = #{#q.object_id}" # matches what I get below
logger.debug "flush: q.length = #{#q.length}" # always zero!
#q.each_pair do |k,v|
# pack it up and send it
end
#q.clear
end
def add(item)
logger.debug "add: q.object_id = #{#q.object_id}" # matches what I get above
#q[item.name] ||= item
logger.debug "add: q.length = #{#q.length}" # increases with each add
# not actually that simple, but not relevant
end
end
I'm going to go out on a limb and assume that your code is deployed using a forking app server (eg unicorn or passenger).
This means that your app is loaded once and then new instances are forked from that master instances. Forking is cheap so this means that new instances of the app can be started up/shutdown really quickly.
I believe that your aggregator instance is getting created/started in this master process. When this forks the process's entire memory space is copied (so there an instance of aggregator in the new process, with the same object id and so on).
However when forking only the current thread is copied , so the aggregator flushing is only happening in the master process, but all the appending is happening in the child processes. You could confirm this by adding Proccess.pid to what you log - you should see that your logging is coming from 2 different process.
One way of fixing this would be to start/restart your thread after the child process has forked. How you do this depends on how the app is being served. With unicorn you can do this in your unicorn config via the after_fork method. With passenger you do
PhusionPassenger.on_event(:starting_worker_process) do |forked|
if forked
...
end
end

Send multiply messages in websocket using threads

I'm making a Ruby server using the em-websocket gem. When a client sends some message (e.g. "thread") the server creates two different threads and sends two anwsers to the client in parallel (I'm actually studying multithreading and websockets). Here's my code:
EM.run {
EM::WebSocket.run(:host => "0.0.0.0", :port => 8080) do |ws|
ws.onmessage { |msg|
puts "Recieved message: #{msg}"
if msg == "thread"
threads = []
threads << a = Thread.new {
sleep(1)
puts "1"
ws.send("Message sent from thread 1")
}
threads << b = Thread.new{
sleep(2)
puts "2"
ws.send("Message sent from thread 2")
}
threads.each { |aThread| aThread.join }
end
How it executes:
I'm sending "thread" message to a server
After one second in my console I see printed string "1". After another second I see "2".
Only after that both messages simultaneously are sent to the client.
The problem is that I want to send messages exactly at the same time when debug output "1" and "2" are sent.
My Ruby version is 1.9.3p194.
I don't have experience with EM, so take this with a pinch of salt.
However, at first glance, it looks like "aThread.join" is actually blocking the "onmessage" method from completing and thus also preventing the "ws.send" from being processed.
Have you tried removing the "threads.each" block?
Edit:
After having tested this in arch linux with both ruby 1.9.3 and 2.0.0 (using "test.html" from the examples of em-websocket), I am sure that even if removing the "threads.each" block doesn't fix the problem for you, you will still have to remove it as Thread#join will suspend the current thread until the "joined" threads are finished.
If you follow the function call of "ws.onmessage" through the source code, you will end up at the Connection#send_data method of the Eventmachine module and find the following within the comments:
Call this method to send data to the remote end of the network connection. It takes a single String argument, which may contain binary data. Data is buffered to be sent at the end of this event loop tick (cycle).
As "onmessage" is blocked by the "join" until both "send" methods have run, the event loop tick cannot finish until both sets of data are buffered and thus, all the data cannot be sent until this time.
If it is still not working for you after removing the "threads.each" block, make sure that you have restarted your eventmachine and try setting the second sleep to 5 seconds instead. I don't know how long a typical event loop takes in eventmachine (and I can't imagine it to be as long as a second), however, the documentation basically says that if several "send" calls are made within the same tick, they will all be sent at the same time. So increasing the time difference will make sure that this is not happening.
I think the problem is that you are calling sleep method, passing 1 to the first thread and 2 to the second thread.
Try removing sleep call on both threads or passing the same value on each call.

What happens when you don't join your Threads?

I'm writing a ruby program that will be using threads to do some work. The work that is being done takes a non-deterministic amount of time to complete and can range anywhere from 5 to 45+ seconds. Below is a rough example of what the threading code looks like:
loop do # Program loop
items = get_items
threads = []
for item in items
threads << Thread.new(item) do |i|
# do work on i
end
threads.each { |t| t.join } # What happens if this isn't there?
end
end
My preference would be to skip joining the threads and not block the entire application. However I don't know what the long term implications of this are, especially because the code is run again almost immediately. Is this something that is safe to do? Or is there a better way to spawn a thread, have it do work, and clean up when it's finished, all within an infinite loop?
I think it really depends on the content of your thread work. If, for example, your main thread needed to print "X work done", you would need to join to guarantee that you were showing the correct answer. If you have no such requirement, then you wouldn't necessarily need to join up.
After writing the question out, I realized that this is the exact thing that a web server does when serving pages. I googled and found the following article of a Ruby web server. The loop code looks pretty much like mine:
loop do
session = server.accept
request = session.gets
# log stuff
Thread.start(session, request) do |session, request|
HttpServer.new(session, request, basePath).serve()
end
end
Thread.start is effectively the same as Thread.new, so it appears that letting the threads finish and die off is OK to do.
If you split up a workload to several different threads and you need to combine at the end the solutions from the different threads you definately need a join otherwise you could do it without a join..
If you removed the join, you could end up with new items getting started faster than the older ones get finished. If you're working on too many items at once, it may cause performance issues.
You should use a Queue instead (snippet from http://ruby-doc.org/stdlib/libdoc/thread/rdoc/classes/Queue.html):
require 'thread'
queue = Queue.new
producer = Thread.new do
5.times do |i|
sleep rand(i) # simulate expense
queue << i
puts "#{i} produced"
end
end
consumer = Thread.new do
5.times do |i|
value = queue.pop
sleep rand(i/2) # simulate expense
puts "consumed #{value}"
end
end
consumer.join

Resources