Send multiply messages in websocket using threads - ruby

I'm making a Ruby server using the em-websocket gem. When a client sends some message (e.g. "thread") the server creates two different threads and sends two anwsers to the client in parallel (I'm actually studying multithreading and websockets). Here's my code:
EM.run {
EM::WebSocket.run(:host => "0.0.0.0", :port => 8080) do |ws|
ws.onmessage { |msg|
puts "Recieved message: #{msg}"
if msg == "thread"
threads = []
threads << a = Thread.new {
sleep(1)
puts "1"
ws.send("Message sent from thread 1")
}
threads << b = Thread.new{
sleep(2)
puts "2"
ws.send("Message sent from thread 2")
}
threads.each { |aThread| aThread.join }
end
How it executes:
I'm sending "thread" message to a server
After one second in my console I see printed string "1". After another second I see "2".
Only after that both messages simultaneously are sent to the client.
The problem is that I want to send messages exactly at the same time when debug output "1" and "2" are sent.
My Ruby version is 1.9.3p194.

I don't have experience with EM, so take this with a pinch of salt.
However, at first glance, it looks like "aThread.join" is actually blocking the "onmessage" method from completing and thus also preventing the "ws.send" from being processed.
Have you tried removing the "threads.each" block?
Edit:
After having tested this in arch linux with both ruby 1.9.3 and 2.0.0 (using "test.html" from the examples of em-websocket), I am sure that even if removing the "threads.each" block doesn't fix the problem for you, you will still have to remove it as Thread#join will suspend the current thread until the "joined" threads are finished.
If you follow the function call of "ws.onmessage" through the source code, you will end up at the Connection#send_data method of the Eventmachine module and find the following within the comments:
Call this method to send data to the remote end of the network connection. It takes a single String argument, which may contain binary data. Data is buffered to be sent at the end of this event loop tick (cycle).
As "onmessage" is blocked by the "join" until both "send" methods have run, the event loop tick cannot finish until both sets of data are buffered and thus, all the data cannot be sent until this time.
If it is still not working for you after removing the "threads.each" block, make sure that you have restarted your eventmachine and try setting the second sleep to 5 seconds instead. I don't know how long a typical event loop takes in eventmachine (and I can't imagine it to be as long as a second), however, the documentation basically says that if several "send" calls are made within the same tick, they will all be sent at the same time. So increasing the time difference will make sure that this is not happening.

I think the problem is that you are calling sleep method, passing 1 to the first thread and 2 to the second thread.
Try removing sleep call on both threads or passing the same value on each call.

Related

Ruby Bunny - Consuming from Multiple Queues

I’ve just started using Ruby and am writing a piece to consume some messages from a RabbitMQ queue. I’m using Bunny to do so.
So I’ve created my queues and binded them to an exchange.
However I’m now unsure how I handle subscribing to them both and allowing the ruby app to continue running (want the messages to keep coming through i.e. not blocked or at least not for a long time) until I actually exit it with ctrl+c.
I’ve tried using :block => true however as I have 2 different queues I’m subscribing to, using this means it remains consuming from only one.
So this is how I’m consuming messages:
def consumer
begin
puts ' [*] Waiting for messages. To exit press CTRL+C'
#oneQueue.subscribe(:manual_ack => true) do |delivery_info, properties, payload|
puts('Got One Queue')
puts "Received #{payload}, message properties are #{properties.inspect}"
end
#twoQueue.subscribe(:manual_ack => true) do |delivery_info, properties, payload|
puts('Got Two Queue')
puts "Received #{payload}, message properties are #{properties.inspect}"
end
rescue Interrupt => _
#TODO - close connections here
exit(0)
end
end
Any help would be appreciated.
Thanks!
You can't use block: true when you have two subscriptions as only the first one will block; it'll never get to the second subscription.
One thing you can do is set up both subscriptions without blocking (which will automatically spawn two threads to process messages), and then block your main thread with a wait loop (add just before your rescue):
loop { sleep 5 }

Understanding Celluloid Concurrency

Following are my Celluloid codes.
client1.rb One of the 2 clients. (I named it as client 1)
client2.rb 2nd of the 2 clients. (named as client 2 )
Note:
the only the difference between the above 2 clients is the text that is passed to the server. i.e ('client-1' and 'client-2' respectively)
On testing this 2 clients (by running them side by side) against following 2 servers (one at time). I found very strange results.
server1.rb (a basic example taken from the README.md of the celluloid-zmq)
Using this as the example server for the 2 above clients resulted in parallel executions of tasks.
OUTPUT
ruby server1.rb
Received at 04:59:39 PM and message is client-1
Going to sleep now
Received at 04:59:52 PM and message is client-2
Note:
the client2.rb message was processed when client1.rb request was on sleep.(mark of parallelism)
server2.rb
Using this as the example server for the 2 above clients did not resulted in parallel executions of tasks.
OUTPUT
ruby server2.rb
Received at 04:55:52 PM and message is client-1
Going to sleep now
Received at 04:56:52 PM and message is client-2
Note:
the client-2 was ask to wait 60 seconds since client-1 was sleeping(60 seconds sleep)
I ran the above test multiple times all resulted in same behaviour.
Can anyone explain me from the results of the above tests that.
Question: Why is celluloid made to wait for 60 seconds before it can process the other request i.e as noticed in server2.rb case.?
Ruby version
ruby -v
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0]
Using your gists, I verified this issue can be reproduced in MRI 2.2.1 as well as jRuby 1.7.21 and Rubinius 2.5.8 ... The difference between server1.rb and server2.rb is the use of the DisplayMessage and message class method in the latter.
Use of sleep in DisplayMessage is out of Celluloid scope.
When sleep is used in server1.rb it is using Celluloid.sleep in actuality, but when used in server2.rb it is using Kernel.sleep ... which locks up the mailbox for Server until 60 seconds have passed. This prevents future method calls on that actor to be processed until the mailbox is processing messages ( method calls on the actor ) again.
There are three ways to resolve this:
Use a defer {} or future {} block.
Explicitly invoke Celluloid.sleep rather than sleep ( if not explicitly invoked as Celluloid.sleep, using sleep will end up calling Kernel.sleep since DisplayMessage does not include Celluloid like Server does )
Bring the contents of DisplayMessage.message into handle_message as in server1.rb; or at least into Server, which is in Celluloid scope, and will use the correct sleep.
The defer {} approach:
def handle_message(message)
defer {
DisplayMessage.message(message)
}
end
The Celluloid.sleep approach:
class DisplayMessage
def self.message(message)
#de ...
Celluloid.sleep 60
end
end
Not truly a scope issue; it's about asynchrony.
To reiterate, the deeper issue is not the scope of sleep ... that's why defer and future are my best recommendation. But to post something here that came out in my comments:
Using defer or future pushes a task that would cause an actor to become tied up into another thread. If you use future, you can get the return value once the task is done, if you use defer you can fire & forget.
But better yet, create another actor for tasks that tend to get tied up, and even pool that other actor... if defer or future don't work for you.
I'd be more than happy to answer follow-up questions brought up by this question; we have a very active mailing list, and IRC channel. Your generous bounties are commendable, but plenty of us would help purely to help you.
Managed to reproduce and fix the issue.
Deleting my previous answer.
Apparently, the problem lies in sleep.
Confirmed by adding logs "actor/kernel sleeping" to the local copy of Celluloids.rb's sleep().
In server1.rb,
the call to sleep is within server - a class that includes Celluloid.
Thus Celluloid's implementation of sleep overrides the native sleep.
class Server
include Celluloid::ZMQ
...
def run
loop { async.handle_message #socket.read }
end
def handle_message(message)
...
sleep 60
end
end
Note the log actor sleeping from server1.rb. Log added to Celluloids.rb's sleep()
This suspends only the current "actor" in Celluloid
i.e. only the current "Celluloid thread" handling the client1 sleeps.
In server2.rb,
the call to sleep is within a different class DisplayMessage that does NOT include Celluloid.
Thus it is the native sleep itself.
class DisplayMessage
def self.message(message)
...
sleep 60
end
end
Note the ABSENCE of any actor sleeping log from server2.rb.
This suspends the current ruby task i.e. the ruby server sleeps (not just a single Celluloid actor).
The Fix?
In server2.rb, the appropriate sleep must be explicitly specified.
class DisplayMessage
def self.message(message)
puts "Received at #{Time.now.strftime('%I:%M:%S %p')} and message is #{message}"
## Intentionally added sleep to test whether Celluloid block the main process for 60 seconds or not.
if message == 'client-1'
puts 'Going to sleep now'.red
# "sleep 60" will invoke the native sleep.
# Use Celluloid.sleep to support concurrent execution
Celluloid.sleep 60
end
end
end

Which of Ruby's concurrency devices would be best suited for this scenario?

The whole threads/fibers/processes thing is confusing me a little. I have a practical problem that can be solved with some concurrency, so I thought this was a good opportunity to ask professionals and people more knowledgable than me about it.
I have a long array, let's say 3,000 items. I want to send a HTTP request for each item in the array.
Actually iterating over the array, generating requests, and sending them is very rapid. What takes time is waiting for each item to be received, processed, and acknowledged by the party I'm sending to. I'm essentially sending 100 bytes, waiting 2 seconds, sending 100 bytes, waiting 2 seconds.
What I would like to do instead is send these requests asynchronously. I want to send a request, specify what to do when I get the response, and in the meantime, send the next request.
From what I can see, there are four concurrency options I could use here.
Threads.
Fibers.
Processes; unsuitable as far as I know because multiple processes accessing the same array isn't feasible/safe.
Asynchronous functionality like JavaScript's XMLHttpRequest.
The simplest would seem to be the last one. But what is the best, simplest way to do that using Ruby?
Failing #4, which of the remaining three is the most sensible choice here?
Would any of these options also allow me to say "Have no more than 10 pending requests at any time"?
This is your classic producer/consumer problem and is nicely suited for threads in Ruby. Just create a Queue
urls = [...] # array with bunches of urls
require "thread"
queue = SizedQueue.new(10) # this will only allow 10 items on the queue at once
p1 = Thread.new do
url_slice = urls.each do |url|
response = do_http_request(url)
queue << response
end
queue << "done"
end
consumer = Thread.new do
http_response = queue.pop(true) # don't block when zero items are in queue
Thread.exit if http_response == "done"
process(http_response)
end
# wait for the consumer to finish
consumer.join
EventMachine as an event loop and em-synchrony as a Fiber wrapper for it's callbacks into synchronous code
Copy Paste from em-synchrony README
require "em-synchrony"
require "em-synchrony/em-http"
require "em-synchrony/fiber_iterator"
EM.synchrony do
concurrency = 2
urls = ['http://url.1.com', 'http://url2.com']
results = []
EM::Synchrony::FiberIterator.new(urls, concurrency).each do |url|
resp = EventMachine::HttpRequest.new(url).get
results.push resp.response
end
p results # all completed requests
EventMachine.stop
end
This is an IO bounded case that fits more in both:
Threading model: no problem with MRI Ruby in this case cause threads work well with IO cases; GIL effect is almost zero.
Asynchronous model, which proves(in practice and theory) to be far superior than threads when it comes to IO specific problems.
For this specific case and to make things far simpler, I would have gone with Typhoeus HTTP client which has a parallel support that works as the evented(Asynchronous) concurrency model.
Example:
hydra = Typhoeus::Hydra.new
%w(url1 url2 url3).each do |url|
request = Typhoeus::Request.new(url, followlocation: true)
request.on_complete do |response|
# do something with response
end
hydra.queue(request)
end
hydra.run # this is a blocking call that returns once all requests are complete

Ruby Sinatra with consumer thread and job queue

I’m trying to create a very simple restful server. When it receives a request, I want to create a new job on a queue that can be handled by another thread while the current thread returns a response to the client.
I looked at Sinatra, but haven't got too far.
require 'sinatra'
require 'thread'
queue = Queue.new
set :port, 9090
get '/' do
queue << 'item'
length = queue.size
puts 'QUEUE LENGTH %d', length
'Message Received'
end
consumer = Thread.new do
5.times do |i|
value = queue.pop(true) rescue nil
puts "consumed #{value}"
end
end
consumer.join
In the above example, I know the consumer thread would only run a few times (as opposed to the life of the application), but even this isn't working for me.
Is there a better approach?
Your main problem is your call to Queue#pop. You’re passing true, which causes it not to suspend the thread and raises an exception instead, which you rescue with nil. Your consumer thread therefore loops five times before any thing else can happen.
You need to change that line to
value = queue.pop
so that the thread waits for new data being pushed onto the queue.
You’ll also need to remove the consumer.join line from the end, since that will cause deadlock once you’ve changed the call to pop.
(Also, it’s not part of your main problem, but it looks like you want printf rather than puts when you print the queue length).

Sending outside of EventMachine loop

I'm using the em-ws-client gem, although I think my question is more general than that. I'm trying to send data from outside the EventMachine receive block, but it takes a very long time (~20s) for the data to be sent:
require "em-ws-client"
m = Mutex.new
c = ConditionVariable.new
Thread.new do
EM.run do
#ws = EM::WebSocketClient.new("ws://echo.websocket.org")
#ws.onopen do
puts "connected"
m.synchronize { c.broadcast }
end
#ws.onmessage do |msg, binary|
puts msg
end
end
end
m.synchronize { c.wait(m) }
#ws.send_message "test"
sleep 100
When I put the #ws.send_message "test" directly into the onopen method it works just fine. I don't understand why my version doesn't work. I found this issue in EventMachine, but I'm not sure whether it's related.
Why does it take so long, and how can I fix that?
EventMachine is strictly single threaded and sharing of sockets between threads is not recommended. What you might be seeing here is an issue with the main EventMachine thread being unaware that you've submitted a send_message call and leaving it buffered for an extended period of time.
I'd be very, very careful when using threads with EventMachine. I've seen it malfunction and crash if you hit thread timing or synchronization problems.

Resources