Single thread still handles concurrency request? - ruby

Ruby process is single thread. When we start a single process using thin server, why are we still able to handle concurrency request?
require 'sinatra'
require 'thin'
set :server, %w[thin]
get '/test' do
sleep 2 <----
"success"
end
What is inside thin that can handle concurrency request? If it is due to event-machine framework, the code above is actually a sync code which is not for EM used.

Quoting the chapter: "Non blocking IOs/Reactor pattern" in
http://merbist.com/2011/02/22/concurrency-in-ruby-explained/:
"this is the approach used by Twisted, EventMachine and Node.js. Ruby developers can use EventMachine or
an EventMachine based webserver like Thin as well as EM clients/drivers to make non blocking async calls."
The heart of the matter regard EventMachine.defer
*
used for integrating blocking operations into EventMachine's control flow.
The action of defer is to take the block specified in the first parameter (the "operation")
and schedule it for asynchronous execution on an internal thread pool maintained by EventMachine.
When the operation completes, it will pass the result computed by the block (if any)
back to the EventMachine reactor.
Then, EventMachine calls the block specified in the second parameter to defer (the "callback"),
as part of its normal event handling loop.
The result computed by the operation block is passed as a parameter to the callback.
You may omit the callback parameter if you don't need to execute any code after the operation completes.
*
Essentially, in response to an HTTP request, the server executes that you wrote,
invokes the process method in the Connecction class.
have a look at the code in $GEM_HOME/gems/thin-1.6.2/lib/thin/connection.rb:
# Connection between the server and client.
# This class is instanciated by EventMachine on each new connection
# that is opened.
class Connection < EventMachine::Connection
# Called when all data was received and the request
# is ready to be processed.
def process
if threaded?
#request.threaded = true
EventMachine.defer(method(:pre_process), method(:post_process))
else
#request.threaded = false
post_process(pre_process)
end
end
..here is where a threaded connection invoke EventMachine.defer
The reactor
To see where is activated the EventMachine reactor
should follow the initialization of the program:
Notice that for all Sinatra applications and middleware ($GEM_HOME/gems/sinatra-1.4.5/base.rb)
can run the Sinatra app as a self-hosted server using Thin, Puma, Mongrel, or WEBrick.
def run!(options = {}, &block)
return if running?
set options
handler = detect_rack_handler
....
the method detect_rack_handler returns the first Rack::Handler
return Rack::Handler.get(server_name.to_s)
in our test we require thin therefore it returns a Thin rack handler and setup a threaded server
# Starts the server by running the Rack Handler.
def start_server(handler, server_settings, handler_name)
handler.run(self, server_settings) do |server|
....
server.threaded = settings.threaded if server.respond_to? :threaded=
$GEM_HOME/gems/thin-1.6.2/lib/thin/server.rb
# Start the server and listen for connections.
def start
raise ArgumentError, 'app required' unless #app
log_info "Thin web server (v#{VERSION::STRING} codename #{VERSION::CODENAME})"
...
log_info "Listening on #{#backend}, CTRL+C to stop"
#backend.start { setup_signals if #setup_signals }
end
$GEM_HOME/gems/thin-1.6.2/lib/thin/backends/base.rb
# Start the backend and connect it.
def start
#stopping = false
starter = proc do
connect
yield if block_given?
#running = true
end
# Allow for early run up of eventmachine.
if EventMachine.reactor_running?
starter.call
else
#started_reactor = true
EventMachine.run(&starter)
end
end

Related

How to connect to multiple WebSockets with Ruby?

Using faye-websocket and EventMachine the code looks very similar to faye-websocket's client example:
require 'faye/websocket'
require 'eventmachine'
def setup_socket(url)
EM.run {
ws = Faye::WebSocket::Client.new(url)
ws.on :open do ... end
ws.on :message do ... end
ws.on :close do ... end
}
end
I'd like to have multiple connections open parallely. I can't simply call setup_socket multiple times as the execution won't exit the EM.run clause. I've tried to run setup_socket multiple times in separate threads as:
urls.each do |url|
Thread.new { setup_socket(url) }
end
But it doesn't seem to do anyhting as the puts statements don't reach the output.
I'm not restricted to use faye-websocket but it seemed most people use this library. If possible I'd like to avoid multithreading. I'd also not like to lose the possiblity to make changes (e.g. add a new websocket) over time. Therefore moving the iteration of URLs inside the EM.run clause is not desired but instead starting multiple EMs would be more beneficial. I found an example for starting multiple servers via EM in a very clean way. I'm looking for something similar.
How can I connect to multiple WebSockets at the same time?
Here's one way to do it.
First, you have to accept that the EM thread needs to be running. Without this thread you won't be able to process any current connections. So you just can't get around that.
Then, in order to add new URLs to the EM thread you then need some way to communicate from the main thread to the EM thread, so you can tell it to launch a new connection. This can be done with EventMachine::Channel.
So what we can build now is something like this:
#channel = EventMachine::Channel.new
Thread.new {
EventMachine.run {
#channel.subscribe { |url|
ws = Faye::...new(url)
...
}
}
}
Then in the main thread, any time you want to add a new URL to the event loop, you just use this:
def setup_socket(url)
#channel.push(url)
end
Here's another way to do it... Use Iodine's native websocket support (or the Plezi framework) instead of em-websocket...
...I'm biased (I'm the author), but I think they make it a lot easier. Also, Plezi offers automatic scaling with Redis, so it's easy to grow.
Here's an example using Plezi, where each Controller acts like a channel, with it's own URL and Websocket callback (although I think Plezi's Auto-Dispatch is easier than the lower level on_message callback). This code can be placed in a config.ru file:
require 'plezi'
# Once controller / channel for all members of the "Red" group
class RedGroup
def index # HTTP index for the /red URL
"return the RedGroup client using `:render`".freeze
end
# handle websocket messages
def on_message data
# in this example, we'll send the data to all the members of the other group.
BlueGroup.broadcast :handle_message, data
end
# This is the method activated by the "broadcast" message
def handle_message data
write data # write the data to the client.
end
end
# the blue group controller / channel
class BlueGroup
def index # HTTP index for the /blue URL
"return the BlueGroup client using `:render`".freeze
end
# handle websocket messages
def on_message data
# in this example, we'll send the data to all the members of the other group.
RedGroup.broadcast :handle_message, data
end
# This is the method activated by the "broadcast" message
def handle_message data
write data
end
end
# the routes
Plezi.route '/red', RedGroup
Plezi.route '/blue', BlueGroup
# Set the Rack application
run Plezi.app
P.S.
I wrote this answer also because em-websocket might fail or hog resources in some cases. I'm not sure about the details, but it was noted both on the websocket-shootout benchmark and the AnyCable Websocket Benchmarks.

Handle Sinatra and Faye in same EventMachine Performantly

I am writing a web application that uses both Sinatra—for general single-client synchronous gets—and Faye—for multiple-client asynchronous server-based broadcasts.
My (limited) understanding of EventMachine was that it would allow me to put both of these in a single process and get parallel requests handled for me. However, my testing shows that if either Sinatra or Faye takes a long time on a particular action (which may happen regularly with my real application) it blocks the other.
How can I rewrite the simple test application below so that if either sleep command is uncommented the Faye-pushes and the AJAX poll responses are not delayed?
%w[eventmachine thin sinatra faye json].each{ |lib| require lib }
def run!
EM.run do
Faye::WebSocket.load_adapter('thin')
webapp = MyWebApp.new
server = Faye::RackAdapter.new(mount:'/', timeout:25)
dispatch = Rack::Builder.app do
map('/'){ run webapp }
map('/faye'){ run server }
end
Rack::Server.start({
app: dispatch,
Host: '0.0.0.0',
Port: 8090,
server: 'thin',
signals: false,
})
end
end
class MyWebApp < Sinatra::Application
# http://stackoverflow.com/q/10881594/405017
configure{ set threaded:false }
def initialize
super
#faye = Faye::Client.new("http://localhost:8090/faye")
EM.add_periodic_timer(0.5) do
# uncommenting the following line should not
# prevent Sinatra from responding to "pull"
# sleep 5
#faye.publish( '/push', { faye:Time.now.to_f } )
end
end
get ('/pull') do
# uncommenting the following line should not
# prevent Faye from sending "push" updates rapidly
# sleep 5
content_type :json
{ sinatra:Time.now.to_f }.to_json
end
get '/' do
"<!DOCTYPE html>
<html lang='en'><head>
<meta charset='utf-8'>
<title>PerfTest</title>
<script src='https://code.jquery.com/jquery-2.2.0.min.js'></script>
<script src='/faye/client.js'></script>
<script>
var faye = new Faye.Client('/faye', { retry:2, timeout:10 } );
faye.subscribe('/push',console.log.bind(console));
setInterval(function(){
$.get('/pull',console.log.bind(console))
}, 500 );
</script>
</head><body>
Check the logs, yo.
</body></html>"
end
end
run!
How does sleep differ from, say, 999999.times{ Math.sqrt(rand) } or exec("sleep 5")? Those also block any single-thread, right? That's what I'm trying to simulate, a blocking command that takes a long time.
Both cases would block your reactor/event queue. With the reactor pattern, you want to avoid any CPU intensive work, and focus purely on IO (i.e. network programming).
The reason why the single-threaded reactor pattern works so well with I/O is because IO is not CPU intensive - instead it just blocks your programs while the system kernel handles your I/O request.
The reactor pattern takes advantage of this by immediately switching your single thread to potentially work on something different (perhaps the response of some other request has completed) until the I/O operation is completed by the OS.
Once the OS has the result of your IO request, EventMachine finds the callback you had initially registered with your I/O request and passes it the response data.
So instead of something like
# block here for perhaps 50 ms
r = RestClient.get("http://www.google.ca")
puts r.body
EventMachine is more like
# Absolutely no blocking
response = EventMachine::HttpRequest.new('http://google.ca/').get
# add to event queue for when kernel eventually delivers result
response.callback {
puts http.response
}
In the first example, you would need the multi-threaded model for your web server, since a single thread making a network request can block for potentially seconds.
In the second example, you don't have blocking operations, so one thread works great (and is generally faster than a multi-thread app!)
If you ever do have a CPU intensive operation, EventMachine allows you to cheat a little bit, and start a new thread so that the reactor doesn't block. Read more about EM.defer here.
One final note is that this is the reason Node.js is so popular. For Ruby we need EventMachine + compatible libraries for the reactor pattern (can't just use the blocking RestClient for example), but Node.js and all of it's libraries are written from the start for the reactor design pattern (they are callback based).

Does Rack handle requests serially or concurrently?

Suppose I have a single process Rack application, if multiple requests arrive at the same time can the invocations of call(env) occur concurrently? Or is it guaranteed that call(env) will happen serially and so therefore there is no race condition on #counter? Does it make any difference between using Unicorn or Thin?
require 'json'
class Greeter
def call(env)
req = Rack::Request.new(env)
#counter ||= 0
#counter = #counter + 1
puts #counter
[200, {"Content-Type" => "application/json"}, [{x:"Hello World!"}.to_json]]
end
end
run Greeter.new
It depends on your Rack handler (the application server). Unicorn and Thin are both capable of concurrent requests, using multi-process and/or evented models, depending on which one you choose and how you configure it. So its not really a question of whether Rack supports it, since it is the handler (Unicorn, Thin, or others) that are responsible for concurrency. This post has some more details and an overview of several popular Rack app servers:
Is Sinatra multi threaded?
If you're wondering whether an instance variable in the Greeter class could potentially be shared between threads, that should not happen even when using one of the concurrent app servers, since they will each have their own Greeter instance and hence separate instance variables. But you would need to watch out for globals or constants, since those would be shared across all threads so you'd need to take that into account and use a lock/mutex, etc.

Test Ruby TCPSocket server

I have a small application, serving connections(like a chat). It catches the connection, grabs login from it, then listens to the data and broadcasts it to each connection, except sender.
The problem is i'm not a very advanced tester and do not know, how this can be tested.
# Handle each connection
def serve(io)
io.puts("LOGIN\n")
# Listen for identifier
user = io.gets.chomp
...
# Add connection to the list
#mutex.synchronize { #chatters[user] = io }
# Get and broadcast input until connection returns nil
loop do
incoming = io.gets
broadcast(incoming, io)
end
end
#Send message out to everyone, but sender
def broadcast(message="", sender)
# Mutex for safety - GServer uses threads
#mutex.synchronize do
#chatters.each do |chatter|
socket = chatter[1]
# Do not send to sender
if sock != sender
sock.print(message)
end
end
end
end
If you just want to do unit testing, you could use RSpec mocks (or some other mocking framework) to stub your methods and ensure that the logic works the way you expect. If you actually want to drive an integration test, that's a lot more work, and will require that you create a separate reader and writer for the socket so that you can actually test each piece of the conversation independently for expected behavior.
Someone else has apparently blogged about a similar issue to yours. Perhaps that example will help you.
If your question is more about the test cases you should write, instead of about how to test sockets, then you may want to rewrite your question so that answers will be more on-target.

EventMachine, Redis & EM HTTP Request

I try to read URLs from a Redis store and simply fetch the HTTP status of the URLs. All within EventMachine. I don't know what's wrong with my code, but it's not asynchronous like expected.
All requests are fired from the first one to the last one and curiously I only get the first response (the HTTP header I want to check) after the last request. Does anyone have a hint what's going wrong there?
require 'eventmachine'
require 'em-hiredis'
require 'em-http'
EM.run do
#redis = EM::Hiredis.connect
#redis.errback do |code|
puts "Error code: #{code}"
end
#redis.keys("domain:*") do |domains|
domains.each do |domain|
if domain
http = EM::HttpRequest.new("http://www.#{domain}", :connect_timeout => 1).get
http.callback do
puts http.response_header.http_status
end
else
EM.stop
end
end
end
end
I'm running this script for a few thousand domains so I would expect to get the first responses before sending the last request.
While EventMachine is async, the reactor itself is single threaded. So, while your loop is running and firing off those thousands of requests, none of them are being executed until the loop exits. Then, if you call EM.stop, you'll stop the reactor before they execute.
You can use something like EM::iterator to break up the processing of domains into chunks that let the reactor execute. Then you'll need to do some magic if you really want to EM.stop by keeping a counter of the dispatched requests and the received responses before you stop the reactor.

Resources