Does Rack handle requests serially or concurrently? - ruby

Suppose I have a single process Rack application, if multiple requests arrive at the same time can the invocations of call(env) occur concurrently? Or is it guaranteed that call(env) will happen serially and so therefore there is no race condition on #counter? Does it make any difference between using Unicorn or Thin?
require 'json'
class Greeter
def call(env)
req = Rack::Request.new(env)
#counter ||= 0
#counter = #counter + 1
puts #counter
[200, {"Content-Type" => "application/json"}, [{x:"Hello World!"}.to_json]]
end
end
run Greeter.new

It depends on your Rack handler (the application server). Unicorn and Thin are both capable of concurrent requests, using multi-process and/or evented models, depending on which one you choose and how you configure it. So its not really a question of whether Rack supports it, since it is the handler (Unicorn, Thin, or others) that are responsible for concurrency. This post has some more details and an overview of several popular Rack app servers:
Is Sinatra multi threaded?
If you're wondering whether an instance variable in the Greeter class could potentially be shared between threads, that should not happen even when using one of the concurrent app servers, since they will each have their own Greeter instance and hence separate instance variables. But you would need to watch out for globals or constants, since those would be shared across all threads so you'd need to take that into account and use a lock/mutex, etc.

Related

Is there any way to make several big methods calls in parallel within one method? While trying to use, thread my dev server hangs up

def call
a_call
b_call
c_call
end
def a_call
does_something
end
def b_call
does_something
end
def c_call
does_something
end
Call all these methods at once so that response time is less.
Tried it using Thread dev server but it hangs up. Concurrent-ruby is not descriptive enough to use.

How to connect to multiple WebSockets with Ruby?

Using faye-websocket and EventMachine the code looks very similar to faye-websocket's client example:
require 'faye/websocket'
require 'eventmachine'
def setup_socket(url)
EM.run {
ws = Faye::WebSocket::Client.new(url)
ws.on :open do ... end
ws.on :message do ... end
ws.on :close do ... end
}
end
I'd like to have multiple connections open parallely. I can't simply call setup_socket multiple times as the execution won't exit the EM.run clause. I've tried to run setup_socket multiple times in separate threads as:
urls.each do |url|
Thread.new { setup_socket(url) }
end
But it doesn't seem to do anyhting as the puts statements don't reach the output.
I'm not restricted to use faye-websocket but it seemed most people use this library. If possible I'd like to avoid multithreading. I'd also not like to lose the possiblity to make changes (e.g. add a new websocket) over time. Therefore moving the iteration of URLs inside the EM.run clause is not desired but instead starting multiple EMs would be more beneficial. I found an example for starting multiple servers via EM in a very clean way. I'm looking for something similar.
How can I connect to multiple WebSockets at the same time?
Here's one way to do it.
First, you have to accept that the EM thread needs to be running. Without this thread you won't be able to process any current connections. So you just can't get around that.
Then, in order to add new URLs to the EM thread you then need some way to communicate from the main thread to the EM thread, so you can tell it to launch a new connection. This can be done with EventMachine::Channel.
So what we can build now is something like this:
#channel = EventMachine::Channel.new
Thread.new {
EventMachine.run {
#channel.subscribe { |url|
ws = Faye::...new(url)
...
}
}
}
Then in the main thread, any time you want to add a new URL to the event loop, you just use this:
def setup_socket(url)
#channel.push(url)
end
Here's another way to do it... Use Iodine's native websocket support (or the Plezi framework) instead of em-websocket...
...I'm biased (I'm the author), but I think they make it a lot easier. Also, Plezi offers automatic scaling with Redis, so it's easy to grow.
Here's an example using Plezi, where each Controller acts like a channel, with it's own URL and Websocket callback (although I think Plezi's Auto-Dispatch is easier than the lower level on_message callback). This code can be placed in a config.ru file:
require 'plezi'
# Once controller / channel for all members of the "Red" group
class RedGroup
def index # HTTP index for the /red URL
"return the RedGroup client using `:render`".freeze
end
# handle websocket messages
def on_message data
# in this example, we'll send the data to all the members of the other group.
BlueGroup.broadcast :handle_message, data
end
# This is the method activated by the "broadcast" message
def handle_message data
write data # write the data to the client.
end
end
# the blue group controller / channel
class BlueGroup
def index # HTTP index for the /blue URL
"return the BlueGroup client using `:render`".freeze
end
# handle websocket messages
def on_message data
# in this example, we'll send the data to all the members of the other group.
RedGroup.broadcast :handle_message, data
end
# This is the method activated by the "broadcast" message
def handle_message data
write data
end
end
# the routes
Plezi.route '/red', RedGroup
Plezi.route '/blue', BlueGroup
# Set the Rack application
run Plezi.app
P.S.
I wrote this answer also because em-websocket might fail or hog resources in some cases. I'm not sure about the details, but it was noted both on the websocket-shootout benchmark and the AnyCable Websocket Benchmarks.

Single thread still handles concurrency request?

Ruby process is single thread. When we start a single process using thin server, why are we still able to handle concurrency request?
require 'sinatra'
require 'thin'
set :server, %w[thin]
get '/test' do
sleep 2 <----
"success"
end
What is inside thin that can handle concurrency request? If it is due to event-machine framework, the code above is actually a sync code which is not for EM used.
Quoting the chapter: "Non blocking IOs/Reactor pattern" in
http://merbist.com/2011/02/22/concurrency-in-ruby-explained/:
"this is the approach used by Twisted, EventMachine and Node.js. Ruby developers can use EventMachine or
an EventMachine based webserver like Thin as well as EM clients/drivers to make non blocking async calls."
The heart of the matter regard EventMachine.defer
*
used for integrating blocking operations into EventMachine's control flow.
The action of defer is to take the block specified in the first parameter (the "operation")
and schedule it for asynchronous execution on an internal thread pool maintained by EventMachine.
When the operation completes, it will pass the result computed by the block (if any)
back to the EventMachine reactor.
Then, EventMachine calls the block specified in the second parameter to defer (the "callback"),
as part of its normal event handling loop.
The result computed by the operation block is passed as a parameter to the callback.
You may omit the callback parameter if you don't need to execute any code after the operation completes.
*
Essentially, in response to an HTTP request, the server executes that you wrote,
invokes the process method in the Connecction class.
have a look at the code in $GEM_HOME/gems/thin-1.6.2/lib/thin/connection.rb:
# Connection between the server and client.
# This class is instanciated by EventMachine on each new connection
# that is opened.
class Connection < EventMachine::Connection
# Called when all data was received and the request
# is ready to be processed.
def process
if threaded?
#request.threaded = true
EventMachine.defer(method(:pre_process), method(:post_process))
else
#request.threaded = false
post_process(pre_process)
end
end
..here is where a threaded connection invoke EventMachine.defer
The reactor
To see where is activated the EventMachine reactor
should follow the initialization of the program:
Notice that for all Sinatra applications and middleware ($GEM_HOME/gems/sinatra-1.4.5/base.rb)
can run the Sinatra app as a self-hosted server using Thin, Puma, Mongrel, or WEBrick.
def run!(options = {}, &block)
return if running?
set options
handler = detect_rack_handler
....
the method detect_rack_handler returns the first Rack::Handler
return Rack::Handler.get(server_name.to_s)
in our test we require thin therefore it returns a Thin rack handler and setup a threaded server
# Starts the server by running the Rack Handler.
def start_server(handler, server_settings, handler_name)
handler.run(self, server_settings) do |server|
....
server.threaded = settings.threaded if server.respond_to? :threaded=
$GEM_HOME/gems/thin-1.6.2/lib/thin/server.rb
# Start the server and listen for connections.
def start
raise ArgumentError, 'app required' unless #app
log_info "Thin web server (v#{VERSION::STRING} codename #{VERSION::CODENAME})"
...
log_info "Listening on #{#backend}, CTRL+C to stop"
#backend.start { setup_signals if #setup_signals }
end
$GEM_HOME/gems/thin-1.6.2/lib/thin/backends/base.rb
# Start the backend and connect it.
def start
#stopping = false
starter = proc do
connect
yield if block_given?
#running = true
end
# Allow for early run up of eventmachine.
if EventMachine.reactor_running?
starter.call
else
#started_reactor = true
EventMachine.run(&starter)
end
end

Ruby and Celluloid

Due to some limitations I want to switch my current project from EventMachine/EM-Synchrony to Celluloid but I've some trouble to get in touch with it. The project I am coding on is a web harvester which should crawl tons of pages as fast as possible.
For the basic understanding of Celluloid I've generated 10.000 dummy pages on a local web server and wanna crawl them by this simple Celluloid snippet:
#!/usr/bin/env jruby --1.9
require 'celluloid'
require 'open-uri'
IDS = 1..9999
BASE_URL = "http://192.168.0.20/files"
class Crawler
include Celluloid
def read(id)
url = "#{BASE_URL}/#{id}"
puts "URL: " + url
open(url) { |x| x.read }
end
end
pool = Crawler.pool(size: 100)
IDS.to_a.map do |id|
pool.future(:read, id)
end
As far as I understand Celluloid, futures are the way to go to get the response of a fired request (comparable to callbacks in EventMachine), right? The other thing is, every actor runs in its own thread, so I need some kind of batching the requests cause 10.000 threads would result in errors on my OSX dev machine.
So creating a pool is the way to go, right? BUT: the code above iterates over the 9999 URLs but only 1300 HTTP requests are sent to the web server. So something goes wrong with limiting the requests and iterating over all URLs.
Likely your program is exiting as soon as all of your futures are created. With Celluloid a future will start execution but you can't be assured of it finishing until you call #value on the future object. This holds true for futures in pools as well. Probably what you need to do is change it to something like this:
crawlers = IDS.to_a.map do |id|
begin
pool.future(:read, id)
rescue DeadActorError, MailboxError
end
end
crawlers.compact.each { |crawler| crawler.value rescue nil }

Why Sinatra request takes EM thread?

Sinatra app receives requests for long running tasks and EM.defer them, launching them in EM's internal pool of 20 threads. When there are more than 20 EM.defer running, they are stored in EM's threadqueue by EM.defer.
However, it seems Sinatra won't service any requests until there is an EM thread available to handle them. My question is, isn't Sinatra suppose to use the reactor of the main thread to service all requests? Why am I seeing an add on the threadqueue when I make a new request?
Steps to reproduce:
Access /track/
Launch 30 /sleep/ reqs to fill the threadqueue
Access /ping/ and notice the add in the threadqueue as well as the delay
Code to reproduce it:
require 'sinatra'
#monkeypatch EM so we can access threadpools
module EventMachine
def self.queuedDefers
#threadqueue==nil ? 0: #threadqueue.size
end
def self.availThreads
#threadqueue==nil ? 0: #threadqueue.num_waiting
end
def self.busyThreads
#threadqueue==nil ? 0: #threadpool_size - #threadqueue.num_waiting
end
end
get '/track/?' do
EM.add_periodic_timer(1) do
p "Busy: " + EventMachine.busyThreads.to_s + "/" +EventMachine.threadpool_size.to_s + ", Available: " + EventMachine.availThreads.to_s + "/" +EventMachine.threadpool_size.to_s + ", Queued: " + EventMachine.queuedDefers.to_s
end
end
get '/sleep/?' do
EM.defer(Proc.new {sleep 20}, Proc.new {body "DONE"})
end
get '/ping/?' do
body "pong"
end
I tried the same thing on Rack/Thin (no Sinatra) and works as it's supposed to, so I guess Sinatra is causing it.
Ruby version: 1.9.3.p125
EventMachine: 1.0.0.beta.4.1
Sinatra: 1.3.2
OS: Windows
Ok, so it seems Sinatra starts Thin in threaded mode by default causing the above behavior.
You can add
set :threaded, false
in your Sinatra configure section and this will prevent the Reactor defering requests on a separate thread, and blocking when under load.
Source1
Source2
Unless I'm misunderstanding something about your question, this is pretty much how EventMachine works. If you check out the docs for EM.defer, they state:
Don't write a deferred operation that will block forever. If so, the
current implementation will not detect the problem, and the thread
will never be returned to the pool. EventMachine limits the number of
threads in its pool, so if you do this enough times, your subsequent
deferred operations won't get a chance to run.
Basically, there's a finite number of threads, and if you use them up, any pending operations will block until a thread is available.
It might be possible to bump threadpool_size if you just need more threads, although ultimately that's not a long-term solution.
Is Sinatra multi threaded? is a really good question here on SO about Sinatra and threads. In short, Sinatra is awesome but if you need decent threading you might need to look elsewhere.

Resources