Is Sinatra multi threaded? - ruby

Is Sinatra multi-threaded? I read else where that "sinatra is multi-threaded by default", what does that imply?
Consider this example
get "/multithread" do
t1 = Thread.new{
puts "sleeping for 10 sec"
sleep 10
# Actually make a call to Third party API using HTTP NET or whatever.
}
t1.join
"multi thread"
end
get "/dummy" do
"dummy"
end
If I access "/multithread" and "/dummy" subsequently in another tab or browser then nothing can be served(in this case for 10 seconds) till "/multithread" request is completed. In case activity freezes application becomes unresponsive.
How can we work around this without spawning another instance of the application?

tl;dr Sinatra works well with Threads, but you will probably have to use a different web server.
Sinatra itself does not impose any concurrency model, it does not even handle concurrency. This is done by the Rack handler (web server), like Thin, WEBrick or Passenger. Sinatra itself is thread-safe, meaning that if your Rack handler uses multiple threads to server requests, it works just fine. However, since Ruby 1.8 only supports green threads and Ruby 1.9 has a global VM lock, threads are not that widely used for concurrency, since on both versions, Threads will not run truly in parallel. The will, however, on JRuby or the upcoming Rubinius 2.0 (both alternative Ruby implementations).
Most existing Rack handlers that use threads will use a thread pool in order to reuse threads instead of actually creating a thread for each incoming request, since thread creation is not for free, esp. on 1.9 where threads map 1:1 to native threads. Green threads have far less overhead, which is why fibers, which are basically cooperatively scheduled green threads, as used by the above mentioned sinatra-synchrony, became so popular recently. You should be aware that any network communication will have to go through EventMachine, so you cannot use the mysql gem, for instance, to talk to your database.
Fibers scale well for network intense processing, but fail miserably for heavy computations. You are less likely to run into race conditions, a common pitfall with concurrency, if you use fibers, as they only do a context switch at clearly defined points (with synchony, whenever you wait for IO). There is a third common concurrency model: Processes. You can use preforking server or fire up multiple processes yourself. While this seems a bad idea at first glance, it has some advantages: On the normal Ruby implementation, this is the only way to use all your CPUs simultaniously. And you avoid shared state, so no race conditions by definition. Also, multiprocess apps scale easily over multiple machines. Keep in mind that you can combine multiple process with other concurrency models (evented, cooperative, preemptive).
The choice is mainly made by the server and middleware you use:
Multi-Process, non-preforking: Mongrel, Thin, WEBrick, Zbatery
Multi-Process, preforking: Unicorn, Rainbows, Passenger
Evented (suited for sinatra-synchrony): Thin, Rainbows, Zbatery
Threaded: Net::HTTP::Server, Threaded Mongrel, Puma, Rainbows, Zbatery, Thin[1], Phusion Passenger Enterprise >= 4
[1] since Sinatra 1.3.0, Thin will be started in threaded mode, if it is started by Sinatra (i.e. with ruby app.rb, but not with the thin command, nor with rackup).

While googling around, found this gem:
sinatra-synchrony
which might help you, because it touches you question.
There is also a benchmark, they did nearly the same thing like you want (external calls).
Conclusion: EventMachine is the answer here!

Thought I might elaborate for people who come across this. Sinatra includes this little chunk of code:
server.threaded = settings.threaded if server.respond_to? :threaded=
Sinatra will detect what gem you have installed for a webserver (aka, thin, puma, whatever.) and if it responds to "threaded" will set it to be threaded if requested. Neat.

After making some changes to code I was able to run padrino/sinatra application on mizuno
. Initially I tried to run Padrino application on jRuby but it was simply too unstable and I did not investigate as to why. I was facing JVM crashes when running on jRuby. I also went through this article, which makes me think why even choose Ruby if deployment can be anything but easy.
Is there any discussion on deployment of applications in ruby? Or can I spawn a new thread :)

I've been getting in to JRuby myself lately and I am extremely surprised how simple it is to switch from MRI to JRuby. It pretty much involves swapping out a few gems (in most cases).
You should take a look at the combination JRuby and Trinidad (App Server). Torquebox also seems to be an interesting all-in-one solution, it comes with a lot more than just an app server.
If you want to have an app server that supports threading, and you're familiar with Mongrel, Thin, Unicorn, etc, then Trinidad is probably the easiest to migrate to since it's practically identical from the users perspective. Loving it so far!

Related

Should I use rails5 ActiveJob default async adapter for small background job in production?

Rails app which handle and activation of a license using an external service, the external service sometime delays the handling of rails request to over 30s, which will then return an error to front end (I'm running heroku, so max is 30s).
I tried using ActiveJobs and the default rails async adapter (Rails 5), and I can see that is working in Heroku out of the box. I keep reading that I should be using another web process and for example redis, but if the background job should just be performed straight after the request is done and if is just hitting another API outside which may be slower, is it so bad to use the default async?
I can see that this is handle in an in-process thread but I don't see a reason for such small job to be having another web process.
I use the async adapter in production for sending emails. This is a very small job. An email could take up to 3 seconds to send.
The doc said it's a poor fit for production because it will drop pending jobs on restart. If I remember correctly, Heroku restarts dynos once a day.
If your job is pending during the restart, the job will be lost. For my case, a pending email during the restart is pretty slim. So far so good.
But if you have jobs taking 30 seconds, I'll use Resque or DelayedJob.
If for small background job in production, which does not require 100% persistence in case of failure/server restart, whose duration is relatively short and thus separate process would be an overkill, I'd recommend using Sucker Punch.
Sucker Punch gem is designed to handle exactly such case. It prepares execution thread pool for each Job you create, using the concurrent-ruby gem, which is (probably) the most robust concurrency library in Ruby. It also hooks on_exit to finish all the pending tasks, so I guess you can expect this gem to be more reliable than the AsyncJob.
One thing to note is that although Sucker Punch is supported on Active Job, the adapter is not well written. Or, at least, when you use Sucker Punch adapter, it's behavior would be just like that of async adapter. So, I'd recommend using bare Sucker Punch if you wanted something just a little more useful/robust than AsyncJob.

For Airbrake 5, do I need to use Sidekiq?

Environment
I'm installing Airbrake on Heroku for a Ruby web app (not Rails).
So Airbrake#notify for Airbrake version 5 for Ruby sends a notification asynchronously.
My worry is that if I don't use Sidekiq worker + Redis, then it might still be possible that calling Airbrake#notify might still slow down the app's response time depending on how it's used (whether in a Rails-like controller or some other part of the app).
Besides overcoming the potential issue mentioned above, the other advantage of using Sidekiq worker + Redis to call Airbrake#notify I can think of is that Redis has a couple of persistence strategies so if the app crashes I can backtrack and look over the backed up error notifications from the Sidekiq queue.
Whereas if I don't use Sidekiq + Redis and the app crashes, then there will be no backed up data....
Questions
Does that mean I don't need to use Sidekiq + Redis (or some other equivalent database)?
Am I understanding the issue correctly? I don't have a very complete understanding of "pooled connections" and asynchronous processing, so this makes understanding what to do here a bit challenging.
This is the class that sends async notices https://github.com/airbrake/airbrake-ruby/blob/master/lib/airbrake-ruby/async_sender.rb
It's using standard ruby threads to send messages, so no background service should be necessary

Ruby 1.9 thread pools

As I understand, Ruby 1.9 uses OS threads but only one thread will still actually be running concurrently (though one thread may be doing blocking IO while another thread is doing processing). The threading examples I've seen just use Thread.new to launch a new thread. Coming from a Java background, I typically use thread pools as to not launch to many new threads since they are "heavyweight."
Is there a thread pool construct built into ruby? I didn't see one in the default language libraries. Or are there is a standard gem that is typically used? Since OS level threading is a newer feature of ruby, I don't know how mature the libraries are for it.
You are correct in that the default C Ruby interpreter only executes one thread at a time (other C based dynamic languages such as Python have similar restrictions). Because of this restriction, threading is not really that common in Ruby and as a result there is no default threadpool library. If there are tasks to be done in parallel, people typically uses processes since processes can scale over multiple servers.
If you do need to use threads, I would recommend you use https://github.com/meh/ruby-threadpool on the JRuby platform, which is a Ruby interpreter running on the JVM. That should be right up your alley, and because it is running on the virtual machine it will have true threading.
The accepted answer is correct, But, there are many tasks in which threads are fine. after all there are some reasons why it is there. even though it can only run a thread at a time. it is still can be considered parallel in many real life situations.
for example when we have 100 long running process in which each takes approximate 10 minutes to complete. by using threads in ruby, even with all those restrictions, if we define a threadpool of 10 tasks at time, it will run much faster than 100*10 minutes when running without threads. examples include, live capturing of file changes, sending large number of web requests (such as status check)
You can understand how pooling works by reading https://blog.codeship.com/understanding-fundamental-ruby-abstraction-concurrency/ . in production code use https://github.com/meh/ruby-thread#pool

What are some good ways to make an async web app on ruby these days?

I'm looking to build a webapp with a WebSocket component, and a run of the mill rack based frontend. My initial plan was to use Camping for the frontend, running the server on thin, with a rack config.ru looking like this:
require 'rack'
require './parts/web-frontend'
require './parts/websocket'
AppStationary = Rack::File.new("./stationary")
run Rack::Cascade.new(AppWebSockets, AppWebPages, AppStationary)
AppWebSockets is being provided by websocket-rack and works great. In the absence of an Upgrade: WebSocket request it simply 404's and the request runs down the cascade to the camping app, AppWebPages.
It's becoming clear that this camping webapp inevitably requires access to IO, to talk with the CouchDB database using regular http requests. There are plenty of ways to do http requests, including some async libraries compatible with eventmachine. If I subscribe to a callback, rack returns and the page has already responded by the time I'm ready to create a response. I'd like to be able to use em-synchrony to get some concurrency via Ruby 1.9's Fibers - which I've only just gotten my head around - but cannot find any documentation on how to make use of em-synchrony with Thin.
I've encountered a webserver called Goliath which claims to be similar to thin, with em-synchrony support baked in, but it lacks a command line utility to launch and test the server and seems to require I write a different sort of file to a rackup, which is quite distasteful. It also is unclear if it would even support websocket-rack, which only specifies support for Thin currently.
What are some good ways to avoid blocking IO while still making use of familiar rack based tools like camping, and having access to WebSockets?
In regards to Goliath, Goliath is based on Thin (I started with the thin code and when from there). A lot of the code has changed (e.g. using http_parser.rb instead of mongrel parser) but the original basis was Thin.
Launching the server is just a matter of executing your .rb file. The system is the same as Sinatra uses (I borrowed the code from Sinatra to make it work). You can also write you own server if you want, there are examples in the repo if you need the extra control. For us, we wanted the launching to be as simple as possible and require as few files created as possible. So, launching .rb file and using God to bringup/restart servers worked well.
Tests you write with RSpec/Test::Unit and run the test file as you normally would. The tests for Goliath will fire up the reactor and send real requests to the API from your unit tests (note, this doesn't fork, it uses EM to run the reactor in the same process as the tests). All this stuff is wrapped in a test_helper that goliath provides.
There is no rackup file with Goliath. You run the .rb file directly. The Goliath application has the middleware use commands baked straight into the .rb file. For us at PostRank, this was the easiest and clearest way to define the server. You had all of your use statements (with any extra bits they use) visible as you work on the file instead of having multiple files. For us, this was a win, your mileage may vary.
I have no idea if websocket-rack would work but, there is a branch in the repo for baking websocket support straight into Goliath. I haven't looked at it in a while (there were some upstream bugs that got fixed that were required) but it shouldn't be too hard to get it up and running and, with the upstream fixed, merged into master.
To your question about em-synchrony and thin, you should just be able to wrap an EM.synchrony {} block around your code. The synchrony method just calls down to EM.run and wraps your block in a new fiber. If the reactor is already running EM will just execute the passed block immediately. As long as Thin has already started the reactor this should work fine.
Update: The websockets branch as been merged into Goliath mainline, so there is WebSocket support baked straight into Goliath if you're running from HEAD.
Here's an example of how to add async support to Camping: https://gist.github.com/1192720 (see 65 for the code you'll have to use in your app). Maybe we should wrap it up in a gem or something…
Have you looked at Cramp - http://cramp.in ? Cramp is fully async and has an in-built websockets support.

What's the best option for a framework-agnostic Ruby background worker library?

I'm building a simple recipe search engine with Ruby and Sinatra for an iPhone app, using RabbitMQ for my message queue. I'm looking around and finding a lot of different implementation choices for background processes, but most of them either implement custom message queue algorithms or operate as Rails plugins.
What's out there in terms of high-quality framework-agnostic worker libraries that will play nicely with RabbitMQ?
And are there any best-practices I should keep in mind while writing the worker code, beyond the obvious:
# BAD, don't do this!
begin
# work
rescue Exception
end
I am using Beanstalk and have written my own daemons using the daemons gem. Daemon kit is a new project but queue loops are not yet implemented. You can also have a look at Nanite if it fits your needs, it's framework-agnostic.
I ended up writing my own library in a fit of uncontrollable yak-shaving. Daemon kit was the right general idea, but seriously way too heavyweight for my needs. I don't want what looks like a full rails app for each of my daemons. I'm going to end up with at least 3 daemons, and that would be a colossal mess of directories. The daemons gem has a horrible API, and while I was tempted to abstract it away, I realized it was probably easier to just manage the fork myself, so that's what I did.
API looks like this:
require "rubygems"
require "chaingang"
class Worker
def setup
# Set up connections here
end
def teardown
# Tear down connections here
end
def call
# Do some work
sleep 1
end
end
ChainGang.prepare(Worker.new)
And then you just use the included rake task to start/stop/restart or check status. I took a page from the Rack playbook: anything that implements the call method is fair game as an argument to ChainGang.prepare and ChainGang.work methods, so a Proc is a valid worker object.
Took me longer to build than it would've to use something else, but I have a vague suspicion that it'll pay off in the long-run.
Check out nanite (written in Ruby), it's a young project written atop rabbitmq.
github.com/ezmobius/nanite/tree/master

Resources