Do multiple requests in Phusion passenger run in their own threads? - ruby

I have a Ruby on Rails application deployed with Phusion passenger + Apache web server. Does each request runs in its own thread spawned by Phusion Passenger?

Passenger (along with most other application servers) runs no more than one request per thread. Typically there is also only one thread per process. From the Phusion Passenger docs:
Phusion Passenger supports two concurrency models:
process: single-threaded, multi-processed I/O concurrency. Each application process only has a single thread and can only handle 1 request at a time. This is the concurrency model that Ruby applications traditionally used. It has excellent compatibility (can work with applications that are not designed to be thread-safe) but is unsuitable workloads in which the application has to wait for a lot of external I/O (e.g. HTTP API calls), and uses more memory because each process has a large memory overhead.
thread: multi-threaded, multi-processed I/O concurrency. Each application process has multiple threads (customizable via PassengerThreadCount). This model provides much better I/O concurrency and uses less memory because threads share memory with each other within the same process. However, using this model may cause compatibility problems if the application is not designed to be thread-safe.
(Emphasis my own)

Passenger open source edition only uses one thread per application, as listed in your apache virtual hosts files (not sure about nginx). So you could conceivably have multiple instances of your app running on the same apache server, but you would have to install your app into multiple directories and point vhosts entries at them, and put some kind of load-balancer in front of it.
Passenger enterprise enables much more control over concurrency.
EDIT: clarity.

Related

Worker, Threads & Pool size using Puma

If I have a server with 1 core, how many puma workers, threads and what database pool size is appropriate?
What's the general thumb here?
Not an easy answer.
The two main sources of information are:
Puma github repository (the authors' point of view)
Heroku's web page (the main big user's point of view)
Unfortunately they are inconsistent mostly because heroku has different deployment metrics and terminology.
So I ended up following the puma repository guidelines which says:
One worker per core
Threads to be determined in connection with RAM availability and application and
Threads = Connection Pool
So the number of threads is mostly a try and check operation.

When making network requests, when should I use Threads vs Processes?

I'm working on a Ruby script that will be making hundreds of network requests (via open-uri) to various APIs and I'd like to do this in parallel since each request is slow, and blocking.
I have been looking at using Thread or Process to achieve this but I'm not sure which method to use.
With regard to network request, when should i use a Thread over Process, or does it not matter?
Before going into detail, there is already a library solving your problem. Typhoeus is optimized to run a large number of HTTP requests in parallel and is based on the libcurl library.
Like a modern code version of the mythical beast with 100 serpent
heads, Typhoeus runs HTTP requests in parallel while cleanly
encapsulating handling logic.
Threads will be run in the same process as your application. Since Ruby 1.9 native threads are used as the underlying implementation. Resources can be easily shared across threads, as they all can access the mutual state of the application. The problem, however, is that you cannot utilize the multiple cores of your CPU with most Ruby implementations.
Ruby uses the Global Interpreter Lock (GIL). GIL is a locking mechanism to ensure that the mutual state is not corrupted due to parallel modifications from different threads. Other Ruby implementations like JRuby, Rubinius or MacRuby offer an approach without GIL.
Processes run separately from each other. Processes do not share resources, which means every process has its own state. This can be a problem, if you want to share data across your requests. A process also allocates its own stack of memory. You could still share data by using a messaging bus like RabitMQ.
I cannot recommend to use either only threads or only processes. If you want to implement that yourself, you should use both. Fork for every n requests a new processes which then again spawns a number of threads to issue the HTTP requests. Why?
If you fork for every HTTP request another process, this will result in too many processes. Although your operating system might be able to handle this, the overhead is still tremendous. Some HTTP requests might finish very fast, so why bother with an extra process, just run them in another thread.

Ruby Grape Reactor gets blocked

I am creating an API using Ruby Grape and I face the following problem.
When there is a new GET request, a large amount of data is requested which takes long time and in the meanwhile Reactor is blocked and no new requests can be handled until the request is finished.
Code is quite straight forward:
class API < Grape::API
resource :users do
get do
get_users()
end
end
end
get_users connects to another system by TCP and gets a large amount of data converted to JSON. This is done using a 3rd party gem.
What would be the best option to handle this type of situations?
I think of two options:
Set up passenger/unicorn etc. with enough workers to handle concurrent requests.
If this is not enough: re-make API logic so that long operations will break up to two calls: first - leave a request, second - check for completion/retrieve result.
Also, if it is suitable - you could cache the result of get_users()
Your application performs a long-running blocking I/O operation. To handle these kinds of workloads well, your system needs to support high I/O concurrency.
Traditional single-threaded multi-process systems such as Phusion Passenger open source and Unicorn are not suitable for these kinds of workloads. The amount of concurrency they can handle is limited by the number of processes. This problem is documented on Unicorn's philosophy page, section "Just Worse in Some Cases", or on the recent Phusion article about tuning Phusion Passenger's concurrency.
While Thin is in theory capable of handling high I/O concurrency due to its evented I/O model, applications and frameworks must be explicitly written to take advantage of this. Few frameworks do this. Neither Rails nor Sinatra support evented I/O. Cramp supports it and there was another new evented framework whose name I've forgotten. But it seems Grape does not support evented I/O.
The solution would be to switch to a multithreading-capable application server, which are also capable of supporting high I/O concurrency. One such application server is Phusion Passenger 4 Enterprise, which supports a hybrid multithreaded/multiprocess model. Multithreading is concurrency, while multiprocess is for stability and the ability to leverage multiple CPU cores. The Phusion blog describes optimal concurrency settings for different workloads.

Why Rails applications run Garbage Collector at all?

I was pretty sure, that all Rack application servers (I had some experience with Unicorn and Passenger) were creating single process for every worker when they were created, and its state was "frozen".
Whenever app server receives request to handle, it forks from the master process and all further changes to forked process are separated from the original process. They benefit from copy-on-write optimizations and are safe to be "damaged" by processing request. All changes to the environment affect only single process that will be preempted anyway.
If my vision of RoR application stack was true, there should be almost no need of garbage collection, unless serving single request would take a lot of time and memory (which usually is not the case).
On the other hand, question about GC measurements done with NewRelic and it's answers led me to conclusion that I must be completely wrong.
Can someone clarify this process?
Rack applications servers are not forking at each request, only during initialization:
First, the environment is loaded in one process
Then, the server fork several workers
Then all requests are distributed among these processes
That's why the Garbage collector is used to keep each process memory clean & stable.

Is Sinatra multi threaded?

Is Sinatra multi-threaded? I read else where that "sinatra is multi-threaded by default", what does that imply?
Consider this example
get "/multithread" do
t1 = Thread.new{
puts "sleeping for 10 sec"
sleep 10
# Actually make a call to Third party API using HTTP NET or whatever.
}
t1.join
"multi thread"
end
get "/dummy" do
"dummy"
end
If I access "/multithread" and "/dummy" subsequently in another tab or browser then nothing can be served(in this case for 10 seconds) till "/multithread" request is completed. In case activity freezes application becomes unresponsive.
How can we work around this without spawning another instance of the application?
tl;dr Sinatra works well with Threads, but you will probably have to use a different web server.
Sinatra itself does not impose any concurrency model, it does not even handle concurrency. This is done by the Rack handler (web server), like Thin, WEBrick or Passenger. Sinatra itself is thread-safe, meaning that if your Rack handler uses multiple threads to server requests, it works just fine. However, since Ruby 1.8 only supports green threads and Ruby 1.9 has a global VM lock, threads are not that widely used for concurrency, since on both versions, Threads will not run truly in parallel. The will, however, on JRuby or the upcoming Rubinius 2.0 (both alternative Ruby implementations).
Most existing Rack handlers that use threads will use a thread pool in order to reuse threads instead of actually creating a thread for each incoming request, since thread creation is not for free, esp. on 1.9 where threads map 1:1 to native threads. Green threads have far less overhead, which is why fibers, which are basically cooperatively scheduled green threads, as used by the above mentioned sinatra-synchrony, became so popular recently. You should be aware that any network communication will have to go through EventMachine, so you cannot use the mysql gem, for instance, to talk to your database.
Fibers scale well for network intense processing, but fail miserably for heavy computations. You are less likely to run into race conditions, a common pitfall with concurrency, if you use fibers, as they only do a context switch at clearly defined points (with synchony, whenever you wait for IO). There is a third common concurrency model: Processes. You can use preforking server or fire up multiple processes yourself. While this seems a bad idea at first glance, it has some advantages: On the normal Ruby implementation, this is the only way to use all your CPUs simultaniously. And you avoid shared state, so no race conditions by definition. Also, multiprocess apps scale easily over multiple machines. Keep in mind that you can combine multiple process with other concurrency models (evented, cooperative, preemptive).
The choice is mainly made by the server and middleware you use:
Multi-Process, non-preforking: Mongrel, Thin, WEBrick, Zbatery
Multi-Process, preforking: Unicorn, Rainbows, Passenger
Evented (suited for sinatra-synchrony): Thin, Rainbows, Zbatery
Threaded: Net::HTTP::Server, Threaded Mongrel, Puma, Rainbows, Zbatery, Thin[1], Phusion Passenger Enterprise >= 4
[1] since Sinatra 1.3.0, Thin will be started in threaded mode, if it is started by Sinatra (i.e. with ruby app.rb, but not with the thin command, nor with rackup).
While googling around, found this gem:
sinatra-synchrony
which might help you, because it touches you question.
There is also a benchmark, they did nearly the same thing like you want (external calls).
Conclusion: EventMachine is the answer here!
Thought I might elaborate for people who come across this. Sinatra includes this little chunk of code:
server.threaded = settings.threaded if server.respond_to? :threaded=
Sinatra will detect what gem you have installed for a webserver (aka, thin, puma, whatever.) and if it responds to "threaded" will set it to be threaded if requested. Neat.
After making some changes to code I was able to run padrino/sinatra application on mizuno
. Initially I tried to run Padrino application on jRuby but it was simply too unstable and I did not investigate as to why. I was facing JVM crashes when running on jRuby. I also went through this article, which makes me think why even choose Ruby if deployment can be anything but easy.
Is there any discussion on deployment of applications in ruby? Or can I spawn a new thread :)
I've been getting in to JRuby myself lately and I am extremely surprised how simple it is to switch from MRI to JRuby. It pretty much involves swapping out a few gems (in most cases).
You should take a look at the combination JRuby and Trinidad (App Server). Torquebox also seems to be an interesting all-in-one solution, it comes with a lot more than just an app server.
If you want to have an app server that supports threading, and you're familiar with Mongrel, Thin, Unicorn, etc, then Trinidad is probably the easiest to migrate to since it's practically identical from the users perspective. Loving it so far!

Resources