Ruby Threading Performance: Slow Thread.pass - ruby

We're running a threaded ruby server (Puma), and have seen serious performance issues with our Sinatra app. Specifically, something as simple as Thread.pass can take over 2s. How is it possible that a server with 16 threads can take over 2s to return control to a thread? Is the Ruby scheduler that bad, or is there something we can do to fix this?
Details:
Ruby implementation: MRI 2.1
Sinatra App
Running on Heroku 1x dynos
Puma server, running 16 threads, 1 process
Some routes are doing fairly heavy work, but routes doing almost no work are impacted
Over 100MB in free memory
Thanks in advance!

The time that Thread.pass takes is a non-specified value, it may take 10s or it might not pass at all (i.e. continue execution immediately).
Thread.pass is more of a hint or a suggestion.

Long story short: it's the heroku virtual machine.
Sometimes your whole virtual machine pauses, so the program (in whatever language) just stops responding for a few seconds. Running on dedicated boxes 100% resolved this issue. Heroku 1x/2x dynos don't really seem reliable for applications where multi-seconds pauses are unacceptable. I get that sharing resources is needed, but completely pausing the world for multiple seconds is too much. Seems like their scheduling could use some work.

Related

Unbounded memory growth after Rack::Timeout in a puma Rails app

We're using Ruby 2.3.3 with a Rails 5.0 app, running on puma 3.8.2, and rack-timeout 0.4.2.
We get semi frequent Rack::Timeout exceptions that have no ill effect. Occasionally though, something bad will happen that causes a slew of timeouts, and after that point, the puma worker process seems to just keep generating more and more live objects. GC appears to still be running - at least, NewRelic is still reporting time spent in GC - but it doesn't seem to be having much effect:
The worker process keeps serving requests, but gets slower & slower as the heap size goes up, before eventually dying with out-of-memory errors.
I know this is quite a vague question, but I wondered if anyone had run into anything similar. Any tips for diagnosing it further?

Google app engine: Performance differences between background thread and a task queue

In my GAE app, there are some tasks which were being performed in taskqueues, asynchronously. These tasks involve fetching data from an API and processing it. As the app matured, the data grew in size so the taskqueue limit of 10 minutes started to break in some cases. To fix this I moved the app to GAE modules architecture and moved the asynchronous execution to a backgrouindThread. This works but the execution has become a lot slower, typically taking 8x time. What is the reason behind this? What is the solution/workaround for this, if any?
EDIT:
I suppose it's because of poor multithreaded performance of python. Is it true? Can anyone confirm this?

Is there any reson to not reduce Ping Maximum Response Time in IIS 7

IIS includes a worker process health check "ping" function that pings worker processes every 90 seconds by default and recycles them if they don't respond. I have an application that is chronically putting app pools into a bad state and I'm curious if there is any reason not to lower this time to force IIS to recycle a failed worker process quicker. Searching the web all I can find is people that are increasing the time to allow for debugging. It seems like 90 seconds is far to high for a web application, but perhaps I'm missing something.
Well the obvious answer is that in some situations requests would take longer than 90 seconds for the worker process to return. If you can't imagine a situation where this would be appropriate, then feel free to lower it.
I wouldn't recommend going too much lower than 30 seconds. I can see situations where you get in recycle loops. However you can do testing and see what makes sense in your situation. I would recommend Siege for load testing to see how your application behaves.

Eventmachine memory management

I'm running an eventmachine process on heroku, and it seems to be hitting their memory limit of 512MB after an hour or so. I start seeing messages like this:
Error R14 (Memory quota exceeded)
Process running mem=531M(103.8%)
I'm running a lot of events through the reactor, so I'm thinking maybe the reactor is getting backed up (I'm imagining it as a big queue)? But there could be some other reason, I'm still fairly new to eventmachine.
Are there any good ways to profile eventmachine and and get some stats on it. As a simple example, I was hoping to see how many events were scheduled in the queue to see if it was getting backed up and keeping those all in memory. But if anyone has other suggestions I'd really appreciate it.
Thanks!
I use eventmachine extensively and never ran into any memory leak inside the reactor so your bet is that the ruby code is but without knowing more about your application it is hard to give you a real answer.
The only queue I can think of right now is the thread pool, each time you use the defer method the block is either given to a free thread or queued waiting for a free thread, I suppose if all your threads are blocking waiting for something the queue could grow and use all the memory available.
The leak turned out to be in Mongoid's identity_map (nothing to do with EventMachine). Setting Mongoid.identity_map_enabled = false at the beginning of the event machine process resolved it.

Is Sinatra multi threaded?

Is Sinatra multi-threaded? I read else where that "sinatra is multi-threaded by default", what does that imply?
Consider this example
get "/multithread" do
t1 = Thread.new{
puts "sleeping for 10 sec"
sleep 10
# Actually make a call to Third party API using HTTP NET or whatever.
}
t1.join
"multi thread"
end
get "/dummy" do
"dummy"
end
If I access "/multithread" and "/dummy" subsequently in another tab or browser then nothing can be served(in this case for 10 seconds) till "/multithread" request is completed. In case activity freezes application becomes unresponsive.
How can we work around this without spawning another instance of the application?
tl;dr Sinatra works well with Threads, but you will probably have to use a different web server.
Sinatra itself does not impose any concurrency model, it does not even handle concurrency. This is done by the Rack handler (web server), like Thin, WEBrick or Passenger. Sinatra itself is thread-safe, meaning that if your Rack handler uses multiple threads to server requests, it works just fine. However, since Ruby 1.8 only supports green threads and Ruby 1.9 has a global VM lock, threads are not that widely used for concurrency, since on both versions, Threads will not run truly in parallel. The will, however, on JRuby or the upcoming Rubinius 2.0 (both alternative Ruby implementations).
Most existing Rack handlers that use threads will use a thread pool in order to reuse threads instead of actually creating a thread for each incoming request, since thread creation is not for free, esp. on 1.9 where threads map 1:1 to native threads. Green threads have far less overhead, which is why fibers, which are basically cooperatively scheduled green threads, as used by the above mentioned sinatra-synchrony, became so popular recently. You should be aware that any network communication will have to go through EventMachine, so you cannot use the mysql gem, for instance, to talk to your database.
Fibers scale well for network intense processing, but fail miserably for heavy computations. You are less likely to run into race conditions, a common pitfall with concurrency, if you use fibers, as they only do a context switch at clearly defined points (with synchony, whenever you wait for IO). There is a third common concurrency model: Processes. You can use preforking server or fire up multiple processes yourself. While this seems a bad idea at first glance, it has some advantages: On the normal Ruby implementation, this is the only way to use all your CPUs simultaniously. And you avoid shared state, so no race conditions by definition. Also, multiprocess apps scale easily over multiple machines. Keep in mind that you can combine multiple process with other concurrency models (evented, cooperative, preemptive).
The choice is mainly made by the server and middleware you use:
Multi-Process, non-preforking: Mongrel, Thin, WEBrick, Zbatery
Multi-Process, preforking: Unicorn, Rainbows, Passenger
Evented (suited for sinatra-synchrony): Thin, Rainbows, Zbatery
Threaded: Net::HTTP::Server, Threaded Mongrel, Puma, Rainbows, Zbatery, Thin[1], Phusion Passenger Enterprise >= 4
[1] since Sinatra 1.3.0, Thin will be started in threaded mode, if it is started by Sinatra (i.e. with ruby app.rb, but not with the thin command, nor with rackup).
While googling around, found this gem:
sinatra-synchrony
which might help you, because it touches you question.
There is also a benchmark, they did nearly the same thing like you want (external calls).
Conclusion: EventMachine is the answer here!
Thought I might elaborate for people who come across this. Sinatra includes this little chunk of code:
server.threaded = settings.threaded if server.respond_to? :threaded=
Sinatra will detect what gem you have installed for a webserver (aka, thin, puma, whatever.) and if it responds to "threaded" will set it to be threaded if requested. Neat.
After making some changes to code I was able to run padrino/sinatra application on mizuno
. Initially I tried to run Padrino application on jRuby but it was simply too unstable and I did not investigate as to why. I was facing JVM crashes when running on jRuby. I also went through this article, which makes me think why even choose Ruby if deployment can be anything but easy.
Is there any discussion on deployment of applications in ruby? Or can I spawn a new thread :)
I've been getting in to JRuby myself lately and I am extremely surprised how simple it is to switch from MRI to JRuby. It pretty much involves swapping out a few gems (in most cases).
You should take a look at the combination JRuby and Trinidad (App Server). Torquebox also seems to be an interesting all-in-one solution, it comes with a lot more than just an app server.
If you want to have an app server that supports threading, and you're familiar with Mongrel, Thin, Unicorn, etc, then Trinidad is probably the easiest to migrate to since it's practically identical from the users perspective. Loving it so far!

Resources