How to recover a crashed EventMachine loop - ruby

I'm using Unicorn on Heroku and I created an EventMachine loop:
(from https://gist.github.com/jonkgrimes/5103321)
after_fork do |server,worker|
defined?(ActiveRecord::Base) and
ActiveRecord::Base.establish_connection
if defined?(EventMachine)
unless EventMachine.reactor_running? && EventMachine.reactor_thread.alive?
if EventMachine.reactor_running?
EventMachine.stop_event_loop
EventMachine.release_machine
EventMachine.instance_variable_set("#reactor_running",false)
end
Thread.new { EventMachine.run }
end
end
Signal.trap("INT") { EventMachine.stop }
Signal.trap("TERM") { EventMachine.stop }
end
The EventMachine works great, but at some point my events start failing because "no eventmachine loop is running." I imagine two possible problems:
the loop is still running but somehow my unicorn forks are no longer bound properly (seems unlikely)
the loop crashed (seems likely)
How can I detect and restart a crashed eventmachine? And/or how should I go about debugging this problem?

Related

Wait for eventmachine queue to be empty?

I'm using a Ruby library that assumes its running inside eventmachine (Faye) and starts the eventmachine reactor in a separate thread if it isn't inside an EM.run context. When the Rails application is started inside a thin server, no problem. But for background jobs the Rails environment is loaded and then resque is started, which spawns a new process for every job (the self.perform method below).
So, I know that I have the reactor running, but I need to know when it is safe to return from self.perform because that will exit the current process and cut off any pending actions in the EM reactor. Alternatively I could run the job inside an EM.run block but would then need to know when it is safe to exit.
class AsanaPopulateJob
#queue = :profiles
def self.logger; Rails.logger; end
def self.perform(ident_id)
begin
logger.debug "==> Start AsanaPopulateJob[#{ident_id}]"
ident = Identity.find(ident_id)
conn = AsanaConnector.new(ident)
Faye.ensure_reactor_running!
conn.populate_profile!
logger.debug "=== Sleep 5..."
sleep(5)
ensure
logger.debug "<== Done AsanaPopulateJob[#{ident_id}]"
end
end
end
Right now, I'm running my function call that invokes em-hiredis and then sleeping for 5 seconds to let things settle down. Surely there's something better.

Starting a Sinatra app in a new thread. The thread immediately dies

I have the following Sinatra app defined:
require "sinatra/base"
class App < Sinatra::Base
configure do
set port: 5000
end
get "/" do
"Hello!"
end
end
From inside a Rails app, I am trying to start the Sinatra app in the background:
Thread.new do
App.run!
end
But it seems that the thread immediately dies. There is nothing keeping it alive.
How can I make it so that the Sinatra app will startup in the new thread and run indefinitely (or for at least the lifetime of the app)?
Thread.new do
App.run!
end
I'm willing to bet that App.run! is raising an exception. Thread.new with a block has a nasty habit of swallowing exceptions
https://bugs.ruby-lang.org/issues/6647
Do the following:
Thread.new do
begin
App.run!
rescue StandardError => e
$stderr << e.message
$stderr << e.backtrace.join("\n")
end
end
and see whether you see anything logged to stderr.

Kill all threads on terminate

I'm trying to create an app in ruby which can be started from command line and it does two things: runs a continous job (loop with sleep which runs some action [remote feed parsing]) with one thread and sinatra in a second thread. My code (simplified) looks like that:
require 'sinatra'
class MyApp < Sinatra::Base
get '/' do
"Hello!"
end
end
threads = []
threads << Thread.new do
loop do
# do something heavy
sleep 10
end
end
threads << Thread.new do
MyApp.run!
end
threads.each { |t| t.join }
The above code actually does it's job very well - the sinatra app is started an available under 4567 port and the do something heavy task is beeing fired each 10 seconds. However, i'm not able to kill that script.
I'm running it with ruby app.rb but killing it with ctrl + c is not working. It kills just the sinatra thread but the second one is still running and, to stop the script, i need to close the terminal window.
I was trying to kill all the threads on SIGNINT but it's also not working as expected
trap "SIGINT" do
puts "Exiting"
threads.each { |t| Thread.kill t }
exit 130
end
Can you help me with this? Thanks in advance.
To trap ctrl-c, change "SIGINT" to "INT".
trap("INT") {
puts "trapping"
threads.each{|t|
puts "killing"
Thread.kill t
}
}
To configure Sinatra to skip catching traps:
class MyApp < Sinatra::Base
configure do
set :traps, false
end
...
Reference: Ruby Signal module
To list the available Ruby signals: Signal.list.keys
Reference: Sinatra Intro
(When I run your code and trap INT, I do get a Sinatra socket warning "Already in use". I presume that's fine for your purposes, or you can solve that by doing a Sinatra graceful shutdown. See Sinatra - terminate server from request)
Late to the party, but Trap has one big disadvantage - it gets overriden by the webserver. For example, Puma sets several traps which basically makes your one never to be called.
The best workaround is to use at_exit which can be defined multiple times and Ruby makes sure all blocks are called. I haven't tested this if it would work for your case tho.

Rufus-Scheduler, DaemonKit and traps

I daemonized a Ruby scheduler script (using Rufus) with Rufus-Scheduler DaemonKit and I'm trying to trap the TERM or INT signals to have the application try to save state before quitting.
DaemonKit has its own trap_state (private) method and it catches the signal before the daemon script so even though I have this block, it doesn't do much.
DaemonKit::Application.running! do |config|
surprise = Surprise.new(interval, frequency, false)
surprise.start
config.trap( 'SIGINT' ) do #tried INT and TERM as well
puts 'Exiting'
surprise.stop
File.delete($lock)
end
end
As a side effect (maybe a mistake in my implementation ?) after sigterm the .rufus lockfile is still there
The behavior on ctrl-c right now is this
[daemon-kit]: DaemonKit (0.3.1) booted, now running surprise
log writing failed. can't be called from trap context
[daemon-kit]: Running signal traps for INT
log writing failed. can't be called from trap context
[daemon-kit]: Running shutdown hooks
log writing failed. can't be called from trap context
[daemon-kit]: Shutting down surprise
The start method is a pretty simple schedule
def start
#scheduler = Rufus::Scheduler.new(:lockfile => $lock)
#scheduler.every '1d', :first_at => #first, :overlap => false do |job|
... # some work
end
#scheduler.join
end
def stop
# save state
#scheduler.shutdown
end
Looking at your own answer, and the following code you pasted:
def start
#scheduler = Rufus::Scheduler.new(:lockfile => $lock)
# ...
#scheduler.join # <- NOT NEEDED
end
DaemonKit's DaemonKit::Application.running! block actually never finishes running, so you could safely skip calling #join on any thread.
We should work on making this use-case more clear, as I would love see it used more widely for this kinda work.
So it's very simple, I need to configure the trap proc (or block in my case) BEFORE I run the scheduler in the start method. Not feeling very clever right about now, but the following code works as expected. For reference, the set_trap is private in DK but the public trap method overrides the defaults that come with the DK startup.
DaemonKit::Application.running! do |config|
surprise = Surprise.new(interval, frequency, false)
config.trap("TERM") { surprise.stop }
config.trap( "INT" ) { surprise.stop }
surprise.start
end
Interestingly I saw this line on startup that I hadn't noticed before
[daemon-kit]: Trapping SIGINT signals not supported on this platform
INT and TERM both work though

Ruby, windows, active_record, and Control-C

What is active_record doing to the signal processes under windows (I don't see this with the same versions on the mac) that causes it to behave so strangely? For instance:
require 'rubygems'
trap("INT"){puts "interrupted"}
puts __LINE__
sleep 5
require 'active_record'
trap("INT"){puts "interrupted again"}
puts __LINE__
sleep 5
When I run the above code (ruby 1.8.6, gem 1.3.1, activerecord 2.2.2,) I can hit ^C as many times as I like during the first sleep, but the first interrupt after the require of activerecord causes the script to terminate. In the above case, the trap still executes, it only fails to allow the program to continue. Usually.
Removing the second call to trap does not have any effect upon the behaviors.
The real annoyance is that in some conditions, the trap fails to execute at all. Considering that the whole point of doing this is to get my code to clean up after itself (remove its footprint in the database so the next guy sees a sane state,) this is a real problem. For instance:
require 'rubygems'
require 'active_record'
trap("INT"){puts "interrupted"}
puts __LINE__
gets
Pressing ^C after seeing the puts will not execute the trap at all.
I only see this problem after requiring active_record. Is there a workaround? I'd be curious to know if this is a bug or if there is an explanation of some sort. As I said, I have no issue with this on the mac - repeated ^Cs result in multiple executions of the trap proc.
thanks...
Considering that the whole point of doing this is to get my code to clean up after itself (remove its footprint in the database ...
Have you considered just using a database transaction? It seems like it would be a much easier way to solve the problem.
I saw a different pattern when trying to duplicate this problem:
puts "start"
trap("INT") { puts "interrupted" }
sleep 5
puts "end"
On Ubuntu (Ruby 1.8.6) this produces
start
interrupted
interrupted
(etc)
interrupted
end
So "interrupted" prints each time Crtl-C is pressed, until the 5 seconds are up. Under Windows (also Ruby 1.8.6), this produces:
start
interrupted
end
i.e. it prints "interrupted" once and then exits.
So it appears that while handling SIGINT Ruby exits the sleep routine and continues on to the next statement. My guess (hand-waving) is that this is somehow due to Ruby using green threads instead of native threads on Windows. Any experts please chime in here.
You could emulate the Unix-y behavior by restarting sleep in the handler:
puts "start"
trap("INT") do
puts "interrupted"
sleep 5
end
sleep 5
puts "end"
Unfortunately this resets the timer each time SIGINT is trapped, so it needs some hacking:
$interval = 5
def go_to_sleep(secs)
$started = Time.now
sleep secs
end
trap("INT") do
puts "interrupted"
time_to_sleep = [0,$interval - (Time.now - $started)].max
if time_to_sleep > 0
sleep time_to_sleep
end
end
puts "one"
go_to_sleep($interval)
puts "two"
go_to_sleep($interval)
puts "three"
go_to_sleep($interval)

Resources