Wait for eventmachine queue to be empty? - ruby

I'm using a Ruby library that assumes its running inside eventmachine (Faye) and starts the eventmachine reactor in a separate thread if it isn't inside an EM.run context. When the Rails application is started inside a thin server, no problem. But for background jobs the Rails environment is loaded and then resque is started, which spawns a new process for every job (the self.perform method below).
So, I know that I have the reactor running, but I need to know when it is safe to return from self.perform because that will exit the current process and cut off any pending actions in the EM reactor. Alternatively I could run the job inside an EM.run block but would then need to know when it is safe to exit.
class AsanaPopulateJob
#queue = :profiles
def self.logger; Rails.logger; end
def self.perform(ident_id)
begin
logger.debug "==> Start AsanaPopulateJob[#{ident_id}]"
ident = Identity.find(ident_id)
conn = AsanaConnector.new(ident)
Faye.ensure_reactor_running!
conn.populate_profile!
logger.debug "=== Sleep 5..."
sleep(5)
ensure
logger.debug "<== Done AsanaPopulateJob[#{ident_id}]"
end
end
end
Right now, I'm running my function call that invokes em-hiredis and then sleeping for 5 seconds to let things settle down. Surely there's something better.

Related

How to stop a process from within the tests, when testing a never-ending process?

I am developing a long-running program in Ruby. I am writing some integration tests for this. These tests need to kill or stop the program after starting it; otherwise the tests hang.
For example, with a file bin/runner
#!/usr/bin/env ruby
while true do
puts "Hello World"
sleep 10
end
The (integration) test would be:
class RunReflectorTest < TestCase
test "it prints a welcome message over and over" do
out, err = capture_subprocess_io do
system "bin/runner"
end
assert_empty err
assert_includes out, "Hello World"
end
end
Only, obviously, this will not work; the test starts and never stops, because the system call never ends.
How should I tackle this? Is the problem in system itself, and would Kernel#spawn provide a solution? If so, how? Somehow the following keeps the out empty:
class RunReflectorTest < TestCase
test "it prints a welcome message over and over" do
out, err = capture_subprocess_io do
pid = spawn "bin/runner"
sleep 2
Process.kill pid
end
assert_empty err
assert_includes out, "Hello World"
end
end
. This direction also seems like it will cause a lot of timing-issues (and slow tests). Ideally, a reader would follow the stream of STDOUT and let the test pass as soon as the string is encountered and then immediately kill the subprocess. I cannot find how to do this with Process.
Test Behavior, Not Language Features
First, what you're doing is a TDD anti-pattern. Tests should focus on behaviors of methods or objects, not on language features like loops. If you must test a loop, construct a test that checks for a useful behavior like "entering an invalid response results in a re-prompt." There's almost no utility in checking that a loop loops forever.
However, you might decide to test a long-running process by checking to see:
If it's still running after t time.
If it's performed at least i iterations.
If a loop exits properly given certain input or upon reaching a boundary condition.
Use Timeouts or Signals to End Testing
Second, if you decide to do it anyway, you can just escape the block with Timeout::timeout. For example:
require 'timeout'
# Terminates block
Timeout::timeout(3) { `sleep 300` }
This is quick and easy. However, note that using timeout doesn't actually signal the process. If you run this a few times, you'll notice that sleep is still running multiple times as a system process.
It's better is to signal the process when you want to exit with Process::kill, ensuring that you clean up after yourself. For example:
pid = spawn 'sleep 300'
Process::kill 'TERM', pid
sleep 3
Process::wait pid
Aside from resource issues, this is a better approach when you're spawning something stateful and don't want to pollute the independence of your tests. You should almost always kill long-running (or infinite) processes in your test teardown whenever you can.
Ideally, a reader would follow the stream of STDOUT and let the test pass as soon as the string is encountered and then immediately kill the subprocess. I cannot find how to do this with Process.
You can redirect stdout of spawned process to any file descriptor by specifying out option
pid = spawn(command, :out=>"/dev/null") # write mode
Documentation
Example of redirection
With the answer from CodeGnome on how to use Timeout::timeout and the answer from andyconhin on how to redirect Process::spawn IO, I came up with two Minitest helpers that can be used as follows:
it "runs a deamon" do
wait_for(timeout: 2) do
wait_for_spawned_io(regexp: /Hello World/, command: ["bin/runner"])
end
end
The helpers are:
def wait_for(timeout: 1, &block)
Timeout::timeout(timeout) do
yield block
end
rescue Timeout::Error
flunk "Test did not pass within #{timeout} seconds"
end
def wait_for_spawned_io(regexp: //, command: [])
buffer = ""
begin
read_pipe, write_pipe = IO.pipe
pid = Process.spawn(command.shelljoin, out: write_pipe, err: write_pipe)
loop do
buffer << read_pipe.readpartial(1000)
break if regexp =~ buffer
end
ensure
read_pipe.close
write_pipe.close
Process.kill("INT", pid)
end
buffer
end
These can be used in a test which allows me to start a subprocess, capture the STDOUT and as soon as it matches the test Regular Expression, it passes, else it will wait 'till timeout and flunk (fail the test).
The loop will capture output and pass the test once it sees matching output. It uses a IO.pipe because that is most transparant for subprocesses (and their children) to write to.
I doubt this will work on Windows. And it needs some cleaning up of the wait_for_spawned_io which is doing slightly too much IMO. Antoher problem is that the Process.kill('INT') might not reach the children which are orphaned but still running after this test has ran. I need to find a way to ensure the entire subtree of processes is killed.

Kill all threads on terminate

I'm trying to create an app in ruby which can be started from command line and it does two things: runs a continous job (loop with sleep which runs some action [remote feed parsing]) with one thread and sinatra in a second thread. My code (simplified) looks like that:
require 'sinatra'
class MyApp < Sinatra::Base
get '/' do
"Hello!"
end
end
threads = []
threads << Thread.new do
loop do
# do something heavy
sleep 10
end
end
threads << Thread.new do
MyApp.run!
end
threads.each { |t| t.join }
The above code actually does it's job very well - the sinatra app is started an available under 4567 port and the do something heavy task is beeing fired each 10 seconds. However, i'm not able to kill that script.
I'm running it with ruby app.rb but killing it with ctrl + c is not working. It kills just the sinatra thread but the second one is still running and, to stop the script, i need to close the terminal window.
I was trying to kill all the threads on SIGNINT but it's also not working as expected
trap "SIGINT" do
puts "Exiting"
threads.each { |t| Thread.kill t }
exit 130
end
Can you help me with this? Thanks in advance.
To trap ctrl-c, change "SIGINT" to "INT".
trap("INT") {
puts "trapping"
threads.each{|t|
puts "killing"
Thread.kill t
}
}
To configure Sinatra to skip catching traps:
class MyApp < Sinatra::Base
configure do
set :traps, false
end
...
Reference: Ruby Signal module
To list the available Ruby signals: Signal.list.keys
Reference: Sinatra Intro
(When I run your code and trap INT, I do get a Sinatra socket warning "Already in use". I presume that's fine for your purposes, or you can solve that by doing a Sinatra graceful shutdown. See Sinatra - terminate server from request)
Late to the party, but Trap has one big disadvantage - it gets overriden by the webserver. For example, Puma sets several traps which basically makes your one never to be called.
The best workaround is to use at_exit which can be defined multiple times and Ruby makes sure all blocks are called. I haven't tested this if it would work for your case tho.

Rufus-Scheduler, DaemonKit and traps

I daemonized a Ruby scheduler script (using Rufus) with Rufus-Scheduler DaemonKit and I'm trying to trap the TERM or INT signals to have the application try to save state before quitting.
DaemonKit has its own trap_state (private) method and it catches the signal before the daemon script so even though I have this block, it doesn't do much.
DaemonKit::Application.running! do |config|
surprise = Surprise.new(interval, frequency, false)
surprise.start
config.trap( 'SIGINT' ) do #tried INT and TERM as well
puts 'Exiting'
surprise.stop
File.delete($lock)
end
end
As a side effect (maybe a mistake in my implementation ?) after sigterm the .rufus lockfile is still there
The behavior on ctrl-c right now is this
[daemon-kit]: DaemonKit (0.3.1) booted, now running surprise
log writing failed. can't be called from trap context
[daemon-kit]: Running signal traps for INT
log writing failed. can't be called from trap context
[daemon-kit]: Running shutdown hooks
log writing failed. can't be called from trap context
[daemon-kit]: Shutting down surprise
The start method is a pretty simple schedule
def start
#scheduler = Rufus::Scheduler.new(:lockfile => $lock)
#scheduler.every '1d', :first_at => #first, :overlap => false do |job|
... # some work
end
#scheduler.join
end
def stop
# save state
#scheduler.shutdown
end
Looking at your own answer, and the following code you pasted:
def start
#scheduler = Rufus::Scheduler.new(:lockfile => $lock)
# ...
#scheduler.join # <- NOT NEEDED
end
DaemonKit's DaemonKit::Application.running! block actually never finishes running, so you could safely skip calling #join on any thread.
We should work on making this use-case more clear, as I would love see it used more widely for this kinda work.
So it's very simple, I need to configure the trap proc (or block in my case) BEFORE I run the scheduler in the start method. Not feeling very clever right about now, but the following code works as expected. For reference, the set_trap is private in DK but the public trap method overrides the defaults that come with the DK startup.
DaemonKit::Application.running! do |config|
surprise = Surprise.new(interval, frequency, false)
config.trap("TERM") { surprise.stop }
config.trap( "INT" ) { surprise.stop }
surprise.start
end
Interestingly I saw this line on startup that I hadn't noticed before
[daemon-kit]: Trapping SIGINT signals not supported on this platform
INT and TERM both work though

How to recover a crashed EventMachine loop

I'm using Unicorn on Heroku and I created an EventMachine loop:
(from https://gist.github.com/jonkgrimes/5103321)
after_fork do |server,worker|
defined?(ActiveRecord::Base) and
ActiveRecord::Base.establish_connection
if defined?(EventMachine)
unless EventMachine.reactor_running? && EventMachine.reactor_thread.alive?
if EventMachine.reactor_running?
EventMachine.stop_event_loop
EventMachine.release_machine
EventMachine.instance_variable_set("#reactor_running",false)
end
Thread.new { EventMachine.run }
end
end
Signal.trap("INT") { EventMachine.stop }
Signal.trap("TERM") { EventMachine.stop }
end
The EventMachine works great, but at some point my events start failing because "no eventmachine loop is running." I imagine two possible problems:
the loop is still running but somehow my unicorn forks are no longer bound properly (seems unlikely)
the loop crashed (seems likely)
How can I detect and restart a crashed eventmachine? And/or how should I go about debugging this problem?

Ruby, windows, active_record, and Control-C

What is active_record doing to the signal processes under windows (I don't see this with the same versions on the mac) that causes it to behave so strangely? For instance:
require 'rubygems'
trap("INT"){puts "interrupted"}
puts __LINE__
sleep 5
require 'active_record'
trap("INT"){puts "interrupted again"}
puts __LINE__
sleep 5
When I run the above code (ruby 1.8.6, gem 1.3.1, activerecord 2.2.2,) I can hit ^C as many times as I like during the first sleep, but the first interrupt after the require of activerecord causes the script to terminate. In the above case, the trap still executes, it only fails to allow the program to continue. Usually.
Removing the second call to trap does not have any effect upon the behaviors.
The real annoyance is that in some conditions, the trap fails to execute at all. Considering that the whole point of doing this is to get my code to clean up after itself (remove its footprint in the database so the next guy sees a sane state,) this is a real problem. For instance:
require 'rubygems'
require 'active_record'
trap("INT"){puts "interrupted"}
puts __LINE__
gets
Pressing ^C after seeing the puts will not execute the trap at all.
I only see this problem after requiring active_record. Is there a workaround? I'd be curious to know if this is a bug or if there is an explanation of some sort. As I said, I have no issue with this on the mac - repeated ^Cs result in multiple executions of the trap proc.
thanks...
Considering that the whole point of doing this is to get my code to clean up after itself (remove its footprint in the database ...
Have you considered just using a database transaction? It seems like it would be a much easier way to solve the problem.
I saw a different pattern when trying to duplicate this problem:
puts "start"
trap("INT") { puts "interrupted" }
sleep 5
puts "end"
On Ubuntu (Ruby 1.8.6) this produces
start
interrupted
interrupted
(etc)
interrupted
end
So "interrupted" prints each time Crtl-C is pressed, until the 5 seconds are up. Under Windows (also Ruby 1.8.6), this produces:
start
interrupted
end
i.e. it prints "interrupted" once and then exits.
So it appears that while handling SIGINT Ruby exits the sleep routine and continues on to the next statement. My guess (hand-waving) is that this is somehow due to Ruby using green threads instead of native threads on Windows. Any experts please chime in here.
You could emulate the Unix-y behavior by restarting sleep in the handler:
puts "start"
trap("INT") do
puts "interrupted"
sleep 5
end
sleep 5
puts "end"
Unfortunately this resets the timer each time SIGINT is trapped, so it needs some hacking:
$interval = 5
def go_to_sleep(secs)
$started = Time.now
sleep secs
end
trap("INT") do
puts "interrupted"
time_to_sleep = [0,$interval - (Time.now - $started)].max
if time_to_sleep > 0
sleep time_to_sleep
end
end
puts "one"
go_to_sleep($interval)
puts "two"
go_to_sleep($interval)
puts "three"
go_to_sleep($interval)

Resources