Is it possible to send a notification when a Unicorn master finishes a restart? - ruby

I'm running a series of Rails/Sinatra apps behind nginx + unicorn, with zero-downtime deploys. I love this setup, but it takes a while for Unicorn to finish restarting, so I'd like to send some sort of notification when it finishes.
The only callbacks I can find in Unicorn docs are related to worker forking, but I don't think those will work for this.
Here's what I'm looking for from the bounty: the old unicorn master starts the new master, which then starts its workers, and then the old master stops its workers and lets the new master take over. I want to execute some ruby code when that handover completes.
Ideally I don't want to implement any complicated process monitoring in order to do this. If that's the only way, so be it. But I'm looking for easier options before going that route.

I've built this before, but it's not entirely simple.
The first step is to add an API that returns the git SHA of the current revision of code deployed. For example, you deploy AAAA. Now you deploy BBBB and that will be returned. For example, let's assume you added the api "/checks/version" that returns the SHA.
Here's a sample Rails controller to implement this API. It assumes capistrano REVISION file is present, and reads current release SHA into memory at app load time:
class ChecksController
VERSION = File.read(File.join(Rails.root, 'REVISION')) rescue 'UNKNOWN'
def version
render(:text => VERSION)
end
end
You can then poll the local unicorn for the SHA via your API and wait for it to change to the new release.
Here's an example using Capistrano, that compares the running app version SHA to the newly deployed app version SHA:
namespace :deploy do
desc "Compare running app version to deployed app version"
task :check_release_version, :roles => :app, :except => { :no_release => true } do
timeout_at = Time.now + 60
while( Time.now < timeout_at) do
expected_version = capture("cat /data/server/current/REVISION")
running_version = capture("curl -f http://localhost:8080/checks/version; exit 0")
if expected_version.strip == running_version.strip
puts "deploy:check_release_version: OK"
break
else
puts "=[WARNING]==========================================================="
puts "= Stale Code Version"
puts "=[Expected]=========================================================="
puts expected_version
puts "=[Running]==========================================================="
puts running_version
puts "====================================================================="
Kernel.sleep(10)
end
end
end
end
You will want to tune the timeouts/retries on the polling to match your average app startup time. This example assumes a capistrano structure, with app in /data/server/current and a local unicorn on port 8080.

If you have full access to the box, you could script the Unicorn script to start another script which loops through checking for /proc/<unicorn-pid>/exe which will link to the running process.
See: Detect launching of programs on Linux platform
Update
Based on the changes to the question, I see two options - neither of which are great, but they're options nonetheless...
You could have a cron job that runs a Ruby script every minute which monitors the PID directory mtime, then ensure that PID files exist (since this will tell you that a file has changed in the directory and the process is running) then executes additional code if both conditions are true. Again, this is ugly and is a cron that runs every minute, but it's minimal setup.
I know you want to avoid complicated monitoring, but this is how I'd try it... I would use monit to monitor those processes, and when they restart, kick off a Ruby script which sleeps (to ensure start-up), then checks the status of the processes (perhaps using monit itself again). If this all returns properly, execute additional Ruby code.
Option #1 isn't clean, but as I write the monit option, I like it even better.

Related

Setting up resque-pool over a padrino Rakefile throwing errors

I have setup a Padrino bus application using super-cool Resque for handling background process and ResqueBus for pub/sub of events.
The ResqueBus setup creates a resque queue and a worker for it to work on. Everything upto here works fine. Now since the resquebus is only creating a single worker for a single queue, and the process in my bus app can go haywire since many events will be published and subscribed. So a single worker per application queue seems to be inefficient. So thought of integrating the resque-pool gem to handle the worker process.
I have followed all process that resque pool gem has specified. I have edited my Rakefile.
# Add your own tasks in files placed in lib/tasks ending in .rake,
# for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.
require File.expand_path('../config/application', __FILE__)
Ojus::Application.load_tasks
require 'resque/pool/tasks'
# this task will get called before resque:pool:setup
# and preload the rails environment in the pool manager
task "resque:setup" => :environment do
# generic worker setup, e.g. Hoptoad for failed jobs
end
task "resque:pool:setup" do
# close any sockets or files in pool manager
ActiveRecord::Base.connection.disconnect!
# and re-open them in the resque worker parent
Resque::Pool.after_prefork do |job|
ActiveRecord::Base.establish_connection
end
end
Now I tried to run this resque-pool command.
resque-pool --daemon --environment production
This throws an error like this.
/home/ubuntu/.rvm/gems/ruby-2.0.0-p451#notification-engine/gems/activerecord-4.1.7/lib/active_record/connection_adapters/connection_specification.rb:257:in `resolve_symbol_connection': 'default_env' database is not configured. Available: [:development, :production, :test] (ActiveRecord::AdapterNotSpecified)
I tried to debug this and found out that it throws an error at line
ActiveRecord::Base.connection.disconnect!
For now I have removed this line and everything seems working fine. But due to this a problem may arise because if we restart the padrino application the older ActiveRecord connection will be hanging around.
**
I just wanted to know if there is any work around for this problem and
run the resque-pool command by closing all the ActiveRecord
connections.
**
It would have been helpful if you had given your database.rb file of padrino.
Never mind, you can try
defined?(ActiveRecord::Base) && ActiveRecord::Base.connection.disconnect!
instead of ActiveRecord::Base.connection.disconnect!
and
ActiveRecord::Base.establish_connection(ActiveRecord::Base.configurations[Padrino.env])
instead of ActiveRecord::Base.establish_connection()
to establish a connection with activerecord you have to pass a parameter to what environment you want to connect otherwise it will search 'default_env' which is default in activerecord.
checkout the source code source code

Run command after gem install from gem root folder

I'm deploying a Sinatra app as a gem. I have a command that starts the app as a service.
We are using chef to manage our deployments.
How can I run the command to start the app service but only after it's fully installed (including run-time dependencies)?
I've tried Googling for trying to run a post-install script but I haven't found anything that is of use or concrete without some complicated 'extconf.rb' work around
I would prefer not to use an execute resource if I can help it.
EDIT: I tried what was suggested but it breaks thins in way that causes berkshelf not to work in our pipeline.
Here's the code I'm using:
execute "run-service:post_install" do
cwd (f = File.expand_path(__FILE__).split('/')).shift(f.length - 3).join('\\')
timeout 5
command "bundle && rake service:post_install"
# action :nothing
# subscribes :run, "gem_package[gem_name]" , :delayed
end
It doesn't matter if I un-comment or not the last two lines, it just breaks things but if i take out the whole thing it stops breaking things. Obviously I'm doing something wrong but I'm not sure what.
EDIT:
IT's the command itself that breaks it, when I change command to ls and action to :run, it breaks.
EDIT:after changing the command path around a bit I managed to get it to spit out a usable error, it was trying to run the command from chef cook books path, so I've (hopefully) forced it to use the correct path.
Why do you not want to use an execute resource? That is exactly what it is for, running commands from Chef. Chef obeys the order of the resources, so if you have a gem_package followed by an execute they will run in that order.
So, In the end I decided to try using the service resource because it allows you to set start, and stop commands.
The code that I used is :
service service_name do
init_command ("#{%x(gem env gemdir).strip.gsub('/','\\')}\\gems\\gem_name-#{installing_version}")
start_command "rake service:start"
stop_command "rake service:stop"
reload_command "rake service:reload"
restart_command "rake service:restart"
supports start: true, restart: true, reload: true
action [:enable,:start]
end
I'm still having problems but this is of a different sort.

AppFog background worker 'failed to start'

I'm trying to follow the AppFog guide on creating a background worker in ruby, and I'm running into some (probably noob) issues. The example uses Rufus-scheduler, which (according to the Ruby docs on AppFog) means I need to use Bundler to include/manage within my app. Nonetheless, I've run bundle install, pushed everything to AppFog in the appropriate ('standalone') fashion, and still can't seem to get it running.
my App & Gemfile:
...and via the AF CLI:
$ af push
[...creating/uploading/etc. etc... - removed to save space]
Staging Application 'chservice-dev': OK
Starting Application 'chservice-dev': .
Error: Application [chservice-dev] failed to start, logs information below.
====> /logs/staging.log <====
# Logfile created on 2013-06-27 20:22:23 +0000 by logger.rb/25413
Need to fetch tzinfo-1.0.1.gem from RubyGems
Adding tzinfo-1.0.1.gem to app...
Adding rufus-scheduler-2.0.19.gem to app...
Adding bundler-1.1.3.gem to app...
====> /logs/stdout.log <====
2013-06-27 20:22:28.841 - script executed.
Delete the application? [Yn]:
How can I fix (or troubleshoot) this? I'm probably missing a large step/concept... very new to ruby =)
Thanks in advance.
I think the app might be exiting immediately. The scheduler needs to be joined to the main thread in order to keep that app running.
require 'rubygems'
require 'rufus/scheduler'
scheduler = Rufus::Scheduler.start_new
scheduler.every '10s' do
puts 'Log this'
end
### join the scheduler to the main thread ###
scheduler.join
I created a sample rufus scheduler app that works on appfog: https://github.com/tsantef/appfog-rufus-example

Testing server ruby-application with cucumber

My ruby application runs Webrick server. I want to test it by cucumber and want to ensure that it gives me right response.
Is it normal to run server in test environment for testing? Where in my code I should start server process and where I should destroy it?
Now I start server by background step and destroy in After hook. It's slow because server starts before every scenario and destroys after.
I have idea to start server in env.rb and destroy it in at_exit block declared also in env.rb. What do you think about it?
Do you know any patterns for that problem?
I use Spork for this. It starts up one or more servers, and has the ability to reload these when needed. This way, each time you run your tests you're not incurring the overhead of firing up Rails.
https://github.com/sporkrb/spork
Check out this RailsCast for the details: http://railscasts.com/episodes/285-spork
Since cucumber does not support spork any more (why ?) I use the following code in env.rb
To fork a process I use this lib : https://github.com/jarib/childprocess
require 'childprocess'
ChildProcess.posix_spawn = true
wkDir=File.dirname(__FILE__)
server_dir = File.join(wkDir, '../../site/dev/bin')
#Because I use rvm , I have to run the server thru a shell
#server = ChildProcess.build("sh","-c","ruby pageServer.rb -p 4563")
#server.cwd = server_dir
#server.io.inherit!
#server.leader = true
#server.start
at_exit do
puts "----------------at exit--------------"
puts "Killing process " + #server.pid.to_s
#server.stop
if #server.alive?
puts "Server is still alive - kill it manually"
end
end

bluepill not detecting that processes have, in fact, started successfully, and so creates new ones

I have one (EC2) Ubuntu server where bluepill is working just fine to start and monitoring resque processes (and it has done so on other nodes in the past).
I'm setting up a new node, and for some reason on this node bluepill does not recognize that the processes have started and are running, and so keeps creating new ones. I'm a little baffled by what's causing this. The 2 nodes are almost identical; they're both EC2 servers provisioned by the same chef scripts. It is true that the one not working is 'production' and the other 'staging', but there's almost no difference due to that.
Any thoughts or suggestions before I fork the github project and start inserting more monitoring, to try and figure out what's going on? There's been discussion on this list in the past about troubles w/ bluepill and resque, but as I said this is working fine on my staging server, and has worked fine on earlier production servers (although I will note that this new production server is ruby 1.9.3 (vs 1.9.2) and rails 3.2 (vs. 3.1)).
Here's my .pill file (or more specifically, my chef cookbook's template file):
ENV["RAILS_ENV"] = "<%= node.chef_environment %>"
ENV["QUEUE"] = "*"
Bluepill.application("zmx_app") do |app|
app.working_dir = "/srv/zmx/current"
app.uid = "root"
app.gid = "root"
2.times do |i|
app.process("resque-#{i}") do |process|
process.group = "resque"
process.start_command = "rake resque:work"
process.pid_file = "/srv/zmx/current/tmp/pids/resque_workers-#{i}.pid"
process.stop_command = "kill -QUIT {{PID}}"
process.daemonize = true
end
end
end
This turned out to be a bug in bluepill, which I have forked, fixed, and submitted a pull request.
And I'm not sure why I didn't realize that there was, in fact, a difference between my two environments: staging/old prod was on bluepill 0.0.55, my new production environment on 0.0.58.

Resources