AppFog background worker 'failed to start' - ruby

I'm trying to follow the AppFog guide on creating a background worker in ruby, and I'm running into some (probably noob) issues. The example uses Rufus-scheduler, which (according to the Ruby docs on AppFog) means I need to use Bundler to include/manage within my app. Nonetheless, I've run bundle install, pushed everything to AppFog in the appropriate ('standalone') fashion, and still can't seem to get it running.
my App & Gemfile:
...and via the AF CLI:
$ af push
[...creating/uploading/etc. etc... - removed to save space]
Staging Application 'chservice-dev': OK
Starting Application 'chservice-dev': .
Error: Application [chservice-dev] failed to start, logs information below.
====> /logs/staging.log <====
# Logfile created on 2013-06-27 20:22:23 +0000 by logger.rb/25413
Need to fetch tzinfo-1.0.1.gem from RubyGems
Adding tzinfo-1.0.1.gem to app...
Adding rufus-scheduler-2.0.19.gem to app...
Adding bundler-1.1.3.gem to app...
====> /logs/stdout.log <====
2013-06-27 20:22:28.841 - script executed.
Delete the application? [Yn]:
How can I fix (or troubleshoot) this? I'm probably missing a large step/concept... very new to ruby =)
Thanks in advance.

I think the app might be exiting immediately. The scheduler needs to be joined to the main thread in order to keep that app running.
require 'rubygems'
require 'rufus/scheduler'
scheduler = Rufus::Scheduler.start_new
scheduler.every '10s' do
puts 'Log this'
end
### join the scheduler to the main thread ###
scheduler.join
I created a sample rufus scheduler app that works on appfog: https://github.com/tsantef/appfog-rufus-example

Related

Rails 4 Delayed_job error - Job failed to load: undefined class/module CustomJob

I've spent several days on this and about 100 hours but can't get the fix.
Here's my setup (using Rails 4.2.8)
class CustomJob < ActiveJob::Base
def perform(*args)
filename = args.first
data = File.read(filename)
# process that data
end
end
When I run Delayed::Job.enqueue CustomJob.new('filename'), I get the error mentioned in the subject. The job is created and added to the db, but the error message is "Job failed..."
I have this line:
require 'custom_job'
in several places including script/delayed_job.rb, config/initializers/delayed_jobs.rb, config/initializers/custom_job.rb and the file in which I'm calling the job.
I also added this:
config.autoload_paths+=Dir[Rails.root.join('app','jobs')]
config.active_job.queue_adapter = :delayed_job
to my config/application.rb file
And this:
config.threadsafe! unless defined? ($rails_rake_task) && $rails_rake_task
I've also restarted my server after every change. And verified that delayed_job was running using:
dir$ RAILS_ENV=development script/delayed_job status
delayed_job: running [pid 64503]
Sources:
delayed_job fails jobs when running as a daemon. Runs fine when using rake jobs:work
DelayedJob: "Job failed to load: uninitialized constant Syck::Syck"
Rails Custom Delayed Job - uninitialized constant
model classes not loading in Delayed Job when using threadsafe
Can you also try adding this line into config/application.rb file
config.eager_load_paths+=Dir[Rails.root.join('app','jobs')]
I always feel like the answer is obvious...AFTER I figure it out.
The problem was that I was using a shared database and there were existing workers accessing this DB. Though I was restarting and refreshing my local instance of the server, the other instances were trying to run my jobs and the OTHER workers were causing the error, not my local instance.
Solution: Ensure that other instances of delayed_job are using the same table as the code you're testing/building/using. If so, use another DB if possible.

Setting up resque-pool over a padrino Rakefile throwing errors

I have setup a Padrino bus application using super-cool Resque for handling background process and ResqueBus for pub/sub of events.
The ResqueBus setup creates a resque queue and a worker for it to work on. Everything upto here works fine. Now since the resquebus is only creating a single worker for a single queue, and the process in my bus app can go haywire since many events will be published and subscribed. So a single worker per application queue seems to be inefficient. So thought of integrating the resque-pool gem to handle the worker process.
I have followed all process that resque pool gem has specified. I have edited my Rakefile.
# Add your own tasks in files placed in lib/tasks ending in .rake,
# for example lib/tasks/capistrano.rake, and they will automatically be available to Rake.
require File.expand_path('../config/application', __FILE__)
Ojus::Application.load_tasks
require 'resque/pool/tasks'
# this task will get called before resque:pool:setup
# and preload the rails environment in the pool manager
task "resque:setup" => :environment do
# generic worker setup, e.g. Hoptoad for failed jobs
end
task "resque:pool:setup" do
# close any sockets or files in pool manager
ActiveRecord::Base.connection.disconnect!
# and re-open them in the resque worker parent
Resque::Pool.after_prefork do |job|
ActiveRecord::Base.establish_connection
end
end
Now I tried to run this resque-pool command.
resque-pool --daemon --environment production
This throws an error like this.
/home/ubuntu/.rvm/gems/ruby-2.0.0-p451#notification-engine/gems/activerecord-4.1.7/lib/active_record/connection_adapters/connection_specification.rb:257:in `resolve_symbol_connection': 'default_env' database is not configured. Available: [:development, :production, :test] (ActiveRecord::AdapterNotSpecified)
I tried to debug this and found out that it throws an error at line
ActiveRecord::Base.connection.disconnect!
For now I have removed this line and everything seems working fine. But due to this a problem may arise because if we restart the padrino application the older ActiveRecord connection will be hanging around.
**
I just wanted to know if there is any work around for this problem and
run the resque-pool command by closing all the ActiveRecord
connections.
**
It would have been helpful if you had given your database.rb file of padrino.
Never mind, you can try
defined?(ActiveRecord::Base) && ActiveRecord::Base.connection.disconnect!
instead of ActiveRecord::Base.connection.disconnect!
and
ActiveRecord::Base.establish_connection(ActiveRecord::Base.configurations[Padrino.env])
instead of ActiveRecord::Base.establish_connection()
to establish a connection with activerecord you have to pass a parameter to what environment you want to connect otherwise it will search 'default_env' which is default in activerecord.
checkout the source code source code

NewRelic transaction traces in a Ruby Gem

I am developing a Ruby gem that I would like to add NewRelic monitoring to. The gem is used in a script that is run as a daemon and monitored by bluepill. I followed "Monitoring Ruby background processes and daemons" to get started.
I confirmed the gem is establishing a connection with NewRelic as the application shows up in my portal there, however, there is no transaction traces or any metrics breakdown of the code being invoked.
Here's the "entry" point of my gem as I tried to manually start the agent around the invoking method:
require 'fms/parser/version'
require 'fms/parser/core'
require 'fms/parser/env'
require 'mongoid'
ENV['NRCONFIG'] ||= File.dirname(__FILE__) + '/../newrelic.yml'
require 'newrelic_rpm'
module Fms
module Parser
def self.prepare_parse(filename)
::NewRelic::Agent.manual_start
Mongoid.load!("#{File.dirname(__FILE__)}/../mongoid.yml", :development)
Core.prepare_parse(filename)
::NewRelic::Agent.shutdown
end
end
end
I also tried adding this into the module:
class << self
include ::NewRelic::Agent::Instrumentation::ControllerInstrumentation
add_transaction_tracer :prepare_parse, :category => :task
end
I'm not entirely sure what else I can do. I confirmed the agent is able to communicate with the server and transaction traces are enabled. Nothing shows up in the background application tab either.
This is the most useful information I've gotten from the agent log so far:
[12/23/13 21:21:03 +0000 apivm (7819)] INFO : Environment: development
[12/23/13 21:21:03 +0000 apivm (7819)] INFO : No known dispatcher detected.
[12/23/13 21:21:03 +0000 apivm (7819)] INFO : Application: MY-APP
[12/23/13 21:21:03 +0000 apivm (7819)] INFO : Installing Net instrumentation
[12/23/13 21:21:03 +0000 apivm (7819)] INFO : Finished instrumentation
[12/23/13 21:21:04 +0000 apivm (7819)] INFO : Reporting to: https://rpm.newrelic.com/[MASKED_ACCOUNT_NUMBER]
[12/23/13 22:12:06 +0000 apivm (7819)] INFO : Starting the New Relic agent in "development" environment.
[12/23/13 22:12:06 +0000 apivm (7819)] INFO : To prevent agent startup add a NEWRELIC_ENABLE=false environment variable or modify the "development" section of your newrelic.yml.
[12/23/13 22:12:06 +0000 apivm (7819)] INFO : Reading configuration from /var/lib/gems/1.9.1/gems/fms-parser-0.0.6/lib/fms/../newrelic.yml
[12/23/13 22:12:06 +0000 apivm (7819)] INFO : Starting Agent shutdown
The only thing that's really concerning here is "No known dispatcher detected".
Is what I'm trying to do possible?
I work at New Relic and wanted to add some up-to-date details about the latest version of the newrelic_rpm gem. TrinitronX is on the right track, but unfortunately that code sample and blog post is based on a very old version of the gem, and the internals have changed significantly since then. The good news is that newer versions of the agent should make this simpler.
To start off, I should say I'm assuming that your process stays alive for a long time as a daemon, and makes repeated calls to prepare_parse.
Generally speaking, the explicit manual_start and shutdown calls you have inserted into your prepare_parse method should not be necessary - except for a few special cases (certain rake tasks and interactive sessions). The New Relic agent will automatically start as soon as it is required. You can see details about when the Ruby agent will automatically start and how to control this behavior here:
https://docs.newrelic.com/docs/ruby/forcing-the-ruby-agent-to-start
For monitoring background tasks like this, there are conceptually two levels of instrumentation that you might want: transaction tracers and method tracers. You already have a transaction tracer, but you may also want to add method tracers around the major chunks of work that happen within your prepare_parse method. Doing so will give you better visibility into what's happening within each prepare_parse invocation. You can find details about adding method tracers here:
https://docs.newrelic.com/docs/ruby/ruby-custom-metric-collection#method_tracers
With the way that you are calling add_transaction_tracer, your calls to prepare_parse should show up as transactions on the 'Background tasks' tab in the New Relic UI.
The one caveat here may be the fact that you're running this as a daemon. The Ruby agent uses a background thread to asynchronously communicate with New Relic servers. Since threads are not copied across calls to fork(), this means you will sometimes have to manually re-start the agent after a fork() (note that Ruby's Process.daemon uses fork underneath, so it's included as well). Whether or not this is necessary depends on the relative timing of the require of newrelic_rpm and the call to fork / daemon (if newrelic_rpm isn't required until after the call to fork / daemon, you should be good, otherwise see below).
There are two solutions to the fork issue:
Manually call NewRelic::Agent.after_fork from the forked child, right after the fork.
If you're using newrelic_rpm 3.7.1 or later, there's an experimental option to automatically re-start the background thread that you can enable in your newrelic.yml file by setting restart_thread_in_children: true. This is off by default at the moment, but may become the default behavior in future versions of the agent.
If you're still having trouble, the newrelic_agent.log file is your best bet to debugging things. You'll want to increase the verbosity by setting log_level: debug in your newrelic.yml file in order to get more detailed output.
For debugging this problem, try the following code:
require 'fms/parser/version'
require 'fms/parser/core'
require 'fms/parser/env'
require 'mongoid'
ENV['NRCONFIG'] ||= File.dirname(__FILE__) + '/../newrelic.yml'
# Make sure NewRelic has correct log file path
ENV['NEW_RELIC_LOG'] ||= File.dirname(__FILE__) + '/../log/newrelic_agent.log'
require 'newrelic_rpm'
::NewRelic::Agent.manual_start
# For debug purposes: output some dots until we're connected to NewRelic
until NewRelic::Agent.connected? do
print '.'
sleep 1
end
module Fms
module Parser
class << self
include ::NewRelic::Agent::Instrumentation::ControllerInstrumentation
add_transaction_tracer :prepare_parse, :category => :task
end
def self.prepare_parse(filename)
Mongoid.load!("#{File.dirname(__FILE__)}/../mongoid.yml", :development)
Core.prepare_parse(filename)
# Force the agent to prepare data before we shutdown
::NewRelic::Agent.load_data
# NOTE: Ideally you'd want to shut down the agent just before the process exits... not every time you call Fms::Parser#prepare_parse
::NewRelic::Agent.shutdown(:force_send => true)
end
end
end
I have a feeling that this probably has something to do with running your gem's code within the daemonized process that bluepill is starting. Ideally, we'd want to start the NewRelic agent within the process as soon after the daemon process is forked as we can get. Putting it after your library's requires should do this when the file is required.
We also would most likely want to stop the NewRelic agent just before the background task process exits, not every time the Fms::Parser#prepare_parse method is called. However, for our purposes this should get you enough debugging info to continue, so you can ensure that the task is contacting New Relic the first time it's run. We can also try using :force_send => true to ensure we send the data.
References:
Blog Post: Instrumenting your monitoring checks with New Relic

Is it possible to send a notification when a Unicorn master finishes a restart?

I'm running a series of Rails/Sinatra apps behind nginx + unicorn, with zero-downtime deploys. I love this setup, but it takes a while for Unicorn to finish restarting, so I'd like to send some sort of notification when it finishes.
The only callbacks I can find in Unicorn docs are related to worker forking, but I don't think those will work for this.
Here's what I'm looking for from the bounty: the old unicorn master starts the new master, which then starts its workers, and then the old master stops its workers and lets the new master take over. I want to execute some ruby code when that handover completes.
Ideally I don't want to implement any complicated process monitoring in order to do this. If that's the only way, so be it. But I'm looking for easier options before going that route.
I've built this before, but it's not entirely simple.
The first step is to add an API that returns the git SHA of the current revision of code deployed. For example, you deploy AAAA. Now you deploy BBBB and that will be returned. For example, let's assume you added the api "/checks/version" that returns the SHA.
Here's a sample Rails controller to implement this API. It assumes capistrano REVISION file is present, and reads current release SHA into memory at app load time:
class ChecksController
VERSION = File.read(File.join(Rails.root, 'REVISION')) rescue 'UNKNOWN'
def version
render(:text => VERSION)
end
end
You can then poll the local unicorn for the SHA via your API and wait for it to change to the new release.
Here's an example using Capistrano, that compares the running app version SHA to the newly deployed app version SHA:
namespace :deploy do
desc "Compare running app version to deployed app version"
task :check_release_version, :roles => :app, :except => { :no_release => true } do
timeout_at = Time.now + 60
while( Time.now < timeout_at) do
expected_version = capture("cat /data/server/current/REVISION")
running_version = capture("curl -f http://localhost:8080/checks/version; exit 0")
if expected_version.strip == running_version.strip
puts "deploy:check_release_version: OK"
break
else
puts "=[WARNING]==========================================================="
puts "= Stale Code Version"
puts "=[Expected]=========================================================="
puts expected_version
puts "=[Running]==========================================================="
puts running_version
puts "====================================================================="
Kernel.sleep(10)
end
end
end
end
You will want to tune the timeouts/retries on the polling to match your average app startup time. This example assumes a capistrano structure, with app in /data/server/current and a local unicorn on port 8080.
If you have full access to the box, you could script the Unicorn script to start another script which loops through checking for /proc/<unicorn-pid>/exe which will link to the running process.
See: Detect launching of programs on Linux platform
Update
Based on the changes to the question, I see two options - neither of which are great, but they're options nonetheless...
You could have a cron job that runs a Ruby script every minute which monitors the PID directory mtime, then ensure that PID files exist (since this will tell you that a file has changed in the directory and the process is running) then executes additional code if both conditions are true. Again, this is ugly and is a cron that runs every minute, but it's minimal setup.
I know you want to avoid complicated monitoring, but this is how I'd try it... I would use monit to monitor those processes, and when they restart, kick off a Ruby script which sleeps (to ensure start-up), then checks the status of the processes (perhaps using monit itself again). If this all returns properly, execute additional Ruby code.
Option #1 isn't clean, but as I write the monit option, I like it even better.

Job handler serialization incorrect when running delayed_job in production with Thin or Unicorn

I recently brought delayed_job into my Rails 3.1.3 app. In development
everything is fine. I even staged my DJ release on the same VPS as my
production app using the same production application server (Thin),
and everything was fine. Once I released to production, however, all
hell broke loose: none of the jobs were entered into the jobs table
correctly, and I started seeing the following in the logs for all
processed jobs:
2012-02-18T14:41:51-0600: [Worker(delayed_job host:hope pid:12965)]
NilClass# completed after 0.0151
2012-02-18T14:41:51-0600: [Worker(delayed_job host:hope pid:12965)] 1
jobs processed at 15.9666 j/s, 0 failed ...
NilClass and no method name? Certainly not correct. So I looked at the
serialized handler on the job in the DB and saw:
"--- !ruby/object:Delayed::PerformableMethod\nattributes:\n id: 13\n
event_id: 26\n name: memememe\n api_key: !!null \n"
No indication of a class or method name. And when I load the YAML into
an object and call #object on the resulting PerformableMethod I get
nil. For kicks I then fired up the console on the broken production
app and delayed the same job. This time the handler looked like:
"--- !ruby/object:Delayed::PerformableMethod\nobject: !ruby/
ActiveRecord:Domain\n attributes:\n id: 13\n event_id: 26\n
name: memememe\n api_key: !!null \nmethod_name: :create_a\nargs: []
\n"
And sure enough, that job runs fine. Puzzled, I then recalled reading
something about DJ not playing nice with Thin. So, I tried Unicorn and
was sad to see the same result. Hours of research later and I think
this has something to do with how the app server is loading the YAML
libraries Psych and Syck and DJ's interaction with them. I cannot,
however, pin down exactly what is wrong.
Note that I'm running delayed_job 3.0.1 official, but have tried upgrading to
the master branch and have even tried downgrading to 2.1.4.
Here are some notable differences between my stage and production
setups:
In stage I run 1 Thin server on a TCP port -- no web proxy in front
In production I run 2+ Thin servers and proxy to them with Nginx.
They talk over a UNIX socket
When I tried unicorn it was 1 app server proxied to by Nginx over a
UNIX socket
Could the web proxying/Nginx have something to do with it? Please, any insight is greatly appreciated. I've spent a lot of time
integrating delayed_job and would hate to have to shelve the work or, worse,
toss it. Thanks for reading.
I fixed this by not using #delay. Instead I replaced all of my "model.delay.method" code with custom jobs. Doing so works like a charm, and is ultimately more flexible. This fix works fine with Thin. I haven't tested with Unicorn.
I'm running into a similar problem with rails 3.0.10 and dj 2.1.4, it's most certainly a different yaml library being loaded when running from console vs from the app server; thin, unicorn, nginx. I'll share any solution I come up with
Ok so removing these lines from config/boot.rb fixed this issue for me.
require 'yaml'
YAML::ENGINE.yamler = 'syck'
This had been placed there to fix an YAML parsing error, forcing YAML to use 'syck'. Removing this required me to fix the underlying issues with the .yml files. More on this here
Now my delayed job record handlers match between those created via the server (unicorn in my case) and the console. Both my server and delayed job workers are kicked off within bundler
Unicorn
cd #{rails_root} && bundle exec unicorn_rails -c #{rails_root}/config/unicorn.rb -E #{rails_env} -D"
DJ
export LANG=en_US.utf8; export GEM_HOME=/data/reception/current/vendor/bundle/ruby/1.9.1; cd #{rail
s_root}; /usr/bin/ruby1.9.1 /data/reception/current/script/delayed_job start staging

Resources