I've used Delayed_job in the past. I have an old project that runs on a server where I can't upgrade from Ruby 1.8.6 to 1.8.7, and therefore can't use Delayed Job, so I'm trying BackgroundJobs http://codeforpeople.rubyforge.org/svn/bj/trunk/README
I have it working so that my job runs, but something doesn't seem right. For example, if I run the job like this:
jobs = Bj.submit "echo hi", :is_restartable => false, :limit => 1, :forever => false
Then I see the job in the bj_job table and I see that it completed along with 'hi' in stdout. I also see only one job in the table and it doesn't keep re-running it.
For some reason if I do this:
jobs = Bj.submit "./script/runner ./jobs/calculate_mean_values.rb #{self.id}", :is_restartable => false, :limit => 1, :forever => false
The job still completes as expected, however, it keeps inserting new rows in the bj_job table, and the method gets run over and over until I stop my dev server. Is that how it is supposed to work?
I'm using Ruby 1.8.6 and Rails 2.1.2 and I don't have the option of upgrading. I'm using the plugin flavor of Bj.
Because I just need to run the process once after the model is saved, I have it working by using script/runner directly like this:
system " RAILS_ENV=#{RAILS_ENV} ruby #{RAILS_ROOT}/script/runner 'CompositeGrid.calculate_values(#{self.id})' & "
But would like to know if I'm doing something wrong with Background Jobs,
OK, this was stupid user error. As it turns out, I had a call back that was restarting the process and creating an endless loop. After fixing the call back it is working exactly as expected.
Related
We are using Sidekiq to process a number of backend jobs. One in particular is used very heavily. All I can really say about it is that it sends emails. It doesn't do the email creation (that's a separate job), it just sends them. We spin up a new worker for each email that needs to be sent.
We are trying to upgrade to Ruby 3 and having problems, though. Ruby 2.6.8 has no issues; in 3 (as well as 2.7.3 IIRC), if there is a large number of queued workers, it will get through maybe 20K of them, then it will start hemorrhaging FIFO pipes, on the order of 300-1000 ever 5 seconds or so. Eventually it gets to the ulimit on the system (currently set at 64K) and all sockets/connections fail due to insufficient resources.
In trying to debug this issue I did a run with 90% of what the email worker does entirely commented out, so it does basically nothing except make a couple database queries and do some string templating. I thought I was getting somewhere with that approach, as one run (of 50K+ emails) succeeded without the pipe explosion. However, the next run (identical parameters) did wind up with the runaway pipes.
Profiling with rbspy and ruby-prof did not help much, as they primarily focus on the Sidekiq infrastructure, not the workers themselves.
Looking through our code, I did see that nothing we wrote is ever using IO.* (e.g. IO.popen, IO.select, etc), so I don't see what could be causing the FIFO pipes.
I did see https://github.com/mperham/sidekiq/wiki/Batches#huge-batches, which is not necessarily what we're doing. If you look at the code snippet below, we're basically creating one large batch. I'm not sure whether pushing jobs in bulk as per the link will help with the problem we're having, but I'm about to give it a try once I rework things a bit.
No matter what I do I can't seem to figure out the following:
What is making these pipes? Why are they being created?
What is the condition by which the pipes start getting made exponentially? There are two FIFO pipes that open when we start Sidekiq, but until enough work has been done, we don't see more than 2-6 pipes open generally.
Any advice is appreciated, even along the lines of where to look next, as I'm a bit stumped.
Initializer:
require_relative 'logger'
require_relative 'configuration'
require 'sidekiq-pro'
require "sidekiq-ent"
module Proprietary
unless const_defined?(:ENVIRONMENT)
ENVIRONMENT = ENV['RACK_ENV'] || ENV['RAILS_ENV'] || 'development'
end
# Sidekiq.client_middleware.add Sidekiq::Middleware::Client::Batch
REDIS_URL = if ENV["REDIS_URL"].present?
ENV["REDIS_URL"]
else
"redis://#{ENV["REDIS_SERVER"]}:#{ENV["REDIS_PORT"]}"
end
METRICS = Statsd.new "10.0.9.215", 8125
Sidekiq::Enterprise.unique! unless Proprietary::ENVIRONMENT == "test"
Sidekiq.configure_server do |config|
# require 'sidekiq/pro/reliable_fetch'
config.average_scheduled_poll_interval = 2
config.redis = {
namespace: Proprietary.config.SIDEKIQ_NAMESPACE,
url: Proprietary::REDIS_URL
}
config.server_middleware do |chain|
require 'sidekiq/middleware/server/statsd'
chain.add Sidekiq::Middleware::Server::Statsd, :client => METRICS
end
config.error_handlers << Proc.new do |ex,ctx_hash|
Proprietary.report_exception(ex, "Sidekiq", ctx_hash)
end
config.super_fetch!
config.reliable_scheduler!
end
Sidekiq.configure_client do |config|
config.redis = {
namespace: Proprietary.config.SIDEKIQ_NAMESPACE,
url: Proprietary::REDIS_URL,
size: 15,
network_timeout: 5
}
end
end
Code snippet (sanitized)
def add_targets_to_batch
#target_count = targets.count
queue_counter = 0
batch.jobs do
targets.shuffle.each do |target|
send(campaign_target)
queue_counter += 1
end
end
end
def send(campaign_target)
TargetEmailWorker.perform_async(target[:id],
guid,
is_draft ? target[:email_address] : nil)
begin
Target.where(id: target[:id]).update(send_at: Time.now.utc)
rescue Exception => ex
Proprietary.report_exception(ex, self.class.name, { target_id: target[:id], guid: guid })
end
end
end
First I tried auditing our external connections for connection pooling, etc. That did not help the issue. Eventually I got to the point where I disabled all external connections and let the job run doing virtually nothing outside of a database query and some logging. This allowed one run to complete without issue, but on the second one, the FIFO pipes still grew exponentially after a certain (variable) amount of work was done.
I'm running into a strange issue that is causing my Heroku workers to crash. We're using Ruby on Rails and delayed_job for background jobs. I'm passing a job to delayed_job using the Vero gem.
This is the call I make to "identify" the user to Vero:
after_save { self.identify! }
Then it puts a job in the queue that looks like this:
--- !ruby/object:Vero::Api::Workers::Users::TrackAPI
domain: https://api.getvero.com
options:
:email: ******#gmail.com
:data:
:email: ******#gmail.com
:name: ? ?
:first_name: ?
:last_name: ?
:school_id: -1
The issue seems to be those question marks. I'm not sure why they are showing up there instead of a string of text. This is the error that comes up:
Psych::SyntaxError: (<unknown>): mapping keys are not allowed in this context at line 7 column 14
Unfortunately, instead of the job just failing.. it actually crashes the worker.. not allowing for another jobs to be processed.
Has anyone run into this issue in the past? How can I format the YAML in a way that it won't crash the worker?
Thanks!
Check out this user. Seems that he entered some data not accept by the encoding of the fields in the db.
Are you using utf-8? If he entered utf-16 you can translitirate it in ruby to utf-8
I have a little script streaming data from Twitter and feeding it to another application. I am using for that the official twitter gem in its 5.0.0.rc.1 release with the streaming feature being flagged as experimental, which is fine with me. It's not a critical application anyway.
In order to have it withstand unforeseen crashes and other networking problems and so, I have this script monitored by god (0.13.3) and it works indeed most of the time. For some reason though it appears that occasionally the script will completely hang and sit idly while tweets should be received every odd second (debug setup is using widely used terms).
ps lists it as Ss, consuming 0% CPU and a mere 25MB of RAM.
I believe there might be some gotchas in the twitter gem (or a dependency) and I do not have the leisure to dive into the code now and try to fix it.
God.watch do |w|
w.name = 'twitter-streamer'
w.env = {
'TWITTER_CONSUMER_KEY' => 'key',
'TWITTER_CONSUMER_SECRET' => 'secret',
'TWITTER_ACCESS_TOKEN' => 'token',
'TWITTER_ACCESS_TOKEN_SECRET' => 'very_secret'
}
w.start = "twitter_streamer --hashtags cheese"
w.keepalive
w.log = File.join APP_HOME, 'log', 'twitter-streamer.log'
end
This is the definition of my watch. As you can see, it's pretty much by the book. What I would like, is a condition that would allow me to forcefully restart the process every so often. That would is an acceptable workaround for my needs.
Perhaps something like
# lifecycle
w.lifecycle do |on|
on.condition(:every) do |c|
c.within = 15.minute
c.transition = :restart
end
end
This is based off the :flapping condition block sample on the project's homepage. Is there a way to achieve this or would I need to implement my own condition?
I have this in my initializer:
Delayed::Job.const_set( "MAX_ATTEMPTS", 1 )
However, my jobs are still re-running after failure, seemingly completely ignoring this setting.
What might be going on?
more info
Here's what I'm observing: jobs with a populated "last error" field and an "attempts" number of more than 1 (10+).
I've discovered I was reading the old/wrong wiki. The correct way to set this is
Delayed::Worker.max_attempts = 1
Check your dbms table "delayed_jobs" for records (jobs) that still exist after the job "fails". The job will be re-run if the record is still there. -- If it shows that the "attempts" is non-zero then you know that your constant setting isn't working right.
Another guess is that the job's "failure," for some reason, is not being caught by DelayedJob. -- In that case, the "attempts" would still be at 0.
Debug by examining the delayed_job/lib/delayed/job.rb file. Esp the self.workoff method when one of your jobs "fail"
Added #John, I don't use MAX_ATTEMPTS. To debug, look in the gem to see where it is used. Sounds like the problem is that the job is being handled in the normal way rather than limiting attempts to 1. Use the debugger or a logging stmt to ensure that your MAX_ATTEMPTS setting is getting through.
Remember that the DelayedJobs jobs runner is not a full Rails program. So it could be that your initializer setting is not being run. Look into the script you're using to run the jobs runner.
I'm writing an rails 3 application which requires performing small tasks on a custom schedule for each user. The scheduled tasks will be defined dynamically. Right now my plan is to use resque scheduler with redis.
Once I set the schedule for a specify task (for eg. run task A every 48 hours) I would like to run that task indefinitely. So I would like to store those schedules in a db or something so in case an app crashes when it restarts it would load queue those task again.
Is this something Resque supports by default by storing it in redis or do I need to write my own custom thing? I was also looking at ruby-taskr (http://code.google.com/p/ruby-taskr/). I am not sure if taskr supports storing it in a database and registering it on start?
Also it would be helpful if there are applications/demo that I can look at it.
Thanks
I have a similar setup for batch jobs. The user adds them on a web dashboard and they get run however often is specified.
I use active-record to store the scheduling definitions, use resque for execution and a single cron entry for enqueueing using a rake task.
so then in the rake task:
to_run = Report.daily
to_run += Report.weekly if Time.now.monday?
to_run += Report.monthly if Time.now.day == 1
to_run.each{|r| r.enqueue!}
where daily, weekly, monthly are named scopes on the model:
class Report < ActiveRecord::Base
scope :daily, where(:when_to_run => 'daily')
scope :weekly, where(:when_to_run => 'weekly')
scope :monthly, where(:when_to_run => 'monthly')
end
This is a little hacky, but it works well and I stay within the stack nicely. Hope that is useful