How resque checks when to run a job? - ruby

I have found the Resque:
https://github.com/elucid/resque-delayed
And I can see that I can schedule delayed Job. My question is, how does it check for delayed jobs? If I have 5000 delayed jobs in one month time, I hope it doesn't check every 10 seconds all delayed jobs.
So how is it being done?

It does not have to check all the delayed jobs. It maintains a sorted set in Redis, the jobs being sorted by their scheduled time. See the code at:
https://github.com/elucid/resque-delayed/blob/master/lib/resque-delayed/resque-delayed.rb
Each time the daemon awakes, only the first item of the set needs to be checked (using a ZRANGEBYSCORE command). The daemon fetches the relevant jobs one by one, until the polling query returns no result, then it sleeps again.
Performance could be further improved by fetching the jobs n by n. It could be implemented using a server-side Lua script as a polling query:
local res = redis.call('ZRANGEBYSCORE',KEYS[1], "-inf", ARGV[1], 'LIMIT', 0, 10 )
if #res > 0 then
redis.call( 'ZREMRANGEBYRANK', KEYS[1], 0, #res-1 )
return res
else
return false
end
In one roundtrip, this script gets 10 jobs (if available), and delete them from the zset. Much better than the 11 ZRANGEBYSCORE and 10 ZREM, currently required by Resque-delayed.

Related

queue.size in Sidekiq is not updated in real time

I'm using sidekiq with ActiveJob. I want to balance the queues. So I use this way.
while queue.size < 10
SomeJob.perform_later(some_args) # This should add one job to the queue right away, but it doesn't, it takes some time for the job to enter the queue.
end
This is failing in a bad way. This will schedule 50, 60 or more jobs. The cause is that the queue is not populated by jobs directly, but instead, it takes some time for the jobs to enter the queue. So the method queue.size will return 0 for a few seconds then gets the real queue size.
UPDATE:
I found the issue. It turns out that the class I use to schedule the jobs is a configured one, the configuration at some point was SomeJob.set(wait: wait_time), and wait_time was 0. active job will put the job into scheduled set for some time (less than a second or so) before it enters the queue. This is why the queue.size didn't reflect what I expected to be in the queue.
This is happening because queue is already initialized, and you're not reinitializing the new queue object every time a job is enqueued. It won't "update in real time" as you say (similarly to how you'd have to call #reload on an ActiveRecord object)
More efficient than reinitializing, same effect:
size = queue.size
max_queue_size = 10
number_of_jobs_to_perform = max_queue_size - size
number_of_jobs_to_perform = 0 if number_of_jobs_to_perform < 0
number_of_jobs_to_perform.times do
SomeJob.perform_later(args)
end
Edit: if you really must, use a proc, such as Proc.new { queue.size }.times do ...

how can I run ruby script for n-occurrences every X-minutes

I am looking at https://github.com/jmettraux/rufus-scheduler which nicely allows me to schedule and chain events.
But I want the events to stop after, say, the 10th occurrence or after the 24 days.
How do I do this?
My case would be:
run a script which creates the recurring jobs based on intervals and then stops after a given date or occurrence.
This is what I have done.
def run_schedule(url, count, method, interval)
puts "running scheduler"
scheduler = Rufus::Scheduler.new
scheduler.every interval do
binding.pry
attack_loop(url, count, method)
end
end
I am testing my site and want the attack_loop to be scheduled in memory to run against the interval.
But it appears it never hits the binding.pry line.
Normally these schedulers are running via cron jobs. Then the problem with your requirement is cron job doesn't know whether you hit the 10th occurrence or the 24 days as it doesnt keep a track. One possible solution would be to create a separate table to update the cron job details.
I'm thinking a table like,
scheduler_details
- id
- occurrence_count
- created_date
- updated_date
_ scheduler_type
So, now when you run a script, you can create or update the details. (You can search the script by scheduler_type, that way
you can check the number of occurrences
with created date, you can calculate the 24 days
HTH
Good day, you can specify the number of times a job should run:
https://github.com/jmettraux/rufus-scheduler/#times--nb-of-times-before-auto-unscheduling
If you need a more elaborate stop condition, you could do:
scheduler.every interval do |job|
if Time.now > some_max_date
job.unschedule
else
attack_loop(url, count, method)
end
end
But it appears it never hits the binding.pry line.
What has it to do with your issue title? Are you mixing issues?

Understanding delayed_job status

I've implemented long-running tasks in my Rails app using delayed_job along with delayed_job_web. My delayed_job configuration instructs jobs to be attempted once, and for failures to be retained:
config/initializers/delayed_job.rb:
Delayed::Worker.max_attempts = 1
Delayed::Worker.destroy_failed_jobs = false
I tried 2 test jobs that automatically raised errors, in order to see how failures behave. What I get is the following:
My expectation was that Failed jobs would have a count of 2, but that Enqueued / Working / Pending would all be 0. I can't find any documentation on what determines whether a job is Enqueued / Working / Pending, or even what the difference between Working and Pending is (the web interface describes both lists as "contains jobs currently being processed".)
Can anyone provide some clarity?
If you check https://github.com/ejschmitt/delayed_job_web/blob/master/lib/delayed_job_web/application/app.rb , you see the following (starting line 114):
when :working
'locked_at is not null'
when :failed
'last_error is not null'
when :pending
'attempts = 0'
end
Enqueued would be the total number of delayed jobs, i.e. Delayed::Job.count
Working jobs are those that have been locked by the delayed_job process and are currently being worked.
Failed are those that have a last_error
Pending are those jobs that have never been attempted.

Ruby and Rails Async

I need to perform long-running operation in ruby/rails asynchronously.
Googling around one of the options I find is Sidekiq.
class WeeklyReportWorker
include Sidekiq::Worker
def perform(user, product, year = Time.now.year, week = Date.today.cweek)
report = WeeklyReport.build(user, product, year, week)
report.save
end
end
# call WeeklyReportWorker.perform_async('user', 'product')
Everything works great! But there is a problem.
If I keep calling this async method every few seconds, but the actual time heavy operation performs is one minute things won't work.
Let me put it in example.
5.times { WeeklyReportWorker.perform_async('user', 'product') }
Now my heavy operation will be performed 5 times. Optimally it should have performed only once or twice depending on whether execution of first operaton started before 5th async call was made.
Do you have tips how to solve it?
Here's a naive approach. I'm a resque user, maybe sidekiq has something better to offer.
def perform(user, product, year = Time.now.year, week = Date.today.cweek)
# first, make a name for lock key. For example, include all arguments
# there, so that another perform with the same arguments won't do any work
# while the first one is still running
lock_key_name = make_lock_key_name(user, product, year, week)
Sidekiq.redis do |redis| # sidekiq uses redis, let us leverage that
begin
res = redis.incr lock_key_name
return if res != 1 # protection from race condition. Since incr is atomic,
# the very first one will set value to 1. All subsequent
# incrs will return greater values.
# if incr returned not 1, then another copy of this
# operation is already running, so we quit.
# finally, perform your business logic here
report = WeeklyReport.build(user, product, year, week)
report.save
ensure
redis.del lock_key_name # drop lock key, so that operation may run again.
end
end
end
I am not sure I understood your scenario well, but how about looking at this gem:
https://github.com/collectiveidea/delayed_job
So instead of doing:
5.times { WeeklyReportWorker.perform_async('user', 'product') }
You can do:
5.times { WeeklyReportWorker.delay.perform('user', 'product') }
Out of the box, this will make the worker process the second job after the first job, but only if you use the default settings (because by default the worker process is only one).
The gem offers possibilities to:
Put jobs on a queue;
Have different queues for different jobs if that is required;
Have more than one workers to process a queue (for example, you can start 4 workers on a 4-CPU machine for higher efficiency);
Schedule jobs to run at exact times, or after set amount of time after queueing the job. (Or, by default, schedule for immediate background execution).
I hope it can help you as you did to me.

Quartz.NET: Need CronTrigger on an iStatefulJob instance to *delay instead of skip* if running job while schedule matures

Greetings, your friendly neighborhood Quartz.NET n00b is back!
I have a Windows Service running iStatefulJob instances on a Quartz.NET CronTrigger based schedule scheme... The CRON String used to schedule the job: "0 0/1 * * * ? *"
Everything works great. However, if I have a job that is set to run, say, at the X:00 mark of every minute, and that job happens to run for MORE than a minute, I notice that the subsequent job runs IMMEDIATELY after the job is finished executing, rather than waiting until its next scheduled run, effectively "queuing" up instead of merely skipping the job till it's next scheduled run.
I put in the trigger a CronTrigger MisfireInstruction of DONOTHING, but the exact same thing happens when a job overruns its next scheduled execution schedule.
How do I get an iStatefulJob instance to merely SKIP a scheduled execution trigger if it is currently running, rather than have it delay it until the first execution completes?
I explicitly set the trigger.MisfireInstruction = MisfireInstruction.CronTrigger.DoNothing;
...But instead of "doing nothing", for a job scheduled to run every minute that takes 90 seconds to complete, I experience the following execution log:
Job runs at 9:00:00am, finishes at 9:01:30am <- job runs for 1:30
Job runs at 9:01:30am, finishes at 9:03:00am <- subsequent job that should have run at 9:01:00
Job runs at 9:04:00am, finishes at 9:05:30am <- shouldn't this one have run at 9:03:00?
Job runs at 9:05:30am, finishes at 9:07:00am <- subsequent job that should have run at 9:05:00
Job runs at 9:08:00am, finishes at 9:09:30am <- shouldn't this have run at 9:07:00?
... it seems like it runs correctly the first time, on the minute... delays for 30 seconds as the 90 second job execution time expires, and then, instead of waiting till the NEXT full minute, EXECUTES IMMEDIATELY at the 30 second mark... Doubly odd, is that it then finishes the SECOND job on the minute mark, but waits till the NEXT minute mark to execute instead of running it back-2-back...
Pretty much seems like it works correctly EVERY OTHER RUN, when it is not running on the :30 marks...
What's the best way to get a job not to delay/queue, but to just SKIP until it is idle and the next schedule matures?
EDIT: I tried going back to iJobs instead of iStatefulJobs using the same DONOTHING trigger misfire instruction, but the job executes EVERY MINUTE despite the prior execution being still active. I can't seem to get it to skip a scheduled run if it is currently running with either iJob or iStatefulJob...
EDIT#2: I think that my triggers are NEVER misfiring, which is why DoNothing as a misfire instruction is useless... Given that's the case, I guess I need another mechanism to detect if a job instance of a schedule is running to ensure the job SKIPS its next execution until its following scheduled time rather than delaying it until first instance completion...
EDIT3: I tried adding an element to the iStatefulJob jobdatamap called "IsRunning"... I set it to TRUE when the execute sequence starts, and then return it to false after job completion. Before executing, it checks the element, which is apparently persisted between jobs, and prematurely quits the execution (logging "JOB SKIPPED!") if it detects it to be true... This unfortunately doesn't work, for probably obvious reasons: If the jobs are running following the bulleted schedule above, then the job is never SIMULTANEOUSLY running along with itself, as it is delaying the run till the job ends, so this check is useless. According to documentation, returning to iJob from iStatefulJob would not help here as the jobdatamap is only persisted between jobs in the Stateful job type...
I still haven't solved how to SKIP a scheduled job instead of delaying it till it's current iteration completes... If anyone has ideas, you're a lifesaver! :)
It should be caused by misfireThreshold of RAMJobStore (http://quartznet.sourceforge.net/apidoc/topic2722.html).
The time span by which a trigger must
have missed its next-fire-time, in
order for it to be considered
"misfired" and thus have its misfire
instruction applied.
It is 60 seconds by default. So job isn't considered as "misfired" until it is late for more than misfiredThreshold value.
To resolve the problem just decrease this threshold (below code sets to 1 ms):
...
properties["quartz.jobStore.misfireThreshold"] = "1";
...
schedulerFactory = new StdSchedulerFactory(properties);
It should resolve the issue.

Resources