I have a simple code for ruby- resque.
But due to some code issues, some jobs fail.
The problem is that they do not appear in failed jobs, they just disappear (once job is taken from the queue, it's poped, so it's removed).
How can I make resque to put the job in failed position?
I see the problem. Once the job is taken form the queue, it is removed from redis. Once it fails it's back in the redis in failed jobs. But the moment you stop the worker (ctrl+c) it doesn't have time to put it back in the failed jobs
Related
I am using a load balancer and 2 servers use the database to listen for jobs. The problem is that when I dispatch a job and it get picked by a server but it is not completed before the retry time runs out of a queue job, it will start running on the second server and fail with a message saying that the 'Job has attempted to run to many times or run for too long'. But the first server continues to complete the job and it finishes successfully. I tried everything, starting from using Queue::before, writing functions to handle the logic for a job that is in progress but I had no luck, maybe someone can help. Thank you in advance.
Is it possible to somehow view sidekiq completed job list - for example, find all PurchaseWorkers with params (1)? Yesterday in my app delayed method that was supposed to run didn't and associated entity (lets say 'purchase') got stuck in limbo with state "processing". I am trying to understand whats the reason: job wasn't en-queued at all or was en-queued but for some reason exited unexpectedly. There were no errors in sidekiq log.
Thanks.
This is old but I wanted to see the same thing since I'm not sure if jobs I scheduled ran or not!
Turns out, Sidekiq doesn't have anything built in to see jobs that completed and still doesn't seem to.
If it err'd and never completes it should be in the 'dead' queue. But to check that something actually ran seems to be beyond Sidekiq by default.
The FAQ suggests installing 3rd party plugins to track and log information: https://github.com/mperham/sidekiq/wiki/FAQ#how-can-i-tell-when-a-job-has-finished One of them allows for having a callback to do follow up (maybe add a record for completed jobs elsewhere?)
You can also setup Sidekiq to log to somewhere other than STDOUT (default) so you can output log information about your jobs. In this case, logging that it's complete or catching errors if for some reason it is never landing in the retrying or dead jobs queue when there is a problem. See https://github.com/mperham/sidekiq/wiki/Logging
To see jobs still in queue you can use the Rails console and look at the queue by queue name https://www.rubydoc.info/gems/sidekiq/Sidekiq/Queue
One option is the default stats provided by sidekiq - https://github.com/mperham/sidekiq/wiki/Monitoring#using-the-built-in-dashboard
The best options is to use the Web UI provided here - https://github.com/mperham/sidekiq/wiki/Monitoring#web-ui
On Heroku, I use delayed_job to run asynchronous tasks. All is well until I do a git push heroku master and then the Heroku environment kills any worker threads that are in-process.
The issue here is that those jobs never get re-queued since the delayed_job table in my db shows them as still locked and running, even though the workers that used to be servicing them are long dead.
How do I prevent this situation from occurring? I'd like Heroku to wait for all delayed jobs in progress to complete or error out before closing down, or at least terminate them and allow a new worker to be assigned to them once the server comes back up post-reboot from changes being applied by my update.
Looks like you can configure DJ to handle SIGTERM and mark the in-progress jobs as failed (so they'll be restarted again):
Use this setting to throw an exception on TERM signals by adding this in your initializer:
Delayed::Worker.raise_signal_exceptions = :term
More info in this answer:
https://stackoverflow.com/a/16811844/1715829
Some image resize jobs failed to exit when our heroku background worker was restarted.
The job is stuck in the busy page of the UI. It looks likes it is occupying one of the busy threads and was started over an hour ago.
But upon inspecting the job args and checking the DB, it looks like the images were actually processed, so maybe it's just redis, or the web UI contains wrong data.
Given the TID, and JID osuuiyruo 8e25ebc62ae7d7023a9b5650
Is there anyway to remove these "stuck" jobs? I tried quieting the workers and stopping them, and then scaling heroku workers to 0 then bringing them back up, but they stay in that stuck busy queue.
Please state your Sidekiq version in the future.
If you are on 3.x, this will be fixed in 3.1.4. https://github.com/mperham/sidekiq/issues/1764
I am running delayed_job for a few background services, all of which, until recently, run in isolation e.g. send an email, write a report etc.
I now have a need for one delayed_job, as its last step, to lodge another delayed_job.
delay.deploy() - when delayed_job runs this, it triggers a deploy action, the last step of which is to ...
delay.update_status() - when delayed_job runs this job, it will check the status of the deploy we started. If the deploy is still progressing, we call delay.update_status() again, if the deploy has stopped we write the final deploy status to a db record.
Step 1 works fine - after 5 seconds, delayed_job fires up the deploy, which starts the deployment, and then calls delay.update_status().
But here,
instead of update_status() starting up in 5 seconds, delayed_job goes into a busy loop, firing of a bunch of update_status calls, and looping really hard without pause.
I can see the logs filling up with all these calls, the server slows down, until the end-condition for update_status is reached (deploy has eventually succeeded or failed), and things get quiet again.
Am I using Delayed_Job::delay() incorrectly, am I missing a basic tenent of this use-case ?
OK it turns out this is "expected behaviour" - if you are already in the code running for a delayed_job, and you call .delay() again, without specifying a delay, it will run immediately. You need to add the parameter run_at:
delay(queue: :deploy, run_at: 10.seconds.from_now).check_status
See the discussion in google groups