On Heroku, I use delayed_job to run asynchronous tasks. All is well until I do a git push heroku master and then the Heroku environment kills any worker threads that are in-process.
The issue here is that those jobs never get re-queued since the delayed_job table in my db shows them as still locked and running, even though the workers that used to be servicing them are long dead.
How do I prevent this situation from occurring? I'd like Heroku to wait for all delayed jobs in progress to complete or error out before closing down, or at least terminate them and allow a new worker to be assigned to them once the server comes back up post-reboot from changes being applied by my update.
Looks like you can configure DJ to handle SIGTERM and mark the in-progress jobs as failed (so they'll be restarted again):
Use this setting to throw an exception on TERM signals by adding this in your initializer:
Delayed::Worker.raise_signal_exceptions = :term
More info in this answer:
https://stackoverflow.com/a/16811844/1715829
Related
I set up a log drain for my Heroku app that drains the logs via HTTPS.
I added it to my app with this toolbelt command:
heroku drains:add http://example.com --app MY_APP
Trouble is I have a chron job that runs on a separate dyno, my worker process, and that isn't draining.
I really only want to drain the logs from the worker process.
Is there a way to add a drain to ONLY the worker process? Barring that, is there a way to make it so the worker process is included in the drain?
Thanks!
At this point in time it seems not possible to filter a drain on anything.
The drain contains ALL logging from the specified app and all dynos in that app.
the only fix right now seems to be using a filter on the receiving end.
I have a Heroku worker setup to do a long running job which iterates over long periods. However whenever I do an update & deploy of other files in the repo this worker restarts, which is annoying, any way to avoid this?
No. This behaviour is part of Heroku's Automatic Dyno Restarting.
You can't work around this. Instead, you need to build all parts of your app to be able to function properly despite the fact that all dynos will restart at least once every 24 hours or so, whether or not you deploy updates in your repo.
Most significantly, you need to build support for Graceful Shutdown into all your processes (e.g. web process and worker processes).
I have two web dynos in my heroku app, and at times get a dyno automatic restart (as per heroku policy). Is the function that was going on during the restart automatically restored in the new restarted dyno? If not, is there a way I can control this restart?
Is the function that was going on during the restart automatically restored in the new restarted dyno?
no
If not, is there a way I can control this restart?
no
What you can do, is trap the SIGTERM signal that is sent to your process 10 seconds before it is SIGKILLed. This would give you time to finish current computation, stop taking web requests, do cleanup, etc. More details on the process is in the Heroku Devcenter.
Some image resize jobs failed to exit when our heroku background worker was restarted.
The job is stuck in the busy page of the UI. It looks likes it is occupying one of the busy threads and was started over an hour ago.
But upon inspecting the job args and checking the DB, it looks like the images were actually processed, so maybe it's just redis, or the web UI contains wrong data.
Given the TID, and JID osuuiyruo 8e25ebc62ae7d7023a9b5650
Is there anyway to remove these "stuck" jobs? I tried quieting the workers and stopping them, and then scaling heroku workers to 0 then bringing them back up, but they stay in that stuck busy queue.
Please state your Sidekiq version in the future.
If you are on 3.x, this will be fixed in 3.1.4. https://github.com/mperham/sidekiq/issues/1764
I created some workers with coffee-resque and was trying to view workers using the ruby version of resque-web and only saw intermittent workers flash in and out.
I noticed that coffee-resque untracks workers while paused. Is that the intended behavior? This made it so that resque web only listed flashing intermittent workers and they always had a status of waiting when they did appear, even though that was when they were processing.
Am I doing it wrong or is there a suggested way of monitoring the worker queues?
Also, is there a way to clean up the inactive orphaned worker keys in redis if the worker process failed and didn't do a graceful untrack on exit?
I recently provided a pull request that fixed this issue. It has been accepted into coffee-resque and a new version was released.
https://github.com/technoweenie/coffee-resque/issues/17
This fix was released as 0.1.6.