Is there any way in ruby to determine if a background worker is running?
For instance, i have a server that works a queue in delayed job and i would like to ensure 4 workers are on it and spin up a new worker process if one has either stalled or quit.
From the command line, crontab -lgives a list of all currently running jobs.
From the Rails console, Delayed::Job.all will give you a list of all currently running jobs.
Delayed Job also has a list of lifecycle methods which you can access:
http://www.rubydoc.info/github/collectiveidea/delayed_job/Delayed/Lifecycle
the usual way to do that is to use an external watchdog process. you can use Monit or God
Related
I have configured a queue worker as a daemon on Forge, then used the recommended deployment script command (php artisan queue:restart).
How do I manually stop and restart the queue worker? If I stop it, supervisor will just restart it. Do I need kill the active worker in Forge first?
This may be required on an ad-hoc basis. For example, if I want to clear a log file that the queue has open.
I've been pretty vocal in deployment discussions, and I always tell people to stop their worker processes with the supervisorctl command.
supervisorctl stop <name of task>
Using the queue:restart command doesn't actually restart anything. It sets an entry in the cache which the worker processes check, and shutdown. As you noticed, supervisor will then restart the process.
This means that queue:restart has one huge problem, ignoring the naming and the fact that it doesn't restart; it will cause all worker processes on all servers that uses the same cache to restart. I think this is wrong, I think a deployment should only affect the current server currently being deployed to.
If you're using a per-server cache, like the file cache driver, then this has another problem; what happens if your deployment entirely removes the website folder? The cache would change, the queues would start again, and the worker process may have a mix of old and new code. Fun things to debug...
Supervisor will signal the process when it is shutting down, and wait for it to shut down cleanly, and if it doesn't, forcefully kill it. These timeouts can be configured in the supervisor configuration file. This means that using supervisorctl to stop the queue process will not terminate any jobs "half-way through", they will all complete (assuming they run for a short enough time, or you increase the timeouts).
Is it possible to create a script that is always running on my VPS server? And what need i to do to run it the hole time? (I haven't yet a VPS server, but if this is possible i wants to buy one!
Yes you can, there are many methods to get your expected result.
Supervisord
Supervisord is a process control system that keeps any process running. It automatically start or restart your process whenever necessary.
When to use it: Use it when you need a process that run continuously, eg.:
A queue worker that reads a database continuously waiting for a job to run.
A node application that acts like a daemon
Cron
Cron allow you running processes regularly, in time intervals. You can for example run a process every 1 minute, or every 30 minutes, or any time interval you need.
When to use it: Use it when your process is not long running, it do a task and end, and you do not need it beign restarted automatically like on Supervisord, eg.:
A task that collects logs everyday and send it on a gzip by email
A backup routine.
Whatever you choose, there are many tutorials on the internet on how configuring both, so I'll not go into this details.
I have several processes which currently run as rake tasks. Can I somehow use Sidekiq to execute a process in a continuous loop? Is that a best-practice with Sidekiq?
These processes, though they run in the background in a continuous loop in their respective rake tasks now, occasionally fail. Then I have to restart the rake task.
I am trying a couple of options, with help from the SO community. One is to figure out how to monitor the rake tasks with monit. But that means each process will have to have its own environment, adding to server load. Since I'm running in a virtualized environment, I want to eliminate that wherever possible.
The other option is just to leverage the Sidekiq option I already have. I use Sidekiq now for background processing, but it's always just one-offs. Is there some way I can have a continuous process in Sidekiq? And also be notified of failures and have the processes restart automatically?
The answer per Mike Perham the Sidekiq author is to use a cron job for scheduled tasks like this. You can create a rake task which submits the job to Sidekiq to run in the background. Then create a cron job to schedule it.
I don't know why you go for sideki, is this project specific ? Previously I faced the same problem but I migrated to delayed_job and it satisfy my needs. If the active record objects are transactional use delayed_job otherwise go for resque it is also a nice one.
I want to switch my single delayed_job process to multiple workers. I currently have an upstart job that runs rake and uses respawn method with no 'expect fork' since rake does not fork. Now to switch to a multiple worker method I need to 'expect' in my upstart configuration file. Any suggestions.
Out of the box, it appears that upstart expect does not support the behavior outlined in https://github.com/collectiveidea/delayed_job#running-jobs , as there are multiple workers that each fork twice to daemonize.
As outlined in this question about upstart: Can upstart expect/respawn be used on processes that fork more than twice? , you can use a bit of scripting to shepherd the processes yourself in the different hooks.
Another option would be to use upstart job instances (http://upstart.ubuntu.com/cookbook/#instance) to start multiple jobs that do not fork.
I'm not very clear with what you were asking. But if you want multiple delayed jobs to run in background, when you start the delayed job using the command something like rake Jobs:Work, you can specify the number of consumer threads you want to spawn. Hope it helps you.
My server process is basically an API that responds to REST requests.
Some of these requests are for starting long running tasks.
Is it a bad idea to do something like this?
get "/crawl_the_web" do
Thread.new do
Crawler.new # this will take many many days to complete
end
end
get "/status" do
"going well" # this can be run while there are active Crawler threads
end
The server won't be handling more than 1000 requests a day.
Not the best idea....
Use a background job runner to run jobs.
POST /crawl_the_web should simply add a job to the job queue. The background job runner will periodically check for new jobs on the queue and execute them in order.
You can use, for example, delayed_job for this, setting up a single separate process to poll for and run the jobs. If you are on Heroku, you can use the delayed_job feature to run the jobs in a separate background worker/dyno.
If you do this, how are you planning to stop/restart your sinatra app? When you finally deploy your app, your application is probably going to be served by unicorn, passenger/mod_rails, etc. Unicorn will manage the lifecycle of its child processes and it would have no knowledge of these long-running threads that you might have launched and that's a problem.
As someone suggested above, use delayed_job, resque or any other queue-based system to run background jobs. You get persistence of the jobs, you get horizontal scalability (just launch more workers on more nodes), etc.
Starting threads during request processing is a bad idea.
Besides that you cannot control your worker threads (start/stop them in a controlled way), you'll quickly get into troubles if you start a thread inside request processing. Think about what happens - the request ends and the process gets prepared to serve the next request, while your worker thread still runs and accesses process-global resources like the database connection, open files, same class variables and global variables and so on. Sooner or later, your worker thread (or any library used from it) will affect the main thread somehow and break other requests and it will be almost impossible to debug.
You're really better off using separate worker processes. delayed_job for example is a really small dependency and easy to use.