Can I use Sidekiq for continuous processes? - ruby

I have several processes which currently run as rake tasks. Can I somehow use Sidekiq to execute a process in a continuous loop? Is that a best-practice with Sidekiq?
These processes, though they run in the background in a continuous loop in their respective rake tasks now, occasionally fail. Then I have to restart the rake task.
I am trying a couple of options, with help from the SO community. One is to figure out how to monitor the rake tasks with monit. But that means each process will have to have its own environment, adding to server load. Since I'm running in a virtualized environment, I want to eliminate that wherever possible.
The other option is just to leverage the Sidekiq option I already have. I use Sidekiq now for background processing, but it's always just one-offs. Is there some way I can have a continuous process in Sidekiq? And also be notified of failures and have the processes restart automatically?

The answer per Mike Perham the Sidekiq author is to use a cron job for scheduled tasks like this. You can create a rake task which submits the job to Sidekiq to run in the background. Then create a cron job to schedule it.

I don't know why you go for sideki, is this project specific ? Previously I faced the same problem but I migrated to delayed_job and it satisfy my needs. If the active record objects are transactional use delayed_job otherwise go for resque it is also a nice one.

Related

Ensure Background Worker is Alive and Working

Is there any way in ruby to determine if a background worker is running?
For instance, i have a server that works a queue in delayed job and i would like to ensure 4 workers are on it and spin up a new worker process if one has either stalled or quit.
From the command line, crontab -lgives a list of all currently running jobs.
From the Rails console, Delayed::Job.all will give you a list of all currently running jobs.
Delayed Job also has a list of lifecycle methods which you can access:
http://www.rubydoc.info/github/collectiveidea/delayed_job/Delayed/Lifecycle
the usual way to do that is to use an external watchdog process. you can use Monit or God

Does anyone run more than one resque worker in a Heroku Dyno?

Given that unicorn usually manages more than one Rails server process, and given that a Resque job runner probably consumes less resources than a Web request, it should be possible to run more than one resque worker on a single Heroku dyno.
Is anyone doing this successfully so far? My thoughts are, that an easy way to do so would have the Procfile runs foreman, which then runs 2 (or more) instances of the actual worker (i.e. rake resque:work)
Or is rake resque:workers up to that task? Resque itself does not recommend using that method, as this starts workers in parallel threads instead of in parallel processes.
Obviously, this makes sense only on i/o bound jobs.
One can use foreman to start multiple processes. Add foreman to your Gemfile, and then create two files:
Procfile:
worker: bundle exec foreman start -f Procfile.workers
Procfile.workers:
worker_1: QUEUE=* bundle exec rake resque:work
worker_2: QUEUE=* bundle exec rake resque:work
The same technique can be used to run a web server alongside some workers.
NOTE: while many state success using this approach, I would not suggest to use it outside of some experiments, mostly because of the risk to run into RAM limitations on small heroku instances; and once you pay for the heroku service it is probably easier to just spin up a dedicated worker machine anyways.
Based on this article, it sounds like it's possible, but the biggest gotcha is that if one of the child processes dies, Heroku won't be able to restart it.

delayed_job with multiple workers and upstart

I want to switch my single delayed_job process to multiple workers. I currently have an upstart job that runs rake and uses respawn method with no 'expect fork' since rake does not fork. Now to switch to a multiple worker method I need to 'expect' in my upstart configuration file. Any suggestions.
Out of the box, it appears that upstart expect does not support the behavior outlined in https://github.com/collectiveidea/delayed_job#running-jobs , as there are multiple workers that each fork twice to daemonize.
As outlined in this question about upstart: Can upstart expect/respawn be used on processes that fork more than twice? , you can use a bit of scripting to shepherd the processes yourself in the different hooks.
Another option would be to use upstart job instances (http://upstart.ubuntu.com/cookbook/#instance) to start multiple jobs that do not fork.
I'm not very clear with what you were asking. But if you want multiple delayed jobs to run in background, when you start the delayed job using the command something like rake Jobs:Work, you can specify the number of consumer threads you want to spawn. Hope it helps you.

Scheduling recurring task in ruby on rails 3.2

I have ruby on rails website,in which I want to perform some task at fixed interval such as 'sending report by email every sunday',for example.
I have examined using whenever gem but since it is wrapper for the *nix utility cron,it may not work on windows.
I am asking for which gem or method to use to do for scheduling such above task that is not depend on underlying platform?
Both Clockwork and rufus-scheduler (optionally combined with delayed job) are good gems for scheduling tasks.
If you are on torquebox, it already provides a job scheduler based on quartz.
Use WebMin set up. Set the Cron jobs for your application script that you want to execute and run it on the web min server i.e your_ip_address:10000. It is the best way for job scheduling. I used it in most of my project.

Is it a bad idea to create worker threads in a server process?

My server process is basically an API that responds to REST requests.
Some of these requests are for starting long running tasks.
Is it a bad idea to do something like this?
get "/crawl_the_web" do
Thread.new do
Crawler.new # this will take many many days to complete
end
end
get "/status" do
"going well" # this can be run while there are active Crawler threads
end
The server won't be handling more than 1000 requests a day.
Not the best idea....
Use a background job runner to run jobs.
POST /crawl_the_web should simply add a job to the job queue. The background job runner will periodically check for new jobs on the queue and execute them in order.
You can use, for example, delayed_job for this, setting up a single separate process to poll for and run the jobs. If you are on Heroku, you can use the delayed_job feature to run the jobs in a separate background worker/dyno.
If you do this, how are you planning to stop/restart your sinatra app? When you finally deploy your app, your application is probably going to be served by unicorn, passenger/mod_rails, etc. Unicorn will manage the lifecycle of its child processes and it would have no knowledge of these long-running threads that you might have launched and that's a problem.
As someone suggested above, use delayed_job, resque or any other queue-based system to run background jobs. You get persistence of the jobs, you get horizontal scalability (just launch more workers on more nodes), etc.
Starting threads during request processing is a bad idea.
Besides that you cannot control your worker threads (start/stop them in a controlled way), you'll quickly get into troubles if you start a thread inside request processing. Think about what happens - the request ends and the process gets prepared to serve the next request, while your worker thread still runs and accesses process-global resources like the database connection, open files, same class variables and global variables and so on. Sooner or later, your worker thread (or any library used from it) will affect the main thread somehow and break other requests and it will be almost impossible to debug.
You're really better off using separate worker processes. delayed_job for example is a really small dependency and easy to use.

Resources