How to make Ruby run some task every 10 minutes? - ruby

I would like to do a cron job every 10 minutes, but my system only does 1 hour. So I'm looking for a method to do this. I've seen Timer and sleep but I'm not sure how to do this or even better yet a resource for achieving this.

Take a look at http://rufus.rubyforge.org/rufus-scheduler/
rufus-scheduler is a Ruby gem for scheduling pieces of code (jobs). It understands running a job AT a certain time, IN a certain time, EVERY x time or simply via a CRON statement.
rufus-scheduler is no replacement for cron/at since it runs inside of Ruby.

To do this reliably, invest in a VPS and create the 10-minute cron job as desired. Trying to emulate cron all on your own is very likely to fail in unforeseen ways.
Creating a sleeping process is not the way to go about this; if your server doesn't give you the freedom to make your own cron as you like it, you probably can't create your own background process for this sort of thing, either. You might be able to, on each request, take a look and see how many of the jobs need done (if it was 25 minutes since last request, you might have to do two), and go back and do them retroactively.
But, seriously. You need your own server to do this dependably.

Related

How does Laravels task scheduling work without persisting the last completed date?

Laravel is (correctly) running scheduled tasks via the App\Console\Kernel#schedule method. It does this without the need for a persistance layer. Previously ran scheduled tasks aren't saved to the database or stored in anyway.
How is this "magic" achieved? I want to have a deeper understanding.
I have looked through the source, and I can see it is somewhat achieved by rounding down the current date and diffing that to the schedule frequency, along with the fact that it is required to run every minute, it can say with a certain level of confidence that it should run a task. That is my interpretation, but I still can't fully grasp how it is guaranteeing to run on schedule and how it handles failure or things being off by a few seconds.
EDIT Edit due to clarity issue pointed out in comment.
By "a few seconds" I mean how does the "round down" method work, even when it is ran every minute, but not at the same second - example: first run 00:01.00, 00:01:02, 00:02:04
Maybe to clarify further, and to assist in understanding how it works, is there any boundary guarantees on how it functions? If ran multiple times per minute will it execute per minute tasks multiple times in the minute?
Cronjob can not guarantee seconds precisely. That is why generally no cronjob interval is less than a minute. So, in reality, it doesn't handle "things being off by a few seconds."
What happens in laravel is this, after running scheduling command for the first time the server asks "Is there a queued job?" every minute. If none, it doesn't do anything.
For example, take the "daily" cronjob. Scheduler doesn't need to know when was the last time it ran the task or something like this. When it encounters the daily cronjob it simply checks if it is midnight. If it is midnight it runs the job.
Also, take "every thirty minute" cronjob. Maybe you registered the cronjob at 10:25. But still the first time it will run on 10:30, not on 10:55. It doesn't care what time you registered or when was the last time it ran. It only checks if the current minute is "00" or divisible by thirty. So at 10:30 it will run. Again, it will run on 11:00. and so on.
Similarly a ten minute cronjob by default will only check if the current minute is divisible by ten or not. So, regardless of the time you registered the command it will run only on XX:00, XX:10, XX:20 and so on.
That is why by default it doesn't need to store previously ran scheduled task. However, you can store it into a file if you want for monitoring purpose.

Bash script - Maintain multiple instances running

How can I ensure, that multiple instances of certain program are always running?
Let's say that I want to make sure that 4 instances of a certain program are always running.
If one instance is killed, new one should start.
If 5 instances are running, one should be killed.
This is not really a shell question, because the approach is the same, whichever shell you are using.
I think the cleanest solution is to have a "watchdog", which checks the running processes (using ps) and, if necessary, starts a new one or kills an unnecessary one.
One way - which I have used in a similar situation - is to write a cron job, which regularly (say: every 5 minutes) starts the watchdog and let it do his work.
If such an interval is too long for your case (i.e. if you need checking it more often than every minute), you could have the watchdog run continuously, in a loop. Still, you will need a cron job, which controls in turn the watchdog from time to time - just in case the watchdogs dies. In this case you might consider running it as a daemon.

Scheduling a task run

I have a script that must run at a certain hour for the amount of time I specify.
I'm looking at the clockwork gem (https://github.com/tomykaira/clockwork) which seems to be the closest piece of software I might eventually use to accomplish this, unfortunately it doesn't seem to give the ability to set a duration (start at 3PM stop 5PM), meaning I have to split the feature in 2, starting the script is going to be clockwork's job, stopping it is in the script itself with a custom solution.
Very suboptimal and messy.
How does people do this in Ruby? TIA
There is great gem called whenever for same job. With it you can set exact time for your task, like:
every 1.day, :at => '4:30 am' do
runner "MyModel.task_to_run_at_four_thirty_in_the_morning"
end
But you'll have to have two stages, one for starting one for stopping your job, which seems to be more natural than job which kills itself at some time by my opinion.
Somewhat janky, but there is another solution. I'm not sure what you are using to host your app, but on Heroku you can set up a scheduler to run every 10 minutes, on the hour, or daily. Then inside the method that the scheduler calls, you can determine the current time. Say you only want to run it between 3pm and 5pm, you would just wrap your code inside an if statement that verifies the current time is between 3pm and 5pm (watch out for time conversions with UTC).
Hope this helps.

Repeated tasks - spawn new processes or run continuously?

We have about 10 different Python scripts that download data from the web, read data from a database and write data back to that database. They do so repeatedly every 10 seconds (or 10 seconds after the last task has completed).
The question is, what is the best approach at running these tasks? I can think of a few ways:
a while True that runs the task then sleeps for the interval. It could be guarded by a watchdog like supervisord, making sure it is always up.
having the script execute the task just once, and invoking the script externally once every 10 seconds by another process.
having the script execute the task lets say for 1 hour (every 10 seconds for an hour), and having a watchdog make sure that task runs again once the hour is over.
I would like to avoid long running processes that actually do something because I don't want to deal with memory problems etc over long periods of time.
Additional Information
The scripts are different because they each retrieve data from a different source, and query, calculate and insert different data into the database.
The tasks are performed every 10 seconds since the data being retrieve is in real-time, and we need to not only keep updating it very frequently, but also keep all the historical data in the database.
There are a lot of resources being used by the scripts - MySQL connections, HTTP connections, Redis connections, etc. We have encountered issues with using the long-running approach before, specifically with MySQL connections (things like MySQL server has gone away, even though all connections had been closed). Hence the inclination toward having the scripts run in shorter periods of time.
What are some common approaches at this?
Unless your scripts somehow leak memory (quite unlikely), they should all be the same. So, for sheer simplicity (your time programming/debugging is much more expensive than a few miliseconds of the machine's time, even each 10 seconds!) I'd go for the single script that checks each 10 seconds.
OTOH, checking each 10 seconds sounds like busywork. Can't you set up so that whatever you are monitoring tells you when there are changes? Or batch the records up so you can retrieve, say, a day's worth at at time?
If you are running on linux, cron has granularity of a minute. We have processes we run constantly. Rather than watch them, the script will open a semaphore that gets released when the program finishes normally or not. This way if it runs long and it gets called again by cron, the copy will exit when it can't get the lock. This way you can call it a often as you need to without it stepping on a possibly still running copy.

How to run a per second cron job every two minutes

I have to set up a cron job on my hosting provider.
This cron job needs to run every second. It's not intensive, just doing a check.
The hosting provider however only allows cron jobs to be run every two minutes. (can't change hosting btw)
So, I'm clueless on how to go about this?
My thoughts so far:
If it can only run every two minutes, I need to make it run every second for two minutes. 1) How do I make my script run for two minutes executing a function every second?
But it's important that there are no interruptions. 2) I have to ensure that it runs smoothly and that it remains constantly active.
Maybe I can also try making it run forever, and run the cron job every two minutes checking whether it is running? 3) Is this possible?
My friend mentioned using multithreading to ensure it's running every second. 4) any comments on this?
Thanks for any advice. I'm using ZF.
Approach #3 is the standard solution. For instance you can have the cron job touch a file every time it runs. Then on startup you can check whether that file has been touched recently, and if it has then exit immediately. Else start running. (Other approaches include using file locking, or else writing the pid to a file and on startup check whether that pid exists and is the expected program.)
As for the one second timeout, I would suggest calling usleep at the end of your query, supplying the number of milliseconds from now to when you next want to run. If you do a regular sleep then you'll actually run less than once a second because sleeps sometimes last longer than expected, and your check takes time. As long as your check takes under a second to run, this should work fine.
I don't think cron allows second level resolution. http://unixhelp.ed.ac.uk/CGI/man-cgi?crontab+5
field allowed values
----- --------------
minute 0-59
hour 0-23
day of month 1-31
month 1-12 (or names, see below)
day of week 0-7 (0 or 7 is Sun, or use names)
So, even if your hosting provider allows you can't run a process that repeats every second. However, you can user command something like watch for repeated execution of your script. see here

Categories

Resources