I am following this to schedule my Django cron job on Heroku.
Procfile:
web: gunicorn tango.wsgi --log-file -
clock: python createStatistics.py
createStatistics.py:
from apscheduler.schedulers.blocking import BlockingScheduler
sched = BlockingScheduler()
#sched.scheduled_job('interval', minutes=1)
def timed_job():
print('This job is run every minute.')
#sched.scheduled_job('cron', day=14, hour=15, minute=37)
def scheduled_job():
print('This job is run on day 14 at minute 37, 3pm.')
sched.start()
The timed_job runs OK, however, the scheduled_job has no effect. Do I need to set up any time zone information for apscheduler (I have the TIME_ZONE set in settings.py)? If so, how? Or did I miss anything?
Specific to Heroku, for reasons I have not been able to figure out yet, it seems that you need to specify the optional id field on cron jobs to make them work. So the cron, job definition would now look like this.
#sched.scheduled_job('cron', id="job_1", day=14, hour=15, minute=37)
def scheduled_job():
print('This job is run on day 14 at minute 37, 3pm.')
You must specify the timezone in every job otherwise Heroku will run in UTC timezone.
#sched.scheduled_job('cron', day=14, hour=15, minute=37, timezone=YOUR_TIME_ZONE)
def scheduled_job():
print('This job is run on day 14 at minute 37, 3pm.')
Related
When viewing submitted jobs managed by Slurm, I would like to have the time limit column (specified by %l) to show only hours, instead of the usual days-hours:minutes:seconds format. This is the command I am currently using:
squeue --format="%.6i %.5P %.25j %.8u %.8T %.10M %.5l %.15b %.5C %.6D %R" --sort=+i --me
and this is the example output:
276350 qgpu jobname username RUNNING 1:14:14 1-00:00:00 gres:gpu:v100:1 18 1 s31n02
So, in this case, I would like the elapsed time to remain as is (1:14:14), but the time limit to change from 1-00:00:00 to 24. Is there a way to do it?
This is the way Slurm displays the dates. Elapsed time will eventually be displayed the same way (days-hours:minutes:seconds) after 23:59:59.
You can use a wrapper script to convert into a different format. Or if you know the time limit is no more than a day, just set the time limit to 23:59:00 by using --time=1439.
salloc -N1 --time=1439 bash
Using your squeue command:
166 mypartition interactive jyvet RUNNING 7:36 23:59:00 N/A 1 1 mynode
I'm trying to print a list of all the servers that are affected by cron job recipes in chef and I would like to have the cron time value printed next to each cron job for a specific server.
I have already used a script like this:
for i in $(ls cookbooks/cookbook_name/recipes);
do knife search node "recipes:*${i%.*}" -i;
done
I want to extract the crontab time from an expression like this:
% frequent zfs snapshot
cron_d 'zfs-auto-snapshot-frequently' do
minute '*/15'
path PATH
command '/opt/zfs-auto-snapshot.sh frequent 6'
end
% Hourly
cron_d 'zfs-auto-snapshot-hourly' do
minute '59'
path PATH
command '/opt/zfs-auto-snapshot.sh hourly 4'
end
% Daily
cron_d 'zfs-auto-snapshot-daily' do
minute '59'
hour '23'
path PATH
command '/opt/zfs-auto-snapshot.sh daily 3'
end
% Weekly
cron_d 'zfs-auto-snapshot-weekly' do
minute '0'
hour '0'
weekday '1'
path PATH
command '/opt/zfs-auto-snapshot.sh weekly 1'
end
% Monthly
cron_d 'zfs-auto-snapshot-monthly' do
minute '0'
hour '0'
day '1'
path PATH
command '/opt/zfs-auto-snapshot.sh monthly 1'
end
I need a list output sort of like this:
admindb.xyz.com ***3*
server.xyz.com ***3*
office.xyz.com *2***
collector01.xyz.com *3*4*
This is not really possible short of making a ton of assumptions about your code and writing some terrible regexes. Chef is code, static analysis of any code is hard, with Ruby it's somewhere between infuriating and "lolno". Best bet would be to not even look at Chef and directly query each machine via SSH.
I'm working on a production app that has multiple rails servers behind nginx loadbalancer. We are monitoring sidekiq processes with monit, and it works just fine - when sidekiq proces dies monit starts it right back.
However recently encountered a situation where one of these processes was running and visible to monit, but for some reason not visible to sidekiq. That resulted in many failed jobs and took us some time to notice that we're missing one process in sidekiq Web UI, since monit was telling us everything was fine and all processes were running. Simple restart fixed the problem.
And that bring me to my question: how do you monitor your sidekiq processes? I know i can use something like rollbar to notify me when jobs fail, but i'd like to know if there is a way to monitor process count and preferably send mail when one dies. Any suggestions?
Something that would ping sidekiq/stats and verify response.
My super simple solution to a similar problem looks like this:
# sidekiq_check.rb
namespace :sidekiq_check do
task rerun: :environment do
if Sidekiq::ProcessSet.new.size == 0
exec 'bundle exec sidekiq -d -L log/sidekiq.log -C config/sidekiq.yml -e production'
end
end
end
and then using cron/whenever
# schedule.rb
every 5.minutes do
rake 'sidekiq_check:rerun'
end
We ran into this problem where our sidekiq processes had stopped working off jobs overnight and we had no idea. It took us about 30 minutes to integrate http://deadmanssnitch.com by following these instructions.
It's not the prettiest or cheapest option but it gets the job done (integrates nicely with Pagerduty) and has saved our butt twice in the last few months.
On of our complaints with the service is the shortest grace interval we can set is 15 minutes which is too long for us. So we're evaluating similar services like Healthchecks, etc.
My approach is the following:
create a background job that does something
call the job regularly
check that the thing is being done!
so; using a cron script (or something like whenever) every 5 mins, I run :
CheckinJob.perform_later
It's now up to sidekiq (or delayed_job, or whatever active job you're using) to actually run the job.
The job just has to do something which you can check.
I used to get the job to update a record in my Status table (essentially a list of key/value records). Then I'd have a /status page which returns a :500 status code if the record hasn't been updated in the last 6 minutes.
(obviously your timing may vary)
Then I use a monitoring service to monitor the status page! (something like StatusCake)
Nowdays I have a simpler approach; I just get the background job to check in with a cron monitoring service like
IsItWorking
Dead Mans Snitch
Health Checks
The monitoring service which expects your task to check in every X mins. If your task doesn't check in - then the monitoring service will let you know.
Integration is dead simple for all the services. For Is It Working it would be:
IsItWorkingInfo::Checkin.ping(key:"CHECKIN_IDENTIFIER")
full disclosure: I wrote IsItWorking !
I use god gem to monitor my sidekiq processes. God gem makes sure that your process is always running and also can notify the process status on various channels.
ROOT = File.dirname(File.dirname(__FILE__))
God.pid_file_directory = File.join(ROOT, "tmp/pids")
God.watch do |w|
w.env = {'RAILS_ENV' => ENV['RAILS_ENV'] || 'development'}
w.name = 'sidekiq'
w.start = "bundle exec sidekiq -d -L log/sidekiq.log -C config/sidekiq.yml -e #{ENV['RAILS_ENV']}"
w.log = "#{ROOT}/log/sidekiq_god.log"
w.behavior(:clean_pid_file)
w.dir = ROOT
w.keepalive
w.restart_if do |restart|
restart.condition(:memory_usage) do |c|
c.interval = 120.seconds
c.above = 100.megabytes
c.times = [3, 5] # 3 out of 5 intervals
end
restart.condition(:cpu_usage) do |c|
c.interval = 120.seconds
c.above = 80.percent
c.times = 5
end
end
w.lifecycle do |on|
on.condition(:flapping) do |c|
c.to_state = [:start, :restart]
c.times = 5
c.within = 5.minute
c.transition = :unmonitored
c.retry_in = 10.minutes
c.retry_times = 5
c.retry_within = 1.hours
end
end
end
I have below piece of code :
Converter.delay.convert("some params")
Now I want this job to be run for max 1 minute. If exceeded,delayed job should raise the exception.
I tried setting up
Delayed::Worker.max_run_time = 1.minute
but it seems it sets a timeout on the worker , not on the job.
Converter class is defined in RAILS_ROOT/lib/my_converter.rb
Timeout in the job itself
require 'timeout'
class Converter
def self.convert(params)
Timeout.timeout(60) do
# your processing
end
end
end
Delayed::Worker.max_run_time=1.minute
Its the max time on each task given to worker. The execution of any task takes more than specified we get exception raised as .
execution expired (Delayed::Worker.max_run_time is only 1 minutes)
The worker continue to run and process next tasks.
An example output from capistrano:
INFO [94db8027] Running /usr/bin/env uptime on leehambley#example.com:22
DEBUG [94db8027] Command: /usr/bin/env uptime
DEBUG [94db8027] 17:11:17 up 50 days, 22:31, 1 user, load average: 0.02, 0.02, 0.05
INFO [94db8027] Finished in 0.435 seconds command successful.
As you can see, each line starts with "{type} {hash}". I assume the hash is some unique identifier for either the server or the running thread, as I've noticed if I run capistrano over several servers, each one has it's own distinct hash.
My question is, how do I get this value? I want to manually output some message during execution, and I want to be able to match my output, with the server that triggered it.
Something like: puts "DEBUG ["+????+"] Something happened!"
What do I put in the ???? there? Or is there another, built in way to output messages like this?
For reference, I am using Capistrano Version: 3.2.1 (Rake Version: 10.3.2)
This hash is a command uuid. It is tied not to the server but to a specific command that is currently run.
If all you want is to distinguish between servers you may try the following
task :some_task do
on roles(:app) do |host|
debug "[#{host.hostname}:#{host.port}] something happened"
end
end