When using go-cron to work, multiple services are started at the same time, and multiple cron tasks will be executed at the same time? - go

When using go-cron to work, multiple services are started at the same time, and multiple cron tasks will be executed at the same time? What can be done to ensure that the cron task is only executed once?
my cron expression is [0 2 */1 * *]

You can configure the SingletonMode for your task to prevents a new job from starting if the prior job has not yet completed:
s := gocron.NewScheduler(time.UTC)
_, _ = s.Every(1).Second().SingletonMode().Do(task)
https://pkg.go.dev/github.com/go-co-op/gocron#Scheduler.SingletonMode
Or enable the SingletonModeAll to prevent new jobs from starting if the prior instance of the particular job has not yet completed:
s := gocron.NewScheduler(time.UTC)
s.SingletonModeAll()
_, _ = s.Every(1).Second().Do(task)
https://pkg.go.dev/github.com/go-co-op/gocron#Scheduler.SingletonModeAll

Related

Ansible: How to execute ALTER statements on database with delay?

I have generated a file with bunch of alter statements based on certain condition on cluster using Ansible task.
Here's the sample file content
alter table test1 with throttling = 0.0;
alter table test2 with throttling = 0.0;
alter table test3 with throttling = 0.0;
I want to login and execute these ALTER statements with a delay of 2 mins. I was able to achieve this with a shell script using sleep command by copying the shell script from Control Node and execute on Remote Node.
But the problem we noticed was we were unable to check if script executed properly or failed (like authentication failed to DB, etc.)
Can we perform the same task using Ansible module and execute them one by one with some delay?
Regarding
Can we perform the same task using ansible module and execute them one by one with some delay?
the short answer, yes, of course.
A slightly longer minimal example
- name: Run queries against db test_db with pause
community.mysql.mysql_query:
login_db: test_db
query: "ALTER TABLE test{{ item }} WITH throttling = 0.0;"
loop: [1, 2, 3]
loop_control:
pause: 120 # seconds
Further Documentation
Community.Mysql - Modules
mysql_query module – Run MySQL queries
Pausing with loop
Extended loop variables
Further Q&A
How to run MySQL query in Ansible?
Ensuring a delay in an Ansible loop

How to see celery tasks in redis queue when there is no worker?

I have a container creating celery tasks, and a container running a worker.
I have removed the worker container, so I expected that tasks would accumulate in the redis list of tasks.
But I can't see any tasks in redis.
This is with django. I need to isolate the worker and queue, hence the settings
A typical queue name is 'test-dear', that is, SHORT_HOSTNAME='test-dear'
CELERY_DATABASE_NUMBER = 0
CELERY_BROKER_URL = f"redis://{REDIS_HOST}:6379/{CELERY_DATABASE_NUMBER}"
CELERY_RESULT_BACKEND = f"redis://{REDIS_HOST}:6379/{CELERY_DATABASE_NUMBER}"
CELERY_BROKER_TRANSPORT_OPTIONS = {'global_keyprefix': SHORT_HOSTNAME }
CELERY_TASK_DEFAULT_QUEUE = SHORT_HOSTNAME
CELERY_TASK_ACKS_LATE = True
After starting everything, and stopping the worker, I add tasks.
For example, on the producer container after python manage.py shell
>>> from cached_dear import tasks
>>> t1 = tasks.purge_deleted_masterdata_fast.delay()
<AsyncResult: 9c9a564a-d270-444c-bc71-ff710a42049e>
t1.get() does not return.
then in redis:
127.0.0.1:6379> llen test-dear
(integer) 0
I was not expecting 0 entries.
What I am doing wrong or not understanding?
I did this from the redis container
redis-cli monitor | grep test-dear
and sent a task.
The list is test-deartest-dear and
llen test-deartest-dear
works to show the number of tasks which have not yet been sent to a worker.
The queue name is f"{global_keyprefix}{queue_name}

How to restart job in zeebe

I have method, as part of zeebe workflow job. And when it fails, I want to restart all job. I found, that it can be done with NewFailJobCommand, but it seems that the job fails on the first try. How I can restart the job if it fails?
err := w.workflowStore.InitScanEventsTTL(ctx, scanID, job.Msg.Tenant)
if err != nil {
return w.client.NewFailJobCommand().JobKey(job.Key).Retries(job.Retries -
1).ErrorMessage(reason).Send(ctx)
}
You need to specify the retry count in the task properties in the process model.

Delayed job: Set max run time of job , not the worker

I have below piece of code :
Converter.delay.convert("some params")
Now I want this job to be run for max 1 minute. If exceeded,delayed job should raise the exception.
I tried setting up
Delayed::Worker.max_run_time = 1.minute
but it seems it sets a timeout on the worker , not on the job.
Converter class is defined in RAILS_ROOT/lib/my_converter.rb
Timeout in the job itself
require 'timeout'
class Converter
def self.convert(params)
Timeout.timeout(60) do
# your processing
end
end
end
Delayed::Worker.max_run_time=1.minute
Its the max time on each task given to worker. The execution of any task takes more than specified we get exception raised as .
execution expired (Delayed::Worker.max_run_time is only 1 minutes)
The worker continue to run and process next tasks.

Parallel processing with dependencies on a SGE cluster

I'm doing some experiments on a computing cluster. My algorithm has two steps. The first one writes its outputs to some files which will be used by the second step. The dependecies are 1 to n meaning one step2 programs needs the output of n step1 program. I'm not sure what to do neither waist cluster resources nor keep the head node busy. My current solution is:
submit script (this runs on the head node)
for different params, p:
run step 1 with p
sleep some time based on the an estimate of how much step 1 takes
for different params, q:
run step 2 with q
step 2 algorithm (this runs on the computing nodes)
while files are not ready:
sleep a few minutes
do the step 2
Is there any better way to do this?
SGE provides both job dependencies and array jobs for that. You can submit your phase 1 computations an array job and then submit the phase 2 computation as a dependent job using the qsub -hold_jid <phase 1 job ID|name> .... This will make the phase 2 job wait until all the phase 1 computations have finished and then it will be released and dispatched. The phase 1 computations will run in parallel as long as there are enough slots in the cluster.
In a submission script it might be useful to specifiy holds by job name and name each array job in a unique way. E.g.
mkdir experiment_1; cd experiment_1
qsub -N phase1_001 -t 1-100 ./phase1
qsub -hold_jid phase1_001 -N phase2_001 ./phase2 q1
cd ..
mkdir experiment_2; cd experiment_2
qsub -N phase1_002 -t 1-42 ./phase1 parameter_file
qsub -hold_jid phase1_002 -N phase2_002 ./phase2 q2
cd ..
This will schedule 100 executions of the phase1 script as the array job phase1_001 and another 42 executions as the array job phase1_002. If there are 142 slots on the cluster, all 142 executions will run in parallel. Then one execution of the phase2 script will be dispatched after all tasks in the phase1_001 job have finished and one execution will be dispatched after all tasks in the phase1_002 job have finished. Again those can run in parallel.
Each taks in the array job will receive a unique $SGE_TASK_ID value ranging from 1 to 100 for the tasks in job phase1_001 and from 1 to 42 for the tasks in job phase1_002. From it you can compute the p parameter.

Resources