Spring Scheduler code within an App with multiple instances with multiple JVMs - spring

I have a spring scheduler task configured with either of fixedDelay or cron, and have multiple instances of this app running on multiple JVMs.
The default behavior is all the instances are executing the scheduler task.
Is there a way by which we can control this behavior so that only one instance will execute the scheduler task and others don't.
Please let me know if you know any approaches.
Thank you

We had similar problem. We fixed it like this:
Removed all #Scheduled beans from our Spring Boot services.
Created AWS Lambda function scheduled with desired schedule.
Lambda function hits our top level domain with scheduling request.
Load balancer forwards this request to one of the service instances.
This way we are sure that scheduled task is executed only once across the cluster of our services.

I have faced similar problem where same scheduled batch job was running on two server where it was intended to be running on one node at a time. But later on I found a solution to not to execute the job if it is already running on other server.
Job someJob = ...
Set<JobExecution> jobs = jobExplorer.findRunningJobExecutions("someJobName");
if (jobs == null || jobs.isEmpty()) {
jobLauncher.run(someJob, jobParametersBuilder.toJobParameters());
}
}
So before launching the job, a check is needed if the job is already in execution on other node.
Please note that this approach will work only with DB based job repository.

We had the same problem our three instance were running same job and doing the tasks three times every day. We solved it by making use of Spring batch. Spring batch can have only unique job id so if you start the job with a job id like date it will restricts duplicate jobs to start with same id. In our case we used date like '2020-1-1' (since it runs only once a day) . All three instance tries to start the job with id '2020-1-1' but spring rejects two duplicate job stating already job '2020-1-1' is running.

If my understanding is correct on your question, that you want to run this scheduled job on a single instance, then i think you should look at ShedLock
ShedLock makes sure that your scheduled tasks are executed at most once at the same time. If a task is being executed on one node, it acquires a lock which prevents execution of the same task from another node (or thread). Please note, that if one task is already being executed on one node, execution on other nodes does not wait, it is simply skipped.

Related

How can I start running server in one yml job and tests in another when run server job is still running

So I have 2 yml pipelines currently... one starts running the server and after server is up and running I start the other pipeline that runs tests in one job and once that's completed starts a job that shuts down the server from first pipeline.
I'm kinda new to yml and wondering if there is a way to run all this in a single pipeline...
The problem I came across is that if I put server to run in a first job I do not know how to condition the second job to kick off after server is running. This job doesn't have succeeded of failed condition because it's still in progress as the server has to run in order for tests to be run.
I tried adding a variable that I set to true after server is running but it still never jumps to the next job?
I looked into templates too but those are not very clear to me so any suggestion or documentation or tutorial would be very helpful on how to achive putting this in one pipeline...
I already googled a bunch and will keep googling but figured someone here might have an answer already.
Each agent can run only one job at a time. To run multiple jobs in parallel you must configure multiple agents. You also need sufficient parallel jobs.
You can specify the conditions under which each job runs. By default, a job runs if it does not depend on any other job, or if all of the jobs that it depends on have completed and succeeded. You can customize this behavior by forcing a job to run even if a previous job fails or by specifying a custom condition.
Since you have added a variable that you set to true after server is running. Then try to enable a custom condition, set that job run if a variable is xxx.
More details please kindly check official doc here:
Specify jobs in your pipeline
Specify conditions

Interrupting a job in quartz with multiple instances

I have 5 instances of an application using quartz in cluster mode both having the quartz scheduler running. (with postgresql)
org.quartz.jobStore.isClustered:true
org.quartz.scheduler.instanceName: myInstanceName
org.quartz.scheduler.instanceId: AUTO
So I have a job which starts and do some operations, update itself if necessary with new scheduled time or else deletes itself. (One job can contain only one trigger.)
The application has a UI interface to allow the user to cancel the job.
When the interrupt command is send from the UI;
If job is not currently working; I can pause the job or cancel.
If my job is currently working at that time, how can I stop the job with the correct instance and get the current state of the job? Basically I want to catch at that moment and save that data at that time, which user is actually interrupt moment
Does scheduler.interrupt(jobKey) interrupt my job which implements InterruptableJob correctly ?
Is scheduler.interrupt() exactly knows which instance should currently running the job and find the correct instance and get the right state of the job ?
Can u correct me, or which way should I go with ?
interrupt method implementations and getCurrentlyExecutingJobs() in quartz are not cluster aware,
which means the method has to be run on the instance which is executing that job, in other words only jobs with specified job key running in the current instance will be interrupted.
An interrupt request can be broadcasted to all running instances of quartz to cancel all instances of running jobs.
from: https://www.quartz-scheduler.org/api/2.1.7/org/quartz/Scheduler.html#interrupt(org.quartz.JobKey)
This method is not cluster aware. That is, it will only interrupt
instances of the identified InterruptableJob currently executing in
this Scheduler instance, not across the entire cluster.

Schedule application Spring on Docker cluster

I need to schedule in batch my application Spring in a Docker cluster with different nodes.
I found the solution of set replicas=1 on docker-compose, but in my opinion this isn't the best solution because minimizes the potential of Docker.
Some help or advice? Thank you.
If I understand you correctly you want to run several replicas of a spring application (it does not matter if this is managed by docker, k8s, it is standalone, etc). Then you want a background job to be started only on one single instance. Right? In this case I may advise you to have a look at ShedLock.
ShedLock does one and only one thing. It makes sure your scheduled
tasks are executed at most once at the same time. If a task is being
executed on one node, it acquires a lock which prevents execution of
the same task from another node (or thread). Please note, that if one
task is already being executed on one node, execution on other nodes
does not wait, it is simply skipped.
It integrates smoothly in Spring. For example a scheduled batch job may look like this:
#Scheduled(cron = ...)
#SchedulerLock(name = "scheduledTaskName")
public void scheduledTask() {
// do something
}
Various options may be used under the hood to implement a distributed lock, e.g. MySQL, Redis, Zookeeper and others.

Schedule a trigger for a job that is excecuted on every node in a cluster

I'm wondering if there is a simple workaround/hack for quartz of triggering a job that is excecuted on every node in a cluster.
My situation:
My application is caching some things and is running in a cluster with no distributed-cache. Now I have situations where I want to refresh the caches on all nodes triggered by a job.
As you have found out, Quartz always picks up a random instance to execute a scheduled job and this cannot be easily changed unless you want to hack its internals.
Probably the easiest way to achieve what you describe would be to implement some sort of a coordinator (or master) job that will be aware of all Quartz instances in the cluster and will "manually" trigger execution of the cache-sync job on every single node. The master job can easily do it via the RMI, or JMX APIs exposed by Quartz.
You may want to check this somewhat similar question.

How does Spring-XD handle job execution

I can't get the information out of the documentation. Can anyone tell me how Spring-XD executes jobs? Does it assign a job to a certain container and is this job only executed on the container it is deployed to, or is each job execution assigned to another container? Can I somehow control that a certain job may be executed in parallel (with different arguments) and others may not ?
Thanks!
Peter
I am sure you would have seen some of the documentation here:
https://github.com/spring-projects/spring-xd/wiki/Batch-Jobs
To answer your questions:
Can anyone tell me how Spring-XD executes jobs? Does it assign a job to a certain container and is this job only executed on the container it is deployed to, or is each job execution assigned to another container?
After you create a new job definition using this:
xd>job create dailyfeedjob --definition "myfeedjobmodule" --deploy
the batch job module myfeedjobmodule gets deployed into the XD container. Once deployed, there is a job launching queue setup in the message broker: redis, rabbit or local. The name of the queue is job:dailyfeedjob in the message broker. Since this queue is bound to the job module deployed in the XD container, a request message sent to this queue is picked by the job module deployed inside that specific container.
Now, you can send the job launching request message (with job parameters) into the job:dailyfeedjob queue by simply setting up a stream that sends a message into this queue. For example: a trigger (fixed-delay, cron, date triggers) could do that. This also a job launch command from the shell which launches job only once.
This section would explain it more: https://github.com/spring-projects/spring-xd/wiki/Batch-Jobs#launching-a-job
Hence, the job is launched (every time it receives the job launching request) only inside the container where the job module is deployed and you can expect original the spring batch flow when the job is executed. (refer to shell doc for all the job related commands)
Can I somehow control that a certain job may be executed in parallel (with different arguments) and others may not ?
If it is for the different job parameters for the same job definition, then it would go to the same container where the job module is deployed.
But, you can still create a new job definition with the same batch job module.
xd>job create myotherdailyfeedjob --definition "myfeedjobmodule" --deploy
The only difference being it will be under that namespace. and, the job launching queue name would job:myotherdailyfeedjob. It all depends on how do you want to organize running your batch jobs.
Also, for parallel processing batch jobs you can use:
http://docs.spring.io/spring-batch/reference/html/scalability.html
and, XD provides single step partitioning support for running batch jobs:
Include this in your job module:
<import resource="classpath:/META-INF/spring-xd/batch/singlestep-partition-support.xml"/>
with partitioner and tasklet beans defined.
You can try out some of the XD batch samples from here:
https://github.com/spring-projects/spring-xd-samples

Resources