Run parallel jobs in Spring Batch - parallel-processing

I want to run jobs in parallel. I have tried many solution from below links:-
using spring batch to execute jobs in parallel
How to run spring batch jobs in parallel
What I want is-
I have 3 Jobs doing different processes as follows:-
(I have n jobs. But for now 3)
Job1->Do Student File Processing
Job2->Do Employee File Processing
Job3->any other Processing
These jobs does not have any resemblance. They are independent of each other. All jobs to be run in parallel. Below is the sample code:-
#Bean
public Job importUserJob() {
return jobBuilderFactory.get("importUserJob")
.incrementer(new RunIdIncrementer())
.listener(listener())
.flow(stepOne()).end().build();
}
#Bean
public Job importUserOtherJob() {
return jobBuilderFactory.get("importUserOtherJob")
.incrementer(new RunIdIncrementer())
.listener(listener())
.flow(stepSecond()).end().build();
}
Similarly, I have two step:-
stepOne -> reader, processing, and writing in database and
stepSecond -> reader, processing, and writing in database
But, this job does not run in parallel. The second job waits for the first one to complete. I want to run in parallel. I also have added SimpleAsyncTaskExecutor to the JobLauncher from this link- Multiple spring batch jobs and even also added #Qualifier("asyncJobLauncher") when autowired to JobLauncher from this link - https://github.com/spring-projects/spring-boot/issues/1655.
Please tell me way to achieve parallel processing of the jobs.

Related

how to prevent quartz from running another job if the previous one is still running?

I'm using Quarkus. My Quartz jobs are scheduled to run every 10 seconds:
return TriggerBuilder.newTrigger()
.withIdentity("my-job")
.startNow()
.withSchedule(
SimpleScheduleBuilder.simpleSchedule()
.withIntervalInSeconds(10)
.repeatForever()
).build();
This works fine but jobs keep triggering every 10 seconds irrespective of whether or not the last one finishes. I need the next job to start only if there are no jobs currently running job. How do I accomplish this?
Add #DisallowConcurrentExecution on you Job class.
as example :
#DisallowConcurrentExecution
public class MyScheduledJob implements Job {
//execution method
}

Spring Batch Running duplicate steps(more than once) in a tasklet

Let's say I have following flow,
Start->Step1->Step2->Step3->Step2->End
I have created tasklets for each step and configured a Job as above.
When the job got triggered, the execution is fine till Step3 and it goes into a loop infinitely.
So is there a way to a step more than once in a JobFlow.
I am using Spring Batch 4.2.1.RELEASE.
How are you writting your job? I used to have this kind of problem when I use many flows based on batch desions.
Have you tried something like this?
#Bean
fun jobSincAdUsuario(): Job {
estatisticas.reset()
return job.get("batch-job")
.incrementer(RunIdIncrementer())
.start(step1())
.next(step2())
.next(step3())
.next(step2())
.build().build()
}

Batch or Chain for jobs inside jobs

I have job A which downloads xml and then calls other job B which will create data on database. This job B will be called in loop and can be more than 10.000 items. First tried to use chain method but problem is that, if someone will call queue in wrong sequence it will not work. Then tried to use batch from new Laravel 8. Collecting all jobs (more than 10000) to one batch can cause out of memory exception. Other problem is calling job C at the end. This job will update some credentials. Thats why job A and B must be runned successfully. May be there is any good idea for this situation?
Laravel's job batching feature allows you to easily execute a batch of jobs and then perform some action when the batch of jobs has completed executing.
If you have an out-of-memory problem with Jobs Batching you are doing things wrong. Since the queues are executed one by one if you have it configured that way there should be no problems, even if they are more than 100k records. So, make sure you glue one Job for each item, and execute the action, you won't have problems with this.
Then, you could do something like this.
$chain = [
new ProcessPodcast(Podcast::find(1)),
new ProcessPodcast(Podcast::find(2)),
new ProcessPodcast(Podcast::find(3)),
new ProcessPodcast(Podcast::find(4)),
new ProcessPodcast(Podcast::find(5)),
...
// And so on for all your items.
// This should be generated by a foreach with all his items.
];
Bus::batch($chain)->then(function (Batch $batch) {
// All jobs completed successfully...
// Uupdate some credentials...
})->catch(function (Batch $batch, Throwable $e) {
// First batch job failure detected...
})->finally(function (Batch $batch) {
// The batch has finished executing...
})->dispatch();

Keep track of laravel queued jobs

I'm trying to get department details from an API which supports pagination, so if I spawn one job per page like following
/departments?id=1&page=1 -> job1
/departments?id=1page=2 -> job2
How can I keep track of these jobs for a particular department as I have to write the responses to txt file.
The jobs are instantiated via controller class like:
class ParseAllDeptsJob implements ShouldQueue
{
public function handle()
{
foreach (Departments::all() as $dept) {
ParseDeptJob::dispatch($dept);
}
}
}
You can chain a job, using withChain(). This job will not run if the jobs higher up the chain fail.
From the documentation:
Job chaining allows you to specify a list of queued jobs that should
be run in sequence. If one job in the sequence fails, the rest of the
jobs will not be run. To execute a queued job chain, you may use the
withChain method on any of your dispatchable jobs:
In your case, this is how you'd do it:
ParseAllDeptsJob::withChain([
new SendEmailNotification
])->dispatch();
SendEmailNotification won't be dispatched if an error occurs while processing ParseAllDeptsJob.

Limit the lifetime of a batch job

Is there a way to limit the lifetime of a running spring-batch job to e.g. 23 hours?
We start a batch job daily by a cron job and he job takes about 9 hours. It happened under some circumstances that the DB connection was so slow that the job took over 60 hours to complete. The problem is that the next job instance gets started by the cronjob the next day - and then anotherone the day after - and anotherone...
If this job is not finished within e.g. 23 hours, I want to terminate it and return an error. Is there a way to do that out-of-the-box with spring-batch?
Using a StepListener you can stop a job by calling StepExecution#setTerminateOnly.
stepBuilderFactory
...
.writer(writer)
.listener(timeoutListener)
...
.build()
And the TimeoutListener could look like this
#Component
public class TimeoutListener implements StepListener {
private StepExecution stepExecution;
#BeforeStep
public void beforeStep(StepExecution stepExecution) {
this.stepExecution = stepExecution;
}
#BeforeRead
public void beforeRead() {
if (jobShouldStop()) {
stepExecution.setTerminateOnly();
}
}
private boolean jobShouldStop() {
// ...
}
}
This will gracefully stop the job, without forcefully terminate any running steps.
Spring Batch specifically avoids the issue of job orchestration which this falls into. That being said, you could add a listener to your job that checks for other instances running and calls stop on them before beginning that one. Not knowing what each job does, I'm not sure how effective that would be, but it should be a start.
If you write your own job class to launch the process you can make your class implement StatefulJob interface, which prevents concurrent launches of the same job. Apart from that you can write your own monitoring and stop the job programatically after some period, but it will require some custom coding, I dont know if there is anything build-in for such use case.

Resources