Spring Batch Running duplicate steps(more than once) in a tasklet - spring

Let's say I have following flow,
Start->Step1->Step2->Step3->Step2->End
I have created tasklets for each step and configured a Job as above.
When the job got triggered, the execution is fine till Step3 and it goes into a loop infinitely.
So is there a way to a step more than once in a JobFlow.
I am using Spring Batch 4.2.1.RELEASE.

How are you writting your job? I used to have this kind of problem when I use many flows based on batch desions.
Have you tried something like this?
#Bean
fun jobSincAdUsuario(): Job {
estatisticas.reset()
return job.get("batch-job")
.incrementer(RunIdIncrementer())
.start(step1())
.next(step2())
.next(step3())
.next(step2())
.build().build()
}

Related

how to prevent quartz from running another job if the previous one is still running?

I'm using Quarkus. My Quartz jobs are scheduled to run every 10 seconds:
return TriggerBuilder.newTrigger()
.withIdentity("my-job")
.startNow()
.withSchedule(
SimpleScheduleBuilder.simpleSchedule()
.withIntervalInSeconds(10)
.repeatForever()
).build();
This works fine but jobs keep triggering every 10 seconds irrespective of whether or not the last one finishes. I need the next job to start only if there are no jobs currently running job. How do I accomplish this?
Add #DisallowConcurrentExecution on you Job class.
as example :
#DisallowConcurrentExecution
public class MyScheduledJob implements Job {
//execution method
}

Batch or Chain for jobs inside jobs

I have job A which downloads xml and then calls other job B which will create data on database. This job B will be called in loop and can be more than 10.000 items. First tried to use chain method but problem is that, if someone will call queue in wrong sequence it will not work. Then tried to use batch from new Laravel 8. Collecting all jobs (more than 10000) to one batch can cause out of memory exception. Other problem is calling job C at the end. This job will update some credentials. Thats why job A and B must be runned successfully. May be there is any good idea for this situation?
Laravel's job batching feature allows you to easily execute a batch of jobs and then perform some action when the batch of jobs has completed executing.
If you have an out-of-memory problem with Jobs Batching you are doing things wrong. Since the queues are executed one by one if you have it configured that way there should be no problems, even if they are more than 100k records. So, make sure you glue one Job for each item, and execute the action, you won't have problems with this.
Then, you could do something like this.
$chain = [
new ProcessPodcast(Podcast::find(1)),
new ProcessPodcast(Podcast::find(2)),
new ProcessPodcast(Podcast::find(3)),
new ProcessPodcast(Podcast::find(4)),
new ProcessPodcast(Podcast::find(5)),
...
// And so on for all your items.
// This should be generated by a foreach with all his items.
];
Bus::batch($chain)->then(function (Batch $batch) {
// All jobs completed successfully...
// Uupdate some credentials...
})->catch(function (Batch $batch, Throwable $e) {
// First batch job failure detected...
})->finally(function (Batch $batch) {
// The batch has finished executing...
})->dispatch();

Spring Batch - How to output Thread and Grid number to console or log

In my Spring Batch configuration I have this:
#Bean
public TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor("myJob");
asyncTaskExecutor.setConcurrencyLimit(15);
asyncTaskExecutor.setThreadNamePrefix("SrcToDest");
return taskExecutor;
}
And also I have a "master-step" where I am setting the grid-size as per below:
#Bean
#Qualifier("masterStep")
public Step masterStep() {
return stepBuilderFactory.get("masterStep").partitioner("step1", partitioner()).step(step1())
.taskExecutor(threadpooltaskExecutor()).taskExecutor(taskExecutor())
.gridSize(10).build();
}
In my case, I see only "Thread-x" at the end when "myjob" finishes with "COMPLETED" status.
Questions
In order to monitor how can I print the thread number to the console/log throughout the execution process? i.e. "myjob" start to finish
Is there some way I can get the output to console/log to see the grid action too?
I could not find any example or anywhere in Spring Guides for these.
Still looking how to display grid numbers to console
This depends on your partitioner. You can add a log statement in your partitioner and show the grid size. So at partitioning time, it's on your side.
At partition handling time, Spring Batch will show a log statement at debug level of each execution of the worker step.

Run parallel jobs in Spring Batch

I want to run jobs in parallel. I have tried many solution from below links:-
using spring batch to execute jobs in parallel
How to run spring batch jobs in parallel
What I want is-
I have 3 Jobs doing different processes as follows:-
(I have n jobs. But for now 3)
Job1->Do Student File Processing
Job2->Do Employee File Processing
Job3->any other Processing
These jobs does not have any resemblance. They are independent of each other. All jobs to be run in parallel. Below is the sample code:-
#Bean
public Job importUserJob() {
return jobBuilderFactory.get("importUserJob")
.incrementer(new RunIdIncrementer())
.listener(listener())
.flow(stepOne()).end().build();
}
#Bean
public Job importUserOtherJob() {
return jobBuilderFactory.get("importUserOtherJob")
.incrementer(new RunIdIncrementer())
.listener(listener())
.flow(stepSecond()).end().build();
}
Similarly, I have two step:-
stepOne -> reader, processing, and writing in database and
stepSecond -> reader, processing, and writing in database
But, this job does not run in parallel. The second job waits for the first one to complete. I want to run in parallel. I also have added SimpleAsyncTaskExecutor to the JobLauncher from this link- Multiple spring batch jobs and even also added #Qualifier("asyncJobLauncher") when autowired to JobLauncher from this link - https://github.com/spring-projects/spring-boot/issues/1655.
Please tell me way to achieve parallel processing of the jobs.

Spring batch A job instance already exists

OK,
I know this has been asked before, but I still can't find a definite answer to my question. And my question is this: I am using spring batch to export data to SOLR search server. It needs to run every minute, so I can export all the updates. The first execution passes OK, but the second one complains with:
2014-10-02 20:37:00,022 [defaultTaskScheduler-1] ERROR: catching
org.springframework.batch.core.repository.JobInstanceAlreadyCompleteException: A job instance already exists and is complete for parameters={catalogVersionPK=3378876823725152,
type=UPDATE}. If you want to run this job again, change the parameters.
at org.springframework.batch.core.repository.support.SimpleJobRepository.createJobExecution(SimpleJobRepository.java:126)
at
Of course I can add a date-time parameter to the job like this:
.addLong("time", System.getCurrentTimeMillis())
and then the job can be run more than once. However, I also want to query the last execution of the job, so I have code like this:
DateTime endTime = new DateTime(0);
JobExecution je = jobRepository.getLastJobExecution("searchExportJob", new JobParametersBuilder().addLong("catalogVersionPK", catalogVersionPK).addString("type", type).toJobParameters());
if (je != null && je.getEndTime() != null) {
endTime = new DateTime(je.getEndTime());
}
and this returns nothing, because I didn't provide the time parameter. So seems like I can run the job once and get the last execution time, or i can run it multiple times and not get the last execution time. I am really stuck :(
Assumption
Spring Batch use some tables to store each JOB executed with its parameters.
If you run twice the job with the same parameters, the second one fails, because the job is identified by jobName and parameters.
1# Solution
You could use JobExecution when run a new Job.
JobExecution execution = jobLauncher.run(job, new JobParameters());
.....
// Use a JobExecutionDao to retrieve the JobExecution by ID
JobExecution ex = jobExecutionDao.getJobExecution(execution.getId());
2# Solution
You could implement a custom JobExecutionDao and perform a custom query to find your JobExecution on BATCH_JOB_EXECUTION table.
See here the reference of Spring.
I hope my answer is helpful to you.
Use the Job Explorer as suggested by Luca Basso Ricci.
Because you do not know the job parameters you need to look up by instance.
Look for the last instance of job named searchExportJob
Look for the last execution of the instance above
This way you use Spring Batch API only
//We can set count 1 because job instance are ordered by instance id descending
//so we know the first one returned is the last instance
List<JobInstance> instances = jobExplorer.getJobInstances("searchExportJob",0,1);
JobInstance lastInstance = instances.get(0);
List<JobExecution> jobExecutions = jobExplorer.getJobExecutions(lastInstance);
//JobExcectuin is ordered by execution id descending so first
//result is the last execution
JobExecution je = jobExecutions.get(0);
if (je != null && je.getEndTime() != null) {
endTime = new DateTime(je.getEndTime());
}
Note this code only work for Spring Batch 2.2.x and above in 2.1.x the API was somewhat different
There is another interface you can use: JobExplorer
From its javadoc:
Entry point for browsing executions of running or historical jobs and
steps. Since the data may be re-hydrated from persistent storage, it
may not contain volatile fields that would have been present when the
execution was active
If you are debugging your batch job and terminate your batch job before completing, then it will give this error if you try to start it again.
To start it again either you need to update the name of your job so that it will create another execution id.
or you can update below tables.
BATCH_JOB_EXECUTION
BATCH_STEP_EXECUTION
You need to update Status and End_time columns with non null values.
Create new Job RunId everytime.
If your code involves creating Job object using Jobfactory then below snippet would be useful for the problem:
return jobBuilderFactory
.get("someJobName")
.incrementer(new RunIdIncrementer()) // solution lies here- creating new job id everytime
.flow( // and here
stepBuilderFactory
.get("someTaskletStepName")
.tasklet(tasklet) // u can replace it with step
.allowStartIfComplete(true) // this will make the job run even if complete in last run
.build())
.end()
.build();

Resources