Spring Batch process unable to run flows in parallel - spring

I have a Spring batch application in which I am trying to run two steps in parallel by defining them on flowbuilder and in every stepbuilderfactory I have defined a custom partition and I am running the step in multiple threads.
#Bean
public Job job() throws Exception {
return jobBuilderFactory.get("Spring_batch").split(new SimpleAsyncTaskExecutor()).add(flow1(),flow2()).build().build();
}
flow 1:
#Bean
public Flow flow1() throws Exception {
return new FlowBuilder<Flow>("Flow")
.start(stepBuilderFactory.get("MasterStep1").partitioner(SlaveStep().getName(), partitioner())
.step(SlaveStep()).gridSize(4).taskExecutor(new SimpleAsyncTaskExecutor()).build())
.build();
}
Step for flow 1:
#Bean
public Step SlaveStep() throws Exception {
return stepBuilderFactory.get("flow1slaveStep").<Person, Person>chunk(100)
.reader(new jdbcPagingItemReader(input1, input2)).writer(new ItemWriter()).build();
}
flow 2:
#Bean
public Flow flow2() throws Exception {
return new FlowBuilder<Flow>("Flow-2")
.start(stepBuilderFactory.get("MasterStep2").partitioner(FlowSlaveStep().getName(), secondPartitioner())
.step(FlowSlaveStep()).gridSize(4).taskExecutor(new SimpleAsyncTaskExecutor()).build())
.build();
}
Step for flow 2:
#Bean
public Step FlowSlaveStep() throws Exception {
return stepBuilderFactory.get("flow2slaveStep").<Person, Person>chunk(100)
.reader(new jdbcPagingItemReader(input1, input2)).writer(new ItemWriter()).build();
}
Both the steps use same ItemReader and ItemWriter but with different inpits they get from partitions.
when I am running it the program works fine till it completes running custom partitions in both the flows but both the flows are waiting after completing the custom partitions and not going to itemReaders. I have an StoredProcedureItemReader defined with inputs. I am not sure how I can get my code to complete executing both the flows in parallel. When I am trying to run the flows one after the other they run perfectly fine like:
#Bean
public Job job() throws Exception {
return jobBuilderFactory.get("Spring_batch").start(flow1()).next(flow2()).end().build();
}
Please suggest me what I can do to achieve parallel flows on the steps which hits the same StoredProcedureItemReader.

Related

Spring-batch step not re-executed (cache)

I'm working on a project that includes Spring batch, before copying the code snippets, I'm going to summarize easily how the job works with a cron.
the cron calls a rest API on my project (#PostMapping("/jobs/external/{jobName}"))
in the post method, I get the job and execute it.
in each execution, I'm supposed to run a step.
the step contains a reader (external rest call to elastic API to get documents) and a processor.
now my problem: in the catalina.out, I'm able to see the rest call from the cron every 10 minutes as configured in my cron. BUT, the step doesn't seem to make that call to elastic every 10 minutes, the batch process always has the same set of data, which is fetched one time when the batch is called during tomcat restart.
job rest api :
#PostMapping("/jobs/external/{jobName}")
#Timed
public ResponseEntity start(#PathVariable String jobName) throws BatchException {
log.info("LAUNCHING JOB FROM EXTERNAL : {}, timestamp : {}", jobName, Instant.now().toString());
try {
Job job = jobRegistry.getJob(jobName);
JobParametersBuilder builder = new JobParametersBuilder();
builder.addDate("date", new Date());
return Optional.of(jobLauncher.run(job, builder.toJobParameters()))
.map(BatchExecutionVM::new)
.map(exec -> ResponseEntity
.ok()
.headers(HeaderUtil.createAlert("jobManagement.started", jobName))
.body(exec))
.orElseGet(() -> ResponseEntity.badRequest().build());
} catch (NoSuchJobException aEx) {
log.warn(JOB_NOT_FOUND, aEx);
throw new BatchException();
} catch (JobInstanceAlreadyCompleteException | JobExecutionAlreadyRunningException | JobRestartException aEx) {
log.warn("Job execution error.", aEx);
throw new BatchException();
} catch (JobParametersInvalidException aEx) {
log.warn("Job parameters are invalid.", aEx);
throw new BatchException();
}
}
job configuration :
#Bean
public Job usualJob() {
return jobBuilderFactory
.get("usualJob")
.incrementer(new SimpleJobIncrementer())
.flow(readUsualStep())
.end()
.build();
}
#Bean
public Step readUsualStep() {
// TODO: simplifier on n'a pas besoin de chunk
return stepBuilderFactory.get("readUsualStep")
.allowStartIfComplete(true)
.<AlertDocument, Void>chunk(25)
.readerIsTransactionalQueue()
.reader(rowItemReader())
.processor(rowItemProcessor())
.build();
}
#Bean
public ItemReader<AlertDocument> rowItemReader() {
return new UsualItemReader(usualService.getLastAlerts());
}
#Bean
public UsualMapRowProcessor rowItemProcessor() {
return new UsualMapRowProcessor();
}
i don't know why usualService.getLastAlerts() is called just once and not every 10 minutes.
thanks to M. Deinum, this is basically the solution :
#Bean
#StepScope
public ItemReader<AlertDocument> rowItemReader() {
return new UsualItemReader(usualService.getLastAlerts());
}
annotating the step bean with stepScope annotation will make it reinstantiate every step.

Spring batch Job Failing

All the steps being passed still the job is completed with failed status.
#Bean
public Job job() {
return this.jobBuilderFactory.get("person-job")
.start(initializeBatch())
.next(readBodystep())
.on("STOPPED")
.stopAndRestart(initializeBatch())
.end()
.validator(batchJobParamValidator)
.incrementer(jobParametersIncrementer)
.listener(jobListener)
.build();
}
#Bean
public Flow preProcessingFlow() {
return new FlowBuilder<Flow>("preProcessingFlow")
.start(extractFooterAndBodyStep())
.next(readFooterStep())
.build();
}
#Bean
public Step initializeBatch() {
return this.stepBuilderFactory.get("initializeBatch")
.flow(preProcessingFlow())
.build();
public Step readBodystep() {
return this.stepBuilderFactory.get("readChunkStep")
.<PersonDTO, PersonBO>chunk(10)
.reader(personFileBodyReader)
.processor(itemProcessor())
.writer(dummyWriter)
.listener(new ReadFileStepListener())
.listener(personFileBodyReader)
.build();
}
is anything wrong with the above configuration?
When I am removing the stopAndRestart configuration, it is getting passed.
For your use case, it is not stopAndRestart that you need, it is rather setting allowStartIfComplete on the step. With that, if the job fails, the step will be re-executed even if it was successfully completed in the previous run.

Uses of JobExecutionDecider in Spring Batch split flow using SimpleAsyncTaskExecutor

I want to configure a Spring Batch job with 4 steps. Step-2 and Step-3 are independent to each other. So I want to execute then in parallel. Any of these 2 steps or both can be skipped depending on Execution Parameter. Check the flow as mentioned below :
Batch flow details
Java Configuration as mentioned below:
#Bean
public Job sampleBatchJob()
throws Exception {
final Flow step1Flow = new FlowBuilder<SimpleFlow>("step1Flow")
.from(step1Tasklet()).end();
final Flow step2Flow = new FlowBuilder<SimpleFlow>("step2Flow")
.from(new step2FlowDecider()).on("EXECUTE").to(step2MasterStep())
.from(new step2FlowDecider()).on("SKIP").end(ExitStatus.COMPLETED.getExitCode())
.build();
final Flow step3Flow = new FlowBuilder<SimpleFlow>("step3Flow")
.from(new step3FlowDecider()).on("EXECUTE").to(step3MasterStep())
.from(new step3FlowDecider()).on("SKIP").end(ExitStatus.COMPLETED.getExitCode())
.build();
final Flow splitFlow = new FlowBuilder<Flow>("splitFlow")
.split(new SimpleAsyncTaskExecutor())
.add(step2Flow, step3Flow)
.build();
return jobBuilderFactory().get("sampleBatchJob")
.start(step1Flow)
.next(splitFlow)
.next(step4MasterStep())
.end()
.build();
}
Sample code for Step2FlowDecider:
public class Step2FlowDecider
implements JobExecutionDecider {
#Override
public FlowExecutionStatus decide(JobExecution jobExecution, StepExecution stepExecution) {
if (StringUtils.equals("Y", batchParameter.executeStep2())) {
return new FlowExecutionStatus("EXECUTE");
}
return new FlowExecutionStatus("SKIP");
}
}
With this configuration, when I try to execute the batch, it is getting failed, without any details error log.

To separate steps class in spring batch

I have tried to find the solution but I cannot... ㅠㅠ
I want to separate steps in a job like below.
step1.class -> step2.class -> step3.class -> done
The reason why I'm so divided is that I have to use queries each step.
#Bean
public Job bundleJob() {
return jobBuilderFactory.get(JOB_NAME)
.start(step1) // bean
.next(step2) // bean
.next(step3()) // and here is the code ex) reader, processor, writer
.build();
}
my purpose is that I have to use the return data in step1, step2.
but jpaItemReader is like async ... so it doesn't process like above order.
debug flow like this.
readerStep1 -> writerStep1 -> readerStep2 -> readerWriter2 -> readerStep3 -> writerStep3
and
-> processorStep1 -> processorStep2 -> processorStep3
that is the big problem to me...
How can I wait each step in a job? Including querying.
aha! I got it.
the point is the creating beans in a configuration.
I wrote annotation bean all kinds of steps so that those are created by spring.
the solution is late binding like #JobScope or #StepScope
#Bean
#StepScope. // late creating bean.
public ListItemReader<Dto> itemReader() {
// business logic
return new ListItemReader<>(dto);
}
To have a separate steps in your job you can use a Flow with a TaskletStep. Sharing a snippet for your reference,
#Bean
public Job processJob() throws Exception {
Flow fetchData = (Flow) new FlowBuilder<>("fetchData")
.start(fetchDataStep()).build();
Flow transformData = (Flow) new FlowBuilder<>("transformData")
.start(transformData()).build();
Job job = jobBuilderFactory.get("processTenantLifeCycleJob").incrementer(new RunIdIncrementer())
.start(fetchData).next(transformData).next(processData()).end()
.listener(jobCompletionListener()).build();
ReferenceJobFactory referenceJobFactory = new ReferenceJobFactory(job);
registry.register(referenceJobFactory);
return job;
}
#Bean
public TaskletStep fetchDataStep() {
return stepBuilderFactory.get("fetchData")
.tasklet(fetchDataValue()).listener(fetchDataStepListener()).build();
}
#Bean
#StepScope
public FetchDataValue fetchDataValue() {
return new FetchDataValue();
}
#Bean
public TaskletStep transformDataStep() {
return stepBuilderFactory.get("transformData")
.tasklet(transformValue()).listener(sendReportDataCompletionListener()).build();
}
#Bean
#StepScope
public TransformValue transformValue() {
return new TransformValue();
}
#Bean
public Step processData() {
return stepBuilderFactory.get("processData").<String, Data>chunk(chunkSize)
.reader(processDataReader()).processor(dataProcessor()).writer(processDataWriter())
.listener(processDataListener())
.taskExecutor(backupTaskExecutor()).build();
}
In this example I have used 2 Flows to Fetch and Transform data which will execute data from a class.
In order to return the value of those from the step 1 and 2, you can store the value in the job context and retrieve that in the ProcessData Step which has a reader, processor and writer.

Spring batch stop a job

How can I stop a job in spring batch ? I tried to use this method using the code below:
public class jobListener implements JobExecutionListener{
#Override
public void beforeJob(JobExecution jobExecution) {
jobExecution.setExitStatus(ExitStatus.STOPPED);
}
#Override
public void afterJob(JobExecution jobExecution) {
// TODO Auto-generated method stub
}
}
I tried also COMPLETED,FAILED but this method doesn't work and the job continues to execute. Any solution?
You can use JobOperator along with JobExplorer to stop a job from outside the job (see https://docs.spring.io/spring-batch/reference/html/configureJob.html#JobOperator). The method is stop(long executionId) You would have to use JobExplorer to find the correct executionId to stop.
Also from within a job flow config you can configure a job to stop after a steps execution based on exit status (see https://docs.spring.io/spring-batch/trunk/reference/html/configureStep.html#stopElement).
I assume you want to stop a job by a given name.
Here is the code.
String jobName = jobExecution.getJobInstance().getJobName(); // in most cases
DataSource dataSource = ... //#Autowire it in your code
JobOperator jobOperator = ... //#Autowire it in your code
JobExplorerFactoryBean factory = new JobExplorerFactoryBean();
factory.setDataSource(dataSource);
factory.setJdbcOperations(new JdbcTemplate(dataSource));
JobExplorer jobExplorer = factory.getObject();
Set<JobExecution> jobExecutions = jobExplorer.findRunningJobExecutions(jobName);
jobExecutions.forEach(jobExecution -> jobOperator.stop(jobExecution.getId()));

Resources