I have created a Spring Batch app and I'm struggling to implement a simple flow with a condition. Here's what I want to implement:
I tried to achieve this implementing the following code:
#Bean
public Job job(JobCompletionNotificationListener listener) {
return jobs.get(Constants.JOB_SIARD_FILES_PROCESSOR + new Date().getTime())
.incrementer(new RunIdIncrementer())
.listener(listener)
.start(step1())
.next(decider()).on("yes").to(step2345Flow())
.end()
.build();
}
#Bean
public Flow step2345Flow() {
return new FlowBuilder<SimpleFlow>("yes_flow")
.start(step2())
.next(step3())
.next(step4())
.next(step5())
.build();
}
When the condition is "yes" the flow is working just fine, but when the condition is "no" the flow always ends with an execution status "FAILED". I want it to be "COMPLETED" just like the first flow but without executing the steps 2, 3, 4 and 5.
Hope anyone can help me with this.
Spring Batch does not allow alternative branches in the flow to be implicit. In other words, you need an on(...) for each case.
Assuming decider() yields a proxied bean, it should work fine with
#Bean
public Job job(JobCompletionNotificationListener listener) {
return jobs.get(Constants.JOB_SIARD_FILES_PROCESSOR + new Date().getTime())
.incrementer(new RunIdIncrementer())
.listener(listener)
.start(step1())
.next(decider()).on("yes").to(step2345Flow())
.from(decider()).on("no").end()
.end()
.build();
}
To cover really all cases, you can also use on("*") instead of on("no").
Please also have a second look at the official documentation: https://docs.spring.io/spring-batch/docs/4.3.x/reference/html/index-single.html#controllingStepFlow
Related
I have implemented a batch job where there is a sequence of multiple steps. I have also implemented a CustomStepListener.
For my business logic I have to check a Database entry in beforeStep of StepExecutionListener. If the database entry exists, that means the step is executed before, so I want to skip that step and go to the next one.
But I don't have any idea, how I can skip current step from beforeStep.
Can anyone give me any suggestion?
A JobExecutionDecider is better suited to decide if a step should be executed or not. In your case, you can perform the database check in the decider and make it return a status to be used for the flow definition. Here is an example:
public class MyDecider implements JobExecutionDecider {
public FlowExecutionStatus decide(JobExecution jobExecution, StepExecution stepExecution) {
String status;
if (dbEntryExists()) {
status = "ENTRY_EXIST";
}
else {
status = "ENTRY_NOT_EXIST";
}
return new FlowExecutionStatus(status);
}
}
#Bean
public Job job() {
return this.jobBuilderFactory.get("job")
.start(step1())
.next(decider()).on("ENTRY_EXIST").to(step2())
.from(decider()).on("ENTRY_NOT_EXIST").to(step3())
.end()
.build();
}
Please refer to the Programmatic Flow Decisions section in the documentation for more details.
I made a code maliciously to check how the batch flow works.
#Bean
public Step conditionalJobStep1() {
return stepBuilderFactory.get("step1")
.tasklet((contribution, chunkContext) -> {
log.info(">>>>> This is stepNextConditionalJob Step1");
return RepeatStatus.FINISHED;
}).build();
}
#Bean
public Step conditionalJobStep2() {
return stepBuilderFactory.get("conditionalJobStep2")
.tasklet((contribution, chunkContext) -> {
log.info(">>>>> This is stepNextConditionalJob Step2");
return RepeatStatus.FINISHED;
}).build();
}
#Bean
public Step conditionalJobStep3() {
return stepBuilderFactory.get("conditionalJobStep3")
.tasklet((contribution, chunkContext) -> {
log.info(">>>>> This is stepNextConditionalJob Step3");
return RepeatStatus.FINISHED;
}).build();
}
Here is Steps and tasklets.
#Bean
public Job stepNextConditionalJob() {
return jobBuilderFactory.get("stepNextConditionalJob")
.start(conditionalJobStep1())
.on("FAILED")
.to(conditionalJobStep3())
.on("*")
.end()
.from(conditionalJobStep1())
.on("*")
.to(conditionalJobStep3())
.next(conditionalJobStep2())
.on("*")
.to(conditionalJobStep1())
.on("*")
.end()
.end()
.build();
}
above code results 1->3->2->1->3->2->1->..... inf.
I thought What makes this flow.
My Think : Step1 is not FAILED, So start(1) -> to(1) & from(1) -> (3) -> (2) -> to(1) & from(1) -> (3) -> (2) -> ...
but when i modified just 2 numbers like this job(The code that changed only Step2 and Step3 from the previous code after from(step1)) results just 1->2->3->Job end.
#Bean
public Job stepNextConditionalJob() {
return jobBuilderFactory.get("stepNextConditionalJob")
.start(conditionalJobStep1())
.on("FAILED")
.to(conditionalJobStep3())
.on("*")
.end()
.from(conditionalJobStep1())
.on("*")
.to(conditionalJobStep2())
.next(conditionalJobStep3())
.on("*")
.to(conditionalJobStep1())
.on("*")
.end()
.end()
.build();
}
Many of own experiments are related to "to" right after the "start".
I don't know why it works like this.
What makes this difference ?
Your flow does not define an outcome from step2 on *. You have:
.to(conditionalJobStep3())
.on("*")
.end()
and
.to(conditionalJobStep1())
.on("*")
.end()
but no such construct for step2. The only outcome you have defined for step 2 is:
.next(conditionalJobStep2())
.on("*")
.to(conditionalJobStep1())
Which means regardless of step2's exit code , go to conditionalJobStep1. That's why you see 1->3->2->1->3->2->1->....
I don't know why it works like this. What makes this difference ?
It works when you flip step2 with step3 because you have defined an outcome on "*" for step3: .to(conditionalJobStep3()).on("*").end().
I recommend using the following syntax to define all outcomes for each step:
jobBuilderFactory.get("stepNextConditionalJob")
.from(stepA).on(ExitCode1).to(StepB)
.from(stepA).on(ExitCode2).to(StepC)
.from(stepA).on("*").end()
...
.from(stepX).on(ExitCode1).to(StepY)
.from(stepX).on(ExitCode1).to(StepZ)
.from(stepX).on("*").end()
.build()
This is more explicit and less prone to forgetting outgoing transitions.
i have been playing with a spring batch job that reads a sample csv file and dumps the records into a table.
My question is surrounding restarts, i have introduced a data issue in the file ( too long to insert) in the 3rd line
In the first run
The first two lines get inserted and the third line fails ( as expected )
when i restart
The fourth line is picked up and the rest of the file is processed
All the documentation seems to suggest that spring batch picks up where it left off, does it mean the 3rd ( problem record ) considered
'attempted' and hence wont be tried again? i was expecting all the restarts to fail untill i fixed the file.
#Bean
public FlatFileItemReader<Person> reader() {
return new FlatFileItemReaderBuilder<Person>()
.name("personItemReader")
.resource(new ClassPathResource("sample-data.csv"))
.delimited()
.names(new String[]{"firstName", "lastName"})
.fieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {{
setTargetType(Person.class);
}})
.build();
}
#Bean
public JdbcBatchItemWriter<Person> writer(DataSource dataSource) {
return new JdbcBatchItemWriterBuilder<Person>()
.itemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>())
.sql("INSERT INTO people (first_name, last_name) VALUES (:firstName, :lastName)")
.dataSource(dataSource)
.build();
}
#Bean
public Step step1(JdbcBatchItemWriter<Person> writer) {
return stepBuilderFactory.get("step1")
.<Person, Person> chunk(1)
.reader(reader())
.processor(processor())
.writer(writer)
.taskExecutor(taskExecutor())
.throttleLimit(1)
.build();
}
#Bean
public Job importUserJob(JobCompletionNotificationListener listener) {
return jobBuilderFactory.get("importUserJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.start(step1)
.build();
}
Please let me know have you gone through below. If Its not clear I can share the same sample project in github
Spring Batch restart uncompleted jobs from the same execution and step
Spring Batch correctly restart uncompleted jobs in clustered environment
In production we always use "fault-tolerant" so that job will reject the wrong data and continue. Later operations will correct the data and re-execute the job again. Advantage here is huge volume of data can be continuously processed and no need to wait for data correction.
Please compare your code with below
https://github.com/ngecom/stackoverflow-springbatchRestart
You have set a RunIdIncrementer on your job, so you will have a new job instance on each run. You need to remove that incrementer and pass the file as a job parameter to have the same job instance on each run. With this approach, all restarts will fail until you fix the file.
As a side note, you can't have restartability if you use a multi-threaded step. This is because the state would not be consistent when using multiple threads. So you need to use a single threaded-step (remove the task executor). This is explained in the documentation here: Multi-threaded step.
All the steps being passed still the job is completed with failed status.
#Bean
public Job job() {
return this.jobBuilderFactory.get("person-job")
.start(initializeBatch())
.next(readBodystep())
.on("STOPPED")
.stopAndRestart(initializeBatch())
.end()
.validator(batchJobParamValidator)
.incrementer(jobParametersIncrementer)
.listener(jobListener)
.build();
}
#Bean
public Flow preProcessingFlow() {
return new FlowBuilder<Flow>("preProcessingFlow")
.start(extractFooterAndBodyStep())
.next(readFooterStep())
.build();
}
#Bean
public Step initializeBatch() {
return this.stepBuilderFactory.get("initializeBatch")
.flow(preProcessingFlow())
.build();
public Step readBodystep() {
return this.stepBuilderFactory.get("readChunkStep")
.<PersonDTO, PersonBO>chunk(10)
.reader(personFileBodyReader)
.processor(itemProcessor())
.writer(dummyWriter)
.listener(new ReadFileStepListener())
.listener(personFileBodyReader)
.build();
}
is anything wrong with the above configuration?
When I am removing the stopAndRestart configuration, it is getting passed.
For your use case, it is not stopAndRestart that you need, it is rather setting allowStartIfComplete on the step. With that, if the job fails, the step will be re-executed even if it was successfully completed in the previous run.
I am reading the spring batch documentation and stuck on following part:
There are provided following example:
#Bean
public Job job() {
Flow flow1 = new FlowBuilder<SimpleFlow>("flow1")
.start(step1())
.next(step2())
.build();
Flow flow2 = new FlowBuilder<SimpleFlow>("flow2")
.start(step3())
.build();
return this.jobBuilderFactory.get("job")
.start(flow1)
.split(new SimpleAsyncTaskExecutor())
.add(flow2)
.next(step4())
.end()
.build();
}
But it is not explained what is happening.
as far I understand flow1 and flow2 are executed in parallel but what about step4 ?
step4() is executed linearly after flow1 and flow2 returned.
Look at the FlowBuilder.SplitBuilder.add() javadoc :
public FlowBuilder<Q> add(Flow... flows)
Add flows to the split, in addition to the current state already
present in the parent builder.
Parameters:
flows - more flows to add to the split
Returns: the parent builder
It returns the parent builder and not the current SplitBuilder object.
So it is not included in the flow split and so is executed sequentially.
To run the 3 flows in parallel :
return this.jobBuilderFactory.get("job")
.start(flow1)
.split(new SimpleAsyncTaskExecutor())
.add(flow2, step4())
.end()
.build();