I want to configure a Spring Batch job with 4 steps. Step-2 and Step-3 are independent to each other. So I want to execute then in parallel. Any of these 2 steps or both can be skipped depending on Execution Parameter. Check the flow as mentioned below :
Batch flow details
Java Configuration as mentioned below:
#Bean
public Job sampleBatchJob()
throws Exception {
final Flow step1Flow = new FlowBuilder<SimpleFlow>("step1Flow")
.from(step1Tasklet()).end();
final Flow step2Flow = new FlowBuilder<SimpleFlow>("step2Flow")
.from(new step2FlowDecider()).on("EXECUTE").to(step2MasterStep())
.from(new step2FlowDecider()).on("SKIP").end(ExitStatus.COMPLETED.getExitCode())
.build();
final Flow step3Flow = new FlowBuilder<SimpleFlow>("step3Flow")
.from(new step3FlowDecider()).on("EXECUTE").to(step3MasterStep())
.from(new step3FlowDecider()).on("SKIP").end(ExitStatus.COMPLETED.getExitCode())
.build();
final Flow splitFlow = new FlowBuilder<Flow>("splitFlow")
.split(new SimpleAsyncTaskExecutor())
.add(step2Flow, step3Flow)
.build();
return jobBuilderFactory().get("sampleBatchJob")
.start(step1Flow)
.next(splitFlow)
.next(step4MasterStep())
.end()
.build();
}
Sample code for Step2FlowDecider:
public class Step2FlowDecider
implements JobExecutionDecider {
#Override
public FlowExecutionStatus decide(JobExecution jobExecution, StepExecution stepExecution) {
if (StringUtils.equals("Y", batchParameter.executeStep2())) {
return new FlowExecutionStatus("EXECUTE");
}
return new FlowExecutionStatus("SKIP");
}
}
With this configuration, when I try to execute the batch, it is getting failed, without any details error log.
Related
I'm working on a project that includes Spring batch, before copying the code snippets, I'm going to summarize easily how the job works with a cron.
the cron calls a rest API on my project (#PostMapping("/jobs/external/{jobName}"))
in the post method, I get the job and execute it.
in each execution, I'm supposed to run a step.
the step contains a reader (external rest call to elastic API to get documents) and a processor.
now my problem: in the catalina.out, I'm able to see the rest call from the cron every 10 minutes as configured in my cron. BUT, the step doesn't seem to make that call to elastic every 10 minutes, the batch process always has the same set of data, which is fetched one time when the batch is called during tomcat restart.
job rest api :
#PostMapping("/jobs/external/{jobName}")
#Timed
public ResponseEntity start(#PathVariable String jobName) throws BatchException {
log.info("LAUNCHING JOB FROM EXTERNAL : {}, timestamp : {}", jobName, Instant.now().toString());
try {
Job job = jobRegistry.getJob(jobName);
JobParametersBuilder builder = new JobParametersBuilder();
builder.addDate("date", new Date());
return Optional.of(jobLauncher.run(job, builder.toJobParameters()))
.map(BatchExecutionVM::new)
.map(exec -> ResponseEntity
.ok()
.headers(HeaderUtil.createAlert("jobManagement.started", jobName))
.body(exec))
.orElseGet(() -> ResponseEntity.badRequest().build());
} catch (NoSuchJobException aEx) {
log.warn(JOB_NOT_FOUND, aEx);
throw new BatchException();
} catch (JobInstanceAlreadyCompleteException | JobExecutionAlreadyRunningException | JobRestartException aEx) {
log.warn("Job execution error.", aEx);
throw new BatchException();
} catch (JobParametersInvalidException aEx) {
log.warn("Job parameters are invalid.", aEx);
throw new BatchException();
}
}
job configuration :
#Bean
public Job usualJob() {
return jobBuilderFactory
.get("usualJob")
.incrementer(new SimpleJobIncrementer())
.flow(readUsualStep())
.end()
.build();
}
#Bean
public Step readUsualStep() {
// TODO: simplifier on n'a pas besoin de chunk
return stepBuilderFactory.get("readUsualStep")
.allowStartIfComplete(true)
.<AlertDocument, Void>chunk(25)
.readerIsTransactionalQueue()
.reader(rowItemReader())
.processor(rowItemProcessor())
.build();
}
#Bean
public ItemReader<AlertDocument> rowItemReader() {
return new UsualItemReader(usualService.getLastAlerts());
}
#Bean
public UsualMapRowProcessor rowItemProcessor() {
return new UsualMapRowProcessor();
}
i don't know why usualService.getLastAlerts() is called just once and not every 10 minutes.
thanks to M. Deinum, this is basically the solution :
#Bean
#StepScope
public ItemReader<AlertDocument> rowItemReader() {
return new UsualItemReader(usualService.getLastAlerts());
}
annotating the step bean with stepScope annotation will make it reinstantiate every step.
I'm working on process which uses Spring Integration and Spring Batch
1)Using Spring integration I will poll remote sftp dir to get different csv files as Message
2)Message which carries csv file as payload is sent downstream to Transformer which will transform Message to JobLaunchRequest
3)Spring batch reads csv files and dumps into DB
Question:
For each csv file I need to configure (ItemReader, ItemWriter, Step, Job)
So with that into consideration if I have to deal with 10 different csv files do I have to configure all 4 beans listed above for each csv?
CSVs differs in HeaderNames and HeaderCount and each csv has different JPA Entity
Eventually I will have 40 #Bean Configurations which ideally I think is bad
Can anyone suggest me if this is how spring batch is made to work or there is other way to make it one common dynamic bean for different CSVs
Here is code:
IntegartionFlow:
#Bean
public IntegrationFlow integrationFlow(JobLaunchingGateway jobLaunchingGateway) {
return IntegrationFlows.from(Sftp.inboundAdapter(sftpSessionFactory)
.remoteDirectory("/uploads")
.localDirectory(new File("C:\\Users\\DELL\\Desktop\\local"))
.patternFilter("*.csv")
.autoCreateLocalDirectory(true)
, c -> c.poller(Pollers.fixedRate(1000).taskExecutor(taskExecutor()).maxMessagesPerPoll(1)))
.transform(fileMessageToJobRequest())
.handle(jobLaunchingGateway)
.log(LoggingHandler.Level.WARN, "headers.id + ': ' + payload")
.route(JobExecution.class, j -> j.getStatus().isUnsuccessful() ? "jobFailedChannel" : "jobSuccessfulChannel")
.get();
}
Transformer:
#Transformer
public JobLaunchRequest toRequest(Message<File> message) {
JobParametersBuilder jobParametersBuilder =
new JobParametersBuilder();
jobParametersBuilder.addString(fileParameterName,
message.getPayload().getAbsolutePath());
jobParametersBuilder.addLong("key.id", System.currentTimeMillis());
return new JobLaunchRequest(job, jobParametersBuilder.toJobParameters());
}
Batch Job:
#Bean
public Job vendorMasterBatchJob(Step vendorMasterStep) {
return jobBuilderFactory.get("vendorMasterBatchJob")
.incrementer(new RunIdIncrementer())
.start(vendorMasterStep)
.listener(deleteInputFileJobListener)
.build();
}
Batch Step:
#Bean
public Step vendorMasterStep(FlatFileItemReader<ERPVendorMaster> vendorMasterReader,
JpaItemWriter<ERPVendorMaster> vendorMasterWriter) {
return stepBuilderFactory.get("vendorMasterStep")
.<ERPVendorMaster, ERPVendorMaster>chunk(chunkSize)
.reader(vendorMasterReader)
.writer(vendorMasterWriter)
.faultTolerant()
.skipLimit(Integer.MAX_VALUE)
.skip(RuntimeException.class)
.listener(skipListener)
.build();
}
ItemWriter:
#Bean
public JpaItemWriter<ERPVendorMaster> vendorMasterWriter() {
return new JpaItemWriterBuilder<ERPVendorMaster>()
.entityManagerFactory(entityManagerFactory)
.build();
}
ItemReader:
#Bean
#StepScope
public FlatFileItemReader<ERPVendorMaster> vendorMasterReader(#Value("#{jobParameters['input.file.name']}") String fileName) {
return new FlatFileItemReaderBuilder<ERPVendorMaster>()
.name("vendorMasterItemReader")
.resource(new FileSystemResource(fileName))
.linesToSkip(1)
.delimited()
.names(commaSeparatedVendorMasterHeaderValues.split(","))
.fieldSetMapper(new BeanWrapperFieldSetMapper<ERPVendorMaster>() {{
setConversionService(stringToDateConversionService());
setTargetType(ERPVendorMaster.class);
}})
.build();
}
I'm very new to Spring boot any help will be appreciated
Thanks
I have tried to find the solution but I cannot... ㅠㅠ
I want to separate steps in a job like below.
step1.class -> step2.class -> step3.class -> done
The reason why I'm so divided is that I have to use queries each step.
#Bean
public Job bundleJob() {
return jobBuilderFactory.get(JOB_NAME)
.start(step1) // bean
.next(step2) // bean
.next(step3()) // and here is the code ex) reader, processor, writer
.build();
}
my purpose is that I have to use the return data in step1, step2.
but jpaItemReader is like async ... so it doesn't process like above order.
debug flow like this.
readerStep1 -> writerStep1 -> readerStep2 -> readerWriter2 -> readerStep3 -> writerStep3
and
-> processorStep1 -> processorStep2 -> processorStep3
that is the big problem to me...
How can I wait each step in a job? Including querying.
aha! I got it.
the point is the creating beans in a configuration.
I wrote annotation bean all kinds of steps so that those are created by spring.
the solution is late binding like #JobScope or #StepScope
#Bean
#StepScope. // late creating bean.
public ListItemReader<Dto> itemReader() {
// business logic
return new ListItemReader<>(dto);
}
To have a separate steps in your job you can use a Flow with a TaskletStep. Sharing a snippet for your reference,
#Bean
public Job processJob() throws Exception {
Flow fetchData = (Flow) new FlowBuilder<>("fetchData")
.start(fetchDataStep()).build();
Flow transformData = (Flow) new FlowBuilder<>("transformData")
.start(transformData()).build();
Job job = jobBuilderFactory.get("processTenantLifeCycleJob").incrementer(new RunIdIncrementer())
.start(fetchData).next(transformData).next(processData()).end()
.listener(jobCompletionListener()).build();
ReferenceJobFactory referenceJobFactory = new ReferenceJobFactory(job);
registry.register(referenceJobFactory);
return job;
}
#Bean
public TaskletStep fetchDataStep() {
return stepBuilderFactory.get("fetchData")
.tasklet(fetchDataValue()).listener(fetchDataStepListener()).build();
}
#Bean
#StepScope
public FetchDataValue fetchDataValue() {
return new FetchDataValue();
}
#Bean
public TaskletStep transformDataStep() {
return stepBuilderFactory.get("transformData")
.tasklet(transformValue()).listener(sendReportDataCompletionListener()).build();
}
#Bean
#StepScope
public TransformValue transformValue() {
return new TransformValue();
}
#Bean
public Step processData() {
return stepBuilderFactory.get("processData").<String, Data>chunk(chunkSize)
.reader(processDataReader()).processor(dataProcessor()).writer(processDataWriter())
.listener(processDataListener())
.taskExecutor(backupTaskExecutor()).build();
}
In this example I have used 2 Flows to Fetch and Transform data which will execute data from a class.
In order to return the value of those from the step 1 and 2, you can store the value in the job context and retrieve that in the ProcessData Step which has a reader, processor and writer.
How can I stop a job in spring batch ? I tried to use this method using the code below:
public class jobListener implements JobExecutionListener{
#Override
public void beforeJob(JobExecution jobExecution) {
jobExecution.setExitStatus(ExitStatus.STOPPED);
}
#Override
public void afterJob(JobExecution jobExecution) {
// TODO Auto-generated method stub
}
}
I tried also COMPLETED,FAILED but this method doesn't work and the job continues to execute. Any solution?
You can use JobOperator along with JobExplorer to stop a job from outside the job (see https://docs.spring.io/spring-batch/reference/html/configureJob.html#JobOperator). The method is stop(long executionId) You would have to use JobExplorer to find the correct executionId to stop.
Also from within a job flow config you can configure a job to stop after a steps execution based on exit status (see https://docs.spring.io/spring-batch/trunk/reference/html/configureStep.html#stopElement).
I assume you want to stop a job by a given name.
Here is the code.
String jobName = jobExecution.getJobInstance().getJobName(); // in most cases
DataSource dataSource = ... //#Autowire it in your code
JobOperator jobOperator = ... //#Autowire it in your code
JobExplorerFactoryBean factory = new JobExplorerFactoryBean();
factory.setDataSource(dataSource);
factory.setJdbcOperations(new JdbcTemplate(dataSource));
JobExplorer jobExplorer = factory.getObject();
Set<JobExecution> jobExecutions = jobExplorer.findRunningJobExecutions(jobName);
jobExecutions.forEach(jobExecution -> jobOperator.stop(jobExecution.getId()));
I'm having trouble getting a conditional spring batch flow to work using java config. The samples I've seen in spring batch samples, or spring batch's test code, or on stack overflow tend to show a conditional where a single step needs to be executed on condition, or it's the final step, or both. That's not the case I need to solve.
In procedural pseudo code, I want it to behave like
initStep()
if decision1()
subflow1()
middleStep()
if decision2()
subflow2()
lastStep()
So, subflow1 and 2 are conditional, but init, middle and last always execute. Here's my stripped down test case. In the current configuration, it just quits after executing subflow1.
public class FlowJobTest {
private JobBuilderFactory jobBuilderFactory;
private JobRepository jobRepository;
private JobExecution execution;
#BeforeMethod
public void setUp() throws Exception {
jobRepository = new MapJobRepositoryFactoryBean().getObject();
jobBuilderFactory = new JobBuilderFactory(jobRepository);
execution = jobRepository.createJobExecution("flow", new JobParameters());
}
#Test
public void figureOutFlowJobs() throws Exception {
JobExecutionDecider subflow1Decider = decider(true);
JobExecutionDecider subflow2Decider = decider(false);
Flow subflow1 = new FlowBuilder<Flow>("subflow-1").start(echo("subflow-1-Step-1")).next(echo("subflow-1-Step-2")).end();
Flow subflow2 = new FlowBuilder<Flow>("subflow-2").start(echo("subflow-2-Step-1")).next(echo("subflow-2-Step-2")).end();
Job job = jobBuilderFactory.get("testJob")
.start(echo("init"))
.next(subflow1Decider)
.on("YES").to(subflow1)
.from(subflow1Decider)
.on("*").to(echo("middle"))
.next(subflow2Decider)
.on("YES").to(subflow2)
.from(subflow2Decider)
.on("*").to(echo("last"))
.next(echo("last"))
.build().preventRestart().build();
job.execute(execution);
assertEquals(execution.getStatus(), BatchStatus.COMPLETED);
assertEquals(execution.getStepExecutions().size(), 5);
}
private Step echo(String stepName) {
return new AbstractStep() {
{
setName(stepName);
setJobRepository(jobRepository);
}
#Override
protected void doExecute(StepExecution stepExecution) throws Exception {
System.out.println("step: " + stepName);
stepExecution.upgradeStatus(BatchStatus.COMPLETED);
stepExecution.setExitStatus(ExitStatus.COMPLETED);
jobRepository.update(stepExecution);
}
};
}
private JobExecutionDecider decider(boolean decision) {
return (jobExecution, stepExecution) -> new FlowExecutionStatus(decision ? "YES" : "NO");
}
}
Your original job definition should work, as well, with only a small tweak. The test was failing because the job finished (with status COMPLETED) after the first sub flow. If you instruct it to continue to the middle step instead, it should work as intended. Similar adjustment for the second flow.
Job job = jobBuilderFactory.get("testJob")
.start(echo("init"))
.next(subflow1Decider)
.on("YES").to(subflow1).next(echo("middle"))
.from(subflow1Decider)
.on("*").to(echo("middle"))
.next(subflow2Decider)
.on("YES").to(subflow2).next(echo("last"))
.from(subflow2Decider)
.on("*").to(echo("last"))
.build().preventRestart().build();
The approach I used to make this work was to break my conditional flows into flow steps.
public void figureOutFlowJobsWithFlowStep(boolean decider1, boolean decider2, int expectedSteps) throws Exception {
JobExecutionDecider subflow1Decider = decider(decider1);
JobExecutionDecider subflow2Decider = decider(decider2);
Flow subFlow1 = new FlowBuilder<Flow>("sub-1")
.start(subflow1Decider)
.on("YES")
.to(echo("sub-1-1")).next(echo("sub-1-2"))
.from(subflow1Decider)
.on("*").end()
.end();
Flow subFlow2 = new FlowBuilder<Flow>("sub-2")
.start(subflow2Decider)
.on("YES").to(echo("sub-2-1")).next(echo("sub-2-2"))
.from(subflow2Decider)
.on("*").end()
.end();
Step subFlowStep1 = new StepBuilder("sub1step").flow(subFlow1).repository(jobRepository).build();
Step subFlowStep2 = new StepBuilder("sub2step").flow(subFlow2).repository(jobRepository).build();
Job job = jobBuilderFactory.get("testJob")
.start(echo("init"))
.next(subFlowStep1)
.next(echo("middle"))
.next(subFlowStep2)
.next(echo("last"))
.preventRestart().build();
job.execute(execution);
assertEquals(execution.getStatus(), BatchStatus.COMPLETED);
assertEquals(execution.getStepExecutions().size(), expectedSteps);
}