Spring-batch step not re-executed (cache) - spring

I'm working on a project that includes Spring batch, before copying the code snippets, I'm going to summarize easily how the job works with a cron.
the cron calls a rest API on my project (#PostMapping("/jobs/external/{jobName}"))
in the post method, I get the job and execute it.
in each execution, I'm supposed to run a step.
the step contains a reader (external rest call to elastic API to get documents) and a processor.
now my problem: in the catalina.out, I'm able to see the rest call from the cron every 10 minutes as configured in my cron. BUT, the step doesn't seem to make that call to elastic every 10 minutes, the batch process always has the same set of data, which is fetched one time when the batch is called during tomcat restart.
job rest api :
#PostMapping("/jobs/external/{jobName}")
#Timed
public ResponseEntity start(#PathVariable String jobName) throws BatchException {
log.info("LAUNCHING JOB FROM EXTERNAL : {}, timestamp : {}", jobName, Instant.now().toString());
try {
Job job = jobRegistry.getJob(jobName);
JobParametersBuilder builder = new JobParametersBuilder();
builder.addDate("date", new Date());
return Optional.of(jobLauncher.run(job, builder.toJobParameters()))
.map(BatchExecutionVM::new)
.map(exec -> ResponseEntity
.ok()
.headers(HeaderUtil.createAlert("jobManagement.started", jobName))
.body(exec))
.orElseGet(() -> ResponseEntity.badRequest().build());
} catch (NoSuchJobException aEx) {
log.warn(JOB_NOT_FOUND, aEx);
throw new BatchException();
} catch (JobInstanceAlreadyCompleteException | JobExecutionAlreadyRunningException | JobRestartException aEx) {
log.warn("Job execution error.", aEx);
throw new BatchException();
} catch (JobParametersInvalidException aEx) {
log.warn("Job parameters are invalid.", aEx);
throw new BatchException();
}
}
job configuration :
#Bean
public Job usualJob() {
return jobBuilderFactory
.get("usualJob")
.incrementer(new SimpleJobIncrementer())
.flow(readUsualStep())
.end()
.build();
}
#Bean
public Step readUsualStep() {
// TODO: simplifier on n'a pas besoin de chunk
return stepBuilderFactory.get("readUsualStep")
.allowStartIfComplete(true)
.<AlertDocument, Void>chunk(25)
.readerIsTransactionalQueue()
.reader(rowItemReader())
.processor(rowItemProcessor())
.build();
}
#Bean
public ItemReader<AlertDocument> rowItemReader() {
return new UsualItemReader(usualService.getLastAlerts());
}
#Bean
public UsualMapRowProcessor rowItemProcessor() {
return new UsualMapRowProcessor();
}
i don't know why usualService.getLastAlerts() is called just once and not every 10 minutes.

thanks to M. Deinum, this is basically the solution :
#Bean
#StepScope
public ItemReader<AlertDocument> rowItemReader() {
return new UsualItemReader(usualService.getLastAlerts());
}
annotating the step bean with stepScope annotation will make it reinstantiate every step.

Related

Can we get data processed in Spring Batch after batch job is completed?

I am using spring batch for reading data from db and process the same and do spome process in writer.
if batch size is less than the records read by reader then spring batch runs in multiple batches.I want to do the processing in writer only once at the end of all batch process completion or if this is not possible then i will remove writer and process the data obtained in processor after batch job is completed.Is this possible?
Below is my trigger Spring Batch job code
private void triggerSpringBatchJob() {
loggerConfig.logDebug(log, " : Triggering product catalog scheduler ");
JobParametersBuilder builder = new JobParametersBuilder();
try {
// Adding date in buildJobParameters because if not added we will get A job
// instance already exists: JobInstanceAlreadyCompleteException
builder.addDate("date", new Date());
jobLauncher.run(processProductCatalog, builder.toJobParameters());
} catch (JobExecutionAlreadyRunningException | JobRestartException | JobInstanceAlreadyCompleteException
| JobParametersInvalidException e) {
e.printStackTrace();
}
}
Below is my spring batch configuration
#Configuration
#EnableBatchProcessing
public class BatchJobProcessConfiguration {
#Bean
#StepScope
RepositoryItemReader<Tuple> reader(SkuRepository skuRepository,
ProductCatalogConfiguration productCatalogConfiguration) {
RepositoryItemReader<Tuple> reader = new RepositoryItemReader<>();
reader.setRepository(skuRepository);
// query parameters
List<Object> queryMethodArguments = new ArrayList<>();
if (productCatalogConfiguration.getSkuId().isEmpty()) {
reader.setMethodName("findByWebEligibleAndDiscontinued");
queryMethodArguments.add(productCatalogConfiguration.getWebEligible()); // for web eligible
queryMethodArguments.add(productCatalogConfiguration.getDiscontinued()); // for discontinued
queryMethodArguments.add(productCatalogConfiguration.getCbdProductId()); // for cbd products
} else {
reader.setMethodName("findBySkuIds");
queryMethodArguments.add(productCatalogConfiguration.getSkuId()); // for sku ids
}
reader.setArguments(queryMethodArguments);
reader.setPageSize(1000);
Map<String, Direction> sorts = new HashMap<>();
sorts.put("sku_id", Direction.ASC);
reader.setSort(sorts);
return reader;
}
#Bean
#StepScope
ItemWriter<ProductCatalogWriterData> writer() {
return new ProductCatalogWriter();
}
#Bean
ProductCatalogProcessor processor() {
return new ProductCatalogProcessor();
}
#Bean
SkipPolicy readerSkipper() {
return new ReaderSkipper();
#Bean
Step productCatalogDataStep(ItemReader<Tuple> itemReader, ProductCatalogWriter writer,
HttpServletRequest request, StepBuilderFactory stepBuilderFactory,BatchConfiguration batchConfiguration) {
return stepBuilderFactory.get("processProductCatalog").<Tuple, ProductCatalogWriterData>chunk(batchConfiguration.getBatchChunkSize())
.reader(itemReader).faultTolerant().skipPolicy(readerSkipper()).processor(processor()).writer(writer).build();
}
#Bean
Job productCatalogData(Step productCatalogDataStep, HttpServletRequest request,
JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory.get("processProductCatalog").incrementer(new RunIdIncrementer())
.flow(productCatalogDataStep).end().build();
}
}
want to do the processing in writer only once at the end of all batch process completion or if this is not possible then i will remove writer and process the data obtained in processor after batch job is completed.Is this possible?
"at the end of all batch process completion" is key here. If the requirement is to do some processing after all chunks have been "pre-processed", I would keep it simple and use two steps for that:
Step 1: (pre)processes the data as needed and writes it to a temporary storage
Step 2: Here you do whatever you want with the processed data prepared in the temporary storage
A final step would clean up the temporary storage if it is persistent (file, staging table, etc). Otherwise, ie if it is in memory, this is optional.

Configure ItemWriter, ItemReader, Step and Job dynamically in Spring Batch

I'm working on process which uses Spring Integration and Spring Batch
1)Using Spring integration I will poll remote sftp dir to get different csv files as Message
2)Message which carries csv file as payload is sent downstream to Transformer which will transform Message to JobLaunchRequest
3)Spring batch reads csv files and dumps into DB
Question:
For each csv file I need to configure (ItemReader, ItemWriter, Step, Job)
So with that into consideration if I have to deal with 10 different csv files do I have to configure all 4 beans listed above for each csv?
CSVs differs in HeaderNames and HeaderCount and each csv has different JPA Entity
Eventually I will have 40 #Bean Configurations which ideally I think is bad
Can anyone suggest me if this is how spring batch is made to work or there is other way to make it one common dynamic bean for different CSVs
Here is code:
IntegartionFlow:
#Bean
public IntegrationFlow integrationFlow(JobLaunchingGateway jobLaunchingGateway) {
return IntegrationFlows.from(Sftp.inboundAdapter(sftpSessionFactory)
.remoteDirectory("/uploads")
.localDirectory(new File("C:\\Users\\DELL\\Desktop\\local"))
.patternFilter("*.csv")
.autoCreateLocalDirectory(true)
, c -> c.poller(Pollers.fixedRate(1000).taskExecutor(taskExecutor()).maxMessagesPerPoll(1)))
.transform(fileMessageToJobRequest())
.handle(jobLaunchingGateway)
.log(LoggingHandler.Level.WARN, "headers.id + ': ' + payload")
.route(JobExecution.class, j -> j.getStatus().isUnsuccessful() ? "jobFailedChannel" : "jobSuccessfulChannel")
.get();
}
Transformer:
#Transformer
public JobLaunchRequest toRequest(Message<File> message) {
JobParametersBuilder jobParametersBuilder =
new JobParametersBuilder();
jobParametersBuilder.addString(fileParameterName,
message.getPayload().getAbsolutePath());
jobParametersBuilder.addLong("key.id", System.currentTimeMillis());
return new JobLaunchRequest(job, jobParametersBuilder.toJobParameters());
}
Batch Job:
#Bean
public Job vendorMasterBatchJob(Step vendorMasterStep) {
return jobBuilderFactory.get("vendorMasterBatchJob")
.incrementer(new RunIdIncrementer())
.start(vendorMasterStep)
.listener(deleteInputFileJobListener)
.build();
}
Batch Step:
#Bean
public Step vendorMasterStep(FlatFileItemReader<ERPVendorMaster> vendorMasterReader,
JpaItemWriter<ERPVendorMaster> vendorMasterWriter) {
return stepBuilderFactory.get("vendorMasterStep")
.<ERPVendorMaster, ERPVendorMaster>chunk(chunkSize)
.reader(vendorMasterReader)
.writer(vendorMasterWriter)
.faultTolerant()
.skipLimit(Integer.MAX_VALUE)
.skip(RuntimeException.class)
.listener(skipListener)
.build();
}
ItemWriter:
#Bean
public JpaItemWriter<ERPVendorMaster> vendorMasterWriter() {
return new JpaItemWriterBuilder<ERPVendorMaster>()
.entityManagerFactory(entityManagerFactory)
.build();
}
ItemReader:
#Bean
#StepScope
public FlatFileItemReader<ERPVendorMaster> vendorMasterReader(#Value("#{jobParameters['input.file.name']}") String fileName) {
return new FlatFileItemReaderBuilder<ERPVendorMaster>()
.name("vendorMasterItemReader")
.resource(new FileSystemResource(fileName))
.linesToSkip(1)
.delimited()
.names(commaSeparatedVendorMasterHeaderValues.split(","))
.fieldSetMapper(new BeanWrapperFieldSetMapper<ERPVendorMaster>() {{
setConversionService(stringToDateConversionService());
setTargetType(ERPVendorMaster.class);
}})
.build();
}
I'm very new to Spring boot any help will be appreciated
Thanks

Spring batch Job Failing

All the steps being passed still the job is completed with failed status.
#Bean
public Job job() {
return this.jobBuilderFactory.get("person-job")
.start(initializeBatch())
.next(readBodystep())
.on("STOPPED")
.stopAndRestart(initializeBatch())
.end()
.validator(batchJobParamValidator)
.incrementer(jobParametersIncrementer)
.listener(jobListener)
.build();
}
#Bean
public Flow preProcessingFlow() {
return new FlowBuilder<Flow>("preProcessingFlow")
.start(extractFooterAndBodyStep())
.next(readFooterStep())
.build();
}
#Bean
public Step initializeBatch() {
return this.stepBuilderFactory.get("initializeBatch")
.flow(preProcessingFlow())
.build();
public Step readBodystep() {
return this.stepBuilderFactory.get("readChunkStep")
.<PersonDTO, PersonBO>chunk(10)
.reader(personFileBodyReader)
.processor(itemProcessor())
.writer(dummyWriter)
.listener(new ReadFileStepListener())
.listener(personFileBodyReader)
.build();
}
is anything wrong with the above configuration?
When I am removing the stopAndRestart configuration, it is getting passed.
For your use case, it is not stopAndRestart that you need, it is rather setting allowStartIfComplete on the step. With that, if the job fails, the step will be re-executed even if it was successfully completed in the previous run.

Uses of JobExecutionDecider in Spring Batch split flow using SimpleAsyncTaskExecutor

I want to configure a Spring Batch job with 4 steps. Step-2 and Step-3 are independent to each other. So I want to execute then in parallel. Any of these 2 steps or both can be skipped depending on Execution Parameter. Check the flow as mentioned below :
Batch flow details
Java Configuration as mentioned below:
#Bean
public Job sampleBatchJob()
throws Exception {
final Flow step1Flow = new FlowBuilder<SimpleFlow>("step1Flow")
.from(step1Tasklet()).end();
final Flow step2Flow = new FlowBuilder<SimpleFlow>("step2Flow")
.from(new step2FlowDecider()).on("EXECUTE").to(step2MasterStep())
.from(new step2FlowDecider()).on("SKIP").end(ExitStatus.COMPLETED.getExitCode())
.build();
final Flow step3Flow = new FlowBuilder<SimpleFlow>("step3Flow")
.from(new step3FlowDecider()).on("EXECUTE").to(step3MasterStep())
.from(new step3FlowDecider()).on("SKIP").end(ExitStatus.COMPLETED.getExitCode())
.build();
final Flow splitFlow = new FlowBuilder<Flow>("splitFlow")
.split(new SimpleAsyncTaskExecutor())
.add(step2Flow, step3Flow)
.build();
return jobBuilderFactory().get("sampleBatchJob")
.start(step1Flow)
.next(splitFlow)
.next(step4MasterStep())
.end()
.build();
}
Sample code for Step2FlowDecider:
public class Step2FlowDecider
implements JobExecutionDecider {
#Override
public FlowExecutionStatus decide(JobExecution jobExecution, StepExecution stepExecution) {
if (StringUtils.equals("Y", batchParameter.executeStep2())) {
return new FlowExecutionStatus("EXECUTE");
}
return new FlowExecutionStatus("SKIP");
}
}
With this configuration, when I try to execute the batch, it is getting failed, without any details error log.

To separate steps class in spring batch

I have tried to find the solution but I cannot... ㅠㅠ
I want to separate steps in a job like below.
step1.class -> step2.class -> step3.class -> done
The reason why I'm so divided is that I have to use queries each step.
#Bean
public Job bundleJob() {
return jobBuilderFactory.get(JOB_NAME)
.start(step1) // bean
.next(step2) // bean
.next(step3()) // and here is the code ex) reader, processor, writer
.build();
}
my purpose is that I have to use the return data in step1, step2.
but jpaItemReader is like async ... so it doesn't process like above order.
debug flow like this.
readerStep1 -> writerStep1 -> readerStep2 -> readerWriter2 -> readerStep3 -> writerStep3
and
-> processorStep1 -> processorStep2 -> processorStep3
that is the big problem to me...
How can I wait each step in a job? Including querying.
aha! I got it.
the point is the creating beans in a configuration.
I wrote annotation bean all kinds of steps so that those are created by spring.
the solution is late binding like #JobScope or #StepScope
#Bean
#StepScope. // late creating bean.
public ListItemReader<Dto> itemReader() {
// business logic
return new ListItemReader<>(dto);
}
To have a separate steps in your job you can use a Flow with a TaskletStep. Sharing a snippet for your reference,
#Bean
public Job processJob() throws Exception {
Flow fetchData = (Flow) new FlowBuilder<>("fetchData")
.start(fetchDataStep()).build();
Flow transformData = (Flow) new FlowBuilder<>("transformData")
.start(transformData()).build();
Job job = jobBuilderFactory.get("processTenantLifeCycleJob").incrementer(new RunIdIncrementer())
.start(fetchData).next(transformData).next(processData()).end()
.listener(jobCompletionListener()).build();
ReferenceJobFactory referenceJobFactory = new ReferenceJobFactory(job);
registry.register(referenceJobFactory);
return job;
}
#Bean
public TaskletStep fetchDataStep() {
return stepBuilderFactory.get("fetchData")
.tasklet(fetchDataValue()).listener(fetchDataStepListener()).build();
}
#Bean
#StepScope
public FetchDataValue fetchDataValue() {
return new FetchDataValue();
}
#Bean
public TaskletStep transformDataStep() {
return stepBuilderFactory.get("transformData")
.tasklet(transformValue()).listener(sendReportDataCompletionListener()).build();
}
#Bean
#StepScope
public TransformValue transformValue() {
return new TransformValue();
}
#Bean
public Step processData() {
return stepBuilderFactory.get("processData").<String, Data>chunk(chunkSize)
.reader(processDataReader()).processor(dataProcessor()).writer(processDataWriter())
.listener(processDataListener())
.taskExecutor(backupTaskExecutor()).build();
}
In this example I have used 2 Flows to Fetch and Transform data which will execute data from a class.
In order to return the value of those from the step 1 and 2, you can store the value in the job context and retrieve that in the ProcessData Step which has a reader, processor and writer.

Resources