Spring batch stop the ItemReader running when app starts - spring

I'm using Spring Batch to run jobs which are triggered from a controller method. Everything works fine except when the application first boots the ItemReader runs and reads through everything.
Is this expected behaviour? It's not really a big deal it just slows down the boot time by a good 500 seconds.
Note: The job itself isn't running as I've disabled that via
batch:
job:
enabled: false
Edit:
Configuration
#Slf4j
#Configuration
#EnableBatchProcessing
public class JobConfiguration {
#Value("${app.directoryPath}")
public String directoryPath;
private final JobBuilderFactory jobBuilderFactory;
private final StepBuilderFactory stepBuilderFactory;
private final LocationRepository locationRepository;
private final VideoRepository videoRepository;
public JobConfiguration(JobBuilderFactory jobBuilderFactory, StepBuilderFactory stepBuilderFactory, LocationRepository locationRepository, VideoRepository videoRepository) {
this.jobBuilderFactory = jobBuilderFactory;
this.stepBuilderFactory = stepBuilderFactory;
this.locationRepository = locationRepository;
this.videoRepository = videoRepository;
}
#Bean(name = "importVideo")
public Job importVideo(Step processVideos) {
return jobBuilderFactory
.get("importVideo")
.start(processVideos)
.build();
}
#Bean(name = "processVideos")
public Step processVideos(VideoItemReader videoItemReader, VideoProcessor videoProcessor, VideoWriter videoWriter) {
return stepBuilderFactory.get("processVideos").<File, Video>chunk(25)
.reader(videoItemReader)
.processor(videoProcessor)
.writer(videoWriter)
.build();
}
#Bean
public VideoWriter videoWriter() {
return new VideoWriter(videoRepository);
}
#Bean
public VideoProcessor videoProcessor() {
return new VideoProcessor(locationRepository);
}
#Bean
public VideoItemReader videoItemReader() {
return new VideoItemReader("file:" + directoryPath, locationRepository);
}
}
And I'm calling the job with a GET request via
#GetMapping("/jobs/{job}")
public ResponseEntity<String> importVideos(#PathVariable String job) {
if (job.equalsIgnoreCase("createThumbnails")) {
executeJob(createThumbnails);
} else if (job.equalsIgnoreCase("importVideo")) {
executeJob(importVideo);
}
return new ResponseEntity<>("running", HttpStatus.OK);
}
private void executeJob(Job job) {
Set<JobExecution> runningJobExecutions = jobExplorer.findRunningJobExecutions(job.getName());
if (runningJobExecutions.isEmpty()) {
try {
jobLauncher.run(job, new JobParameters());
} catch (Exception ex) {
log.error(ex.getMessage());
}
} else {
log.info("executeJob already running, so... NOPE!");
}
}
But how I'm calling it has nothing to do with it, if I delete that code the same behaviour still exists...

This would be mostly due to spring not recognising spring.batch.job.enabled property, please try adding this in your
Load your property explicitly in your Application class, hopefully this will solve the issue.
#PropertySource("classpath:batch.properties")

Figured out my issue, the problem was my custom reader implementation was also implementing InitializingBean where it was calling what was effectively 'getVideos()'.
Removing the InitializingBean interface stopped the reader running on each boot.

Related

Spring Batch - How to Automatically Restart a Scheduled Job with Multi-Threaded Steps if Failed?

I'm new to Spring Batch. I have a scheduled job which needs to run every 2 hours. This job has several multi-threaded steps which should run independent from each other. The job is currently launched using a JobLauncher, as mentioned below.
#Component
#EnableScheduling
public class JobScheduler {
private static final Logger logger = LoggerFactory.getLogger(JobScheduler.class);
#Autowired
private JobLauncher jobLauncher;
#Autowired
private Job job;
#Scheduled(cron = "0 0 */2 * * ?")
#Retryable(maxAttempts = 3, backoff = #Backoff(delay = 60000),
include = {SQLException.class, RuntimeException.class})
public void automatedTask() {
JobParameters jobParameters = new JobParametersBuilder().addLong("time", System.currentTimeMillis()).toJobParameters();
try {
JobExecution jobExecution = jobLauncher.run(job, jobParameters);
} catch (JobInstanceAlreadyCompleteException | JobRestartException | JobParametersInvalidException |
JobExecutionAlreadyRunningException ex) {
logger.error("Error occurred when executing job scheduler", ex);
}
}
}
Mentioned below is my BatchConfig class.
#Configuration
#EnableBatchProcessing
#EnableRetry
public class BatchConfig {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private DataSource dataSource;
#Bean
#StepScope
public JdbcPagingItemReader<Model> reader1() {
StringBuffer selectClause = new StringBuffer();
selectClause.append("SELECT ");
selectClause.append("* ");
StringBuffer fromClause = new StringBuffer();
fromClause.append("FROM ");
fromClause.append("TABLENAME");
OraclePagingQueryProvider oraclePagingQueryProvider = new OraclePagingQueryProvider();
oraclePagingQueryProvider.setSelectClause(selectClause.toString());
oraclePagingQueryProvider.setFromClause(fromClause.toString());
Map<String, Order> orderByKeys = new HashMap<>();
orderByKeys.put("id", Order.ASCENDING);
oraclePagingQueryProvider.setSortKeys(orderByKeys);
JdbcPagingItemReader<Model> jdbcPagingItemReader = new JdbcPagingItemReader<>();
jdbcPagingItemReader.setSaveState(false);
jdbcPagingItemReader.setDataSource(dataSource);
jdbcPagingItemReader.setQueryProvider(oraclePagingQueryProvider);
jdbcPagingItemReader.setRowMapper(BeanPropertyRowMapper.newInstance(Model.class));
return jdbcPagingItemReader;
}
#Bean
#StepScope
public JdbcPagingItemReader<Model> reader2() {
}
#Bean
#StepScope
public JdbcPagingItemReader<Model> reader3() {
}
#Bean
#StepScope
public ItemWriter<Model> writer1() {
return new CustomItemWriter1();
}
#Bean
#StepScope
public ItemWriter<Model> writer2() {
return new CustomItemWriter2();
}
#Bean
#StepScope
public ItemWriter<Model> writer3() {
return new CustomItemWriter3();
}
#Bean
public Step step1() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(4);
taskExecutor.setMaxPoolSize(4);
taskExecutor.afterPropertiesSet();
return stepBuilderFactory.get("step1")
.<Model, Model>chunk(1000)
.reader(reader1())
.writer(writer1())
.faultTolerant()
.skipPolicy(new AlwaysSkipItemSkipPolicy())
.skip(Exception.class)
.listener(new CustomSkipListener())
.taskExecutor(taskExecutor)
.build();
}
#Bean
public Step step2() {
}
#Bean
public Step step3() {
}
#Bean
public Job myJob() {
return jobBuilderFactory.get("myJob").incrementer(new RunIdIncrementer())
// .listener(new CustomJobExecutionListener())
.start(step1()).on("*").to(step2())
.from(step1()).on(ExitStatus.FAILED.getExitCode()).to(step2())
.from(step2()).on("*").to(step3())
.from(step2()).on(ExitStatus.FAILED.getExitCode()).to(step3())
.end().build();
}
}
I've added conditional flow to the job so that every next step should work regardless of a failure in the previous step. Everything works fine in the initial steps but if an exception is thrown in the last step, the Exit Status of the whole job becomes FAILED. To solve this AND to solve any other failures in the job, I tried to implement restart functionality. Please note that I'm not saving the state in the readers due to multi-threading and I'm not sure whether this could affect the restarting.
I have referred the accepted solution in the below question,
https://stackoverflow.com/questions/38846457/how-can-you-restart-a-failed-spring-batch-job-and-let-it-pick-up-where-it-left-o
but I don't quite understand how or where to call the jobOperator.restart method at.
I've tried it like below, expecting the job to restart after launching, if failed. But it didn't work at all. Also, this implementation would stop the functionality of #Retryable annotation due to the try-catch block with Exception class caught.
#Component
#EnableScheduling
public class JobScheduler {
private static final Logger logger = LoggerFactory.getLogger(JobScheduler.class);
#Autowired
private JobLauncher jobLauncher;
#Autowired
private Job job;
#Autowired
private JobRepository jobRepository;
#Autowired
private JobRegistry jobRegistry;
#Autowired
private DataSource dataSource;
#Scheduled(cron = "0 0 */2 * * ?")
#Retryable(maxAttempts = 3, backoff = #Backoff(delay = 60000),
include = {SQLException.class, RuntimeException.class})
public void automatedTask() {
JobParameters jobParameters = new JobParametersBuilder().addLong("time", System.currentTimeMillis()).toJobParameters();
try {
JobExecution jobExecution = jobLauncher.run(job, jobParameters);
JobExplorer jobExplorer = this.getJobExplorer(dataSource);
JobOperator jobOperator = this.getJobOperator(jobLauncher, jobRepository, jobRegistry, jobExplorer);
List<JobInstance> jobInstances = jobExplorer.getJobInstances("myJob",0,1);
if(!jobInstances.isEmpty()){
JobInstance jobInstance = jobInstances.get(0);
List<JobExecution> jobExecutions = jobExplorer.getJobExecutions(jobInstance);
if(!jobExecutions.isEmpty()){
for(JobExecution execution: jobExecutions){
if(execution.getStatus().equals(BatchStatus.FAILED)){
jobOperator.restart(execution.getId());
}
}
}
}
} catch (Exception ex) {
logger.error("Error occurred when executing job scheduler", ex);
}
}
#Bean
public JobOperator getJobOperator(final JobLauncher jobLauncher, final JobRepository jobRepository,
final JobRegistry jobRegistry, final JobExplorer jobExplorer) {
final SimpleJobOperator jobOperator = new SimpleJobOperator();
jobOperator.setJobLauncher(jobLauncher);
jobOperator.setJobRepository(jobRepository);
jobOperator.setJobRegistry(jobRegistry);
jobOperator.setJobExplorer(jobExplorer);
return jobOperator;
}
#Bean
public JobExplorer getJobExplorer(final DataSource dataSource) throws Exception {
final JobExplorerFactoryBean bean = new JobExplorerFactoryBean();
bean.setDataSource(dataSource);
bean.setTablePrefix("BATCH_");
bean.setJdbcOperations(new JdbcTemplate(dataSource));
bean.afterPropertiesSet();
return bean.getObject();
}
}
I then tried adding a custom JobExecutionListener like below, expecting it to restart it after running the job, if failed. But it just fails as all the Autowired beans are becoming NULL.
public class CustomJobExecutionListener {
private static final Logger logger = LoggerFactory.getLogger(CustomJobExecutionListener.class);
#Autowired
private JobLauncher jobLauncher;
#Autowired
private JobRepository jobRepository;
#Autowired
private JobRegistry jobRegistry;
#Autowired
private DataSource dataSource;
#BeforeJob
public void beforeJob(JobExecution jobExecution) {
}
#AfterJob
public void afterJob(JobExecution jobExecution) {
try {
JobExplorer jobExplorer = this.getJobExplorer(dataSource);
JobOperator jobOperator = this.getJobOperator(jobLauncher, jobRepository, jobRegistry, jobExplorer);
if(jobExecution.getStatus().equals(BatchStatus.FAILED)){
jobOperator.restart(jobExecution.getId());
}
} catch (Exception ex) {
logger.error("Unknown error occurred when executing after job execution listener", ex);
}
}
#Bean
public JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor(JobRegistry jobRegistry) {
final JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor = new JobRegistryBeanPostProcessor();
jobRegistryBeanPostProcessor.setJobRegistry(jobRegistry);
return jobRegistryBeanPostProcessor;
}
#Bean
public JobOperator getJobOperator(final JobLauncher jobLauncher, final JobRepository jobRepository,
final JobRegistry jobRegistry, final JobExplorer jobExplorer) {
final SimpleJobOperator jobOperator = new SimpleJobOperator();
jobOperator.setJobLauncher(jobLauncher);
jobOperator.setJobRepository(jobRepository);
jobOperator.setJobRegistry(jobRegistry);
jobOperator.setJobExplorer(jobExplorer);
return jobOperator;
}
#Bean
public JobExplorer getJobExplorer(final DataSource dataSource) throws Exception {
final JobExplorerFactoryBean bean = new JobExplorerFactoryBean();
bean.setDataSource(dataSource);
bean.setTablePrefix("BATCH_");
bean.setJdbcOperations(new JdbcTemplate(dataSource));
bean.afterPropertiesSet();
return bean.getObject();
}
}
What am I doing wrong? How should the restart functionality be implemented for this job?
Appreciate your kind help!
Please note that I'm not saving the state in the readers due to multi-threading and I'm not sure whether this could affect the restarting.
It certainly affects restartability. Multi-threading in steps is incompatible with restartability. From the javadoc of the JdbcPagingItemReader that you are using, you can read the following:
The implementation is thread-safe in between calls to open(ExecutionContext),
but remember to use saveState=false if used in a multi-threaded client
(no restart available).
Without restart data, Spring Batch cannot restart the step from where it left-off. This is a trade-off that you have accepted by using a multi-threaded step.
but I don't quite understand how or where to call the jobOperator.restart method at.
Now with regard to restarting the failed job, a few notes:
Trying to restart a job in a JobExecutionListener is incorrect. This listener is called in the scope of the current job execution, while a restart will have its own, distinct job execution
JobOperator#restart should not be called inside the scheduled method, otherwise it will be called for every scheduled run. You can find an example here: https://stackoverflow.com/a/55137314/5019386

Saving file information in Spring batch MultiResourceItemReader

I have a directory having text files. I want to process files and write data into db. I did that by using MultiResourceItemReader.
I have a scenario like whenever file is coming, the first step is to save file info, like filename, record count in file in a log table(custom table).
Since i used MultiResourceItemReader, It's loading all files once and the code which i wrote is executing once in server startup. I tried with getCurrentResource() method but its returning null.
Please refer below code.
NetFileProcessController.java
#Slf4j
#RestController
#RequestMapping("/netProcess")
public class NetFileProcessController {
#Autowired
private JobLauncher jobLauncher;
#Autowired
#Qualifier("netFileParseJob")
private Job job;
#GetMapping(path = "/process")
public #ResponseBody StatusResponse process() throws ServiceException {
try {
Map<String, JobParameter> parameters = new HashMap<>();
parameters.put("date", new JobParameter(new Date()));
jobLauncher.run(job, new JobParameters(parameters));
return new StatusResponse(true);
} catch (Exception e) {
log.error("Exception", e);
Throwable rootException = ExceptionUtils.getRootCause(e);
String errMessage = rootException.getMessage();
log.info("Root cause is instance of JobInstanceAlreadyCompleteException --> "+(rootException instanceof JobInstanceAlreadyCompleteException));
if(rootException instanceof JobInstanceAlreadyCompleteException){
log.info(errMessage);
return new StatusResponse(false, "This job has been completed already!");
} else{
throw new ServiceException(errMessage);
}
}
}
}
BatchConfig.java
#Configuration
#EnableBatchProcessing
public class BatchConfig {
private JobBuilderFactory jobBuilderFactory;
#Autowired
public void setJobBuilderFactory(JobBuilderFactory jobBuilderFactory) {
this.jobBuilderFactory = jobBuilderFactory;
}
#Autowired
StepBuilderFactory stepBuilderFactory;
#Value("file:${input.files.location}${input.file.pattern}")
private Resource[] netFileInputs;
#Value("${net.file.column.names}")
private String netFilecolumnNames;
#Value("${net.file.column.lengths}")
private String netFileColumnLengths;
#Autowired
NetFileInfoTasklet netFileInfoTasklet;
#Autowired
NetFlatFileProcessor netFlatFileProcessor;
#Autowired
NetFlatFileWriter netFlatFileWriter;
#Bean
public Job netFileParseJob() {
return jobBuilderFactory.get("netFileParseJob")
.incrementer(new RunIdIncrementer())
.start(netFileStep())
.build();
}
public Step netFileStep() {
return stepBuilderFactory.get("netFileStep")
.<NetDetailsDTO, NetDetailsDTO>chunk(1)
.reader(new NetFlatFileReader(netFileInputs, netFilecolumnNames, netFileColumnLengths))
.processor(netFlatFileProcessor)
.writer(netFlatFileWriter)
.build();
}
}
NetFlatFileReader.java
#Slf4j
public class NetFlatFileReader extends MultiResourceItemReader<NetDetailsDTO> {
public netFlatFileReader(Resource[] netFileInputs, String netFilecolumnNames, String netFileColumnLengths) {
setResources(netFileInputs);
setDelegate(reader(netFilecolumnNames, netFileColumnLengths));
}
private FlatFileItemReader<NetDetailsDTO> reader(String netFilecolumnNames, String netFileColumnLengths) {
FlatFileItemReader<NetDetailsDTO> flatFileItemReader = new FlatFileItemReader<>();
FixedLengthTokenizer tokenizer = CommonUtil.fixedLengthTokenizer(netFilecolumnNames, netFileColumnLengths);
FieldSetMapper<NetDetailsDTO> mapper = createMapper();
DefaultLineMapper<NetDetailsDTO> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(tokenizer);
lineMapper.setFieldSetMapper(mapper);
flatFileItemReader.setLineMapper(lineMapper);
return flatFileItemReader;
}
/*
* Mapping column data to DTO
*/
private FieldSetMapper<NetDetailsDTO> createMapper() {
BeanWrapperFieldSetMapper<NetDetailsDTO> mapper = new BeanWrapperFieldSetMapper<>();
try {
mapper.setTargetType(NetDetailsDTO.class);
} catch(Exception e) {
log.error("Exception in mapping column data to dto ", e);
}
return mapper;
}
}
I am stuck on this scenario, Any help appreciated
I don't think MultiResourceItemReader is appropriate in your case. I would run a job per file for all the reasons of making one thing do one thing and do it well:
Your preparatory step will work by design
It would be easier to run multiple jobs in parallel and improve your file ingestion throughput
In case of failure, you would only restart the job for the failed file
EDIT: add an example
Resource[] netFileInputs = ... // same code that looks for file as currently in your reader
for (Resource netFileInput : netFileInputs) {
Map<String, JobParameter> parameters = new HashMap<>();
parameters.put("netFileInput", new JobParameter(netFileInput.getFilename()));
jobLauncher.run(job, new JobParameters(parameters));
}

Spring Boot Rest API + Spring Batch

I'm studying the Spring Batch process, but the documentation for me not clarifying the flow.
I have one API that receives one flat file wit fixed positions. The file has a header, body, and footer specific layouts.
I thinking to create a File class that has one Header, a list of Details and a footer class.
All I know from now is that I have to use one Token to identify the positions for each header, detail, and footer, but everything I found about the Spring batch not shows how to do it and start the process from the API request.
You have to build job with JobbuilderFactory:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Bean
public SomeReader<Some> reader() {
// some reader configuration
return reader;
}
#Bean
public SomeProcessor processor() {
return new SomeProcessor();
}
#Bean
public SomeWriter<Person> writer() {
// some config
return writer;
}
#Bean
public Job someJob() {
return jobBuilderFactory.get("someJob")
.flow(step1())
.end()
.build();
}
#Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<Some, Some> chunk(10)
.reader(reader())
.processor(processor())
.writer(writer())
.build();
}
}
Start job in rest controller:
#RestController
#AllArgsConstructor
#Slf4j
public class BatchStartController {
JobLauncher jobLauncher;
Job job;
#GetMapping("/job")
public void startJob() {
//some parameters
Map<String, JobParameter> parameters = new HashMap<>();
JobExecution jobExecution = jobLauncher.run(job, new JobParameters(parameters));
} }
And one important detail - add in application.properties:
spring.batch.job.enabled=false
to prevent job self start.
Solved by myself, as suggested here: Spring Boot: Cannot access REST Controller on localhost (404)
#SpringBootApplication
#EnableBatchProcessing
#EnableScheduling
#ComponentScan(basePackageClasses = JobStatusApi.class)
public class UpdateInfoBatchApplication {
public static void main(String[] args) {
SpringApplication.run(UpdateInfoBatchApplication.class, args);
}
}

Spring batch - get information about files in a directory

So I'm toying around with Spring Batch for the first time and trying to understand how to do things other than process a CSV file.
Attempting to read every music file in a directory for example, I have the following code but I'm not sure how to handle the Delegate part.
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Bean
public MusicItemProcessor processor() {
return new MusicItemProcessor();
}
#Bean
public Job readFiles() {
return jobBuilderFactory.get("readFiles").incrementer(new RunIdIncrementer()).
flow(step1()).end().build();
}
#Bean
public Step step1() {
return stepBuilderFactory.get("step1").<String, String>chunk(10)
.reader(reader())
.processor(processor()).build();
}
#Bean
public ItemReader<String> reader() {
Resource[] resources = null;
ResourcePatternResolver patternResolver = new PathMatchingResourcePatternResolver();
try {
resources = patternResolver.getResources("file:/music/*.flac");
} catch (IOException e) {
e.printStackTrace();
}
MultiResourceItemReader<String> reader = new MultiResourceItemReader<>();
reader.setResources(resources);
reader.setDelegate(new FlatFileItemReader<>()); // ??
return reader;
}
}
At the moment I can see that resources has a list of music files, but looking at the stacktrace I get back, it looks to me like new FlatFileItemReader<>() is trying to read the actual content of the files (I'll want to do that at some point, just not right now).
At the moment I just want the information about the file (absolute path, size, filename etc), not what's inside.
Have I gone completely wrong with this? Or do I just need to configure something a little different?
Any examples of code that does more than process CSV lines would also be awesome
After scouring the internet I've managed to pull together something that I think works... Some feedback would be welcome.
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Bean
public VideoItemProcessor processor() {
return new VideoItemProcessor();
}
#Bean
public Job readFiles() {
return jobBuilderFactory.get("readFiles")
.start(step())
.build();
}
#Bean
public Step step() {
try {
return stepBuilderFactory.get("step").<File, Video>chunk(500)
.reader(directoryItemReader())
.processor(processor())
.build();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
#Bean
public DirectoryItemReader directoryItemReader() throws IOException {
return new DirectoryItemReader("file:/media/media/Music/**/*.flac");
}
}
The part that had me stuck with creating a custom reader for files. If anyone else comes across this, this is how I've done it. I'm sure there are better ways but this works for me
public class DirectoryItemReader implements ItemReader<File>, InitializingBean {
private final String directoryPath;
private final List<File> foundFiles = Collections.synchronizedList(new ArrayList<>());
public DirectoryItemReader(final String directoryPath) {
this.directoryPath = directoryPath;
}
#Override
public File read() {
if (!foundFiles.isEmpty()) {
return foundFiles.remove(0);
}
synchronized (foundFiles) {
final Iterator files = foundFiles.iterator();
if (files.hasNext()) {
return foundFiles.remove(0);
}
}
return null;
}
#Override
public void afterPropertiesSet() throws Exception {
for (final Resource file : getFiles()) {
this.foundFiles.add(file.getFile());
}
}
private Resource[] getFiles() throws IOException {
ResourcePatternResolver patternResolver = new PathMatchingResourcePatternResolver();
return patternResolver.getResources(directoryPath);
}
}
The only thing you'd need to do is implement your own processor. I've used Videos in this example, so I have a video processor
#Slf4j
public class VideoItemProcessor implements ItemProcessor<File, Video> {
#Override
public Video process(final File item) throws Exception {
Video video = Video.builder()
.filename(item.getAbsoluteFile().getName())
.absolutePath(item.getAbsolutePath())
.fileSize(item.getTotalSpace())
.build();
log.info("Created {}", video);
return video;
}
}

Spring batch execute dynamically generated steps in a tasklet

I have a spring batch job that does the following...
Step 1. Creates a list of objects that need to be processed
Step 2. Creates a list of steps depending on how many items are in the list of objects created in step 1.
Step 3. Tries to executes the steps from the list of steps created in step 2.
The executing x steps is done below in executeDynamicStepsTasklet(). While the code runs without any errors it does not seem to be doing anything. Does what I have in that method look correct?
thanks
/*
*
*/
#Configuration
public class ExportMasterListCsvJobConfig {
public static final String JOB_NAME = "exportMasterListCsv";
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Value("${exportMasterListCsv.generateMasterListRows.chunkSize}")
public int chunkSize;
#Value("${exportMasterListCsv.generateMasterListRows.masterListSql}")
public String masterListSql;
#Autowired
public DataSource onlineStagingDb;
#Value("${out.dir}")
public String outDir;
#Value("${exportMasterListCsv.generatePromoStartDateEndDateGroupings.promoStartDateEndDateSql}")
private String promoStartDateEndDateSql;
private List<DivisionIdPromoCompStartDtEndDtGrouping> divisionIdPromoCompStartDtEndDtGrouping;
private List<Step> dynamicSteps = Collections.synchronizedList(new ArrayList<Step>()) ;
#Bean
public Job exportMasterListCsvJob(
#Qualifier("createJobDatesStep") Step createJobDatesStep,
#Qualifier("createDynamicStepsStep") Step createDynamicStepsStep,
#Qualifier("executeDynamicStepsStep") Step executeDynamicStepsStep) {
return jobBuilderFactory.get(JOB_NAME)
.flow(createJobDatesStep)
.next(createDynamicStepsStep)
.next(executeDynamicStepsStep)
.end().build();
}
#Bean
public Step executeDynamicStepsStep(
#Qualifier("executeDynamicStepsTasklet") Tasklet executeDynamicStepsTasklet) {
return stepBuilderFactory
.get("executeDynamicStepsStep")
.tasklet(executeDynamicStepsTasklet)
.build();
}
#Bean
public Tasklet executeDynamicStepsTasklet() {
return new Tasklet() {
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
FlowStep flowStep = new FlowStep(createParallelFlow());
SimpleJobBuilder jobBuilder = jobBuilderFactory.get("myNewJob").start(flowStep);
return RepeatStatus.FINISHED;
}
};
}
public Flow createParallelFlow() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(1);
List<Flow> flows = dynamicSteps.stream()
.map(step -> new FlowBuilder<Flow>("flow_" + step.getName()).start(step).build())
.collect(Collectors.toList());
return new FlowBuilder<SimpleFlow>("parallelStepsFlow")
.split(taskExecutor)
.add(flows.toArray(new Flow[flows.size()]))
.build();
}
#Bean
public Step createDynamicStepsStep(
#Qualifier("createDynamicStepsTasklet") Tasklet createDynamicStepsTasklet) {
return stepBuilderFactory
.get("createDynamicStepsStep")
.tasklet(createDynamicStepsTasklet)
.build();
}
#Bean
#JobScope
public Tasklet createDynamicStepsTasklet() {
return new Tasklet() {
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
for (DivisionIdPromoCompStartDtEndDtGrouping grp: divisionIdPromoCompStartDtEndDtGrouping){
System.err.println("grp: " + grp);
String stepName = "stp_" + grp;
String fileName = grp + FlatFileConstants.EXTENSION_CSV;
Step dynamicStep =
stepBuilderFactory.get(stepName)
.<MasterList,MasterList> chunk(10)
.reader(queryStagingDbReader(
grp.getDivisionId(),
grp.getRpmPromoCompDetailStartDate(),
grp.getRpmPromoCompDetailEndDate()))
.writer(masterListFileWriter(fileName))
.build();
dynamicSteps.add(dynamicStep);
}
System.err.println("createDynamicStepsTasklet dynamicSteps: " + dynamicSteps);
return RepeatStatus.FINISHED;
}
};
}
public FlatFileItemWriter<MasterList> masterListFileWriter(String fileName) {
FlatFileItemWriter<MasterList> writer = new FlatFileItemWriter<>();
writer.setResource(new FileSystemResource(new File(outDir, fileName )));
writer.setHeaderCallback(masterListFlatFileHeaderCallback());
writer.setLineAggregator(masterListFormatterLineAggregator());
return writer;
}
So now I have a list of dynamic steps that need to be executed and I believe that they are in StepScope. Can someone advise me on how to execute them
This will not work. Your Tasklet just creates a job with a FlowStep as first Step. Using the jobBuilderfactory just creates the job. it does not launch it. The methodname "start" may be misleading, since this only defines the first step. But it does not launch the job.
You cannot change the structure of a job (its steps and substeps) once it is started. Therefore, it is not possible to configure a flowstep in step 2 based on things that are calculated in step 1. (of course you could do some hacking deeper inside the springbatch structure and directly modify the beans and so ... but you don't want to do that).
I suggest, that you use a kind of "SetupBean" with an appropriate postConstruct method which is injected into your class that configures your job. This "SetupBean" is responsible to calculate the list of objects being processed.
#Component
public class SetUpBean {
private List<Object> myObjects;
#PostConstruct
public afterPropertiesSet() {
myObjects = ...;
}
public List<Object> getMyObjects() {
return myObjects;
}
}
#Configuration
public class JobConfiguration {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private SetUpBean setup;
...
}

Resources