Spring Batch -#BeforeStep not getting invoked in Partitioner - spring

We are trying to implement a batch job using spring batch partitioning.In this in "step 2" is a partitioned step where I need some data from step 1 for processing.I used StepExecutionContext which will be promoted to job Execution Context at step1 to store this data.
I tried to use #BeforeStep annotation in partitioner class to get the stepExecutionContext
from which I can extract the data stored previously and put it in ExecutionContext of the partitioner .But the method with #BeforeStep annotation is not getting invoked in the partitioner.
Is there any other way to achieve this.
Partitioner Implementation
public class NtfnPartitioner implements Partitioner {
private int index = 0;
String prev_job_time = null;
String curr_job_time = null;
private StepExecution stepExecution ;
ExecutionContext executionContext ;
#Override
public Map<String, ExecutionContext> partition(int gridSize)
{
System.out.println("Entered Partitioner");
List<Integer> referencIds = new ArrayList<Integer>();
for (int i = 0; i < gridSize;i++) {
referencIds.add(index++);
}
Map<String, ExecutionContext> results = new LinkedHashMap<String,ExecutionContext>();
for (int referencId : referencIds) {
ExecutionContext context = new ExecutionContext();
context.put("referenceId", referencId);
context.put(NtfnConstants.PREVIOUS_JOB_TIME, prev_job_time);
context.put(NtfnConstants.JOB_START_TIME, curr_job_time);
results.put("partition." + referencId, context);
}
return results;
}
#BeforeStep
public void beforeStep(StepExecution stepExecution) {
// TODO Auto-generated method stub
System.out.println("Entered Before step in partion");
JobExecution jobExecution = stepExecution.getJobExecution();
ExecutionContext jobContext = jobExecution.getExecutionContext();
System.out.println("ExecutionContext"+jobContext);
String prev_job_time = (String) jobContext.get(NtfnConstants.PREVIOUS_JOB_TIME);
String curr_job_time = (String) jobContext.get(NtfnConstants.JOB_START_TIME);
}

The bean should be step scoped.
Java, annotate class:
#StepScope
XML, in bean definition:
scope="step"
Also look at this answer regarding proxied bean (not sure if this applies to you since no other code than the partitioner was provided). In this case you can still add your partitioner as a listener explicitly during step building:
#Autowired
private NtfnPartitioner partitioner;
...
final Step masterStep = stepBuilderFactory.get("master")
.listener(partitioner)
.partitioner("slave", partitioner)
.step(slave)
...
Or if your partitioner is not bean (e.g. you are creating it based on something dynamic), you can still add it as a listener:
final NtfnPartitioner partitioner = new NtfnPartitioner();
final Step masterStep = stepBuilderFactory.get("master")
.listener(partitioner)
.partitioner("slave", partitioner)
.step(slave)
...

To get the handle of Job Parameters you can implement StepExecutionListener to your NtfnPartitioner Class to make use of Overridden methods beforeStep and afterStep. Please make sure that it should be of StepScoped.
public class NtfnPartitioner implements Partitioner, StepExecutionListener {
String prev_job_time = null;
String curr_job_time = null;
....
#Override
public Map<String, ExecutionContext> partition(int gridSize)
{
....
/* Please use prev_job_time and curr_job_time in this
method which was fetched from context */
....
}
#BeforeStep
public void beforeStep(StepExecution stepExecution) {
System.out.println("Entered Before step in partion");
ExecutionContext jobContext = stepExecution.getJobExecution().getExecutionContext();
System.out.println("ExecutionContext"+jobContext);
String prev_job_time = (String) jobContext.get(NtfnConstants.PREVIOUS_JOB_TIME);
String curr_job_time = (String) jobContext.get(NtfnConstants.JOB_START_TIME);
}
#Override
public ExitStatus afterStep(StepExecution stepExecution) {
if (stepExecution.getStatus() == BatchStatus.COMPLETED) {
return ExitStatus.COMPLETED;
}
return ExitStatus.FAILED;
}
}

Related

Spring batch exception handling sended as ResponseEntity

i m new in Spring boot, i'm training on a small project with Spring batch to get experience, Here my context: I have 2 csv file, one hold employees, the other contains all managers of the compagny. I have to read files, then add each record in database. To make it simple , i just need to call an endpoint from my controller , upload my csv file (multipartfile), then the job will start. I actually was able to do that, my problem is the following.
I have to manage multiple kind of validation (i'm using jsr 380 validation for my entites and i have also to check business exception). A kind of buisness exception can be the following rule, An employee is supervised by a manager of his departement (the employee can't be supervised by a manager, if he's not on same departement, otherwise should throw exception). So for mistaken records, with some invalid or "Illogic" input, i have to skip them (don't save on database) but store them in an Map or List that should be sended as Responses Enity to the client. Hence the client would know which row need to be fixed. I suppose i have to take a look about** Listeners** , But i really can t store exceptions in a map or list then send it as ResponseEntity. Bellow Example of what i want to achieve.
My csv files screenshots
EmployeeBatchConfig.java
#Configuration
#EnableBatchProcessing
#AllArgsConstructor
public class EmployeeBatchConfig {
private JobBuilderFactory jobBuilderFactory;
private StepBuilderFactory stepBuilderFactory;
private EmployeeRepository employeeRepository;
private EmployeeItemWriter employeeItemWriter;
#Bean
#StepScope
public FlatFileItemReader<EmployeeDto> itemReader(#Value("#
{jobParameters[fullPathFileName]}") final String pathFile) {
FlatFileItemReader<EmployeeDto> flatFileItemReader = new
FlatFileItemReader<>();
flatFileItemReader.setResource(new FileSystemResource(new
File(pathFile)));
flatFileItemReader.setName("CSV-Reader");
flatFileItemReader.setLinesToSkip(1);
flatFileItemReader.setLineMapper(lineMapper());
return flatFileItemReader;
}
private LineMapper<EtudiantDto> lineMapper() {
DefaultLineMapper<EtudiantDto> lineMapper = new DefaultLineMapper<>
();
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setDelimiter(",");
lineTokenizer.setStrict(false);
lineTokenizer.setNames("Username", "lastName", "firstName",
"departement", "supervisor");
BeanWrapperFieldSetMapper<EmployeeDto> fieldSetMapper = new
BeanWrapperFieldSetMapper<>();
fieldSetMapper.setTargetType(EmployeeDto.class);
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(fieldSetMapper);
return lineMapper;
}
#Bean
public EmployeeProcessor processor() {
return new EmployeeProcessor(); /*Create a bean processor to skip
invalid rows*/
}
#Bean
public RepositoryItemWriter<Employee> writer() {
RepositoryItemWriter<Employee> writer = new RepositoryItemWriter<>();
writer.setRepository(employeeRepository);
writer.setMethodName("save");
return writer;
}
#Bean
public Step step1(FlatFileItemReader<EmployeeDto> itemReader) {
return stepBuilderFactory.get("slaveStep").<EmployeeDto,
Employee>chunk(5)
.reader(itemReader)
.processor(processor())
.writer(employeeItemWriter)
.faultTolerant()
.listener(skipListener())
.skip(SkipException.class)
.skipLimit(10)
.skipPolicy(skipPolicy())
.build();
}
#Bean
#Qualifier("executeJobEmployee")
public Job runJob(FlatFileItemReader<Employee> itemReader) {
return jobBuilderFactory
.get("importEmployee")
.flow(step1(itemReader))
.end()
.build();
}
#Bean
public SkipPolicy skipPolicy(){
return new ExceptionSkipPolicy();
}
#Bean
public SkipListener<EmployeeDto, Employee> skipListener(){
return new StepSkipListener();
}
/*#Bean
public ExecutionContext executionContext(){
return new ExecutionContext();
}*/
}
EmployeeProcessor.java
public class EmployeeProcessor implements ItemProcessor<EmployeeDto,
Employee>{
#Autowired
private SupervisorService managerService;
#Override
public Employee process(#Valid EmployeeDto item) throws Exception,
SkipException {
ManagerDto manager =
SupervisorService.findSupervisorById(item.getSupervisor());
//retrieve the manager of the employee and compare departement
if(!(manager.getDepartement().equals(item.getDepartement()))) {
throw new SkipException("Manager Invalid", item);
//return null;
}
return ObjectMapperUtils.map(item, Employee.class);
}
}
MySkipPolicy.java
public class MySkipPolicy implements SkipPolicy {
#Override
public boolean shouldSkip(Throwable throwable, int i) throws
SkipLimitExceededException {
return true;
}
}
StepSkipListenerPolicy.java
public class StepSkipListener implements SkipListener<EmployeeDto,
Number> {
#Override // item reader
public void onSkipInRead(Throwable throwable) {
System.out.println("In OnSkipReader");
}
#Override // item writter
public void onSkipInWrite(Number item, Throwable throwable) {
System.out.println("Nooooooooo ");
}
//#SneakyThrows
#Override // item processor
public void onSkipInProcess(#Valid EmployeeDto employee, Throwable
throwable){
System.out.println("Process... ");
/* I guess this is where I should work, but how do I deal with the
exception occur? How do I know which exception I would get ? */
}
}
SkipException.java
public class SkipException extends Exception {
private Map<String, EmployeeDto> errors = new HashMap<>();
public SkipException(String errorMessage, EmployeeDto employee) {
super();
this.errors.put(errorMessage, employee);
}
public Map<String, EmployeeDto> getErrors() {
return this.errors;
}
}
JobController.java
#RestController
#RequestMapping("/upload")
public class JobController {
#Autowired
private JobLauncher jobLauncher;
#Autowired
#Qualifier("executeJobEmployee")
private Job job;
private final String EMPLOYEE_FOLDER = "C:/Users/Project/Employee/";
#PostMapping("/employee")
public ResponseEntity<Object> importEmployee(#RequestParam("file")
MultipartFile multipartFile) throws JobInterruptedException,
SkipException, IllegalStateException, IOException,
FlatFileParseException{
try {
String fileName = multipartFile.getOriginalFilename();
File fileToImport= new File(EMPLOYEE_FOLDER + fileName);
multipartFile.transferTo(fileToImport);
JobParameters jobParameters = new JobParametersBuilder()
.addString("fullPathFileName", EMPLOYEE_FOLDER + fileName)
.addLong("startAt", System.currentTimeMillis())
.toJobParameters();
JobExecution jobExecution = this.jobLauncher.run(job,
jobParameters);
ExecutionContext executionContext =
jobExecution.getExecutionContext();
System.out.println("My Skiped items : " +
executionContext.toString());
} catch (ConstraintViolationException | FlatFileParseException |
JobRestartException | JobInstanceAlreadyCompleteException |
JobParametersInvalidException |
JobExecutionAlreadyRunningException e) {
e.printStackTrace();
return new ResponseEntity<>(e.getMessage(), HttpStatus.BAD_REQUEST);
}
return new ResponseEntity<>("Employee inserted succesfully",
HttpStatus.OK);
}
}
That requirement forces your implementation to wait for the job to finish before returning the web response, which is not the typical way of launching batch jobs from web requests. Typically, since batch jobs can run for several minutes/hours, they are launched in the background and a job ID is returned back to the client for later status check.
In Spring Batch, the SkipListener is the extension point that allows you to add custom code when a skippable exception happens when reading, processing or writing an item. I would add the business validation in an item processor and throw an exception with the skipped item and the reason for that skip (both encapsulated in the exception class that should be declared as skippable).
Skipped items are usually stored somewhere for later analysis (like a table or a file or the job execution context). In your case, you need to send them back in the web response, so you can read them from the store of your choice before returning them attached in the web response. In pseudo code in your controller, this should be something like the following:
- run the job and wait for its termination (the skip listener would write skipped items in the storage of your choice)
- get skipped items from storage
- return web response
For example, if you choose to store skipped items in the job execution context, you can do something like this in your controller:
JobExecution jobExecution = jobLauncher.run(job, jobParameters);
ExecutionContext executionContext = jobExecution.getExecutionContext();
// get skipped items from the execution context
// return the web response

Saving file information in Spring batch MultiResourceItemReader

I have a directory having text files. I want to process files and write data into db. I did that by using MultiResourceItemReader.
I have a scenario like whenever file is coming, the first step is to save file info, like filename, record count in file in a log table(custom table).
Since i used MultiResourceItemReader, It's loading all files once and the code which i wrote is executing once in server startup. I tried with getCurrentResource() method but its returning null.
Please refer below code.
NetFileProcessController.java
#Slf4j
#RestController
#RequestMapping("/netProcess")
public class NetFileProcessController {
#Autowired
private JobLauncher jobLauncher;
#Autowired
#Qualifier("netFileParseJob")
private Job job;
#GetMapping(path = "/process")
public #ResponseBody StatusResponse process() throws ServiceException {
try {
Map<String, JobParameter> parameters = new HashMap<>();
parameters.put("date", new JobParameter(new Date()));
jobLauncher.run(job, new JobParameters(parameters));
return new StatusResponse(true);
} catch (Exception e) {
log.error("Exception", e);
Throwable rootException = ExceptionUtils.getRootCause(e);
String errMessage = rootException.getMessage();
log.info("Root cause is instance of JobInstanceAlreadyCompleteException --> "+(rootException instanceof JobInstanceAlreadyCompleteException));
if(rootException instanceof JobInstanceAlreadyCompleteException){
log.info(errMessage);
return new StatusResponse(false, "This job has been completed already!");
} else{
throw new ServiceException(errMessage);
}
}
}
}
BatchConfig.java
#Configuration
#EnableBatchProcessing
public class BatchConfig {
private JobBuilderFactory jobBuilderFactory;
#Autowired
public void setJobBuilderFactory(JobBuilderFactory jobBuilderFactory) {
this.jobBuilderFactory = jobBuilderFactory;
}
#Autowired
StepBuilderFactory stepBuilderFactory;
#Value("file:${input.files.location}${input.file.pattern}")
private Resource[] netFileInputs;
#Value("${net.file.column.names}")
private String netFilecolumnNames;
#Value("${net.file.column.lengths}")
private String netFileColumnLengths;
#Autowired
NetFileInfoTasklet netFileInfoTasklet;
#Autowired
NetFlatFileProcessor netFlatFileProcessor;
#Autowired
NetFlatFileWriter netFlatFileWriter;
#Bean
public Job netFileParseJob() {
return jobBuilderFactory.get("netFileParseJob")
.incrementer(new RunIdIncrementer())
.start(netFileStep())
.build();
}
public Step netFileStep() {
return stepBuilderFactory.get("netFileStep")
.<NetDetailsDTO, NetDetailsDTO>chunk(1)
.reader(new NetFlatFileReader(netFileInputs, netFilecolumnNames, netFileColumnLengths))
.processor(netFlatFileProcessor)
.writer(netFlatFileWriter)
.build();
}
}
NetFlatFileReader.java
#Slf4j
public class NetFlatFileReader extends MultiResourceItemReader<NetDetailsDTO> {
public netFlatFileReader(Resource[] netFileInputs, String netFilecolumnNames, String netFileColumnLengths) {
setResources(netFileInputs);
setDelegate(reader(netFilecolumnNames, netFileColumnLengths));
}
private FlatFileItemReader<NetDetailsDTO> reader(String netFilecolumnNames, String netFileColumnLengths) {
FlatFileItemReader<NetDetailsDTO> flatFileItemReader = new FlatFileItemReader<>();
FixedLengthTokenizer tokenizer = CommonUtil.fixedLengthTokenizer(netFilecolumnNames, netFileColumnLengths);
FieldSetMapper<NetDetailsDTO> mapper = createMapper();
DefaultLineMapper<NetDetailsDTO> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(tokenizer);
lineMapper.setFieldSetMapper(mapper);
flatFileItemReader.setLineMapper(lineMapper);
return flatFileItemReader;
}
/*
* Mapping column data to DTO
*/
private FieldSetMapper<NetDetailsDTO> createMapper() {
BeanWrapperFieldSetMapper<NetDetailsDTO> mapper = new BeanWrapperFieldSetMapper<>();
try {
mapper.setTargetType(NetDetailsDTO.class);
} catch(Exception e) {
log.error("Exception in mapping column data to dto ", e);
}
return mapper;
}
}
I am stuck on this scenario, Any help appreciated
I don't think MultiResourceItemReader is appropriate in your case. I would run a job per file for all the reasons of making one thing do one thing and do it well:
Your preparatory step will work by design
It would be easier to run multiple jobs in parallel and improve your file ingestion throughput
In case of failure, you would only restart the job for the failed file
EDIT: add an example
Resource[] netFileInputs = ... // same code that looks for file as currently in your reader
for (Resource netFileInput : netFileInputs) {
Map<String, JobParameter> parameters = new HashMap<>();
parameters.put("netFileInput", new JobParameter(netFileInput.getFilename()));
jobLauncher.run(job, new JobParameters(parameters));
}

Spring Batch - RepositoryItemReader read param from ExecutionContext

I have Spring Batch job where I am passing some values between two stpes. I set the value in Job Context in Step1 and now trying to read from RepositoryItemReader in Step2. There is #BeforeStep method where I am able to read value set in context. But I am setting up my repository along with method name and args in #PostConstruct annotated method which is executed before #BeforeStep annotated method.
What is the best way to read param in ReposiotryItem from JobExecution Context?
#Component
#JobScope
public class MyItemReader extends RepositoryItemReader<Scan> {
#Autowired
private MyRepository repository;
private Integer lastIdPulled = null;
public MyItemReader() {
super();
}
#BeforeStep
public void initializeValues(StepExecution stepExecution) {
Integer value = stepExecution.getJobExecution().getExecutionContext().getInt("lastIdPulled");
System.out.println(">>>>>>>> last_pulled_id = " + value);
}
#PostConstruct
protected void init() {
final Map<String, Sort.Direction> sorts = new HashMap<>();
sorts.put("id", Direction.ASC);
this.setRepository(this.repository);
this.setSort(sorts);
this.setMethodName("findByGreaterThanId"); // You should sepcify the method which
//spring batch should call in your repository to fetch
// data and the arguments it needs needs to be
//specified with the below method.
List<Object> methodArgs = new ArrayList<Object>();
if(lastIdPulled== null || lastIdPulled<=0 ){
lastScanIdPulled = 0;
}
methodArgs.add(lastIdPulled);
this.setArguments(methodArgs);
}
}
Your reader needs to be #StepScoped instead of #JobScoped. Even though you're accessing the job context, the value is not available in the context until the previous step finishes. If you #StepScope your reader then it won't initialize until the step it is part of starts up and the value is available.
Another option is to construct the reader as a #Bean definition in a #Configuration file but the idea is the same. This uses SpEL for late binding.
#Configuration
public class JobConfig {
// Your item reader will get autowired into this method
// so you don't have to call it
#Bean
public Step myStep(MyItemReader myItemReader) {
//build your step
}
#Bean
#StepScope
public MyItemReader myItemReader(#Value("#{jobExecutionContext[partitionKey]}") Integer lastIdPulled) {
MyItemReader reader = new MyItemReader();
// Perform #PostConstruct tasks
return reader;
}
}
I was able to figure out how to solve your problem, and mine, without having to create a #Bean definition in #Configuration file and without using #PostConstruct.
Instead of using #PostConstruct, just set them in the constructor of the class and in your #BeforeStep set the arguments as shown below:
#Component
#StepScope
public class MyItemReader extends RepositoryItemReader<Scan> {
#Autowired
public MyItemReader(MyRepository myRepository) {
super();
this.setRepository(MyRepository myRepository);
this.setMethodName("findByGreaterThanId");
final Map<String, Sort.Direction> sorts = new HashMap<>();
sorts.put("id", Direction.ASC);
this.setSort(sorts);
}
#BeforeStep
public void initializeValues(StepExecution stepExecution) {
Integer value = stepExecution.getJobExecution().getExecutionContext().getInt("lastIdPulled");
System.out.println(">>>>>>>> last_pulled_id = " + value);
List<Object> methodArgs = new ArrayList<Object>();
if(lastIdPulled== null || lastIdPulled<=0 ){
lastScanIdPulled = 0;
}
methodArgs.add(lastIdPulled);
this.setArguments(methodArgs);
}
}

Spring batch accessing job parameter returning null value even though stepscope is used

I have read this forum with lots of examples on how to use jobparameters but still got stuck as i am getting null value when retrieving jobparameter
Below is my code so far
Controller :
#RestController
#CrossOrigin
#RequestMapping("/bulkimport/api/")
public class BatchController {
#Autowired
public JobLauncher jobLauncher;
#Autowired
public Job bulkImportJob;
#RequestMapping(value = "/bulkProcessjob", method = RequestMethod.POST)
public void batchImportProcess(#RequestHeader(value = "X-Auth-Token") String jwtToken,
#RequestParam("excelfile") final MultipartFile excelfile
) throws IOException{
String path = new ClassPathResource("importFileFolder/").getURL().getPath();
File fileToImport = new File(path + excelfile.getOriginalFilename());
try {
Map<String, JobParameter> jobParametersMap = new HashMap<String, JobParameter>();
jobParametersMap.put("processName", new JobParameter("processName"));
jobParametersMap.put("operation", new JobParameter("operation"));
jobParametersMap.put("filePath", new JobParameter(fileToImport.getAbsolutePath()));
System.out.println(fileToImport.getAbsolutePath()+"filePath:::::::::::");
jobLauncher.run(bulkImportJob, new JobParameters(jobParametersMap));
}catch(Exception e){
}
}
}
Config class:
public String filePath;
#Bean(name = "bulkImportJob")
public Job bulkImportJob() {
return jobBuilderFactory.get("bulkImportJob").incrementer(new RunIdIncrementer()).flow(bulkImportProcessStep()).end().build();
}
#Bean
public Step bulkImportProcessStep() {
return stepBuilderFactory.get("bulkImportProcessStep").<String, String> chunk(1).reader(validateFileHeader(filePath)).writer(new TempWriter()).build();
}
#Bean
#StepScope
BulkUploadExcelHeader validateFileHeader(#Value("#{jobParameters[filePath]}") String filePath) {
System.out.println("filePath value::::::::::::::::::::::::::::::::"+filePath);
return new BulkUploadExcelHeader(filePath);
}
in above print statement filePath value is printed as null
can anyone please suggest what is wrong in above code
Most probably you need to enclose filePath with single quotes:
validateFileHeader(#Value("#{jobParameters['filePath']}")

Why the intemReader is always sending the exact same value to CustomItemProcessor

Why does the itemReader method is always sending the exact same file name to be processed in CustomItemProcessor?
As far as I understand, since I settup reader as #Scope and I set more than 1 in chunk, I was expecting the "return s" to move forward to next value from String array.
Let me clarify my question with a debug example in reader method:
1 - the variable stringArray is filled in with 3 file names (f1.txt, f2.txt and f3.txt)
2 - "return s" is evoked with s = f1.txt
3 - "return s" evoked again before evoked customItemProcessor method (perfect untill here since chunk = 2)
4 - looking at s it contains f1.txt again (different from what I expected. I expected f2.txt)
5 and 6 - runs processor with same name f1.tx (it should work correctly if the second turn of "return s" would contain f2.txt)
7 - writer method works as expected (processedFiles contain twice the two names processed in customItemProcessor f1.txt and f1.txt again since same name was processed twice)
CustomItemReader
public class CustomItemReader implements ItemReader<String> {
#Override
public String read() throws Exception, UnexpectedInputException,
ParseException, NonTransientResourceException {
String[] stringArray;
try (Stream<Path> stream = Files.list(Paths.get(env
.getProperty("my.path")))) {
stringArray = stream.map(String::valueOf)
.filter(path -> path.endsWith("out"))
.toArray(size -> new String[size]);
}
//*** the problem is here
//every turn s variable receives the first file name from the stringArray
if (stringArray.length > 0) {
for (String s : stringArray) {
return s;
}
} else {
log.info("read method - no file found");
return null;
}
return null;
}
CustomItemProcessor
public class CustomItemProcessor implements ItemProcessor<String , String> {
#Override
public String process(String singleFileToProcess) throws Exception {
log.info("process method: " + singleFileToProcess);
return singleFileToProcess;
}
}
CustomItemWriter
public class CustomItemWriter implements ItemWriter<String> {
private static final Logger log = LoggerFactory
.getLogger(CustomItemWriter.class);
#Override
public void write(List<? extends String> processedFiles) throws Exception {
processedFiles.stream().forEach(
processedFile -> log.info("**** write method"
+ processedFile.toString()));
FileSystem fs = FileSystems.getDefault();
for (String s : processedFiles) {
Files.deleteIfExists(fs.getPath(s));
}
}
Configuration
#Configuration
#ComponentScan(...
#EnableBatchProcessing
#EnableScheduling
#PropertySource(...
public class BatchConfig {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private JobRepository jobRepository;
#Bean
public TaskExecutor getTaskExecutor() {
return new TaskExecutor() {
#Override
public void execute(Runnable task) {
}
};
}
//I can see the number in chunk reflects how many time customReader is triggered before triggers customProcesser
#Bean
public Step step1(ItemReader<String> reader,
ItemProcessor<String, String> processor, ItemWriter<String> writer) {
return stepBuilderFactory.get("step1").<String, String> chunk(2)
.reader(reader).processor(processor).writer(writer)
.allowStartIfComplete(true).build();
}
#Bean
#Scope
public ItemReader<String> reader() {
return new CustomItemReader();
}
#Bean
public ItemProcessor<String, String> processor() {
return new CustomItemProcessor();
}
#Bean
public ItemWriter<String> writer() {
return new CustomItemWriter();
}
#Bean
public Job job(Step step1) throws Exception {
return jobBuilderFactory.get("job1").incrementer(new RunIdIncrementer()).start(step1).build();
}
Scheduler
#Component
public class QueueScheduler {
private static final Logger log = LoggerFactory
.getLogger(QueueScheduler.class);
private Job job;
private JobLauncher jobLauncher;
#Autowired
public QueueScheduler(JobLauncher jobLauncher, #Qualifier("job") Job job){
this.job = job;
this.jobLauncher = jobLauncher;
}
#Scheduled(fixedRate=60000)
public void runJob(){
try{
jobLauncher.run(job, new JobParameters());
}catch(Exception ex){
log.info(ex.getMessage());
}
}
}
Your issue is that you are relying on an internal loop to iterate over the items instead of letting Spring Batch do it for you by calling ItemReader#read multiple times.
What I'd recommend is changing your reader to the something like the following:
public class JimsItemReader implements ItemStreamReader {
private String[] items;
private int curIndex = -1;
#Override
public void open(ExecutionContext ec) {
curIndex = ec.getInt("curIndex", -1);
String[] stringArray;
try (Stream<Path> stream = Files.list(Paths.get(env.getProperty("my.path")))) {
stringArray = stream.map(String::valueOf)
.filter(path -> path.endsWith("out"))
.toArray(size -> new String[size]);
}
}
#Override
public void update(ExecutionContext ec) {
ec.putInt("curIndex", curIndex);
}
#Override
public String read() {
if (curIndex < items.length) {
curIndex++;
return items[curIndex];
} else {
return null;
}
}
}
The above example should loop through the items of your array as they are read. It also should be restartable in that we're storing the index in the ExecutionContext so if the job is restarted after a failure, you'll restart where you left off.

Resources