I have a directory having text files. I want to process files and write data into db. I did that by using MultiResourceItemReader.
I have a scenario like whenever file is coming, the first step is to save file info, like filename, record count in file in a log table(custom table).
Since i used MultiResourceItemReader, It's loading all files once and the code which i wrote is executing once in server startup. I tried with getCurrentResource() method but its returning null.
Please refer below code.
NetFileProcessController.java
#Slf4j
#RestController
#RequestMapping("/netProcess")
public class NetFileProcessController {
#Autowired
private JobLauncher jobLauncher;
#Autowired
#Qualifier("netFileParseJob")
private Job job;
#GetMapping(path = "/process")
public #ResponseBody StatusResponse process() throws ServiceException {
try {
Map<String, JobParameter> parameters = new HashMap<>();
parameters.put("date", new JobParameter(new Date()));
jobLauncher.run(job, new JobParameters(parameters));
return new StatusResponse(true);
} catch (Exception e) {
log.error("Exception", e);
Throwable rootException = ExceptionUtils.getRootCause(e);
String errMessage = rootException.getMessage();
log.info("Root cause is instance of JobInstanceAlreadyCompleteException --> "+(rootException instanceof JobInstanceAlreadyCompleteException));
if(rootException instanceof JobInstanceAlreadyCompleteException){
log.info(errMessage);
return new StatusResponse(false, "This job has been completed already!");
} else{
throw new ServiceException(errMessage);
}
}
}
}
BatchConfig.java
#Configuration
#EnableBatchProcessing
public class BatchConfig {
private JobBuilderFactory jobBuilderFactory;
#Autowired
public void setJobBuilderFactory(JobBuilderFactory jobBuilderFactory) {
this.jobBuilderFactory = jobBuilderFactory;
}
#Autowired
StepBuilderFactory stepBuilderFactory;
#Value("file:${input.files.location}${input.file.pattern}")
private Resource[] netFileInputs;
#Value("${net.file.column.names}")
private String netFilecolumnNames;
#Value("${net.file.column.lengths}")
private String netFileColumnLengths;
#Autowired
NetFileInfoTasklet netFileInfoTasklet;
#Autowired
NetFlatFileProcessor netFlatFileProcessor;
#Autowired
NetFlatFileWriter netFlatFileWriter;
#Bean
public Job netFileParseJob() {
return jobBuilderFactory.get("netFileParseJob")
.incrementer(new RunIdIncrementer())
.start(netFileStep())
.build();
}
public Step netFileStep() {
return stepBuilderFactory.get("netFileStep")
.<NetDetailsDTO, NetDetailsDTO>chunk(1)
.reader(new NetFlatFileReader(netFileInputs, netFilecolumnNames, netFileColumnLengths))
.processor(netFlatFileProcessor)
.writer(netFlatFileWriter)
.build();
}
}
NetFlatFileReader.java
#Slf4j
public class NetFlatFileReader extends MultiResourceItemReader<NetDetailsDTO> {
public netFlatFileReader(Resource[] netFileInputs, String netFilecolumnNames, String netFileColumnLengths) {
setResources(netFileInputs);
setDelegate(reader(netFilecolumnNames, netFileColumnLengths));
}
private FlatFileItemReader<NetDetailsDTO> reader(String netFilecolumnNames, String netFileColumnLengths) {
FlatFileItemReader<NetDetailsDTO> flatFileItemReader = new FlatFileItemReader<>();
FixedLengthTokenizer tokenizer = CommonUtil.fixedLengthTokenizer(netFilecolumnNames, netFileColumnLengths);
FieldSetMapper<NetDetailsDTO> mapper = createMapper();
DefaultLineMapper<NetDetailsDTO> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(tokenizer);
lineMapper.setFieldSetMapper(mapper);
flatFileItemReader.setLineMapper(lineMapper);
return flatFileItemReader;
}
/*
* Mapping column data to DTO
*/
private FieldSetMapper<NetDetailsDTO> createMapper() {
BeanWrapperFieldSetMapper<NetDetailsDTO> mapper = new BeanWrapperFieldSetMapper<>();
try {
mapper.setTargetType(NetDetailsDTO.class);
} catch(Exception e) {
log.error("Exception in mapping column data to dto ", e);
}
return mapper;
}
}
I am stuck on this scenario, Any help appreciated
I don't think MultiResourceItemReader is appropriate in your case. I would run a job per file for all the reasons of making one thing do one thing and do it well:
Your preparatory step will work by design
It would be easier to run multiple jobs in parallel and improve your file ingestion throughput
In case of failure, you would only restart the job for the failed file
EDIT: add an example
Resource[] netFileInputs = ... // same code that looks for file as currently in your reader
for (Resource netFileInput : netFileInputs) {
Map<String, JobParameter> parameters = new HashMap<>();
parameters.put("netFileInput", new JobParameter(netFileInput.getFilename()));
jobLauncher.run(job, new JobParameters(parameters));
}
Related
I have a spring batch job using partions and reader is JdbcCursorItemReader, so in this reader I need an authorisation to read correctely crypted data, so when I declare my reader a call the method just bellow .
the problem is that somme partions read null value for the field which need to be decrypted , the only reason is that the authorisation is not set ( I check in database and data are not null), so why it's work for some partions and not for all?
private void authorize() {
//Authorize
JdbcTemplate jdbcTemplate = new JdbcTemplate(dataSource);
jdbcTemplate.update(setClientInfo, authorization);
}
and this is how i declare my reader
#Bean
#StepScope
public JdbcCursorItemReader<MyEntity> reader(#Value("#{stepExecutionContext['modulo']}") Integer modulo)
throws IOException {
ClassPathResource resource = new ClassPathResource(SQL_FILE);
BufferedReader reader = new BufferedReader(new InputStreamReader(resource.getInputStream()));
String query = FileCopyUtils.copyToString(reader);
query = query.replace(MODULO_LABEL, String.valueOf(modulo));
query = query.replace(GRID_SIZE_LABEL, String.valueOf(gridSize));
authorize();
JdbcCursorItemReader<MyEntity> cursorItemReader = new JdbcCursorItemReader<>();
cursorItemReader.setSql(query);
final int partitionSize = maxNumberCards / gridSize;
cursorItemReader.setMaxItemCount(partitionSize);
cursorItemReader.setDataSource(dataSource);
cursorItemReader.setRowMapper(myRowMapper);
return cursorItemReader;
}
and my job configuration
#Configuration
#EnableBatchProcessing
#RefreshScope
public class MyFunctionJobConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
JdbcCursorItemReader<MyEntity> reader;
#Value("${max-number-card-to-process}")
private Integer MAX_NUMBER_CARD;
#Value("${chunck-size:10}")
private int chunckSize;
#Value("${grid-size:1}")
private int gridSize;
private final static String JOB_DISABLED = "job is disabled, check the configuration file !";
#Value("${job.enabled}")
private boolean batchIsEnabled;
private static final Logger LOGGER = LoggerFactory.getLogger("FUNCTIONAL_LOGGER");
#Bean
#StepScope
#RefreshScope
public MyEntityWriter writer() {
return new MyEntityWriter();
}
#Bean
#StepScope
#RefreshScope
public MyFunctionProcessor processor() throws IOException {
return new MyFunctionProcessor();
}
#Bean
public MyPrationner partitioner() {
return new MyPrationner();
}
#Bean
public Step masterStep() throws SQLException, IOException, ClassNotFoundException {
return stepBuilderFactory.get("masterStep")
.partitioner("MyFunctionStep", partitioner())
.step(MyFunctionStep())
.gridSize(gridSize)
.taskExecutor(MyFunctionTaskExecutor())
.build();
}
#Bean
public TaskExecutor myFunctionTaskExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setThreadNamePrefix("MyFunctionTaskExecutor_");
int corePoolSize = gridSize + 2;
int maxPoolSize = corePoolSize * 2;
taskExecutor.setMaxPoolSize(maxPoolSize);
taskExecutor.setAllowCoreThreadTimeOut(true);
taskExecutor.setCorePoolSize(corePoolSize);
taskExecutor.setQueueCapacity(Integer.MAX_VALUE);
return taskExecutor;
}
#Bean
public Step myFunctionStep() throws IOException, ClassNotFoundException, SQLException {
return stepBuilderFactory.get("MyFunctionStep")
.<MyEntity, MyEntity>chunk(chunckSize)
.reader(reader)
.faultTolerant()
.skipLimit(MAX_NUMBER_CARD)
.skip(InvalidCardNumberException.class)
.skip(TokenManagementException.class)
.processor(processor())
.listener(new MyEntityProcessListener())
.writer(writer())
.listener(new MyEntityWriteListener())
.build();
}
#Bean
public Job myFunctionJob(#Qualifier("MyFunctionStep") Step myFunctionStep)
throws SQLException, IOException, ClassNotFoundException {
if (!batchIsEnabled) {
LOGGER.error(JOB_DISABLED);
System.exit(0);
}
return jobBuilderFactory.get("MyFunctionJob")
.listener(new MyFunctionJobListener())
.incrementer(new RunIdIncrementer())
.flow(masterStep())
.end()
.build();
}
}
I try to run a spring batch job with partions to read data from oracle database( in the sql there is a decryption function ) this need to set an authorisation for every session of connexion
the problem when th batch run some partion not decrypt data and return null an the only reason fo that , is that the authorisation is not set
The JdbcCursorItemReader does not use a JdbcTemplate. It directly creates connections to the database from the data source object passed to it. So you should not be expecting to call authorize which operates on a separate JdbcTemplate instance to impact the behaviour of the JdbcCursorItemReader. You said it works for some partitions, and that's really surprising.
If you want to take control on how the connection to the database is configured and override the default settings (for example by adding some authorization attributes), you need to extend JdbcCursorItemReader and override the protected void openCursor(Connection con) method, something like:
class MyCustomJdbcCursorItemReader extends JdbcCursorItemReader {
#Override
protected void openCursor(Connection con) {
super.openCursor(con);
// con.setClientInfo(); // set client info as needed here
}
}
i m new in Spring boot, i'm training on a small project with Spring batch to get experience, Here my context: I have 2 csv file, one hold employees, the other contains all managers of the compagny. I have to read files, then add each record in database. To make it simple , i just need to call an endpoint from my controller , upload my csv file (multipartfile), then the job will start. I actually was able to do that, my problem is the following.
I have to manage multiple kind of validation (i'm using jsr 380 validation for my entites and i have also to check business exception). A kind of buisness exception can be the following rule, An employee is supervised by a manager of his departement (the employee can't be supervised by a manager, if he's not on same departement, otherwise should throw exception). So for mistaken records, with some invalid or "Illogic" input, i have to skip them (don't save on database) but store them in an Map or List that should be sended as Responses Enity to the client. Hence the client would know which row need to be fixed. I suppose i have to take a look about** Listeners** , But i really can t store exceptions in a map or list then send it as ResponseEntity. Bellow Example of what i want to achieve.
My csv files screenshots
EmployeeBatchConfig.java
#Configuration
#EnableBatchProcessing
#AllArgsConstructor
public class EmployeeBatchConfig {
private JobBuilderFactory jobBuilderFactory;
private StepBuilderFactory stepBuilderFactory;
private EmployeeRepository employeeRepository;
private EmployeeItemWriter employeeItemWriter;
#Bean
#StepScope
public FlatFileItemReader<EmployeeDto> itemReader(#Value("#
{jobParameters[fullPathFileName]}") final String pathFile) {
FlatFileItemReader<EmployeeDto> flatFileItemReader = new
FlatFileItemReader<>();
flatFileItemReader.setResource(new FileSystemResource(new
File(pathFile)));
flatFileItemReader.setName("CSV-Reader");
flatFileItemReader.setLinesToSkip(1);
flatFileItemReader.setLineMapper(lineMapper());
return flatFileItemReader;
}
private LineMapper<EtudiantDto> lineMapper() {
DefaultLineMapper<EtudiantDto> lineMapper = new DefaultLineMapper<>
();
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setDelimiter(",");
lineTokenizer.setStrict(false);
lineTokenizer.setNames("Username", "lastName", "firstName",
"departement", "supervisor");
BeanWrapperFieldSetMapper<EmployeeDto> fieldSetMapper = new
BeanWrapperFieldSetMapper<>();
fieldSetMapper.setTargetType(EmployeeDto.class);
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(fieldSetMapper);
return lineMapper;
}
#Bean
public EmployeeProcessor processor() {
return new EmployeeProcessor(); /*Create a bean processor to skip
invalid rows*/
}
#Bean
public RepositoryItemWriter<Employee> writer() {
RepositoryItemWriter<Employee> writer = new RepositoryItemWriter<>();
writer.setRepository(employeeRepository);
writer.setMethodName("save");
return writer;
}
#Bean
public Step step1(FlatFileItemReader<EmployeeDto> itemReader) {
return stepBuilderFactory.get("slaveStep").<EmployeeDto,
Employee>chunk(5)
.reader(itemReader)
.processor(processor())
.writer(employeeItemWriter)
.faultTolerant()
.listener(skipListener())
.skip(SkipException.class)
.skipLimit(10)
.skipPolicy(skipPolicy())
.build();
}
#Bean
#Qualifier("executeJobEmployee")
public Job runJob(FlatFileItemReader<Employee> itemReader) {
return jobBuilderFactory
.get("importEmployee")
.flow(step1(itemReader))
.end()
.build();
}
#Bean
public SkipPolicy skipPolicy(){
return new ExceptionSkipPolicy();
}
#Bean
public SkipListener<EmployeeDto, Employee> skipListener(){
return new StepSkipListener();
}
/*#Bean
public ExecutionContext executionContext(){
return new ExecutionContext();
}*/
}
EmployeeProcessor.java
public class EmployeeProcessor implements ItemProcessor<EmployeeDto,
Employee>{
#Autowired
private SupervisorService managerService;
#Override
public Employee process(#Valid EmployeeDto item) throws Exception,
SkipException {
ManagerDto manager =
SupervisorService.findSupervisorById(item.getSupervisor());
//retrieve the manager of the employee and compare departement
if(!(manager.getDepartement().equals(item.getDepartement()))) {
throw new SkipException("Manager Invalid", item);
//return null;
}
return ObjectMapperUtils.map(item, Employee.class);
}
}
MySkipPolicy.java
public class MySkipPolicy implements SkipPolicy {
#Override
public boolean shouldSkip(Throwable throwable, int i) throws
SkipLimitExceededException {
return true;
}
}
StepSkipListenerPolicy.java
public class StepSkipListener implements SkipListener<EmployeeDto,
Number> {
#Override // item reader
public void onSkipInRead(Throwable throwable) {
System.out.println("In OnSkipReader");
}
#Override // item writter
public void onSkipInWrite(Number item, Throwable throwable) {
System.out.println("Nooooooooo ");
}
//#SneakyThrows
#Override // item processor
public void onSkipInProcess(#Valid EmployeeDto employee, Throwable
throwable){
System.out.println("Process... ");
/* I guess this is where I should work, but how do I deal with the
exception occur? How do I know which exception I would get ? */
}
}
SkipException.java
public class SkipException extends Exception {
private Map<String, EmployeeDto> errors = new HashMap<>();
public SkipException(String errorMessage, EmployeeDto employee) {
super();
this.errors.put(errorMessage, employee);
}
public Map<String, EmployeeDto> getErrors() {
return this.errors;
}
}
JobController.java
#RestController
#RequestMapping("/upload")
public class JobController {
#Autowired
private JobLauncher jobLauncher;
#Autowired
#Qualifier("executeJobEmployee")
private Job job;
private final String EMPLOYEE_FOLDER = "C:/Users/Project/Employee/";
#PostMapping("/employee")
public ResponseEntity<Object> importEmployee(#RequestParam("file")
MultipartFile multipartFile) throws JobInterruptedException,
SkipException, IllegalStateException, IOException,
FlatFileParseException{
try {
String fileName = multipartFile.getOriginalFilename();
File fileToImport= new File(EMPLOYEE_FOLDER + fileName);
multipartFile.transferTo(fileToImport);
JobParameters jobParameters = new JobParametersBuilder()
.addString("fullPathFileName", EMPLOYEE_FOLDER + fileName)
.addLong("startAt", System.currentTimeMillis())
.toJobParameters();
JobExecution jobExecution = this.jobLauncher.run(job,
jobParameters);
ExecutionContext executionContext =
jobExecution.getExecutionContext();
System.out.println("My Skiped items : " +
executionContext.toString());
} catch (ConstraintViolationException | FlatFileParseException |
JobRestartException | JobInstanceAlreadyCompleteException |
JobParametersInvalidException |
JobExecutionAlreadyRunningException e) {
e.printStackTrace();
return new ResponseEntity<>(e.getMessage(), HttpStatus.BAD_REQUEST);
}
return new ResponseEntity<>("Employee inserted succesfully",
HttpStatus.OK);
}
}
That requirement forces your implementation to wait for the job to finish before returning the web response, which is not the typical way of launching batch jobs from web requests. Typically, since batch jobs can run for several minutes/hours, they are launched in the background and a job ID is returned back to the client for later status check.
In Spring Batch, the SkipListener is the extension point that allows you to add custom code when a skippable exception happens when reading, processing or writing an item. I would add the business validation in an item processor and throw an exception with the skipped item and the reason for that skip (both encapsulated in the exception class that should be declared as skippable).
Skipped items are usually stored somewhere for later analysis (like a table or a file or the job execution context). In your case, you need to send them back in the web response, so you can read them from the store of your choice before returning them attached in the web response. In pseudo code in your controller, this should be something like the following:
- run the job and wait for its termination (the skip listener would write skipped items in the storage of your choice)
- get skipped items from storage
- return web response
For example, if you choose to store skipped items in the job execution context, you can do something like this in your controller:
JobExecution jobExecution = jobLauncher.run(job, jobParameters);
ExecutionContext executionContext = jobExecution.getExecutionContext();
// get skipped items from the execution context
// return the web response
I have a scenario for Springboot batch application where I need to read the CSV-11 and write to another CSV-2, the problem arises when I need to get the CSV-1 from a REST endpoint, for e.g. when I initiate the job it should fetch a CSV-1 from an endpoint and then continue the batch processing to write to CSV-2.
But it seems like to process a CSV-1 we need to feed the 'Resource' in advance while application startup [I am fairly new to batch so not sure if it's 100% correct].
Can anyone please guide me the right approach to resolve this?
EDIT : Adding code, I was able to resolve it but need advice if it's the right way to do things (please ignore the hard-coded data).
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Autowired
private JobBuilderFactory jobbuilder;
#Autowired
private StepBuilderFactory stepbuilder;
#Autowired
private CSVProcessor processor;
#Autowired
private CustomSkipPolicy customSkipPolicy;
#Autowired
private CSVResponse response;
#Bean(name="csv")
public Job job_csv() throws Exception {
Step step = stepbuilder
.get("csv-step")
.<Person, Person>chunk(5)
.reader(new CSVReader(response.getResource()))
.processor(processor)
.writer(new CSVWriter().write(response.getFilename()))
.faultTolerant()
.skipPolicy(customSkipPolicy)
.listener(new StepListener())
.build();
return jobbuilder
.get("csv-job")
.incrementer(new RunIdIncrementer())
.listener(new JobListener())
.flow(step)
.end()
.build();
}
}
#Slf4j
public class CSVReader extends FlatFileItemReader<Person> {
public CSVReader(Resource resource) throws Exception {
super();
setResource(resource);
setStrict(false);
setLinesToSkip(1);
doOpen();
DelimitedLineTokenizer dlt = new DelimitedLineTokenizer();
dlt.setNames(new String[] {"id","first_name","last_name","email","gender","ip_address","dob"});
dlt.setDelimiter(",");
dlt.setStrict(false);
BeanWrapperFieldSetMapper<Person> fsp = new BeanWrapperFieldSetMapper<>();
fsp.setTargetType(Person.class);
DefaultLineMapper<Person> dlp = new DefaultLineMapper<>();
dlp.setLineTokenizer(dlt);
dlp.setFieldSetMapper(fsp);
setLineMapper(dlp);
}
}
public class CSVWriter {
private static final String DATA_PROCESSED = "C:/data/processed";
public FlatFileItemWriter<Person> write(String filename) throws Exception {
FlatFileItemWriter<Person> writer = new FlatFileItemWriter<>();
writer.setResource(resource(filename));
writer.setHeaderCallback(new Header());
BeanWrapperFieldExtractor<Person> fe = new BeanWrapperFieldExtractor<>();
fe.setNames(new String[] { "id", "first_name", "last_name", "age" });
DelimitedLineAggregator<Person> dla = new DelimitedLineAggregator<>();
dla.setDelimiter(",");
dla.setFieldExtractor(fe);
writer.setLineAggregator(dla);
return writer;
}
private Resource resource(String filename) throws IOException {
String processed_file = filename.replace(".csv", "").concat("_PROCESSED").concat(".csv");
if (Files.notExists(Paths.get(DATA_PROCESSED), new LinkOption[] { LinkOption.NOFOLLOW_LINKS })) {
Files.createDirectory(Paths.get(DATA_PROCESSED));
}
if (Files.notExists(Paths.get(DATA_PROCESSED + "/" + processed_file),
new LinkOption[] { LinkOption.NOFOLLOW_LINKS })) {
Files.createFile(Paths.get(DATA_PROCESSED + "/" + processed_file));
}
return new FileSystemResource(DATA_PROCESSED + "/" + processed_file);
}
}
#Component
#Slf4j
public class CSVResponse {
private static final String DATA_RECEIVE = "C:/data/receive";
private String filename;
public Resource getResource() throws IOException {
ResponseEntity<Resource> resp = new RestTemplate().getForEntity("http://localhost:8081/csv", Resource.class);
log.info("File Received : "+resp.getBody().getFilename());
setFilename(resp.getBody().getFilename());
Path path = Paths.get(DATA_RECEIVE);
if(!Files.exists(path, new LinkOption[]{ LinkOption.NOFOLLOW_LINKS})) {
Files.createDirectory(path);
}
Files.copy(resp.getBody().getInputStream(), Paths.get(DATA_RECEIVE + "/" +resp.getBody().getFilename()), StandardCopyOption.REPLACE_EXISTING);
return resp.getBody();
}
public String getFilename() {
return filename;
}
public void setFilename(String filename) {
this.filename = filename;
}
}
You can implement item reader with StepExecutionListener interface, and you have to implement the fetching CSV-1 file via RestTemplate in beforeStep() method of the item reader.
You can refer how to use RestTemplate from https://www.baeldung.com/rest-template
I am trying to use spring batch to read file from a .dat file and persist the data into database. My requirement says to either insert all of the data or insert none of the data into table i.e, atomicity. However, using spring batch i'm not able to achieve the same it is reading data in chunks and is inserting data as long as the records are fine. if at some point the record is inappropriate and some db exception is thrown then i want complete rollback which is not happening. Let's say we get error at 2051th record then my code saves 2050 records but i want complete rollback and if all data is good then all N records should be persisted. Thanks in advance for any help or relevant approach that may solve my issue...
NOTE: I have already used Spring Transactional annotation on caller method but it's not working and i'm reading data in a chunk size of 10 items.
MyConfiguration.java
#Configuration
public class MyConfiguration
{
#Autowired
JobBuilderFactory jobBuilderFactory;
#Autowired
StepBuilderFactory stepBuilderFactory;
#Autowired
#Qualifier("MyCompletionListener")
JobCompletionNotificationListener jobCompletionNotificationListener;
#StepScope
#Bean(name="MyReader")
public FlatFileItemReader<InputMapperDTO> reader(#Value("#{jobParameters['fileName']}") String fileName) throws IOException
{
FlatFileItemReader<InputMapperDTO> newBean = new FlatFileItemReader<>();
newBean.setName("MyReader");
newBean.setResource(new InputStreamResource(FileUtils.openInputStream(new File(fileName))));
newBean.setLineMapper(lineMapper());
newBean.setLinesToSkip(1);
return newBean;
}
#Bean(name="MyLineMapper")
public DefaultLineMapper<InputMapperDTO> lineMapper()
{
DefaultLineMapper<InputMapperDTO> lineMapper = new DefaultLineMapper<>();
lineMapper.setLineTokenizer(lineTokenizer());
Reader reader = new Reader();
lineMapper.setFieldSetMapper(reader);
return lineMapper;
}
#Bean(name="MyTokenizer")
public DelimitedLineTokenizer lineTokenizer()
{
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
tokenizer.setDelimiter("|");
tokenizer.setNames("InvestmentAccountUniqueIdentifier", "BaseCurrencyUniqueIdentifier",
"OperatingCurrencyUniqueIdentifier", "PricingHierarchyUniqueIdentifier", "InvestmentAccountNumber",
"DummyAccountIndicator", "InvestmentAdvisorCompanyNumberLegacy","HighNetWorthAccountTypeCode");
tokenizer.setIncludedFields(0, 5, 7, 13, 29, 40, 49,75);
return tokenizer;
}
#Bean(name="MyBatchProcessor")
public ItemProcessor<InputMapperDTO, FinalDTO> processor()
{
return new Processor();
}
#Bean(name="MyWriter")
public ItemWriter<FinalDTO> writer()
{
return new Writer();
}
#Bean(name="MyStep")
public Step step1() throws IOException
{
return stepBuilderFactory.get("MyStep")
.<InputMapperDTO, FinalDTO>chunk(10)
.reader(this.reader(null))
.processor(this.processor())
.writer(this.writer())
.build();
}
#Bean(name=MyJob")
public Job importUserJob(#Autowired #Qualifier("MyStep") Step step1)
{
return jobBuilderFactory
.get("MyJob"+new Date())
.incrementer(new RunIdIncrementer())
.listener(jobCompletionNotificationListener)
.flow(step1)
.end()
.build();
}
}
Writer.java
public class Writer implements ItemWriter<FinalDTO>
{
#Autowired
SomeRepository someRepository;
#Override
public void write(List<? extends FinalDTO> listOfObjects) throws Exception
{
someRepository.saveAll(listOfObjects);
}
}
JobCompletionNotificationListener.java
public class JobCompletionNotificationListener extends JobExecutionListenerSupport
{
#Override
public void afterJob(JobExecution jobExecution)
{
if(jobExecution.getStatus() == BatchStatus.COMPLETED)
{
System.err.println("****************************************");
System.err.println("***** Batch Job Completed ******");
System.err.println("****************************************");
}
else
{
System.err.println("****************************************");
System.err.println("***** Batch Job Failed ******");
System.err.println("****************************************");
}
}
}
MyCallerMethod
#Transactional
public String processFile(String datFile) throws JobExecutionAlreadyRunningException, JobRestartException,
JobInstanceAlreadyCompleteException, JobParametersInvalidException
{
long st = System.currentTimeMillis();
JobParametersBuilder builder = new JobParametersBuilder();
builder.addString("fileName",datFile);
builder.addDate("date", new Date());
jobLauncher.run(job, builder.toJobParameters());
System.err.println("****************************************");
System.err.println("***** Total time consumed = "+(System.currentTimeMillis()-st)+" ******");
System.err.println("****************************************");
return response;
}
The operation I have tried is not provided in batch. For my requirement, I have implemented custom delete which flushes the database upon failure in any step.
How can I initilize the MultiResourceItemReader for each job run . currently with this setup its still using the same instance for each job run
I put the #StepScope still its using the same old list of files which it has already been processed. I am not sure what else I have to add in this code
I also tried with the #JobScope also it did not work out. there is something fundamental I am missing
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Value("file:ftp-inbound/*.csv")
#Autowired
private Resource[] inputResources;
#Autowired
private StepBuilderFactory steps;
#Autowired
private JobBuilderFactory jobs;
#Autowired
private ResourceLoader resourceLoader;
#Bean
public FlatFileItemReader<AccommodationRoomAvailability> itemReader() throws UnexpectedInputException, ParseException, IOException {
FlatFileItemReader<AccommodationRoomAvailability> reader = new FlatFileItemReader<AccommodationRoomAvailability>();
DelimitedLineTokenizer tokenizer = new DelimitedLineTokenizer();
String[] tokens = {"Product ID", "Allotment", "Kamertype", "Zoeknaam", "Hotel", "Datum", "Beschikbaar", "Nachten"};
tokenizer.setNames(tokens);
tokenizer.setDelimiter(";");
tokenizer.setStrict(true);
reader.setLinesToSkip(1);
DefaultLineMapper<AccommodationRoomAvailability> lineMapper = new DefaultLineMapper<AccommodationRoomAvailability>();
lineMapper.setLineTokenizer(tokenizer);
lineMapper.setFieldSetMapper(new RecordFieldSetMapper());
reader.setLineMapper(lineMapper);
return reader;
}
#Bean
#Qualifier("multiResourceReader")
#StepScope
public MultiResourceItemReader<AccommodationRoomAvailability> multiResourceItemReader() throws Exception {
MultiResourceItemReader<AccommodationRoomAvailability> resourceItemReader = new MultiResourceItemReader<AccommodationRoomAvailability>();
resourceItemReader.setResources(inputResources);
resourceItemReader.setDelegate(itemReader());
resourceItemReader.setStrict(false);
resourceItemReader.setSaveState(false);
// resourceItemReader.read();
return resourceItemReader;
}
#Bean
public ItemProcessor<AccommodationRoomAvailability, String> itemProcessor() {
return new AvailabilityProcessor();
}
#Bean
public ItemWriter itemWriter() {
return new ItemWriter() {
#Override
public void write(List list) throws Exception {
}
};
}
#Bean
protected Step step1(#Qualifier("multiResourceReader") MultiResourceItemReader<AccommodationRoomAvailability> reader, ItemProcessor<AccommodationRoomAvailability, String> processor,
ItemWriter writer) {
return steps.get("step1")/*.listener(new StepListener())*/.<AccommodationRoomAvailability, String>chunk(30000).reader(reader)
.processor(processor)
.writer(writer)
.build();
}
#Bean
public Step step2() throws IOException {
FileDeletingTasklet task = new FileDeletingTasklet();
task.setResources(inputResources);
return stepBuilderFactory.get("step2")
.tasklet(task)
.build();
}
#Bean(name = "job")
public Job job(#Qualifier("step1") Step step1, Step step2) throws IOException {
return jobs.get("job")
.start(step1).on("*").to(step2).end()
// .flow(step1).on("").to(step2()).end()
.build();
}
}
Once your application context is created, the injected resources #Value("file:ftp-inbound/*.csv") will be the same during the whole lifetime of your app. That's why the reader will always read the same values.
You need to pass these resources as a parameter to your job and late-bind them in your reader with Step scope. In your example it would be something like:
#Bean
#Qualifier("multiResourceReader")
#StepScope
public MultiResourceItemReader<AccommodationRoomAvailability> multiResourceItemReader(#Value("#{jobParameters['inputResources']}") Resource[] inputResources) throws Exception {
MultiResourceItemReader<AccommodationRoomAvailability> resourceItemReader = new MultiResourceItemReader<AccommodationRoomAvailability>();
resourceItemReader.setResources(inputResources);
resourceItemReader.setDelegate(itemReader());
resourceItemReader.setStrict(false);
resourceItemReader.setSaveState(false);
return resourceItemReader;
}
Then pass input resources as a parameter to your job:
JobParameters jobParameters = new JobParametersBuilder()
.addString("inputResources", "file:ftp-inbound/*.csv")
.toJobParameters();
currently with this setup its still using the same instance for each job run
That's because your resources are always the same when they are injected in a field of your configuration class. If you use the job parameters approach I mentioned in the previous example, you will have a different instance if you run the job with different set of files.