Conditional writer for spring batch - spring-boot

I use spring boot and spring batch
public ItemWriter<T> writerOne(){
ItemWriter<T> writer = new ItemWriter<T>();
//your logic here
return writer;
}
public ItemWriter<T> writerTwo(){
ItemWriter<T> writer = new ItemWriter<T>();
//your logic here
return writer;
}
public CompositeItemWriter<T> compositeItemWriter(){
CompositeItemWriter writer = new CompositeItemWriter();
writer.setDelegates(Arrays.asList(writerOne(),writerTwo()));
return writer;
}
I read a cvs file, do a process and after need to call two writer...
Depending a field value, writerTwo must be called.
Is there any way to get this goal?

Related

Can we get data processed in Spring Batch after batch job is completed?

I am using spring batch for reading data from db and process the same and do spome process in writer.
if batch size is less than the records read by reader then spring batch runs in multiple batches.I want to do the processing in writer only once at the end of all batch process completion or if this is not possible then i will remove writer and process the data obtained in processor after batch job is completed.Is this possible?
Below is my trigger Spring Batch job code
private void triggerSpringBatchJob() {
loggerConfig.logDebug(log, " : Triggering product catalog scheduler ");
JobParametersBuilder builder = new JobParametersBuilder();
try {
// Adding date in buildJobParameters because if not added we will get A job
// instance already exists: JobInstanceAlreadyCompleteException
builder.addDate("date", new Date());
jobLauncher.run(processProductCatalog, builder.toJobParameters());
} catch (JobExecutionAlreadyRunningException | JobRestartException | JobInstanceAlreadyCompleteException
| JobParametersInvalidException e) {
e.printStackTrace();
}
}
Below is my spring batch configuration
#Configuration
#EnableBatchProcessing
public class BatchJobProcessConfiguration {
#Bean
#StepScope
RepositoryItemReader<Tuple> reader(SkuRepository skuRepository,
ProductCatalogConfiguration productCatalogConfiguration) {
RepositoryItemReader<Tuple> reader = new RepositoryItemReader<>();
reader.setRepository(skuRepository);
// query parameters
List<Object> queryMethodArguments = new ArrayList<>();
if (productCatalogConfiguration.getSkuId().isEmpty()) {
reader.setMethodName("findByWebEligibleAndDiscontinued");
queryMethodArguments.add(productCatalogConfiguration.getWebEligible()); // for web eligible
queryMethodArguments.add(productCatalogConfiguration.getDiscontinued()); // for discontinued
queryMethodArguments.add(productCatalogConfiguration.getCbdProductId()); // for cbd products
} else {
reader.setMethodName("findBySkuIds");
queryMethodArguments.add(productCatalogConfiguration.getSkuId()); // for sku ids
}
reader.setArguments(queryMethodArguments);
reader.setPageSize(1000);
Map<String, Direction> sorts = new HashMap<>();
sorts.put("sku_id", Direction.ASC);
reader.setSort(sorts);
return reader;
}
#Bean
#StepScope
ItemWriter<ProductCatalogWriterData> writer() {
return new ProductCatalogWriter();
}
#Bean
ProductCatalogProcessor processor() {
return new ProductCatalogProcessor();
}
#Bean
SkipPolicy readerSkipper() {
return new ReaderSkipper();
#Bean
Step productCatalogDataStep(ItemReader<Tuple> itemReader, ProductCatalogWriter writer,
HttpServletRequest request, StepBuilderFactory stepBuilderFactory,BatchConfiguration batchConfiguration) {
return stepBuilderFactory.get("processProductCatalog").<Tuple, ProductCatalogWriterData>chunk(batchConfiguration.getBatchChunkSize())
.reader(itemReader).faultTolerant().skipPolicy(readerSkipper()).processor(processor()).writer(writer).build();
}
#Bean
Job productCatalogData(Step productCatalogDataStep, HttpServletRequest request,
JobBuilderFactory jobBuilderFactory) {
return jobBuilderFactory.get("processProductCatalog").incrementer(new RunIdIncrementer())
.flow(productCatalogDataStep).end().build();
}
}
want to do the processing in writer only once at the end of all batch process completion or if this is not possible then i will remove writer and process the data obtained in processor after batch job is completed.Is this possible?
"at the end of all batch process completion" is key here. If the requirement is to do some processing after all chunks have been "pre-processed", I would keep it simple and use two steps for that:
Step 1: (pre)processes the data as needed and writes it to a temporary storage
Step 2: Here you do whatever you want with the processed data prepared in the temporary storage
A final step would clean up the temporary storage if it is persistent (file, staging table, etc). Otherwise, ie if it is in memory, this is optional.

Getting an Error like this - "jobParameters cannot be found on object of type BeanExpressionContext"

We're creating a spring batch app that reads data from a database and writes in another database. In this process, we need to dynamically set the parameter to the SQL as we have parameters that demands data accordingly.
For this, We created a JdbcCursorItemReader Reader with #StepScope as I've found in other articles and tutorials. But was not successful. The chunk reader in our Job actually uses Peekable reader which internally uses the JdbcCursorItemReader object to perform the actual read operation.
When the job is triggered, we get the error - "jobParameters cannot be found on object of type BeanExpressionContext"
Please let me know what is that I am doing wrongly in the bean configuration below.
#Bean
#StepScope
#Scope(proxyMode = ScopedProxyMode.TARGET_CLASS)
public JdbcCursorItemReader<DTO> jdbcDataReader(#Value() String param) throws Exception {
JdbcCursorItemReader<DTO> databaseReader = new JdbcCursorItemReader<DTO>();
return databaseReader;
}
// This class extends PeekableReader, and sets JdbcReader (jdbcDataReader) as delegate
#Bean
public DataPeekReader getPeekReader() {
DataPeekReader peekReader = new DataPeekReader();
return peekReader;
}
// This is the reader that uses Peekable Item Reader (getPeekReader) and also specifies chunk completion policy.
#Bean
public DataReader getDataReader() {
DataReader dataReader = new DataReader();
return dataReader;
}
// This is the step builder.
#Bean
public Step readDataStep() throws Exception {
return stepBuilderFactory.get("readDataStep")
.<DTO, DTO>chunk(getDataReader())
.reader(getDataReader())
.writer(getWriter())
.build();
}
#Bean
public Job readReconDataJob() throws Exception {
return jobBuilderFactory.get("readDataJob")
.incrementer(new RunIdIncrementer())
.flow(readDataStep())
.end()
.build();
}
Please let me know what is that I am doing wrongly in the bean configuration below.
Your jdbcDataReader(#Value() String param) is incorrect. You need to specify a Spel expression in the #Value to specify which parameter to inject. Here is an example of how to pass a job parameter to a JdbcCursorItemReader:
#Bean
#StepScope
public JdbcCursorItemReader<DTO> jdbcCursorItemReader(#Value("#{jobParameters['table']}") String table) {
return new JdbcCursorItemReaderBuilder<DTO>()
.sql("select * from " + table)
// set other properties
.build();
}
You can find more details in the late binding section of the reference documentation.

Spring Batch Annotated No XML Pass Parameters to Item Readere

I created a simple Boot/Spring Batch 3.0.8.RELEASE job. I created a simple class that implements JobParametersIncrementer to go to the database, look up how many days the query should look for and puts those into the JobParameters object.
I need that value in my JdbcCursorItemReader, as it is selecting data based upon one of the looked up JobParameters, but I cannot figure this out via Java annotations. XML examples plenty, not so much for Java.
Below is my BatchConfiguration class that runs job.
`
#Autowired
SendJobParms jobParms; // this guy queries DB and puts data into JobParameters
#Bean
public Job job(#Qualifier("step1") Step step1, #Qualifier("step2") Step step2) {
return jobs.get("DW_Send").incrementer(jobParms).start(step1).next(step2).build();
}
#Bean
protected Step step2(ItemReader<McsendRequest> reader,
ItemWriter<McsendRequest> writer) {
return steps.get("step2")
.<McsendRequest, McsendRequest> chunk(5000)
.reader(reader)
.writer(writer)
.build();
}
#Bean
public JdbcCursorItemReader reader() {
JdbcCursorItemReader<McsendRequest> itemReader = new JdbcCursorItemReader<McsendRequest>();
itemReader.setDataSource(dataSource);
// want to get access to JobParameter here so I can pull values out for my sql query.
itemReader.setSql("select xxxx where rownum <= JobParameter.getCount()");
itemReader.setRowMapper(new McsendRequestMapper());
return itemReader;
}
`
Change reader definition as follows (example for parameter of type Long and name paramCount):
#Bean
#StepScope
public JdbcCursorItemReader reader(#Value("#{jobParameters[paramCount]}") Long paramCount) {
JdbcCursorItemReader<McsendRequest> itemReader = new JdbcCursorItemReader<McsendRequest>();
itemReader.setDataSource(dataSource);
itemReader.setSql("select xxxx where rownum <= ?");
ListPreparedStatementSetter listPreparedStatementSetter = new ListPreparedStatementSetter();
listPreparedStatementSetter.setParameters(Arrays.asList(paramCount));
itemReader.setPreparedStatementSetter(listPreparedStatementSetter);
itemReader.setRowMapper(new McsendRequestMapper());
return itemReader;
}

How to change my job configuration to add file name dynamically

I have a spring batch job which reads from a db then outputs to a multiple csv's. Inside my db I have a special column named divisionId. A CSV file should exist for every distinct value of divisionId. I split out the data using a ClassifierCompositeItemWriter.
At the moment I have an ItemWriter bean defined for every distinct value of divisionId. The beans are the same, it's only the file name that is different.
How can I change the configuration below to create a file with the divisionId automatically pre-pended to the file name without having to register a new ItemWriter for each divisionId?
I've been playing around with #JobScope and #StepScope annotations but can't get it right.
Thanks in advance.
#Bean
public Step readStgDbAndExportMasterListStep() {
return commonJobConfig.stepBuilderFactory
.get("readStgDbAndExportMasterListStep")
.<MasterList,MasterList>chunk(commonJobConfig.chunkSize)
.reader(commonJobConfig.queryStagingDbReader())
.processor(masterListOutputProcessor())
.writer(masterListFileWriter())
.stream((ItemStream) divisionMasterListFileWriter45())
.stream((ItemStream) divisionMasterListFileWriter90())
.build();
}
#Bean
public ItemWriter<MasterList> masterListFileWriter() {
BackToBackPatternClassifier classifier = new BackToBackPatternClassifier();
classifier.setRouterDelegate(new DivisionClassifier());
classifier.setMatcherMap(new HashMap<String, ItemWriter<? extends MasterList>>() {{
put("45", divisionMasterListFileWriter45());
put("90", divisionMasterListFileWriter90());
}});
ClassifierCompositeItemWriter<MasterList> writer = new ClassifierCompositeItemWriter<MasterList>();
writer.setClassifier(classifier);
return writer;
}
#Bean
public ItemWriter<MasterList> divisionMasterListFileWriter45() {
FlatFileItemWriter<MasterList> writer = new FlatFileItemWriter<>();
writer.setResource(new FileSystemResource(new File(commonJobConfig.outDir, "45_masterList" + "" + ".csv")));
writer.setHeaderCallback(masterListFlatFileHeaderCallback());
writer.setLineAggregator(masterListFormatterLineAggregator());
return writer;
}
#Bean
public ItemWriter<MasterList> divisionMasterListFileWriter90() {
FlatFileItemWriter<MasterList> writer = new FlatFileItemWriter<>();
writer.setResource(new FileSystemResource(new File(commonJobConfig.outDir, "90_masterList" + "" + ".csv")));
writer.setHeaderCallback(masterListFlatFileHeaderCallback());
writer.setLineAggregator(masterListFormatterLineAggregator());
return writer;
}
I came up with a pretty complex way of doing this. I followed a tutorial at https://github.com/langmi/spring-batch-examples/wiki/Rename-Files.
The premise is to use the step execution context to place the file name in it.

How to make writer initiated for every job instance in spring batch

I am writing a spring batch job. I am implementing custom writer using KafkaClientWriter extends AbstractItemStreamItemWriter<ProducerMessage>
I have fields which need to be unique for each instance. But I could see this class initiated only once. Rest jobs have same instance of writer class.
Where as my custom readers and processors are getting initiated for each job.
Below is my job configurations. How can I achieve the same behavior for writer as well?
#Bean
#Scope("job")
public ZipMultiResourceItemReader reader(#Value("#{jobParameters[fileName]}") String fileName, #Value("#{jobParameters[s3SourceFolderPrefix]}") String s3SourceFolderPrefix, #Value("#{jobParameters[timeStamp]}") long timeStamp, com.fastretailing.catalogPlatformSCMProducer.service.ConfigurationService confService) {
FlatFileItemReader faltFileReader = new FlatFileItemReader();
ZipMultiResourceItemReader zipReader = new ZipMultiResourceItemReader();
Resource[] resArray = new Resource[1];
resArray[0] = new FileSystemResource(new File(fileName));
zipReader.setArchives(resArray);
DefaultLineMapper<ProducerMessage> lineMapper = new DefaultLineMapper<ProducerMessage>();
lineMapper.setLineTokenizer(new DelimitedLineTokenizer());
CSVFieldMapper csvFieldMapper = new CSVFieldMapper(fileName, s3SourceFolderPrefix, timeStamp, confService);
lineMapper.setFieldSetMapper(csvFieldMapper);
faltFileReader.setLineMapper(lineMapper);
zipReader.setDelegate(faltFileReader);
return zipReader;
}
#Bean
#Scope("job")
public ItemProcessor<ProducerMessage, ProducerMessage> processor(#Value("#{jobParameters[timeStamp]}") long timeStamp) {
ProducerProcessor processor = new ProducerProcessor();
processor.setS3FileTimeStamp(timeStamp);
return processor;
}
#Bean
#ConfigurationProperties
public ItemWriter<ProducerMessage> writer() {
return new KafkaClientWriter();
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader reader, ItemWriter writer,
ItemProcessor processor, #Value("${reader.chunkSize}")
int chunkSize) {
LOGGER.info("Step configuration loaded with chunk size {}", chunkSize);
return stepBuilderFactory.get("step1")
.chunk(chunkSize).reader(reader)
.processor(processor).writer(writer)
.build();
}
#Bean
public StepScope stepScope() {
final StepScope stepScope = new StepScope();
stepScope.setAutoProxy(true);
return stepScope;
}
#Bean
public JobScope jobScope() {
final JobScope jobScope = new JobScope();
return jobScope;
}
#Bean
public Configuration configuration() {
return new Configuration();
}
I tried making the writer with job scope. But in that case open is not getting called. This is where I am doing some initializations.
When using java based configuration and a scoped proxy what happens is that the return type of the method is detected and for that a proxy is created. So when you return ItemWriter you will get a JDK proxy only implementing ItemWriter, whereas your open method is on the ItemStream interface. Because that interface isn't included on the proxy there is no way to call the method.
Either change the return type to KafkaClientWriter or ItemStreamWriter< ProducerMessage> (assuming the KafkaCLientWriter implements that method). Next add #Scope("job") and you should have your open method called again with a properly scoped writer.

Resources