Getting an Error like this - "jobParameters cannot be found on object of type BeanExpressionContext" - spring

We're creating a spring batch app that reads data from a database and writes in another database. In this process, we need to dynamically set the parameter to the SQL as we have parameters that demands data accordingly.
For this, We created a JdbcCursorItemReader Reader with #StepScope as I've found in other articles and tutorials. But was not successful. The chunk reader in our Job actually uses Peekable reader which internally uses the JdbcCursorItemReader object to perform the actual read operation.
When the job is triggered, we get the error - "jobParameters cannot be found on object of type BeanExpressionContext"
Please let me know what is that I am doing wrongly in the bean configuration below.
#Bean
#StepScope
#Scope(proxyMode = ScopedProxyMode.TARGET_CLASS)
public JdbcCursorItemReader<DTO> jdbcDataReader(#Value() String param) throws Exception {
JdbcCursorItemReader<DTO> databaseReader = new JdbcCursorItemReader<DTO>();
return databaseReader;
}
// This class extends PeekableReader, and sets JdbcReader (jdbcDataReader) as delegate
#Bean
public DataPeekReader getPeekReader() {
DataPeekReader peekReader = new DataPeekReader();
return peekReader;
}
// This is the reader that uses Peekable Item Reader (getPeekReader) and also specifies chunk completion policy.
#Bean
public DataReader getDataReader() {
DataReader dataReader = new DataReader();
return dataReader;
}
// This is the step builder.
#Bean
public Step readDataStep() throws Exception {
return stepBuilderFactory.get("readDataStep")
.<DTO, DTO>chunk(getDataReader())
.reader(getDataReader())
.writer(getWriter())
.build();
}
#Bean
public Job readReconDataJob() throws Exception {
return jobBuilderFactory.get("readDataJob")
.incrementer(new RunIdIncrementer())
.flow(readDataStep())
.end()
.build();
}

Please let me know what is that I am doing wrongly in the bean configuration below.
Your jdbcDataReader(#Value() String param) is incorrect. You need to specify a Spel expression in the #Value to specify which parameter to inject. Here is an example of how to pass a job parameter to a JdbcCursorItemReader:
#Bean
#StepScope
public JdbcCursorItemReader<DTO> jdbcCursorItemReader(#Value("#{jobParameters['table']}") String table) {
return new JdbcCursorItemReaderBuilder<DTO>()
.sql("select * from " + table)
// set other properties
.build();
}
You can find more details in the late binding section of the reference documentation.

Related

Spring batch - org.springframework.batch.item.ReaderNotOpenException - jdbccursoritemreader

I get the reader not open exception when tried to read JdbcCursorItemReader in the Item Reader implementation. I checked the stack overflow, but couldnt get a discussion on jdbc item reader.
Here is the code of Batch config and Item reader implementation. Added only required code.
public class BatchConfig extends DefaultBatchConfigurer {
#Bean
public ItemReader<Allocation> allocationReader() {
return new AllocationReader(dataSource);
}
#Bean
public Step step() {
return stepBuilderFactory.get("step").<Allocation, Allocation>chunk (1).reader(allocationReader()).processor(allocationProcessor()).writer(allocationWriter()).build();
}
}
public class AllocationReader implements ItemReader<Allocation> {
private DataSource ds;
private string block;
public AllocationReader(DataSource ds) {
this.ds = ds;
}
#BeforeStep
public void readStep(StepExecution StepExecution) {
this.stepExecution = StepExecution;
block = StepExecution.getJobExecution().get ExecutionContext().get("blocks");
}
#Override
public Allocation read() {
JdbcCursorItemReader<Allocation> reader = new JdbcCursorItemReader<Allocation>();
reader.setSql("select * from blocks where block_ref = + block);
reader.setDataSource(this.ds);
reader.setRowMapper(new AllocationMapper());
return reader.read();
}}
I could not write Item reader as bean in batch config as i need to call before step in item reader to access stepexecution.
If the Item reader read() function return type is changed to jdbccursorItemReader type, it throws type exception in Step reader() .
Let me know what am i missing or any other code snippet is required .
You are creating a JdbcCursorItemReader instance in the read method of AllocationReader. This is not correct. The code of this method should be the implementation of the actual read operation, and not for creating an item reader.
I could not write Item reader as bean in batch config as i need to call before step in item reader to access step execution.
For this use case, you can define the reader as a step-scoped bean and inject attributes from the job execution context as needed. This is explained in the reference documentation here: Late Binding of Job and Step Attributes. In your case, the reader could be defined like this:
#Bean
#StepScoped
public JdbcCursorItemReader<Allocation> itemReader(#Value("#{jobExecutionContext['block']}") String block) {
// use "block" as needed to define the reader
JdbcCursorItemReader<Allocation> reader = new JdbcCursorItemReader<Allocation>();
reader.setSql("select * from blocks where block_ref = " + block);
reader.setDataSource(this.ds);
reader.setRowMapper(new AllocationMapper());
return reader;
}
When you define an item reader bean that is an ItemStream, you need to make the bean definition method return at least ItemStreamReader (or the actual implementation type), so that Spring Batch correctly scopes the bean and calls open/update/close appropriately during the step. Otherwise, the open method will not be called and therefore you will get that ReaderNotOpenException.

Why is exception in Spring Batch AsycItemProcessor caught by SkipListener's onSkipInWrite method?

I'm writing a Spring Boot application that starts up, gathers and converts millions of database entries into a new streamlined JSON format, and then sends them all to a GCP PubSub topic. I'm attempting to use Spring Batch for this, but I'm running into trouble implementing fault tolerance for my process. The database is rife with data quality issues, and sometimes my conversions to JSON will fail. When failures occur, I don't want the job to immediately quit, I want it to continue processing as many records as it can and, before completion, to report which exact records failed so that I, and or my team, can examine these problematic database entries.
To achieve this, I've attempted to use Spring Batch's SkipListener interface. But I'm also using an AsyncItemProcessor and an AsyncItemWriter in my process, and even though the exceptions are occurring during the processing, the SkipListener's onSkipInWrite() method is catching them - rather than the onSkipInProcess() method. And unfortunately, the onSkipInWrite() method doesn't have access to the original database entity, so I can't store its ID in my list of problematic DB entries.
Have I misconfigured something? Is there any other way to gain access to the objects from the reader that failed the processing step of an AsynItemProcessor?
Here's what I've tried...
I have a singleton Spring Component where I store how many DB entries I've successfully processed along with up to 20 problematic database entries.
#Component
#Getter //lombok
public class ProcessStatus {
private int processed;
private int failureCount;
private final List<UnexpectedFailure> unexpectedFailures = new ArrayList<>();
public void incrementProgress { processed++; }
public void logUnexpectedFailure(UnexpectedFailure failure) {
failureCount++;
unexpectedFailure.add(failure);
}
#Getter
#AllArgsConstructor
public static class UnexpectedFailure {
private Throwable error;
private DBProjection dbData;
}
}
I have a Spring batch Skip Listener that's supposed to catch failures and update my status component accordingly:
#AllArgsConstructor
public class ConversionSkipListener implements SkipListener<DBProjection, Future<JsonMessage>> {
private ProcessStatus processStatus;
#Override
public void onSkipInRead(Throwable error) {}
#Override
public void onSkipInProcess(DBProjection dbData, Throwable error) {
processStatus.logUnexpectedFailure(new ProcessStatus.UnexpectedFailure(error, dbData));
}
#Override
public void onSkipInWrite(Future<JsonMessage> messageFuture, Throwable error) {
//This is getting called instead!! Even though the exception happened during processing :(
//But I have no access to the original DBProjection data here, and messageFuture.get() gives me null.
}
}
And then I've configured my job like this:
#Configuration
public class ConversionBatchJobConfig {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private TaskExecutor processThreadPool;
#Bean
public SimpleCompletionPolicy processChunkSize(#Value("${commit.chunk.size:100}") Integer chunkSize) {
return new SimpleCompletionPolicy(chunkSize);
}
#Bean
#StepScope
public ItemStreamReader<DbProjection> dbReader(
MyDomainRepository myDomainRepository,
#Value("#{jobParameters[pageSize]}") Integer pageSize,
#Value("#{jobParameters[limit]}") Integer limit) {
RepositoryItemReader<DbProjection> myDomainRepositoryReader = new RepositoryItemReader<>();
myDomainRepositoryReader.setRepository(myDomainRepository);
myDomainRepositoryReader.setMethodName("findActiveDbDomains"); //A native query
myDomainRepositoryReader.setArguments(new ArrayList<Object>() {{
add("ACTIVE");
}});
myDomainRepositoryReader.setSort(new HashMap<String, Sort.Direction>() {{
put("update_date", Sort.Direction.ASC);
}});
myDomainRepositoryReader.setPageSize(pageSize);
myDomainRepositoryReader.setMaxItemCount(limit);
// myDomainRepositoryReader.setSaveState(false); <== haven't figured out what this does yet
return myDomainRepositoryReader;
}
#Bean
#StepScope
public ItemProcessor<DbProjection, JsonMessage> dataConverter(DataRetrievalSerivice dataRetrievalService) {
//Sometimes throws exceptions when DB data is exceptionally weird, bad, or missing
return new DbProjectionToJsonMessageConverter(dataRetrievalService);
}
#Bean
#StepScope
public AsyncItemProcessor<DbProjection, JsonMessage> asyncDataConverter(
ItemProcessor<DbProjection, JsonMessage> dataConverter) throws Exception {
AsyncItemProcessor<DbProjection, JsonMessage> asyncDataConverter = new AsyncItemProcessor<>();
asyncDataConverter.setDelegate(dataConverter);
asyncDataConverter.setTaskExecutor(processThreadPool);
asyncDataConverter.afterPropertiesSet();
return asyncDataConverter;
}
#Bean
#StepScope
public ItemWriter<JsonMessage> jsonPublisher(GcpPubsubPublisherService publisherService) {
return new JsonMessageWriter(publisherService);
}
#Bean
#StepScope
public AsyncItemWriter<JsonMessage> asyncJsonPublisher(ItemWriter<JsonMessage> jsonPublisher) throws Exception {
AsyncItemWriter<JsonMessage> asyncJsonPublisher = new AsyncItemWriter<>();
asyncJsonPublisher.setDelegate(jsonPublisher);
asyncJsonPublisher.afterPropertiesSet();
return asyncJsonPublisher;
}
#Bean
public Step conversionProcess(SimpleCompletionPolicy processChunkSize,
ItemStreamReader<DbProjection> dbReader,
AsyncItemProcessor<DbProjection, JsonMessage> asyncDataConverter,
AsyncItemWriter<JsonMessage> asyncJsonPublisher,
ProcessStatus processStatus,
#Value("${conversion.failure.limit:20}") int maximumFailures) {
return stepBuilderFactory.get("conversionProcess")
.<DbProjection, Future<JsonMessage>>chunk(processChunkSize)
.reader(dbReader)
.processor(asyncDataConverter)
.writer(asyncJsonPublisher)
.faultTolerant()
.skipPolicy(new MyCustomConversionSkipPolicy(maximumFailures))
// ^ for now this returns true for everything until 20 failures
.listener(new ConversionSkipListener(processStatus))
.build();
}
#Bean
public Job conversionJob(Step conversionProcess) {
return jobBuilderFactory.get("conversionJob")
.start(conversionProcess)
.build();
}
}
This is because the future wrapped by the AsyncItemProcessor is only unwrapped in the AsyncItemWriter, so any exception that might occur at that time is seen as a write exception instead of a processing exception. That's why onSkipInWrite is called instead of onSkipInProcess.
This is actually a known limitation of this pattern which is documented in the Javadoc of the AsyncItemProcessor, here is an excerpt:
Because the Future is typically unwrapped in the ItemWriter,
there are lifecycle and stats limitations (since the framework doesn't know
what the result of the processor is).
While not an exhaustive list, things like StepExecution.filterCount will not
reflect the number of filtered items and
itemProcessListener.onProcessError(Object, Exception) will not be called.
The Javadoc states that the list is not exhaustive, and the side-effect regarding the SkipListener that you are experiencing is one these limitations.

To separate steps class in spring batch

I have tried to find the solution but I cannot... ㅠㅠ
I want to separate steps in a job like below.
step1.class -> step2.class -> step3.class -> done
The reason why I'm so divided is that I have to use queries each step.
#Bean
public Job bundleJob() {
return jobBuilderFactory.get(JOB_NAME)
.start(step1) // bean
.next(step2) // bean
.next(step3()) // and here is the code ex) reader, processor, writer
.build();
}
my purpose is that I have to use the return data in step1, step2.
but jpaItemReader is like async ... so it doesn't process like above order.
debug flow like this.
readerStep1 -> writerStep1 -> readerStep2 -> readerWriter2 -> readerStep3 -> writerStep3
and
-> processorStep1 -> processorStep2 -> processorStep3
that is the big problem to me...
How can I wait each step in a job? Including querying.
aha! I got it.
the point is the creating beans in a configuration.
I wrote annotation bean all kinds of steps so that those are created by spring.
the solution is late binding like #JobScope or #StepScope
#Bean
#StepScope. // late creating bean.
public ListItemReader<Dto> itemReader() {
// business logic
return new ListItemReader<>(dto);
}
To have a separate steps in your job you can use a Flow with a TaskletStep. Sharing a snippet for your reference,
#Bean
public Job processJob() throws Exception {
Flow fetchData = (Flow) new FlowBuilder<>("fetchData")
.start(fetchDataStep()).build();
Flow transformData = (Flow) new FlowBuilder<>("transformData")
.start(transformData()).build();
Job job = jobBuilderFactory.get("processTenantLifeCycleJob").incrementer(new RunIdIncrementer())
.start(fetchData).next(transformData).next(processData()).end()
.listener(jobCompletionListener()).build();
ReferenceJobFactory referenceJobFactory = new ReferenceJobFactory(job);
registry.register(referenceJobFactory);
return job;
}
#Bean
public TaskletStep fetchDataStep() {
return stepBuilderFactory.get("fetchData")
.tasklet(fetchDataValue()).listener(fetchDataStepListener()).build();
}
#Bean
#StepScope
public FetchDataValue fetchDataValue() {
return new FetchDataValue();
}
#Bean
public TaskletStep transformDataStep() {
return stepBuilderFactory.get("transformData")
.tasklet(transformValue()).listener(sendReportDataCompletionListener()).build();
}
#Bean
#StepScope
public TransformValue transformValue() {
return new TransformValue();
}
#Bean
public Step processData() {
return stepBuilderFactory.get("processData").<String, Data>chunk(chunkSize)
.reader(processDataReader()).processor(dataProcessor()).writer(processDataWriter())
.listener(processDataListener())
.taskExecutor(backupTaskExecutor()).build();
}
In this example I have used 2 Flows to Fetch and Transform data which will execute data from a class.
In order to return the value of those from the step 1 and 2, you can store the value in the job context and retrieve that in the ProcessData Step which has a reader, processor and writer.

Spring Batch Annotated No XML Pass Parameters to Item Readere

I created a simple Boot/Spring Batch 3.0.8.RELEASE job. I created a simple class that implements JobParametersIncrementer to go to the database, look up how many days the query should look for and puts those into the JobParameters object.
I need that value in my JdbcCursorItemReader, as it is selecting data based upon one of the looked up JobParameters, but I cannot figure this out via Java annotations. XML examples plenty, not so much for Java.
Below is my BatchConfiguration class that runs job.
`
#Autowired
SendJobParms jobParms; // this guy queries DB and puts data into JobParameters
#Bean
public Job job(#Qualifier("step1") Step step1, #Qualifier("step2") Step step2) {
return jobs.get("DW_Send").incrementer(jobParms).start(step1).next(step2).build();
}
#Bean
protected Step step2(ItemReader<McsendRequest> reader,
ItemWriter<McsendRequest> writer) {
return steps.get("step2")
.<McsendRequest, McsendRequest> chunk(5000)
.reader(reader)
.writer(writer)
.build();
}
#Bean
public JdbcCursorItemReader reader() {
JdbcCursorItemReader<McsendRequest> itemReader = new JdbcCursorItemReader<McsendRequest>();
itemReader.setDataSource(dataSource);
// want to get access to JobParameter here so I can pull values out for my sql query.
itemReader.setSql("select xxxx where rownum <= JobParameter.getCount()");
itemReader.setRowMapper(new McsendRequestMapper());
return itemReader;
}
`
Change reader definition as follows (example for parameter of type Long and name paramCount):
#Bean
#StepScope
public JdbcCursorItemReader reader(#Value("#{jobParameters[paramCount]}") Long paramCount) {
JdbcCursorItemReader<McsendRequest> itemReader = new JdbcCursorItemReader<McsendRequest>();
itemReader.setDataSource(dataSource);
itemReader.setSql("select xxxx where rownum <= ?");
ListPreparedStatementSetter listPreparedStatementSetter = new ListPreparedStatementSetter();
listPreparedStatementSetter.setParameters(Arrays.asList(paramCount));
itemReader.setPreparedStatementSetter(listPreparedStatementSetter);
itemReader.setRowMapper(new McsendRequestMapper());
return itemReader;
}

How to make writer initiated for every job instance in spring batch

I am writing a spring batch job. I am implementing custom writer using KafkaClientWriter extends AbstractItemStreamItemWriter<ProducerMessage>
I have fields which need to be unique for each instance. But I could see this class initiated only once. Rest jobs have same instance of writer class.
Where as my custom readers and processors are getting initiated for each job.
Below is my job configurations. How can I achieve the same behavior for writer as well?
#Bean
#Scope("job")
public ZipMultiResourceItemReader reader(#Value("#{jobParameters[fileName]}") String fileName, #Value("#{jobParameters[s3SourceFolderPrefix]}") String s3SourceFolderPrefix, #Value("#{jobParameters[timeStamp]}") long timeStamp, com.fastretailing.catalogPlatformSCMProducer.service.ConfigurationService confService) {
FlatFileItemReader faltFileReader = new FlatFileItemReader();
ZipMultiResourceItemReader zipReader = new ZipMultiResourceItemReader();
Resource[] resArray = new Resource[1];
resArray[0] = new FileSystemResource(new File(fileName));
zipReader.setArchives(resArray);
DefaultLineMapper<ProducerMessage> lineMapper = new DefaultLineMapper<ProducerMessage>();
lineMapper.setLineTokenizer(new DelimitedLineTokenizer());
CSVFieldMapper csvFieldMapper = new CSVFieldMapper(fileName, s3SourceFolderPrefix, timeStamp, confService);
lineMapper.setFieldSetMapper(csvFieldMapper);
faltFileReader.setLineMapper(lineMapper);
zipReader.setDelegate(faltFileReader);
return zipReader;
}
#Bean
#Scope("job")
public ItemProcessor<ProducerMessage, ProducerMessage> processor(#Value("#{jobParameters[timeStamp]}") long timeStamp) {
ProducerProcessor processor = new ProducerProcessor();
processor.setS3FileTimeStamp(timeStamp);
return processor;
}
#Bean
#ConfigurationProperties
public ItemWriter<ProducerMessage> writer() {
return new KafkaClientWriter();
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader reader, ItemWriter writer,
ItemProcessor processor, #Value("${reader.chunkSize}")
int chunkSize) {
LOGGER.info("Step configuration loaded with chunk size {}", chunkSize);
return stepBuilderFactory.get("step1")
.chunk(chunkSize).reader(reader)
.processor(processor).writer(writer)
.build();
}
#Bean
public StepScope stepScope() {
final StepScope stepScope = new StepScope();
stepScope.setAutoProxy(true);
return stepScope;
}
#Bean
public JobScope jobScope() {
final JobScope jobScope = new JobScope();
return jobScope;
}
#Bean
public Configuration configuration() {
return new Configuration();
}
I tried making the writer with job scope. But in that case open is not getting called. This is where I am doing some initializations.
When using java based configuration and a scoped proxy what happens is that the return type of the method is detected and for that a proxy is created. So when you return ItemWriter you will get a JDK proxy only implementing ItemWriter, whereas your open method is on the ItemStream interface. Because that interface isn't included on the proxy there is no way to call the method.
Either change the return type to KafkaClientWriter or ItemStreamWriter< ProducerMessage> (assuming the KafkaCLientWriter implements that method). Next add #Scope("job") and you should have your open method called again with a properly scoped writer.

Resources