Spring Batch tasklet configuration - spring

My batch Job takes input of 15 different files and the first step in my job is a Validate Step which will validate the header and total no of records, next step is the chunk based tasklet.
Now in validation, I have different if-else cases in my tasklet to handle different validations for the files and the cases are based on the file name and extension which are been passed as Job parameters. In this case I need to know whether the tasklet reference for my validation step can be decided upon runtime. i.e I'll have bean references for validators for 15 files and based upon the file extension in the job parameters my tasklet ref should be taken at runtime.
<batch:job id="downloadFile">
<batch:step id="validator" >
<batch:tasklet ref="InteropValidator"/>
<batch:next on="FAILED" to="AckStep"/>
<batch:next on="COMPLETED" to="loadFiles"/>
</batch:step>
<batch:step id="loadFiles" next="AckStep">
<partition step="slave" partitioner="partitionerFactory">
<handler grid-size="1" task-executor="taskExecutor" />
</partition>
</batch:step>
<batch:step id="AckStep" >
<batch:tasklet ref="Acksteptasklet"/>
<batch:fail on="FAILED"/>
</batch:step>
</batch:job>
In InteropValidator java, I have implemented the tasklet interface and written code snippet as below:
if("ICTX".equals(FilenameUtils.getExtension(filename)))
{
if(fileValidated && detailValid && agencyValid )
{
cc.getStepContext().getStepExecution().getJobExecution().getExecutionContext().put("STATUSCODE","01");
sc.setExitStatus(ExitStatus.COMPLETED);
}
}
if("SML".equals(FilenameUtils.getExtension(filename)))
{
//Validations for SML File
}

Related

Get jobExecutionContext in xml config spring batch from before step

I am defining my MultiResourceItemReader on this way:
<bean id="multiDataItemReader" class="org.springframework.batch.item.file.MultiResourceItemReader" scope="step">
<property name="resources" value="#{jobExecutionContext['filesResource']}"/>
<property name="delegate" ref="dataItemReader"/>
</bean>
How you can see I want read from the jobExecutionContext the "filesResource" value.
Note: I changed some names to keep the "code privacy". This is executing, Is somebody wants more info please tell me.
I am saving this value in my first step and I am using the reader in the second step, Should I have access to it?
I am saving it in the final lines from my step1 tasklet:
ExecutionContext jobContext = context.getStepContext().getStepExecution().getJobExecution().getExecutionContext();
jobContext.put("filesResource", resourceString);
<batch:job id="myJob">
<batch:step id="step1" next="step2">
<batch:tasklet ref="moveFilesFromTasklet" />
</batch:step>
<batch:step id="step2">
<tasklet>
<chunk commit-interval="500"
reader="multiDataItemReader"
processor="dataItemProcessor"
writer="dataItemWriter" />
</tasklet>
</batch:step>
</batch:job>
I am not really sure what I am forgetting to get the value. The error that I am getting is:
20190714 19:49:08.120 WARN org.springframework.batch.item.file.MultiResourceItemReader [[ # ]] - No resources to read. Set strict=true if this should be an error condition.
I see nothing wrong with your config. The value of resourceString should be an array of org.springframework.core.io.Resource as this is the parameter type of the resources attribute of MultiResourceItemReader.
You can pass an array or a list of String with the absolute path to each resource and it should work. Here is a quick example:
class MyTasklet implements Tasklet {
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) {
List<String> resources = Arrays.asList(
"/full/path/to/resource1",
"/full/path/to/resource2");
chunkContext.getStepContext().getStepExecution().getJobExecution().getExecutionContext()
.put("filesResource", resources);
return RepeatStatus.FINISHED;
}
}

How to skip reader, writer in spring batch

I have a requirement where I need to upload some files to a server. I am using spring batch to accomplish the same. Here the "initializeFile" will basically interact with the server to check if the files already exists in server. if not then it should call the step "uploadIndexFileStep" to upload the files. If files already present in server then the step "uploadIndexFileStep" SHOULDN'T be called.
How to implement this case wherein if the "initializeFile" has no files to upload then spring should not call the next step "uploadIndexFileStep".
Is there a way, or do I need to follow some design or its a spring config change? Any pointers would be helpful.
following is the batch configuration.
<batch:step id="initFileStep" next="uploadIndexFileStep">
<batch:tasklet ref="initializeFile"></batch:tasklet>
</batch:step>
<batch:step id="uploadIndexFileStep">
<batch:tasklet>
<batch:chunk reader="indexFileReader" processor="indexFileProcessor" writer="indexFileWriter" commit-interval="${app.chunk.commit.interval}"/>
</batch:tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="uploadIndexJobListener"/>
</batch:listeners>
</batch:job>
Spring batch provides a nice way to handle conditional flow. You can implement this by using ON exist status.
You can have something like below
#Bean
public Job job() {
return jobBuilderFactory().get("job").
flow(initializeFile()).on("FILELOADED").to(anyStep()).
from(initializeFile()).on("FILENOTLOADED").to(uploadIndexFileStep()).next(step3()).next(step4()).end().build();
}
5.3.2 Conditional Flow
I resolved this using JobExecutionDecider. I am maintaining the queue size in ExecutionContext and then reading this execution context in decider to manage the flow.
public class UploadIndexFlowDecider implements JobExecutionDecider {
#Override
public FlowExecutionStatus decide(JobExecution jobExecution, StepExecution stepExecution) {
int queueSize = jobExecution.getExecutionContext().getInt("INDEX_UPLOAD_QUEUE_SIZE");
if(queueSize > 0)
return FlowExecutionStatus.COMPLETED;
else
return FlowExecutionStatus.STOPPED;
}
}
#Component
public class InitializeFileStep implements Tasklet {
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
chunkContext.getStepContext().getStepExecution().getJobExecution().getExecutionContext().putInt("INDEX_UPLOAD_QUEUE_SIZE", 1);
return RepeatStatus.FINISHED;
}
<batch:job id="uploadIndexFileJob">
<batch:step id="initFileStep" next="uploadDecision">
<batch:tasklet ref="initializeFile"></batch:tasklet>
</batch:step>
<batch:decision id="uploadDecision" decider="uploadIndexDecision">
<batch:next on="COMPLETED" to="uploadIndexFileStep"/>
<batch:end on="STOPPED"/>
</batch:decision>
<batch:step id="uploadIndexFileStep">
<batch:tasklet>
<batch:chunk reader="indexFileReader" processor="indexFileProcessor" writer="indexFileWriter" commit-interval="${app.chunk.commit.interval}"/>
</batch:tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="uploadIndexJobListener"/>
</batch:listeners>
</batch:job>

I want my Processor to get access to JobExecutionId

Spring 4.3 with Spring Batch 3.0.8.
I want to have a reference to the job execution id in the processor, so I can put it inside the output object and write it out along with the data to db. Here is my setup below.
I have added the blueReportJobExecutionListener, which gives me the JobExecution ID that I need.... but how do I send that over to my blueReportItemProcessor ?! That's the object that needs that value.
<bean id="blueReportJobExecutionListener" class="com.cloud.cost.listener.BlueReportJobExecutionListener" scope="prototype" />
<bean id="blueReportJobListener" class="com.cloud.cost.listener.BlueReportJobListener" scope="prototype" />
<bean id="blueReportStepListener" class="com.cloud.cost.listener.BlueReportStepListener" scope="prototype" />
<batch:job id="blueReportJob">
<batch:step id="blueReportStep">
<batch:tasklet>
<batch:chunk reader="blueReportCSVFileItemReader" processor="blueReportItemProcessor" writer="mysqlItemWriter"
commit-interval="2">
</batch:chunk>
</batch:tasklet>
<batch:listeners>
<batch:listener ref="blueReportStepListener"/>
</batch:listeners>
</batch:step>
<batch:listeners>
<batch:listener ref="blueReportJobListener"/>
<batch:listener ref="**blueReportJobExecutionListener**"/>
</batch:listeners>
</batch:job>
You can the get the value from Job Execution by simply using #Value annotation.
#Value("#{jobExecutionContext['JOB_ID']}")
Where JOB_ID is the key you have used in the listener to add the job id.
Make sure your processor scope is defined as step otherwise this value will not be autowired.

How to make Execution context in Spring batch Partitioner to run in sequence

I have a requirement where first I have to select no of MasterRecords from table and then for each MasterRecords I will have to fetch no of child rows and for each child rows process and write chunk wise.
To do this I used Partitioner in spring batch and created master and slave steps to achieve this. Now code is working fine if I dont need to run slave step in same sequence it was added to Execution context.
But my requirement is to run slave step for each execution context in same sequence it was added in partitioner. Because until I process parent record I cannot process child records.
Using partitioner slave step is not running in same sequence. Please help me how to maintain same sequence for slave step run ?????
Is there any other way to achieve this using spring batch. any help is welcomed.
<job id="EPICSDBJob" xmlns="http://www.springframework.org/schema/batch">
<!-- Create Order Master Start -->
<step id="populateNewOrdersMasterStep" allow-start-if-complete="false"
next="populateLineItemMasterStep">
<partition step="populateNewOrders" partitioner="pdcReadPartitioner">
<handler grid-size="1" task-executor="taskExecutor" />
</partition>
<batch:listeners>
<batch:listener ref="partitionerStepListner" />
</batch:listeners>
</step>
<!-- Create Order Master End -->
<listeners>
<listener ref="epicsPimsJobListner" />
</listeners>
</job>
<step id="populateNewOrders" xmlns="http://www.springframework.org/schema/batch">
<tasklet allow-start-if-complete="true">
<chunk reader="epicsDBReader" processor="epicsPimsProcessor"
writer="pimsWriter" commit-interval="10">
</chunk>
</tasklet>
<batch:listeners>
<batch:listener ref="stepJobListner" />
</batch:listeners>
</step>
<bean id="epicsDBReader" class="com.cat.epics.sf.batch.reader.EPICSDBReader" scope="step" >
<property name="sfObjName" value="#{stepExecutionContext[sfParentObjNm]}" />
<property name="readChunkCount" value="10" />
<property name="readerDao" ref="readerDao" />
<property name="configDao" ref="configDao" />
<property name="dBReaderService" ref="dBReaderService" />
</bean>
Partitioner Method:
#Override
public Map<String, ExecutionContext> partition(int arg0) {
Map<String, ExecutionContext> result = new LinkedHashMap<String, ExecutionContext>();
List<String> sfMappingObjectNames = configDao.getSFMappingObjNames();
int i=1;
for(String sfMappingObjectName: sfMappingObjectNames){
ExecutionContext value = new ExecutionContext();
value.putString("sfParentObjNm", sfMappingObjectName);
result.put("partition:"+i, value);
i++;
}
return result;
}
There isn't a way to guarantee order within Spring Batch's partitioning model. The fact that the partitions are executed in parallel means that, by definition, there will be no ordering to the records processed. I think this is a case where restructuring the job a bit may help.
If your requirement is to execute the parent then execute the children, using a driving query pattern along with the partitioning would work. You'd partition along the parent records (which it looks like you're doing), then in the worker step, you'd use the parent record to drive queries and processing for the children records. That would guarantee that the child records are processed after the master one.

how to best approach to use spring batch annotation or xml files ?

firstly, thanks for attention,in my spring batch project defined many jobs , for example:
<batch:job id="helloWorldJob1" job-repository="jobRepository">
<batch:step id="step1" >
<batch:tasklet>
<batch:chunk reader="itemReader1" writer="itemWriter1"
processor="itemProcessor1">
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
<batch:job id="helloWorldJob2" job-repository="jobRepository">
<batch:step id="step1" >
<batch:tasklet>
<batch:chunk reader="itemReader2" writer="itemWriter2"
processor="itemProcessor2">
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
<batch:job id="helloWorldJob3" job-repository="jobRepository">
<batch:step id="step1" >
<batch:tasklet>
<batch:chunk reader="itemReader3" writer="itemWriter3"
processor="itemProcessor3">
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
.
.
.
.
how to use pure annotation ? is it the right approach?
Basically good start is Spring batch official documentation. Only thing here to note is that example has one job which runs when you do
mvn spring-boot:run. BatchConfiguration is example how pure java configuration can look like.
On our project we created main configuration like this:
#Configuration
#EnableBatchProcessing(modular = true)
public class JobContextConfig {
#Autowired
private JobRepository jobRepository;
#Bean
public ApplicationContextFactory helloWorldJob1Job() {
return new GenericApplicationContextFactory(HelloWorldJob1JobConfig.class);
}
#Bean
public ApplicationContextFactory helloWorldJob2Job() {
return new GenericApplicationContextFactory(HelloWorldJob2JobConfig.class);
}
#Bean
public ApplicationContextFactory helloWorldJob3Job() {
return new GenericApplicationContextFactory(HelloWorldJob3JobConfig.class);
}
}
And we have separate configuration in separate context for each of jobs. HelloWorldJob1JobConfig would hold everything that that job needs and class would look like BatchConfiguration from spring example. This will create everything you need except triggering so we created launchers for each job (we are launching some jobs over http, some with messaging and some manually so we needed actually to start jobs defined this way with JobLauncher).
Another good resource for integrating spring-batch with spring-boot with pure java configuration is spring batch boot web starter.

Resources