Can we write a Spring Batch Job Without ItemReader and ItemWriter - spring

In my project, I have written a Quartz scheduler with Spring Batch 2.2.
As per my requirement, I want to run a scheduler to fetch application config property to refresh the configuration cache on all the GlassFish Clusters.
So I dont need ItemWriter and ItemReader which are used to for File Read/Write operations.
So can I remove ItemReader and ItemWriter from ?
The configuration of my job is mentioned below :
<batch:job id="reportJob">
<batch:step id="step1">
<batch:tasklet>
<!--I want to remove ItemReader and ItemWriter as its not used -->
<batch:chunk reader="ItemReader" writer="ItemWriter"
commit-interval="10">
</batch:chunk>
</batch:tasklet>
</batch:step>
<batch:listeners>
<batch:listener ref="simpleListener"/>
</batch:listeners>
</batch:job>
<bean id="jobDetail" class="org.springframework.scheduling.quartz.JobDetailBean">
<!-- Cache Refresh code is written here : JobLauncherDetails.java file -->
<property name="jobClass" value="com.mkyong.quartz.JobLauncherDetails" />
<property name="group" value="quartz-batch" />
<property name="jobDataAsMap">
<map>
<entry key="jobName" value="reportJob" />
<entry key="jobLocator" value-ref="jobRegistry" />
<entry key="jobLauncher" value-ref="jobLauncher" />
<entry key="param1" value="mkyong1" />
<entry key="param2" value="mkyong2" />
</map>
</property>
</bean>
I writing my business logic to refresh cache on JobClass JobLauncherDetails.java.
So is it possible to remove ItemReader and ItemWriter ? Do we have any possible alternative way ?

Use a Tasklet
<job id="reportJob">
<step id="step1">
<tasklet ref="MyTaskletBean" />
</step>
<!-- Other config... -->
</job>
class MyTasklet implements Tasklet {
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
}
}
You can read more on Tasklet at chapter 5.2 from official doc

Related

Spring Batch: is this a tasklet or chunk?

I'm a little bit confused!
Spring Batch provides two different ways for implementing a job: using tasklets and chunks.
So, when I have this:
<tasklet>
<chunk
reader = 'itemReader'
processor = 'itemProcessor'
writer = 'itemWriter'
/>
</tasklet>
What kind of implementation is this? Tasklet? Chunk?
That's a chunk type step, because inside the <tasklet> element is a <chunk> element that defines a reader, writer, and/or processor.
Below is an example of a job executing first a chunk and second a tasklet step:
<job id="readMultiFileJob" xmlns="http://www.springframework.org/schema/batch">
<step id="step1" next="deleteDir">
<tasklet>
<chunk reader="multiResourceReader" writer="flatFileItemWriter"
commit-interval="1" />
</tasklet>
</step>
<step id="deleteDir">
<tasklet ref="fileDeletingTasklet" />
</step>
</job>
<bean id="fileDeletingTasklet" class="com.mkyong.tasklet.FileDeletingTasklet" >
<property name="directory" value="file:csv/inputs/" />
</bean>
<bean id="multiResourceReader"
class=" org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources" value="file:csv/inputs/domain-*.csv" />
<property name="delegate" ref="flatFileItemReader" />
</bean>
Thus you can see that the distinction is actually on the level of steps, not for the entire job.

Reading one file and writing to two different files doesn't works when commit-interval is more than 1

I am reading one file and based on some business logic writing to two different files. I am using ClassifierCompositeItemWriter to write to two different files.
It works only when the commit-interval is 1 or else some records get written to different output file.
Below is the code snippet,
<batch:job id="interestJob">
<batch:step id="verifyFile" parent="VerifyFile">
<batch:fail on="FAILED"/>
<batch:next on="*" to="processInterest"/>
</batch:step>
<batch:step id="processInterest">
<batch:tasklet>
<batch:chunk reader="itemReader" processor="itemProcessor" writer="itemWriter" commit-interval="50" skip-limit="1000000">
<batch:streams>
<batch:stream ref="masterCarditemWriter"/>
<batch:stream ref="visaitemWriter"/>
</batch:streams>
</batch:tasklet>
</batch:step>
</batch:job>
<bean id="itemWriter" class="org.springframework.batch.item.support.ClassifierCompositeItemWriter">
<property name="classifier" ref="classifier" />
</bean>
<bean id="classifier" class="org.springframework.batch.classify.BackToBackPatternClassifier">
<property name="routerDelegate">
<bean class="com.scotiabank.sco.report.batch.dda.interest.MyClassifier" />
</property>
<property name="matcherMap">
<map>
<entry key="visa" value-ref="visaitemWriter" />
<entry key="master" value-ref="masterCarditemWriter" />
</map>
</property>
</bean>
public class MyClassifier {
#Classifier
public String classify(Interest dda) {
return dda.getCardType();
}
}

Less number of threads are running parallel - Spring Batch Remote Partitioning

I am working on a Spring Batch project where I have a file of 2 million records. I am doing some processing on it and then saving it to database. Processing is time costly. So I am using Spring Batch Remote partitioning.
First I am manually splitting the file into 15 files and then using multiResourcePartitioner I am assigning each file to a single thread. But what I noticed is that in the start only 4 threads are running parallel and after some time number of threads running parallel are decreasing with time.
This is the configuration:
<batch:job id="GhanshyamESCatalogUpdater">
<batch:step id="GhanshyamCatalogUpdater2" >
<batch:partition step="slave" partitioner="rangePartitioner">
<batch:handler grid-size="15" task-executor="taskExecutor" />
</batch:partition>
</batch:step>
<batch:listeners>
<batch:listener ref="jobFailureListener"/>
</batch:listeners>
</batch:job>
<bean id="rangePartitioner" class="org.springframework.batch.core.partition.support.MultiResourcePartitioner" scope="step">
<property name="resources" value="file:#{jobParameters['job.partitionDir']}/x*">
</property>
</bean>
<step id="slave" xmlns="http://www.springframework.org/schema/batch">
<tasklet>
<chunk reader="gsbmyntraXmlReader" writer="gsbmyntraESWriter" commit-interval="1000" />
</tasklet>
</step>
This is the Task Executor:
<bean id="taskExecutor"
class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="100" />
<property name="allowCoreThreadTimeOut" value="true" />
<property name="WaitForTasksToCompleteOnShutdown" value="true" />
</bean>

Spring Batch not invoking ItemReader after Partitioner execution

I would like configure SpringBatch using partitioner. Parttioner will partition a list of data into multiple chunks and ItemReader will process those data. But in my case after successfully invoking Parttioner it is not invoking ItemReader's read() method. Below are the code snippet and code. Could you please let me know whats wrong?
<batch:job id="cycleJob">
<batch:step id="step1">
<batch:partition partitioner="cyclePartitioner">
<batch:step>
<batch:tasklet task-executor="taskExecutor" throttle-limit="1">
<batch:chunk processor="itemProcessor" reader="itemReader" writer="itemWriter" commit-interval="10">
</batch:chunk>
</batch:tasklet>
</batch:step>
<batch:handler task-executor="taskExecutor" grid-size="${maxThreads}" />
</batch:partition>
</batch:step>
</batch:job>
<bean id="itemProcessor" class="com.navisys.besystem.batch.CycleItemProcessor">
<property name="transactionTemplate" ref="txTemplate"/>
<property name="processorType" value="Batch.executeCycle"/>
</bean>
<bean id="itemWriter" class="com.navisys.besystem.batch.CycleItemWriter" />
<bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor">
<constructor-arg type="java.lang.String" value="cycle-" />
<property name="concurrencyLimit" value="${maxThreads}" />
</bean>
<bean id="itemReader" scope="step" class="com.navisys.besystem.batch.CycleItemReader">
<property name="dao" ref="cycledao" />
<property name="cycleDate" value="${cycleDate}" />
<property name="batchIds" value="${batchIds}" />
<property name="switches" value="${switches}" />
<property name="workItemsPerMessage" value="${workItemsPerMessage}" />
<property name="policyMask" value="${policyMask}"></property>
<property name="mainFile" value="${mainFile}" />
<property name="datafileLocation" value="${datafileLocation}"></property>
<property name="data" value="#{stepExecutionContext['data']}" />
</bean>
<bean id="cyclePartitioner" class="com.navisys.besystem.batch.CyclePartitioner">
<property name="dao" ref="cycledao" />
<property name="cycleDate" value="${cycleDate}" />
<property name="batchIds" value="${batchIds}" />
<property name="currentSwitch" value="R"></property>
</bean>
public class CyclePartitioner implements Partitioner {
#Override
public Map<String, ExecutionContext> partition(int gridSize) {
final Map<String, ExecutionContext> contextMap = new HashMap<>();
List<BatchContractIdData> list = initialize();
int partionCount = 0;
int itemsPerList = (null == list || list.isEmpty())?1:(int)(Math.ceil(list.size()/gridSize));
for(List<BatchContractIdData> data:Lists.partition(list, itemsPerList)){
ExecutionContext context = new ExecutionContext();
context.put("data", new ArrayList<BatchContractIdData>(data));
contextMap.put(getPartitionName(++partionCount), context);
}
return contextMap;
}
}

Multiple input file Spring Batch

I'm trying to develop a batch which can process a directory containing files with Spring Batch.
I looked at the MultiResourcePartitioner and tryied somethind like :
<job parent="loggerParent" id="importContractESTD" xmlns="http://www.springframework.org/schema/batch">
<step id="multiImportContractESTD">
<batch:partition step="partitionImportContractESTD" partitioner="partitioner">
<batch:handler grid-size="5" task-executor="taskExecutor" />
</batch:partition>
</step>
</job>
<bean id="partitioner" class="org.springframework.batch.core.partition.support.MultiResourcePartitioner">
<property name="keyName" value="inputfile" />
<property name="resources" value="file:${import.contract.filePattern}" />
</bean>
<step id="partitionImportContractESTD" xmlns="http://www.springframework.org/schema/batch">
<batch:job ref="importOneContractESTD" job-parameters-extractor="defaultJobParametersExtractor" />
</step>
<bean id="defaultJobParametersExtractor" class="org.springframework.batch.core.step.job.DefaultJobParametersExtractor"
scope="step" />
<!-- Job importContractESTD definition -->
<job parent="loggerParent" id="importOneContractESTD" xmlns="http://www.springframework.org/schema/batch">
<step parent="baseStep" id="initStep" next="calculateMD5">
<tasklet ref="initTasklet" />
</step>
<step id="calculateMD5" next="importContract">
<tasklet ref="md5Tasklet">
<batch:listeners>
<batch:listener ref="md5Tasklet" />
</batch:listeners>
</tasklet>
</step>
<step id="importContract">
<tasklet>
<chunk reader="contractReader" processor="contractProcessor" writer="contractWriter" commit-interval="${commit.interval}" />
<batch:listeners>
<batch:listener ref="contractProcessor" />
</batch:listeners>
</tasklet>
</step>
</job>
<!-- Chunk definition : Contract ItemReader -->
<bean id="contractReader" class="com.sopra.banking.cirbe.acquisition.batch.AcquisitionFileReader" scope="step">
<property name="resource" value="#{stepExecutionContext[inputfile]}" />
<property name="lineMapper">
<bean id="contractLineMappe" class="org.springframework.batch.item.file.mapping.PatternMatchingCompositeLineMapper">
<property name="tokenizers">
<map>
<entry key="1*" value-ref="headerTokenizer" />
<entry key="2*" value-ref="contractTokenizer" />
</map>
</property>
<property name="fieldSetMappers">
<map>
<entry key="1*" value-ref="headerMapper" />
<entry key="2*" value-ref="contractMapper" />
</map>
</property>
</bean>
</property>
</bean>
<!-- MD5 Tasklet -->
<bean id="md5Tasklet" class="com.sopra.banking.cirbe.acquisition.batch.AcquisitionMD5Tasklet">
<property name="file" value="#{stepExecutionContext[inputfile]}" />
</bean>
But what I get is :
Caused by: org.springframework.expression.spel.SpelEvaluationException: EL1008E:(pos 0): Field or property 'stepExecutionContext' cannot be found on object of type 'org.springframework.beans.factory.config.BeanExpressionContext'
What I'm looking for is a way to launch my job importOneContractESTD for each files contained in file:${import.contract.filePattern}. And each files is shared between the step calculateMD5 (which puts me the processed file md5 into my jobContext) and the step importContract (which read the previous md5 from the jobContext to add it as data to each line processed by the contractProcessor)
If I only try to call importOneContractESTD with one file given as a parameter (eg replacing #{stepExecutionContext[inputfile]} for ${my.file}), it works... But I want to try to use spring batch to manage my directory rather than my calling shell script...
Thanks for your ideas !
Add scope="step" when you need to access stepExecutionContext
like here:
<bean id="md5Tasklet" class="com.sopra.banking.cirbe.acquisition.batch.AcquisitionMD5Tasklet" scope="step">
<property name="file" value="#{stepExecutionContext[inputfile]}" />
</bean>
More info here.

Resources