Controlling Spring-Batch Step Flow in a java-config manner - spring

According to Spring-Batch documentation (http://docs.spring.io/spring-batch/2.2.x/reference/html/configureStep.html#controllingStepFlow), controlling step flow in an xml config file is very simple:
e.g. I could write the following job configuration:
<job id="myJob">
<step id="step1">
<fail on="CUSTOM_EXIT_STATUS"/>
<next on="*" to="step2"/>
</step>
<step id="step2">
<end on="1ST_EXIT_STATUS"/>
<next on="2ND_EXIT_STATUS" to="step10"/>
<next on="*" to="step20"/>
</step>
<step id="step10" next="step11" />
<step id="step11" />
<step id="step20" next="step21" />
<step id="step21" next="step22" />
<step id="step22" />
</job>
Is there a simple way defining such a job configuration in a java-config manner? (using JobBuilderFactory and so on...)

As the documentation also mentions, we can only branch the flow based on the exit-status of a step. To be able to report a custom exit-status (possibly different from the one automatically mapped from batch-status), we must provide an afterStep method for a StepExecutionListener.
Suppose we have an initial step step1 (an instance of a Tasklet class Step1), and we want to do the following:
if step1 fails (e.g. by throwing a runtime exception), then the entire job should be considered as FAILED.
if step1 completes with an exit-status of COMPLETED-WITH-A, then we want to branch to some step step2a which supposedly handles this specific case.
otherwise, we stay on the main truck of the job and continue with step step2.
Now, provide an afterStep method inside Step1 class (also implementing StepExecutionListener):
private static class Step1 implements Tasklet, StepExecutionListener
{
#Override
public ExitStatus afterStep(StepExecution stepExecution)
{
logger.info("*after-step1* step-execution={}", stepExecution.toString());
// Report a different exit-status on a random manner (just a demo!).
// Some of these exit statuses (COMPLETED-WITH-A) are Step1-specific
// and are used to base a conditional flow on them.
ExitStatus exitStatus = stepExecution.getExitStatus();
if (!"FAILED".equals(exitStatus.getExitCode())) {
double r = Math.random();
if (r < 0.50)
exitStatus = null; // i.e. COMPLETED
else
exitStatus = new ExitStatus(
"COMPLETED-WITH-A",
"Completed with some special condition A");
}
logger.info("*after-step1* reporting exit-status of {}", exitStatus);
return exitStatus;
}
// .... other methods of Step1
}
Finally, build the job flow inside createJob method of our JobFactory implementation:
#Override
public Job createJob()
{
// Assume some factories returning instances of our Tasklets
Step step1 = step1();
Step step2a = step2a();
Step step2 = step2();
JobBuilder jobBuilder = jobBuilderFactory.get(JOB_NAME)
.incrementer(new RunIdIncrementer())
.listener(listener); // a job-level listener
// Build job flow
return jobBuilder
.start(step1)
.on("FAILED").fail()
.from(step1)
.on("COMPLETED-WITH-A").to(step2a)
.from(step1)
.next(step2)
.end()
.build();
}

Maybe. If your intentions are to write something similar to a flow decider "programmatically" (using SB's framework interfaces, I mean) there is the built-in implementation and is enough for the most use cases.
Opposite to XML config you can use JavaConfig annotations if you are familiar with them; personally I prefer XML definition, but it's only a personal opinion.

Related

How to write custom flat file item reader using xml configuration

i am new to spring batch. I am using flat file item reader, configured in xml file. then there is a processor which processes each obj created. I need to pre process contents of file before passing it to file item reader. The processed results/file should not be written to disk. may i know how to do it through xml file configuration.
is it though tasklet or extending flat file item reader? then the processor should work as before with no change. i need to introduce a layer before passing the file to flat file item reader.
You can use ItemReadListener for this. ItemReadListener has three callback methods.
beforeRead , afterRead and onReadError.
You can but your logic in beforeRead.
Sample Code for CustomItemReaderListener
public class CustomItemReaderListener implements ItemReadListener<Domain> {
#Override
public void beforeRead() {
System.out.println("ItemReadListener - beforeRead");
//I need to pre process contents of file before passing it to file item reader
// add this logic here
}
#Override
public void afterRead(Domain item) {
System.out.println("ItemReadListener - afterRead");
}
#Override
public void onReadError(Exception ex) {
System.out.println("ItemReadListener - onReadError");
}
}
Map listeners to Step in XML :
<step id="step1">
<tasklet>
<chunk reader="myReader" writer="flatFileItemWriter"
commit-interval="1" />
<listeners>
<listener ref="customItemReaderListener" />
</listeners>
</tasklet>
</step>

Retry Whole Batch Job for n times

Is it possible to retry a particular job for say n times?
public void run() {
String[] springConfig = { "spring/batch/jobs/job-read-files.xml" };
ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);
JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("partitionJob");
JobParameters jobParameters = new JobParameters();
for (int i = 0; i < 2; i++) {
try {
JobExecution execution = jobLauncher.run(job,jobParameters);
System.out.println("Exit Status : " + execution.getAllFailureExceptions());
} catch (Exception e) {
e.printStackTrace();
}
}
System.out.println("Done");
}
I tried this, but since spring batch stores some status for job completion,it doesn't work in the second and third time.
Update: It worked when i tried this
public void run() {
for (int i = 0; i <= 2; i++) {
String[] springConfig = { "spring/batch/jobs/job-read-files.xml" };
ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);
JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("partitionJob");
JobParameters jobParameters = new JobParameters();
try {
JobExecution execution = jobLauncher.run(job,jobParameters);
System.out.println("Exit Status : " + execution.getAllFailureExceptions());
} catch (Exception e) {
e.printStackTrace();
}
}
System.out.println("Done");
}
Is there a better solution than this?
Here is my job config
<!-- partitioner job -->
<job id="partitionJob" restartable="true" xmlns="http://www.springframework.org/schema/batch">
<!-- master step, 2 threads (grid-size) -->
<step id="masterStep" next="finalstep">
<partition step="slave" partitioner="rangePartitioner">
<handler grid-size="2" task-executor="taskExecutor" />
</partition>
</step>
<step id="finalstep">
<tasklet>
<chunk reader="dummyReader" writer="spWriter" commit-interval="1" />
</tasklet>
</step>
</job>
<batch:step id="slave">
<tasklet>
<chunk reader="pagingItemReader" writer="dummyWriter"
commit-interval="2" retry-limit="3">
<batch:retryable-exception-classes>
<batch:include class="java.lang.Exception" />
</batch:retryable-exception-classes>
</chunk>
</tasklet>
</batch:step>
Spring has nice retry mechanism, where you can define RetryTemplate, and you call some part of code N times, and you can define RetryCallback and RecoveryCallback which is nice.
Spring batch actually uses it internally for retry mechanism on step. You can checkout spring retry documentation and regarding retry on step level this is nice blog post which explains skip and retry mechanism in spring batch.

Spring batch :Restart a job and then start next job automatically

I need to create a recovery pattern.
In my pattern I can launch a job only on a given time window.
In case the job fails, it will only be restarted on the next time window and when finish I would like to start the schedule job that was planned in advance for this window.
The only different between jobs is the time window parameters.
I thought about JobExecutionDecider with conjunction with JobExplorer or overriding a Joblauncher. But all seems too intrusive.
I failed to found an example that match my needs any Ideas will be most welcome.
Just to recap what was actually done based on the advice provided by incomplete-co.de.
I created a recovery flow which is similar to the one below. The recovery flow wraps my actual batch and responsible only to serve the correct job parameters to the internal job. It could be initial parameters on first execution, new parameters on normal execution or old parameters in case the last execution had failed.
<batch:job id="recoveryWrapper"
incrementer="wrapperRunIdIncrementer"
restartable="true">
<batch:decision id="recoveryFlowDecision" decider="recoveryFlowDecider">
<batch:next on="FIRST_RUN" to="defineParametersOnFirstRun" />
<batch:next on="RECOVER" to="recover.batchJob " />
<batch:next on="CURRENT" to="current.batchJob " />
</batch:decision>
<batch:step id="defineParametersOnFirstRun" next="current.batchJob">
<batch:tasklet ref="defineParametersOnFirstRunTasklet"/>
</batch:step>
<batch:step id="recover.batchJob " next="current.batchJob">
<batch:job ref="batchJob" job-launcher="jobLauncher"
job-parameters-extractor="jobParametersExtractor" />
</batch:step>
<batch:step id="current.batchJob" >
<batch:job ref="batchJob" job-launcher="jobLauncher"
job-parameters-extractor="jobParametersExtractor" />
</batch:step>
</batch:job>
The heart of the solution is the RecoveryFlowDecider the JobParametersExtractor while using Spring Batch Restart mechanism.
RecoveryFlowDecider will query the JobExplorer and JobRepository to find out if we had a failure in the last run. It will place The last execution on the execution context of the wrapper to use later in the JobParametersExtractor.
Note the use of runIdIncremeter to allow re-execution of the wrapper job.
#Component
public class RecoveryFlowDecider implements JobExecutionDecider {
private static final String FIRST_RUN = "FIRST_RUN";
private static final String CURRENT = "CURRENT";
private static final String RECOVER = "RECOVER";
#Autowired
private JobExplorer jobExplorer;
#Autowired
private JobRepository jobRepository;
#Override
public FlowExecutionStatus decide(JobExecution jobExecution
,StepExecution stepExecution) {
// the wrapper is named as the wrapped job + WRAPPER
String wrapperJobName = jobExecution.getJobInstance().getJobName();
String jobName;
jobName = wrapperJobName.substring(0,wrapperJobName.indexOf(EtlConstants.WRAPPER));
List<JobInstance> instances = jobExplorer.getJobInstances(jobName, 0, 1);
JobInstance internalJobInstance = instances.size() > 0 ? instances.get(0) : null;
if (null == internalJobInstance) {
return new FlowExecutionStatus(FIRST_RUN);
}
JobExecution lastExecution = jobRepository.getLastJobExecution(internalJobInstance.getJobName()
,internalJobInstance.getJobParameters());
//place the last execution on the context (wrapper context to use later)
jobExecution.getExecutionContext().put(EtlConstants.LAST_EXECUTION, lastExecution);
ExitStatus exitStatus = lastExecution.getExitStatus();
if (ExitStatus.FAILED.equals(exitStatus) || ExitStatus.UNKNOWN.equals(exitStatus)) {
return new FlowExecutionStatus(RECOVER);
}else if(ExitStatus.COMPLETED.equals(exitStatus)){
return new FlowExecutionStatus(CURRENT);
}
//We should never get here unless we have a defect
throw new RuntimeException("Unexpecded batch status: "+exitStatus+" in decider!");
}
}
Then the JobParametersExtractor will test again for the outcome of the last execution, in case of failed job it will serve the original parameters used to execute the failed job triggering Spring Bacth restart mechanism. Otherwise it will create a new set of parameters and will execute at his normal course.
#Component
public class JobExecutionWindowParametersExtractor implements
JobParametersExtractor {
#Override
public JobParameters getJobParameters(Job job, StepExecution stepExecution) {
// Read the last execution from the wrapping job
// in order to build Next Execution Window
JobExecution lastExecution= (JobExecution) stepExecution.getJobExecution().getExecutionContext().get(EtlConstants.LAST_EXECUTION);;
if(null!=lastExecution){
if (ExitStatus.FAILED.equals(lastExecution.getExitStatus())) {
JobInstance instance = lastExecution.getJobInstance();
JobParameters parameters = instance.getJobParameters();
return parameters;
}
}
//We do not have failed execution or have no execution at all we need to create a new execution window
return buildJobParamaters(lastExecution,stepExecution);
}
...
}
have you considered a JobStep? that is, a step determines if there are any additional jobs to be run. this value is set into the StepExecutionContext. a JobExecutionDecider then checks for this value; if exists, directs to a JobStep which launches the Job.
here's the doc on it http://docs.spring.io/spring-batch/reference/htmlsingle/#external-flows
Is it possible to do it in the opposite manner?
In every time window, submit the job intended for that time window.
However, the very first step of the job should check whether the job in previous time window is completed successfully or not. If it failed before, then submit the previous job, and wait for completion, before going into its own logic.

Spring Batch SkipListener not called when exception occurs in reader

This is my step configuration. My skip listeners onSkipInWrite() method is called properly. But onSkipInRead() is not getting called. I found this by deliberately throwing a null pointer exception from my reader.
<step id="callService" next="writeUsersAndResources">
<tasklet allow-start-if-complete="true">
<chunk reader="Reader" writer="Writer"
commit-interval="10" skip-limit="10">
<skippable-exception-classes>
<include class="java.lang.Exception" />
</skippable-exception-classes>
</chunk>
<listeners>
<listener ref="skipListener" />
</listeners>
</tasklet>
</step>
I read some forums and interchanged the listeners-tag at both levels: Inside the chunk, and outside the tasklet. Nothing is working...
Adding my skip Listener here
package com.legal.batch.core;
import org.apache.commons.lang.StringEscapeUtils;
import org.springframework.batch.core.SkipListener;
import org.springframework.jdbc.core.JdbcTemplate;
public class SkipListener implements SkipListener<Object, Object> {
#Override
public void onSkipInProcess(Object arg0, Throwable arg1) {
// TODO Auto-generated method stub
}
#Override
public void onSkipInRead(Throwable arg0) {
}
#Override
public void onSkipInWrite(Object arg0, Throwable arg1) {
}
}
Experts please suggest
Skip listeners respect transaction boundary, which means they always be called just before the transaction is committed.
Since a commit interval in your example is set to "10", the onSkipInRead will be called right at the moment of committing these 10 items (at once).
Hence if you try to do a step by step debugging, you would not see a onSkipInRead called right away after an ItemReader throws an exception.
A SkipListener in your example has an empty onSkipInRead method. Try to add some logging inside onSkipInRead, move a and rerun your job to see those messages.
EDIT:
Here is a working example [names are changed to 'abc']:
<step id="abcStep" xmlns="http://www.springframework.org/schema/batch">
<tasklet>
<chunk writer="abcWriter"
reader="abcReader"
commit-interval="${abc.commit.interval}"
skip-limit="1000" >
<skippable-exception-classes>
<include class="com.abc....persistence.mapping.exception.AbcMappingException"/>
<include class="org.springframework.batch.item.validator.ValidationException"/>
...
<include class="...Exception"/>
</skippable-exception-classes>
<listeners>
<listener ref="abcSkipListener"/>
</listeners>
</chunk>
<listeners>
<listener ref="abcStepListener"/>
<listener ref="afterStepStatsListener"/>
</listeners>
<no-rollback-exception-classes>
<include class="com.abc....persistence.mapping.exception.AbcMappingException"/>
<include class="org.springframework.batch.item.validator.ValidationException"/>
...
<include class="...Exception"/>
</no-rollback-exception-classes>
<transaction-attributes isolation="READ_COMMITTED"
propagation="REQUIRED"/>
</tasklet>
</step>
where an abcSkipListener bean is:
public class AbcSkipListener {
private static final Logger logger = LoggerFactory.getLogger( "abc-skip-listener" );
#OnReadError
public void houstonWeHaveAProblemOnRead( Exception problem ) {
// ...
}
#OnSkipInWrite
public void houstonWeHaveAProblemOnWrite( AbcHolder abcHolder, Throwable problem ) {
// ...
}
....
}
I come back on the subject after having had the same problem in more superior versions where the xml configuration is not used
With the bellow configuration , i was not able to reach the skip listener implementations.
#Bean
public Step step1( ) {
return stepBuilderFactory
.get("step1").<String, List<Integer>>chunk(1)
.reader(reader)
.processor(processor)
.faultTolerant()
.skipPolicy(skipPolicy)
.writer(writer)
.listener(stepListener)
.listener(skipListener)
.build();
}
The issue here is the placement of the skip listener is not correct.
The skip listener should be within the faultTolerantStepBuilder.
#Bean
public Step step1( ) {
return stepBuilderFactory
.get("step1").<String, List<Integer>>chunk(1)
.reader(reader)
.processor(processor)
.faultTolerant()
.listener(skipListener)
.skipPolicy(skipPolicy)
.writer(writer)
.listener(stepListener)
.build();
}
The first snippet is considered as listener within a simpleStepBuilder.

Access Spring Batch Job definition

I've got a Job description:
<job id="importJob" job-repository="jobRepository">
<step id="importStep1" next="importStep2" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStep2" next="importStep3" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStep3" next="importStep4" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStep4" next="importStepFinish" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStepFinish">
<tasklet ref="importJobBean" />
</step>
</job>
I want to know how many steps were defined in "importJob" (5 in this case). Looks like Job and JobInstance api has nothing relevant. Is this possible at all?
You have options
JobExplorer
The cleanest way to read meta data about your Job is through JobExplorer:
public interface JobExplorer {
List<JobInstance> getJobInstances(String jobName, int start, int count);
JobExecution getJobExecution(Long executionId);
StepExecution getStepExecution(Long jobExecutionId, Long stepExecutionId);
JobInstance getJobInstance(Long instanceId);
List<JobExecution> getJobExecutions(JobInstance jobInstance);
Set<JobExecution> findRunningJobExecutions(String jobName);
}
JobExecution
But you can also get it by simply looking at JobExecution:
// Returns the step executions that were registered
public Collection<StepExecution> getStepExecutions()
JobLauncher returns you a JobExecution when you launch the job:
public interface JobLauncher {
public JobExecution run(Job job, JobParameters jobParameters)
throws JobExecutionAlreadyRunningException, JobRestartException;
}
Or you can get it via JobExecutionListener
public interface JobExecutionListener {
void beforeJob(JobExecution jobExecution);
void afterJob(JobExecution jobExecution);
}
There are other ways to obtain it, but the above two should suffice.
EDIT to answer the comment:
In case you'd like to get a metadata regardless of whether or not the step was executed, there is a convenience method getStepNames which is defined by the AbstractJob and is implemented (e.g.) in SimpleJob as:
/**
* Convenience method for clients to inspect the steps for this job.
*
* #return the step names for this job
*/
public Collection<String> getStepNames() {
List<String> names = new ArrayList<String>();
for (Step step : steps) {
names.add(step.getName());
}
return names;
}

Resources