Spring batch - Rerun from first step - spring

Does anyone know if there is a way to start over in Spring batch?
I want it to first start at Step1, then Step2, Step3, then back to Step1, Step2, Step3, and so forth until a condition is met. I tried googling, but failed to fin any concrete examples.
Code so far:
#Bean
Job job(JobBuilderFactory factory) {
return factory.get(JOB_NAME)
.start(stagingStep)
.next(analyzeStep)
.next(reportingStep)
.preventRestart()
.build();
}

I think this can be done in multiple ways..
1.Intercept the job as mentioned here
<job id="footballJob">
<step id="playerload" parent="s1" next="gameLoad"/>
<step id="gameLoad" parent="s2" next="playerSummarization"/>
<step id="playerSummarization" parent="s3"/>
<listeners>
<listener ref="sampleListener"/>
</listeners>
..
and implement your lister..
public interface JobExecutionListener {
void beforeJob(JobExecution jobExecution);
void afterJob(JobExecution jobExecution); // implement and call the job again
}
2.Implement your own trigger/scheduler...
<task:scheduled ref="runScheduler" method="run" trigger="mytrigger" />
<bean id="runScheduler" class="com.spring.scheduler.MyScheduler" >
<property name="jobLauncher" ref="jobLauncher" />
<property name="job" ref="helloWorldJob" />
</bean>
..
<task:scheduled-tasks>
<!--task:scheduled ref="runScheduler" method="run" fixed-delay="5000" /> -->
<task:scheduled ref="runScheduler" method="run" cron="*/5 * * * * *" />
</task:scheduled-tasks>
You can use your own trigger and pass reference to above...
<bean id="mytrigger" class="com.spring.scheduler.MyTrigger" />
public class MyScheduler {
#Autowired
private JobLauncher jobLauncher;
#Autowired
private Job job;
public void run() {
try {
JobParameters param = new JobParametersBuilder().toJobParameters();
String dateParam = new Date().toString();
JobExecution execution = jobLauncher.run(job, param);
System.out.println("Exit Status in scheduler: " + execution.getStatus());
} catch (Exception e) {
e.printStackTrace();
}
}
and then if needed you can create a trigger
public class MyTrigger implements Trigger{
#Override
public Date nextExecutionTime(TriggerContext triggerContext) {...return date;}
3.If only one tasklet needs to be rerun again again, its easy , just return RepeatStatus.CONTINUABLE, this task rerun again and again...
public RepeatStatus execute(StepContribution contribution,
ChunkContext chunkContext)throws Exception
{
return RepeatStatus.CONTINUABLE;//RepeatStatus.FINISHED;
}
And if you want some specific step that can also be done (manipulate step 1 or 2 and use specific steps to build a job..before running again)

Related

Retry Whole Batch Job for n times

Is it possible to retry a particular job for say n times?
public void run() {
String[] springConfig = { "spring/batch/jobs/job-read-files.xml" };
ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);
JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("partitionJob");
JobParameters jobParameters = new JobParameters();
for (int i = 0; i < 2; i++) {
try {
JobExecution execution = jobLauncher.run(job,jobParameters);
System.out.println("Exit Status : " + execution.getAllFailureExceptions());
} catch (Exception e) {
e.printStackTrace();
}
}
System.out.println("Done");
}
I tried this, but since spring batch stores some status for job completion,it doesn't work in the second and third time.
Update: It worked when i tried this
public void run() {
for (int i = 0; i <= 2; i++) {
String[] springConfig = { "spring/batch/jobs/job-read-files.xml" };
ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);
JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("partitionJob");
JobParameters jobParameters = new JobParameters();
try {
JobExecution execution = jobLauncher.run(job,jobParameters);
System.out.println("Exit Status : " + execution.getAllFailureExceptions());
} catch (Exception e) {
e.printStackTrace();
}
}
System.out.println("Done");
}
Is there a better solution than this?
Here is my job config
<!-- partitioner job -->
<job id="partitionJob" restartable="true" xmlns="http://www.springframework.org/schema/batch">
<!-- master step, 2 threads (grid-size) -->
<step id="masterStep" next="finalstep">
<partition step="slave" partitioner="rangePartitioner">
<handler grid-size="2" task-executor="taskExecutor" />
</partition>
</step>
<step id="finalstep">
<tasklet>
<chunk reader="dummyReader" writer="spWriter" commit-interval="1" />
</tasklet>
</step>
</job>
<batch:step id="slave">
<tasklet>
<chunk reader="pagingItemReader" writer="dummyWriter"
commit-interval="2" retry-limit="3">
<batch:retryable-exception-classes>
<batch:include class="java.lang.Exception" />
</batch:retryable-exception-classes>
</chunk>
</tasklet>
</batch:step>
Spring has nice retry mechanism, where you can define RetryTemplate, and you call some part of code N times, and you can define RetryCallback and RecoveryCallback which is nice.
Spring batch actually uses it internally for retry mechanism on step. You can checkout spring retry documentation and regarding retry on step level this is nice blog post which explains skip and retry mechanism in spring batch.

Stop the task in Spring Integration framework

I would like to stop a KeepAliveReceiver task after a given event. I tested the following solutions and none is working 1) sending keepAliveReceiver.stop() to control but, 2) implement Lifecycle and call stop() 3) stop the scheduler. Any ideas how can I stop the task from within the running task?
#MessageEndpoint
public class KeepAliveReceiver implements Runnable, LifeCycle {
private int limit;
#Autowired
private ControlBusGateway controlGateway; // sending messages to control Channel
#Autowired
private ThreadPoolTaskScheduler myScheduler;
#Override
public void run() {
...
if ( event ) {
LOGGER.debug( "FAILOVER! Starting messageReceiveRouter. ");
controlGateway.send( new GenericMessage<String>( "#keepAliveReceiver.stop()" ) );
// not allowed
myScheduler.shutdown();
// not working, the scheduler keeps starting the keepAliveReceiver
this.stop();
//not working
}
}
#Override
public void stop() {
LOGGER.debug( "STOPPED!");
}
and xml definition of the scheduler:
<task:scheduler id="myScheduler" pool-size="10" />
<task:scheduled-tasks>
<task:scheduled ref="keepAliveReceiver" method="run" fixed-rate="500" />
</task:scheduled-tasks>
Send to the controlGateway a Message with empty command ;-)
'Kill' your <control-bus> and change it to
<outbound-channel-adapter channel="stopSchedulerChannel" expression="#myScheduler.shutdown()">
And add
<channel id="stopSchedulerChannel">
<dispatcher task-executor="executor"/>
</channel>
And configure appropriate executor bean
Your problem is about a wish to stop task from himself. From other side <control-bus> allows operations only on SmartLifecycle implementors

Spring Batch resume after server's failure

I am using spring batch to parse files and I have the following scenario:
I am running a job. This job has to parse a giving file. For unexpected reason (let say for power cut) the server fails and I have to restart the machine. Now, after restarting the server I want to resume the job from the point which stopped before the power cut. This means that if the system read 1.300 rows from 10.000 now have to start reading from 1.301 row.
How can I achieve this scenario using spring batch?
About configuration: I use spring-integration which polls under a directory for new files. When a file is arrived the spring-integration creates the spring batch job. Also, spring-batch uses FlatFileItemReader to parse the file.
Here is the complete solution to restart a job after JVM crash.
Make a job restartable by making restarable="true"
job id="jobName" xmlns="http://www.springframework.org/schema/batch"
restartable="true"
2 . Code to restart a job
import java.util.Date;
import java.util.List;
import org.apache.commons.collections.CollectionUtils;
import org.springframework.batch.core.BatchStatus;
import org.springframework.batch.core.ExitStatus;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobInstance;
import org.springframework.batch.core.explore.JobExplorer;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.launch.JobOperator;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.beans.factory.annotation.Autowired;
public class ResartJob {
#Autowired
private JobExplorer jobExplorer;
#Autowired
JobRepository jobRepository;
#Autowired
private JobLauncher jobLauncher;
#Autowired
JobOperator jobOperator;
public void restart(){
try {
List<JobInstance> jobInstances = jobExplorer.getJobInstances("jobName",0,1);// this will get one latest job from the database
if(CollectionUtils.isNotEmpty(jobInstances)){
JobInstance jobInstance = jobInstances.get(0);
List<JobExecution> jobExecutions = jobExplorer.getJobExecutions(jobInstance);
if(CollectionUtils.isNotEmpty(jobExecutions)){
for(JobExecution execution: jobExecutions){
// If the job status is STARTED then update the status to FAILED and restart the job using JobOperator.java
if(execution.getStatus().equals(BatchStatus.STARTED)){
execution.setEndTime(new Date());
execution.setStatus(BatchStatus.FAILED);
execution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(execution);
jobOperator.restart(execution.getId());
}
}
}
}
} catch (Exception e1) {
e1.printStackTrace();
}
}
}
3.
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean" p:dataSource-ref="dataSource" p:transactionManager-ref="transactionManager" p:lobHandler-ref="oracleLobHandler"/>
<bean id="oracleLobHandler" class="org.springframework.jdbc.support.lob.DefaultLobHandler"/>
<bean id="jobExplorer" class="org.springframework.batch.core.explore.support.JobExplorerFactoryBean" p:dataSource-ref="dataSource" />
<bean id="jobRegistry" class="org.springframework.batch.core.configuration.support.MapJobRegistry" />
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor" ref="jobLauncherTaskExecutor" />
</bean> <task:executor id="jobLauncherTaskExecutor" pool-size="6" rejection-policy="ABORT" />
<bean id="jobOperator" class="org.springframework.batch.core.launch.support.SimpleJobOperator" p:jobLauncher-ref="jobLauncher" p:jobExplorer-re`enter code here`f="jobExplorer" p:jobRepository-ref="jobRepository" p:jobRegistry-ref="jobRegistry"/>
An updated work-around for Spring batch 4. Takes JVM start up time into account for broken jobs detection. Please note that this will not work when in a clustered environment where multiple servers start jobs.
#Bean
public ApplicationListener<ContextRefreshedEvent> resumeJobsListener(JobOperator jobOperator, JobRepository jobRepository,
JobExplorer jobExplorer) {
// restart jobs that failed due to
return event -> {
Date jvmStartTime = new Date(ManagementFactory.getRuntimeMXBean().getStartTime());
// for each job
for (String jobName : jobExplorer.getJobNames()) {
// get latest job instance
for (JobInstance instance : jobExplorer.getJobInstances(jobName, 0, 1)) {
// for each of the executions
for (JobExecution execution : jobExplorer.getJobExecutions(instance)) {
if (execution.getStatus().equals(BatchStatus.STARTED) && execution.getCreateTime().before(jvmStartTime)) {
// this job is broken and must be restarted
execution.setEndTime(new Date());
execution.setStatus(BatchStatus.FAILED);
execution.setExitStatus(ExitStatus.FAILED);
for (StepExecution se : execution.getStepExecutions()) {
if (se.getStatus().equals(BatchStatus.STARTED)) {
se.setEndTime(new Date());
se.setStatus(BatchStatus.FAILED);
se.setExitStatus(ExitStatus.FAILED);
jobRepository.update(se);
}
}
jobRepository.update(execution);
try {
jobOperator.restart(execution.getId());
}
catch (JobExecutionException e) {
LOG.warn("Couldn't resume job execution {}", execution, e);
}
}
}
}
}
};
}
What I would do in your situation is to create a step to log the last processed row in a file. Then create a second job that would read this file and start the processing from a specific row number.
So if the job stops due to whatever reason you will be able to run the new Job that will resume the processing.
you can also write like below :
#RequestMapping(value = "/updateStatusAndRestart/{jobId}/{stepId}", method = GET)
public ResponseEntity<String> updateBatchStatus(#PathVariable("jobId") Long jobExecutionId ,#PathVariable("stepId")Long stepExecutionId )throws Exception {
StepExecution stepExecution = jobExplorer.getStepExecution(jobExecutionId,stepExecutionId);
stepExecution.setEndTime(new Date(System.currentTimeMillis()));
stepExecution.setStatus(BatchStatus.FAILED);
stepExecution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(stepExecution);
JobExecution jobExecution = stepExecution.getJobExecution();
jobExecution.setEndTime(new Date(System.currentTimeMillis()));
jobExecution.setStatus(BatchStatus.FAILED);
jobExecution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(jobExecution);
jobOperator.restart(execution.getId());
return new ResponseEntity<String>("<h1> Batch Status Updated !! </h1>", HttpStatus.OK);
}
Here i have used restApi endpoint to pass the jobExecutionId and stepExecutionId and setting the status of both job_execution and step_execution to FAIL. then restart using batch operator.

Spring Batch SkipListener not called when exception occurs in reader

This is my step configuration. My skip listeners onSkipInWrite() method is called properly. But onSkipInRead() is not getting called. I found this by deliberately throwing a null pointer exception from my reader.
<step id="callService" next="writeUsersAndResources">
<tasklet allow-start-if-complete="true">
<chunk reader="Reader" writer="Writer"
commit-interval="10" skip-limit="10">
<skippable-exception-classes>
<include class="java.lang.Exception" />
</skippable-exception-classes>
</chunk>
<listeners>
<listener ref="skipListener" />
</listeners>
</tasklet>
</step>
I read some forums and interchanged the listeners-tag at both levels: Inside the chunk, and outside the tasklet. Nothing is working...
Adding my skip Listener here
package com.legal.batch.core;
import org.apache.commons.lang.StringEscapeUtils;
import org.springframework.batch.core.SkipListener;
import org.springframework.jdbc.core.JdbcTemplate;
public class SkipListener implements SkipListener<Object, Object> {
#Override
public void onSkipInProcess(Object arg0, Throwable arg1) {
// TODO Auto-generated method stub
}
#Override
public void onSkipInRead(Throwable arg0) {
}
#Override
public void onSkipInWrite(Object arg0, Throwable arg1) {
}
}
Experts please suggest
Skip listeners respect transaction boundary, which means they always be called just before the transaction is committed.
Since a commit interval in your example is set to "10", the onSkipInRead will be called right at the moment of committing these 10 items (at once).
Hence if you try to do a step by step debugging, you would not see a onSkipInRead called right away after an ItemReader throws an exception.
A SkipListener in your example has an empty onSkipInRead method. Try to add some logging inside onSkipInRead, move a and rerun your job to see those messages.
EDIT:
Here is a working example [names are changed to 'abc']:
<step id="abcStep" xmlns="http://www.springframework.org/schema/batch">
<tasklet>
<chunk writer="abcWriter"
reader="abcReader"
commit-interval="${abc.commit.interval}"
skip-limit="1000" >
<skippable-exception-classes>
<include class="com.abc....persistence.mapping.exception.AbcMappingException"/>
<include class="org.springframework.batch.item.validator.ValidationException"/>
...
<include class="...Exception"/>
</skippable-exception-classes>
<listeners>
<listener ref="abcSkipListener"/>
</listeners>
</chunk>
<listeners>
<listener ref="abcStepListener"/>
<listener ref="afterStepStatsListener"/>
</listeners>
<no-rollback-exception-classes>
<include class="com.abc....persistence.mapping.exception.AbcMappingException"/>
<include class="org.springframework.batch.item.validator.ValidationException"/>
...
<include class="...Exception"/>
</no-rollback-exception-classes>
<transaction-attributes isolation="READ_COMMITTED"
propagation="REQUIRED"/>
</tasklet>
</step>
where an abcSkipListener bean is:
public class AbcSkipListener {
private static final Logger logger = LoggerFactory.getLogger( "abc-skip-listener" );
#OnReadError
public void houstonWeHaveAProblemOnRead( Exception problem ) {
// ...
}
#OnSkipInWrite
public void houstonWeHaveAProblemOnWrite( AbcHolder abcHolder, Throwable problem ) {
// ...
}
....
}
I come back on the subject after having had the same problem in more superior versions where the xml configuration is not used
With the bellow configuration , i was not able to reach the skip listener implementations.
#Bean
public Step step1( ) {
return stepBuilderFactory
.get("step1").<String, List<Integer>>chunk(1)
.reader(reader)
.processor(processor)
.faultTolerant()
.skipPolicy(skipPolicy)
.writer(writer)
.listener(stepListener)
.listener(skipListener)
.build();
}
The issue here is the placement of the skip listener is not correct.
The skip listener should be within the faultTolerantStepBuilder.
#Bean
public Step step1( ) {
return stepBuilderFactory
.get("step1").<String, List<Integer>>chunk(1)
.reader(reader)
.processor(processor)
.faultTolerant()
.listener(skipListener)
.skipPolicy(skipPolicy)
.writer(writer)
.listener(stepListener)
.build();
}
The first snippet is considered as listener within a simpleStepBuilder.

Access Spring Batch Job definition

I've got a Job description:
<job id="importJob" job-repository="jobRepository">
<step id="importStep1" next="importStep2" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStep2" next="importStep3" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStep3" next="importStep4" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStep4" next="importStepFinish" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStepFinish">
<tasklet ref="importJobBean" />
</step>
</job>
I want to know how many steps were defined in "importJob" (5 in this case). Looks like Job and JobInstance api has nothing relevant. Is this possible at all?
You have options
JobExplorer
The cleanest way to read meta data about your Job is through JobExplorer:
public interface JobExplorer {
List<JobInstance> getJobInstances(String jobName, int start, int count);
JobExecution getJobExecution(Long executionId);
StepExecution getStepExecution(Long jobExecutionId, Long stepExecutionId);
JobInstance getJobInstance(Long instanceId);
List<JobExecution> getJobExecutions(JobInstance jobInstance);
Set<JobExecution> findRunningJobExecutions(String jobName);
}
JobExecution
But you can also get it by simply looking at JobExecution:
// Returns the step executions that were registered
public Collection<StepExecution> getStepExecutions()
JobLauncher returns you a JobExecution when you launch the job:
public interface JobLauncher {
public JobExecution run(Job job, JobParameters jobParameters)
throws JobExecutionAlreadyRunningException, JobRestartException;
}
Or you can get it via JobExecutionListener
public interface JobExecutionListener {
void beforeJob(JobExecution jobExecution);
void afterJob(JobExecution jobExecution);
}
There are other ways to obtain it, but the above two should suffice.
EDIT to answer the comment:
In case you'd like to get a metadata regardless of whether or not the step was executed, there is a convenience method getStepNames which is defined by the AbstractJob and is implemented (e.g.) in SimpleJob as:
/**
* Convenience method for clients to inspect the steps for this job.
*
* #return the step names for this job
*/
public Collection<String> getStepNames() {
List<String> names = new ArrayList<String>();
for (Step step : steps) {
names.add(step.getName());
}
return names;
}

Resources