Retry Whole Batch Job for n times - spring

Is it possible to retry a particular job for say n times?
public void run() {
String[] springConfig = { "spring/batch/jobs/job-read-files.xml" };
ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);
JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("partitionJob");
JobParameters jobParameters = new JobParameters();
for (int i = 0; i < 2; i++) {
try {
JobExecution execution = jobLauncher.run(job,jobParameters);
System.out.println("Exit Status : " + execution.getAllFailureExceptions());
} catch (Exception e) {
e.printStackTrace();
}
}
System.out.println("Done");
}
I tried this, but since spring batch stores some status for job completion,it doesn't work in the second and third time.
Update: It worked when i tried this
public void run() {
for (int i = 0; i <= 2; i++) {
String[] springConfig = { "spring/batch/jobs/job-read-files.xml" };
ApplicationContext context = new ClassPathXmlApplicationContext(springConfig);
JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("partitionJob");
JobParameters jobParameters = new JobParameters();
try {
JobExecution execution = jobLauncher.run(job,jobParameters);
System.out.println("Exit Status : " + execution.getAllFailureExceptions());
} catch (Exception e) {
e.printStackTrace();
}
}
System.out.println("Done");
}
Is there a better solution than this?
Here is my job config
<!-- partitioner job -->
<job id="partitionJob" restartable="true" xmlns="http://www.springframework.org/schema/batch">
<!-- master step, 2 threads (grid-size) -->
<step id="masterStep" next="finalstep">
<partition step="slave" partitioner="rangePartitioner">
<handler grid-size="2" task-executor="taskExecutor" />
</partition>
</step>
<step id="finalstep">
<tasklet>
<chunk reader="dummyReader" writer="spWriter" commit-interval="1" />
</tasklet>
</step>
</job>
<batch:step id="slave">
<tasklet>
<chunk reader="pagingItemReader" writer="dummyWriter"
commit-interval="2" retry-limit="3">
<batch:retryable-exception-classes>
<batch:include class="java.lang.Exception" />
</batch:retryable-exception-classes>
</chunk>
</tasklet>
</batch:step>

Spring has nice retry mechanism, where you can define RetryTemplate, and you call some part of code N times, and you can define RetryCallback and RecoveryCallback which is nice.
Spring batch actually uses it internally for retry mechanism on step. You can checkout spring retry documentation and regarding retry on step level this is nice blog post which explains skip and retry mechanism in spring batch.

Related

Spring-batch exiting with Exit Status : COMPLETED before actual job is finished?

In my Spring Batch application I have written a CustomItemWriter which internally writes item to DynamoDB using DynamoDBAsyncClient, this client returns Future object. I have a input file with millions of record. Since CustomItemWriter returns future object immediately my batch job exiting within 5 sec with status as COMPLETED, but in actual it is taking 3-4 minutes to write all item to the DB, I want that batch job finishes only after all item written to DataBase. How can i do that?
job is defined as below
<bean id="report" class="com.solution.model.Report" scope="prototype" />
<batch:job id="job" restartable="true">
<batch:step id="step1">
<batch:tasklet>
<batch:chunk reader="cvsFileItemReader" processor="filterReportProcessor" writer="customItemWriter"
commit-interval="20">
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
<bean id="customItemWriter" class="com.solution.writer.CustomeWriter"></bean>
CustomeItemWriter is defined as below
public class CustomeWriter implements ItemWriter<Report>{
public void write(List<? extends Report> item) throws Exception {
List<Future<PutItemResult>> list = new LinkedList();
AmazonDynamoDBAsyncClient client = new AmazonDynamoDBAsyncClient();
for(Report report : item) {
PutItemRequest req = new PutItemRequest();
req.setTableName("MyTable");
req.setReturnValue(ReturnValue.ALL_ODD);
req.addItemEntry("customerId",new
AttributeValue(item.getCustomeId()));
Future<PutItemResult> res = client.putItemAsync(req);
list.add(res);
}
}
}
Main class contains
JobExecution execution = jobLauncher.run(job, new JobParameters());
System.out.println("Exit Status : " + execution.getStatus());
Since in ItemWriter its returning future object it doesn't waits to complete the opration. And from the main since all item is submitted for writing Batch Status is showing COMPLETED and job terminates.
I want that this job should terminate only after actual write is performed in the DynamoDB.
Can we have some other step well to wait on this or some Listener is available?
Here is one approach. Since ItemWriter::write doesn't return anything you can make use of listener feature.
#Component
#JobScope
public class YourWriteListener implements ItemWriteListener<WhatEverYourTypeIs> {
#Value("#{jobExecution.executionContext}")
private ExecutionContext executionContext;
#Override
public void afterWrite(final List<? extends WhatEverYourTypeIs> paramList) {
Future future = this.executionContext.readAndValidate("FutureKey", Future.class);
//wait till the job is done using future object
}
#Override
public void beforeWrite(final List<? extends WhatEverYourTypeIs> paramList) {
}
#Override
public void onWriteError(final Exception paramException, final List<? extends WhatEverYourTypeIs> paramList) {
}
}
In your writer class, everything remains same except addind the future object to ExecutionContext.
public class YourItemWriter extends ItemWriter<WhatEverYourTypeIs> {
#Value("#{jobExecution.executionContext}")
private ExecutionContext executionContext;
#Override
protected void doWrite(final List<? extends WhatEverYourTypeIs> youritems)
//write to DynamoDb and get Future object
executionContext.put("FutureKey", future);
}
}
}
And you can register the listener in your configuration. Here is a java code, you need to do the same in your xml
#Bean
public Step initStep() {
return this.stepBuilders.get("someStepName").<YourTypeX, YourTypeY>chunk(10)
.reader(yourReader).processor(yourProcessor)
.writer(yourWriter).listener(YourWriteListener)
.build();
}

Spring batch - Rerun from first step

Does anyone know if there is a way to start over in Spring batch?
I want it to first start at Step1, then Step2, Step3, then back to Step1, Step2, Step3, and so forth until a condition is met. I tried googling, but failed to fin any concrete examples.
Code so far:
#Bean
Job job(JobBuilderFactory factory) {
return factory.get(JOB_NAME)
.start(stagingStep)
.next(analyzeStep)
.next(reportingStep)
.preventRestart()
.build();
}
I think this can be done in multiple ways..
1.Intercept the job as mentioned here
<job id="footballJob">
<step id="playerload" parent="s1" next="gameLoad"/>
<step id="gameLoad" parent="s2" next="playerSummarization"/>
<step id="playerSummarization" parent="s3"/>
<listeners>
<listener ref="sampleListener"/>
</listeners>
..
and implement your lister..
public interface JobExecutionListener {
void beforeJob(JobExecution jobExecution);
void afterJob(JobExecution jobExecution); // implement and call the job again
}
2.Implement your own trigger/scheduler...
<task:scheduled ref="runScheduler" method="run" trigger="mytrigger" />
<bean id="runScheduler" class="com.spring.scheduler.MyScheduler" >
<property name="jobLauncher" ref="jobLauncher" />
<property name="job" ref="helloWorldJob" />
</bean>
..
<task:scheduled-tasks>
<!--task:scheduled ref="runScheduler" method="run" fixed-delay="5000" /> -->
<task:scheduled ref="runScheduler" method="run" cron="*/5 * * * * *" />
</task:scheduled-tasks>
You can use your own trigger and pass reference to above...
<bean id="mytrigger" class="com.spring.scheduler.MyTrigger" />
public class MyScheduler {
#Autowired
private JobLauncher jobLauncher;
#Autowired
private Job job;
public void run() {
try {
JobParameters param = new JobParametersBuilder().toJobParameters();
String dateParam = new Date().toString();
JobExecution execution = jobLauncher.run(job, param);
System.out.println("Exit Status in scheduler: " + execution.getStatus());
} catch (Exception e) {
e.printStackTrace();
}
}
and then if needed you can create a trigger
public class MyTrigger implements Trigger{
#Override
public Date nextExecutionTime(TriggerContext triggerContext) {...return date;}
3.If only one tasklet needs to be rerun again again, its easy , just return RepeatStatus.CONTINUABLE, this task rerun again and again...
public RepeatStatus execute(StepContribution contribution,
ChunkContext chunkContext)throws Exception
{
return RepeatStatus.CONTINUABLE;//RepeatStatus.FINISHED;
}
And if you want some specific step that can also be done (manipulate step 1 or 2 and use specific steps to build a job..before running again)

Controlling Spring-Batch Step Flow in a java-config manner

According to Spring-Batch documentation (http://docs.spring.io/spring-batch/2.2.x/reference/html/configureStep.html#controllingStepFlow), controlling step flow in an xml config file is very simple:
e.g. I could write the following job configuration:
<job id="myJob">
<step id="step1">
<fail on="CUSTOM_EXIT_STATUS"/>
<next on="*" to="step2"/>
</step>
<step id="step2">
<end on="1ST_EXIT_STATUS"/>
<next on="2ND_EXIT_STATUS" to="step10"/>
<next on="*" to="step20"/>
</step>
<step id="step10" next="step11" />
<step id="step11" />
<step id="step20" next="step21" />
<step id="step21" next="step22" />
<step id="step22" />
</job>
Is there a simple way defining such a job configuration in a java-config manner? (using JobBuilderFactory and so on...)
As the documentation also mentions, we can only branch the flow based on the exit-status of a step. To be able to report a custom exit-status (possibly different from the one automatically mapped from batch-status), we must provide an afterStep method for a StepExecutionListener.
Suppose we have an initial step step1 (an instance of a Tasklet class Step1), and we want to do the following:
if step1 fails (e.g. by throwing a runtime exception), then the entire job should be considered as FAILED.
if step1 completes with an exit-status of COMPLETED-WITH-A, then we want to branch to some step step2a which supposedly handles this specific case.
otherwise, we stay on the main truck of the job and continue with step step2.
Now, provide an afterStep method inside Step1 class (also implementing StepExecutionListener):
private static class Step1 implements Tasklet, StepExecutionListener
{
#Override
public ExitStatus afterStep(StepExecution stepExecution)
{
logger.info("*after-step1* step-execution={}", stepExecution.toString());
// Report a different exit-status on a random manner (just a demo!).
// Some of these exit statuses (COMPLETED-WITH-A) are Step1-specific
// and are used to base a conditional flow on them.
ExitStatus exitStatus = stepExecution.getExitStatus();
if (!"FAILED".equals(exitStatus.getExitCode())) {
double r = Math.random();
if (r < 0.50)
exitStatus = null; // i.e. COMPLETED
else
exitStatus = new ExitStatus(
"COMPLETED-WITH-A",
"Completed with some special condition A");
}
logger.info("*after-step1* reporting exit-status of {}", exitStatus);
return exitStatus;
}
// .... other methods of Step1
}
Finally, build the job flow inside createJob method of our JobFactory implementation:
#Override
public Job createJob()
{
// Assume some factories returning instances of our Tasklets
Step step1 = step1();
Step step2a = step2a();
Step step2 = step2();
JobBuilder jobBuilder = jobBuilderFactory.get(JOB_NAME)
.incrementer(new RunIdIncrementer())
.listener(listener); // a job-level listener
// Build job flow
return jobBuilder
.start(step1)
.on("FAILED").fail()
.from(step1)
.on("COMPLETED-WITH-A").to(step2a)
.from(step1)
.next(step2)
.end()
.build();
}
Maybe. If your intentions are to write something similar to a flow decider "programmatically" (using SB's framework interfaces, I mean) there is the built-in implementation and is enough for the most use cases.
Opposite to XML config you can use JavaConfig annotations if you are familiar with them; personally I prefer XML definition, but it's only a personal opinion.

Spring Batch resume after server's failure

I am using spring batch to parse files and I have the following scenario:
I am running a job. This job has to parse a giving file. For unexpected reason (let say for power cut) the server fails and I have to restart the machine. Now, after restarting the server I want to resume the job from the point which stopped before the power cut. This means that if the system read 1.300 rows from 10.000 now have to start reading from 1.301 row.
How can I achieve this scenario using spring batch?
About configuration: I use spring-integration which polls under a directory for new files. When a file is arrived the spring-integration creates the spring batch job. Also, spring-batch uses FlatFileItemReader to parse the file.
Here is the complete solution to restart a job after JVM crash.
Make a job restartable by making restarable="true"
job id="jobName" xmlns="http://www.springframework.org/schema/batch"
restartable="true"
2 . Code to restart a job
import java.util.Date;
import java.util.List;
import org.apache.commons.collections.CollectionUtils;
import org.springframework.batch.core.BatchStatus;
import org.springframework.batch.core.ExitStatus;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobInstance;
import org.springframework.batch.core.explore.JobExplorer;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.launch.JobOperator;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.beans.factory.annotation.Autowired;
public class ResartJob {
#Autowired
private JobExplorer jobExplorer;
#Autowired
JobRepository jobRepository;
#Autowired
private JobLauncher jobLauncher;
#Autowired
JobOperator jobOperator;
public void restart(){
try {
List<JobInstance> jobInstances = jobExplorer.getJobInstances("jobName",0,1);// this will get one latest job from the database
if(CollectionUtils.isNotEmpty(jobInstances)){
JobInstance jobInstance = jobInstances.get(0);
List<JobExecution> jobExecutions = jobExplorer.getJobExecutions(jobInstance);
if(CollectionUtils.isNotEmpty(jobExecutions)){
for(JobExecution execution: jobExecutions){
// If the job status is STARTED then update the status to FAILED and restart the job using JobOperator.java
if(execution.getStatus().equals(BatchStatus.STARTED)){
execution.setEndTime(new Date());
execution.setStatus(BatchStatus.FAILED);
execution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(execution);
jobOperator.restart(execution.getId());
}
}
}
}
} catch (Exception e1) {
e1.printStackTrace();
}
}
}
3.
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean" p:dataSource-ref="dataSource" p:transactionManager-ref="transactionManager" p:lobHandler-ref="oracleLobHandler"/>
<bean id="oracleLobHandler" class="org.springframework.jdbc.support.lob.DefaultLobHandler"/>
<bean id="jobExplorer" class="org.springframework.batch.core.explore.support.JobExplorerFactoryBean" p:dataSource-ref="dataSource" />
<bean id="jobRegistry" class="org.springframework.batch.core.configuration.support.MapJobRegistry" />
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor" ref="jobLauncherTaskExecutor" />
</bean> <task:executor id="jobLauncherTaskExecutor" pool-size="6" rejection-policy="ABORT" />
<bean id="jobOperator" class="org.springframework.batch.core.launch.support.SimpleJobOperator" p:jobLauncher-ref="jobLauncher" p:jobExplorer-re`enter code here`f="jobExplorer" p:jobRepository-ref="jobRepository" p:jobRegistry-ref="jobRegistry"/>
An updated work-around for Spring batch 4. Takes JVM start up time into account for broken jobs detection. Please note that this will not work when in a clustered environment where multiple servers start jobs.
#Bean
public ApplicationListener<ContextRefreshedEvent> resumeJobsListener(JobOperator jobOperator, JobRepository jobRepository,
JobExplorer jobExplorer) {
// restart jobs that failed due to
return event -> {
Date jvmStartTime = new Date(ManagementFactory.getRuntimeMXBean().getStartTime());
// for each job
for (String jobName : jobExplorer.getJobNames()) {
// get latest job instance
for (JobInstance instance : jobExplorer.getJobInstances(jobName, 0, 1)) {
// for each of the executions
for (JobExecution execution : jobExplorer.getJobExecutions(instance)) {
if (execution.getStatus().equals(BatchStatus.STARTED) && execution.getCreateTime().before(jvmStartTime)) {
// this job is broken and must be restarted
execution.setEndTime(new Date());
execution.setStatus(BatchStatus.FAILED);
execution.setExitStatus(ExitStatus.FAILED);
for (StepExecution se : execution.getStepExecutions()) {
if (se.getStatus().equals(BatchStatus.STARTED)) {
se.setEndTime(new Date());
se.setStatus(BatchStatus.FAILED);
se.setExitStatus(ExitStatus.FAILED);
jobRepository.update(se);
}
}
jobRepository.update(execution);
try {
jobOperator.restart(execution.getId());
}
catch (JobExecutionException e) {
LOG.warn("Couldn't resume job execution {}", execution, e);
}
}
}
}
}
};
}
What I would do in your situation is to create a step to log the last processed row in a file. Then create a second job that would read this file and start the processing from a specific row number.
So if the job stops due to whatever reason you will be able to run the new Job that will resume the processing.
you can also write like below :
#RequestMapping(value = "/updateStatusAndRestart/{jobId}/{stepId}", method = GET)
public ResponseEntity<String> updateBatchStatus(#PathVariable("jobId") Long jobExecutionId ,#PathVariable("stepId")Long stepExecutionId )throws Exception {
StepExecution stepExecution = jobExplorer.getStepExecution(jobExecutionId,stepExecutionId);
stepExecution.setEndTime(new Date(System.currentTimeMillis()));
stepExecution.setStatus(BatchStatus.FAILED);
stepExecution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(stepExecution);
JobExecution jobExecution = stepExecution.getJobExecution();
jobExecution.setEndTime(new Date(System.currentTimeMillis()));
jobExecution.setStatus(BatchStatus.FAILED);
jobExecution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(jobExecution);
jobOperator.restart(execution.getId());
return new ResponseEntity<String>("<h1> Batch Status Updated !! </h1>", HttpStatus.OK);
}
Here i have used restApi endpoint to pass the jobExecutionId and stepExecutionId and setting the status of both job_execution and step_execution to FAIL. then restart using batch operator.

Access Spring Batch Job definition

I've got a Job description:
<job id="importJob" job-repository="jobRepository">
<step id="importStep1" next="importStep2" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStep2" next="importStep3" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStep3" next="importStep4" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStep4" next="importStepFinish" parent="abstractImportStep">
<tasklet ref="importJobBean" />
</step>
<step id="importStepFinish">
<tasklet ref="importJobBean" />
</step>
</job>
I want to know how many steps were defined in "importJob" (5 in this case). Looks like Job and JobInstance api has nothing relevant. Is this possible at all?
You have options
JobExplorer
The cleanest way to read meta data about your Job is through JobExplorer:
public interface JobExplorer {
List<JobInstance> getJobInstances(String jobName, int start, int count);
JobExecution getJobExecution(Long executionId);
StepExecution getStepExecution(Long jobExecutionId, Long stepExecutionId);
JobInstance getJobInstance(Long instanceId);
List<JobExecution> getJobExecutions(JobInstance jobInstance);
Set<JobExecution> findRunningJobExecutions(String jobName);
}
JobExecution
But you can also get it by simply looking at JobExecution:
// Returns the step executions that were registered
public Collection<StepExecution> getStepExecutions()
JobLauncher returns you a JobExecution when you launch the job:
public interface JobLauncher {
public JobExecution run(Job job, JobParameters jobParameters)
throws JobExecutionAlreadyRunningException, JobRestartException;
}
Or you can get it via JobExecutionListener
public interface JobExecutionListener {
void beforeJob(JobExecution jobExecution);
void afterJob(JobExecution jobExecution);
}
There are other ways to obtain it, but the above two should suffice.
EDIT to answer the comment:
In case you'd like to get a metadata regardless of whether or not the step was executed, there is a convenience method getStepNames which is defined by the AbstractJob and is implemented (e.g.) in SimpleJob as:
/**
* Convenience method for clients to inspect the steps for this job.
*
* #return the step names for this job
*/
public Collection<String> getStepNames() {
List<String> names = new ArrayList<String>();
for (Step step : steps) {
names.add(step.getName());
}
return names;
}

Resources