Spring Batch resume after server's failure - spring

I am using spring batch to parse files and I have the following scenario:
I am running a job. This job has to parse a giving file. For unexpected reason (let say for power cut) the server fails and I have to restart the machine. Now, after restarting the server I want to resume the job from the point which stopped before the power cut. This means that if the system read 1.300 rows from 10.000 now have to start reading from 1.301 row.
How can I achieve this scenario using spring batch?
About configuration: I use spring-integration which polls under a directory for new files. When a file is arrived the spring-integration creates the spring batch job. Also, spring-batch uses FlatFileItemReader to parse the file.

Here is the complete solution to restart a job after JVM crash.
Make a job restartable by making restarable="true"
job id="jobName" xmlns="http://www.springframework.org/schema/batch"
restartable="true"
2 . Code to restart a job
import java.util.Date;
import java.util.List;
import org.apache.commons.collections.CollectionUtils;
import org.springframework.batch.core.BatchStatus;
import org.springframework.batch.core.ExitStatus;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobInstance;
import org.springframework.batch.core.explore.JobExplorer;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.launch.JobOperator;
import org.springframework.batch.core.repository.JobRepository;
import org.springframework.beans.factory.annotation.Autowired;
public class ResartJob {
#Autowired
private JobExplorer jobExplorer;
#Autowired
JobRepository jobRepository;
#Autowired
private JobLauncher jobLauncher;
#Autowired
JobOperator jobOperator;
public void restart(){
try {
List<JobInstance> jobInstances = jobExplorer.getJobInstances("jobName",0,1);// this will get one latest job from the database
if(CollectionUtils.isNotEmpty(jobInstances)){
JobInstance jobInstance = jobInstances.get(0);
List<JobExecution> jobExecutions = jobExplorer.getJobExecutions(jobInstance);
if(CollectionUtils.isNotEmpty(jobExecutions)){
for(JobExecution execution: jobExecutions){
// If the job status is STARTED then update the status to FAILED and restart the job using JobOperator.java
if(execution.getStatus().equals(BatchStatus.STARTED)){
execution.setEndTime(new Date());
execution.setStatus(BatchStatus.FAILED);
execution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(execution);
jobOperator.restart(execution.getId());
}
}
}
}
} catch (Exception e1) {
e1.printStackTrace();
}
}
}
3.
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean" p:dataSource-ref="dataSource" p:transactionManager-ref="transactionManager" p:lobHandler-ref="oracleLobHandler"/>
<bean id="oracleLobHandler" class="org.springframework.jdbc.support.lob.DefaultLobHandler"/>
<bean id="jobExplorer" class="org.springframework.batch.core.explore.support.JobExplorerFactoryBean" p:dataSource-ref="dataSource" />
<bean id="jobRegistry" class="org.springframework.batch.core.configuration.support.MapJobRegistry" />
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
<property name="taskExecutor" ref="jobLauncherTaskExecutor" />
</bean> <task:executor id="jobLauncherTaskExecutor" pool-size="6" rejection-policy="ABORT" />
<bean id="jobOperator" class="org.springframework.batch.core.launch.support.SimpleJobOperator" p:jobLauncher-ref="jobLauncher" p:jobExplorer-re`enter code here`f="jobExplorer" p:jobRepository-ref="jobRepository" p:jobRegistry-ref="jobRegistry"/>

An updated work-around for Spring batch 4. Takes JVM start up time into account for broken jobs detection. Please note that this will not work when in a clustered environment where multiple servers start jobs.
#Bean
public ApplicationListener<ContextRefreshedEvent> resumeJobsListener(JobOperator jobOperator, JobRepository jobRepository,
JobExplorer jobExplorer) {
// restart jobs that failed due to
return event -> {
Date jvmStartTime = new Date(ManagementFactory.getRuntimeMXBean().getStartTime());
// for each job
for (String jobName : jobExplorer.getJobNames()) {
// get latest job instance
for (JobInstance instance : jobExplorer.getJobInstances(jobName, 0, 1)) {
// for each of the executions
for (JobExecution execution : jobExplorer.getJobExecutions(instance)) {
if (execution.getStatus().equals(BatchStatus.STARTED) && execution.getCreateTime().before(jvmStartTime)) {
// this job is broken and must be restarted
execution.setEndTime(new Date());
execution.setStatus(BatchStatus.FAILED);
execution.setExitStatus(ExitStatus.FAILED);
for (StepExecution se : execution.getStepExecutions()) {
if (se.getStatus().equals(BatchStatus.STARTED)) {
se.setEndTime(new Date());
se.setStatus(BatchStatus.FAILED);
se.setExitStatus(ExitStatus.FAILED);
jobRepository.update(se);
}
}
jobRepository.update(execution);
try {
jobOperator.restart(execution.getId());
}
catch (JobExecutionException e) {
LOG.warn("Couldn't resume job execution {}", execution, e);
}
}
}
}
}
};
}

What I would do in your situation is to create a step to log the last processed row in a file. Then create a second job that would read this file and start the processing from a specific row number.
So if the job stops due to whatever reason you will be able to run the new Job that will resume the processing.

you can also write like below :
#RequestMapping(value = "/updateStatusAndRestart/{jobId}/{stepId}", method = GET)
public ResponseEntity<String> updateBatchStatus(#PathVariable("jobId") Long jobExecutionId ,#PathVariable("stepId")Long stepExecutionId )throws Exception {
StepExecution stepExecution = jobExplorer.getStepExecution(jobExecutionId,stepExecutionId);
stepExecution.setEndTime(new Date(System.currentTimeMillis()));
stepExecution.setStatus(BatchStatus.FAILED);
stepExecution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(stepExecution);
JobExecution jobExecution = stepExecution.getJobExecution();
jobExecution.setEndTime(new Date(System.currentTimeMillis()));
jobExecution.setStatus(BatchStatus.FAILED);
jobExecution.setExitStatus(ExitStatus.FAILED);
jobRepository.update(jobExecution);
jobOperator.restart(execution.getId());
return new ResponseEntity<String>("<h1> Batch Status Updated !! </h1>", HttpStatus.OK);
}
Here i have used restApi endpoint to pass the jobExecutionId and stepExecutionId and setting the status of both job_execution and step_execution to FAIL. then restart using batch operator.

Related

Need a way to prevent unwanted job param from propagating to next execution of spring boot batch job

I am running a batch app using spring boot 2.1.2 and spring batch 4.1.1. The app uses a MySQL database for the spring batch metadata data source.
First, I run the job with this command:
java -jar target/batchdemo-0.0.1-SNAPSHOT.jar -Dspring.batch.job.names=echo com.paypal.batch.batchdemo.BatchdemoApplication myparam1=value1 myparam2=value2
Notice I am passing two params:
myparam1=value1
myparam2=value2
Since the job uses RunIdIncrementer, the actual params used by the app are logged as:
Job: [SimpleJob: [name=echo]] completed with the following parameters: [{myparam2=value2, run.id=1, myparam1=value1}]
Next I run the job again, this time dropping myparam2:
java -jar target/batchdemo-0.0.1-SNAPSHOT.jar -Dspring.batch.job.names=echo com.paypal.batch.batchdemo.BatchdemoApplication myparam1=value1
This time the job again runs with param2 still included:
Job: [SimpleJob: [name=echo]] completed with the following parameters: [{myparam2=value2, run.id=2, myparam1=value1}]
This causes business logic to be invoked as if I had again passed myparam2 to the app.
Is there a way to drop the job parameter and have it not be passed to the next instance?
App code:
package com.paypal.batch.batchdemo;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;
#SpringBootApplication
#EnableBatchProcessing
public class BatchdemoApplication {
public static void main(String[] args) {
SpringApplication.run(BatchdemoApplication.class, args);
}
#Autowired
JobBuilderFactory jobBuilder;
#Autowired
StepBuilderFactory stepBuilder;
#Autowired
ParamEchoTasklet paramEchoTasklet;
#Bean
public RunIdIncrementer incrementer() {
return new RunIdIncrementer();
}
#Bean
public Job job() {
return jobBuilder.get("echo").incrementer(incrementer()).start(echoParamsStep()).build();
}
#Bean
public Step echoParamsStep() {
return stepBuilder.get("echoParams").tasklet(paramEchoTasklet).build();
}
}
package com.paypal.batch.batchdemo;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.stereotype.Component;
#Component
public class ParamEchoTasklet implements Tasklet {
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
LOGGER.info("ParamEchoTasklet BEGIN");
chunkContext.getStepContext().getJobParameters().entrySet().stream().forEachOrdered((entry) -> {
String key = entry.getKey();
Object value = entry.getValue();
LOGGER.info("Param {} = {}", key, value);
});
LOGGER.info("ParamEchoTasklet END");
return RepeatStatus.FINISHED;
}
private Logger LOGGER = LoggerFactory.getLogger(ParamEchoTasklet.class);
}
I debugged the spring batch and spring boot code, and here is what is happening. JobParametersBuilder line 273 adds the params from the most recent prior job instance to the nextParameters map along with any params added by the JobParametersIncrementer:
List<JobExecution> previousExecutions = this.jobExplorer.getJobExecutions(lastInstances.get(0));
if (previousExecutions.isEmpty()) {
// Normally this will not happen - an instance exists with no executions
nextParameters = incrementer.getNext(new JobParameters());
}
else {
JobExecution previousExecution = previousExecutions.get(0);
nextParameters = incrementer.getNext(previousExecution.getJobParameters());
}
Then since I am using spring boot, JobLauncherCommandLineRunner line 213 merges the prior params with the new params passed for the new execution, which results in the old param being passed to the new execution:
return merge(nextParameters, jobParameters);
It appears to be impossible to run the job ever again without the param unless I am missing something. Could it be a bug in spring batch?
The normal behavior for RunIdIncrementer appears to increment the run id for the JobExecution and pass along the remaining prior JobParameters. I would not call this a bug.
Keep in mind that the idea behind the RunIdIncrementer is simply to change one identifying parameter to allow a job to be run again, even if a prior run with the same (other) parameters completed successfully and restart has not been configured.
You could always create a customized incrementer by implementing JobParametersIncrementer.
Another alternative is to use the JobParametersBuilder to build a JobParameters object and then use the JobLauncher to run your job with those parameters. I often use the current system time in milliseconds to create uniqueness if I'm running jobs that will otherwise have the same JobParameters. You will obviously have to figure out the logic for pulling your specific parameters from the command line (or wherever else) and iterating over them to populate the JobParameters object.
Example:
public JobExecution executeJob(Job job) {
JobExecution jobExecution = null;
try {
JobParameters jobParameters =
new JobParametersBuilder()
.addLong( "time.millis", System.currentTimeMillis(), true)
.addString( "param1", "value1", true)
.toJobParameters();
jobExecution = jobLauncher.run( job, jobParameters );
} catch ( JobInstanceAlreadyCompleteException | JobRestartException | JobParametersInvalidException | JobExecutionAlreadyRunningException e ) {
e.printStackTrace();
}
return jobExecution;
}

Why Spring scheduler stopped unexpectedly

We have the below scheduler defined in spring context xml file.
<!-- Item Scheduling -->
<task:scheduler id="itemScheduler" pool-size="1"/>
<task:scheduled-tasks scheduler="itemScheduler" >
<task:scheduled ref="itemQueuePoller" method="poll" fixed-delay="10000"/>
</task:scheduled-tasks>
and below is the java code for the execution
#Service("itemQueuePoller")
public class ItemQueuePoller {
private final Logger LOG = getLogger(ItemQueuePoller.class);
private ItemQueueDao itemQueueDao;
private ItemHandler itemHandler;
#Autowired
public ItemQueuePoller(#Qualifier("itemQueueDao") ItemQueueDao itemQueueDao,
#Qualifier("itemHandler") ItemHandler itemHandler) {
this.itemQueueDao= itemQueueDao;
this.itemHandler= itemHandler;
}
//scheduled via batch application context
public void poll() {
try {
List<ItemQueueEntry> entries = itemQueueDao.findNextBatch();
if (entries == null || entries.isEmpty()) return;
itemHandler.processJob(entries);
} catch (Exception e) {
LOG.error("Exception occurred while processing Queue Items due to: ", e);
throw new RuntimeException(e);
}
}
}
This was working fine everyday but one day it ran only two times and then stopped automatically without any exception in logs.
When we have restarted the application, it starts working fine.
My question is that why it was stopped automatically?

Spring-batch exiting with Exit Status : COMPLETED before actual job is finished?

In my Spring Batch application I have written a CustomItemWriter which internally writes item to DynamoDB using DynamoDBAsyncClient, this client returns Future object. I have a input file with millions of record. Since CustomItemWriter returns future object immediately my batch job exiting within 5 sec with status as COMPLETED, but in actual it is taking 3-4 minutes to write all item to the DB, I want that batch job finishes only after all item written to DataBase. How can i do that?
job is defined as below
<bean id="report" class="com.solution.model.Report" scope="prototype" />
<batch:job id="job" restartable="true">
<batch:step id="step1">
<batch:tasklet>
<batch:chunk reader="cvsFileItemReader" processor="filterReportProcessor" writer="customItemWriter"
commit-interval="20">
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
<bean id="customItemWriter" class="com.solution.writer.CustomeWriter"></bean>
CustomeItemWriter is defined as below
public class CustomeWriter implements ItemWriter<Report>{
public void write(List<? extends Report> item) throws Exception {
List<Future<PutItemResult>> list = new LinkedList();
AmazonDynamoDBAsyncClient client = new AmazonDynamoDBAsyncClient();
for(Report report : item) {
PutItemRequest req = new PutItemRequest();
req.setTableName("MyTable");
req.setReturnValue(ReturnValue.ALL_ODD);
req.addItemEntry("customerId",new
AttributeValue(item.getCustomeId()));
Future<PutItemResult> res = client.putItemAsync(req);
list.add(res);
}
}
}
Main class contains
JobExecution execution = jobLauncher.run(job, new JobParameters());
System.out.println("Exit Status : " + execution.getStatus());
Since in ItemWriter its returning future object it doesn't waits to complete the opration. And from the main since all item is submitted for writing Batch Status is showing COMPLETED and job terminates.
I want that this job should terminate only after actual write is performed in the DynamoDB.
Can we have some other step well to wait on this or some Listener is available?
Here is one approach. Since ItemWriter::write doesn't return anything you can make use of listener feature.
#Component
#JobScope
public class YourWriteListener implements ItemWriteListener<WhatEverYourTypeIs> {
#Value("#{jobExecution.executionContext}")
private ExecutionContext executionContext;
#Override
public void afterWrite(final List<? extends WhatEverYourTypeIs> paramList) {
Future future = this.executionContext.readAndValidate("FutureKey", Future.class);
//wait till the job is done using future object
}
#Override
public void beforeWrite(final List<? extends WhatEverYourTypeIs> paramList) {
}
#Override
public void onWriteError(final Exception paramException, final List<? extends WhatEverYourTypeIs> paramList) {
}
}
In your writer class, everything remains same except addind the future object to ExecutionContext.
public class YourItemWriter extends ItemWriter<WhatEverYourTypeIs> {
#Value("#{jobExecution.executionContext}")
private ExecutionContext executionContext;
#Override
protected void doWrite(final List<? extends WhatEverYourTypeIs> youritems)
//write to DynamoDb and get Future object
executionContext.put("FutureKey", future);
}
}
}
And you can register the listener in your configuration. Here is a java code, you need to do the same in your xml
#Bean
public Step initStep() {
return this.stepBuilders.get("someStepName").<YourTypeX, YourTypeY>chunk(10)
.reader(yourReader).processor(yourProcessor)
.writer(yourWriter).listener(YourWriteListener)
.build();
}

Stop the task in Spring Integration framework

I would like to stop a KeepAliveReceiver task after a given event. I tested the following solutions and none is working 1) sending keepAliveReceiver.stop() to control but, 2) implement Lifecycle and call stop() 3) stop the scheduler. Any ideas how can I stop the task from within the running task?
#MessageEndpoint
public class KeepAliveReceiver implements Runnable, LifeCycle {
private int limit;
#Autowired
private ControlBusGateway controlGateway; // sending messages to control Channel
#Autowired
private ThreadPoolTaskScheduler myScheduler;
#Override
public void run() {
...
if ( event ) {
LOGGER.debug( "FAILOVER! Starting messageReceiveRouter. ");
controlGateway.send( new GenericMessage<String>( "#keepAliveReceiver.stop()" ) );
// not allowed
myScheduler.shutdown();
// not working, the scheduler keeps starting the keepAliveReceiver
this.stop();
//not working
}
}
#Override
public void stop() {
LOGGER.debug( "STOPPED!");
}
and xml definition of the scheduler:
<task:scheduler id="myScheduler" pool-size="10" />
<task:scheduled-tasks>
<task:scheduled ref="keepAliveReceiver" method="run" fixed-rate="500" />
</task:scheduled-tasks>
Send to the controlGateway a Message with empty command ;-)
'Kill' your <control-bus> and change it to
<outbound-channel-adapter channel="stopSchedulerChannel" expression="#myScheduler.shutdown()">
And add
<channel id="stopSchedulerChannel">
<dispatcher task-executor="executor"/>
</channel>
And configure appropriate executor bean
Your problem is about a wish to stop task from himself. From other side <control-bus> allows operations only on SmartLifecycle implementors

Spring batch :Restart a job and then start next job automatically

I need to create a recovery pattern.
In my pattern I can launch a job only on a given time window.
In case the job fails, it will only be restarted on the next time window and when finish I would like to start the schedule job that was planned in advance for this window.
The only different between jobs is the time window parameters.
I thought about JobExecutionDecider with conjunction with JobExplorer or overriding a Joblauncher. But all seems too intrusive.
I failed to found an example that match my needs any Ideas will be most welcome.
Just to recap what was actually done based on the advice provided by incomplete-co.de.
I created a recovery flow which is similar to the one below. The recovery flow wraps my actual batch and responsible only to serve the correct job parameters to the internal job. It could be initial parameters on first execution, new parameters on normal execution or old parameters in case the last execution had failed.
<batch:job id="recoveryWrapper"
incrementer="wrapperRunIdIncrementer"
restartable="true">
<batch:decision id="recoveryFlowDecision" decider="recoveryFlowDecider">
<batch:next on="FIRST_RUN" to="defineParametersOnFirstRun" />
<batch:next on="RECOVER" to="recover.batchJob " />
<batch:next on="CURRENT" to="current.batchJob " />
</batch:decision>
<batch:step id="defineParametersOnFirstRun" next="current.batchJob">
<batch:tasklet ref="defineParametersOnFirstRunTasklet"/>
</batch:step>
<batch:step id="recover.batchJob " next="current.batchJob">
<batch:job ref="batchJob" job-launcher="jobLauncher"
job-parameters-extractor="jobParametersExtractor" />
</batch:step>
<batch:step id="current.batchJob" >
<batch:job ref="batchJob" job-launcher="jobLauncher"
job-parameters-extractor="jobParametersExtractor" />
</batch:step>
</batch:job>
The heart of the solution is the RecoveryFlowDecider the JobParametersExtractor while using Spring Batch Restart mechanism.
RecoveryFlowDecider will query the JobExplorer and JobRepository to find out if we had a failure in the last run. It will place The last execution on the execution context of the wrapper to use later in the JobParametersExtractor.
Note the use of runIdIncremeter to allow re-execution of the wrapper job.
#Component
public class RecoveryFlowDecider implements JobExecutionDecider {
private static final String FIRST_RUN = "FIRST_RUN";
private static final String CURRENT = "CURRENT";
private static final String RECOVER = "RECOVER";
#Autowired
private JobExplorer jobExplorer;
#Autowired
private JobRepository jobRepository;
#Override
public FlowExecutionStatus decide(JobExecution jobExecution
,StepExecution stepExecution) {
// the wrapper is named as the wrapped job + WRAPPER
String wrapperJobName = jobExecution.getJobInstance().getJobName();
String jobName;
jobName = wrapperJobName.substring(0,wrapperJobName.indexOf(EtlConstants.WRAPPER));
List<JobInstance> instances = jobExplorer.getJobInstances(jobName, 0, 1);
JobInstance internalJobInstance = instances.size() > 0 ? instances.get(0) : null;
if (null == internalJobInstance) {
return new FlowExecutionStatus(FIRST_RUN);
}
JobExecution lastExecution = jobRepository.getLastJobExecution(internalJobInstance.getJobName()
,internalJobInstance.getJobParameters());
//place the last execution on the context (wrapper context to use later)
jobExecution.getExecutionContext().put(EtlConstants.LAST_EXECUTION, lastExecution);
ExitStatus exitStatus = lastExecution.getExitStatus();
if (ExitStatus.FAILED.equals(exitStatus) || ExitStatus.UNKNOWN.equals(exitStatus)) {
return new FlowExecutionStatus(RECOVER);
}else if(ExitStatus.COMPLETED.equals(exitStatus)){
return new FlowExecutionStatus(CURRENT);
}
//We should never get here unless we have a defect
throw new RuntimeException("Unexpecded batch status: "+exitStatus+" in decider!");
}
}
Then the JobParametersExtractor will test again for the outcome of the last execution, in case of failed job it will serve the original parameters used to execute the failed job triggering Spring Bacth restart mechanism. Otherwise it will create a new set of parameters and will execute at his normal course.
#Component
public class JobExecutionWindowParametersExtractor implements
JobParametersExtractor {
#Override
public JobParameters getJobParameters(Job job, StepExecution stepExecution) {
// Read the last execution from the wrapping job
// in order to build Next Execution Window
JobExecution lastExecution= (JobExecution) stepExecution.getJobExecution().getExecutionContext().get(EtlConstants.LAST_EXECUTION);;
if(null!=lastExecution){
if (ExitStatus.FAILED.equals(lastExecution.getExitStatus())) {
JobInstance instance = lastExecution.getJobInstance();
JobParameters parameters = instance.getJobParameters();
return parameters;
}
}
//We do not have failed execution or have no execution at all we need to create a new execution window
return buildJobParamaters(lastExecution,stepExecution);
}
...
}
have you considered a JobStep? that is, a step determines if there are any additional jobs to be run. this value is set into the StepExecutionContext. a JobExecutionDecider then checks for this value; if exists, directs to a JobStep which launches the Job.
here's the doc on it http://docs.spring.io/spring-batch/reference/htmlsingle/#external-flows
Is it possible to do it in the opposite manner?
In every time window, submit the job intended for that time window.
However, the very first step of the job should check whether the job in previous time window is completed successfully or not. If it failed before, then submit the previous job, and wait for completion, before going into its own logic.

Resources