run spring batch job from the controller - spring

I am trying to run my batch job from a controller. It will be either fired up by a cron job or by accessing a specific link.
I am using Spring Boot, no XML just annotations.
In my current setting I have a service that contains the following beans:
#EnableBatchProcessing
#PersistenceContext
public class batchService {
#Bean
public ItemReader<Somemodel> reader() {
...
}
#Bean
public ItemProcessor<Somemodel, Somemodel> processor() {
return new SomemodelProcessor();
}
#Bean
public ItemWriter writer() {
return new CustomItemWriter();
}
#Bean
public Job importUserJob(JobBuilderFactory jobs, Step step1) {
return jobs.get("importUserJob")
.incrementer(new RunIdIncrementer())
.flow(step1)
.end()
.build();
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory,
ItemReader<somemodel> reader,
ItemWriter<somemodel> writer,
ItemProcessor<somemodel, somemodel> processor) {
return stepBuilderFactory.get("step1")
.<somemodel, somemodel> chunk(100)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
}
As soon as I put the #Configuration annotation on top of my batchService class, job will start as soon as I run the application. It finished successfully, everything is fine. Now I am trying to remove #Configuration annotation and run it whenever I want. Is there a way to fire it from the controller?
Thanks!

You need to create a application.yml file in the src/main/resources and add following configuration:
spring.batch.job.enabled: false
With this change, the batch job will not automatically execute with the start of Spring Boot. And batch job will be triggered when specific link.
Check out my sample code here:
https://github.com/pauldeng/aws-elastic-beanstalk-worker-spring-boot-spring-batch-template

You can launch a batch job programmatically using JobLauncher which can be injected into your controller. See the Spring Batch documentation for more details, including this example controller:
#Controller
public class JobLauncherController {
#Autowired
JobLauncher jobLauncher;
#Autowired
Job job;
#RequestMapping("/jobLauncher.html")
public void handle() throws Exception{
jobLauncher.run(job, new JobParameters());
}
}

Since you're using Spring Boot, you should leave the #Configuration annotation in there and instead configure your application.properties to not launch the jobs on startup. You can read more about the autoconfiguration options for running jobs at startup (or not) in the Spring Boot documentation here: http://docs.spring.io/spring-boot/docs/current-SNAPSHOT/reference/htmlsingle/#howto-execute-spring-batch-jobs-on-startup

Related

Tasklet or ItemReader that makes calls to Google Cloud Datastore

I'm attempting to create a Spring Batch tasklet that calls a DatastoreRepository. Tasklet execute step
#Override public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext)
throws Exception {
String syncId = this.stepExecutionContext.getJobParameters()
.getString(JobParameterKeys.SYNC_ID);
SyncJob syncJob = syncJobRepo.findById(Long.parseLong(syncId)).get();
When I attempt to call the syncJobRepo I receive a the org.springframework.transaction.NoTransactionException: No transaction aspect-managed TransactionStatus in scope exception.
I have a custom configured datasource (mySql instance) backing Spring Batch for storing the job execution metadata.
I've attempted to define the DatastoreTransactionManager
#Bean
DatastoreTransactionManager datastoreTransactionManager() {
DatastoreTransactionManager manager
= new DatastoreTransactionManager(DatastoreOptions.getDefaultInstance().getService());
return manager;
}
My configuration is annotated with #EnableBatchProcessing
the batch job config:
#Bean public Job customersJob(StepBuilderFactory stepBuilderFactory,
JobBuilderFactory jobBuilderFactory,
Tasklet batchCustomerReader,
SyncJobNotificationListener listener,
DatastoreTransactionManager datastoreTransactionManager) {
Step step = stepBuilderFactory.get(CUSTOMERS_BATCH_JOB_LOAD_FROM_ERP_STEP)
.tasklet(batchCustomerReader)
.transactionManager(datastoreTransactionManager)
.listener(listener)
.build();
return jobBuilderFactory.get(CUSTOMERS_BATCH_JOB)
.incrementer(new RunIdIncrementer())
.start(step)
.build();
}
I have a custom configured datasource (mySql instance) backing Spring Batch for storing the job execution metadata.
If you use different datasources for Spring Batch meta-data and your business data, then you need to configure an XA transaction manager to synchronize the transaction between both datasources. This way, both data and meta-data are kept in sync in case of failure/restart scenario.
A similar Q/A can be found here: Seperate datasource for jobrepository and writer of Spring Batch

Issues with Spring Batch

Hi I have been working in Spring batch recently and need some help.
1) I want to run my Job using multiple threads, hence I have used TaskExecutor as below,
#Bean
public TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(4);
return taskExecutor;
}
#Bean
public Step myStep() {
return stepBuilderFactory.get("myStep")
.<MyEntity,AnotherEntity> chunk(1)
.reader(reader())
.processor(processor())
.writer(writer())
.taskExecutor(taskExecutor())
.throttleLimit(4)
.build();
}
but, while executing in can see below line in console.
o.s.b.c.l.support.SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.
What does this mean? However, while debugging I can see four SimpleAsyncExecutor threads running. Can someone shed some light on this?
2) I don't want to run my Batch application with the metadata tables that spring batch creates. I have tried adding spring.batch.initialize-schema=never. But it didn't work. I also saw some way to do this by using ResourcelessTransactionManager, MapJobRepositoryFactoryBean. But I have to make some database transactions for my job. So will it be alright if I use this?
Also I was able to do this by extending DefaultBatchConfigurer and overriding:
#Override
public void setDataSource(DataSource dataSource) {
// override to do not set datasource even if a datasource exist.
// initialize will use a Map based JobRepository (instead of database)
}
Please guide me further. Thanks.
Update:
My full configuration class here.
#EnableBatchProcessing
#EnableScheduling
#Configuration
public class MyBatchConfiguration{
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public DataSource dataSource;
/* #Override
public void setDataSource(DataSource dataSource) {
// override to do not set datasource even if a datasource exist.
// initialize will use a Map based JobRepository (instead of database)
}*/
#Bean
public Step myStep() {
return stepBuilderFactory.get("myStep")
.<MyEntity,AnotherEntity> chunk(1)
.reader(reader())
.processor(processor())
.writer(writer())
.taskExecutor(executor())
.throttleLimit(4)
.build();
}
#Bean
public Job myJob() {
return jobBuilderFactory.get("myJob")
.incrementer(new RunIdIncrementer())
.listener(listener())
.flow(myStep())
.end()
.build();
}
#Bean
public MyJobListener myJobListener()
{
return new MyJobListener();
}
#Bean
public ItemReader<MyEntity> reader()
{
return new MyReader();
}
#Bean
public ItemWriter<? super AnotherEntity> writer()
{
return new MyWriter();
}
#Bean
public ItemProcessor<MyEntity,AnotherEntity> processor()
{
return new MyProcessor();
}
#Bean
public TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(4);
return taskExecutor;
}}
In the future, please break this up into two independent questions. That being said, let me shed some light on both questions.
SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.
Your configuration is configuring myStep to use your TaskExecutor. What that does is it causes Spring Batch to execute each chunk in it's own thread (based on the parameters of the TaskExecutor). The log message you are seeing has nothing to do with that behavior. It has to do with launching your job. By default, the SimpleJobLauncher will launch the job on the same thread it is running on, thereby blocking that thread. You can inject a TaskExecutor into the SimpleJobLauncher which will cause the job to be executed on a different thread from the JobLauncher itself. These are two separate uses of multiple threads by the framework.
I don't want to run my Batch application with the metadata tables that spring batch creates
The short answer here is to just use an in memory database like HSQLDB or H2 for your metadata tables. This provides a production grade data store (so that concurrency is handled correctly) without actually persisting the data. If you use the ResourcelessTransactionManager, you are effectively turning transactions off (a bad idea if you're using a database in any capacity) because that TransactionManager doesn't actually do anything (it's a no-op implementation).

Haw spring batch application are started

Excuse me but this may be a noob question for some but it just crossed my mind and I think it is worth to fix my ideas and get a relevant explanation from some experts.
I just started spring batch Tutorial and i have confusion on haw these application are started. let's take this example on official site
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
#SpringBootApplication
public class Application {
public static void main(String[] args) throws Exception {
SpringApplication.run(Application.class, args);
}
}
and the here is the configuration class
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public DataSource dataSource;
// tag::jobstep[]
#Bean
public Job importUserJob(JobCompletionNotificationListener listener) {
return jobBuilderFactory.get("importUserJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(step1())
.end()
.build();
}
#Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<Person, Person> chunk(10)
.reader(reader())
.processor(processor())
.writer(writer())
.build();
}
// end::jobstep[]
}
Here it was mentioned that the main() method uses Spring Boot’s SpringApplication.run() method to launch an application.
it' s not clear for me haw the job importUserJob was executed whereas
there is no explicit code found that show haw we start this job, it's only a configuration part (declaration).
On the other hand i found another example haw to start a spring application like this:
public static void main(String[] args) {
GenericApplicationContext context = new AnnotationConfigApplicationContext(MyBatchConfiguration.class);
JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
Job job = (Job) context.getBean("myJobName");//this is bean name of your job
JobExecution execution = jobLauncher.run(job, jobParameters);
}
Here i can understand that the job is executed with the jobLauncher.
This is the Inversion of Control (IOC, or Dependency Injection) in Spring in action. Essentially this is what happens:
The #SpringBootApplication on Application tells Spring that it should search the package of Application and all sub-packages for classes annotated with relevant annotations
The #Configuration annotation on BatchConfiguration is such a relevant annotation and tells Spring that this class contains information about how the application should be configured. This means that Spring will create an instance of this class and read all attributes and methods on it in search of more information.
The #Autowired annotations on the various fields in BatchConfiguration tells Spring that when instantiating this class these fields should be set to the applicable beans (injected).
The #Bean annotation on importUserJob tells Spring that this method produces a bean, this means that Spring will call this method and store the resulting bean under the Job type (and any interfaces/superclasses it inherits) and under the importUserJob name.
Further annotations on the built job will trigger other actions from Spring. It likely contains annotations that tell Spring to run various functions, connect it to some events. Or another bean requests all instances of Job and do stuff with it in the methods it tells Spring to execute.
The #EnableBatchProcessing annotation is a meta-annotation that tells Spring to search other packages and import other configuration classes to make the beans requested for injection (annotated with #Autowired) available.
IOC can be a pretty advanced concept to grasp in the beginning and Spring definitely isn't the easiest IOC-framework there - though it is my personal favourite - but if you want to know more about Spring Framework and IOC I suggest you start by reading the documentation and then continue on by searching for tutorials.

NoSuchJobException when running a job programmatically in Spring Batch

I have a Job running on startup. I want to run this job programmatically at a particular point of my application, not when I start my app.
When running on startup I have no problem, but I got a "NoSuchJobException" (No job configuration with the name [importCityFileJob] was registered) when I try to run it programmatically.
After looking on the web, I think it's a problem related to JobRegistry, but I don't know how to solve it.
Note : my whole batch configuration is set programmatically, I don't use any XML file to configure my batch and my job. That's a big part of my problem while I lack the examples...
Here is my code to run the Job :
public String runBatch() {
try {
JobLauncher launcher = new SimpleJobLauncher();
JobLocator locator = new MapJobRegistry();
Job job = locator.getJob("importCityFileJob");
JobParameters jobParameters = new JobParameters(); // ... ?
launcher.run(job, jobParameters);
} catch (Exception e) {
e.printStackTrace();
System.out.println("Something went wrong");
}
return "Job is running";
}
My Job declaration :
#Bean
public Job importCityFileJob(JobBuilderFactory jobs, Step step) {
return jobs.get("importFileJob").incrementer(new RunIdIncrementer()).flow(step).end().build();
}
(I tried to replace importCityFileJob by importFileJob in my runBatch method, but it didn't work)
My BatchConfiguration file contains the job declaration above, a step declaration, the itemReader/itemWriter/itemProcessor, and that's all.
I use the #EnableBatchProcessing annotation.
I'm new to Spring Batch & I'm stuck on this problem. Any help would be welcome.
Thanks
Edit : I've solved my problem. I wrote my solution in the answers
Here is what I had to do to fix my problem:
Add the following Bean to the BatchConfiguration :
#Bean
public JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor(JobRegistry jobRegistry) {
JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor = new JobRegistryBeanPostProcessor();
jobRegistryBeanPostProcessor.setJobRegistry(jobRegistry);
return jobRegistryBeanPostProcessor;
}
Replace the JobLocator by an #Autowired JobRegistry, and use the #Autowired JobLauncher instead of creating one. My run method now have the following code :
#Autowired
private JobRegistry jobRegistry;
#Autowired
private JobLauncher launcher;
public String runBatch() {
try {
Job job = jobRegistry.getJob("importCityFileJob");
JobParameters jobParameters = new JobParameters();
launcher.run(job, jobParameters);
} catch (Exception e) {
e.printStackTrace();
System.out.println("Something went wrong");
}
return "OK";
}
I hope it will help someone.
A JobRegistry won't populate itself. In your example, you're creating a new instance, then trying to get the job from it without having registered it in the first place. Typically, the JobRegistry is configured as a bean along with an AutomaticJobRegistrar that will load all jobs into the registrar on startup. That doesn't mean they will be executed, just registered so they can be located later.
If you're using Java configuration, this should happen automatically using the #EnableBatchProcessing annotation. With that annotation, you'd just inject the provided JobRegistry and the jobs should already be there.
You can read more about the #EnableBatchProcessing in the documentation here: http://docs.spring.io/spring-batch/apidocs/org/springframework/batch/core/configuration/annotation/EnableBatchProcessing.html
You can also read about the AutomaticJobRegistrar in the documentation here: http://docs.spring.io/spring-batch/apidocs/org/springframework/batch/core/configuration/support/AutomaticJobRegistrar.html
I could not find the correct answer on this page. In my case the spring batch jobs were configured in a different configuration class not annotated with #EnableBatchProcessing. In that case you need to add the Job to the JobRegistry:
import org.springframework.batch.core.Job;
import org.springframework.batch.core.configuration.DuplicateJobException;
import org.springframework.batch.core.configuration.JobRegistry;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.support.ReferenceJobFactory;
import org.springframework.batch.core.job.flow.Flow;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
#Configuration
public class MyBatchJobConfigurations {
#Bean
public Job myCountBatchJob(final JobBuilderFactory jobFactory, final JobRegistry jobRegistry, final Flow myJobFlow)
throws DuplicateJobException {
final Job countJob = jobFactory.get("myCountBatchJob")
.start(myJobFlow)
.end().build();
ReferenceJobFactory referenceJobFactory = new ReferenceJobFactory(countJob);
jobRegistry.register(referenceJobFactory);
return countJob;
}
}
Adding the following bean in the applicationContext.xml resolved the problem for me
<bean class="org.springframework.batch.core.configuration.support.JobRegistryBeanPostProcessor">
<property name="jobRegistry" ref="jobRegistry" />
</bean>
I also have this entry in the applicationContext.xml
<bean id="jobRegistry"
class="org.springframework.batch.core.configuration.support.MapJobRegistry" />
another solution:
rename the method name "importCityFileJob" to "job":
#Bean
public Job job(JobBuilderFactory jobs, Step step) {
return jobs.get("importFileJob").incrementer(new RunIdIncrementer()).flow(step).end().build();
}
#EnableBatchProcessing
#Configuration
public class SpringBatchCommon {
#Bean
public JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor(JobRegistry jobRegistry) {
JobRegistryBeanPostProcessor postProcessor = new JobRegistryBeanPostProcessor();
postProcessor.setJobRegistry(jobRegistry);
return postProcessor;
}
}
Set the JobRegistry in the JobRegistryBeanPostProcessor , after that you can autowire the JobLauncher and the JobLocator
Job job = jobLocator.getJob("importFileJob");
JobParametersBuilder jobBuilder = new JobParametersBuilder();
//set any parameters if required
jobLauncher.run(job, jobBuilder.toJobParameters()

how to select which spring batch job to run based on application argument - spring boot java config

I have two independent spring batch jobs in the same project because I want to use the same infrastructure-related beans. Everything is configured in Java. I would like to know if there's a proper way to start the jobs independent based for example on the first java app argument in the main method for example. If I run SpringApplication.run only the second job gets executed by magic.
The main method looks like:
#ComponentScan
#EnableAutoConfiguration
public class Application {
public static void main(String[] args) {
SpringApplication app = new SpringApplication(Application.class);
app.setWebEnvironment(false);
ApplicationContext ctx= app.run(args);
}
}
and the two jobs are configured as presented in the Spring Batch Getting Started tutorial on Spring.io. Here is the configuration file of the first job, the second being configured in the same way.
#Configuration
#EnableBatchProcessing
#Import({StandaloneInfrastructureConfiguration.class, ServicesConfiguration.class})
public class AddPodcastJobConfiguration {
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory stepBuilderFactory;
//reader, writer, processor...
}
To enable modularization I created an AppConfig class, where I define factories for the two jobs:
#Configuration
#EnableBatchProcessing(modular=true)
public class AppConfig {
#Bean
public ApplicationContextFactory addNewPodcastJobs(){
return new GenericApplicationContextFactory(AddPodcastJobConfiguration.class);
}
#Bean
public ApplicationContextFactory newEpisodesNotificationJobs(){
return new GenericApplicationContextFactory(NotifySubscribersJobConfiguration.class);
}
}
P.S. I am new to Spring configuration in Java configuration Spring Boot and Spring Batch...
Just set the "spring.batch.job.names=myJob" property. You could set it as SystemProperty when you launch your application (-Dspring.batch.job.names=myjob). If you have defined this property, spring-batch-starter will only launch the jobs, that are defined by this property.
To run the jobs you like from the main method you can load the the required job configuration bean and the JobLauncher from the application context and then run it:
#ComponentScan
#EnableAutoConfiguration
public class ApplicationWithJobLauncher {
public static void main(String[] args) throws BeansException, JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException, JobParametersInvalidException, InterruptedException {
Log log = LogFactory.getLog(ApplicationWithJobLauncher.class);
SpringApplication app = new SpringApplication(ApplicationWithJobLauncher.class);
app.setWebEnvironment(false);
ConfigurableApplicationContext ctx= app.run(args);
JobLauncher jobLauncher = ctx.getBean(JobLauncher.class);
JobParameters jobParameters = new JobParametersBuilder()
.addDate("date", new Date())
.toJobParameters();
if("1".equals(args[0])){
//addNewPodcastJob
Job addNewPodcastJob = ctx.getBean("addNewPodcastJob", Job.class);
JobExecution jobExecution = jobLauncher.run(addNewPodcastJob, jobParameters);
} else {
jobLauncher.run(ctx.getBean("newEpisodesNotificationJob", Job.class), jobParameters);
}
System.exit(0);
}
}
What was causing my lots of confusion was that the second job were executed, even though the first job seemed to be "picked up" by the runner... Well the problem was that in both job's configuration file I used standard method names writer(), reader(), processor() and step() and it used the ones from the second job that seemed to "overwrite" the ones from the first job without any warnings...
I used though an application config class with #EnableBatchProcessing(modular=true), that I thought would be used magically by Spring Boot :
#Configuration
#EnableBatchProcessing(modular=true)
public class AppConfig {
#Bean
public ApplicationContextFactory addNewPodcastJobs(){
return new GenericApplicationContextFactory(AddPodcastJobConfiguration.class);
}
#Bean
public ApplicationContextFactory newEpisodesNotificationJobs(){
return new GenericApplicationContextFactory(NotifySubscribersJobConfiguration.class);
}
}
I will write a blog post about it when it is ready, but until then the code is available at https://github.com/podcastpedia/podcastpedia-batch (work/learning in progress)..
There is the CommandLineJobRunner and maybe can be helpful.
From its javadoc
Basic launcher for starting jobs from the command line
Spring Batch auto configuration is enabled by adding #EnableBatchProcessing (from Spring Batch) somewhere in your context. By default it executes all Jobs in the application context on startup (see JobLauncherCommandLineRunner for details). You can narrow down to a specific job or jobs by specifying spring.batch.job.names (comma separated job name patterns).
-- Spring Boot Doc
Or disable the auto execution and run the jobs programmatically from the context using a JobLauncher based on the args passed to the main method

Resources