Method annotated with #Bean is called directly. Use dependency injection instead - spring

I'm following a tutorial for Spring Batch and when I write the following code - IntelliJ is complaining that the tasklet(null) call in the job function is called directly:
Method annotated with #Bean is called directly. Use dependency injection instead.
I can get the error to go away if I remove the #Bean annotation from the job - but I want to know what's going on. How can I inject the bean there? Simply writing tasklet(Tasklet tasklet(null)) gives the same error.
#Bean
#StepScope
public Tasklet tasklet(#Value("#{jobParameters['name']}") String name) {
return ((contribution, chunkContext) -> {
System.out.println(String.format("This is %s", name));
return RepeatStatus.FINISHED;
});
}
#Bean
public Job job() {
return jobBuilderFactory.get("job")
.start(stepBuilderFactory.get("step1")
.tasklet(tasklet(null)) // tasklet(null) = problem
.build())
.build();
}
asd

#Bean
#StepScope
public Tasklet tasklet(#Value("#{jobParameters['name']}") String name) {
return ((contribution, chunkContext) -> {
System.out.println(String.format("This is %s", name));
return RepeatStatus.FINISHED;
});
}
#Bean
public Job job(Tasklet tasklet) {
return jobBuilderFactory.get("job")
.start(stepBuilderFactory.get("step1")
.tasklet(tasklet)
.build())
.build();
}
Spring Bean creation and AOPs are very picky. You need to be very careful with the usage.
In this case you can use bean dependency to solve the TaskLet name being null.

Related

How to add tasklet to run after each partition step completion in Spring Batch

I am new to Spring batch and implementing a spring batch job where it has to pull huge data set from DB and write to file. Below is the sample job config which is working as expected for me.
#Bean
public Job customDBReaderFileWriterJob() throws Exception {
return jobBuilderFactory.get(MY_JOB)
.incrementer(new RunIdIncrementer())
.flow(partitionGenerationStep())
.next(cleanupStep())
.end()
.build();
}
#Bean
public Step partitionGenerationStep() throws Exception {
return stepBuilderFactory
.get("partitionGenerationStep")
.partitioner("Partitioner", partitioner())
.step(multiOperationStep())
.gridSize(50)
.taskExecutor(taskExecutor())
.build();
}
#Bean
public Step multiOperationStep() throws Exception {
return stepBuilderFactory
.get("MultiOperationStep")
.<Input, Output>chunk(100)
.reader(reader())
.processor(processor())
.writer(writer())
.build();
}
#Bean
#StepScope
public DBPartitioner partitioner() {
DBPartitioner dbPartitioner = new DBPartitioner();
dbPartitioner.setColumn(ID);
dbPartitioner.setDataSource(dataSource);
dbPartitioner.setTable(TABLE);
return dbPartitioner;
}
#Bean
#StepScope
public Reader reader() {
return new Reader();
}
#Bean
#StepScope
public Processor processor() {
return new Processor();
}
#Bean
#StepScope
public Writer writer() {
return new Writer();
}
#Bean
public Step cleanupStep() {
return stepBuilderFactory.get("cleanupStep")
.tasklet(cleanupTasklet())
.build();
}
#Bean
#StepScope
public CleanupTasklet cleanupTasklet() {
return new CleanupTasklet();
}
#Bean
public TaskExecutor taskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(10);
executor.setMaxPoolSize(10);
executor.setQueueCapacity(10);
executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
executor.setThreadNamePrefix("MultiThreaded-");
return executor;
}
As the data set is huge, i have configured thread pool value for task-executor as 10 and grid size 50. With this setup 10 threads are writing to 10 files at a time, and reader is reading file in chunks so reader processor and writer flow is iterating multiple times (for a group of 10, before moving to next partition).
Now, I would like to add a tasklet where i can compress files once all iteration (read, process,write) for one thread is completed i.e. after completion of each partition.
I do have a cleanup tasklet to run at last, but having compression logic there means to get all files generated from each partition first and then perform compression. Please suggest.
You can change your worker step multiOperationStep to be a FlowStep of a chunk-oriented step followed by a simple tasklet step where you do the compression. In other words, the worker step is actually two steps combined in one FlowStep.

How to avoid configuration via config?

Is there some example of how to generate Step with a code, not as configured Bean
#Bean
public Step simpleStep(JdbcBatchItemWriter<SomeInputDto> writer) {
return stepBuilderFactory.get("simpleStep")
.<SomeInputDto, SomeOutputDto> chunk(100)
.reader(reader())
.processor(handler())
.writer(writer)
.taskExecutor(taskExecutor())
.build();
}
Exactly, I want generate reader and step for each table of my database

Spring batch SystemCommandTasklet throwing null pointer exception

I am new to Spring Batch and I am trying to run a linux sort command after the batch process using SystemCommandTasklet as a second step. However, its throwing NullPointerException when sorting bigger files (which takes some time, around 250 MB). It looks like SystemCommandTasklet is unable to initialize StepExecution in beforeStep() and throwing an error. Can someone check my configuration and let me know if I am missing some configuration which is causing this?
BatchConfig.java
#Bean
public Job job() throws Exception {
return jobs.get("job")
.incrementer(new RunIdIncrementer())
.flow(step1()).on("FAILED").fail().on("COMPLETED").to(step2())
.end()
.build();
}
#Bean
public Step step1() {
return steps.get("step1")
.<FileEntry,FileEntry>chunk(100)
.reader(reader()).faultTolerant().skipLimit(MAX_SKIP_LIMIT).skip(FlatFileParseException.class)
.processor(new Processor())
.writer(compositeWriter()).stream(outputwriter()).stream(rejectwriter())
.listener(new CustomStepExecutionListener())
.build();
}
#Bean
public Step step2() throws Exception {
return steps.get("step2")
.tasklet(sortingTasklet())
.build();
}
#Bean
#StepScope
public Tasklet sortingTasklet() throws Exception {
SystemCommandTasklet tasklet = new SystemCommandTasklet();
logger.debug("Sorting File : " + getOutputFileName());
tasklet.setCommand(new String("sort " + getOutputFileName() + " -d -s -t \001 -k1,1 -o " + getOutputFileName() + ".sorted "));
tasklet.setTimeout(600000l);
return tasklet;
}
Here is the link to SpringBatch source code for SystemCommandTasklet, its throwing NullPointerException at line 131.
https://github.com/spring-projects/spring-batch/blob/master/spring-batch-core/src/main/java/org/springframework/batch/core/step/tasklet/SystemCommandTasklet.java
You aren't registering the SystemCommandTasklet as a StepExecutionListener and since you aren't returning the implementing class on the #Bean method, Spring Batch doesn't know that the tasklet implements that interface. I'd recommend two things to be safe:
Change the tasklet's configuration method signature to be:
#Bean
#StepScope
public SystemCommandTasklet sortingTasklet() throws Exception {
Register the tasklet as a listener on your step as well, similar to how you're doing it with the CustomStepExecutionListener.

Spring Boot & Spring Batch auto wiring

I'm trying to get me head around Spring Boot's auto wiring for Spring Batch. In the example in the Spring docs, there are a number of parameters being passed in the constructor to some beans.
The example works, but if I try to create another job using the identical configuration (naming it importUserJob2), I get a non-unique bean exception - the error reports there are 2 step beans.
#Bean
public Job importUserJob(JobBuilderFactory jobs, Step s1, JobExecutionListener listener) {
return jobs.get("importUserJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(s1)
.end()
.build();
}
What does the constructor here mean when used with parameters (I can't see where these parameters are supplied) and where are these beans being created? How do I create 2 jobs?
EDIT: here are the 2 jobs, 2 steps and the exception I get.
#Bean
public Job helloJob(JobBuilderFactory jobs, Step s1, JobExecutionListener listener) {
return jobs.get("helloJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(s1)
.end()
.build();
}
#Bean
public Step step1(StepBuilderFactory stepBuilderFactory) {
return stepBuilderFactory.get("step1")
.tasklet(helloTasklet())
.build();
}
#Bean
public Job otherJob(JobBuilderFactory jobs, Step s1, JobExecutionListener listener) {
return jobs.get("otherJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(s1)
.end()
.build();
}
#Bean
public Step step2(StepBuilderFactory stepBuilderFactory, ItemReader<MyReadItem> reader,
ItemWriter< MyReadItem > writer) {
return stepBuilderFactory.get("step2")
.< MyReadItem, MyReadItem > chunk(10)
.reader(reader)
.writer(writer)
.build();
}
And the exception:
nested exception is org.springframework.beans.factory.NoUniqueBeanDefinitionException: No qualifying bean of type [org.springframework.batch.core.Step] is defined: expected single matching bean but found 2: step1,step2
What creates Step s1? Am I right in thinking that Spring just looks for any bean that is the correct type (Step), but there are 2 found? Do I need to qualify the steps when they're created to ensure the correct one is injected??
In both your job definitions methods, you are using "Step s1" as a parameter. Since "s1" is not a name of a bean, spring will try to autowire by type. However, it will find two steps ("step1" and "step2") in the context. I guess, that's the reason for your exception.
You could either change the names in your parameter, or call directly the method in your definitions:
#Bean
public Job otherJob(JobBuilderFactory jobs, JobExecutionListener listener) {
return jobs.get("otherJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(step2())
.end()
.build();
}
Spring takes care, that it not simply calls the step2 method, but instead provides a real spring-bean.

spring-boot-starter-jta-atomikos and spring-boot-starter-batch

Is it possible to use both these starters in a single application?
I want to load records from a CSV file into a database table. The Spring Batch tables are stored in a different database, so I assume I need to use JTA to handle the transaction.
Whenever I add #EnableBatchProcessing to my #Configuration class it configures a PlatformTransactionManager, which stops this being auto-configured by Atomikos.
Are there any spring boot + batch + jta samples out there that show how to do this?
Many Thanks,
James
I just went through this and I found something that seems to work. As you note, #EnableBatchProcessing causes a DataSourceTransactionManager to be created, which messes up everything. I'm using modular=true in #EnableBatchProcessing, so the ModularBatchConfiguration class is activated.
What I did was to stop using #EnableBatchProcessing and instead copy the entire ModularBatchConfiguration class into my project. Then I commented out the transactionManager() method, since the Atomikos configuration creates the JtaTransactionManager. I also had to override the jobRepository() method, because that was hardcoded to use the DataSourceTransactionManager created inside DefaultBatchConfiguration.
I also had to explicitly import the JtaAutoConfiguration class. This wires everything up correctly (according to the Actuator's "beans" endpoint - thank god for that). But when you run it the transaction manager throws an exception because something somewhere sets an explicit transaction isolation level. So I also wrote a BeanPostProcessor to find the transaction manager and call txnMgr.setAllowCustomIsolationLevels(true);
Now everything works, but while the job is running, I cannot fetch the current data from batch_step_execution table using JdbcTemplate, even though I can see the data in SQLYog. This must have something to do with transaction isolation, but I haven't been able to understand it yet.
Here is what I have for my configuration class, copied from Spring and modified as noted above. PS, I have my DataSource that points to the database with the batch tables annotated as #Primary. Also, I changed my DataSource beans to be instances of org.apache.tomcat.jdbc.pool.XADataSource; I'm not sure if that's necessary.
#Configuration
#Import(ScopeConfiguration.class)
public class ModularJtaBatchConfiguration implements ImportAware
{
#Autowired(required = false)
private Collection<DataSource> dataSources;
private BatchConfigurer configurer;
#Autowired
private ApplicationContext context;
#Autowired(required = false)
private Collection<BatchConfigurer> configurers;
private AutomaticJobRegistrar registrar = new AutomaticJobRegistrar();
#Bean
public JobRepository jobRepository(DataSource batchDataSource, JtaTransactionManager jtaTransactionManager) throws Exception
{
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(batchDataSource);
factory.setTransactionManager(jtaTransactionManager);
factory.afterPropertiesSet();
return factory.getObject();
}
#Bean
public JobLauncher jobLauncher() throws Exception {
return getConfigurer(configurers).getJobLauncher();
}
// #Bean
// public PlatformTransactionManager transactionManager() throws Exception {
// return getConfigurer(configurers).getTransactionManager();
// }
#Bean
public JobExplorer jobExplorer() throws Exception {
return getConfigurer(configurers).getJobExplorer();
}
#Bean
public AutomaticJobRegistrar jobRegistrar() throws Exception {
registrar.setJobLoader(new DefaultJobLoader(jobRegistry()));
for (ApplicationContextFactory factory : context.getBeansOfType(ApplicationContextFactory.class).values()) {
registrar.addApplicationContextFactory(factory);
}
return registrar;
}
#Bean
public JobBuilderFactory jobBuilders(JobRepository jobRepository) throws Exception {
return new JobBuilderFactory(jobRepository);
}
#Bean
// hopefully this will autowire the Atomikos JTA txn manager
public StepBuilderFactory stepBuilders(JobRepository jobRepository, JtaTransactionManager ptm) throws Exception {
return new StepBuilderFactory(jobRepository, ptm);
}
#Bean
public JobRegistry jobRegistry() throws Exception {
return new MapJobRegistry();
}
#Override
public void setImportMetadata(AnnotationMetadata importMetadata) {
AnnotationAttributes enabled = AnnotationAttributes.fromMap(importMetadata.getAnnotationAttributes(
EnableBatchProcessing.class.getName(), false));
Assert.notNull(enabled,
"#EnableBatchProcessing is not present on importing class " + importMetadata.getClassName());
}
protected BatchConfigurer getConfigurer(Collection<BatchConfigurer> configurers) throws Exception {
if (this.configurer != null) {
return this.configurer;
}
if (configurers == null || configurers.isEmpty()) {
if (dataSources == null || dataSources.isEmpty()) {
throw new UnsupportedOperationException("You are screwed");
} else if(dataSources != null && dataSources.size() == 1) {
DataSource dataSource = dataSources.iterator().next();
DefaultBatchConfigurer configurer = new DefaultBatchConfigurer(dataSource);
configurer.initialize();
this.configurer = configurer;
return configurer;
} else {
throw new IllegalStateException("To use the default BatchConfigurer the context must contain no more than" +
"one DataSource, found " + dataSources.size());
}
}
if (configurers.size() > 1) {
throw new IllegalStateException(
"To use a custom BatchConfigurer the context must contain precisely one, found "
+ configurers.size());
}
this.configurer = configurers.iterator().next();
return this.configurer;
}
}
#Configuration
class ScopeConfiguration {
private StepScope stepScope = new StepScope();
private JobScope jobScope = new JobScope();
#Bean
public StepScope stepScope() {
stepScope.setAutoProxy(false);
return stepScope;
}
#Bean
public JobScope jobScope() {
jobScope.setAutoProxy(false);
return jobScope;
}
}
I found a solution where I was able to keep #EnableBatchProcessing but had to implement BatchConfigurer and atomikos beans, see my full answer in this so answer.

Resources