Define an in-memory JobRepository - spring-boot

I'm testing Spring Batch using Spring boot. My need is to define jobs working on an Oracle Database but I don't want to save jobs and steps states inside this DB.
I've read in the documentation I can use a in-memory repository with the MapJobRepositoryFactoryBean.
Then, I've implemented this bean:
#Bean
public JobRepository jobRepository() {
MapJobRepositoryFactoryBean factoryBean = new MapJobRepositoryFactoryBean(new ResourcelessTransactionManager());
try {
JobRepository jobRepository = factoryBean.getObject();
return jobRepository;
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
But when my job starts, the first thing Spring Batch does is to create the table in the Oracle DB and continues to use the Oracle datasource. It's like my JobRepository definition isn't taken account.
What did I miss ?
EDIT: I'm using Spring Boot 1.5.3 and Spring Batch 3.0.7

With SpringBoot 2.x, the solution is simpler.
You have to extend the DefaultBatchConfigurer class like this:
#Component
public class NoPersistenceBatchConfigurer extends DefaultBatchConfigurer {
#Override
public void setDataSource(DataSource dataSource) {
}
}
Without datasource, the framework automatically switches to use the MapJobRepository.

A few things here:
If you have a DataSource configured in your ApplicationContext, by default Spring Batch will try to use it.
In order to not use a DataSource when one is available within the ApplicationContext, you'll need to create your own BatchConfigurer. You can do that by extending the DefaultBatchConfigurer.
Don't use the MapJobRepository except only for testing purposes. I has a number of issues (thread safety, etc) and is not recommended for production use. Use an in memory database like HSQLDB instead (you'll still need to create your own BatchConfigurer to do so).

Thank the comment of pvpkiran I've found my problem. It's necessary to define a JobLauncher bean.
Below an example:
#Bean
public JobRepository jobRepository() {
MapJobRepositoryFactoryBean factoryBean = new MapJobRepositoryFactoryBean(new ResourcelessTransactionManager());
try {
JobRepository jobRepository = factoryBean.getObject();
return jobRepository;
} catch (Exception e) {
e.printStackTrace();
return null;
}
}
#Bean
public JobLauncher jobLauncher(JobRepository jobRepository) {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository);
return jobLauncher;
}

If using Spring Boot and #EnableBatchProcessing, you would extend the DefaultBatchConfigurer and override the createJobRepository method. Create a ResourcelessTransactionManager and JobRepository using MapJobRepositoryFactoryBean, the rest of the beans will be auto created by Spring Boot.
#Configuration
public class InMemoryBatchContextConfigurer extends DefaultBatchConfigurer {
#Bean
private ResourcelessTransactionManager resoucelessTransactionManager() {
return new ResourcelessTransactionManager();
}
#Override
protected JobRepository createJobRepository() throws Exception {
MapJobRepositoryFactoryBean factoryBean = new MapJobRepositoryFactoryBean();
factoryBean.setTransactionManager(resoucelessTransactionManager());
return factoryBean.getObject();
}
}`

Extend DefaultBatchConfigurer class and override createJobRepository method just like below.
#Configuration
public class InMemoryBatchConfigurer extends DefaultBatchConfigurer {
#Override
protected JobRepository createJobRepository() throws Exception {
return new MapJobRepositoryFactoryBean().getObject();
}
}

Related

Spring Disable #Transactional from Configuration java file

I have a code base which is using for two different applications. some of my spring service classes has annotation #Transactional. On server start I would like to disable #Transactional based on some configuration.
The below is my configuration Class.
#Configuration
#EnableTransactionManagement
#PropertySource("classpath:application.properties")
public class WebAppConfig {
private static final String PROPERTY_NAME_DATABASE_DRIVER = "db.driver";
#Resource
private Environment env;
#Bean
public DataSource dataSource() {
DriverManagerDataSource dataSource = new DriverManagerDataSource();
dataSource.setDriverClassName(env.getRequiredProperty(PROPERTY_NAME_DATABASE_DRIVER));
dataSource.setUrl(url);
dataSource.setUsername(userId);
dataSource.setPassword(password);
return dataSource;
}
#Bean
public PlatformTransactionManager txManager() {
DefaultTransactionDefinition def = new DefaultTransactionDefinition();
def.setIsolationLevel(TransactionDefinition.ISOLATION_DEFAULT);
if(appName.equqls("ABC")) {
def.setPropagationBehavior(TransactionDefinition.PROPAGATION_NEVER);
}else {
def.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRED);
}
CustomDataSourceTransactionManager txM=new CustomDataSourceTransactionManager(def);
txM.setDataSource(dataSource());
return txM;
}
#Bean
public JdbcTemplate jdbcTemplate() {
JdbcTemplate jdbcTemplate = new JdbcTemplate();
jdbcTemplate.setDataSource(dataSource());
return jdbcTemplate;
}
}
I am trying to ovveried methods in DataSourceTransactionManager to make the functionality. But still it is trying to commit/rollback the transaction at end of transaction. Since there is no database connection available it is throwing exception.
If I keep #Transactional(propagation=Propagation.NEVER), everything works perfectly, but I cannot modify it as another app is using the same code base and it is necessary in that case.
I would like to know if there is a to make transaction fully disable from configuration without modifying #Transactional annotation.
I'm not sure if it would work but you can try to implement custom TransactionInterceptor and override its method that wraps invocation into a transaction, by removing that transactional stuff. Something like this:
public class NoOpTransactionInterceptor extends TransactionInterceptor {
#Override
protected Object invokeWithinTransaction(
Method method,
Class<?> targetClass,
InvocationCallback invocation
) throws Throwable {
// Simply invoke the original unwrapped code
return invocation.proceedWithInvocation();
}
}
Then you declare a conditional bean in one of #Configuration classes
// assuming this property is stored in Spring application properties file
#ConditionalOnProperty(name = "turnOffTransactions", havingValue = "true"))
#Bean
#Role(BeanDefinition.ROLE_INFRASTRUCTURE)
public TransactionInterceptor transactionInterceptor(
/* default bean would be injected here */
TransactionAttributeSource transactionAttributeSource
) {
TransactionInterceptor interceptor = new NoOpTransactionInterceptor();
interceptor.setTransactionAttributeSource(transactionAttributeSource);
return interceptor;
}
Probably you gonna need additional configurations, I can't verify that right now

Issues with Spring Batch

Hi I have been working in Spring batch recently and need some help.
1) I want to run my Job using multiple threads, hence I have used TaskExecutor as below,
#Bean
public TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(4);
return taskExecutor;
}
#Bean
public Step myStep() {
return stepBuilderFactory.get("myStep")
.<MyEntity,AnotherEntity> chunk(1)
.reader(reader())
.processor(processor())
.writer(writer())
.taskExecutor(taskExecutor())
.throttleLimit(4)
.build();
}
but, while executing in can see below line in console.
o.s.b.c.l.support.SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.
What does this mean? However, while debugging I can see four SimpleAsyncExecutor threads running. Can someone shed some light on this?
2) I don't want to run my Batch application with the metadata tables that spring batch creates. I have tried adding spring.batch.initialize-schema=never. But it didn't work. I also saw some way to do this by using ResourcelessTransactionManager, MapJobRepositoryFactoryBean. But I have to make some database transactions for my job. So will it be alright if I use this?
Also I was able to do this by extending DefaultBatchConfigurer and overriding:
#Override
public void setDataSource(DataSource dataSource) {
// override to do not set datasource even if a datasource exist.
// initialize will use a Map based JobRepository (instead of database)
}
Please guide me further. Thanks.
Update:
My full configuration class here.
#EnableBatchProcessing
#EnableScheduling
#Configuration
public class MyBatchConfiguration{
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
public DataSource dataSource;
/* #Override
public void setDataSource(DataSource dataSource) {
// override to do not set datasource even if a datasource exist.
// initialize will use a Map based JobRepository (instead of database)
}*/
#Bean
public Step myStep() {
return stepBuilderFactory.get("myStep")
.<MyEntity,AnotherEntity> chunk(1)
.reader(reader())
.processor(processor())
.writer(writer())
.taskExecutor(executor())
.throttleLimit(4)
.build();
}
#Bean
public Job myJob() {
return jobBuilderFactory.get("myJob")
.incrementer(new RunIdIncrementer())
.listener(listener())
.flow(myStep())
.end()
.build();
}
#Bean
public MyJobListener myJobListener()
{
return new MyJobListener();
}
#Bean
public ItemReader<MyEntity> reader()
{
return new MyReader();
}
#Bean
public ItemWriter<? super AnotherEntity> writer()
{
return new MyWriter();
}
#Bean
public ItemProcessor<MyEntity,AnotherEntity> processor()
{
return new MyProcessor();
}
#Bean
public TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor taskExecutor = new SimpleAsyncTaskExecutor();
taskExecutor.setConcurrencyLimit(4);
return taskExecutor;
}}
In the future, please break this up into two independent questions. That being said, let me shed some light on both questions.
SimpleJobLauncher : No TaskExecutor has been set, defaulting to synchronous executor.
Your configuration is configuring myStep to use your TaskExecutor. What that does is it causes Spring Batch to execute each chunk in it's own thread (based on the parameters of the TaskExecutor). The log message you are seeing has nothing to do with that behavior. It has to do with launching your job. By default, the SimpleJobLauncher will launch the job on the same thread it is running on, thereby blocking that thread. You can inject a TaskExecutor into the SimpleJobLauncher which will cause the job to be executed on a different thread from the JobLauncher itself. These are two separate uses of multiple threads by the framework.
I don't want to run my Batch application with the metadata tables that spring batch creates
The short answer here is to just use an in memory database like HSQLDB or H2 for your metadata tables. This provides a production grade data store (so that concurrency is handled correctly) without actually persisting the data. If you use the ResourcelessTransactionManager, you are effectively turning transactions off (a bad idea if you're using a database in any capacity) because that TransactionManager doesn't actually do anything (it's a no-op implementation).

MyBatis+Spring MapperScan with Mulitple Data Sources

I am pulling data from two different databases using MyBatis 3.3.1 and Spring 4.3. The two configuration classes to scan for mappers look at follows:
#Configuration
#MapperScan(value="com.mapper1.map",
SqlSessionFactoryRef="sqlSessionFactory1")
public class AppConfig {
#Bean
public DataSource getDataSource1() {
BasicDataSource dataSource = new BasicDataSource();
dataSource.setDriverClassName("com.mysql.jdbc.Driver");
dataSource.setUrl("jdbc:mysql://localhost:3306/database1");
dataSource.setUsername("user");
dataSource.setPassword("pw");
return dataSource;
}
#Bean
public DataSourceTransactionManager transactionManager1() {
return new DataSourceTransactionManager(getDataSource1());
}
#Bean
public SqlSessionFactory sqlSessionFactory1() throws Exception {
SqlSessionFactoryBean sessionFactory = new SqlSessionFactoryBean();
sessionFactory.setDataSource(getDataSource1());
return sessionFactory.getObject();
}
}
#Configuration
#MapperScan(value="com.mapper2.map",
SqlSessionFactoryRef="sqlSessionFactory2")
public class AppConfig {
#Bean
public DataSource getDataSource2() {
BasicDataSource dataSource = new BasicDataSource();
dataSource.setDriverClassName("com.mysql.jdbc.Driver");
dataSource.setUrl("jdbc:mysql://localhost:3307/database2");
dataSource.setUsername("user");
dataSource.setPassword("pw");
return dataSource;
}
#Bean
public DataSourceTransactionManager transactionManager2() {
return new DataSourceTransactionManager(getDataSource2());
}
#Bean
public SqlSessionFactory sqlSessionFactory2() throws Exception {
SqlSessionFactoryBean sessionFactory = new SqlSessionFactoryBean();
sessionFactory.setDataSource(getDataSource2());
return sessionFactory.getObject();
}
}
The code deploys fine, but only mappers from data source 1 works. When I try to use a mapper from data source 2, I get a "No table found" exception from my database. The problem is that although I am setting the specific SqlSessionFactory that I want to use in the mapperScan, it ends up using the other SqlSessionFactory for all the mappers. If I comment out the SqlSessionFactory in configuration 1, then Configuration 2 will work.
Note that if I don't use MapperScan, but instead use a MapperScannerConfigurer bean, I am able to correctly retrieve data.
Has anyone else had problems using #MapperScan with multiple data sources?
The only issue I see in your code is SqlSessionFactoryRef should be from lowercase: (sqlSessionFactory). Apart from that everything is fine, this approach works for me.
You can also look at ace-mybatis. It allows to work with multiple datasources configuring only one bean.

Spring Batch Jpa Repository Save not committing the data

I am using Spring Batch and JPA to process a Batch Job and perform updates. I am using the default Repository implementations.
And I am using a repository.save to save the modified object in the processor.
Also,I don't have any #Transactional Annotation specified in the processor or writer.
I don't see any exceptions throughout. The selects happen fine.
Is there any setting like "setAutoCommit(true)" that I should be using for the JPA to save the data in the DB.
Here is my step,reader and writer config:
Also, my config class is annotated with EnableBatchProcessing
#EnableBatchProcessing
public class UpgradeBatchConfiguration extends DefaultBatchConfigurer{
#Autowired
private PlatformTransactionManager transactionManager;
#Override
protected JobRepository createJobRepository() throws Exception {
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(getdataSource());
factory.setTransactionManager(transactionManager);
factory.setTablePrefix("CFTES_OWNER.BATCH_");
factory.afterPropertiesSet();
return factory.getObject();
}
#Bean(name = "updateFilenetJobStep")
public Step jobStep(StepBuilderFactory stepBuilderFactory,
#Qualifier("updateFileNetReader") RepositoryItemReader reader,
#Qualifier("updateFileNetWriter") ItemWriter writer,
#Qualifier("updateFileNetProcessor") ItemProcessor processor) {
return stepBuilderFactory.get("jobStep").allowStartIfComplete(true).chunk(1).reader(reader).processor(processor)
.writer(writer).transactionManager(transactionManager).build();
}
#Bean(name = "updateFileNetWriter")
public ItemWriter getItemWriter() {
return new BatchItemWriter();
}
#Bean(name = "updateFileNetReader")
public RepositoryItemReader<Page<TermsAndConditionsErrorEntity>> getItemReader(
TermsAndConditionsErrorRepository repository) {
RepositoryItemReader<Page<TermsAndConditionsErrorEntity>> reader = new RepositoryItemReader<Page<TermsAndConditionsErrorEntity>>();
reader.setRepository(repository);
reader.setMethodName("findAll");
HashMap<String, Direction> map = new HashMap<String, Direction>();
map.put("transactionId", Direction.ASC);
reader.setSort(map);
return reader;
}
}
And in the writer this is what I am using Repository.save
repository.save(entity);
I was able to solve this by injecting a JPATransactionManager throughout(the job repository, job, step etc) , instead of the Autowired PlatformTransactionManager.

spring-boot-starter-jta-atomikos and spring-boot-starter-batch

Is it possible to use both these starters in a single application?
I want to load records from a CSV file into a database table. The Spring Batch tables are stored in a different database, so I assume I need to use JTA to handle the transaction.
Whenever I add #EnableBatchProcessing to my #Configuration class it configures a PlatformTransactionManager, which stops this being auto-configured by Atomikos.
Are there any spring boot + batch + jta samples out there that show how to do this?
Many Thanks,
James
I just went through this and I found something that seems to work. As you note, #EnableBatchProcessing causes a DataSourceTransactionManager to be created, which messes up everything. I'm using modular=true in #EnableBatchProcessing, so the ModularBatchConfiguration class is activated.
What I did was to stop using #EnableBatchProcessing and instead copy the entire ModularBatchConfiguration class into my project. Then I commented out the transactionManager() method, since the Atomikos configuration creates the JtaTransactionManager. I also had to override the jobRepository() method, because that was hardcoded to use the DataSourceTransactionManager created inside DefaultBatchConfiguration.
I also had to explicitly import the JtaAutoConfiguration class. This wires everything up correctly (according to the Actuator's "beans" endpoint - thank god for that). But when you run it the transaction manager throws an exception because something somewhere sets an explicit transaction isolation level. So I also wrote a BeanPostProcessor to find the transaction manager and call txnMgr.setAllowCustomIsolationLevels(true);
Now everything works, but while the job is running, I cannot fetch the current data from batch_step_execution table using JdbcTemplate, even though I can see the data in SQLYog. This must have something to do with transaction isolation, but I haven't been able to understand it yet.
Here is what I have for my configuration class, copied from Spring and modified as noted above. PS, I have my DataSource that points to the database with the batch tables annotated as #Primary. Also, I changed my DataSource beans to be instances of org.apache.tomcat.jdbc.pool.XADataSource; I'm not sure if that's necessary.
#Configuration
#Import(ScopeConfiguration.class)
public class ModularJtaBatchConfiguration implements ImportAware
{
#Autowired(required = false)
private Collection<DataSource> dataSources;
private BatchConfigurer configurer;
#Autowired
private ApplicationContext context;
#Autowired(required = false)
private Collection<BatchConfigurer> configurers;
private AutomaticJobRegistrar registrar = new AutomaticJobRegistrar();
#Bean
public JobRepository jobRepository(DataSource batchDataSource, JtaTransactionManager jtaTransactionManager) throws Exception
{
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(batchDataSource);
factory.setTransactionManager(jtaTransactionManager);
factory.afterPropertiesSet();
return factory.getObject();
}
#Bean
public JobLauncher jobLauncher() throws Exception {
return getConfigurer(configurers).getJobLauncher();
}
// #Bean
// public PlatformTransactionManager transactionManager() throws Exception {
// return getConfigurer(configurers).getTransactionManager();
// }
#Bean
public JobExplorer jobExplorer() throws Exception {
return getConfigurer(configurers).getJobExplorer();
}
#Bean
public AutomaticJobRegistrar jobRegistrar() throws Exception {
registrar.setJobLoader(new DefaultJobLoader(jobRegistry()));
for (ApplicationContextFactory factory : context.getBeansOfType(ApplicationContextFactory.class).values()) {
registrar.addApplicationContextFactory(factory);
}
return registrar;
}
#Bean
public JobBuilderFactory jobBuilders(JobRepository jobRepository) throws Exception {
return new JobBuilderFactory(jobRepository);
}
#Bean
// hopefully this will autowire the Atomikos JTA txn manager
public StepBuilderFactory stepBuilders(JobRepository jobRepository, JtaTransactionManager ptm) throws Exception {
return new StepBuilderFactory(jobRepository, ptm);
}
#Bean
public JobRegistry jobRegistry() throws Exception {
return new MapJobRegistry();
}
#Override
public void setImportMetadata(AnnotationMetadata importMetadata) {
AnnotationAttributes enabled = AnnotationAttributes.fromMap(importMetadata.getAnnotationAttributes(
EnableBatchProcessing.class.getName(), false));
Assert.notNull(enabled,
"#EnableBatchProcessing is not present on importing class " + importMetadata.getClassName());
}
protected BatchConfigurer getConfigurer(Collection<BatchConfigurer> configurers) throws Exception {
if (this.configurer != null) {
return this.configurer;
}
if (configurers == null || configurers.isEmpty()) {
if (dataSources == null || dataSources.isEmpty()) {
throw new UnsupportedOperationException("You are screwed");
} else if(dataSources != null && dataSources.size() == 1) {
DataSource dataSource = dataSources.iterator().next();
DefaultBatchConfigurer configurer = new DefaultBatchConfigurer(dataSource);
configurer.initialize();
this.configurer = configurer;
return configurer;
} else {
throw new IllegalStateException("To use the default BatchConfigurer the context must contain no more than" +
"one DataSource, found " + dataSources.size());
}
}
if (configurers.size() > 1) {
throw new IllegalStateException(
"To use a custom BatchConfigurer the context must contain precisely one, found "
+ configurers.size());
}
this.configurer = configurers.iterator().next();
return this.configurer;
}
}
#Configuration
class ScopeConfiguration {
private StepScope stepScope = new StepScope();
private JobScope jobScope = new JobScope();
#Bean
public StepScope stepScope() {
stepScope.setAutoProxy(false);
return stepScope;
}
#Bean
public JobScope jobScope() {
jobScope.setAutoProxy(false);
return jobScope;
}
}
I found a solution where I was able to keep #EnableBatchProcessing but had to implement BatchConfigurer and atomikos beans, see my full answer in this so answer.

Resources