Need to configure my JPA layer to use a TransactionManager (Spring Cloud Task + Batch register a PlatformTransactionManager unexpectedly) - spring-boot

I am using Spring Cloud Task + Batch in a project.
I plan to use different datasources for business data and Spring audit data on the task. So I configured something like:
#Bean
public TaskConfigurer taskConfigurer() {
return new DefaultTaskConfigurer(this.singletonNotExposedSpringDatasource());
}
#Bean
public BatchConfigurer batchConfigurer() {
return new DefaultBatchConfigurer(this.singletonNotExposedSpringDatasource());
}
whereas main datasource is autoconfigured through JpaBaseConfiguration.
The problem comes when SimpleBatchConfiguration+DefaultBatchConfigurer expose a PlatformTransactionManager bean, since JpaBaseConfiguration has a #ConditionalOnMissingBean on PlatformTransactionManager. Therefore Batch's PlatformTransactionManager, binded to the spring.datasource takes place.
So far, this seems to be caused because this bug
So I tried to emulate what JpaBaseConfiguration does, defining my own PlatformTransactionManager over my biz datasource/entityManager.
#Primary
#Bean
public PlatformTransactionManager appTransactionManager(final LocalContainerEntityManagerFactoryBean appEntityManager) {
JpaTransactionManager transactionManager = new JpaTransactionManager();
transactionManager.setEntityManagerFactory(appEntityManager.getObject());
this.appTransactionManager = transactionManager;
return transactionManager;
}
Note I have to define it with a name other than transactionManager, otherwise Spring finds 2 beans and complains (unregardless of #Primary!)
But now it comes the funny part. When running the tests, everything runs smooth, tests finish and DDLs are properly created for both business and Batch/Task's databases, database reads work flawlessly, but business data is not persisted in my testing database, so final assertThats fail when counting. If I #Autowire in my test PlatformTransactionManager or ÈntityManager, everything indicates they are the proper ones. But if I debug within entityRepository.save, and execute org.springframework.transaction.interceptor.TransactionAspectSupport.currentTransactionStatus(), it seems the DatasourceTransactionManager from Batch's configuration is overriding, so my custom exposed PlatformTransactionManager is not being used.
So I guess it is not a problem of my PlatformManager being the primary, but that something is configuring my JPA layer TransactionInterceptor to use the non primary but transactionManager named bean of Batch.
I also tried with making my #Configuration implement TransactionManagementConfigurer and override PlatformTransactionManager annotationDrivenTransactionManager() but still no luck
Thus, I guess what I am asking is whether there is a way to configure the primary TransactionManager for the JPA Layer.

The problem comes when SimpleBatchConfiguration+DefaultBatchConfigurer expose a PlatformTransactionManager bean,
As you mentioned, this is indeed what was reported in BATCH-2788. The solution we are exploring is to expose the transaction manager bean only if Spring Batch creates it.
In the meantime you can set the property spring.main.allow-bean-definition-overriding=true to allow bean definition overriding and set the transaction manager you want Spring Batch to use with BatchConfigurer#getTransactionManager. In your case, it would be something like:
#Bean
public BatchConfigurer batchConfigurer() {
return new DefaultBatchConfigurer(this.singletonNotExposedSpringDatasource()) {
#Override
public PlatformTransactionManager getTransactionManager() {
return new MyTransactionManager();
}
};
}
Hope this helps.

Related

Spring Batch/Data JPA application not persisting/saving data to Postgres database when calling JPA repository (save, saveAll) methods

I am near wits-end. I read/googled endlessly so far and tried the solutions on all the google/stackoverflow posts that have this similiar issue (there a quite a few). Some seemed promising, but nothing has worked for me yet; though I have made some progress and I am on the right track I believe (I'm believing at this point its something with the Transaction manager and some possible conflict with Spring Batch vs. Spring Data JPA).
References:
Spring boot repository does not save to the DB if called from scheduled job
JpaItemWriter: no transaction is in progress
Similar to the aforementioned posts, I have a Spring Boot application that is using Spring Batch and Spring Data JPA. It reads comma delimited data from a .csv file, then does some processing/transformation, and attempts to persist/save to database using the JPA Repository methods, specifically here .saveAll() (I also tried .save() method and this did the same thing), since I'm saving a List<MyUserDefinedDataType> of a user-defined data type (batch insert).
Now, my code was working fine on Spring Boot starter 1.5.9.RELEASE, but I recently attempted to upgrade to 2.X.X, which I found, after countless hours of debugging, only version 2.2.0.RELEASE would persist/save data to database. So an upgrade to >= 2.2.1.RELEASE breaks persistence. Everything is read fine from the .csv, its just when the first time the code flow hits a JPA repository method like .save() .saveAll(), the application keeps running but nothing gets persisted. I also noticed the Hikari pool logs "active=1 idle=4", but when I looked at the same log when on version 1.5.9.RELEASE, it says active=0 idle=5 immediately after persisting the data, so the application is definitely hanging. I went into the debugger and even saw after jumping into the Repository calls, it goes into almost an infinite cycle through the Spring AOP libraries and such (all third party) and I don't believe ever comes back to the real application/business logic that I wrote.
3c22fb53ed64 2021-05-20 23:53:43.909 DEBUG
[HikariPool-1 housekeeper] com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Pool stats (total=5, active=1, idle=4, waiting=0)
Anyway, I tried the most common solutions that worked for other people which were:
Defining a JpaTransactionManager #Bean and injecting it into the Step function, while keeping the JobRepository using the PlatformTransactionManager. This did not work. Then I also I tried using the JpaTransactionManager also in the JobRepository #Bean, this also did not work.
Defining a #RestController endpoint in my application to manually trigger this Job, instead of doing it manually from my main Application.java class. (I talk about this more below). And per one of the posts I posted above, the data persisted correctly to the database even on spring >= 2.2.1, which further I suspect now something with the Spring Batch persistence/entity/transaction managers is messed up.
The code is basically this:
BatchConfiguration.java
#Configuration
#EnableBatchProcessing
#Import({DatabaseConfiguration.class})
public class BatchConfiguration {
// Datasource is a Postgres DB defined in separate IntelliJ project that I add to my pom.xml
DataSource dataSource;
#Autowired
public BatchConfiguration(#Qualifier("dataSource") DataSource dataSource) {
this.dataSource = dataSource;
}
#Bean
#Primary
public JpaTransactionManager jpaTransactionManager() {
final JpaTransactionManager tm = new JpaTransactionManager();
tm.setDataSource(dataSource);
return tm;
}
#Bean
public JobRepository jobRepository(PlatformTransactionManager transactionManager) throws Exception {
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setDataSource(dataSource);
jobRepositoryFactoryBean.setTransactionManager(transactionManager);
jobRepositoryFactoryBean.setDatabaseType("POSTGRES");
return jobRepositoryFactoryBean.getObject();
}
#Bean
public JobLauncher jobLauncher(JobRepository jobRepository) {
SimpleJobLauncher simpleJobLauncher = new SimpleJobLauncher();
simpleJobLauncher.setJobRepository(jobRepository);
return simpleJobLauncher;
}
#Bean(name = "jobToLoadTheData")
public Job jobToLoadTheData() {
return jobBuilderFactory.get("jobToLoadTheData")
.start(stepToLoadData())
.listener(new CustomJobListener())
.build();
}
#Bean
#StepScope
public TaskExecutor taskExecutor() {
ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
threadPoolTaskExecutor.setCorePoolSize(maxThreads);
threadPoolTaskExecutor.setThreadGroupName("taskExecutor-batch");
return threadPoolTaskExecutor;
}
#Bean(name = "stepToLoadData")
public Step stepToLoadData() {
TaskletStep step = stepBuilderFactory.get("stepToLoadData")
.transactionManager(jpaTransactionManager())
.<List<FieldSet>, List<myCustomPayloadRecord>>chunk(chunkSize)
.reader(myCustomFileItemReader(OVERRIDDEN_BY_EXPRESSION))
.processor(myCustomPayloadRecordItemProcessor())
.writer(myCustomerWriter())
.faultTolerant()
.skipPolicy(new AlwaysSkipItemSkipPolicy())
.skip(DataValidationException.class)
.listener(new CustomReaderListener())
.listener(new CustomProcessListener())
.listener(new CustomWriteListener())
.listener(new CustomSkipListener())
.taskExecutor(taskExecutor())
.throttleLimit(maxThreads)
.build();
step.registerStepExecutionListener(stepExecutionListener());
step.registerChunkListener(new CustomChunkListener());
return step;
}
My main method:
Application.java
#Autowired
#Qualifier("jobToLoadTheData")
private Job loadTheData;
#Autowired
private JobLauncher jobLauncher;
#PostConstruct
public void launchJob () throws JobParametersInvalidException, JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException
{
JobParameters parameters = (new JobParametersBuilder()).addDate("random", new Date()).toJobParameters();
jobLauncher.run(loadTheData, parameters);
}
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
Now, normally I'm reading this .csv from Amazon S3 bucket, but since I'm testing locally, I am just placing the .csv in the project directory and reading it directly by triggering the job in the Application.java main class (as you can see above). Also, I do have some other beans defined in this BatchConfiguration class but I don't want to over-complicate this post more than it already is and from the googling I've done, the problem possibly is with the methods I posted (hopefully).
Also, I would like to point out, similar to one of the other posts on Google/stackoverflow with a user having a similar problem, I created a #RestController endpoint that simply calls the .run() method the JobLauncher and I pass in the JobToLoadTheData Bean, and it triggers the batch insert. Guess what? Data persists to the database just fine, even on spring >= 2.2.1.
What is going on here? is this a clue? is something funky going wrong with some type of entity or transaction manager? I'll take any advice tips! I can provide any more information that you guys may need , so please just ask.
You are defining a bean of type JobRepository and expecting it to be picked up by Spring Batch. This is not correct. You need to provide a BatchConfigurer and override getJobRepository. This is explained in the reference documentation:
You can customize any of these beans by creating a custom implementation of the
BatchConfigurer interface. Typically, extending the DefaultBatchConfigurer
(which is provided if a BatchConfigurer is not found) and overriding the required
getter is sufficient.
This is also documented in the Javadoc of #EnableBatchProcessing. So in your case, you need to define a bean of type Batchconfigurer and override getJobRepository and getTransactionManager, something like:
#Bean
public BatchConfigurer batchConfigurer(EntityManagerFactory entityManagerFactory, DataSource dataSource) {
return new DefaultBatchConfigurer(dataSource) {
#Override
public PlatformTransactionManager getTransactionManager() {
return new JpaTransactionManager(entityManagerFactory);
}
#Override
public JobRepository getJobRepository() {
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setDataSource(dataSource);
jobRepositoryFactoryBean.setTransactionManager(getTransactionManager());
// set other properties
return jobRepositoryFactoryBean.getObject();
}
};
}
In a Spring Boot context, you could also override the createTransactionManager and createJobRepository methods of org.springframework.boot.autoconfigure.batch.JpaBatchConfigurer if needed.

Why adding #Bean to a method I'll call directly

I have seen lots of examples about Spring configuration through #Configuration and #Bean annotations. But I relealized that it's a common practice to add #Bean annotation to methods that are called directly to populate other beans. For example:
#Bean
public Properties hibernateProperties() {
Properties hibernateProp = new Properties();
hibernateProp.put("hibernate.dialect",
"org.hibernate.dialect.H2Dialect");
hibernateProp.put("hibernate.hbm2ddl.auto", "create-drop");
hibernateProp.put("hibernate.format_sql", true);
hibernateProp.put("hibernate.use_sql_comments", true);
hibernateProp.put("hibernate.show_sql", true);
return hibernateProp;
}
#Bean public SessionFactory sessionFactory() {
return new LocalSessionFactoryBuilder(dataSource())
.scanPackages("com.ps.ents")
.addProperties(hibernateProperties())
.buildSessionFactory();}
So, I'm wondering if it's better just declaring the hibernateProperties() as private without the #Bean annotation.
I would like to know if this is a bad/unneeded common practice or there is a reason behind.
Thanks in advance!
According to Spring Documentation inter-bean dependency injection is one good approach in order to define the bean dependencies in a simple form. Of course if you define your hibernateProperties() as private it will work but it could not be injected to other components in your application through Spring Container.
Just decide depending on how many classes depends on your bean and also if you need to reuse it in order to call its methods or inject it to other classes.
Decorating a method in #Configuration class with #Bean means that the return value of that method will become a Spring bean.
By default those beans are singletons(only one instance for the lifespan of the application).
In your example Spring knows that hibernateProperties() is a singleton bean and will create it only ones. So here:
#Bean public SessionFactory sessionFactory() {
return new LocalSessionFactoryBuilder(dataSource())
.scanPackages("com.ps.ents")
.addProperties(hibernateProperties())
.buildSessionFactory();
}
hibernateProperties() method will not be executed again, but the bean will be taken from the application context. If you don't annotate hibernateProperties() with #Bean it will be a simple java method and it will be executed whenever it's called. It depends on what you want.
Just to mention, the other way to do dependency injection in #Configuration classes is add a parameter. Like this:
#Bean public SessionFactory sessionFactory(Properties props) {
return new LocalSessionFactoryBuilder(dataSource())
.scanPackages("com.ps.ents")
.addProperties(props)
.buildSessionFactory();
}
When Spring tries to create the sessionFactory() bean it will first look in the application context for a bean of type Properties.

Javers with multiple EntityManagerFactory

We are just getting started with JaVers in a Spring Boot application.
This application as two EntityManagerFactory beans:
#Primary
#Bean(name = "entityManagerFactory")
LocalContainerEntityManagerFactoryBean entityManagerFactory(DataSource dataSource, Environment env) {
And
#Bean(name = "secondaryEntityManagerFactory")
For auditing, we are just concerned with the #Primary entity manager factory. When we start up the application, it fails on initialization because of multiple entity manager factory beans.
Is there a way to tell JaVers just to be concerned with the #Primary factory?
Thanks!
Dave
JaVers Spring Boot starter uses EntityManagerFactory only in startup phase to determine SQL dialect. In runtime, SQL connections are obtained from the ConnectionProvider bean:
#Bean
#ConditionalOnMissingBean
public ConnectionProvider jpaConnectionProvider() {
return new JpaHibernateConnectionProvider();
}
You can override this bean with an implementation, which will provide connections to secondary database.

Spring + multiple H2 instances in memory

Two different H2 instances to be created in-memory. To make sure this happens, I initialized both instances with same shema and different data. So that when I query using DAO different set of data picked from different DataSource. However this is not happening. What am I doing wrong? How to name the instance of H2 correct?
#Bean(name = "DS1")
#Primary
public EmbeddedDatabase dataSource1() {
return new EmbeddedDatabaseBuilder().
setType(EmbeddedDatabaseType.H2).
setName("DB1").
addScript("schema.sql").
addScript("data-1.sql").
build();
}
#Bean(name = "DS2")
public EmbeddedDatabase dataSource2() {
return new EmbeddedDatabaseBuilder().
setType(EmbeddedDatabaseType.H2).
setName("DB2").
addScript("schema.sql").
addScript("data-2.sql").
build();
}
You have created two DataSources and have marked one as #Primary -- this is the one which will be used when your EntityManagerFactories and Repositories are autoconfigured. That's why both DAO's are accessing the same database.
In order to get around this, you need to declare two separate EntityManagerFactories, as described in the Spring Boot documentation:
http://docs.spring.io/spring-boot/docs/current-SNAPSHOT/reference/htmlsingle/#howto-use-two-entity-managers
After that, you'll need to declare two separate repositories and tell each repository which of the EntityManagerFactory to use. To do this, in your #EnableJpaRepositories annotation you'll have to specify the correct EntityMangerFactory. This article describes very nicely how to do that:
http://scattercode.co.uk/2013/11/18/spring-data-multiple-databases/
It would be nice if Spring Boot supported autoconfiguration with two DataSources, but I don't think it's going to happen soon:
https://github.com/spring-projects/spring-boot/issues/808
UPDATE
The author of the above article has published an updated approach:
https://scattercode.co.uk/2016/01/05/multiple-databases-with-spring-boot-and-spring-data-jpa/
The issue was not with multiple instances of H2; but with DataSource injection.
I solved it by passing the Qualifier in method argument.
#Autowired
#Bean(name = "jdbcTemplate")
public JdbcTemplate getJdbcTemplate(#Qualifier("myDataSource") DataSource dataSource) {
return new JdbcTemplate(dataSource);
}

Is there a way to define a default transaction manager in Spring

I have an existing application that uses the Hibernate SessionFactory for one database. We are adding another database for doing analytics. The transactions will never cross so I don't need JTA, but I do want to use JPA EntityManager for the new database.
I've set up the EntityManager and the new transaction manager, which I've qualified, but Spring complains that I need to qualify my existing #Transactional annotations. I'm trying to find a way to tell Spring to use the txManager one as the default. Is there any way of doing this? Otherwise I'll have to add the qualifier to all the existing #Transactional annotations which I would like to avoid if possible.
#Bean(name = "jpaTx")
public PlatformTransactionManager transactionManagerJPA() throws NamingException {
JpaTransactionManager txManager = new JpaTransactionManager(entityManagerFactory());
return txManager;
}
#Bean
public PlatformTransactionManager txManager() throws Exception {
HibernateTransactionManager txManager = new HibernateTransactionManager(sessionFactory());
txManager.setNestedTransactionAllowed(true);
return txManager;
}
Error I'm getting
No qualifying bean of type [org.springframework.transaction.PlatformTransactionManager] is defined: expected single matching bean but found 2:
Thanks
I was able to solve this using the #Primary annotation
#Bean(name = "jpaTx")
public PlatformTransactionManager transactionManagerJPA() throws NamingException {
JpaTransactionManager txManager = new JpaTransactionManager(entityManagerFactory());
return txManager;
}
#Bean
#Primary
public PlatformTransactionManager txManager() throws Exception {
HibernateTransactionManager txManager = new HibernateTransactionManager(sessionFactory());
txManager.setNestedTransactionAllowed(true);
return txManager;
}
Since the same type of bean is produced from two methods, you must qualify the #Transactional annotation with the named bean. An easy way around to suite your need will be to use two different Spring application contexts. One operating with the old data source and one operating with new. Each of these contexts will have only one method producing the PlatformTransactionManager instance.

Resources