Spring Batch 4.0 and Spring Boot 1.5.9 - spring-boot

Getting up to speed with Spring Batch and Spring Boot wanting to use only annotations. Got the demo application https://spring.io/guides/gs/batch-processing/ working which uses hsqldb and all worked. After that I opted to switch to oracle. I want to use the repository database tables as our older jobs have done. When I switch to oracle 11g the insertrs into the spring batch tables complain about SERILIZABLE. So spending a good part of yesterday chasing this down, I understand that I probably need to do set a isolation level on the jobrepository. Problem is no site goes into this, and the docs showing xml/java do not explain clearly either. I check out the examples from Spring Batch github and cannot find anything there either. Is there an example somewhere of how to create and job repository and change the isolation level for this? Burning hours at this time with no forward motion.
Tried this below code setting the isolation and get: BatchConfigurationException: java.lang.IllegalArgumentException: Invalid transaction attribute token: [READ_COMMITTED]
public class BatchConfiguration extends DefaultBatchConfigurer {
#Autowired
public JobBuilderFactory jobBuilderFactory;
#Autowired
private PlatformTransactionManager transactionManager;
#Autowired
public StepBuilderFactory stepBuilderFactory;
#Autowired
private DataSource dataSource;
#Override
protected JobRepository createJobRepository() throws Exception {
JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
factory.setDataSource(dataSource);
factory.setTransactionManager(transactionManager);
factory.setIsolationLevelForCreate("READ_COMMITTED");
// factory.setTablePrefix("BATCH_");
factory.setMaxVarCharLength(1000);
return factory.getObject();
}
#Bean
public JdbcCursorItemReader<BusnPrtnr> itemReader() {
return new JdbcCursorItemReaderBuilder<BusnPrtnr>()
.dataSource(dataSource)
...

There is an error in the Java code example here: https://docs.spring.io/spring-batch/4.0.x/reference/html/job.html#txConfigForJobRepository
It should be factory.setIsolationLevelForCreate("ISOLATION_REPEATABLE_READ"); instead of factory.setIsolationLevelForCreate("REPEATABLE_READ");.
The XML code example is correct (the XML parser adds the ISOLATION_ prefix behind the scene).
There is a pull request to fix this issue here: https://github.com/spring-projects/spring-batch/pull/577/files#diff-de2cb44d4395c5ad35d1fc05bbace8f1R620
The fix will be part of 4.0.1 release.

Related

Spring Batch/Data JPA application not persisting/saving data to Postgres database when calling JPA repository (save, saveAll) methods

I am near wits-end. I read/googled endlessly so far and tried the solutions on all the google/stackoverflow posts that have this similiar issue (there a quite a few). Some seemed promising, but nothing has worked for me yet; though I have made some progress and I am on the right track I believe (I'm believing at this point its something with the Transaction manager and some possible conflict with Spring Batch vs. Spring Data JPA).
References:
Spring boot repository does not save to the DB if called from scheduled job
JpaItemWriter: no transaction is in progress
Similar to the aforementioned posts, I have a Spring Boot application that is using Spring Batch and Spring Data JPA. It reads comma delimited data from a .csv file, then does some processing/transformation, and attempts to persist/save to database using the JPA Repository methods, specifically here .saveAll() (I also tried .save() method and this did the same thing), since I'm saving a List<MyUserDefinedDataType> of a user-defined data type (batch insert).
Now, my code was working fine on Spring Boot starter 1.5.9.RELEASE, but I recently attempted to upgrade to 2.X.X, which I found, after countless hours of debugging, only version 2.2.0.RELEASE would persist/save data to database. So an upgrade to >= 2.2.1.RELEASE breaks persistence. Everything is read fine from the .csv, its just when the first time the code flow hits a JPA repository method like .save() .saveAll(), the application keeps running but nothing gets persisted. I also noticed the Hikari pool logs "active=1 idle=4", but when I looked at the same log when on version 1.5.9.RELEASE, it says active=0 idle=5 immediately after persisting the data, so the application is definitely hanging. I went into the debugger and even saw after jumping into the Repository calls, it goes into almost an infinite cycle through the Spring AOP libraries and such (all third party) and I don't believe ever comes back to the real application/business logic that I wrote.
3c22fb53ed64 2021-05-20 23:53:43.909 DEBUG
[HikariPool-1 housekeeper] com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Pool stats (total=5, active=1, idle=4, waiting=0)
Anyway, I tried the most common solutions that worked for other people which were:
Defining a JpaTransactionManager #Bean and injecting it into the Step function, while keeping the JobRepository using the PlatformTransactionManager. This did not work. Then I also I tried using the JpaTransactionManager also in the JobRepository #Bean, this also did not work.
Defining a #RestController endpoint in my application to manually trigger this Job, instead of doing it manually from my main Application.java class. (I talk about this more below). And per one of the posts I posted above, the data persisted correctly to the database even on spring >= 2.2.1, which further I suspect now something with the Spring Batch persistence/entity/transaction managers is messed up.
The code is basically this:
BatchConfiguration.java
#Configuration
#EnableBatchProcessing
#Import({DatabaseConfiguration.class})
public class BatchConfiguration {
// Datasource is a Postgres DB defined in separate IntelliJ project that I add to my pom.xml
DataSource dataSource;
#Autowired
public BatchConfiguration(#Qualifier("dataSource") DataSource dataSource) {
this.dataSource = dataSource;
}
#Bean
#Primary
public JpaTransactionManager jpaTransactionManager() {
final JpaTransactionManager tm = new JpaTransactionManager();
tm.setDataSource(dataSource);
return tm;
}
#Bean
public JobRepository jobRepository(PlatformTransactionManager transactionManager) throws Exception {
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setDataSource(dataSource);
jobRepositoryFactoryBean.setTransactionManager(transactionManager);
jobRepositoryFactoryBean.setDatabaseType("POSTGRES");
return jobRepositoryFactoryBean.getObject();
}
#Bean
public JobLauncher jobLauncher(JobRepository jobRepository) {
SimpleJobLauncher simpleJobLauncher = new SimpleJobLauncher();
simpleJobLauncher.setJobRepository(jobRepository);
return simpleJobLauncher;
}
#Bean(name = "jobToLoadTheData")
public Job jobToLoadTheData() {
return jobBuilderFactory.get("jobToLoadTheData")
.start(stepToLoadData())
.listener(new CustomJobListener())
.build();
}
#Bean
#StepScope
public TaskExecutor taskExecutor() {
ThreadPoolTaskExecutor threadPoolTaskExecutor = new ThreadPoolTaskExecutor();
threadPoolTaskExecutor.setCorePoolSize(maxThreads);
threadPoolTaskExecutor.setThreadGroupName("taskExecutor-batch");
return threadPoolTaskExecutor;
}
#Bean(name = "stepToLoadData")
public Step stepToLoadData() {
TaskletStep step = stepBuilderFactory.get("stepToLoadData")
.transactionManager(jpaTransactionManager())
.<List<FieldSet>, List<myCustomPayloadRecord>>chunk(chunkSize)
.reader(myCustomFileItemReader(OVERRIDDEN_BY_EXPRESSION))
.processor(myCustomPayloadRecordItemProcessor())
.writer(myCustomerWriter())
.faultTolerant()
.skipPolicy(new AlwaysSkipItemSkipPolicy())
.skip(DataValidationException.class)
.listener(new CustomReaderListener())
.listener(new CustomProcessListener())
.listener(new CustomWriteListener())
.listener(new CustomSkipListener())
.taskExecutor(taskExecutor())
.throttleLimit(maxThreads)
.build();
step.registerStepExecutionListener(stepExecutionListener());
step.registerChunkListener(new CustomChunkListener());
return step;
}
My main method:
Application.java
#Autowired
#Qualifier("jobToLoadTheData")
private Job loadTheData;
#Autowired
private JobLauncher jobLauncher;
#PostConstruct
public void launchJob () throws JobParametersInvalidException, JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException
{
JobParameters parameters = (new JobParametersBuilder()).addDate("random", new Date()).toJobParameters();
jobLauncher.run(loadTheData, parameters);
}
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
Now, normally I'm reading this .csv from Amazon S3 bucket, but since I'm testing locally, I am just placing the .csv in the project directory and reading it directly by triggering the job in the Application.java main class (as you can see above). Also, I do have some other beans defined in this BatchConfiguration class but I don't want to over-complicate this post more than it already is and from the googling I've done, the problem possibly is with the methods I posted (hopefully).
Also, I would like to point out, similar to one of the other posts on Google/stackoverflow with a user having a similar problem, I created a #RestController endpoint that simply calls the .run() method the JobLauncher and I pass in the JobToLoadTheData Bean, and it triggers the batch insert. Guess what? Data persists to the database just fine, even on spring >= 2.2.1.
What is going on here? is this a clue? is something funky going wrong with some type of entity or transaction manager? I'll take any advice tips! I can provide any more information that you guys may need , so please just ask.
You are defining a bean of type JobRepository and expecting it to be picked up by Spring Batch. This is not correct. You need to provide a BatchConfigurer and override getJobRepository. This is explained in the reference documentation:
You can customize any of these beans by creating a custom implementation of the
BatchConfigurer interface. Typically, extending the DefaultBatchConfigurer
(which is provided if a BatchConfigurer is not found) and overriding the required
getter is sufficient.
This is also documented in the Javadoc of #EnableBatchProcessing. So in your case, you need to define a bean of type Batchconfigurer and override getJobRepository and getTransactionManager, something like:
#Bean
public BatchConfigurer batchConfigurer(EntityManagerFactory entityManagerFactory, DataSource dataSource) {
return new DefaultBatchConfigurer(dataSource) {
#Override
public PlatformTransactionManager getTransactionManager() {
return new JpaTransactionManager(entityManagerFactory);
}
#Override
public JobRepository getJobRepository() {
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setDataSource(dataSource);
jobRepositoryFactoryBean.setTransactionManager(getTransactionManager());
// set other properties
return jobRepositoryFactoryBean.getObject();
}
};
}
In a Spring Boot context, you could also override the createTransactionManager and createJobRepository methods of org.springframework.boot.autoconfigure.batch.JpaBatchConfigurer if needed.

Spring Boot Transaction support using #transactional annotation not working with mongoDB, anyone have solution for this?

Spring Boot version - 2.4.4,
mongodb version - 4.4.4
In my project, I want to do entry in 2 different document of mongodb, but if one fails than it should do rollback. mongodb supports transaction after version 4.0 but only if you have at least one replica set.
In my case I don't have replica set and also cannot create it according to my project structure. I can't use transaction support of mongodb because no replica-set. So, I am using Spring Transaction.
According to spring docs, to use transaction in Spring Boot, you only need to use #transactional annotation and everything will work(i.e. rollback or commit).
I tried many things from many sources but it is not rollbacking transaction if one fail.
Demo code is here,
This is demo code, not actual project.
This is my service class.
#Service
public class UserService {
#Autowired
UserRepository userRepository;
#Autowired
UserDetailRepository userDetailRepository;
#Transactional(rollbackFor = Exception.class)
public ResponseEntity<JsonNode> createUser(SaveUserDetailRequest saveUserDetailRequest) {
try {
User _user = userRepository.save(new User(saveUserDetailRequest.getId(), saveUserDetailRequest.getFirstName(), saveUserDetailRequest.getLastName()));
UserDetail _user_detail = userDetailRepository.save(new UserDetail(saveUserDetailRequest.getPhone(), saveUserDetailRequest.getAddress()));
} catch (Exception m) {
System.out.print("Mongo Exception");
}
return new ResponseEntity<>(HttpStatus.OK);
}
}
Also tried below code but still not working,
#EnableTransactionManagement
#Configuration
#EnableMongoRepositories({ "com.test.transaction.repository" })
#ComponentScan({"com.test.transaction.service"})
public class Config extends AbstractMongoClientConfiguration{
private com.mongodb.MongoClient mongoClient;
#Bean
MongoTransactionManager transactionManager(MongoDbFactory dbFactory) {
return new MongoTransactionManager(dbFactory);
}
#Bean
public com.mongodb.MongoClient mongodbClient() {
mongoClient = new com.mongodb.MongoClient("mongodb://localhost:27017");
return mongoClient;
}
#Override
protected String getDatabaseName() {
return "test";
}
}
The transaction support in Spring is only there to make things easier, it doesn't replace the transaction support for the underlying datastore being used.
In this case, it will simply delegate the starting/committing of a transaction to MongoDB. WHen using a database it will eventually delegate to the database etc.
As this is the case, the pre-requisites for MongoDB still need to be honoured and you will still need a replica.

Need to configure my JPA layer to use a TransactionManager (Spring Cloud Task + Batch register a PlatformTransactionManager unexpectedly)

I am using Spring Cloud Task + Batch in a project.
I plan to use different datasources for business data and Spring audit data on the task. So I configured something like:
#Bean
public TaskConfigurer taskConfigurer() {
return new DefaultTaskConfigurer(this.singletonNotExposedSpringDatasource());
}
#Bean
public BatchConfigurer batchConfigurer() {
return new DefaultBatchConfigurer(this.singletonNotExposedSpringDatasource());
}
whereas main datasource is autoconfigured through JpaBaseConfiguration.
The problem comes when SimpleBatchConfiguration+DefaultBatchConfigurer expose a PlatformTransactionManager bean, since JpaBaseConfiguration has a #ConditionalOnMissingBean on PlatformTransactionManager. Therefore Batch's PlatformTransactionManager, binded to the spring.datasource takes place.
So far, this seems to be caused because this bug
So I tried to emulate what JpaBaseConfiguration does, defining my own PlatformTransactionManager over my biz datasource/entityManager.
#Primary
#Bean
public PlatformTransactionManager appTransactionManager(final LocalContainerEntityManagerFactoryBean appEntityManager) {
JpaTransactionManager transactionManager = new JpaTransactionManager();
transactionManager.setEntityManagerFactory(appEntityManager.getObject());
this.appTransactionManager = transactionManager;
return transactionManager;
}
Note I have to define it with a name other than transactionManager, otherwise Spring finds 2 beans and complains (unregardless of #Primary!)
But now it comes the funny part. When running the tests, everything runs smooth, tests finish and DDLs are properly created for both business and Batch/Task's databases, database reads work flawlessly, but business data is not persisted in my testing database, so final assertThats fail when counting. If I #Autowire in my test PlatformTransactionManager or ÈntityManager, everything indicates they are the proper ones. But if I debug within entityRepository.save, and execute org.springframework.transaction.interceptor.TransactionAspectSupport.currentTransactionStatus(), it seems the DatasourceTransactionManager from Batch's configuration is overriding, so my custom exposed PlatformTransactionManager is not being used.
So I guess it is not a problem of my PlatformManager being the primary, but that something is configuring my JPA layer TransactionInterceptor to use the non primary but transactionManager named bean of Batch.
I also tried with making my #Configuration implement TransactionManagementConfigurer and override PlatformTransactionManager annotationDrivenTransactionManager() but still no luck
Thus, I guess what I am asking is whether there is a way to configure the primary TransactionManager for the JPA Layer.
The problem comes when SimpleBatchConfiguration+DefaultBatchConfigurer expose a PlatformTransactionManager bean,
As you mentioned, this is indeed what was reported in BATCH-2788. The solution we are exploring is to expose the transaction manager bean only if Spring Batch creates it.
In the meantime you can set the property spring.main.allow-bean-definition-overriding=true to allow bean definition overriding and set the transaction manager you want Spring Batch to use with BatchConfigurer#getTransactionManager. In your case, it would be something like:
#Bean
public BatchConfigurer batchConfigurer() {
return new DefaultBatchConfigurer(this.singletonNotExposedSpringDatasource()) {
#Override
public PlatformTransactionManager getTransactionManager() {
return new MyTransactionManager();
}
};
}
Hope this helps.

How to Replace RemoteServer() when upgrading to Spring Data 4.2+?

In upgrading to Neo 3.2.3 (from Neo 2.5), I've had to upgrade my Spring Data dependency. The main reason for me upgrading is to take advantage of Neo's new Bolt protocol. I bumped the versions (using maven pom.xml), and I'm having issues with one change in particular -- how to set up the scaffolding for Sessions and the RemoteServer configuration.
org.springframework.data.neo4j.server.RemoteServer has been removed from the SD4N api, breaking my code and I'm not sure how to get things to compile again. I've tried a number of sources online, with little success. Here's what I've read:
Neo4j 3.0 and spring data
https://docs.spring.io/spring-data/neo4j/docs/current/reference/html/#_spring_configuration
https://graphaware.com/neo4j/2016/09/30/upgrading-to-sdn-42.html
None of these resources quite explain how to refactor the Spring Configuration (and its clients) to use whatever thing replaces the RemoteServer Object.
How do I connect to my Neo database with Spring Data Neo4J, given a url, username, and password? . Bonus points for explaining how these interrelate to Sessions and SessionFactorys.
The configuration should look like this:
#Configuration
#EnableNeo4jRepositories(basePackageClasses = UserRepository.class)
#ComponentScan(basePackageClasses = UserService.class)
static class Config {
#Bean
public SessionFactory getSessionFactory() {
return new SessionFactory(configuration(), User.class.getPackage().getName());
}
#Bean
public Neo4jTransactionManager transactionManager() throws Exception {
return new Neo4jTransactionManager(getSessionFactory());
}
#Bean
public org.neo4j.ogm.config.Configuration configuration() {
return new org.neo4j.ogm.config.Configuration.Builder()
.uri("bolt://localhost")
.credentials("username", "password")
.build();
}
}
SessionFactory and Session are described here
Please comment about what's unclear in the docs.

Using Quartz with Spring Boot - injection order changes based upon return type of method

I am trying to get Quartz working with Spring Boot, and am not managing to get the injection working correctly. I am basing myself on the example shown here
Here is my boot class:
#ComponentScan
#EnableAutoConfiguration
public class MyApp {
#Autowired
private DataSource dataSource;
#Bean
public JobFactory jobFactory() {
return new SpringBeanJobFactory();
}
#Bean
public SchedulerFactoryBean quartz() {
final SchedulerFactoryBean bean = new SchedulerFactoryBean();
bean.setJobFactory(jobFactory());
bean.setDataSource(dataSource);
bean.setConfigLocation(new ClassPathResource("quartz.properties"));
...
return bean;
}
public static void main(String[] args) {
SpringApplication.run(MyApp.class, args);
}
}
When the quartz() method is invoked by Spring, dataSource is null. However, if I change the return type of the quartz() method to Object, dataSource is correctly injected with the datasource created by reading application.properties, the bean is built, everything works and I get a subsequent error saying that Quartz has been unable to retrieve any jobs from the database, which is normal as I haven't put the schema in place yet.
I have tried adding a #DependsOn("dataSource") annotation on the quartz() method but that doesn't make any difference.
This class is the only class annotated with #Configuration.
Here are my dependencies (I'm using Maven but present them like this for space reasons):
org.springframework.boot:spring-boot-starter-actuator:1.0.0.RC4
org.springframework.boot:spring-boot-starter-jdbc:1.0.0.RC4
org.springframework.boot:spring-boot-starter-web:1.0.0.RC4
org.quartz-scheduler:quartz:2.2.1
org.springframework:spring-support:2.0.8
And the parent:
org.springframework.boot:spring-boot-starter-parent:1.0.0.RC4
Finally the content of quartz.properties:
org.quartz.threadPool.threadCount = 3
org.quartz.jobStore.class=org.springframework.scheduling.quartz.LocalDataSourceJobStore
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
What am I doing wrong?
(I have seen this question, but that question initialises the datasource in the #Configuration class)
Your app starts up (with a schema error, which is expected) if I use "org.springframework:spring-context-support:4.0.2.RELEASE" ("org.springframework:spring-support:2.0.8" if it ever existed must be nearly 10 years old now and certainly isn't compatible with Boot or Quartz 2).

Resources