The Hibernate session (EntityManager) scope in Spring Batch? - spring

As I’m new to Spring and Spring Batch, I have a general question about Spring Batch and JPA using Hibernate as provider.
Please, I want to know when the Hibernate session (wrapped by the EntityManager) is flushed? Between Reader, Processor and Writer? or for each commit interval? We can control it or not?

Please, I want to know when the Hibernate session (wrapped by the EntityManager) is flushed? Between Reader, Processor and Writer? or for each commit interval?
The session is flushed after writing a chunk of items, at each commit interval. For more details, take a look at:
HibernateItemWriter: https://github.com/spring-projects/spring-batch/blob/master/spring-batch-infrastructure/src/main/java/org/springframework/batch/item/database/HibernateItemWriter.java#L95
JpaItemWriter: https://github.com/spring-projects/spring-batch/blob/master/spring-batch-infrastructure/src/main/java/org/springframework/batch/item/database/JpaItemWriter.java#L84
We can control it or not?
If you use the HibernateItemWriter, you can set the clearSession flag to clear the session after each chunk.

To the best of my knowledge when the Spring transaction is committed which would be after each chunk.

Related

How to turn off JPA for SpringBatch under SpringBoot

We have a Spring Boot application that uses Spring Integration and Spring Batch. We drop a file in the poller and it processes. This process inserts records into a database and then reads them back out does some processing and writes a file. Let's say there are 10 records. The first time we get 10 records read and 10 written. Without stopping the server, we delete all the records through a SQL client on the database, run the same file again and we get 10 records read with 20 written. I believe there is some JPA or caching going on with the datasource. We've tried turning off several auto configuration options for JPA and caching but we haven't found the right configuration option to turn off caching.
Adding a bit more detail to the question.
Basically we have cron scheduler that has a FileHandler. This the handleFile methods we have the following.
public File handleFile(File file) throws Throwable {
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
Job job = (Job) appContext.getBean("processInitialFileJob");
JobExecution jb = jobLauncher.run(job, jobParametersBuilder.toJobParameters());
....
}
What can we do to the code above to ensure that it has a new JPA session or not use the JPA session at all? This job needs to read from the database each time and not a cached representation of the database.
Are u using Hibernate. Hibernate First Level cache may be creating the problem for u. Hibernate manages a First Level cache which is local to your Session. So once u create a session and do any transactions in that hibernate syncs that within. But when u do any changes to the table outside hibernate then hibernate wont sync that until flush is called on the session and session is closed.
To make sure this is not happening, inside your poller logic try creating new Session(or EntityManager in case of JPA) and close the session for every read/process/write cycle.
Also make sure this hibernate.current_session_context_class is not set to Thread. Since thread can be reused by the poller so the same Hibernate Session may be injected again.
This ended up not being an issue with Hibernate or JPA, but an issue of a StringBuilder holding on to data from previous runs. I believe this will need to be setup as #JobScope so that it is not reused across different executions of the job.

what is the difference between getSession().save() vs. getHibernateTemplate().save()?

I am using Dao classes that subclasses HibernateDaoSupport.
I have seen examples which calls
getSession().save(instance)
as well as
getHibernateTemplate().save(instance)
what is the difference between these two?
getSession opens a new session
whereas the hibernatetemplate does a best effort to find an existing session/transaction.
Hibernate template is more effective way to database connection. for more info
click here
HibernateTemplate generally a helper class provided by spring hibernate support, to make it really convenient to get the Session & transactions and to commit the Transaction you need not to do it manually while in case of getSession() you need to manage transactions.

How to handle transaction involving Spring message/JMS and Database

I have a method that get an invoice and it creates XML and send that XML to a JMS queue and then save the invoice to DB with updated status like 'invoiced'. Below is pseudo code that involves Spring and Hibernate. My question is: Is the failure in hibernate save rollsback Jms sending.or if JMS send failed, how can I roll back saving invoice status? is this comes under distributed transaction management. What are the transactional cases involved here. Thanks.
#Transactional(propagation=Propagation.Required)
void processInvoices(invoice ){
String xml = createXML(invoice);
messageService.sendInvoice(xml );
invoice.setStatus("invoiced");
save(invoice);
}
As per my knowledge and what I understand from your question you want to synchronize hibernate and JMS transaction, For doing this you will need to use JTA to manage transactions across the the both Hibernate and JMS
Read More # Spring synchronising Hibernate and JMS transactions

XA transactions and message bus

In our new project we would like to achieve transactions that involve jpa (mysql) and a message bus (rabbitmq)
We started building our infrastructure with spring data using mysql and rabbitmq (via spring amqp module). Since rabbitMq is not XA-transactional we configured the neo4j chainedTransactionManager as our main transactionManager. This manager takes as argument the jpa txManager and the rabbitTransactionManager.
Now, I do get the ability to annotate a service with #Transacitonal and use both the jpa and rabbit inside it. If I throw an exception within the service then none of the actions actually occur.
Here are my questions:
Is this configuration really gives me an atomic transaction?
I've heard that the chained tx manager is not using a 2 phase commit but a "best effort", is this best effort less reliable? if so how?
What the ChainedTransactionManager does is basically start and commit transactions in reverse order. So if you have a JpaTransactionManager and a RabbitTransactionManager and configured it like so.
#Bean
public PlatformTransactionManager transactionManager() {
return new ChainedTransactionManager(rabbitTransactionManager(), jpaTransactionManager());
}
Now if tha JPA commit succeeds but your commit to rabbitMQ fails your database changes will still be persisted as those are already committed.
To answer your first question it doesn't give you a real atomic transaction, everything that has been committed prior to the occurence of the Exception (on committing) will remain committed.
See http://docs.spring.io/spring-data/commons/docs/current/api/org/springframework/data/transaction/ChainedTransactionManager.html

Globally disable EntityManager cache in jboss

Is it possible to disable caching with EntityManager in some jboss config?
I'll explain. I have some final "ear" of our product that is using EntityManager through hibernate (something like this, i an newbie to this) and I need to test some behaviour. The easy way for me is to change(remove, create) state of entities direct in the database. But after i did this, the application remain to find old values for some time. I've read about some jboss cache, that is used for entity-manager.
So, for testing, i want to disable EntityManager cache, but it can not be disabled on application-level, only on jboss-level.
In brief: i need application always to reload actual entity state, because it can be edited in database with come other application. And its impossible to disable caching on application-level(hibernate.xml and other)
PS: jboss 4.2.3, ejb3, hibernate3
The cache you are referring to is probably the PersistenceContext. It cannot be disabled. You can only tweak it's scope. In a Java EE environment, the scope of the persistence context is the transaction per default. So if you need for some changes to take effect immediately, you can extract these changes (including fetching the entities in question) into a separate method and annotate it to require a new transaction:
#TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)
Once the method returns, all changes are committed.
You could also use bean managed transactions, so you can control the commit yourself. For this, annotate your bean with #TransactionManagement( TransactionManagementType.BEAN ) and use UserTransaction:
#Resource
private UserTransaction tx;
...
tx.begin();
//do stuff
tx.commit();

Resources