Spring multithreading with hibernate - spring

Some short version infos: Spring Boot 2.1 with Hibernate 5 and Java 8.
We try to do a multithreaded processing step, in which we use spring services to work with hibernate entities. Basically it looks like the following snippet.
ExecutorService executorService = Executors.newFixedThreadPool(4);
List<Callable<String>> executions = new ArrayList<>();
for (String partition : partitions) {
Callable<String> partitionExecution = () -> {
step.execute(partition);
return partition;
};
executions.add(partitionExecution);
}
executorService.invokeAll(executions);
Problem is that the hibernat session is somehow not available in the created threads. We get the following exception:
org.hibernate.LazyInitializationException:
failed to lazily initialize a collection of role: ..., could not initialize proxy - no Session
If I remove to multithreading part (i.e. remove the executor service) everyhting works fine.
We already tried the following:
Use a spring managed ThreadPoolTaskExecutor
Put #Transactional at the top of the method/class (which is wired from another class and invoked there, so should basically work)
Any hints/suggestions appreciated :)

I came up with a working version. Basically I let Spring spawn the threads itself using the Async annotation. This way the threads that are created get the wanted hibernate session attached.
I created a spring service that delegates through an async method.
#Component
public AsyncDelegate {
#Async
#Transactional
public Future delegate(Step step, String partition){
step.execute(partition);
return new AsyncResult(partition);
}
}
And adapted the initial code like this:
#Autowired
AsyncDelegate asyncDelegate;
List<Future> executions = new ArrayList<>();
for (String partition : partitions) {
executions.add(asyncDelegate.delegate(step, partition));
}

In short, I'd strongly advise against passing Hibernate managed objects inside multiple threads as this can cause:
Lazy initialization problems (as you encountered)
Missing updates
Locking exceptions
For more details, this blog post explains why: https://xebia.com/blog/hibernate-and-multi-threading/
For a solution, I'd find a way to split the work so that each thread would get list of entity id's to work with, and they'd do their independent work inside thread (fetch the entities from database, do the actual work etc.).
This is briefly mentioned in the above blog post.

Related

Spring Batch/Data JPA application not persisting data to db

I'm having really weird issue. I need to say that my code working perfectly fine in my local but not persisting some datas in our pod (k8 environment).
I have different datasources to work with in this batch. Everything running fine. Job Repository is map-based and using ResourcelessTransactionManager for it. I configured it like this
#Configuration
#EnableBatchProcessing
public class BatchConfigurer extends DefaultBatchConfigurer {
#Override
public void setDataSource(DataSource dataSource){
}
}
I also use different platformtransactionmanager then spring batch (issue). So I set my spring allow bean overriding to true in my properties. The platform transaction manager in my configurer is right binded one, I debugged it.
I have custom writer for one of my step. Updating records in multiple tables which in multiple dbs (different datasources, in brief)
public class MyWriter implements ItemWriter<MyDTO> {
#Autowired
private MyFirstRepo myfirstRepo; //table in first datasource
#Autowired
private MySecondRepo mySecondRepo; //table in second datasource
#Override
public void write(List<? extends MyDTO> myDtoList) throws Exception {
//some logic
mySecondRepo.delete(deletableEntity)
//some logic
mySecondRepo.saveAll(updatableEntities)
//some logic
myfirstRepo.saveAll(updatableEntities)
}
}
Since I have multiple datasources, I defined multiple transaction managers, and to give transaction manager to my step I defined chained transaction manager that includes that managers.
#Bean
public Step myStep(#Qualifier("chainedTransactionManager") ChainedTransactionManager chainedTransactionManager) {
return getCommonStepBuilder("myStep")
.transactionManager(chainedTransactionManager)
.<MyDTO,MyDTO>chunk(200)
.reader(myPaginingReader())
.writer(myWriter())
.taskExecutor(myTaskExecutor())
.throttleLimit(15)
.build();
}
chained transaction manager config (both of these transaction manager is JpaTransactionManager):
#Configuration
public class TransactionManagerConfig {
#Primary
#Bean(name = "chainedTransactionManager")
public ChainedTransactionManager transactionManager(
#Qualifier("firstTransactionManager") PlatformTransactionManager firstTransactionManager,
#Qualifier("secondTransactionManager")PlatformTransactionManager secondTransactionManager) {
return new ChainedTransactionManager(firstTransactionManager,secondTransactionManager);
}
}
So my first two jpa operations in writer working just fine( operations that made over MySecondRepo) but the last operation is not persisting data to db. It doesn't throws any errors, job completing succesfully but it doesn't update my records in table.
I must mention second time, it does update in my local actually. Just not updating in our app that lives on k8 environment (dockerized microservice). Which is making it so confusing. Any idea why is it happening?
Edit: I created another writer bean for myfirstRepo.saveAll(updatableEntities) (as jdbc batch item writer, executing same logic) and add two of these writer to composite one. Now it's working as expected. But I have a lot of concerns now since I don't know what caused it. Any idea?
Edit 2: I came across with this thread. I was using JdbcPagingItemReader, does entities fetched with this component are in managed state? Entites inside mySecondRepo.delete(deletableEntity) and
mySecondRepo.saveAll(updatableEntities) are fetched inside writer by using hibernate but myfirstRepo.saveAll(updatableEntities) entities are the ones that came from reader.
It all makes sense if it is the case but even it is then why it was working fine in local?
mySecondRepo.saveAll(updatableEntities) are fetched inside writer by using hibernate but myfirstRepo.saveAll(updatableEntities) entities are the ones that came from reader.
Fetching items in the item writer is the cause of your issue. This is incorrect, it is not expected to read items in the item writer. That's why items coming from the reader are saved, but not the ones fetched in the writer.
What you should know is that all writers in the composite are running in the scope of a single transaction, driven by the transaction manager of the step. So if you are writing data to multiple datasources, you need to make sure the transaction manager is coordinating the transaction between all datasources. ChainedTransactionManager is deprecated, you can use a JtaTransactionManager for your case.

guava eventbus post after transaction/commit

I am currently playing around with guava's eventbus in spring and while the general functionality is working fine so far i came across the following problem:
When a user want's to change data on a "Line" entity this is handled as usual in a backend service. In this service the data will be persisted via JPA first and after that I create a "NotificationEvent" with a reference to the changed entity. Via the EventBus I send the reference of the line to all subscribers.
public void notifyUI(String lineId) {
EventBus eventBus = getClientEventBus();
eventBus.post(new LineNotificationEvent(lineId));
}
the eventbus itself is created simply using new EventBus() in the background.
now in this case my subscribers are on the frontend side, outside of the #Transactional realm. so when I change my data, post the event and let the subscribers get all necessary updates from the database the actual transaction is not committed yet, which makes the subscribers fetch the old data.
the only quick fix i can think of is handling it asynchronously and wait for a second or two. But is there another way to post the events using guava AFTER the transaction has been committed?
I don't think guava is "aware" of spring at all, and in particular not with its "#Transactional" stuff.
So you need a creative solution here. One solution I can think about is to move this code to the place where you're sure that the transaction has finished.
One way to achieve that is using TransactionSyncrhonizationManager:
TransactionSynchronizationManager.registerSynchronization(new TransactionSynchronization(){
void afterCommit(){
// do what you want to do after commit
// in this case call the notifyUI method
}
});
Note, that if the transaction fails (rolls back) the method won't be called, in this case you'll probably need afterCompletion method. See documentation
Another possible approach is refactoring your application to something like this:
#Service
public class NonTransactionalService {
#Autowired
private ExistingService existing;
public void entryPoint() {
String lineId = existing.invokeInTransaction(...);
// now you know for sure that the transaction has been committed
notifyUI(lineId);
}
}
#Service
public class ExistingService {
#Transactional
public String invokeInTransaction(...) {
// do your stuff that you've done before
}
}
One last thing I would like to mention here, is that Spring itself provides an events mechanism, that you might use instead of guava's one.
See this tutorial for example

Do Spring transactions propagate through new instantiations

I'm working on a bunch of legacy code written by people before me and I'm confused about a particular kind of setup and wonder if this has ever worked to begin with.
There is a managed bean in spring that has a transactional method.
#Transactional(propagation = Propagation.REQUIRES_NEW, rollbackFor = Throwable.class)
public boolean updateDraftAndLivePublicationUsingFastDocumentsOfMySite(List<FastDocumentLite> fastDocumentLites, Long mySiteId) throws Exception { ... }
Now inside that method I find new instantiations calling update methods fe:
boolean firstFeed = new MySiteIdUpdate(publishing, siteDao, siteDomainService).update(siteId, fastDocumentLites.get(0).getMySiteId());
From my understanding on IOC this new class isn't managed by spring , it's just a variable in the bean. Now going further inside the update method you see another service gets called.
#Transactional(propagation=Propagation.REQUIRED, rollbackFor=Throwable.class)
public void activateSubdomainForSite(Long siteId, boolean activationOfSite)
So if there is a transaction open it should be propagated into this service. But here is what I don't get if that MySiteIdUpdate object isn't managed by spring does the first transaction move forward to the activateSubdomainForSite method ?? Or is another transaction being opened here. I looked in the logs and I believe it to be the latter but I rather ask the experts for a second oppinion before I proclame this legacy code to be complete rubbish to the project lead. I'm suffering with a StaleStateException somewhere further down the road and I'm hoping this has anything to do with it.
I think the code is correct, and the second #Transactional should reuse the existing transaction.
Because:
1) Spring Transaction handling is done either by Proxies or by AspectJ advices. If it is done by Proxies then it is required that MySiteIdUpdate invoke an instance that is injected (this is what you did). If you use AspectJ, then it should work anyway.
2) The association Transactions to the code that use is done by the Thread, this mean, as long as you "are" in the thread which started the transaction you can use it. (you do not start an new thread, so it should work)
An other way to explain: It is perfect legal when you have some method in your call hierarchy that does not belong to an spring bean. This should not make the transaction handling fail.

Accessing Spring #Transactional service from multiple threads

I would like to know if the following is considered safe.
Usual Spring service class that accesses a bunch of DAOS / hibernate entities:
#Transactional
public class MyService {
...
public SomeObject readStuffFromDB(String key) {
...
//return some records from the DB via hibernate entity etc
}
A class in the application that has the service wired in:
public class ServiceHolder {
private MyService myService;
private SomeOtherObject multiThreadedMethod() {
...
//calls myService.readStuffFromDB() and uses the results
//to return something useful
}
multiThreadedMethod will be called from multiple threadpool threads. I would like to know if the multiThreadedMethod is safe in its calls to myService.
It is NOT making any modifications to the DB - only reading.
What happens if two threads call myService.readStuffFromDB() at exactly the same time? Will a concurrent modification exception be thrown from somewhere?
I've been running it with no issues but I'm not 100% sure it will always work.
Yes you will call the same object in the same time as long as your service bean is defined as singleton (which is default and proper), but you should not rely on local variables in you services. So the methods should be written that way they can work independently (you don't need a mutual exclusion here). If you called db and tried do any operations nothing would happen because every thread would receive a new instance of entity manager. If you modified db in the same time and any type of db exception was thrown you would get a rollback exception which is perfectly fine.
entityManager.persist() will do more or less entityManager.getEntityManagerAssignedToCurrentThread().persist()
It is a proxy not real object. So you are safe :)

Another LazyInitializationException (in combination with Spring+GSON)

I guess I'm another newbie guy who fails to understand Hibernate sessions, may be Spring's TransactionTemplate, dunno. Here's my story.
I'm using Hibernate 3.5.5-Final, Spring 3.0.4.RELEASE, trying to live only with annotations (for Hibernate as well as Spring MVC).
My first try was to use #Transactional annotations in combination with properly setup transaction manager. Seemed to work at first, but in long run (about 36hours) I started to receive "LazyInitializationExceptions" over and over again (from places that were running just fine in previous hours!).
So I switched to manual transactions using Spring TransactionTemplate.
Basically I'm having something like this protected stuff in my BaseService
#Autowired
protected HibernateTransactionManager transactionManager;
protected void inTransaction(final Runnable runnable) {
TransactionTemplate transaction = new TransactionTemplate(transactionManager);
transaction.execute(new TransactionCallback<Boolean>() {
#Override
public Boolean doInTransaction(TransactionStatus status) {
try {
runnable.run();
return true;
} catch (Exception e) {
status.setRollbackOnly();
log.error("Exception in transaction.", e));
throw new RuntimeException("Exception in transaction.", e);
}
}
});
}
And using this method from the service's impls worked OK, I did not see LazyInitializationException for 10 days (running Tomcat with this single app 24*7 for 10days, no restarts) ... but than hoops! It has popped again :-/
The LazyInitializationException comes from place under "inTransaction" method and there is no "inTransaction recursion" involved, so I'm pretty sure I should be in the same transaction alas in the same Hibernate session. There is no "data from previous session" involved (as far as my code goes that service layer opens the transaction, gathers all data from Hibernate it needs, process it and returns some result == service does not recall other top-services)
I have not profiled my app (I don't even know how to do that properly in long runs such as 10 days), but my only guess is that I'm leaking memory somewhere and JVM hits heap-limit...
Is there some "SoftReferences" involed inside Spring or Hibernat? I don't know...
Another funny thing is that the exception always happen when I try to serialize the result into JSON using Google GSON serializer. I know, it does not use getters ... I have my own patched version that is using getters instead of actual fields (so I'm making sure not to bypass Hibernate proxy mechanisms), do you think it may play some role here?
Last funny thing is that the exception is always happing in the single service method (not anywhere else), which is driving me nuts because this method is as simple-stupid as it could be (no extrem DB operations, just loads data and serialize them to JSON using lazy-fetching), huh???
Does anybody have any suggestions what should I try? Do you have some similar experiences?
Thanks,
Jakub

Resources