Spring Data Solr #Transaction Commits - spring

I currently have a setup where data is inserted into a database, as well as indexed into Solr. These two steps are wrapped in a spring-managed transaction via the #Transaction annotation. What I've noticed is that spring-data-solr issues an update with the following parameters whenever the transaction is closed : params{commit=true&softCommit=false&waitSearcher=true}
#Transactional
public void save(Object toSave){
dbRepository.save(toSave);
solrRepository.save(toSave);
}
The rate of commits into solr is fairly high, so ideally I'd like send data to the solr index, and have solr auto commit at regular intervals. I have the autoCommit (and autoSoftCommit) set in my solrconfig.xml, but since spring-data-solr is sending those commit parameters, it does a hard commit every time.
I'm aware that I can drop down to the SolrTemplate API and issue commits manually, I would like to keep the solr repository.save call within a spring-managed transaction if possible. Is there a way to modify the parameters that are sent to solr on commit?

After putting in an IDE debug breakpoint in org.springframework.data.solr.repository.support.SimpleSolrRepository here:
private void commitIfTransactionSynchronisationIsInactive() {
if (!TransactionSynchronizationManager.isSynchronizationActive()) {
this.solrOperations.commit(solrCollectionName);
}
}
I discovered that wrapping my code as #Transactional (and other details to actually enable the framework to begin/end code as a transaction) doesn't achieve what we expect with "Spring Data for Apache Solr". The stacktrace shows the Proxy and Transaction Interceptor classes for our code's Transactional scope but then it also shows the framework starting its own nested transaction with another Proxy and Transaction Interceptor of its own. When the framework exits its CrudRepository.save() method my code calls, the action to commit to Solr is done by the framework's nested transaction. It happens before our outer transaction is exited. So, the attempt to batch-process many saves with one commit at the end instead of one commit for every save is futile. It seems, for this area in my code, I'll have to make use of SolrJ to save (update) my entities to Solr and then have "my" transaction's exit be followed with a commit.

If using Spring Solr, I found using the SolrTemplate bean allows you to 'batch' updates when adding data to the Solr index. By using the bean for SolrTemplate, you can use "addBeans" method, which will add a collection to the index and not commit until the end of the transaction. In my case, I started out using solrClient.add() and taking up to 4 hours for my collection to get saved to the index by iterating over it, as it commits after every single save. By using solrTemplate.addBeans(Collect<?>), it finishes in just over 1 second, as the commit is on the entire collection. Here is a code snippet:
#Resource
SolrTemplate solrTemplate;
public void doReindexing(List<Image> images) {
if (images != null) {
/* CMSSolrImage is a class with #SolrDocument mappings.
* the List<Image> images is a collection pulled from my database
* I want indexed in Solr.
*/
List<CMSSolrImage> sImages = new ArrayList<CMSSolrImage>();
for (Image image : images) {
CMSSolrImage sImage = new CMSSolrImage(image);
sImages.add(sImage);
}
solrTemplate.saveBeans(sImages);
}
}

The way I've done something similar is to create a custom repository implementation of the save methods.
Interface for the repository:
public interface FooRepository extends SolrCrudRepository<Foo, String>, FooRepositoryCustom {
}
Interface for the custom overrides:
public interface FooRepositoryCustom {
public Foo save(Foo entity);
public Iterable<Foo> save(Iterable<Foo> entities);
}
Implementation of the custom overrides:
public class FooRepositoryImpl {
private SolrOperations solrOperations;
public SolrSampleRepositoryImpl(SolrOperations fooSolrOperations) {
this.solrOperations = fooSolrOperations;
}
#Override
public Foo save(Foo entity) {
Assert.notNull(entity, "Cannot save 'null' entity.");
registerTransactionSynchronisationIfSynchronisationActive();
this.solrOperations.saveBean(entity, 1000);
commitIfTransactionSynchronisationIsInactive();
return entity;
}
#Override
public Iterable<Foo> save(Iterable<Foo> entities) {
Assert.notNull(entities, "Cannot insert 'null' as a List.");
if (!(entities instanceof Collection<?>)) {
throw new InvalidDataAccessApiUsageException("Entities have to be inside a collection");
}
registerTransactionSynchronisationIfSynchronisationActive();
this.solrOperations.saveBeans((Collection<? extends T>) entities, 1000);
commitIfTransactionSynchronisationIsInactive();
return entities;
}
private void registerTransactionSynchronisationIfSynchronisationActive() {
if (TransactionSynchronizationManager.isSynchronizationActive()) {
registerTransactionSynchronisationAdapter();
}
}
private void registerTransactionSynchronisationAdapter() {
TransactionSynchronizationManager.registerSynchronization(SolrTransactionSynchronizationAdapterBuilder
.forOperations(this.solrOperations).withDefaultBehaviour());
}
private void commitIfTransactionSynchronisationIsInactive() {
if (!TransactionSynchronizationManager.isSynchronizationActive()) {
this.solrOperations.commit();
}
}
}
and you also need to provide a SolrOperations bean for the right solr core:
#Configuration
public class FooSolrConfig {
#Bean
public SolrOperations getFooSolrOperations(SolrClient solrClient) {
return new SolrTemplate(solrClient, "foo");
}
}
Footnote: auto commit is (to my mind) conceptually incompatible with a transaction. An auto commit is a promise from solr that it will try to start to write it to disk within a certain time limit. Many things might stop that from actually happening however - a timely power or hardware failure, errors between the document and the schema, etc. But the client won't know that solr failed to keep its promise, and the transaction will see a success when it actually failed.

Related

Hibernate LazyInitialization exception in console Spring Boot with an open session

I'm not sure if anyone has experienced this particular twist of the LazyInitialization issue.
I have a console Spring Boot application (so: no views - everything basically happens within a single execution method). I have the standard beans and bean repositories, and the typical lazy relationship in one of the beans, and it turns out that unless I specify #Transactional, any access to any lazy collection automatically fails even though the session stays the same, is available, and is open. In fact, I can happily do any session-based operation as long as I don't try to access a lazy collection.
Here's a more detailed example :
#Entity
class Path {
... `
#OneToMany(mappedBy = "path",fetch = FetchType.LAZY,cascade = CascadeType.ALL)
#OrderBy("projectOrder") `
public List<Project> getProjects() {
return projects; `
}`
}
Now, the main method does something as simple as this:
class SpringTest {
... `
#Autowired
private PathRepository pathRepository;
void foo() {
Path path = pathRepository.findByNameKey("...");
System.out.println(path.getProjects()); // Boom <- Lazy Initialization exception}
}
Of course if I slap a #Transactional on top of the method or class it works, but the point is - why should I need that? No one is closing the session, so why is Hibernate complaining that thereĀ“s no session when there is one?
In fact, If I do:
void foo() {
System.out.println(entityManager.unwrap(Session.class));
System.out.println(entityManager.unwrap(Session.class).isOpen());
Path basic = pathRepository.findByNameKey("...");
System.out.println(entityManager.unwrap(Session.class));
System.out.println(entityManager.unwrap(Session.class).isOpen());
System.out.println(((AbstractPersistentCollection)basic.projects).getSession());
Path p1 = pathRepository.findByNameKey("....");
}
I get that the session object stays the same the whole time, it stays open the whole time, but the internal session property of the collection is never set to anything other than null, so of course when Hibernate tries to read that collection, in its withTemporarySessionIfNeeded method it immediately throws an exception
private <T> T withTemporarySessionIfNeeded(LazyInitializationWork<T> lazyInitializationWork) {
SharedSessionContractImplementor tempSession = null;
if (this.session == null) {
if (this.allowLoadOutsideTransaction) {
tempSession = this.openTemporarySessionForLoading();
} else {
this.throwLazyInitializationException("could not initialize proxy - no Session");
}
So I guess my question would be - why is this happening? Why doesn't Hibernate store or access the session from which a bean was fetched so that it can load the lazy collection from it?
Digging a bit deeper, it turns out that the repository method executying the query does a
// method here is java.lang.Object org.hibernate.query.Query.getSingleResult()
public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
...
if (SharedEntityManagerCreator.queryTerminatingMethods.contains(method.getName())) {
...
EntityManagerFactoryUtils.closeEntityManager(this.entityManager); // <--- Why?!?!?
this.entityManager = null;
}
...
}
and the above closeEntityManager calls unsetSession on all collections:
SharedSessionContractImplementor session = this.getSession();
if (this.collectionEntries != null) {
IdentityMap.onEachKey(this.collectionEntries, (k) -> {
k.unsetSession(session);
});
}
But why?!
(Spring Boot version is 2.7.8)
So, after researching more it appears that this is standard behavior in Spring Boot - unless you use your own EntityManager, the one managed automatically by Spring is either attached to a #Transactional boundary, or opens and closes for each query.
Some relevant links:
Does Entity manager needs to be closed every query?
Do I have to close() every EntityManager?
In the end, I ended using a TransactionTemplate to wrap my code into a transaction without having to mark the whole class #Transactional.

Deleting a record then selecting within the same Spring Transaction still returns the deleted record

I have some code within a spring transaction with the isolation level set to SERIALIZABLE. This code does a few things firstly it deletes all records from a table that have a flag set, next it performs a select to ensure invalid records can not be written and finally the new records are written.
The problem is that the select continues to return the records that were deleted if the code is run with the transaction annotation. My understanding is that because we are performing these operations within the same spring transaction that the previous delete operation will be considered when performing the select.
We are using Spring Boot 2.1 and Hibernate 5.2
A summary of the code is shown below:
#HystrixCommand
public void deleteRecord(EntityObj entityObj) {
fooRepository.deleteById(entityObj.getId());
//Below line added as part of debugging but I don't think I should really need it?
fooRepository.flush();
}
public List<EntityObj> findRecordByProperty(final String property) {
return fooRepository.findEntityObjByProperty(property);
}
#Transactional(isolation = Isolation.SERIALIZABLE)
public void debugReadWrite() {
EntitiyObject entitiyObject = new EntityObject();
entitiyObject.setId(1);
deleteRecord(entitiyObject);
List<EntityObj> results = findRecordByProperty("bar");
if (!results.isEmpty()) {
throw new RuntimeException("Should be no results!")
}
}
The transaction has not committed yet, you need to complete the transaction and then find the record.
decorating the deleteRecord with propagation = Propagation.REQUIRES_NEW) should solve the issue
#Transactional(propagation = Propagation.REQUIRES_NEW)
public void deleteRecord(EntityObj entityObj) {
fooRepository.deleteById(entityObj.getId());
// flush not needed fooRepository.flush();
}
A flush is not needed because when deleteRecord completes the translation will be committed.
under the hood
//start transaction
public void deleteRecord(EntityObj entityObj) {
fooRepository.deleteById(entityObj.getId());
}
//commit transaction
Turns out the issue was due to our use of Hystrix. The transaction is started outside of Hystirx and then at a later point goes through a Hystrix command. The Hystrix command is using a threadpool and so the transaction is lost while executing on the new thread from the Hystrix threadpool. See this github issue for more info:
https://github.com/spring-cloud/spring-cloud-netflix/issues/1381

JPA - Spanning a transaction over multiple JpaRepository method calls

I'm using SpringBoot 2.x with SpringData-JPA accessing the database via a CrudRepository.
Basically, I would like to call the CrudRepository's methods to update or persist the data. In one use case, I would like to delete older entries from the database (for the brevity of this example assume: delete all entries from the table) before I insert a new element.
In case persisting the new element fails for any reason, the delete operation shall be rolled back.
However, the main problem seems to be that new transactions are opened for every method called from the CrudRepository. Even though, a transaction was opened by the method from the calling service. I couldn't get the repository methods to use the existing transaction.
Getting transaction for [org.example.jpatrans.ChairUpdaterService.updateChairs]
Getting transaction for [org.springframework.data.jpa.repository.support.SimpleJpaRepository.deleteWithinGivenTransaction]
Completing transaction for [org.springframework.data.jpa.repository.support.SimpleJpaRepository.deleteWithinGivenTransaction]
I've tried using different Propagation. (REQUIRED, SUPPORTED, MANDATORY) on different methods (service/repository) to no avail.
Changing the methods #Transactional annoation to #Transactional(propagation = Propagation.NESTED) sounded that this would just do that, but didn't help.
JpaDialect does not support savepoints - check your JPA provider's capabilities
Can I achieve the expected behaviour, not using an EntityManager directly?
I also would like to avoid to having to be using native queries as well.
Is there anything I have overlooked?
For demonstration purposes, I've created a very condensed example.
The complete example can be found at https://gitlab.com/cyc1ingsir/stackoverlow_jpa_transactions
Here are the main (even more simplified) details:
First I've got a very simple entity defined:
#Entity
#Table(name = "chair")
#Data
#AllArgsConstructor
#NoArgsConstructor
public class Chair {
// Not auto generating the id is on purpose
// for later testing with non unique keys
#Id
private int id;
#Column(name = "legs", nullable = false)
private Integer legs;
}
The connection to the database is made via the CrudRepository:
#Repository
public interface ChairRepository extends CrudRepository<Chair, Integer> {
}
This is being called from another bean (main methods here are updateChairs and doUpdate):
#Slf4j
#Service
#AllArgsConstructor
#Transactional
public class ChairUpdater {
ChairRepository repository;
/*
* Initialize the data store with some
* sample data
*/
public void initializeChairs() {
repository.deleteAll();
Chair chair4 = new Chair(1, 4);
Chair chair3 = new Chair(2, 3);
repository.save(chair4);
repository.save(chair3);
}
public void addChair(int id, Integer legCount) {
repository.save(new Chair(id, legCount));
}
/*
* Expected behaviour:
* when saving a given chair fails ->
* deleting all other is rolled back
*/
#Transactional
public void updateChairs(int id, Integer legCount) {
Chair chair = new Chair(id, legCount);
repository.deleteAll();
repository.save(chair);
}
}
The goal, I want to achieve is demonstrated by these two test cases:
#Slf4j
#RunWith(SpringRunner.class)
#DataJpaTest
#Import(ChairUpdater.class)
public class ChairUpdaterTest {
private static final int COUNT_AFTER_ROLLBACK = 3;
#Autowired
private ChairUpdater updater;
#Autowired
private ChairRepository repository;
#Before
public void setup() {
updater.initializeChairs();
}
#Test
public void positiveTest() throws UpdatingException {
updater.updateChairs(3, 10);
}
#Test
public void testRollingBack() {
// Trying to update with an invalid element
// to force rollback
try {
updater.updateChairs(3, null);
} catch (Exception e) {
LOGGER.info("Rolled back?", e);
}
// Adding a valid element after the rollback
// should succeed
updater.addChair(4, 10);
assertEquals(COUNT_AFTER_ROLLBACK, repository.findAll().spliterator().getExactSizeIfKnown());
}
}
Update:
It seems to work, if the repository is not extended from either CrudRepository or JpaRepository but from a plain Repository, definening all needed methods explicitly. For me, that seems to be a workaround rather than beeing a propper solution.
The question it boils down to seems to be: Is it possible to prevent SimpleJpaRepository from opening new transactions for every (predefined) method used from the repository interface? Or, if that is not possible, how to "force" the transaction manager to reuse the transaction, opened in the service to make a complete rollback possible?
Hi I found this documentation that looks will help you:
https://www.logicbig.com/tutorials/spring-framework/spring-data/transactions.html
Next an example take from the previous web site:
#Configuration
**#ComponentScan
#EnableTransactionManagement**
public class AppConfig {
....
}
Then we can use transactions like this:
#Service
public class MyExampleBean{
**#Transactional**
public void saveChanges() {
**repo.save(..);
repo.deleteById(..);**
.....
}
}
Yes this is possible. First alter the #Transactional annotation so that it includes rollBackFor = Exception.class.
/*
* Expected behaviour:
* when saving a given chair fails ->
* deleting all other is rolled back
*/
#Transactional(rollbackFor = Exception.class)
public void updateChairs(int id, Integer legCount) {
Chair chair = new Chair(id, legCount);
repository.deleteAll();
repository.save(chair);
}
This will cause the transaction to roll back for any exception and not just RuntimeException or Error.
Next you must add enableDefaultTransactions = false to #EnableJpaRepositories and put the annotation on one of your configuration classes if you hadn't already done so.
#Configuration
#EnableJpaRepositories(enableDefaultTransactions = false)
public class MyConfig{
}
This will cause all inherited jpa methods to stop creating a transaction by default whenever they're called. If you want custom jpa methods that you've defined yourself to also use the transaction of the calling service method, then you must make sure that you didn't annotate any of these custom methods with #Transactional. Because that would prompt them to start their own transactions as well.
Once you've done this all of the repository methods should be executed using the service method transaction only. You can test this by creating and using a custom update method that is annotated with #Modifying. For more on testing please see my answer in this SO thread. Spring opens a new transaction for each JpaRepository method that is called within an #Transactional annotated method

How to rollback transaction invoked with jpa entity listeners

I'm using jpa , spring data and entity listeners to audit my entities precisely on postUpdate , postPersist , PostRemove
This is a pseudo code of my entity listener class
public class EntityListener extends AuditingEntityListener {
#PostUpdate
public void postPersist(Object auditedEntity) {
writer.saveEntity(auditedEntity,"UPDATE");
}
This the pseudo code of the Writer class
public class Writer {
#Async
public void saveEntity(Object auditedEntity, String action) {
try {
//some code to prepare the history entity
historyDAO.save(entity);
} catch (Exception e) {
}
}
when an exception is thrown in Writer class , the auditedEntity is updated or inserted however the historyEntity where i store the audit action doesnt
The problem is i need to invoke the saveEntity method in another thread for performance issue (#Async) but in that case a new transaction is open instead of the previously one which opened
how can i solve the rollack issue for both transactions
so when an exception is throwen both historyEntity and auditedEntity not persisted
I understand that you want to rollback both the child and the parent transaction when an exception is thrown from within Writer.saveEntity.
The problem is that the thread with the original transaction would still need to wait for all these complicated operations to finish before it could mark the transaction as committed. You can't easily span a transaction across multiple threads, either.
The only thing you could probably do to speed things up is you could run the logic of generating the history entities in parallel, and then save them all just before the transaction commits.
One way of doing that that I can think of is using a Hibernate interceptor:
public class AuditInterceptor extends EmptyInterceptor {
private List<Callable<BaseEntity>> historyEntries;
private ExecutorService executor;
...
public void beforeTransactionCompletion(Transaction tx) {
List<Future<BaseEntity>> futures = executor.invokeAll(historyEntries);
if (executor.awaitTermination(/* some timeout here */)) {
futures.stream().map(Future::get).forEach(entity -> session.save(object));
} else {
/* rollback */
}
}
}
Your listener code then becomes:
#PostUpdate
public void postPersist(Object auditedEntity) {
interceptor.getHistoryEntries().add(new Callable<BaseEntity> {
/* history entry generation logic goes here */
});
}
(note that the above code is greatly simplified, you could use any other asynchronous execution API, the basic idea is that you need to block in AuditInterceptor.beforeTransactionCompletion, waiting for all the history entries to be generated)
However, I would strongly advise against using the above technique, as it is rather complicated and error prone.
If you look here: https://docs.jboss.org/hibernate/orm/5.1/userguide/html_single/chapters/events/Events.html, you'll find that Hibernate interceptors have more interesting methods that could help you gather auditing info, and that perhaps your implementation could make use of them, possibly avoiding the need for complicated logic altogether (Hibernate already does track changes to fields of individual entities, so you get that information for free).
Why reinvent the wheel, though? If you dig even deeper, you'll find the Hibernate Envers module (http://hibernate.org/orm/envers/, works for both JPA and pure Hibernate) which gives you business auditing out of the box. Envers already digs into the above mechanism, so hopefully the performance issue would go away.
Final note: have you measured how long history entry generation takes? I would guess that executing for loops and if statements might be cheaper than database access operations. If I were you, I wouldn't do any of the above unless I was absolutely sure that's where the performance bottleneck was.

JPA: Nested transactional method is not rolled back

UPD 1: Upon further research I think the following information may be useful:
I obtain datasource through JNDI lookup on WildFly 9.0.2, then 'wrap' it into in instance of HikariDataSource (e. g. return new HikariDataSource(jndiDSLookup(dsName))).
the transaction manager that ends up being used is JTATransactionManager.
I do not configure the transaction manager in any way.
ORIGINAL QUESTION:
I am experiencing an issue with JPA/Hibernate and (maybe) Spring-Boot where DB changes introduced in a transactional method of one class called from a transactional method of another class are committed even though the changes in the caller method are rolled back (as they should be).
Here are my transactional services
StuffService:
#Service
#Transactional(rollbackFor = IOException.class)
public class StuffService {
#Inject private BarService barService;
#Inject private StuffRepository stuffRepository;
public Stuff updateStuff(Stuff stuff) {
try {
if (null != barService.doBar(stuff)) {
stuff.setSomething(SOMETHING);
stuff.setSomethingElse(SOMETHING_ELSE);
return stuffRepository.save(stuff);
}
} catch (FirstCustomException e) {
logger.error("Blah", e);
throw new SecondCustomException(e.getMessage());
}
throw new SecondCustomException("Blah 2");
}
// other methods
}
and BarService:
#Service
#Transactional
public class BarService {
#Inject private EntityARepository entityARepository;
#Inject private EntityBRepository entityBRepository;
/*
* updates existing entity A and persists new entity B.
*/
public EntityA doBar(Stuff stuff) throws FirstCustomException {
EntityA a = entityARepository.findOne(/* some criteria */);
a.setSomething(SOMETHING);
EntityB b = new EntityB();
b.setSomething(SOMETHING);
b.setSomethingElse(SOMETHING_ELSE);
entityBRepository.save(b);
return entityARepository.save(a);
}
// other methods
}
EntityARepository and EntityBRepository are very similar Spring-Boot repositories defined like this:
public interface EntityARepository extends JpaRepository<EntityA, Long>{
EntityA findOne(/* some criteria */);
}
FirstCustomException extends Throwable
SecondCustomException extends RuntimeException
Stuff entity is versioned, and every once in a while it is concurrently updated by StuffService.updateStuff(). In that case changes to one of the stuff instances are rolled back, as expected, but everything that happens in the barService.doBar() ends up being committed.
This puzzles me quite a lot since transaction propagation on both methods should be REQUIRED (the default one) and both methods belong to different classes, hence #Transactional should apply for both.
I did see Transaction is not completely rolled back after server throws OptimisticLockException1
But it did not really answer my question.
Can anyone please give me an idea of what's going on?
Thank you.
This isn't a 'nested' transaction - these services are operating in completely independent transactions. If you want the rollback of one to affect the other, you need to have them take part in the same transaction rather than start its own.
Or if your issue is that there is a problem with the version of 'stuff' passed into the doBar method and you want it verified, you will need to do something with the stuff instance that would cause an optimistic lock check, and so result in an exception if it is stale. see EntityManager.lock

Resources