Spring Data findById vs Join Query performance

Spring Data findById vs Join Query performance - spring

I have a #OneToMany relation between an Entity say Class with Student. Now for each class there can be atleast 100 students. This is how my relationship with Student is defined in Class entity
#OneToMany(mappedBy = "classDataEntity", cascade = CascadeType.ALL, fetch = FetchType.LAZY)
private List<StudentDataEntity> studentDataEntities;
Just to check the performance of fetching class by ID (PK) we use two methods
Optional findById(ID id); // with fetchType Eager with Students
create a new method in repository with #Query joining the two tables in classId
We are calling both methods from the same service class method , e.g
#Transactional
public ClassDataEntity fetchClassEntity(Long classId){
ClassDataEntity classDataEntityJOined = repo.fetchClassWithStudents(id);
ClassDataEntity classDataEntity = repo.findById(id);
}
My understanding is with lot of Students , the join should perform better since its less call to DB , hence less network calls. But in the above case we are seeing findById performing much better
Is it because the data with the id is already in session? Also when are Hibernate sessions created and destroyed when invoked via Crud Repositories

Yeah, it's because the data is already in the persistence context. If you remove #Transactional you should see that two queries are executed because then the persistence context would not be shared (unless you have open-session-in-view enabled in spring).

Related

Transaction getting rolled back on persisting the entity from Many to one side

I have this association in the DB -
I want the data to be persisted in the tables like this -
The corresponding JPA entities have been modeled this way (omitted getters/setters for simplicity) -
STUDENT Entity -
#Entity
#Table(name = "student")
public class Student {
#Id
#SequenceGenerator(name = "student_pk_generator", sequenceName =
"student_pk_sequence", allocationSize = 1)
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator =
"student_pk_generator")
#Column(name = "student_id", nullable = false)
private Long studentId;
#Column(name = "name", nullable = false)
private String studentName;
#OneToMany(mappedBy = "student", cascade = CascadeType.ALL)
private Set<StudentSubscription> studentSubscription;
}
STUDENT_SUBSCRIPTION Entity -
#Entity
#Table(name = "student_subscription")
#Inheritance(strategy = InheritanceType.JOINED)
public abstract class StudentSubscription {
#Id
private Long studentId;
#ManyToOne(optional = false)
#JoinColumn(name = "student_id", referencedColumnName = "student_id")
#MapsId
private Student student;
#Column(name = "valid_from")
private Date validFrom;
#Column(name = "valid_to")
private Date validTo;
}
LIBRARY_SUBSCRIPTION Entity -
#Entity
#Table(name = "library_subscription",
uniqueConstraints = {#UniqueConstraint(columnNames = {"library_code"})})
#PrimaryKeyJoinColumn(name = "student_id")
public class LibrarySubscription extends StudentSubscription {
#Column(name = "library_code", nullable = false)
private String libraryCode;
#PrePersist
private void generateLibraryCode() {
this.libraryCode = // some logic to generate unique libraryCode
}
}
COURSE_SUBSCRIPTION Entity -
#Entity
#Table(name = "course_subscription",
uniqueConstraints = {#UniqueConstraint(columnNames = {"course_code"})})
#PrimaryKeyJoinColumn(name = "student_id")
public class CourseSubscription extends StudentSubscription {
#Column(name = "course_code", nullable = false)
private String courseCode;
#PrePersist
private void generateCourseCode() {
this.courseCode = // some logic to generate unique courseCode
}
}
Now, there is a Student entity already persisted with the id let's say - 100.
Now I want to persist this student's library subscription. For this I have created a simple test using Spring DATA JPA repositories -
#Test
public void testLibrarySubscriptionPersist() {
Student student = studentRepository.findById(100L).get();
StudentSubscription librarySubscription = new LibrarySubscription();
librarySubscription.setValidFrom(//some date);
librarySubscription.setValidTo(//some date);
librarySubscription.setStudent(student);
studentSubscriptionRepository.save(librarySubscription);
}
On running this test I am getting the exception -
org.springframework.dao.InvalidDataAccessApiUsageException: detached entity passed to persist: com.springboot.data.jpa.entity.Student; nested exception is org.hibernate.PersistentObjectException: detached entity passed to persist: com.springboot.data.jpa.entity.Student
To fix this I attach a #Transactional to the test. This fixed the above exception for detached entity, but the entity StudentSubscription and LibrarySubscription are not getting persisted to the DB. In fact the transaction is getting rolled back.
Getting this exception in the logs -
INFO 3515 --- [ main] o.s.t.c.transaction.TransactionContext : Rolled back transaction for test: [DefaultTestContext#35390ee3 testClass = SpringDataJpaApplicationTests, testInstance = com.springboot.data.jpa.SpringDataJpaApplicationTests#48a12036, testMethod = testLibrarySubscriptionPersist#SpringDataJpaApplicationTests, testException = [null], mergedContextConfiguration = [MergedContextConfiguration#5e01a982 testClass = SpringDataJpaApplicationTests, locations = '{}', classes = '{class com.springboot.data.jpa.SpringDataJpaApplication}', contextInitializerClasses = '[]', activeProfiles = '{}', propertySourceLocations = '{}', propertySourceProperties = '{org.springframework.boot.test.context.SpringBootTestContextBootstrapper=true}', contextCustomizers = set[org.springframework.boot.test.context.filter.ExcludeFilterContextCustomizer#18ece7f4, org.springframework.boot.test.json.DuplicateJsonObjectContextCustomizerFactory$DuplicateJsonObjectContextCustomizer#264f218, org.springframework.boot.test.mock.mockito.MockitoContextCustomizer#0, org.springframework.boot.test.web.client.TestRestTemplateContextCustomizer#2462cb01, org.springframework.boot.test.autoconfigure.actuate.metrics.MetricsExportContextCustomizerFactory$DisableMetricExportContextCustomizer#928763c, org.springframework.boot.test.autoconfigure.properties.PropertyMappingContextCustomizer#0, org.springframework.boot.test.autoconfigure.web.servlet.WebDriverContextCustomizerFactory$Customizer#7c3fdb62, org.springframework.boot.test.context.SpringBootTestArgs#1, org.springframework.boot.test.context.SpringBootTestWebEnvironment#1ad282e0], contextLoader = 'org.springframework.boot.test.context.SpringBootContextLoader', parent = [null]], attributes = map['org.springframework.test.context.event.ApplicationEventsTestExecutionListener.recordApplicationEvents' -> false]]
Now I have couple of questions -
Why am I getting detached entity exception. When we fetch an entity from the DB, Spring Data JPA must be using entityManager to fetch the entity. The fetched entity gets automatically attached to the persistence context right ?
On attaching #Transactional on the test, why the transaction is getting rolledback, and no entity is getting persisted. I was expecting the two entities - StudentSubscription and LibrarySubscription should've been persisted using the joined table inheritance approach.
I tried many things but no luck. Seeking help from, JPA and Spring DATA experts :-)
Thanks in advance.

Let me add a few details that outline a couple of design problems with your code that significantly complicate the picture. In general, when working with Spring Data, you cannot simply look at your tables, create cookie-cutter entities and repositories for those and expect things to simply work. You need to at least spend a bit of time to understand the Domain-Driven Design building blocks entity, aggregate and repository.
Repositories manage aggregates
In your case, Student treats StudentSubscriptions like an entity (full object reference, cascading persistence operations) but at the same time a repository to persist the …Subscriptions exists. This fundamentally breaks the responsibility of keeping consistency of the Student aggregate, as you can simply remove a …Subscription from the store via the repository without the aggregate having a chance to intervene. Assuming the …Subscriptions are aggregates themselves, and you'd like to keep the dependency in that direction, those must only be referred to via identifiers, not via full object representations.
The arrangement also adds cognitive load, as there are now two ways to add a subscription:
Create a …Subscription instance, assign the Student, persist the subscription via the repository.
Load a Student, create a …Subscription, add that to the student, persist the Student via it's repository.
While that's already a smell, the bidirectional relationship between the …Subscription and Student imposes the need to manually manage those in code. Also, the relationships establish a dependency cycle between the concepts, which makes the entire arrangement hard to change. You already see that you have accumulated a lot of (mapping) complexity for a rather simple example.
What would better alternatives look like?
Option 1 (less likely): Students and …Subscriptions are "one"
If you'd like to keep the concepts close together and there's no need to query the subscriptions on their own, you could just avoid those being aggregates and remove the repository for them. That would allow you to remove the back-reference from …Subscription to Student and leave you with only one way of adding subscriptions: load the Student, add a …Subscription instance, save the Student, done. This also gives the Student aggregate its core responsibility back: enforcing invariants on its state (the set of …Subscription having to follow some rules, e.g. at least one selected etc.)
Option 2 (more likely): Students and …Subscriptions are separate aggregates (potentially from separate logical modules)
In this case, I'd remove the …Subscriptions from the Student entirely. If you need to find a Students …Subscriptions, you can add a query to the …SubscriptionRepository (e.g. List<…Subscription> findByStudentId(…)). As a side effect of this you remove the cycle and Student does not (have to) know anything about …Subscriptions anymore, which simplifies the mapping. No wrestling with eager/lazy loading etc. In case any cross-aggregate rules apply, those would be applied in an application service fronting the SubscriptionRepository.
Heuristics summarized
Clear distinction between what's an aggregate and what not (the former get a corresponding repository, the later don't)
Only refer to aggregates via their identifiers.
Avoid bidirectional relationships. Usually, one side of the relationship can be replaced with a query method on a repository.
Try to model dependencies from higher-level concepts to lower level ones (Students with Subscriptionss probably make sense, a …Subscription without a Student most likely doesn't. Thus, the latter is the better relationship to model and solely work with.)

The transaction is getting rolled back because the test is doing DB updates in the test method.
#Transactional does auto rollback if the transaction includes any update DB. Also here is the compulsion to use transaction because EntityManager gets closed as soon as the Student entity gets retrieved, so to keep that open the test has to be within the transactional context.
Probably if I had used a testDB for my testcases then probably spring wouldn't haveve been rolling back this update.
Will setup an H2 testDb and perform the same operation there and will post the outcome.
Thanks for the quick help guys. :-)

Why am I getting detached entity exception. When we fetch an entity from the DB, Spring Data JPA must be using entityManager to fetch the entity. The fetched entity gets automatically attached to the persistent context right ?
Right, but only for as long as the entityManager stays open. Without the transactional, as soon as you return from studentRepository.findById(100L).get();, the entityManager gets closed and the object becomes detached.
When you call the save, a new entityManager gets created that doesn't contain a reference to the previous object. And so you have the error.
The #Trannsaction makes the entity manager stay open for the duration of the method.
At least, that's what I think it's happening.
On attaching #Transactional on the test, why the transaction is getting rolledback,
With bi-directional associations, you need to make sure that the association is updated on both sides. The code should look like:
#Test
#Transactional
public void testLibrarySubscriptionPersist() {
Student student = studentRepository.findById(100L).get();
StudentSubscription librarySubscription = new LibrarySubscription();
librarySubscription.setValidFrom(//some date);
librarySubscription.setValidTo(//some date);
// Update both sides:
librarySubscription.setStudent(student);
student.getStudentSubscription().add(librarySubscription);
// Because of the cascade, saving student should also save librarySubscription.
// Maybe it's not necessary because student is managed
// and the db will be updated anyway at the end
// of the transaction.
studentSubscriptionRepository.save(student);
}
In this case, you could also use EntityManager#getReference:
#Test
#Transactional
public void testLibrarySubscriptionPersist() {
EntityManager em = ...
StudentSubscription librarySubscription = new LibrarySubscription();
librarySubscription.setValidFrom(//some date);
librarySubscription.setValidTo(//some date);
// Doesn't actually load the student
Student student = em.getReference(Student.class, 100L);
librarySubscription.setStudent(student);
studentSubscriptionRepository.save(librarySubscription);
}
I think any of these solutions should fix the issue. Hard to say without the whole stacktrace.

How can I get an Entity with its referenced entity ids in a ManyToMany relation?

I have an basic spring application that uses hibernate and mapstruct
There are two Entities, each are implemented to have their subchild entities as List attribute in a ManyToMany relation
So there is
EntityA.class
with List<EntityB> (fetchType Lazy)
and vice versa
Now when my client calls, it wants to get a DTO that represents like following:
EntityADTO
with List<Long> entityBIds
How can I get my EntityA with only the Ids of EntityB most efficient and without loading the complete EntityB and post process it after?
Thanks a lot!

The #ManyToMany association information is persisted in a dedicated (join-)table and is loaded lazily on collection access, so there needs to be another query.
Instead of querying for the complete information of all associated entities, you could specifically query only for the needed id property.
Possible queries could look e.g. like this:
// Spring-Data repository (requires an extra interface for the result):
interface IdOnly(){
Long getId();
}
interface EntityBRepository extends JpaRepository<EntityB, Long> {
List<IdOnly> getIdByEntityAId(Long enitityAId);
}
// alternative JPQL query (does not need the interface):
#Query("SELECT b.id FROM EntityB b JOIN b.entityAs as a WHERE a.id=:entityAId")
List<Long> getIdByEntityAIdJpaQuery(#Param("enitityAId") Long enitityAId);
This way, only the needed EntityB ids for an associated EntityA are loaded from the DB.
For even further tuning, one could also write a native query directly accessing only the join-table, which avoids all joins:
#Query(nativeQuery = true, //
value = "SELECT entityBId FROM entityA_entityB WHERE enitityAId=:enitityAId")
List<Long> getIdByEntityAIdNative(#Param("enitityAId") Long enitityAId);
For executing the query when mapping with mapstruct, you can use the spring repository bean e.g. as described here: https://stackoverflow.com/a/51292920

In addition to #Fladdimir's answer which is a great approach if you only need the list of values occasionally, JPA allows defining Entity Graphs that can specify what in an object graph you want loaded. This can allow you to define your entity and specific attributes from child/referenced entities in the graph, allowing objects to be returned but the bulk of the data unfetched. This can allow you to process Entity B instances, but without them being fully populated.
There are many tutorials but I've referenced https://www.baeldung.com/jpa-entity-graph more than once. As the tutorial referenced mentions though, Hibernate might have some issues with how it handles attributes that are normally eagerly fetched, so it might not work the way you want (but will with other JPA providers like EclipseLink, which is where I've used this).
Alternatively, if this is a collection of IDs you are going to want/need frequently, you can modify your object model to have them fetched differently.
public class EntityA {
..
#ElementCollection
#CollectionTable(name = "RELATION_TABLE_NAME", joinColumns = #JoinColumn(name = "A_ID", insertable=false, updatable=false))
#Column(name = "B_ID", insertable=false, updatable=false)
List<Long> bIds;
}
This allows fetching the foriegn keys automatically in your AEntity. I've made it read-only, assuming you'd keep the existing A->B relationship and use that to set things. Doing so though means that these two relationships are entirely separate, and so might result in different queries to fetch this same set of data.
If that is a concern, you can alter things again, and remove the existing A->B relationship, and stick it in an intermediary object AB.
public class EntityA {
..
#ElementCollection
#CollectionTable(name = "RELATION_TABLE_NAME", joinColumns = #JoinColumn(name = "A_ID"))
List<AB> listOfBs;
}
#Embeddable
public class AB {
#Column("B_ID", insertable=false, updatable=false)
Long bId;
#ManyToOne(fetch=LAZY)
#JoinColumn(name = "B_ID")
B b;
}
This would allow you to fetch As and use B's ID values without having to fetch from the B table. Note that I've marked the basic bId property as read-only, assuming that your existing app would be setting things by assigning a B reference to the relationship, but you could mark the relationship as read-only instead, and set the FK value using the bId. This might be more efficient long term, as you don't have to look up the B instance to set the relationship.
Alternatively again, you can make AB an entity instead of an embeddable, and allow it to exist and be queried upon outside of As and Bs. There are quite a few options though, and ways to map it, and not likely necessary for a simple model and use case.

Pattern for accessing data outside of transaction

I have a Spring Boot App with Spring Data JPA with hibernate and MySQL as the data store.
I have 3 layers in my application:
API Service
Application Service
Domain Service ( with Repository)
The role of Application Service is to convert hibernate-backed POJOs to DTOs given some business logic.
POJO
SchoolClass.java
#Column
Long id;
#Column
String name;
#OneToMany(fetch = FetchType.LAZY, mappedBy = "schoolClass")
List<Book> books;
#OneToMany(fetch = FetchType.LAZY, mappedBy = "schoolClass")
List<Student> students;
#OneToMany(fetch = FetchType.LAZY, mappedBy = "schoolClass")
List<Schedule> schedules;
Domain Service - My transaction boundary is at the Domain Service layer.
SchoolClassService.java
#Autowired
private SchoolClassRepository repository;
#Transactional(readOnly = true)
public SchoolClass getClassById(Long id) {
return repository.findById(id);
}
Application Service
SchoolClassAppService.java
#Autowired
private SchoolClassService domainService;
public SchoolClassDto getClassById(Long id) {
SchoolClass schoolClass = domainService.getClassById(id);
// convert POJO to DTO;
return SchoolClassDto;
}
My problem is that at times the child entities on SchoolClass are empty when I try to access them in SchoolClassAppService. Not all of them, but out of the three, two would work fine but the third one would be empty. I tried to mark the children lists to be eagerly fetched, but apparently only two collections can be eagerly fetched before Hibernate starts throwing exceptions and it also does not sound like good practice to always load all the objects. I do not get LazyInitializationException, just the list is empty.
I have tried to just call the getter on all lists in the domain service method before returning it just to load all data for the POJO but that does not seem like a clean practice.
Are there any patterns available which keep the transaction boundaries as close to the persistence layer as possible while still make it viable to process the data even after the transaction has been closed?

Not sure why your collections are sometimes empty, but maybe that just how the data is?
I created Blaze-Persistence Entity Views for exactly that use case. You essentially define DTOs for JPA entities as interfaces and apply them on a query. It supports mapping nested DTOs, collection etc., essentially everything you'd expect and on top of that, it will improve your query performance as it will generate queries fetching just the data that you actually require for the DTOs.
The entity views for your example could look like this
#EntityView(SchoolClass.class)
interface SchoolClassDto {
String getName();
List<BookDto> getBooks();
}
#EntityView(Book.class)
interface BookDto {
// Whatever data you need from Book
}
Querying could look like this
List<SchoolClassDto> dtos = entityViewManager.applySetting(
EntityViewSetting.create(SchoolClassDto.class),
criteriaBuilderFactory.create(em, SchoolClass.class)
).getResultList();
Just keep in mind that DTOs shouldn't just be copies your entities but should be designed to fit your specific use case.

JpaRepository: Fetch specific lazy collections

If I have an Entity Person with some lazy-collections (Cars, Bills, Friends, ...) and want to write a JpaRepository-method that gives me all persons indluding eagerly fetched Cars, is this possible?
I know that one can do this on single objects, but is this somehow possible with collections of persons?

Yes, there is a very convenient #EntityGraph annotation provided by Spring Data JPA. It can be used to fine tune the used entitygraph of the query. Every JPA query uses an implicit entitygraph, that specifies, which elements are eagerly or lazy fetched depending on the relations fetchtype settings. If you want a specific relation to be eagerly fetched you need to specify it in the entitygraph.
#Repository
public interface PersonRepository extends CrudRepository<Person, Long> {
#EntityGraph(attributePaths = { "cars" })
Person getByName(String name);
}
Spring Data JPA documentation on entity graphs

Use the following JPA query to get the both tables data. Here used jpa query to fetch the cars.
A "fetch" join allows associations or collections of values to be initialized along with their parent objects using a single select. This is particularly useful in the case of a collection. It effectively overrides the outer join and lazy declarations of the mapping file for associations and collections.
See this for more explanation on join fetch
Use the "join fetch", to fetch object eagerly.
public interface CustomRepository extends JpaRepository<Person, Long> {
#Query("select person from PersonModel as person left join fetch person.cars as cars")
public PersonModel getPersons();
}

Spring, JPA -- integration test of CRUD of entity which has many transitive dependencies of other entities

I have entity e.g. Product which aggregates other entities such as Category. Those entities can also aggregate other entities and so on. Now I need to test my queries to database.
For simple CRUD I would create mock of EntityManager. But what if I have more complex query which I need to test for correct functionality. Then I probably need to persist entity (or more of them) and try to retrieve/update, whatever. I would also need to persist all entities on which my Product depends.
I don't like such approach. What is the best way to test such queries?
Thanks for replies.
Update -- example
Lets assume following entity structure
This structure is maintained by JPA implementation. For example Product class would look like this
#Entity
public class Product {
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
private Long id;
private String name;
#ManyToOne
private Category category;
#ManyToOne
private Entity1 something;
}
So now if I want to test any query used in DAO I need to create Product in database, but it is dependent on Category and Entity1 and there is #ManyToOne annotation so values cannot be null. So I need to persist those entities too, but they have also dependencies.
I'm considering pre-creating entities such Category, Entity1 and Entity2 before test using SQL script or dbunit (mentioned by #chalimartines) which would save large amount of code, but I don't know whether it is good solution. I would like to know some best practices for such testing.

you can use #TransactionConfiguration(transactionManager = "transactionManager", defaultRollback = true) as
#ContextConfiguration(locations={"classpath:/path/to/your/applicationContextTest.xml"})
#RunWith( SpringJUnit4ClassRunner.class)
#TransactionConfiguration(transactionManager = "transactionManager", defaultRollback = true)
public class YourClassTest {
#Test
public void test() {
//your crud
}
}
update
You cant set the dependecies to null in order to avoid to persist them

I don't know other way, but for persisting Product and its dependencies you can use testing framework DBunit that helps you setup database data.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio