Using Spring Query Methods JPA to efficiently query DB without multiple SELECT statements - spring

I have an entity that has simple String columns as well as many ElementCollections (List and Map). I noticed looking at my postgres logs that PostGres when querying for this entity is doing a bunch of SELECT queries consecutively to get all the ElementCollections.
For efficiency, I would imagine doing one SELECT query with some inner JOINs might be better to avoid all of the individual SELECT queries. Is there a way to do that without writing a very verbose select query manually with all the INNER JOINs?
I have been looking around FetchTypes and Spring QueryData language, and DTO Projection but I imagine there might be a more straightforward way. The benefit I had been taking for granted is by explictly doing the JOINs if I add a new field then I will have to keep updating my query and if Spring is generating queries for me, then I wouldn't have to do anything.
// Person.java
#Entity
public Person {
#Id
long personId;
#Column
String firstName;
#Column
String lastName;
#ElementCollections
Set<String> someField;
#ElementCollections
Map<String, String> otherField;
#ElementCollections
Set<String> anotherField;
#ElementCollections
Map<String, String> yetAnotherField;
}
What is happening right now is
SELECT firstName, lastName FROM Person WHERE personId=$1
SELECT someField FROM Person_SomeField WHERE someField.personId=$1
SELECT otherField.key otherField.value FROM Person_OtherField WHERE otherField.personId=$1
And this continues for all of the ElementCollections tables which leads to a lot of queries.

Change your annotation to #ElementCollection(fetch = FetchType.EAGER).
It sounds like those fields are being lazily loaded (Hibernate is waiting until they are accessed to load them) which results in the N+1 queries you are seeing. LAZY loading is the default behavior for this type of member, which makes sense because loading it is not cheap. However, if you always want these members loaded, setting it to EAGER can make sense. Setting the fetch to EAGER will force Hibernate to load them along with the entity itself. This is the documentation for the fetch option on #ElementCollection:
(Optional) Whether the collection should be lazily loaded or must be
eagerly fetched. The EAGER strategy is a requirement on the
persistence provider runtime that the collection elements must be
eagerly fetched. The LAZY strategy is a hint to the persistence
provider runtime.

Related

Force JPA to not use union and fetch tables one by one

I have 5 similar tables from which I need to execute a same query and fetch data in pages. I have used polymorphic queries (have super abstract class and used #Inheritance to fetch all rows automatically)
But this approach has problems as noted here: Database pressure on Polymorphic queries
The issue is that the queries use union all which makes DB to search through millions of rows just to get 500 results. So instead I want to execute this serially.
When I execute the method JPA will go to first table; fetch data in pages; if the data fetching is complete then go to second table and so on...
Right now with union, I have ton of pressure on database. With this new approach, I could have less pressure as only one table is accessed at once.
I do not know a way to do this without changing the setup I have right now. For example right now I have it like this:
public interface OhlcDao extends JpaRepository<AbstractOhlc, OhlcId> {
Slice<OhlcRawBean<? extends OhlcBean>> findByIdSourceIdAndIdTickerIdIn(
String sourceId,
Set<String> tickerId,
PageRequest pageRequest
);
}
The method uses union to fetch data which I do not like.
Is there a way to make this work in JPA or Hibernate by changing any internal code (aka without changing my setup, so similar method does not use unions)

Is there a way to fetch a class with associated classes in a single SELECT query in Spring Data JPA?

I have two classes, A and B with a One-To-Many relationship.
#Entity
public class A {
//some code
#OneToMany(fetch = FetchType.LAZY, mappedBy = "abc")
private List<B> b;
}
I have a JPA Repository for A, and I observed that when I fetch an A from the database with a repository method, Bs are not included. When I want to access Bs, additional SELECT queries are executed to fetch associated Bs(if there are 100 associated Bs, 100 additional queries are executed). making FetchType.EAGER only changes when these additional SELECT queries are executed. i.e. they are called right after the main SELECT query. Either way, additional SELECT queries are executed to fetch associated classes.
Is this natural behavior? I found that JPA Entity Graphs is the solution for this issue/to fetch A with Bs with a single SELECT query. Are there any other ways to address this issue? The problem with #EntityGraph is it has to be annotated with each repository method separately. Also, is there a way to annotate #EntityGraph once so it affects all the methods in the Repository?
I'm using Spring Boot 2.5.1. Thanks in advance!

Exclude byte field of an object

I'm trying to exclude byte field from my object query since there are several hundreds or thousand of reports and it takes a long time to query it from the database.
public class Reports
{
private int id;
private String reportName;
#Lob
#Basic(fetch= FetchType.LAZY)
private byte[] file;
private Date createdDate;
}
I tried setting up the hibernate byte enhancement for this How to setup Hibernate Gradle plugin for bytecode enhancement? but I'm still getting the file when I query all the reports. Did I missed something here?
In JPA, you can annotate a field with #Transient to indicate that it is not persistent.
Bytecode enhancement should help, but maybe you didn't configure it correctly or the Hibernate version you are using has a bug. I'd need to know details or see a reproducing test case to help you with that.
You could try to use java.sql.Blob instead which is guaranteed to be lazy and doesn't require byte code enhancement.
Apart from that, I would recommend you use DTO projections for actually fetching just the data that you need. I think this is a perfect use case for Blaze-Persistence Entity Views.
I created the library to allow easy mapping between JPA models and custom interface or abstract class defined models, something like Spring Data Projections on steroids. The idea is that you define your target structure(domain model) the way you like and map attributes(getters) via JPQL expressions to the entity model.
A DTO model for your use case could look like the following with Blaze-Persistence Entity-Views:
#EntityView(Reports.class)
public interface ReportsDto {
#IdMapping
int getId();
String getReportName();
Date getCreatedDate();
Set<ReportRowDto> getRows();
#EntityView(ReportRows.class)
interface ReportRowDto {
#IdMapping
Long getId();
String getName();
}
}
Querying is a matter of applying the entity view to a query, the simplest being just a query by id.
ReportsDto a = entityViewManager.find(entityManager, ReportsDto.class, id);
The Spring Data integration allows you to use it almost like Spring Data Projections: https://persistence.blazebit.com/documentation/entity-view/manual/en_US/index.html#spring-data-features
The DTO projection is validated against the entity model and it will only fetch what is necessary. Spring Data Projections falls back to "just" wrapping entities when the projection is too complex, whereas Blaze-Persistence Entity-Views will alter the query as if you had written it by hand :)

Method in Entity need to load all data from aggregate, how to optimalize this?

I've problem with aggregate which one will increase over time.
One day there will be thousands of records and optimalization gonna be bad.
#Entity
public class Serviceman ... {
#ManyToMany(mappedBy = "servicemanList")
private List<ServiceJob> services = new ArrayList<>();
...
public Optional<ServiceJob> firstServiceJobAfterDate(LocalDateTime dateTime) {
return services.stream().filter(i -> i.getStartDate().isAfter(dateTime))
.min(Comparator.comparing(ServiceJob::getStartDate));
}
}
Method just loading all ServiceJob to get just one of them.
Maybe I should delegate this method into service with native sql.
You have to design small aggregates instead of large ones.
This essay explains in detail how to do it: http://dddcommunity.org/library/vernon_2011/. It explains how to decompose your aggregates to smaller ones so you can manage the complexity.
In your case instead of having an Aggregate consisting of two entities: Serviceman and Servicejob with Serviceman being the aggregate root you can decompose it in two smaller aggregates with single entity. ServiceJob will reference Serviceman by ID and you can use ServicejobRpository to make queries.
In your example you will have ServicejobRpository.firstServiceJobAfterDate(guid servicemanID, DateTime date).
This way if you have a lot of entities and you need to scale, you can store Servicejob entities to another DB Server.
If for some reason Serviceman or Servicejob need references to each other to do their work you can use a Service that will use ServicemanRepository and ServicejobRepository to get both aggregates and pass them to one another so they can do their work.

preventing OpenJPA N+1 select performance problem on maps

When I have an entity that contains a Map, e.g.
#Entity
public class TestEntity {
#ElementCollection(fetch = FetchType.EAGER)
Map<String, String> strings = new HashMap<String, String>();
}
and I select multiple entities (SELECT z FROM TestEntity z), OpenJPA 2.0 performs one query for each TestEntity to fetch the map, even though I used FetchType.EAGER. This also happens when the Map value is an entity and I use #OneToMany instead of #ElementCollection. In principle this can be done more efficiently with one query that selects all the map entries for all returned TestEntities. For Collection-valued fields OpenJPA already does this by default (openjpa.jdbc.EagerFetchMode" value="parallel") but it seems to fail on this simple entity. (Same problem with value="join").
Could I be doing something wrong? Is there an easy way to tell OpenJPA to not perform a query per entity but only one?
Or is there already any work planned on improving this (I filed it under https://issues.apache.org/jira/browse/OPENJPA-1920)?
It is a problem for us because we wish to fetch (and detach) a list of about 1900 products which takes almost 15 seconds with OpenJPA. It takes less than a second with my own native query.
Having to write only one native query wouldn't be much of a problem but the map we use is inside a reusable StringI18N entity which is referenced from several different entities (and can be deep in the object graph), so native queries are a maintenance headache.
Any help getting performance up is greatly appreciated.
EDIT: explicitly using JOIN FETCH does not help either:
"SELECT z FROM TestEntity z JOIN FETCH z.strings"
OpenJPA's TRACE still shows that it executes one SQL statement for each individual TestEntity.
It might be a pain (correction: I know it'll be a pain) but have you tried actually mapping your 2-field TestEntity as a full JPA-persisted #Entity?
I know that Hibernate used to treat #ElementCollections rather differently to #OneToManys for example - OpenJPA could well be doing something similar.

Resources