Why is JPA query so slow? - spring

I am implementing queries in my web application with JPA repositories. The two main tables I am querying from are FmReportTb and SpecimenTb.
Here are the two entity classes (only important attributes are listed).
//FmReportTb.java
#Entity
#Table(name="FM_REPORT_TB")
public class FmReportTb implements Serializable {
#Column(name="ROW_ID")
private long rowId;
#Column(name="FR_BLOCK_ID")
private String frBlockId;
#Column(name="FR_FULL_NAME")
private String frFullName;
#OneToOne
#JoinColumn(name="SPECIMEN_ID")
private SpecimenTb specimenTb;
FmReportTb has OneToOne relationship with SpecimenTb.
#Entity
#Table(name="SPECIMEN_TB")
public class SpecimenTb implements Serializable {
private String mrn;
#OneToOne(mappedBy="specimenTb", cascade=CascadeType.ALL)
private FmReportTb fmReportTb;
The query I am working on is to find all records in FmReportTb and show a few attributes from FmReportTb plus mrn from SpecimenTb.
Here is my JPA repository for FmReportTb:
#Repository
public interface FmReportRepository extends JpaRepository<FmReportTb, Long> {
#Query("select f from FmReportTb f where f.deleteTs is not null")
public List<FmReportTb> findAllFmReports();
Since, I am only showing part of the attributes from FmReportTb and one attribute from SpecimenTb, I decided to create a Value Object for FmReportTb. The constructor of the VO class assigns attributes from FmReportTb and grabs mrn attribute from SpecimenTb based on the OneToOne relationship. Another reason for using VO is because table FmReportTb has a lot of OneToMany children entities. For this particular query, I don't need any of them.
public class FmReportVO {
private String frBlockId;
private Date frCollectionDate;
private String frCopiedPhysician;
private String frDiagnosis;
private String frFacilityName;
private String frFullName;
private String frReportId;
private String filepath;
private String mrn;
public FmReportVO(FmReportTb fmReport) {
this.frBlockId = fmReport.getFrBlockId();
this.frCollectionDate = fmReport.getFrCollectionDate();
this.frCopiedPhysician = fmReport.getFrCopiedPhysician();
this.frDiagnosis = fmReport.getFrDiagnosis();
this.frFacilityName = fmReport.getFrFacilityName();
this.frFullName = fmReport.getFrFullName();
this.frReportId = fmReport.getFrReportId();
this.mrn = fmReport.getSpecimenTb().getMrn();
}
I implemented findall method in servicebean class to return a list of FmReportTb VOs.
//FmReportServiceBean.java
#Override
public List<FmReportVO> findAllFmReports() {
List<FmReportTb> reports = fmReportRepository.findAllFmReports();
if (reports == null) {
return null;
}
List<FmReportVO> fmReports = new ArrayList<FmReportVO>();
for (FmReportTb report : reports) {
FmReportVO reportVo = new FmReportVO(report);
String filepath = fileLoadRepository.findUriByFileLoadId(report.getFileLoadId().longValue());
reportVo.setFilepath(filepath);
fmReports.add(reportVo);
}
return fmReports;
}
Lastly, my controller looks like this:
#RequestMapping(
value = "/ristore/foundation/",
method = RequestMethod.GET,
produces = "application/json")
public ResponseEntity<List<FmReportVO>> getAllFmReports() {
List<FmReportVO> reports = ristoreService.findAllFmReports();
if (reports == null) {
return new ResponseEntity<List<FmReportVO>>(HttpStatus.NOT_FOUND);
}
return new ResponseEntity<List<FmReportVO>>(reports, HttpStatus.OK);
}
There are about 200 records in the database. Surprisingly, it took almost 2 full seconds to retrieve all the records in JSON. Even though I did not index all the tables, this is way too slow. Similar query takes about probably a few ms on the database directly. Is it because I am using Value Objects or JPA query tends to be this slow?
EDIT 1
This may have to do with the fact that FmReportTb has almost 20 OneToMany entities. Although the fetchmode of these child entities are set to LAZY, JPA Data repository tends to ignore the fetchmode. So I ended up using NamedEntityGraph to specify the attributes EAGER. This next section is added to the head of my FmReportTb entity class.
#Entity
#NamedEntityGraph(
name = "FmReportGraph",
attributeNodes = {
#NamedAttributeNode("fileLoadId"),
#NamedAttributeNode("frBlockId"),
#NamedAttributeNode("frCollectionDate"),
#NamedAttributeNode("frDiagnosis"),
#NamedAttributeNode("frFullName"),
#NamedAttributeNode("frReportId"),
#NamedAttributeNode("specimenTb")})
#Table(name="FM_REPORT_TB")
And then #EntityGraph("FmReportGraph") was added before the JPA repository query to find all records. After doing that, the performance is improved a little bit. Now fetching 1500 records only takes about 10 seconds. However, it still seems too slow given each json object is fairly small.

Answering for the benefit of others with slow JPA queries...
As #Ken Bekov hints in the comments, foreign keys can help a lot with JPA.
I had a couple of tables with a many to one relationship - a query of 100,000 records was taking hours to perform. Without any code changes I reduced this to seconds just by adding a foreign key.
In phpMyAdmin you do this by creating a Relationship from the "many" table to the "one" table. For a detailed explanation see this question: Setting up foreign keys in phpMyAdmin?
and the answer by #Devsi Odedra

Related

Dynamic JPA query

I have two entities Questions and UserAnswers. I need to make an api in spring boot which returns all the columns from both the entities based on some conditions.
Conditions are:
I will be give a comparator eg: >, <, =, >=, <=
A column name eg: last_answered_at, last_seen_at
A value of the above column eg: 28-09-2020 06:00:18
I will need to return an inner join of the two entities and filter based on the above conditions.
Sample sql query based on above conditions will be like:
SELECT q,ua from questions q INNER JOIN
user_answers ua on q.id = ua.question_id
WHERE ua.last_answered_at > 28-09-2020 06:00:18
The problem I am facing is that the column name and the comparator for the query needs to be dynamic.
Is there an efficient way to do this using spring boot and JPA as I do not want to make jpa query methods for all possible combinations of columns and operators as it can be a very large number and there will be extensive use of if else?
I have developed a library called spring-dynamic-jpa to make it easier to implement dynamic queries with JPA.
You can use it to write the query templates. The query template will be built into different query strings before execution depending on your parameters when you invoke the method.
This sounds like a clear custom implementation of a repository method. Firstly, I will make some assumptions about the implementation of your entities. Afterwards, I will present an idea on how to solve your challenge.
I assume that the entities look basically like this (getters, setters, equals, hachCode... ignored).
#Entity
#Table(name = "questions")
public class Question {
#Id
#GeneratedValue
private Long id;
private LocalDateTime lastAnsweredAt;
private LocalDateTime lastSeenAt;
// other attributes you mentioned...
#OneToMany(mappedBy = "question", cascade = CascadeType.ALL, orphanRemoval = true)
private List<UserAnswer> userAnswers = new ArrayList();
// Add and remove methods added to keep bidirectional relationship synchronised
public void addUserAnswer(UserAnswer userAnswer) {
userAnswers.add(userAnswer);
userAnswer.setQuestion(this);
}
public void removeUserAnswer(UserAnswer userAnswer) {
userAnswers.remove(userAnswer);
userAnswer.setQuestion(null);
}
}
#Entity
#Table(name = "user_answers")
public class UserAnswer {
#Id
#GeneratedValue
private Long id;
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "task_release_id")
private Question question;
}
I will write the code with the knowledge about the JPA of Hibernate. For other JPAs, it might work similarly or the same.
Hibernate often needs the name of attributes as a String. To circumvent the issue of undetected mistakes (especially when refactoring), I suggest the module hibernate-jpamodelgen (see the class names suffixed with an underscore). You can also use it to pass the names of the attributes as arguments to your repository method.
Repository methods try to communicate with the database. In JPA, there are different ways of implementing database requests: JPQL as a query language and the Criteria API (easier to refactor, less error prone). As I am a fan of the Criteria API, I will use the Criteria API together with the modelgen to tell the ORM Hibernate to talk to the database to retrieve the relevant objects.
public class QuestionRepositoryCustomImpl implements QuestionRepository {
#PersistenceContext
private EntityManager entityManager;
#Override
public List<Question> dynamicFind(String comparator, String attribute, String value) {
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Question> cq = cb.createQuery(Question.class);
// Root gets constructed for first, main class in the request (see return of method)
Root<Question> root = cq.from(Question.class);
// Join happens based on respective attribute within root
root.join(Question_.USER_ANSWER);
// The following ifs are not the nicest solution.
// The ifs check what comparator String contains and adds respective where clause to query
// This .where() is like WHERE in SQL
if("==".equals(comparator)) {
cq.where(cb.equal(root.get(attribute), value));
}
if(">".equals(comparator)) {
cq.where(cb.gt(root.get(attribute), value));
}
if(">=".equals(comparator)) {
cq.where(cb.ge(root.get(attribute), value));
}
if("<".equals(comparator)) {
cq.where(cb.lt(root.get(attribute), value));
}
if("<=".equals(comparator)) {
cq.where(cb.le(root.get(attribute), value));
}
// Finally, query gets created and result collected and returned as List
// Hint for READ_ONLY is added as lists are often just for read and performance is better.
return entityManager.createQuery(cq).setHint(QueryHints.READ_ONLY, true).getResultList();
}
}

Efficient way to fetch list size

I have an entity like below. When I need to list comment size of company I'm calling totalComments() method. For this does hibernate go to the database and fetch entire comment data or just querying with count(*)? If hibernate fetch entire comment what is the efficient way for getting comment size?
#Entity
#Table(name = "companies")
public class Company extends ItemEntity {
#OneToMany(fetch = FetchType.LAZY)
#JoinTable(name="companies_comments",
joinColumns=#JoinColumn(name="company_id"),
inverseJoinColumns=#JoinColumn(name="comment_id"))
private Set<Comment> comments = new HashSet<>();
public void addComment(Comment comment) {
this.comments.add(comment);
}
public int totalComments() {
return this.comments.size();
}
}
You should drop the own method counter and create a specific (business) query to retrieve the size of the list, such as
public long getCommentsCount(Company c) {
String query = "SELECT COUNT(cm) FROM Company AS c JOIN c.comments AS cm WHERE c = :company";
return entityManager.createQuery(q, Long.class).setParameter("company", c).getSingleResult();
}
Some persistence provider may optimize performance when this kind of query is loaded as a #NamedQuery on entity, or when using CriteriaQuery API.
Depending on your database, you may need to change the return class to Number.class and convert to long.
If you want to tune even more your performance, use createNativeQuery method and write your own pure SQL, but keep in mind that changes on db schema requires to review theses queries.
I found the answer. If we don't adjust for getting collection size of entity hibernate loads every comment. We can solve this performance issue in two ways.
We can use #LazyCollection(LazyCollectionOption.EXTRA) like below. By LazyCollectionOption.EXTRA .size() and .contains() won't initialize the whole collection.
#OneToMany(fetch = FetchType.LAZY)
#LazyCollection(LazyCollectionOption.EXTRA)
#JoinTable(name="companies_comments",
joinColumns=#JoinColumn(name="company_id"),
inverseJoinColumns=#JoinColumn(name="comment_id"))
private Set<Comment> comments = new HashSet<>();
Or we can use #Formula annotation.
#Formula(SELECT COUNT(*) FROM companies_comments cc WHERE cc.company_id = id)
private int numberOfComments;
Edit after 8 months: For simplicity and performance perspective, we should create a JPA Query Method like below.
#Repository
public interface CommentRepository extends JpaRepository<Comment, Long> {
int countAllByCompany(Company company);
}
We should never use getComments().size() for this purpose, because this way all comments are loaded into memory and this may be cause performance issues.
It is also true when adding comments to the collection. We shouldn't use getComments().add(newComment). When we have OneToMany relation, all we have to do is set the company field of the comment like as newComment.setCompany(company), and perform the persist operation. Therefore, it is recommended to define OneToMany relationships bidirectional.

How to handle DataIntegrityVilolationException while saving a list in Spring Data JPA?

I am using Spring Data JPA in a Spring Boot Application, with MYSQL. There I am saving a list of entities with unique constraint over a field. Out of the list of entities, there is one entity that will throw DataIntegrityViolationException due to the unique constraint. I noticed that none of the entities get persisted in that case, even those that does not violate the unique constraint.
What should be the ideal approach in this case so that those entities which do not violate the unique get persisted ?
Of course I can iterate the list and save them one by one. In fact that is what SimpleJpaRepository is doing underneath.
#Transactional
public <S extends T> List<S> save(Iterable<S> entities) {
List<S> result = new ArrayList<S>();
if (entities == null) {
return result;
}
for (S entity : entities) {
result.add(save(entity));
}
return result;
}
My code - Entity :
#Entity
#Table(uniqueConstraints = #UniqueConstraint(columnNames = { "name" }, name = "uq_name"))
public class SampleContent {
#Id
#GeneratedValue
private Long id;
private String name;
//getter setters
}
Repository :
public interface SampleContentRepository extends JpaRepository<SampleContent, Serializable>{
}
JUnit test :
#Test
public void testCreate(){
List<SampleContent> sampleContentList = new ArrayList<>();
SampleContent sampleContent1 = new SampleContent();
sampleContent1.setName("dd");
SampleContent sampleContent2 = new SampleContent();
sampleContent2.setName("Roy");
SampleContent sampleContent3 = new SampleContent();
sampleContent3.setName("xx");
sampleContentList.add(sampleContent1);
sampleContentList.add(sampleContent2);
sampleContentList.add(sampleContent3);
try{
this.sampleContentRepository.save(sampleContentList);
}catch(DataIntegrityViolationException e){
System.err.println("constraint violation!");
}
}
There is an entity with name "Roy" already present in the table. So, the entire transaction fails and #Transactional rolls back.
I think you can use next steps:
Load existing entities from DB into Set
Override equals and hashCode methods based on name
call Set::addAll you antities (or just add them one by one)
save that Set to DB
Maybe it's suboptimal because forces you to make select * query. But I think it's much more effective then saving entities one by one to DB.
Accotding to this article you can use name as your business key, which has lots of benefits.

NamedEntityGraph Returns All Columns and Objects

I am trying to utilize a NamedEntityGraph to limit the return data for specific queries. Mainly I do not want to return full object details when listing the object. A very simple class example is below.
#Entity
#Table(name="playerreport",schema="dbo")
#NamedEntityGraphs({
#NamedEntityGraph(name = "report.simple",
attributeNodes =
{#NamedAttributeNode(value="intId")
}
)
})
public class PlayerReportEntity {
#Id
#Column(name="intid",columnDefinition="uniqueidentifier")
private String intId;
#Column(name="plyid",columnDefinition="uniqueidentifier")
#Basic(fetch=FetchType.LAZY)
private String plyId;
#ManyToOne(fetch=FetchType.LAZY)
#JoinColumn(name = "plyid", insertable=false,updatable=false)
private PlayerEntity player;
No matter what I do to plyId and player are always returned. Is there any way to only return the requested columns (intId) ?
As for the collection Hibernate does not do the join for the player object but it still returns player as null. So that part is working to an extent.
I am using a JPARepository below to generate Crud Statements for me
public interface PlayerReportRepository extends JpaRepository<PlayerReportEntity, String> {
#EntityGraph(value="report.simple")
List<PlayerIntelEntity> findByPlyId(#Param(value = "playerId") String playerId);
#Override
#EntityGraph(value="report.simple")
public PlayerIntelEntity findOne(String id);
}
A chunk of text from here - "Hence it seems that the #NamedEntityGraph only affects fields that are Collections, but fields that are not a Collection are always loaded." from JIRA
Please use the Example 47 on this page and use repositories accordingly.
In essence, hibernate is right now loading all the feilds in the class and for collections it will work if you follow the example stated above.
Thanks.

Select fews columns (DTO) with specification JPA

I am using spring-data-jpa version 1.5.1.RELEASE .
My domain is :
public class MyDomain{
....
....
private String prop1;
private String prop2;
......
......
}
My JPA Specification is:
public final class MyDomainSpecs {
public static Specification<MyDomain> search(final String prop1,final String prop2) {
return new Specification<MyDomain>() {
public Predicate toPredicate(Root<MyDomain> root, CriteriaQuery<?> query, CriteriaBuilder cb) {
// Some tests if prop1 exist .....
Predicate predicate1 = cb.equal(root.get("prop1"), prop1);
Predicate predicate2 = cb.equal(root.get("prop2"), prop2);
return cb.and(predicate1, predicate2);
}
};
}
}
My Repository :
public interface MyDomainRepository extends JpaRepository<MyDomain, Long>, JpaSpecificationExecutor<MyDomain> {
List<MyDomain> findAll(Specification<MyDomain> spec);
}
All is Working .
But my need (For performance DB tunning) is to not return and select all fields of MyDomain from DB .
I need to select only for example tree properties (prop1, prop2, prop3) , idealy in a DTO Object .
I don't want to convert My List<MyDomain> to List<MyDto> because i am tunning DB request .
So , I don't find any way to do that with spring-data-Jpa and Specification .
Any Idea ?
Thanks
This is not possible as for now. There is a ticket for this but no idea if it will be ever implmented: https://jira.spring.io/browse/DATAJPA-51
Create a special version of MyDomain (e.g. MyDomainSummary or LightMyDomain) that only includes the fields you want to map.
Basic example
Borrowed from the excellent JPA WikiBook.
Assume a JPA entity (i.e. domain class) like so:
#Entity
#Table(name="EMPLOYEE")
public class BasicEmployee {
#Column(name="ID")
private long id;
#Column(name="F_NAME")
private String firstName;
#Column(name="L_NAME")
private String lastName;
// Any un-mapped field will be automatically mapped as basic and column name defaulted.
private BigDecimal salary;
}
The SQL query generated will be similar to
SELECT ID, F_NAME, L_NAME, SALARY FROM EMPLOYEE
if no conditions (where clause) are defined. So, to generalize the basic case one can say that the number of queried columns is equal to the number of mapped fields in your entity. Therefore, the fewer fields your entity, the fewer columns included in the SQL query.
You can have an Employee entity with e.g. 20 fields and a BasicEmployee as above with only 4 fields. Then you create different repositories or different repository methods for both.
Performance considerations
However, I seriously doubt you'll see noticeable performance improvements unless the fields you want to omit represent relationships to other entities. Before you start tweaking here log the SQL that is currently issued against the data base, then remove the columns you want to omit from that SQL, run it again and analyze what you gained.

Resources