#Cacheable duplicates inner list of DTO returned from JOOQ query - spring

I'm new to JOOQ and Spring caching and using version 3.10.6. I'm trying to cache a query result so I don't need to go to database every time. The fetching of the query goes smoothly there is no problem in that but when you execute this query again, it goes to the cache which has duplicate records in the inner lists. Also every time this query is called and it falls to the cache, the duplications grow in number. Now I can put a Set instead of a List but I want to know why this duplication occurs.
Here is my JooqRepo method
#Cacheable(CachingConfig.OPERATORS)
public List<MyDto> getAllOperatorsWithAliases() {
return create.select(Tables.MY_TABLE.ID)
.select(Tables.MY_TABLE.NAME)
.select(Tables.MY_INNER_TABLE.ID)
.select(Tables.MY_INNER_TABLE.ALIAS)
.select(Tables.MY_INNER_TABLE.PARENT_ID)
.select(Tables.MY_INNER_TABLE.IS_MAIN)
.from(Tables.MY_TABLE)
.join(Tables.MY_INNER_TABLE)
.on(Tables.MY_TABLE.ID.eq(Tables.MY_INNER_TABLE.PARENT_ID))
.fetch(this::createMyDtoFromRecord);
}
private MyDto createMyDtoFromRecord(Record record) {
MyInnerDto myInnerDto = new MyInnerDto();
myInnerDto.setId(record.field(Tables.MY_INNER_TABLE.ID).getValue(record));
myInnerDto.setAlias(record.field(Tables.MY_INNER_TABLE.ALIAS).getValue(record));
myInnerDto.setParentId(record.field(Tables.MY_INNER_TABLE.PARENT_ID).getValue(record));
myInnerDto.setIsMain(record.field(Tables.MY_INNER_TABLE.IS_MAIN).getValue(record) == 1);
MyDto myDto = new MyDto();
myDto.setId(record.field(Tables.MY_TABLE.ID).getValue(record));
myDto.setName(record.field(Tables.MY_TABLE.NAME).getValue(record));
myDto.setInnerDtos(Collections.singletonList(myInnerDto));
return myDto;
}
and here are the Dtos
#Data
public class MyDto {
private Long id;
private String name;
private List<MyInnerDto> innerDtos;
}
#Data
public class MyInnerDto {
private Long id;
private String alias;
private Long parentId;
private Boolean isMain;
}
The first call MyDto1 has the list innerDtos of size 1 and with each call that falls to the cache this number goes up by 3 and I think the reason of it is because there are 3 parent dtos being returned in the query.
I've tried adding #EqualsAndHashCode to these dtos but when I add it the query now returns an empty list.
I'm sorry if this was asked before but I couldn't find it.

I found the problem and it was not related to JOOQ but it was about #Cacheable and using in memory caches.
I was using in memory cache and getting rid of the duplicates inside the service layer via putting the contents of the query inside a Map<Long, MyDto> to collect the MyInnerDto's under the same id. But the problem here is; in memory caches return the object itself meanwhile caches like Redis returns a copy of that object. So when I changed the cache object, it was directly changed inside the cache as well, hence the duplication issue.
To get rid of this problem here's the revised version of the query:
#Cacheable(CachingConfig.OPERATORS)
public List<MyDto> getAllOperatorsWithAliases() {
Map<MyDto, List<MyInnerDto>> result = create.select(Tables.MY_TABLE.ID)
.select(Tables.MY_TABLE.NAME)
.select(Tables.MY_INNER_TABLE.ID)
.select(Tables.MY_INNER_TABLE.ALIAS)
.select(Tables.MY_INNER_TABLE.PARENT_ID)
.select(Tables.MY_INNER_TABLE.IS_MAIN)
.from(Tables.MY_TABLE)
.join(Tables.MY_INNER_TABLE)
.on(Tables.MY_TABLE.ID.eq(Tables.MY_INNER_TABLE.PARENT_ID))
.fetchGroups(
r -> r.into(Tables.MY_TABLE).into(MyDto.class),
r -> r.into(Tables.MY_INNER_TABLE).into(MyInnerDto.class)
);
result.forEach(MyDto::setInnerDtos);
return new ArrayList<>(result.keySet());
}

Related

Dynamic JPA query

I have two entities Questions and UserAnswers. I need to make an api in spring boot which returns all the columns from both the entities based on some conditions.
Conditions are:
I will be give a comparator eg: >, <, =, >=, <=
A column name eg: last_answered_at, last_seen_at
A value of the above column eg: 28-09-2020 06:00:18
I will need to return an inner join of the two entities and filter based on the above conditions.
Sample sql query based on above conditions will be like:
SELECT q,ua from questions q INNER JOIN
user_answers ua on q.id = ua.question_id
WHERE ua.last_answered_at > 28-09-2020 06:00:18
The problem I am facing is that the column name and the comparator for the query needs to be dynamic.
Is there an efficient way to do this using spring boot and JPA as I do not want to make jpa query methods for all possible combinations of columns and operators as it can be a very large number and there will be extensive use of if else?
I have developed a library called spring-dynamic-jpa to make it easier to implement dynamic queries with JPA.
You can use it to write the query templates. The query template will be built into different query strings before execution depending on your parameters when you invoke the method.
This sounds like a clear custom implementation of a repository method. Firstly, I will make some assumptions about the implementation of your entities. Afterwards, I will present an idea on how to solve your challenge.
I assume that the entities look basically like this (getters, setters, equals, hachCode... ignored).
#Entity
#Table(name = "questions")
public class Question {
#Id
#GeneratedValue
private Long id;
private LocalDateTime lastAnsweredAt;
private LocalDateTime lastSeenAt;
// other attributes you mentioned...
#OneToMany(mappedBy = "question", cascade = CascadeType.ALL, orphanRemoval = true)
private List<UserAnswer> userAnswers = new ArrayList();
// Add and remove methods added to keep bidirectional relationship synchronised
public void addUserAnswer(UserAnswer userAnswer) {
userAnswers.add(userAnswer);
userAnswer.setQuestion(this);
}
public void removeUserAnswer(UserAnswer userAnswer) {
userAnswers.remove(userAnswer);
userAnswer.setQuestion(null);
}
}
#Entity
#Table(name = "user_answers")
public class UserAnswer {
#Id
#GeneratedValue
private Long id;
#ManyToOne(fetch = FetchType.LAZY)
#JoinColumn(name = "task_release_id")
private Question question;
}
I will write the code with the knowledge about the JPA of Hibernate. For other JPAs, it might work similarly or the same.
Hibernate often needs the name of attributes as a String. To circumvent the issue of undetected mistakes (especially when refactoring), I suggest the module hibernate-jpamodelgen (see the class names suffixed with an underscore). You can also use it to pass the names of the attributes as arguments to your repository method.
Repository methods try to communicate with the database. In JPA, there are different ways of implementing database requests: JPQL as a query language and the Criteria API (easier to refactor, less error prone). As I am a fan of the Criteria API, I will use the Criteria API together with the modelgen to tell the ORM Hibernate to talk to the database to retrieve the relevant objects.
public class QuestionRepositoryCustomImpl implements QuestionRepository {
#PersistenceContext
private EntityManager entityManager;
#Override
public List<Question> dynamicFind(String comparator, String attribute, String value) {
CriteriaBuilder cb = entityManager.getCriteriaBuilder();
CriteriaQuery<Question> cq = cb.createQuery(Question.class);
// Root gets constructed for first, main class in the request (see return of method)
Root<Question> root = cq.from(Question.class);
// Join happens based on respective attribute within root
root.join(Question_.USER_ANSWER);
// The following ifs are not the nicest solution.
// The ifs check what comparator String contains and adds respective where clause to query
// This .where() is like WHERE in SQL
if("==".equals(comparator)) {
cq.where(cb.equal(root.get(attribute), value));
}
if(">".equals(comparator)) {
cq.where(cb.gt(root.get(attribute), value));
}
if(">=".equals(comparator)) {
cq.where(cb.ge(root.get(attribute), value));
}
if("<".equals(comparator)) {
cq.where(cb.lt(root.get(attribute), value));
}
if("<=".equals(comparator)) {
cq.where(cb.le(root.get(attribute), value));
}
// Finally, query gets created and result collected and returned as List
// Hint for READ_ONLY is added as lists are often just for read and performance is better.
return entityManager.createQuery(cq).setHint(QueryHints.READ_ONLY, true).getResultList();
}
}

Challenge Persisting Complex Entity using Spring Data JDBC

Considering the complexities involved in JPA we are planning to use Spring Data JDBC for our entities for its simplicity. Below is the sample structure and we have up to 6 child entities. We are able to successfully insert the data into various of these entities with proper foreign key mappings.
Challenge:- We have a workflow process outside of this application that periodically updates the "requestStatus" in the "Request" entity and this is the only field that gets updated after the Request is created. As with spring data JDBC, during the update it deletes all referenced entities and recreates(inserts) it again. This is kind of a heavy operation considering 6 child entities. Are there any workaround or suggestion in terms of how to handle these scenarios
#Table("Request")
public class Request {
private String requestId; // generated in the Before Save Listener .
private String requestStatus;
#Column("requestId")
private ChildEntity1 childEntity1;
public void addChildEntity1(ChildEntity1 childEntityobj) {
this.childEntity1 = childEntityobj;
}
}
#Table("Child_Entity1")
public class ChildEntity1 {
private String entity1Id; // Auto increment on DB
private String name;
private String SSN;
private String requestId;
#MappedCollection(column = "entity1Id", keyColumn = "entity2Id")
private ArrayList<ChildEntity2> childEntity2List = new ArrayList<ChildEntity2>();
#MappedCollection(column = "entity1Id", keyColumn = "entity3Id")
private ArrayList<ChildEntity3> childEntity3List = new ArrayList<ChildEntity3>();
public void addChildEntity2(ChildEntity2 childEntity2obj) {
childEntity2List.add(childEntity2obj);
}
public void addChildEntity3(ChildEntity3 childEntity3obj) {
childEntity3List.add(childEntity3obj);
}
}
#Table("Child_Entity2")
public class ChildEntity2 {
private String entity2Id; // Auto increment on DB
private String partyTypeCode;
private String requestId;
}
#Table(Child_Entity3)
public class ChildEntity3 {
private String entity3Id; // Auto increment on DB
private String PhoneCode;
private String requestId;
}
#Test
public void createandsaveRequest() {
Request newRequest = createRequest(); // using builder to build the object
newRequest.addChildEntity1(createChildEntity1());
newRequest.getChildEntity1().addChildEntity2(createChildEntity2());
newRequest.getChildEntity1().addChildEntity3(createChildEntity3());
requestRepository.save(newRequest);
}
The approach you describe in your comment:
Have a dedicated method performing exactly that update-statement is the right way to do this.
You should be aware though that this does ignore optimistic locking.
So there is a risk that the following might happen
Thread/Session 1: reads an aggregate.
Thread/Session 2: updates a single field as per your question.
Thread/Session 1: writes the aggregate, possibly with other changes, overwriting the change made by Session 2.
To avoid this or similar problems you need to
check that the version of the aggregate root is unchanged from when you loaded it, in order to guarantee that the method doesn't write conflicting changes.
increment the version in order to guarantee that nothing else overwrites the changes made in this method.
This might mean that you need two or more SQL statements which probably means you have to fallback even more to a full custom method where you implement this, probably using an injected JdbcTemplate.

Efficient way to fetch list size

I have an entity like below. When I need to list comment size of company I'm calling totalComments() method. For this does hibernate go to the database and fetch entire comment data or just querying with count(*)? If hibernate fetch entire comment what is the efficient way for getting comment size?
#Entity
#Table(name = "companies")
public class Company extends ItemEntity {
#OneToMany(fetch = FetchType.LAZY)
#JoinTable(name="companies_comments",
joinColumns=#JoinColumn(name="company_id"),
inverseJoinColumns=#JoinColumn(name="comment_id"))
private Set<Comment> comments = new HashSet<>();
public void addComment(Comment comment) {
this.comments.add(comment);
}
public int totalComments() {
return this.comments.size();
}
}
You should drop the own method counter and create a specific (business) query to retrieve the size of the list, such as
public long getCommentsCount(Company c) {
String query = "SELECT COUNT(cm) FROM Company AS c JOIN c.comments AS cm WHERE c = :company";
return entityManager.createQuery(q, Long.class).setParameter("company", c).getSingleResult();
}
Some persistence provider may optimize performance when this kind of query is loaded as a #NamedQuery on entity, or when using CriteriaQuery API.
Depending on your database, you may need to change the return class to Number.class and convert to long.
If you want to tune even more your performance, use createNativeQuery method and write your own pure SQL, but keep in mind that changes on db schema requires to review theses queries.
I found the answer. If we don't adjust for getting collection size of entity hibernate loads every comment. We can solve this performance issue in two ways.
We can use #LazyCollection(LazyCollectionOption.EXTRA) like below. By LazyCollectionOption.EXTRA .size() and .contains() won't initialize the whole collection.
#OneToMany(fetch = FetchType.LAZY)
#LazyCollection(LazyCollectionOption.EXTRA)
#JoinTable(name="companies_comments",
joinColumns=#JoinColumn(name="company_id"),
inverseJoinColumns=#JoinColumn(name="comment_id"))
private Set<Comment> comments = new HashSet<>();
Or we can use #Formula annotation.
#Formula(SELECT COUNT(*) FROM companies_comments cc WHERE cc.company_id = id)
private int numberOfComments;
Edit after 8 months: For simplicity and performance perspective, we should create a JPA Query Method like below.
#Repository
public interface CommentRepository extends JpaRepository<Comment, Long> {
int countAllByCompany(Company company);
}
We should never use getComments().size() for this purpose, because this way all comments are loaded into memory and this may be cause performance issues.
It is also true when adding comments to the collection. We shouldn't use getComments().add(newComment). When we have OneToMany relation, all we have to do is set the company field of the comment like as newComment.setCompany(company), and perform the persist operation. Therefore, it is recommended to define OneToMany relationships bidirectional.

Replacing entire contents of spring-data Page, while maintaining paging info

Using spring-data-jpa and working on getting data out of table where there are about a dozen columns which are used in queries to find particular rows, and then a payload column of clob type which contains the actual data that is marshalled into java objects to be returned.
Entity object very roughly would be something like
#Entity
#Table(name = "Person")
public class Person {
#Column(name="PERSON_ID", length=45) #Id private String personId;
#Column(name="NAME", length=45) private String name;
#Column(name="ADDRESS", length=45) private String address;
#Column(name="PAYLOAD") #Lob private String payload;
//Bunch of other stuff
}
(Whether this approach is sensible or not is a topic for a different discussion)
The clob column causes performance to suffer on large queries ...
In an attempt to improve things a bit, I've created a separate entity object ... sans payload ...
#Entity
#Table(name = "Person")
public class NotQuiteAWholePerson {
#Column(name="PERSON_ID", length=45) #Id private String personId;
#Column(name="NAME", length=45) private String name;
#Column(name="ADDRESS", length=45) private String address;
//Bunch of other stuff
}
This gets me a page of NotQuiteAPerson ... I then query for the page of full person objects via the personIds.
The hope is that in not using the payload in the original query, which could filtering data over a good bit of the backing table, I only concern myself with the payload when I'm retrieving the current page of objects to be viewed ... a much smaller chunk.
So I'm at the point where I want to map the contents of the original returned Page of NotQuiteAWholePerson to my List of Person, while keeping all the Paging info intact, the map method however only takes a Converter which will iterate over the NotQuiteAWholePerson objects ... which doesn't quite fit what I'm trying to do.
Is there a sensible way to achieve this ?
Additional clarification for #itsallas as to why existing map() will not suffice..
PageImpl::map has
#Override
public <S> Page<S> map(Converter<? super T, ? extends S> converter) {
return new PageImpl<S>(getConvertedContent(converter), pageable, total);
}
Chunk::getConvertedContent has
protected <S> List<S> getConvertedContent(Converter<? super T, ? extends S> converter) {
Assert.notNull(converter, "Converter must not be null!");
List<S> result = new ArrayList<S>(content.size());
for (T element : this) {
result.add(converter.convert(element));
}
return result;
}
So the original List of contents is iterated through ... and a supplied convert method applied, to build a new list of contents to be inserted into the existing Pageable.
However I cannot convert a NotQuiteAWholePerson to a Person individually, as I cannot simply construct the payload... well I could, if I called out to the DB for each Person by Id in the convert... but calling out individually is not ideal from a performance perspective ...
After getting my Page of NotQuiteAWholePerson I am querying for the entire List of Person ... by Id ... in one call ... and now I am looking for a way to substitute the entire content list ... not interively, as the existing map() does, but in a simple replacement.
This particular use case would also assist where the payload, which is json, is more appropriately persisted in a NoSql datastore like Mongo ... as opposed to the sql datastore clob ...
Hope that clarifies it a bit better.
You can avoid the problem entirely with Spring Data JPA features.
The most sensible way would be to use Spring Data JPA projections, which have good extensive documentation.
For example, you would first need to ensure lazy fetching for your attribute, which you can achieve with an annotation on the attribute itself.
i.e. :
#Basic(fetch = FetchType.LAZY) #Column(name="PAYLOAD") #Lob private String payload;
or through Fetch/Load Graphs, which are neatly supported at repository-level.
You need to define this one way or another, because, as taken verbatim from the docs :
The query execution engine creates proxy instances of that interface at runtime for each element returned and forwards calls to the exposed methods to the target object.
You can then define a projection like so :
interface NotQuiteAWholePerson {
String getPersonId();
String getName();
String getAddress();
//Bunch of other stuff
}
And add a query method to your repository :
interface PersonRepository extends Repository<Person, String> {
Page<NotQuiteAWholePerson> findAll(Pageable pageable);
// or its dynamic equivalent
<T> Page<T> findAll(Pageable pageable, Class<T>);
}
Given the same pageable, a page of projections would refer back to the same entities in the same session.
If you cannot use projections for whatever reason (namely if you're using JPA < 2.1 or a version of Spring Data JPA before projections), you could define an explicit JPQL query with the columns and relationships you want, or keep the 2-entity setup. You could then map Persons and NotQuiteAWholePersons to a PersonDTO class, either manually or (preferably) using your object mapping framework of choice.
NB. : There are a variety of ways to use and setup lazy/eager relations. This covers more in detail.

Why is JPA query so slow?

I am implementing queries in my web application with JPA repositories. The two main tables I am querying from are FmReportTb and SpecimenTb.
Here are the two entity classes (only important attributes are listed).
//FmReportTb.java
#Entity
#Table(name="FM_REPORT_TB")
public class FmReportTb implements Serializable {
#Column(name="ROW_ID")
private long rowId;
#Column(name="FR_BLOCK_ID")
private String frBlockId;
#Column(name="FR_FULL_NAME")
private String frFullName;
#OneToOne
#JoinColumn(name="SPECIMEN_ID")
private SpecimenTb specimenTb;
FmReportTb has OneToOne relationship with SpecimenTb.
#Entity
#Table(name="SPECIMEN_TB")
public class SpecimenTb implements Serializable {
private String mrn;
#OneToOne(mappedBy="specimenTb", cascade=CascadeType.ALL)
private FmReportTb fmReportTb;
The query I am working on is to find all records in FmReportTb and show a few attributes from FmReportTb plus mrn from SpecimenTb.
Here is my JPA repository for FmReportTb:
#Repository
public interface FmReportRepository extends JpaRepository<FmReportTb, Long> {
#Query("select f from FmReportTb f where f.deleteTs is not null")
public List<FmReportTb> findAllFmReports();
Since, I am only showing part of the attributes from FmReportTb and one attribute from SpecimenTb, I decided to create a Value Object for FmReportTb. The constructor of the VO class assigns attributes from FmReportTb and grabs mrn attribute from SpecimenTb based on the OneToOne relationship. Another reason for using VO is because table FmReportTb has a lot of OneToMany children entities. For this particular query, I don't need any of them.
public class FmReportVO {
private String frBlockId;
private Date frCollectionDate;
private String frCopiedPhysician;
private String frDiagnosis;
private String frFacilityName;
private String frFullName;
private String frReportId;
private String filepath;
private String mrn;
public FmReportVO(FmReportTb fmReport) {
this.frBlockId = fmReport.getFrBlockId();
this.frCollectionDate = fmReport.getFrCollectionDate();
this.frCopiedPhysician = fmReport.getFrCopiedPhysician();
this.frDiagnosis = fmReport.getFrDiagnosis();
this.frFacilityName = fmReport.getFrFacilityName();
this.frFullName = fmReport.getFrFullName();
this.frReportId = fmReport.getFrReportId();
this.mrn = fmReport.getSpecimenTb().getMrn();
}
I implemented findall method in servicebean class to return a list of FmReportTb VOs.
//FmReportServiceBean.java
#Override
public List<FmReportVO> findAllFmReports() {
List<FmReportTb> reports = fmReportRepository.findAllFmReports();
if (reports == null) {
return null;
}
List<FmReportVO> fmReports = new ArrayList<FmReportVO>();
for (FmReportTb report : reports) {
FmReportVO reportVo = new FmReportVO(report);
String filepath = fileLoadRepository.findUriByFileLoadId(report.getFileLoadId().longValue());
reportVo.setFilepath(filepath);
fmReports.add(reportVo);
}
return fmReports;
}
Lastly, my controller looks like this:
#RequestMapping(
value = "/ristore/foundation/",
method = RequestMethod.GET,
produces = "application/json")
public ResponseEntity<List<FmReportVO>> getAllFmReports() {
List<FmReportVO> reports = ristoreService.findAllFmReports();
if (reports == null) {
return new ResponseEntity<List<FmReportVO>>(HttpStatus.NOT_FOUND);
}
return new ResponseEntity<List<FmReportVO>>(reports, HttpStatus.OK);
}
There are about 200 records in the database. Surprisingly, it took almost 2 full seconds to retrieve all the records in JSON. Even though I did not index all the tables, this is way too slow. Similar query takes about probably a few ms on the database directly. Is it because I am using Value Objects or JPA query tends to be this slow?
EDIT 1
This may have to do with the fact that FmReportTb has almost 20 OneToMany entities. Although the fetchmode of these child entities are set to LAZY, JPA Data repository tends to ignore the fetchmode. So I ended up using NamedEntityGraph to specify the attributes EAGER. This next section is added to the head of my FmReportTb entity class.
#Entity
#NamedEntityGraph(
name = "FmReportGraph",
attributeNodes = {
#NamedAttributeNode("fileLoadId"),
#NamedAttributeNode("frBlockId"),
#NamedAttributeNode("frCollectionDate"),
#NamedAttributeNode("frDiagnosis"),
#NamedAttributeNode("frFullName"),
#NamedAttributeNode("frReportId"),
#NamedAttributeNode("specimenTb")})
#Table(name="FM_REPORT_TB")
And then #EntityGraph("FmReportGraph") was added before the JPA repository query to find all records. After doing that, the performance is improved a little bit. Now fetching 1500 records only takes about 10 seconds. However, it still seems too slow given each json object is fairly small.
Answering for the benefit of others with slow JPA queries...
As #Ken Bekov hints in the comments, foreign keys can help a lot with JPA.
I had a couple of tables with a many to one relationship - a query of 100,000 records was taking hours to perform. Without any code changes I reduced this to seconds just by adding a foreign key.
In phpMyAdmin you do this by creating a Relationship from the "many" table to the "one" table. For a detailed explanation see this question: Setting up foreign keys in phpMyAdmin?
and the answer by #Devsi Odedra

Resources