Pagination in duplicate rescords - spring-boot

I need to apply pagination in a spring boot project.
I apply pagination in 2 queries. Each of them gives me data from different tables. Now, some of these records are identical in the two tables hence need to be removed.
At the end, the number of entries that I need to send will be reduced, thereby ruining the initial pagination applied. How do I go about this? What should be my approach?
Here I take 2 lists from 2 jpa calls(highRiskCust and amlPositiveCust) that will apply pagination, then remove the duplicacy and return the final result (tempReport).
`
List<L1ComplianceResponseDTO> highRiskCust = customerKyc.findAllHighRiskL1Returned(startDateTime, endDateTime,agentIds);
List<L1ComplianceResponseDTO> amlPositiveCust = customerKyc.findAllAmlPositiveL1Returned(startDateTime, endDateTime,agentIds);
List<L1ComplianceResponseDTO> tempReport = new ArrayList<>();
tempReport.addAll(amlPositiveCust);
tempReport.addAll(highRiskCust);
tempReport = tempReport.stream().filter(distinctByKey(p -> p.getKycTicketId()))
.collect(Collectors.toList());
`

In order to have pagination working, you need to do it with a unique request.
Fondammently this request should use UNION.
Since JPA does not support UNION, either you do a native query or you change you query logic using outer joins.

Related

Spring boot JPA Specification API with Pagination and Sorting

Im trying to implement a search in spring boot. Since the search params is dynamic, i had to go for specification api. My requirement is to search for orders given certain params and sort those orders by creation date. Since data can be large, the api should also support for pagination.
Below is the code snipped of the predicates.
public search(OrderSearchCriteria searchCriteria, Integer pageNo, Integer pageSize) {
Pageable pageRequest = (!ObjectUtils.isEmpty(pageNo) && !ObjectUtils.isEmpty(pageSize))
? PageRequest.of(pageNo, pageSize)
: Pageable.unpaged();
Page<Order> ordersPage = this.dao.findAll(((Specification<Order>)(root, query, criteriaBuilder) -> {
// my search params, its more than what is shown here
Long id = searchCriteria.getId();
Priority priority = searchCriteria.getPriority();
List<Predicate> predicates = new ArrayList<>();
if(!ObjectUtils.isEmpty(id))
predicates.add(criteriaBuilder.and(criteriaBuilder.equal(root.get("id"), id)));
if(!ObjectUtils.isEmpty(priority))
predicates.add(criteriaBuilder.and(criteriaBuilder.equal(root.get("priority"), priority)));
// This line is causing the issue
query.orderBy(criteriaBuilder.desc(root.get("creationDate")));
return criteriaBuilder.and(predicates.toArray(new Predicate[0]));
}), pageRequest);
The issue is, if i add this sort by line query.orderBy(criteriaBuilder.desc(root.get("creationDate")))
Pagination is not working. page 0 and page 1 both shows same result(same order) given size as 1. But page 2 shows different result as expected. But if i remove the sort by line shown above, the code is working as expected. How do i support both pagination and sorting without having these issues? i tried applying sort in pagerequest as well. But same issue PageRequest.of(pageNo, size, sort.by(DESC, "creationDate")). Appreciate help.
What do you mean by "it shows same result"? There might be multiple entries that have the same creation date and if you omit a sort, Spring Data will by default sort by id to provide a consistent result.

PageRequest and OrderBy method name Issue

in our Spring application we have a table that contains a lot of "Payment" record. Now we need a query that pages the results sorted from the one with the largest total to the smallest, we are facing an error because sometimes the same record is contained in two successive pages.
We are creating a PageRequest passed to the repository. Here our implementation:
Repository:
public interface StagingPaymentEntityRepository extends JpaRepository<StagingPaymentEntity, Long> {
Page<StagingPaymentEntity> findAllByStatusAndCreatedDateLessThanEqualAndOperationTypeOrderByEffectivePaymentDesc(String status, Timestamp batchStartTimestamp, String operationType, Pageable pageable);
}
public class BatchThreadReiteroStorni extends ThreadAbstract<StagingPaymentEntity> {
PageRequest pageRequest = PageRequest.of (index, 170);
Page<StagingPaymentEntity> records = ((StagingPaymentEntityRepository) repository).findAllByStatusAndCreatedDateLessThanEqualAndOperationTypeOrderByEffectivePaymentDesc("REITERO", batchStartTimestamp, "STORNO", pageRequest) ;
}
where index is the index of the page we are requesting.
There is a way to understand why it is happening ? Thank for support
This can have multiple reasons.
Non deterministic ordering: If the ordering you are using isn't deterministic, i.e. there are rows that might com in any order that order might change between selects resulting in items getting skipped or returned multiple times. Fix: add the primary key as a last column to the ordering.
If you change the entities in a way that affects the ordering, or another process does that you might end up with items getting processed multiple times.
In this scenario I see a couple of approaches:
do value based pagination. I.e. don't select pages but select the next N rows after .
Instead of paging use a Stream this allows to use a single select but still processing the results an element at a time. You might have to flush and evict entities and I'm not 100% sure that works, but certainly worth a try.
Finally you can mark all all rows that you want to process in a separate column, then select N marked entities and unmark them once they are processed.

Mapping many-to-many IN statement into JPA (Spring Boot)

I have created two entities in JPA, Listing and ItemType - these exist in a many-to-many relationship (Hibernate auto-generates a junction table). I'm trying to find the best way to create a query which accepts a dynamic list of item type Strings and returns the IDs of all listings which match the specified item types, but I am a recent initiate in JPA.
At present I'm using JpaRepository to create relatively simple queries. I've been trying to do this using CriteriaQuery but some close-but-not-quite answers I've read elsewhere seem to suggest that because this is in Spring, this may not be the best approach and that I should be handling this using the JpaRepository implementation itself. Does that seem reasonable?
I have a query which doesn't feel a million miles away (based on Baeldung's example and my reading on WikiBooks) but for starters I'm getting a Raw Type warning on the Join, not to mention that I'm unsure if this will run and I'm sure there's a better way of going about this.
public List<ListingDTO> getListingsByItemType(List<String> itemTypes) {
List<ListingDTO> listings = new ArrayList<>();
CriteriaQuery<Listing> criteriaQuery = criteriaBuilder.createQuery(Listing.class);
Root<Listing> listing = criteriaQuery.from(Listing.class);
//Here Be Warnings. This should be Join<?,?> but what goes in the diamond?
Join itemtype = listing.join("itemtype", JoinType.LEFT);
In<String> inClause = criteriaBuilder.in(itemtype.get("name"));
for (String itemType : itemTypes) {
inClause.value(itemType);
}
criteriaQuery.select(listing).where(inClause);
TypedQuery<Listing> query = entityManager.createQuery(criteriaQuery);
List<Listing> results = query.getResultList();
for (Listing result : results) {
listings.add(convertListingToDto(result));
}
return listings;
}
I'm trying to understand how best to pass in a dynamic list of names (the field in ItemType) and return a list of unique ids (the PK in Listing) where there is a row which matches in the junction table. Please let me know if I can provide any further information or assistance - I've gotten the sense that JPA and its handling of dynamic queries like this is part of its bread and butter!
The criteria API is useful when you need to dynamically create a query based on various... criteria.
All you need here is a static JPQL query:
select distinct listing from Listing listing
join listing.itemTypes itemType
where itemType.name in :itemTypes
Since you're using Spring-data-jpa, you just need to define a method and annotate it with #Query in your repository interface:
#Query("<the above query>")
List<Listing> findByItemTypes(List<String> itemTypes)

Large Resultset with Spring Boot and QueryDSL

I have a Spring Boot application where I use QueryDSL for dynamic queries.
Now the results should be exported as a csv file.
The model is an Order which contains products. The products should be included in the csv file.
However, as there are many thousand orders with millions of products this should not be loaded into memory at once.
However, solutions proposed by Hibernate (ScrollableResults) and streams are not supported by QueryDSL.
How can this be achieved while still using QueryDSL (to avoid duplication of filtering logic)?
One workaround to this problem is to keep iterating using offset and limit.
Something like:
long limit = 100;
long lastLimitUsed = 0;
List<MyEntity> entities = new JPAQuery<>(em)
.from(QMyEntity.entity)
.limit(limit)
.offset(lastLimitUsed)
.fetch();
lastLimitUsed += limit;
With that approach you can fetch smaller chunks of data. It is important to analyze if the limit and offset field will work well with your query. There are situations where even if you use limit and offset you will end up making a full scan on the tables involved on the query. If that happens you will face a performance problem instead of a memory one.
Use JPAQueryFactory
// com.querydsl.jpa.impl.JPAQueryFactory
JPAQueryFactory jpaFctory = new JPAQueryFactory(entityManager);
//
Expression<MyEntity> select = QMyEntity.myEntity;
EntityPath<MyEntity> path = QMyEntity.myEntity;
Stream stream = this.jpaQueryFactory
.select(select)
.from(entityPath)
.where(cond)
.createQuery() // get jpa query
.getResultStream();
// do something
stream.close();

spring data jpa specification join fetch is not working

I am trying to use Spring Data JPA Specificaiton to query data, but I got some problem here.
The Java code is as below:
List<NoticeEntity> studentNoticeEntityList = noticeRepository
.findAll((root, criteriaQuery, criteriaBuilder) -> {
criteriaQuery.distinct(true);
root.fetch(NoticeEntity_.contentEntitySet, JoinType.LEFT);
Predicate restrictions = criteriaBuilder.conjunction();
SetJoin<NoticeEntity, UserNoticeEntity> recipientNoticeJoin = root
.join(NoticeEntity_.recipientNoticeEntitySet, JoinType.INNER);
recipientNoticeJoin.on(criteriaBuilder.equal(
recipientNoticeJoin.get(UserNoticeEntity_.recipientStatus), NoticeRecipientStatus.Unread));
Join<UserNoticeEntity, WeChatUserEntity> recipientUserJoin = recipientNoticeJoin
.join(UserNoticeEntity_.user);
restrictions = criteriaBuilder.and(restrictions,
criteriaBuilder.equal(recipientUserJoin.get(WeChatUserEntity_.id), id));
// recipientNoticeJoin.fetch(UserNoticeEntity_.user, JoinType.INNER);
return restrictions;
});
When I comment the code "recipientNoticeJoin.fetch(UserNoticeEntity_.user, JoinType.INNER);", it is working fine, but when I un-comment this, I will get error:
org.hibernate.QueryException: query specified join fetching, but the owner of the fetched association was not present in the select list
So, I am wondering if join fetch is supported by using Specification way, or there is something wrong with my code.
I know there is another way by using #Query("some hql"), but somehow I just prefer to use the Specification way.
Thanks a lot.
The error specifies that you're missing an entity from your select list. Try this:
criteriaQuery.multiselect(root, root.get(NoticeEntity_.recipientNoticeEntitySet);
Also, hibernate may run a count query first to determine the number of results, and this can cause the above error. You can avoid this breaking by checking the return type of the query before adding the fetch.
Eager fetching in a Spring Specification

Resources