spring-data: cache a queries total count - caching

I'm using spring data jpa with querydsl. I have a method that returns query results in pages containing total count. getting the total count is expensive and I would like to cache it. how is that possible?
My naive approach
#Cacheable("queryCount")
private long getCount(JPAQuery query){
return query.count();
}
does not work (to make it work they way wanted the actually key for the cache should not be the whole query, just the criteria). Anyway tested it, did not work and then I found this: Spring 3.1 #Cacheable - method still executed
The way I understand this I can only cache the public interface methods. However in said method I would need to cache a property of the return value, eg.
Page<T> findByComplexProperty(...)
I would need to cache
page.getTotalElements();
Annotating the whole method works (it is cached) but not the way I would like. Assume getting total count takes 30 seconds. Hence for every new page request user needs to wait 30 sec. if he goes back a page, then the cache is used but I would want the count to be only run exactly once and then count is fetched from cache.
How can I do that?

My solution was to autowire the cache manager in the class creating the complex query:
#Autowired
private CacheManager cacheManager;
and then create a simple private method getCount
private long getCount(JPAQuery query) {
Predicate whereClause = query.getMetadata().getWhere();
String key = whereClause.toString();
Cache cache = this.cacheManager.getCache(QUERY_CACHE);
Cache.ValueWrapper value = cache.get(key);
if (value == null) {
Long result = query.count();
cache.put(key, result);
return result;
}
return (Long)value.get();
}

Related

Spring Pagination and count query

I'm developping a REST API managing Module objects. For the UI, I need pagination and the total number of pages.
I know when Spring uses Page<T> an additional count query is used (to get the total number of pages) which is an overhead cost.
I need this total number of pages for the UI. But only once (no need to execute again the count query for each new page).
So I was thinking of exposing two endpoints :
getting the total number of elements
getting the data (so I'm returning a List<Module> instead of Page<Module> because I don't want to execute this extra count query for each page request)
Something like this :
#RestController
#RequestMapping("/api/modules")
public class ModuleApi {
private final ModuleService service;
#GetMapping("/count")
public Long count() {
return service.countModules();
}
#GetMapping
public List<Module> find(
#RequestParam("name") String name ,
#RequestParam(value = "page", required = false, defaultValue = "0") Integer page,
#RequestParam(value = "size", required = false, defaultValue = "10") Integer size
) {
return service.find(PageRequest.of(page, size));
}
}
Is this a good design ?
Counting once means your count will get outdated as more Modules are inserted into your database in which case the count is no longer relevant.
The better design would be to work with spring's Slice<Module>, forget the count altogether, and implement the solution on the front end side. Think of how some sites only fetch you more results when you are at the bottom of the page.
However this may cost a lot of effort and time on your architecture so your proposal should be fine.

Memory leak with Criteria API Pageable

I implemented pageable functionality into Criteria API query and I noticed increased memory usage during query execution. I also used spring-data-jpa method query to return same result, but there memory is cleaned up after every batch is processed. I tried detaching, flushing, clearing objects from EntityManager, but memory use would keep going up, occasionally it will drop but not as much as with method queries. My question is what could cause this memory use if objects are detached and how to deal with it?
Memory usage with Criteria API pageable:
Memory usage with method query:
Code
Since I'm also updating entities retrieved from DB, I use approach where I save ID of last processed entity, so when entity gets updated query doesen't skip next selected page. Below I provide code example that is not from real app I'm working on, but it just recreation of the issue I'm having.
Repository code:
#Override
public Slice<Player> getPlayers(int lastId, Pageable pageable) {
List<Predicate> predicates = new ArrayList<>();
CriteriaBuilder criteriaBuilder = entityManager.getCriteriaBuilder();
CriteriaQuery<Player> criteriaQuery = criteriaBuilder.createQuery(Player.class);
Root<Player> root = criteriaQuery.from(Player.class);
predicates.add(criteriaBuilder.greaterThan(root.get("id"), lastId));
criteriaQuery.where(criteriaBuilder.and(predicates.toArray(Predicate[]::new)));
criteriaQuery.orderBy(criteriaBuilder.asc(root.get("id")));
var query = entityManager.createQuery(criteriaQuery);
if (pageable.isPaged()) {
int pageSize = pageable.getPageSize();
int offset = pageable.getPageNumber() > 0 ? pageable.getPageNumber() * pageSize : 0;
// Fetch additional element and skip it based on the pageSize to know hasNext value.
query.setMaxResults(pageSize + 1);
query.setFirstResult(offset);
var resultList = query.getResultList();
boolean hasNext = pageable.isPaged() && resultList.size() > pageSize;
return new SliceImpl<>(hasNext ? resultList.subList(0, pageSize) : resultList, pageable, hasNext);
} else {
return new SliceImpl<>(query.getResultList(), pageable, false);
}
}
Iterating through pageables:
#Override
public Slice<Player> getAllPlayersPageable() {
int lastId = 0;
boolean hasNext = false;
Pageable pageable = PageRequest.of(0, 200);
do {
var players = playerCriteriaRepository.getPlayers(lastId, pageable);
if(!players.isEmpty()){
lastId = players.getContent().get(players.getContent().size() - 1).getId();
for(var player : players){
System.out.println(player.getFirstName());
entityManager.detach(player);
}
}
hasNext = players.hasNext();
} while (hasNext);
return null;
}
I think you are running into a query plan cache issue here that is related to the use of the JPA Criteria API and how numeric values are handled. Hibernate will render all numeric values as literals into an intermediary HQL query string which is then compiled. As you can imagine, every "scroll" to the next page will be a new query string so you gradually fill up the query plan cache.
One possible solution is to use a library like Blaze-Persistence which has a custom JPA Criteria API implementation and a Spring Data integration that will avoid these issues and at the same time improve the performance of your queries due to a better pagination implementation.
All your code would stay the same, you just have to include the integration and configure it as documented in the setup section.

Using findOne() / findAll() in spring boot for Cassandra DB

During code optimization I found few areas where I was using findOne() within for loop –
public List<User> validateUsers(List<String> userIds) {
List<User> validUsers = new ArrayList<>();
for ( String userId : userIds) {
User user = userRepository.findOne(userId); //Network hit :: expensive call
//Perform validations
...
//Add valid users to validUsers list
...
}
return validUsers;
}
Above method takes long time if I pass huge list of users to validate. [for 300 users around 5 sec.]
Then I changed above method to use findAll() and perform validations on result collection -
public List<User> validateUsers(List<String> userIds) {
List<User> validUsers = new ArrayList<>();
Iterable<User> itr = userRepository.findAll(userIds); //Only one Network hit
for ( User user : itr) {
//Perform validations
...
//Add valid users to validUsers list
...
}
return validUsers;
}
Now for 300 users, results coming in 100 ms.
Question is: Is there any side effects of using findAll() considering the underlying structure of Cassandra? Also I am using CrudRepository. Should I use CassandraRepository?
Following are the parameters to think of when you are attempting this.
How big is the users table, if you are using findAll.
Partition keys for the user table
As Cassandra queries are faster with the primary key fields, findOne might perform better with the large amount of data.
However, can you try
List<T> findAllById(Iterable<ID> ids);
from org.springframework.data.cassandra.repository.CassandraRepository

How can I cache a database query with "IN" operator?

I'm using Spring Boot with Spring Cache. I have a method that, given a list of ids, returns a list of Food that match with those ids:
public List<Food> get(List<Integer> ids) {
return "select * from FOOD where FOOD_ID in ids"; // << pseudo-code
}
I want to cache the results by id. Imagine that I do:
List<Food> foods = get(asList(1, 5, 7));
and then:
List<Food> foods = get(asList(1, 5));
I want to Food with id 1 and Food with id 5 to be retrieved from cache. Is it possible?
I know I can do a method like:
#Cacheable(key = "id")
public Food getById(id){
...
}
and iterate the ids list and call it each time, but in that case I don't take advantage of IN SQL operator, right? Thanks.
The key attribute of Cacheable takes a SpEL expression to calculate the cache key. So you should be able to do something like
#Cacheable(key = "#ids.stream().map(b -> Integer.toString(b)).collect(Collectors.joining(",")))
This would require the ids to always be in the same order
https://docs.spring.io/spring/docs/current/spring-framework-reference/html/cache.html#cache-annotations-cacheable-key
A better option would be to create a class to wrap around your ids that would be able to generate the cache key for you, or some kind of utility class function.
Another possible Solution without #Cacheable would be to inject the cache manager into the class like:
#Autowired
private CacheManager cacheManager;
You can then retrieve the food cache from the cache manager by name
Cache cache = cacheManager.getCache('cache name');
then you could adjust your method to take in the list of ids and manually add and get the values from cache
cache.get(id);
cache.put(id, food);
You will most likely still not be able to use the SQL IN clause, but you are at least handling the iteration inside the method and not everywhere this method is called, and leveraging the cache whenever possible.
public List<Food> get(List<Integer> ids) {
List<Food> result = new ArrayList<>();
for(Integer id : ids) {
// Attempt to fetch from cache
Food food = cache.get(id);
if (food == null) {
// Fetch from DB
cache.put(id, food);
}
result.add(food);
}
return result;
}
Relevant Javadocs:
http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/cache/CacheManager.html
http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/cache/Cache.html

How to query data via Spring data JPA with user defined offset and limit (Range)

Is it possible to fetch data in user defined ranges [int starting record -int last record]?
In my case user will define in query String in which range he wants to fetch data.
I have tried something like this
Pageable pageable = new PageRequest(0, 10);
Page<Project> list = projectRepository.findAll(spec, pageable);
Where spec is my defined specification but unfortunately this do not help.
May be I am doing something wrong here.
I have seen other spring jpa provided methods but nothing are of much help.
user can enter something like this localhost:8080/Section/employee? range{"columnName":name,"from":6,"to":20}
So this says to fetch employee data and it will fetch the first 15 records (sorted by columnName ) does not matter as of now.
If you can suggest me something better that would be great.if you think I have not provided enough information please let me know, I will provide required information.
Update :I do not want to use native or Create query statements (until I don't have any other option).
May be something like this:
Pageable pageable = new PageRequest(0, 10);
Page<Project> list = projectRepository.findAll(spec, new pageable(int startIndex,int endIndex){
// here my logic.
});
If you have better options, you can suggest me that as well.
Thanks.
Your approach didn't work, because new PageRequest(0, 10); doens't do what you think. As stated in docs, the input arguments are page and size, not limit and offset.
As far as I know (and somebody correct me if I'm wrong), there is no "out of the box" support for what you need in default SrpingData repositories. But you can create custom implementation of Pagable, that will take limit/offset parameters. Here is basic example - Spring data Pageable and LIMIT/OFFSET
We can do this with Pagination and by setting the database table column name, value & row counts as below:
#Transactional(readOnly=true)
public List<String> queryEmployeeDetails(String columnName,String columnData, int startRecord, int endRecord) {
Query query = sessionFactory.getCurrentSession().createQuery(" from Employee emp where emp.col= :"+columnName);
query.setParameter(columnName, columnData);
query.setFirstResult(startRecord);
query.setMaxResults(endRecord);
List<String> list = (List<String>)query.list();
return list;
}
If I am understanding your problem correctly, you want your repository to allow user to
Provide criteria for query (through Specification)
Provide column to sort
Provide the range of result to retrieve.
If my understanding is correctly, then:
In order to achieve 1., you can make use of JpaSpecificationExecutor from Spring Data JPA, which allow you to pass in Specificiation for query.
Both 2 and 3 is achievable in JpaSpecificationExecutor by use of Pagable. Pageable allow you to provide the starting index, number of record, and sorting columns for your query. You will need to implement your range-based Pageable. PageRequest is a good reference on what you can implement (or you can extend it I believe).
So i got this working as one of the answer suggested ,i implemented my own Pageable and overrided getPagesize(),getOffset(),getSort() thats it.(In my case i did not need more)
public Range(int startIndex, int endIndex, String sortBy) {
this.startIndex = startIndex;
this.endIndex = endIndex;
this.sortBy = sortBy;
}
#Override
public int getPageSize() {
if (endIndex == 0)
return 0;
return endIndex - startIndex;
}
#Override
public int getOffset() {
// TODO Auto-generated method stub
return startIndex;
}
#Override
public Sort getSort() {
// TODO Auto-generated method stub
if (sortBy != null && !sortBy.equalsIgnoreCase(""))
return new Sort(Direction.ASC, sortBy);
else
return new Sort(Direction.ASC, "id");
}
where startIndex ,endIndex are starting and last index of record.
to access it :
repository.findAll(spec,new Range(0,20,"id");
There is no offset parameter you can simply pass. However there is a very simple solution for this:
int pageNumber = Math.floor(offset / limit) + ( offset % limit );
PageRequest pReq = PageRequest.of(pageNumber, limit);
The client just have to keep track on the offset instead of page number. By this I mean your controller would receive the offset instead of the page number.
Hope this helps!

Resources