How can I cache a database query with "IN" operator? - spring

I'm using Spring Boot with Spring Cache. I have a method that, given a list of ids, returns a list of Food that match with those ids:
public List<Food> get(List<Integer> ids) {
return "select * from FOOD where FOOD_ID in ids"; // << pseudo-code
}
I want to cache the results by id. Imagine that I do:
List<Food> foods = get(asList(1, 5, 7));
and then:
List<Food> foods = get(asList(1, 5));
I want to Food with id 1 and Food with id 5 to be retrieved from cache. Is it possible?
I know I can do a method like:
#Cacheable(key = "id")
public Food getById(id){
...
}
and iterate the ids list and call it each time, but in that case I don't take advantage of IN SQL operator, right? Thanks.

The key attribute of Cacheable takes a SpEL expression to calculate the cache key. So you should be able to do something like
#Cacheable(key = "#ids.stream().map(b -> Integer.toString(b)).collect(Collectors.joining(",")))
This would require the ids to always be in the same order
https://docs.spring.io/spring/docs/current/spring-framework-reference/html/cache.html#cache-annotations-cacheable-key
A better option would be to create a class to wrap around your ids that would be able to generate the cache key for you, or some kind of utility class function.
Another possible Solution without #Cacheable would be to inject the cache manager into the class like:
#Autowired
private CacheManager cacheManager;
You can then retrieve the food cache from the cache manager by name
Cache cache = cacheManager.getCache('cache name');
then you could adjust your method to take in the list of ids and manually add and get the values from cache
cache.get(id);
cache.put(id, food);
You will most likely still not be able to use the SQL IN clause, but you are at least handling the iteration inside the method and not everywhere this method is called, and leveraging the cache whenever possible.
public List<Food> get(List<Integer> ids) {
List<Food> result = new ArrayList<>();
for(Integer id : ids) {
// Attempt to fetch from cache
Food food = cache.get(id);
if (food == null) {
// Fetch from DB
cache.put(id, food);
}
result.add(food);
}
return result;
}
Relevant Javadocs:
http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/cache/CacheManager.html
http://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/cache/Cache.html

Related

Using findOne() / findAll() in spring boot for Cassandra DB

During code optimization I found few areas where I was using findOne() within for loop –
public List<User> validateUsers(List<String> userIds) {
List<User> validUsers = new ArrayList<>();
for ( String userId : userIds) {
User user = userRepository.findOne(userId); //Network hit :: expensive call
//Perform validations
...
//Add valid users to validUsers list
...
}
return validUsers;
}
Above method takes long time if I pass huge list of users to validate. [for 300 users around 5 sec.]
Then I changed above method to use findAll() and perform validations on result collection -
public List<User> validateUsers(List<String> userIds) {
List<User> validUsers = new ArrayList<>();
Iterable<User> itr = userRepository.findAll(userIds); //Only one Network hit
for ( User user : itr) {
//Perform validations
...
//Add valid users to validUsers list
...
}
return validUsers;
}
Now for 300 users, results coming in 100 ms.
Question is: Is there any side effects of using findAll() considering the underlying structure of Cassandra? Also I am using CrudRepository. Should I use CassandraRepository?
Following are the parameters to think of when you are attempting this.
How big is the users table, if you are using findAll.
Partition keys for the user table
As Cassandra queries are faster with the primary key fields, findOne might perform better with the large amount of data.
However, can you try
List<T> findAllById(Iterable<ID> ids);
from org.springframework.data.cassandra.repository.CassandraRepository

GraphQL Java: Using #Batched DataFetcher

I know how to retrieve a bean from a service in a datafetcher:
public class MyDataFetcher implements DataFetcher {
...
#Override
public Object get(DataFetchingEnvironment environment) {
return myService.getData();
}
}
But schemas with nested lists should use a BatchedExecutionStrategy and create batched DataFetchers with get() methods annotated #Batched (see graphql-java doc).
But where do I put my getData() call then?
///// Where to put this code?
List list = myService.getData();
/////
public class MyDataFetcher implements DataFetcher {
#Batched
public Object get(DataFetchingEnvironment environment) {
return list.get(environment.getIndex()); // where to get the index?
}
}
WARNING: The original BatchedExecutionStrategy has been deprecated and will get removed. The current preferred solution is the Data Loader library. Also, the entire execution engine is getting replaced in the future, and the new one will again support batching "natively". You can already use the new engine and the new BatchedExecutionStrategy (both in nextgen packages) but they have limited support for instrumentations. The answer below applies equally to both the legacy and the nextgen execution engine.
Look at it like this. Normal DataFetcherss receive a single object as source (DataFetchingEnvironment#getSource) and return a single object as a result. For example, if you had a query like:
{
user (name: "John") {
company {
revenue
}
}
Your company resolver (fetcher) would get a User object as source, and would be expected to somehow return a Company based on that e.g.
User owner = (User) environment.getSource();
Company company = companyService.findByOwner(owner);
return company;
Now, in the exact same scenario, if your DataFetcher was batched, and you used BatchedExecutionStrategy, instead of receiving a User and returning a Company, you'd receive a List<User> and would return a List<Company> instead.
E.g.
List<User> owners = (List<User>) environment.getSource();
List<Company> companies = companyService.findByOwners(owners);
return companies;
Notice that this means your underlying logic must have a way to fetch multiple things at once, otherwise it wouldn't be batched. So your myService.getData call would need to change, unless it can already fetch data for multiple source object in one go.
Also notice that batched resolution makes sense in nested queries only, as the top level resolver can already fetch a list of object, without the need for batching.

spring-data: cache a queries total count

I'm using spring data jpa with querydsl. I have a method that returns query results in pages containing total count. getting the total count is expensive and I would like to cache it. how is that possible?
My naive approach
#Cacheable("queryCount")
private long getCount(JPAQuery query){
return query.count();
}
does not work (to make it work they way wanted the actually key for the cache should not be the whole query, just the criteria). Anyway tested it, did not work and then I found this: Spring 3.1 #Cacheable - method still executed
The way I understand this I can only cache the public interface methods. However in said method I would need to cache a property of the return value, eg.
Page<T> findByComplexProperty(...)
I would need to cache
page.getTotalElements();
Annotating the whole method works (it is cached) but not the way I would like. Assume getting total count takes 30 seconds. Hence for every new page request user needs to wait 30 sec. if he goes back a page, then the cache is used but I would want the count to be only run exactly once and then count is fetched from cache.
How can I do that?
My solution was to autowire the cache manager in the class creating the complex query:
#Autowired
private CacheManager cacheManager;
and then create a simple private method getCount
private long getCount(JPAQuery query) {
Predicate whereClause = query.getMetadata().getWhere();
String key = whereClause.toString();
Cache cache = this.cacheManager.getCache(QUERY_CACHE);
Cache.ValueWrapper value = cache.get(key);
if (value == null) {
Long result = query.count();
cache.put(key, result);
return result;
}
return (Long)value.get();
}

Alternative to use a method in a query to the database in c# using entity framework

i separated my application into a DAL, BL, UI.
I used entity framework code first throw repositories to access the sql database.
public class Person{
...
}
public class PersonRepository{
Create(...){...}
Update(...){...}
Delete(...){...}
GetById(...){...}
Query(...){...}
...
Now the thing is the BL i'm working on a method to get all the Persons who are leaving near an adress
public GetPersonsNear(string Address){
...
}
private bool AddressesAreClose(string address1, string address2)
{
...
}
the thing is linq does'nt let me use my method (in a query passed in the "Query" method of the repository)
...
PersonRepository personRepository = new PersonRepository();
var person = repository.Query(p => AddressAreClose(adress,p.Adress);
...
therefor i needed to get All the elements of the table in a list using a simple foreach loop to make the tests and keeping only the relevant ones
...
PersonRepository personRepository = new PersonRepository();
var persons = personRepository.GetAll;
foreach(person in persons)
{
if(AdressAreClose(adress,person.adress))
...
}
for now i populated the database with only a few elements to test it, but i'm not sure it would work very well with the far more greater number it will contain later specially with all the test i'm planing to add
isn't there a more clever way to do this ??? I'm open to anything
Well first of all, you should use generics in your repository, even if it's constrained to Person. This way you can build pipes/filters off your queries to clean up your LINQ queries and facilitate reuse.
Of course, without seeing the full signature/implementation of your Query method, it's hard to tell. But either way, you need to return IEnumerable<Person> or IQueryable<Person> to make the following work.
So, you could turn AddressesAreClose into a pipe/filter, like this:
public static bool WhereAddressesAreClose(this IQueryable<Person> source, string address)
{
return source.Where(/* your conditions */);
}
Then you can use it in your LINQ query:
var person = repository
.Query() // Should be IQueryable<Person>
.WhereAddressAreClose(adress);
.ToList();
Depending on the size of your data and whether or not your implementing caching, you should limit the results on the server (database), not post-query with a foreach loop.
If the performance isn't great, consider adding indexes, using compiled queries or moving to a stored procedure.

Can ExecuteQuery return a DBML generated class without having it fetch all the information for that class?

I have a couple of DBML generated classes which are linked together by an id, e.g.
ClassA {
AID,
XID,
Name
}
ClassB {
AID,
ExtraInfo,
ExtraInfo2
}
In using something like db.ClassAs.Where(XID == x) and iterating through that result,
it ends up executing a query for each of the ClassAs and each of ClassBs, which is slow.
Alternatively, I've tried to use ExecuteQuery to fetch all the info I care about and have that return a ClassA. In iterating over that I end up with it doing the same, i.e. doing alot of individual fetches vs. just 1. If I store it in a ClassC (that is not associated with a DB entity) which has the fields of interest of both ClassA and ClassB, this query is much faster, but it's annoying b/c I just created IMO an unnecessary ClassC.
How can I still use ClassA, which associates to ClassB, and still use ExecuteQuery to run 1 query vs. A*B number of queries?
If you have associations you shouldn't need to use ExecuteQuery().
Here's an example using some imaginary Book Library context and anonymous types for the result:
var results =
Books
.Where(book => book.BookId == 1)
.Select(book =>
new
{
book.Name,
AuthorName = book.Author.Name, //Is a field in an associated table.
book.Publisher, //Is an associtated table.
});
EDIT: without anon types
var results =
Books
.Where(book => book.BookId == 1)
.Select(book =>
new BookResult()
{
BookName = book.Name,
AuthorName = book.Author.Name, //Is a field in an associated table.
Publisher = book.Publisher, //Is an associtated table.
});

Resources