How to check Cache Hit Ratio of every cache values in redis? - spring

I am using the Redis configuration with Spring framework. There are multiples API's for which there are different values of cache like below :
#Cacheable(value = "API1")
#Cacheable(value = "API2")
now by using the below command
redis-cli info stats
I can get keyspace hits but how to calculate hits of different cache values (like specifically for value = "API1")?

Related

Hibernate first level cache to hold entities found by a property that is not the ID

I am working on a Java 8 / Spring Boot 2 application and I have noticed that the security module of my app internally uses the findByEmail method of my UserRepostiory (which is a standard Spring Data JPA Repository). When I enabled Hibernate SQL logging, I discovered that these queries are performed multiple times within the same session (security uses it 3-4 times and then my business code uses it some more times). Each time the query hits the database.
This surprised me, as I expected it to be cached in the Hibernate's first level cache. After reading up about it a little bit more, I found out that the first level cache only caches the result of the findById query, not others.
Is there anyway that I can cache the result of the findByEmail query in the first level cache? (I don't want the cache to be shared between sessions, I don't want to use the 2nd level cache, as I think it should be invalidated right after the current session ends).
Yes, you can cache the results of a query on a unique property if you annotate the property with the #NaturalId annotation. If you then use the dedicated API to execute the query, the results will be stored in the 1st level cache. An example:
User user = entityManager
.unwrap(Session.class)
.bySimpleNaturalId(User.class)
.load("john#example.com");

Need persistent im memory cache with multi key lookups

We are having a requirement , where we need to search for keys with multiple keys , and are looking for a multiple indexes .
For example:
Trade data contains the below parameters:
Date
Stock
Price
Quantity
Account
We will be storing each trade as a list with Stock as the key. This would give us the the ability to query , all the trades of a given stock. However , we would also have queries , like list of all the trades in an account. We would want to use this same cache to fetch this query instead of a new cache. The requirement is for an in memory cache(java) , as the latency requirement is very low. Also , we need a persistent cache , so that the cache is re-populated when the application is restarted.
Please let me know , if there is any good solution available , as the only way for persistent cache seems to be the distributed ones.
One way to ensure queries are faster is to create a TradeMeta Object with only the attributes you would like to query on ie
Date Stock Price Quantity Account
The TradeMeta Object can be stored in a Map with index on all the above keys . This ensure hazelcast maintains relevant buckets for easy lookup internally. Predicates can be set against this TradeMetaMap to fetch the keys . One you have the keys use getAsync to fetch the full trade objects from tradeMap.
To persist the cache you would require the Hazelcast EnterpriseHD which has HD storage and HotRestartStore

Spring JPA taking too much time & memory for inserting data in postgres DB

I am working on a spring batch app (with 2 GB memory) and trying to process data (used select queries to get data while processing) and inserting about 1 million processed records in postgres DB. I am using Spring Data JPA for this project. But Spring JPA is consuming too much memory while processing these records & finally i got out of memory exception. I suspected that there are too many entities created
which are not been cleared. Hence I tried to clear entityManager after certain DB calls, but didnt help. How can i reduce the memory consumption by JPA? Any suggestion to reduce memory consumption will be highly appreciated.
Possible Reasons
Number of http threads (Undertow starts around 50 threads per
default, but you can increase/decrease via property the number of
threads needed)
Static variables
Use of cache (memcache, ehcache, etc)
Cascade Persist
Batch Writing
server.tomcat.max-threads=5
This will limit the number of HTTP request handler threads to 1 (default is 50)
in your application.properties
For Details, you should refer R1 R2 R3

Paging SELECT query results from Cassandra in Spring Boot application

During my research I have come across this JIRA for Spring-Data-Cassandra:
https://jira.spring.io/browse/DATACASS-56
Now, according to the post above, currently SDC is not supporting Pagination in the Spring App due to structure of Cassandra. However, I'm thinking, if I can pull the entire rows list into a Java List, can I Paginate that list ? I don't have much experience in Spring, but is there something I am missing when I assume this can be done ?
Cassandra does not support pagination in the sense of pointing to a specific page (limit/offset) but generates a continuation token (PagingState) that is a set of bytes. Pulling a List of records will load all records in memory and possibly exhaust your memory (depending on the amount of data).
Spring Data Cassandra 1.5.0 RC1 comes with a streaming API in CassandraTemplate:
Iterator<Person> it = template.stream("SELECT * FROM person WHERE … ;", Person.class);
while(it.hasNext()) {
// …
}
CassandraTemplate.stream(…) will return an Iterator that operates on an underlying ResultSet. The DataStax driver uses a configurable fetch-size (5000 rows by default) for bulk fetching. Streaming data access can fetch as much or as little data as you require to process data. Data is not retained by the driver nor Spring Data Cassandra, and once the fetched bulk is retrieved from the Iterator, the underlying ResultSet will fetch the next bulk itself.
The other alternative is using ResultSet directly that gives you access to PagingState and do all the continuation/paging business yourself. You would lose all the higher level benefits of Spring Data Cassandra.

Read Cache Data from File system or diskpath

If overflowToDisk is enabled and Disk path is configured, then if data is not found in the memory should it automatically search from diskpath?
Refer the configuration mentioned
When overFlowToDisk gets activated in EHCACHE?
My case
1) Cache warm up from DB before application start
2) Load data from DB with loader implementation
3) Initially DB has 2000 data. So we have 1000 in memory (ABC_007) rest 1000 we have in the DISK.
Is this correct?
<cache name="ABC_007"
maxElementsInMemory="1000"
maxElementsOnDisk="10000"
overflowToDisk="true"
timeToIdleSeconds="..."
timeToLiveSeconds="...">
</cache>
If I search for data which is not in ABC_007, it will be retrieved from DISKPATH. Am I right on this one?
Now, if I implement Cache read through functionality that is if the data is not available in Cache (including diskpath), I should search in the DB.
Now I find the Data. Does it repopulate the Cache?
If ABC_007 still consists 1000 elements. Where it will be stored? ABC_007 or disk?
Please Correct my understandings.
For example refer the sample code
Cache cache = manager.getCache("ABC_007");
Element element = null;
String key = null;
for (int i=0 ; i<2000 ; i++) {
key = "keyInCache" + i ;
element = new Element (key , "value1");
cache.put(element);
}
Now when i cross 1000 then as per configuration , 1001 to 2000 elements will be stored in disk .
<cache name="ABC_007"
maxElementsInMemory="1000"
maxElementsOnDisk="10000"
overflowToDisk="true"
timeToIdleSeconds="..."
timeToLiveSeconds="...">
AM I RIGHT ?
Now I want the Value for the
Key = keyInCache1700
element = cache.get(key);
FROM Where I will get the Value ?
My understanding - as ABC_007 cache has maxElementsInMemory="1000" , that means it can srore upto 1000 key value in memory and value for the key keyInCache1700 will be retrieved from the Disk ...
AM I Correct ?
The answer depends on your version of Ehcache.
As of Ehcache 2.6, the storage model is no longer an overflow one but a tiered one.
In the tiered storage model, all data will always be present in the lowest tier.
Items will be present in the higher tiers based on their hotness.
Possible tiers for open source Ehcache are:
On-heap that is on the JVM heap
On-disk which is the lowest one
By definition high tiers have lower latency but less capacity than lower tiers.
So for a cache configured with overflowToDisk, all the data will always be inside the disk tier. It will store the key in memory and the data on disk.
When looking for an entry inside the cache, the tiers are considered from highest to lowest.
In your example, the data will be retrieved as follows:
Search in memory
If found, return it
Search on disk
If found, add to memory tier (hot data) and return it. This can cause another entry to be evicted from memory
Use your cache loader to retrieve it from DB
When found, add it to the cache and return it
I'm just going to outline/ summarize my rough idea of how EHCache works:
It's essentially a HashMap.
It keeps the HashMap of keys in memory, at all times.
The actual content of items can be stored either in memory, or (when there are enough items) overflow to disk.
Conclusion: My understanding is that EHCache knows what keys it has cached, and where the items are currently stored. This is a basic necessity for a cache to retrieve items quickly.
If an item is unknown to EHCache, I wouldn't expect it to go looking on disk for it.
You should definitely implement "read-thru to DB" logic, around your use of the cache. Items not found in the cache must obviously be read from the DB. Adding them to cache at that time would be expected to put them in memory, as they're currently hot (recently used).

Resources