Distrubuted Caching algorithms/tutorials - caching

What is the best mechanism to understand how caching frameworks/ caching algorithms works , is there any book which covers following topics in details.
cache hits
cache miss
LFU
LRU
LRU2
Two Queues
ARC
MRU
FIFO
Second Chance
Distributed caching

Related

How can I configure an EhCache cache to use an LRU eviction strategy in version 3.8 of ehcache?

In version 3.8, how can I configure an EhCache cache to use an LRU eviction strategy?
I've looked at the EvictionAdvisor, but it only seems to get called for the most recently inserted item. So I can in essence say "yes" or "no" on evicting the most recently added item. But it is not useful in identifying other items that should be evicted.
I seem to recall that in EhCache 2.8 (it's been awhile), I could provide information in the ehcache.xml configuration file to specify that the cache use an LRU eviction strategy.
int these two documentation mentioned that ehcache is using LRU as default eviction strategy :
A cache eviction algorithm is a way of deciding which element to evict when the cache is full. In Ehcache, the MemoryStore may be limited in size (see How to Size Caches for more information). When the store gets full, elements are evicted. The eviction algorithms in Ehcache determine which elements are evicted. The default is LRU.
https://www.ehcache.org/documentation/2.8/apis/cache-eviction-algorithms.html
Ehcache uses Last Recently Used (LRU) as the default eviction strategy for the memory stores. The eviction strategy determines which cache entry is to be evicted when the cache is full.
https://springframework.guru/using-ehcache-3-in-spring-boot/

Redis CRDB Eviction Policy

I have read in the redis documentation that caching eviction policy for CRDB should be set to No Eviction .
"Note: Geo-Distributed CRDBs always operate in noeviction mode."
https://docs.redislabs.com/latest/rs/administering/database-operations/eviction-policy/
Reasoning for that is the garbage collection might cause inconsistencies as both the data center will have bidirectional synch.
I am not getting this point, can someone explain by giving a real world problem that might occur if suppose we have cache eviction policy LRU .
I got to know after doing some research that it is often a trouble to handle eviction when we have active replication. For example if one of the master runs out of memory and cache is trying to evict the keys to make some room for latest data, what might happen is - it will delete those keys from the other master even if there are no memory issues there. So until and unless there is really a good way to handle this ,eviction is not supported.

System Design: Global Caching and consistency

Lets take an example of Twitter. There is a huge cache which gets updated frequently. For example: if person Foo tweets and it has followers all across the globe. Ideally all the caches across all PoP needs to get updated. i.e. they should remain in sync
How does replication across datacenter (PoP) work for realtime caches ?
What tools/technologies are preferred ?
What are potential issues here in this system design ?
I am not sure there is a right/wrong answer to this, but here's my two pennies' worth of it.
I would tackle the problem from a slightly different angle: when a user posts something, that something goes in a distributed storage (not necessarily a cache) that is already redundant across multiple geographies. I would also presume that, in the interest of performance, these nodes are eventually consistent.
Now the caching. I would not design a system that takes care of synchronising all the caches each time someone does something. I would rather implement caching at the service level. Imagine a small service residing in a geographically distributed cluster. Each time a user tries to fetch data, the service checks its local cache - if it is a miss, it reads the tweets from the storage and puts a portion of them in a cache (subject to eviction policies). All subsequent accesses, if any, would be cached at a local level.
In terms of design precautions:
Carefully consider the DC / AZ topology in order to ensure sufficient bandwidth and low latency
Cache at the local level in order to avoid useless network trips
Cache updates don't happen from the centre to the periphery; cache is created when a cache miss happens
I am stating the obvious here, implement the right eviction policies in order to keep only the right objects in cache
The only message that should go from the centre to the periphery is a cache flush broadcast (tell all the nodes to get rid of their cache)
I am certainly missing many other things here, but hopefully this is good food for thought.

Cache eviction in Mondrian

It is not clear from the docs how Mondrian behaves regarding cache eviction.
The "Out of memory" section on configuration is very vague. Is it correct to say that Mondrian never evicts anything from cache? And if the user performs too diverse queries cache eventually grows to infinity?

Fine-tuning and monitoring a Spring cache backed by a ConcurrentMapCache

I have set up a Spring cache manager backed by a ConcurrentMapCache for my application.
I am seeking for ways to monitor the cache and especially make sure the data in cache fits in memory. I considered using jvisualvm for that purpose but there might be other ways... If so what are they?
So my question is basically twofold:
What is the best way to monitor a cache backed by a ConcurrentMapCache?
What are the general guidelines for setting the time to live and cache size values of a cache?
It looks like you are searching for cache related features that can't and won't be available with a "simple" map implementation provided by the JVM.
There are many cache providers out there that provides what you want, that is monitoring, limiting the size of the cache and providing a TTL contract for the cache elements. I would encourage you to look around and switch your CacheManager implementation which will have zero impact on your code since you're using the abstraction.

Resources