Infinispan: How many DefaultCacheManager Instances? - caching

In my web application project i have to build 2 kind on caching mechanism.
The first one is strictly related to the session. So i have implemented a job made pattern by which i can clean the infinispan cache when the user session is ended.
Distributed session cache makes use of 1 single DefaultCacheManager stored inside my application server JNDI. So every time it needs to write or read from the cache, it lookup for it ad makes the CRUD operations.
The second one is a normal infinispan distributed cache with its expiration policy and i'm going to implement it.
My question is:
is it correct to use the same DefaultCacheManager bound with JNDI? or is it better to create new one?
On infinispan guide i read it's a really heavy object and it's suggested to create just one.
Thanks.

Yes, I agree with #Jakub. The only reason why you might want to have separate CacheManagers is when you need them to operate on separate clusters, which is not your case.

Related

Handling dictionary values stored in DB - Spring

I am developing some SPA with a backend written in Java (Spring Boot). In relational DB that backend connects to, there is a table with some dictionary values. Values can edited by users of the app, but it's done really, really rarely (almost never).
Those dictionary values are used in a lot of pages on UI and because of that I would like to "cache" them in a way. What I want to achieve is that I want to load dictionary values on startup to avoid asking DB for values during every request between UI and Backend.
Firstly, I thought about just loading it on the UI part of the app, when user enters the page for the first time. Then I ruled it out, since when one of the users changes the values, it should be reloaded.
What I think might work is just loading them on startup of Backend into some collection (that can be safely used in concurent environment, probably ConcurrentMap) and then during some GET requests asking that collection for the values (instead of DB). When the values are changed, that request just updates the DB table and reloads them into collection.
Then I thought that the collection solution won't be enough, when my backend would be scaled up to more than one instance. In that case, only one of instances will be updated and the second one will provide outdated data. We can avoid it and force refreshes i.e. every 15 minutes (instead of on demand during values update).
But what I think is the best solution is to start some redis service on a side, load dictionary values into it and after every DB update of the values just update the redis instance with the new ones. Every instance of backend would use the same instance of redis, which seems quicker than executing query (select * from _ where _ = _) on DB.
What do you think? Is my thought process is correct? Do you have any ideas that can help solve my issue?
If you are using Spring you could check out Spring Cache Abstraction. That way your cache will be up-to-date whenever some change occurs.
Out of the box few implementations are supported by Spring:
Spring provides a few implementations of that abstraction: JDK java.util.concurrent.ConcurrentMap based caches, Ehcache 2.x, Gemfire cache, Caffeine, and JSR-107 compliant caches (such as Ehcache 3.x). See Plugging-in Different Back-end Caches for more information on plugging in other cache stores and providers.
If you decide to use Memcached implementation you can check out this library (uses Xmemcached under the hood) here.
You could also check a small demo app of how to use Spring Cache Abstraction in your project (link).
I think your in the right path with your approach in terms of 'caching'. I suggest you also check Memcached for it simplicity. Redis is a good choice but still it depends on your requirements and if you need that much feature. just my 2cent
https://aws.amazon.com/elasticache/redis-vs-memcached/
https://devcenter.heroku.com/articles/spring-boot-memcache#add-caching-to-spring-boot
Thanks,

Clarification on database caching

Correct me if I'm wrong, but from my understanding, "database caches" are usually implemented with an in-memory database that is local to the web server (same machine as the web server). Also, these "database caches" store the actual results of queries. I have also read up on the multiple caching strategies like - Cache Aside, Read Through, Write Through, Write Behind, Write Around.
For some context, the Write Through strategy looks like this:
and the Cache Aside strategy looks like this:
I believe that the "Application" refers to a backend server with a REST API.
My first question is, in the Write Through strategy (application writes to cache, cache then writes to database), how does this work? From my understanding, the most commonly used database caches are Redis or Memcached - which are just key-value stores. Suppose you have a relational database as the main database, how are these key-value stores going to write back to the relational database? Do these strategies only apply if your main database is also a key-value store?
In a Write Through (or Read Through) strategy, the cache sits in between the application and the database. How does that even work? How do you get the cache to talk to the database server? From my understanding, the web server (the application) is always the one facilitating the communication between the cache and the main database - which is basically a Cache Aside strategy. Unless Redis has some kind of functionality that allows it to talk to another database, I don't quite understand how this works.
Isn't it possible to mix and match caching strategies? From how I see it, Cache Aside and Read Through are caching strategies for application reads (user wants to read data), while Write Through and Write Behind are caching strategies for application writes (user wants to write data). Couldn't you have a strategy that uses both Cache Aside and Write Through? Why do most articles always seem to portray them as independent strategies?
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
Could you implement a cache using a normal (not in-memory) database? I suppose this would still be somewhat useful since you do not need to make an additional network hop to the database server (since the cache lives on the same machine as the web server)?
Introduction & clarification
I guess you have one misunderstood point, that the cache is NOT expclicitely stored on the same server as the werbserver. Sometimes, not even the database is sperated on it's own server from the webserver. If you think of APIs, like HTTP REST APIs, you can use caching to not spend too many resources on database connections & queries. Generally, you want to use as few database connections & queries as possible. Now imagine the following setting:
You have a werbserver who serves your application and a REST API, which is used by the webserver to work with some resources. Those resources come from a database (lets say a relational database) which is also stored on the same server. Now there is one endpoint which serves e.g. a list of posts (like blog-posts). Every user can fetch all posts (to make it simple in this example). Now we have a case where one can say that this API request could be cached, to not let all users always trigger the database, just to query the same resources (via the REST API) over and over again. Here comes caching. Redis is one of many tools which can be used for caching. Since redis is a simple in-memory key-value storage, you can just put all of your posts (remember the REST API) after the first DB-query, into the cache. All future requests for the posts-list would first check whether the posts are alreay cached or not. If they are, the API will return the cache-content for this specific request.
This is one simple example to show off, what caching can be used for.
Answers on your question
My first question is, why would you ever write to a cache?
To reduce the amount of database connections and queries.
how is writing to these key-value stores going to help with updating the relational database?
It does not help you with updating, but instead it helps you with spending less resources. It also helps you in terms of "temporary backing up" some data - but that only as a very little side effect. For this, out there are more attractive solutions (Since redis is also not persistent by default. But it supports persistence.)
Do these cache writing strategies only apply if your main database is also a key-value store?
No, it is not important which database you use. Whether it's a NoSQL or SQL DB. It strongly depends on what you want to cache and how the database and it's tables are set up. Do you have frequent changes in your recources? Do resources get updated manually or only on user-initiated actions? Those are questions, leading you to the right caching implementation.
Isn't it possible to mix and match caching strategies?
I am not an expert at caching strategies, but let me try:
I guess it is possible but it also, highly depends on what you are doing in your DB and what kind of application you have. I guess if you find out what kind of application you are building up, then you will know, what strategy you have to use - i guess it is also not recommended to mix those strategies up, because those strategies are coupled to your application type - in other words: It will not work out pretty well.
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
I guess that both is possible. Usually you have one database, maybe clustered or synchronized with copies, to which your webservers (e.g. REST APIs) make their requests. Then whether each of you API servers would have it's own cache, to not query the database at all (in cloud-based applications your database is also maybe on another separated server - so another "hop" in terms of networking). OR (what i also can imagine) you have another middleware between your APIs (clusterd up) and your DB (maybe also clustered up) - but i guess that no one would do that because of the network traffic. It would result in a higher response-time, what you usually want to prevent.
Could you implement a cache using a normal (not in-memory) database?
Yes you could, but it would be way slower. A machine can access in-memory data faster then building up another (local) connection to a database and query your cached entries. Also, because your database has to write the entries into files on your machine, to persist the data.
Conclusion
All in all, it is all about being fast in terms of response times and to prevent much network traffic. I hope that i could help you out a little bit.

Asp.net core Caching In-Memory and Distributed together

Can In-Memory caching and Distributed Cache be used together in the same application? Does it make sense after all?
A logic scenario that comes to my mind is to manager Session state (on top of In-Memory, taking advantage of sticky sessions ) and Distributed for other caching. However I don't know if this makes sense after all.
Yes, you can. One implements IMemoryCache, other implements IDistributedCache.
IMemoryCache will not work properly if you have non-sticky sessions and multiple servers.
Also you may want to use service.AddDistributedMemoryCache(); instead of service.AddMemoryCache();

How to use redis for number of micro-services?

I am very much new to redis. I have been investigating on redis for past few days.I read the documentation on cache management(lru cache), commands ,etc. I want to know how to implement caching for multiple microservice(s) data .
I have few questions:
Can all microservices data(cached) be kept under a single instance of redis
server?
Should every microservice have its own cache database in redis?
How to refresh cache data without setting EXPIRE? Since it would consume more memory.
Some more information on best practices on redis with microservices will be helpful.
It's possible to use the same Redis for multiple microservices, just make sure to prefix your redis cache keys to avoid conflict between all microservices.
You can use multi db in the same redis instance (i.e one for each microservice) but it's discouraged because Redis is single threaded.
The best way is to use one Redis for each microservices, then you can easily flush one of them without touching others.
From my personal experience with a redis cache in production (with 2 million keys), there is no problem using EXPIRE. I encourage you to use it.
Please find below the answer to all your questions -
Can all microservices data(cached) be kept under a single instance of redis server? Ans - Yes you can keep all the data under single redis instance, all you need to do is to set that data using different key Name. As redis is basically a Key-Value Database.
Should every microservice have its own cache database in redis? Ans - Not required. Just make different key for each microservice. Also please note that you can use colon (:) to make folders in redis, to identify different microservices easily on Redis Desktop Manager.
Example - Key Name X:Y:Z, here Z is placed in Y folder and Y is in X. SO you will get a folder kind of structure. That would be helpful to differentiate different microservices.
How to refresh cache data without setting EXPIRE? Since it would consume more memory. Ans - You can set data again on the same key if you have any change in Microservice response. That Key value will get over written in that case.
Can all microservices data(cached) be kept under a single instance of redis server?
In microservice architecture it's prefirible "elastic scale SaaS". You can think your Cache service is perse a microservice (that will response on demand) Then you have multiple options here. The recommended practice on data storage is sharding https://azure.microsoft.com/en-us/documentation/articles/best-practices-caching/#partitioning-a-redis-cache .See the diagram below for book Microservices, IoT and Azure
Should every microservice have its own cache database in redis? It's possible to still thinking "vertical partition" but you should consider "horizontal partitions" so again consider sharding; additionally It's not a bad idea to have "local cache" specialy to avoid DoS
"Be careful not to introduce critical dependencies on the availability of a shared cache service into your solutions. An application should be able to continue functioning if the service that provides the shared cache is unavailable. The application should not hang or fail while waiting for the cache service to resume."
How to refresh cache data without setting EXPIRE? Since it would consume more memory.
You can define your synch polices; I think cache is suitable for things that have few changes.
"It might also be appropriate to have a background process that periodically updates reference data in the cache to ensure it is up to date, or that refreshes the cache when reference data changes."
For cahe best practices check
Caching Best Practices

Infinispan JPA Cache loader?

How do I implement Infinispan JPA cache loader?is there any pattern or way to implement it in infinispan API?
Most existing CacheLoader implementations in Infinispan are assuming the data just needs storage and consider it blindly as an array of bytes. The integration API in Infinispan doesn't expose much of a context other than "store(Key,Value)" or "load(Key)". I'm oversimplifying a bit, but that's the core.
There is one exception which is the LuceneCacheLoader. This was designed to work exclusively in combination with the Lucene Directory for Infinispan, as it takes advantage of the fact
It knows which types to expect
Takes advantage of the known needs of the Directory (such as access pattern)
Have a look at the sources to get inspired; note I only implemented loading (it's a CacheLoader).
If you control both the application using Infinispan and the CacheLoader, you could take advantage of these details as well.
Tricky aspects:
While writing multiple keys even in the same transaction, you'll have access to one entry at a time in the scope of the CacheLoader logic -> hard to map relations: have to deal with one entity at a time and "restore connections"
With write behind you might receive entries out of order -> not sure how to deal with referential integrity
With write behind you're not going to have the same Transactional context -> might be acceptable?
Taking these into account, I'm sure you could write one. How easy? That depends on your app.
I'm not sure if a general purpose solution could work. If you find out it can, please contribute it as it would be a great addition to the project.

Resources