Background
I am working on a spring boot application. In this application, we have two different caffeine caches - dealDetailsCache and trackingDetailsCache.
Both of these caffeine caches have single keys, let us say, X and Y respectively. We just keep updating the values for the same keys while refreshing.
Application Flow
In every 5 minutes, there is a scheduled job that runs.
This job fetches some data from an external source and upon successful retrieval, updates the 2 caffeine caches (mentioned above)
Currently, I am manually doing a put operation on each of the above caffeine caches to refresh the data for their respective keys:
put(X, <<data>>) for updating dealDetailsCache
put(Y, <<data>>) for updating trackingDetails
Expected QPS is about 50 per second.
What am I looking for
I am looking for a way to refresh the caffeine caches (just like buildAsync)
such that it does not impact the application and there should be no
downtime.
If put is not the right way to do this, then would someone please
suggest the right way to update the cache in such a way that there is
absolutely no downtime.
I read about CacheEvict, but there is a risk associated with it. It evicts and then refreshes. There could be some time between the two operations and any requests that come in during this period (after eviction and before new data is loaded) would fail.
Ultimate aim is that the requests should always find the data, even if it is old for the time being. Would someone please suggest a clean mechanism for manual cache refreshes?
Related
I have implemented Caching in my Spring Boot REST Application. My policy includes a time based cache eviction strategy, and an update-based cache eviction strategy. I am worried that since I employ a stateless server, if there is a method called to update certain data, and this was handled by server instance A, then the corresponding caches in server instance B, C and D, are not updated as well.
Is this an issue I would face / is there a way to overcome this issue?
This is the oldest problem in software development - cache invalidation when you have multiple servers
One way to handle it is to move your cache out of the individual servers and move them to somewhere shared like another instance that holds the cache entries that every other app refers to or something like redis [centralized cache]
Second way is to do a broadcast message so that each server now knows to invalidate the entry once the data has been modified or deleted - here you run the risk of the message not being processed and thus a stale entry is left in some server[s]
Another option is to have some sort of write ahead log [like kafka or redis streams] which is processed by each server and thus they will all process the events deterministically and have the same cache state
Lmk if you need more help - we can setup some time outside of SO
If i add spring-session jdbc to my vaadin-spring-boot-application the application is very slow and does a full page reload after a few seconds. Everything else looks like it is working normally.
I do not notice the problem and I have been researching on this issue for a few days and got this Github issue and Vaadin microservices configuration But in these, I did not find a suitable solution to solve this problem, Any one can give me an true example to implemention Spring sessions on Vaadin?
Regards.
Session replication schemes like spring-session assumes that the session is relatively small and that the content isn't sensitive to concurrent modification from multiple request threads. Neither of those assumptions hold true for a typical Vaadin application.
The first problem is that there's typically between 100KB and 10MB of data in the session that needs to be fetched from the database, deserialized, updated and then again serialized and stored in the database for each request. The second problem is that Vaadin stores a lock instance in the session and uses that to ensure there aren't multiple request threads using the same session concurrently.
To serialize a session to persistent storage, you thus need to ensure your load balancer uses sticky sessions and typically also use a high performance solution such as Hazelcast rather than just deserializing and serializing individually for each request.
For more details, you can have a look at these two posts:
https://vaadin.com/learn/tutorials/hazelcast
https://vaadin.com/blog/session-replication-in-the-world-of-vaadin
I am developing a microservice. This MS will be deployed to docker containers and will be monitored by Kubernetes. I have to implement a caching solution using hazelcast distributed cache. My requirements are:
Preload the cache on startup of this microservice. For around 3000 stores I have to fetch two specific attributes and cache them.
Every 24 hours refresh the cache.
I implemented Spring #EventListener and on startup to make a database call for the 2 attributes and do a #CachePut and store them in Cache.
I also have a Spring scheduler with cron expression to refresh cache at every 6 AM in morning.
So far so good.
But what I did not realize that in clustered environment - 10-15 instances of my microservice will be in action and will try to do above 2 steps almost simultaneously - thus creating a stampede effect on my database and cache. Does anyone know what to do in this scenario? Is there any good design or even average one which I can follow?
Thanks.
You should be looking to use Hazelcast provided Loading and Storing Persistent Data mechanism that allows 2 options for writing: Write-through and write-behind and read-through for loading data into the cache.
Look for MapLoader and its methods, that will let you warm-up/preload your cluster and you have the freedom to do that with your own implementation.
Check for more details: https://docs.hazelcast.org/docs/3.11/manual/html-single/index.html#loading-and-storing-persistent-data
I am using magnolia enterprise standard version 5.3. We have publish and publish inc. sub nodes option for different apps. Can someone please tell me how cache work when we publish a tree structure? i means to say that, is it publish each node one by one and after publishing each node is it flush the public cache? or first it publish whole tree and then flush public cache?
Actually i want to apply wait time for bulk publish? before that i want to understand cache role while we publish the tree structure.
Can we add wait time for bulk publish?
I am not talking about multisite cache things.
Depends on how you configured the cache (or flush policy (or actually the observer that triggers flush policy). IIRC, by default, it is configured such that when event ("something was published") arrives, it will wait and collect all other incoming activations that come within one second. If nothing comes in one second since last event, the event with aggregated messages is passed on to flush policy. If, on the other hand, the events keep arriving, observation will keep collecting and aggregating those events for maximum of 4 seconds before reacting and flushing the cache. (I hope, 1 sec and 4 secs are the correct intervals, but it has been couple of years since I was last time digging anything in that area, so it might have been slightly changed since.)
In EE you have also possibility to configure other caching policies and can have dual cache where one is always pre-heated w/ new content before other is flushed or you can write completely custom policy that suits your needs.
I am new to Ehcache, My Rest API cache works
<cache name="com.a.b.c.model.Act"
maxElementsInMemory="10000" overflowToDisk="false" statistics="true" />
If I do any update in Database through query, the cache won't update those changes.
If I do update through REST API the cache will get refreshed.
What change I have to make if I have to get cache refresh when a change happens in Database
Is it good to go with timeToLiveSeconds or any other configurations can be done?
Updating the cache when the underlying system of record changes, and not through the service methods on which caching is performed, is the classical problem with caching.
Ehcache does not provide an out of the box solution there as it would mean supporting ALL the technologies that can act as a system of record. Not only databases but also web services and really anything a programmer can come up with.
However, if your application can live with having outdated data in cache for a small period of time, then expiry can be helpful. It effectively mark cache entries as valid for a period of time - time to live.
If your application cannot support stale data in cache, you will have to make sure the process to update the system of record takes care of invalidating the cache as well.
If the code that is upadating the database can access Ehcache manager (same VM or distributed persistence) the best solution is to break the cache
CacheManager.getInstance().getEhcache("mycache").remove(key);
and let the cache refresh autonomously.
If you already dispose of the updated object you could also skip one step with put
CacheManager.getInstance().getEhcache("mycache").put(key, updatedObject);
If your code has CacheEntryFactories I would let the factory do the business (entry creation is centralized, you can add more logic there)