Enlisting a Infinispan Cache Store in a Cache Transaction? - caching

I am using Infinispan 6.0.2 via the Wildfly 8.2 sub-system. I have configured a transactional cache that uses a String Based JDBC Cache Store to persist content placed in the infinispan cache.
My concern is that after reading the following in the Infinispan documentation that there is potential for the cache and cache store to become out of sync when putting/updating/removing multiple entries into the cache in the same transaction due to the transaction committing/rolling-back in the cache but only partial succeeding/failing in the cache store.
4.5. Cache Loaders and transactional caches
When a cache is transactional and a cache loader is present, the cache loader won’t be enlisted in the transaction in which the cache is part. That means that it is possible to have inconsistencies at cache loader level: the transaction to succeed applying the in-memory state but (partially) fail applying the changes to the store. Manual recovery would not work with caches stores.
Could some one please clarify if the above statement only refers to loading from a cache store if it also refers to writing to a store as well.
If this is also the case when writing to a cache store are there any recommended strategies/solutions for ensuring a cache and cache store remain in sync?
The driving factors behind this for me is that I am using Infinispan both for write-through and over-flow of business critical data and need confidence that the cache store correctly represents the state of the data.
I have also asked this question on the Infinispan Forums
Many thanks in advance.

It applies to writes as well, failure to write to the store does not affect rest of the transaction.
The reason for this is that the actual persistence API is not transactional (edit: newer versions of Infinispan support transactional persistence, too). Therefore, with 2-phase commits (in first phase - prepare - all locks are acquired, in second one - commit - the write is executed) the write to the store is executed in the second phase. Therefore, the failure cannot rollback changes on different nodes.
Although Infinispan is trying to get close to strongly consistent in-memory database, it is still rather a cache given the guarantees. If you are more interested in the design limitations (and some of them also theoretical limitations), I recommend reading Infinispan wiki.

Related

Does Custom CacheStore for Ignite automatically do write-behind?

I have implemented a Custom CacheStore for Ignite to communicate with Elastic Search. Now, if ElasticSearch server is down for some time, will the data in the cache be uploaded to Elastic Search once the ES server is up?
Does Custom CacheStore for Ignite automatically do write-behind?
No. It is disabled by default: https://apacheignite.readme.io/docs/3rd-party-store#section-configuration
setWriteBehindEnabled(boolean) | Sets flag indicating whether write-behind is enabled. | false
Now, if ElasticSearch server is down for some time, will the data in the cache be uploaded to Elastic Search once the ES server is up?
No. Ignite will not send that data again. It is also specified in documentation:
Performance vs. Consistency
Enabling write-behind caching increases performance by performing
asynchronous updates, but this can lead to a potential drop in consistency as
some updates could be lost due to node failures or crashes.
If you enable writeBehind in your CacheConfiguration, Ignite will automatically add writeBehind functionality to your Cache Store. It will wrap your Cache Store with GridCacheWriteBehindStore which implements all the delaying and batching functionality.
So yes, Ignite will automatically do write-behind if you enable it in config.
There is a write-behind mode, that doesn't write entries to the cache store once they are written to Ignite, but postpones them for some time, and then writes accumulated entries in a batch. When the write behind queue size reaches CacheConfiguration#writeBehindFlushSize, further operations are blocked until the queue is flushed. If the underlying database is unavailable for some time, then write operations will be retried until they succeed.
Write-behind documentation: https://apacheignite.readme.io/docs/3rd-party-store#section-write-behind-caching
As opposed to this mode, there is a write-through mode, that makes all operations go to the cache store once they are performed. If cache store fails to save the entry, the operation itself is rolled back.
Write-through documentation: https://apacheignite.readme.io/docs/3rd-party-store#section-read-through-and-write-through
I think that since you are implementing a custom CacheStore, then you need to handle it on your own.
At the same time you may take a look at Apache Ignite write-behind mechanism that will accumulate updates in a queue and send them when required. In theory if you have enough memory for a large queue then it can help you to survive ElasticSearch server downtime. But bear in mind that with write-behind Ignite won't provide consistency guaranties.

Ehcache to refresh when an update take place in a database

I am new to Ehcache, My Rest API cache works
<cache name="com.a.b.c.model.Act"
maxElementsInMemory="10000" overflowToDisk="false" statistics="true" />
If I do any update in Database through query, the cache won't update those changes.
If I do update through REST API the cache will get refreshed.
What change I have to make if I have to get cache refresh when a change happens in Database
Is it good to go with timeToLiveSeconds or any other configurations can be done?
Updating the cache when the underlying system of record changes, and not through the service methods on which caching is performed, is the classical problem with caching.
Ehcache does not provide an out of the box solution there as it would mean supporting ALL the technologies that can act as a system of record. Not only databases but also web services and really anything a programmer can come up with.
However, if your application can live with having outdated data in cache for a small period of time, then expiry can be helpful. It effectively mark cache entries as valid for a period of time - time to live.
If your application cannot support stale data in cache, you will have to make sure the process to update the system of record takes care of invalidating the cache as well.
If the code that is upadating the database can access Ehcache manager (same VM or distributed persistence) the best solution is to break the cache
CacheManager.getInstance().getEhcache("mycache").remove(key);
and let the cache refresh autonomously.
If you already dispose of the updated object you could also skip one step with put
CacheManager.getInstance().getEhcache("mycache").put(key, updatedObject);
If your code has CacheEntryFactories I would let the factory do the business (entry creation is centralized, you can add more logic there)

Difference between in-memory-store and managed-store in mule cache

What are the main differences between in-memory-store and managed-store in mule cache scope and which gives a best performance.
What is the best way to configure caching in global scope?
We are currently using in-memory-store caching. We are always getting issues with memory outage as we are using a server with less HW configurations. We are using mule 3.7v.
Please provide your suggestions to configure cache in optimized way.
We are facing issue with cache expiration with in-memory-store. cache date is not being expunged after expiration time also. But when we use "managed-store" its working as expected.
Below is my configuration:
In-memory:
This store the data inside system memory. The data stored with In-memory is non-persistent which means in case of API restart or crash, the data been cached will be lost.
Managed-store:
This stores the data in a place defined by ListableObjectStore. The data stored with Managed-store is persistent which means in case of API restart or crash, the data been cached will no be lost.
Source (explained in detail with configuration difference):
http://www.tutorialsatoz.com/caching-in-mule-cache-scope/
One of my friend clearly explained me this difference as follows:
in-memory cache--> It is a temperoy memory storage area where it will store the data. for example: Consider using a VM component in Mule, the data will be stored in VM in the form of in-memory queue
in the case of Managed store--> we can store the data and use it in later stages. example: object store
mainly cache will store the frequently used data. It will reduce the db or http calls by saving the frequently used data or results in cache scope.
But both are for temporary storage only, means they are valid for that particular session alone.

what happens when a new ehcache cachemanager is created?

In my application I use ehcache with several caches that are backed by a terracotta server.
I noticed that there is a correlation between the size of the data that is saved in the server and the time it takes to create a cache manager instance in the client (the bigger the size the longer it takes).
I couldn't find any info about what actually happens when the cache manager is created.
To my understanding, the data would only be pulled when it is actually requested and not when creating the manager, so what is the overhead?
Any thoughts or references to relevent reading would be much appreciated.
First of all, CacheManager is not related to any data pushing or pulling, it create the caches which contains the elements as name value pairs and holds the data for put/get and other operations. Actually CacheManager do creation, access and removal of Caches.
In-fact when you create a CacheManager that has caches that participates in the terracotta cluster you might see a difference in the time it loads up. The cache manager will establish a connection to the server specified in the config. If there is any pre cache loaders like classes extending BootstrapCacheLoader will affect the load time too. The cache consistency attribute in caches that participate in the cluster has also impact on the load time. Terracotta server by design will push the most hit data to clients in order to reduce cache misses on local and also if the cache is identified for pinning.

Distributed, persistent cache using EHCache

I currently have a distributed cache using EHCache via RMI that works just fine. I was wondering if you can include persistence with the caches to create a distributed, persistent cache.
Alongside this, if the cache was persistent, would it load from the file store, then bootstrap from the cache cluster? Basically, what I want is:
Cache starts
Cache loads persistent objects from the file store
Cache joins the distruted cluster and bootstraps as normal
The usecase behind this is having 2 identical components running on independent machines, distributing the cache to avoid losing data in the event that one of the components fails. The persistence would guard against losing all data on the rare occasion that both components fail.
Would moving to another distribution method (such as Terracotta) support this?
I would take a look at the write-through caching options in EHCache. As described in the link, combining a read-through and write-behind cache will provide persistence to a user-defined data store.
What Terracotta gives you is consistency (so you don't have to worry about resolving conflicts among cluster members). You have the option of defining an interface to your own store (through CacheLoader and CacheWriter or just letting Terracotta persist your data, but I have received mixed signals from Terracotta and documentation on whether TC is appropriate for a system-of-record. If your data is transient and can be blown away at any time (like for web sessions) it might be OK.
Adding bootstrapCacheLoaderFactory element along with cacheEventListenerFactory to the Cache(which which needs to bootstrapped from other nodes when it is down & replicates with other nodes if that node got any updates)
memoryStoreEvictionPolicy="LFU"
diskPersistent="true"
timeToLiveSeconds="86400"
maxElementsOnDisk="1000">

Resources