Terracotta Disabling L2 cache and object serialization - caching

I'm quite new to Terracotta and I've installed and made it work properly with EHcache for distribute caching, but what I'm getting now is not what I really want.
In my application I would like to have several client caches (L1) with ehcache, and "propagate" the invalidation of a removed key from a client to all the other clients. I don't want that my cache will reside also on the terracotta server, so I'd like to simply disable L2 caching, so that my objects don't need to be serializable (the only actions done on cache are PUT and REMOVE).
I know this could be done using simply ehcache, but i have no multicasting support in my environment (Amazon EC2) and my clients will be automatically created with autoscaling features, so I cannot know their IPs.
So basically, I need a Terracotta Server only to propagate the invalidation request to all the clients. Is there any way to accomplish this?
Thanks a lot!

When you use EhCache backed by Terracotta and in your cache configuration you specify to use terracotta, e.g.:
<cache name="com.xyz.MyPOJO"
<terracotta/>
</cache>
then your class must be serializable (since Terracotta will attempt to store it on the cache server instance.
However, in your configuration you can specify not to use Terracotta for some caches, e.g.
<cache name="com.xyz.MyPOJO"
<terracotta/>
</cache>
<cache name="com.xyz.NotServerStoredPOJO"
</cache>
Then your "NotServerStoredPOJO" from the example above will not be stored on the terracotta cache server...instead it will live only in your local EhCache... but by doing that you will not be able to propagate it to other instances if your EhCache in diff JVMs.
So, perhaps you will need to keep something on the terracotta server (some sort of flags/ids) that will indicate what to invalidate. Then in your application code have a common class/functionality that will check that flag before getting value from the local EhCache...if it finds flag/id to be deleted, then it will delete it from local cache and will not return back anything to the requester.
On on the other hand, your use case kind of defeats the purpose of having central cache server. If you want to coordinate multiple cache instances without central location, you can use JGroups http://www.jgroups.org (instead of Terracotta)...its free of commercial license also. But then you'll need to implement your own mechanism over JGroups of how to invalidate certain entries in your local EhCache instances...

Related

MyBatis caching strategy in distributed system

I'd like to know how myBatis cache (local and second level) to handle data in distributed system. I have 5 instances running against Oracle db, and I use MyBatis for data access. All 5 instances are same but running on different servers. The Mybatis are configured to use SESSION cache, which being said the cache is cleared when any insert/delete/update statement is executed.
When 1 instance runs , the local cache of that server is cleared. How does the other 4 instances know the cache needs to be flushed/renewed?
If you are using the built-in cache, no they don't. Never enable secondary cache if you are using MyBatis in a distributed environment with default cache because they don't know what happens each other and won't clear the staled cache when change happens.
You need to set up an external cache service, such as Ehcache and Redis, to make MyBatis secondary cache usable.
Please refer to http://mybatis.org/ehcache-cache/ and http://mybatis.org/redis-cache/
Giving a track:
I guess all instances are behind a load balancer and running against a single Oracle DB.
Instances nodes would better be in a cluster, otherwise, how could they communicate with each other. Then cache may be shared between cluster's nodes, for example as stated in Jboss doc, working with Hibernate.
The question is more about how to configure server (or application, in files such as beans.xml) to use MyBatis cache.
If the SessionFactory is declared #ApplicationScoped, it could be enough.

Is the overhead of serializing and deserializing POJOs a good reason for using Infinispan over Memcached or Redis for caching POJOs?

I need to cache different user and application data on a daily basis.
Context:
no experience with caches
working on a java web application that sends news articles to users displayed in a user-feed format
MySQL backend
Java middle tier using Hibernate and Jersey
I've checked out different cache technologies, and it seems like Memcached or Redis are the most used technologies in use cases similar to mine -- many reads and writes i.e. Facebook, Twitter, etc.
But I have to serialize objects before I cache them using the two above cache systems. It seemed like an unnecessary step to cache just a POJO, so I checked out POJO caches and stumbled upon JBOSS's Infinispan.
Does anyone have any good reasons why I shouldn't use Infinispan over Memcached or Redis over the serialization, and subsequent deserialization, overhead concern?
When Infinispan works in clustered mode, or when it has to offload data to external stores, it will have to face Serialization.
The good news is:
- you'll avoid any serialization costs unless it has to go somewhere else
- its own serialization mechanism is far more efficient than Java's standard serialization mechanism (and nicely customizable)
Memcached and Redis are "external" caching solutions, while with Infinispan you can keep the same Java instance cached. If this is a good or bad thing depends on your architecture specifics.
Although commonly you'll want to use a hybrid solution: use Infinispan for your in-JVM needs, cap its memory usage, have it offload what can't be fit locally to an external store, and it's easy to have it offload the extra stuff to either Redis, Memcached, another Infinispan cluster, or several other alternatives.
Your benefit is transparent integration with some popular frameworks (i.e. Hibernate) and that it can handle the serialization efficiently for you - if and when it's needed as it might need to happen in background.

Why everyone recommend to avoid use EHCache as a distributed cache in Play 2.x?

I want to cluster EHCache in Play Framework 2.x web application in several node. Why everyone recommend to avoid to use EHCache as a distributed cache in Play 2.x clustered web application?
I use nginx proxy to serve request across Play node and i want to make default EHCache of each node share its content.
Well according to this EHCache page, using EHCache in distributed mode is a commercial product. So if you want to use a free distributed cache, you need something different like Memcached or Redis.
My experience deploying a (Java)Play 2.2.3 to Amazon EC2 was terrible with EHCache. It requires a few workarounds with the localhost resolve (going su for each of your nodes - hard work when you have a few dozens of servers) and regardless, being free only for standalone version without ostensively letting us know upfront is a big no-no for me. I'm done with EHCache.
Edit: moved to Redis in 2015 (thanks #Traveler)
I am not aware of any Play Framework issues here, but the use of ehcache 2.x should fine as you can set it up with JGroups (faster than RMI) and use invalidation mode (infinispan slang).
Invalidation is a clustered mode that does not actually share any data at all, but simply aims to remove data that may be stale from remote caches. This cache mode only makes sense if you have another, permanent store for your data.
In ehcache 2.x you can set up invalidation mode with replicatePuts=false in your jgroups config.
In ehcache 3.x they do not have such a mode. You have to set up a commercial Terracotta server which is a distributed cache. So all date is moved between nodes and the terracotta server.
We tried it once and failed terribly.
As ehcache2.x is no longer active we just switched to Infinispan which has all features of ehcache2.x and a lot more.
So my recommendation: Use ehcache 2.x or infinispan. Do not use ehcache 3.x

Spring+Hibernate(with 2nd level cache enabled) in Tomcat(clustered), do I need JTA?

As the title says, I have a web application which should be able to run on cluster with hibernate 2nd level cache enabled and org.springframework.orm.hibernate.HibernateTransactionManager as a transaction manager. The application has only one database. It will be deployed in Tomcat 7, and for some reason the company will not use any application server(I'm not in charge). Now I checked some Cache providers, for example Infinispan, which as the doc says, is cluster safe when JTA is used as a transaction manager.
My job is to research a caching solution which is cluster-safe.
Now I want to know if it's possible to achieve a cluster safe cache with the above stack? Is JTA a must?
I've had success using org.hibernate.cache.EhCacheProvider with org.springframework.orm.hibernate3.HibernateTransactionManager in a clustered environment on both Tomcat and JBoss (albeit an earlier version of Tomcat than the version you're using). It wasn't necessary to use JTA.
EHCache supports clustering right out of the box through various replication mechanisms. I've used the RMI Replicated Caching mechanism which uses multicast for automatic peer discovery and that worked quite nicely in a multi-node cluster with multiple caches per node.
Once configured, replication would take place between the caches within a node and between caches across nodes. It was very reliable, transparent as far as the application was concerned and I don't recall ever having to deal with any issues associated with it. It just worked.
You can specify EhCacheProvider in your Hibernate configuration along with the properties to enable second level caching:
hibernate.cache.use_second_level_cache=true
hibernate.cache.use_query_cache=true
hibernate.cache.provider_class=org.hibernate.cache.EhCacheProvider
The remainder of the configuration is in the ehcache.xml file which defines the caches and the replication configuration. It may be worth checking out the EHCache documentation if you're not familiar with the format of ehcache.xml - but they provide a useful example file here.
An example replicated cache from ehcache.xml may look something like this:
<cache name="example"
maxElementsInMemory="1000"
eternal="false"
overflowToDisk="false"
timeToIdleSeconds="0"
timeToLiveSeconds="600">
<cacheEventListenerFactory
class="net.sf.ehcache.distribution.RMICacheReplicatorFactory"/>
<bootstrapCacheLoaderFactory
class="net.sf.ehcache.distribution.RMIBootstrapCacheLoaderFactory"
properties="bootstrapAsynchronously=true, maximumChunkSizeBytes=5000000"/>
</cache>
And then you'll need to add the replication settings which may look like this:
<cacheManagerPeerProviderFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory"
properties="peerDiscovery=automatic, multicastGroupAddress=230.0.0.2,
multicastGroupPort=4455, timeToLive=1" />
<cacheManagerPeerListenerFactory
class="net.sf.ehcache.distribution.RMICacheManagerPeerListenerFactory"
properties="hostName=localhost, port=40001, socketTimeoutMillis=2000" />
That's really about it. There are other ways to configure replication in EHCache as described in the documentation but the RMI method described above is relatively simple and has worked well for me. If you do decide to go with EHCache, in addition to the documentation there are various posts on StackOverflow relating to replication that you might want to consult.

Spring cache of two Grails applications in the same machine (different Jetty server)

Hi I have one Grails application, it uses Spring cache. I want to clone it (say APP_A and APP_B) and deploy on separate it as each access different DB and has some different configuration.
Currently I have two copy of Jetty servers (JETTY_A, JETTY_B. different port). I put APP_A in Jetty_A and APP_B in Jetty_B.
I'm not familiar with Spring cache.
Is this deployment save? I mean, will there be any mix of cache between both? Because both using the same code base. So, the cache will use the same key name.
#cacheable("someCache")
SpringCache uses EHCache under the covers. The caches are in-process caches and they do not affect caches running in other processes on the same machine, unless you had explicitly configured distributed caching.
As #KenLiu said in his answer, Spring Cache is strictly in-process when using EHCache as it's cache provider. Since you are working with Grails, however, there are better alternatives that will require only minimal changes.
The Grails Cache Plugin is a offers a Spring Cache API-compatible cache abstraction over a number of (plugable) cache providers, including some, like the Redis provider, that allow you to cache between processes (and entire machines) very easily.

Resources