Gemfire cache throwing PartionedRegionException: Object hashCode inconsistent between peers - caching

When using partitioned caching in gemfire and integrating with spring data using cacheable annotation, it puts the data in cache properly but when retrieving from cache, if the key is on a different partition it is throwing PartionedRegionException saying the hashCode is inconsistent between cache peers. I have overridden equals and hashCode method in the class whose objects are keys for the cache. Any idea where i could be going wrong? The two cache peers are on the same machine. And the locator is started externally.
I'm starting cache using the following method.
#Bean
#Primary
Cache getGemfireCache() {
Cache cache = new CacheFactory().create();
RegionFactory<Object,Object> regionFactory = cache.createRegionFactory(RegionShortcut.PARTITION);
allCacheNames.forEach(cacheName -> regionFactory.create(cacheName));
return cache;
}
Any help would be appreciated.
Thanks!

Hmmm.
First, it is hard to describe exactly what problem you are experiencing, but I am nearly certain it has very little to do with Spring Data, or technically, Spring's Cache Abstraction in this case (especially since you mention "caching" using the #Cacheable annotation) than it does with say, Pivotal GemFire itself, or more likely in your application domain model, specifically.
Second, the problem you are experiencing has very little do with your configuration shown above. Essentially, in your configuration, you are creating a "peer" Cache instance along with Regions for each of your caches identified in the #Cacheable annotations declared on your application service methods, which is not particularly interesting in this case.
TIP: Regarding configuration, it would have been better to do this:
#SpringBootApplication
#EnableCachingDefinedRegions
public class MyCachingSpringBootApplication { ... }
See here, here and here for more information.
NOTE: SBDG creates a ClientCache instance by default, not a "peer" Cache instance. If you truly want your Spring application to contain an embedded peer Cache instance and be part of the server cluster, then you would additionally override SBDG's preference of auto-configuring a ClientCache instance by declaring the #PeerCacheApplication annotation. See here for more details.
Next, you mention that you "overrode" equals and hashCode, which seems to suggest you are using some complex key. In general, it is better to keep with simple key types when using Pivotal GemFire, such as Long, Integer, String, etc, for reasons like what you are experiencing.
A better option if you need to influence your partitioning strategy or data organization across the cluster (e.g. perhaps for collocation) is to implement GemFire's PartitionResolver and register it with the PR.
However, it is not uncommon for you cacheable service methods to look like the following:
#Cacheable("CustomersByAccount")
Account findBy(Customer customer) { ... }
As you may well know, the "key" to the #Cacheable "findBy" service method shown above is Customer, which is clearly a complex object and must have a valid equals and hashCode method when used as a key in a GemFire cache Region, used to back the application cache "CustomersByAccount".
A few questions:
Is it possible that A) your complex key's class definition (e.g. like Customer) changed, such as by adding/removing a [new] field or by changing a field type (?) and B) your PARTITION Region backing the cache (e.g. "CustomersByAccount") is persistent?
Is your equals and hashCode methods consistent? That is they declare and use the same fields to determine the result of equals and hashCode?
For example, this would not be valid:
class Customer {
private Long id;
private String firstName;
private String lastName;
...
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (!(obj instanceof Customer)) {
return false;
}
Customer that = (Customer) obj;
return this.id.equals(that.id);
}
#Override
public int hashCode() {
int hashValue = 17;
hashValue = 37 * hashValue + this.firstName.hashCode();
hashValue = 37 * hashValue + this.lastName.hashCode();
return hashValue;
}
...
}
Or any other combination where equals/hashCode could potentially yield a different result depending on state previously stored in GemFire.
You might also try clearing the cache and rehydrating (eagerly or lazily as necessary), particularly if your class definitions have changed and especially if some of those class types are used as keys.
Also, in general, I would recommend immutable keys as much as possible if it is not possible to strictly stick to simple/scalar types (e.g. like Long or String).
Perhaps, if you could share a bit more details into your application domain model classes, such as the types used as keys, along with your use of Spring's Cache Abstraction on your service methods, that might help.
Also, any examples or test cases reproducing the problem are greatly appreciated.
Thanks!

Related

Spring Caching not working for findAll method

I have recently started working on caching the result from a method. I am using #Cacheable and #CachePut to implement the desired the functionality.
But somehow, the save operation is not updating the cache for findAll method. Below is the code snippet for the same:
#RestController
#RequestMapping(path = "/test/v1")
#CacheConfig(cacheNames = "persons")
public class CacheDemoController {
#Autowired
private PersonRepository personRepository;
#Cacheable
#RequestMapping(method = RequestMethod.GET, path="/persons/{id}")
public Person getPerson(#PathVariable(name = "id") long id) {
return this.personRepository.findById(id);
}
#Cacheable
#RequestMapping(method = RequestMethod.GET, path="/persons")
public List<Person> findAll() {
return this.personRepository.findAll();
}
#CachePut
#RequestMapping(method = RequestMethod.POST, path="/save")
public Person savePerson(#RequestBody Person person) {
return this.personRepository.save(person);
}
}
For the very first call to the findAll method, it is storing the the result in the "persons" cache and for all the subsequent calls it is returning the same result even if the save() operation has been performed in between.
I am pretty new to caching so any advice on this would be of great help.
Thanks!
So, a few things come to mind regarding your UC and looking at your code above.
First, I am not a fan of users enabling caching in either the UI or Data tier of the application, though it makes more sense in the Data tier (e.g. DAOs or Repos). Caching, like Transaction Management, Security, etc, is a service-level concern and therefore belongs in the Service tier IMO, where your application consists of: [Web|Mobile|CLI]+ UI -> Service -> DAO (a.k.a. Repo). The advantage of enabling Caching in the Service tier is that is is more reusable across your application/system architecture. Think, servicing Mobile app clients in addition to Web, for instance. Your Controllers for you Web tier may not necessarily be the same as those handling Mobile app clients.
I encourage you to read the chapter in the core Spring Framework's Reference Documentation on Spring's Cache Abstraction. FYI, Spring's Cache Abstraction, like TX management, is deeply rooted in Spring's AOP support. However, for your purposes here, let's break your Spring Web MVC Controller (i.e. CacheDemoController) down a bit as to what is happening.
So, you have a findAll() method that you are caching the results for.
WARNING: Also, I don't generally recommend that you cache the results of a Repository.findAll() call, especially in production! While this might work just fine locally given a limited data set, the CrudRepository.findAll() method returns all results in the data structure in the backing data store (e.g. the Person Table in an RDBMS) for that particular object/data type (e.g. Person) by default, unless you are employing paging or some LIMIT on the result set returned. When it comes to caching, always think a high degree of reuse on relatively infrequent data changes; these are good candidates for caching.
Given your Controller's findAll() method has NO method parameters, Spring is going to determine a "default" key to use to cache the findAll() method's return value (i.e. List<Person).
TIP: see Spring's docs on "Default Key Generation" for more details.
NOTE: In Spring, as with caching in general, Key/Value stores (like java.util.Map) are the primary implementation's for Spring's notion of a Cache. However, not all "caching providers" are equal (e.g. Redis vs. a java.util.concurrent.ConcurrentHashMap, for instance).
After calling the findAll() Controller method, your cache will have...
KEY | VALUE
------------------------
abc123 | List of People
NOTE: the cache will not store each Person in the list individually as a separate cache entry. That is not how method-level caching works in Spring's Cache Abstraction, at least not by default. However, it is possible.
Then, suppose your Controller's cacheable getPerson(id:long) method is called next. Well, this method includes a parameter, the Person's ID. The argument to this parameter will be used as the key in Spring's Cache Abstraction when the Controller getPerson(..) method is called and Spring attempts to find the (possibly existing) value in the cache. For example, say the method is called with controller.getPerson(1). Except a cache entry with key 1 does not exist in the cache, even if that Person (1) is in list mapped to key abc123. Thus, Spring is not going to find Person 1 in the list and return it, and so, this op results in a cache miss. When the method returns the value (the Person with ID 1) will be cached. But, the cache now looks like this...
KEY | VALUE
------------------------
abc123 | List of People
1 | Person(1)
Finally, a user invokes the Controller's savePerson(:Person) method. Again, the savePerson(:Person) Controller method's parameter value is used as the key (i.e. a "Person" object). Let's say the method is called as so, controller.savePerson(person(1)). Well, the CachePut happens when the method returns, so the existing cache entry for Person 1 is not updated since the "key" is different, so a new cache entry is created, and your cache again looks like this...
KEY | VALUE
---------------------------
abc123 | List of People
1 | Person(1)
Person(1) | Person(1)
None of which is probably what you wanted nor intended to happen.
So, how do you fix this. Well, as I mentioned in the WARNING above, you probably should not be caching an entire collection of values returned from an op. And, even if you do, you need to extend Spring's Caching infrastructure OOTB to handle Collection return types, to break the elements of the Collection up into individual cache entries based on some key. This is intimately more involved.
You can, however, add better coordination between the getPerson(id:long) and savePerson(:Person) Controller methods, however. Basically, you need to be a bit more specific about your key to the savePerson(:Person) method. Fortunately, Spring allows you to "specify" the key, by either providing s custom KeyGenerator implementation or simply by using SpEL. Again, see the docs for more details.
So your example could be modified like so...
#CachePut(key = "#result.id"
#RequestMapping(method = RequestMethod.POST, path="/save")
public Person savePerson(#RequestBody Person person) {
return this.personRepository.save(person);
}
Notice the #CachePut annotation with the key attribute containing the SpEL expression. In this case, I indicated that the cache "key" for this Controller savePerson(:Person) method should be the return value's (i.e. the "#result") or Person object's ID, thereby matching the Controller getPerson(id:long) method's key, which will then update the single cache entry for the Person keyed on the Person's ID...
KEY | VALUE
---------------------------
abc123 | List of People
1 | Person(1)
Still, this won't handle the findAll() method, but it works for getPerson(id) and savePerson(:Person). Again, see my answers to the posting(s) on Collection values as return types in Spring's Caching infrastructure and how to handle them properly. But, be careful! Caching an entire Collection of values as individual cache entries could reck havoc on your application's memory footprint, resulting in OOME. You definitely need to "tune" the underlying caching provider in this case (eviction, expiration, compression, etc) before putting a large deal of entires in the cache, particular at the UI tier where literally thousands of requests maybe happening simultaneously, then "concurrency" becomes a factor too! See Spring's docs on sync capabilities.
Anyway, hope this helps aid your understanding of caching, with Spring in particular, as well as caching in general.
Cheers,
-John

Serializing element into custom Hybris CacheRegion

Info: using Hybris 5.7
I've being trying to write a custom CacheRegion in order to use Redis for this purpose (instead of the in memory solutions provided such as Maps or EHCache).
Following the instructions provided here didn't seem to be enought.
The point that I'm stuck is the moment to serialize the element object wich doesn't implement serializable so I can't get to serialize it into json or byte array or anything else (tried with Jackson, Kryo, FST and java default serializer).
The code is something as follow (skipping other parts):
...
#Override
public Object getWithLoader(CacheKey cacheKey, CacheValueLoader cacheValueLoader) throws CacheValueLoadException {
return Optional.ofNullable(get(cacheKey))
.orElseGet(() -> {
Object element = cacheValueLoader.load(cacheKey);
//Can't get to serialize *element* to store it
return element;
});
}
...
Debugging I found out that Object element is actualy an instance of GenericBMPBean$GenericItemEntityStateCacheUnit and it seems to contains a lot of things (except the actual data that I couldn't find, oddly).
Another thing that I didn't understand yet is the usage of the CacheKey.CacheUnitValueType wich seems to be ignored by the EhCache implementation. Even when it is NON_SERIALIZABLE it is stored into the EhCache.
So my question is: How should I manage to serialize this kind of data?
Plus question: what is the desired usage of the flag CacheUnitValueType ?
The main goal behind this is to decouple the cache from the application JVM and increase HA and scalability.
Thank you.
The point that I'm stuck is the moment to serialize the element object wich doesn't implement serializable so I can't get to serialize it.
An object which doesn't implement Serializable in hybris should not be serialized. You have to take care of what get cached.
It's stated in the documentation
It is not possible on entity and type system regions because of not serializable object nature.

#Cacheable with Spring 3.1

I am using #Cacheable with Spring 3.1. I little bit confused with value and key mapping parameters in Cacheable.
Here is what I am doing:
#Cacheable(value = "message", key = "#zoneMastNo")
public List<Option> getAreaNameOptionList(String local, Long zoneMastNo) {
//..code to fetch data form database..
return list;
}
#Cacheable(value = "message", key = "#areaMastNo")
public List<Option> getLocalityNameOptionList(String local, Long areaMastNo) {
//..code to fetch data form database..
return list;
}
What happening here, second method is dependent on selected value of first method,
but issue is suppose when I pass zoneMastNo = 1 and areaMastNo = 1 then second method returns first methods result.
Actually, I have lots of services hence, I am looking to use common value for cacheable for specific use cases.
Now my questions are:
How can I solve this issue?
Is it good idea that use cacheable for every services?
After specified time will cache completely remove from memory without
using #CacheEvict ?
How can I solve this issue?
I assume zoneMastNo and areaMastNo are completely different keys, by which I mean List<Option> for zoneMastNo = 1 is not the same as List<Option> for areaMastNo = 1. This means you need two caches - one keyed by zone and the other by area. However you are explicitly using only one cache named message. Quoting 29.3.1 #Cacheable annotation:
#Cacheable("books")
public Book findBook(ISBN isbn) {...}
In the snippet above, the method findBook is associated with the cache named books.
So if I understand correctly, you should basically use two different caches:
#Cacheable(value = "byZone", key = "#zoneMastNo")
public List<Option> getAreaNameOptionList(String local, Long zoneMastNo)
//...
#Cacheable(value = "byArea", key = "#areaMastNo")
public List<Option> getLocalityNameOptionList(String local, Long areaMastNo)
Also are you sure these methods won't have a different result depending on local parameter? If not, what is it used for?
Is it good idea that use cacheable for every services?
No, for the following reasons:
some methods are just fast enough
...and caching introduced some overhead on its own
some services call other services, do you need caching on every level of hierarchy
caching needs memory, a lot of it
cache invalidation is hard
After specified time will cache completely remove from memory without using #CacheEvict ?
That totally depends on your cache implementation. But every sane implementation has such an option, e.g. EhCache.
question 3:
it depends on your cache expiration configuration. if you use ehcache, change the settings in ehcache.xml.

EhCache: #CacheEvict on Multiple Objects Using Annotations

I understand that using Spring's (3.1) built in CacheManager using the EhCache implementation, there are certain limitations when in proxy mode (the default) as per this post:
Spring 3.1 #Cacheable - method still executed
Consider the scenario I have:
#CacheEvict(value = "tacos", key = "#tacoId", beforeInvocation = true)
removeTaco(String tacoId) {
// Code to remove taco
}
removeTacos(Set<String> tacoIds) {
for (String tacoId : tacoIds) {
removeTaco(tacoId);
}
}
In this repository method, calling removeTacos(tacoIds) will not actually Evict anything from the Cache because of the limitation described above. My workaround, is that on a service layer above, if I wanted to delete multiple tacos, I'd be looping through each taco Id and passing it into removeTaco(), and never using removeTacos()
However, I'm wondering if there's another way to accomplish this.
1) Is there an SpEL expression that I could pass into the key that would tell EhCache to expire every id in the Set?
e.g. #CacheEvict(value = "tacos", key = "#ids.?[*]") // I know this isn't valid, just can't find the expression.
Or is there a way I can have removeTacos() call removeTaco and actually expire the Cached objects?
The #Caching annotation can be used to combine multiple annotations of the same type such as #CacheEvict or #CachePut, this is the example from the Spring documentation
#Caching(evict = { #CacheEvict("primary"), #CacheEvict(value="secondary", key="#p0") })
public Book importBooks(String deposit, Date date)
You can do one of two things
#CacheEvict(value = "tacos", allEntries = true)
removeTacos(Set<String> tacoIds)
which is not so bad if tacos are read a lot more than they are removed
OR
removeTacos(Set<String> tacoIds) {
for (String tacoId : tacoIds) {
getTacoService().removeTaco(tacoId);
}
}
by calling the service (proxy) you invoke the cache eviction.
AFAIK #CacheEvict supports only removing single entry (by key) or all entries in given cache, there's no way to remove at once multiple entries. If you want to put, update or remove multiple objects from cache (using annotations) and you may switch to memcached take a look at my project Simple Spring Memcached (SSM).
Self invocations don't go through the proxy so one of the solution is to switch to other mode than proxy. Anther solution (I'm not recommending it) may be keeping reference to the service in service (as an autowired field) and use it to invoke removeTaco.
Several months ago I had similar issue in one of my projects. It didn't use Spring Cache but SSM which also requires proxy. To made it work I moved caching (annotations) from service to DAO (repositories) layer. It solved problem with self invocation.

Which layer should I implement caching of lookup data from database in a DDD application?

I am designing a WCF service using DDD.
I have a domain service layer that calls repository to create domain objects. The repository is implemented using ADO.Net and not an ORM. The data comes from DB using Stored Procs. While creating an object say an Address the SP returns an id for state. The SP will not join address table with the state table. The state is represented by a value object class State that has id, abbr and name properties. The list of state objects can be cached (using system.runtime.caching.memorycache) when the application starts up as it is non-volatile data. In general I have a LookupDataRepository that can retrieve all such lookup data from tables. Now the AddressRepository has to populate the State property of address from the state id.
pseudo code:
class AddressRepository : IAddressRepository
{
Address GetAddressById(int id)
{
// call sp and map from data reader
Address addr = new Address(id);
addr.Line = rdr.GetString(1);
addr.State = // what to do ?, ideally LookupCache.GetState(rdr.GetInt32(2))
}
}
class State
{
public int Id;
public string Abbr;
public string Name;
enum StateId {VIC, NSW, WA, SA};
public static State Victoria = // what to do, ideally LookupCache.GetState(StateId.VIC)
}
// then somewhere in address domain model
if(currentState = State.Victroia)
{
// specific logic for Victoria
}
My question is which layer to put this cache ?. Service, Repository, a separate assembly available across all layers.
Where to put Cache? It depends.
If you're scenario will be that you inject your IAddressRepository into several application services (I believe you call em Domain services) the outcome will be:
Caching at repository level will result in that all services will benefit (Pros).
Caching at repository level will result in that all services must use cache (cons).
Caching at service level will only cache for those clients/service that use that specific service and methods (Pros/Cons?)
If you have your transaction management at service layer, you'll need to be careful when applying caching at repository level. Sometime a read operation may hit cache instead and the transaction cannot verify that the read data you're suppose to conduct write operation on isn't modified.
I would go for caching at Service layer. If feels more natural and gives you more control of where and when you want to cache. Repository level is usually to low grained. Service layer and its methods is more closer to use cases and it's then you know when and what to cache.
I really recommend writing a cache wrapper like
public class CacheManager : ICacheManager
{
public Address Address
{
get { }
set { }
}
}
That holds a static reference to System.Runtime.Caching.MemoryCache.Default.
It makes your Caching type safety and casting is only done inside wrapper. You can also unit test your services with a Mocked ICacheManager injected.
A more advanced approach is to do this with Aspect Oriented Programming and decorators/interceptors. You have tons of good info here at StackOverFlow https://stackoverflow.com/search?q=AOP+caching
In my opinion and from my experience, would caching as a repository result in duplicate code, more complexity to the domain, and all events need to be cached? Since an IRepository interface would be implemented ...
So I opted for an infrastructure service, let's cache only what is needed in our app. It is also available to all other services that wish to consume it.

Resources