Serializing element into custom Hybris CacheRegion - caching

Info: using Hybris 5.7
I've being trying to write a custom CacheRegion in order to use Redis for this purpose (instead of the in memory solutions provided such as Maps or EHCache).
Following the instructions provided here didn't seem to be enought.
The point that I'm stuck is the moment to serialize the element object wich doesn't implement serializable so I can't get to serialize it into json or byte array or anything else (tried with Jackson, Kryo, FST and java default serializer).
The code is something as follow (skipping other parts):
...
#Override
public Object getWithLoader(CacheKey cacheKey, CacheValueLoader cacheValueLoader) throws CacheValueLoadException {
return Optional.ofNullable(get(cacheKey))
.orElseGet(() -> {
Object element = cacheValueLoader.load(cacheKey);
//Can't get to serialize *element* to store it
return element;
});
}
...
Debugging I found out that Object element is actualy an instance of GenericBMPBean$GenericItemEntityStateCacheUnit and it seems to contains a lot of things (except the actual data that I couldn't find, oddly).
Another thing that I didn't understand yet is the usage of the CacheKey.CacheUnitValueType wich seems to be ignored by the EhCache implementation. Even when it is NON_SERIALIZABLE it is stored into the EhCache.
So my question is: How should I manage to serialize this kind of data?
Plus question: what is the desired usage of the flag CacheUnitValueType ?
The main goal behind this is to decouple the cache from the application JVM and increase HA and scalability.
Thank you.

The point that I'm stuck is the moment to serialize the element object wich doesn't implement serializable so I can't get to serialize it.
An object which doesn't implement Serializable in hybris should not be serialized. You have to take care of what get cached.
It's stated in the documentation
It is not possible on entity and type system regions because of not serializable object nature.

Related

Gemfire cache throwing PartionedRegionException: Object hashCode inconsistent between peers

When using partitioned caching in gemfire and integrating with spring data using cacheable annotation, it puts the data in cache properly but when retrieving from cache, if the key is on a different partition it is throwing PartionedRegionException saying the hashCode is inconsistent between cache peers. I have overridden equals and hashCode method in the class whose objects are keys for the cache. Any idea where i could be going wrong? The two cache peers are on the same machine. And the locator is started externally.
I'm starting cache using the following method.
#Bean
#Primary
Cache getGemfireCache() {
Cache cache = new CacheFactory().create();
RegionFactory<Object,Object> regionFactory = cache.createRegionFactory(RegionShortcut.PARTITION);
allCacheNames.forEach(cacheName -> regionFactory.create(cacheName));
return cache;
}
Any help would be appreciated.
Thanks!
Hmmm.
First, it is hard to describe exactly what problem you are experiencing, but I am nearly certain it has very little to do with Spring Data, or technically, Spring's Cache Abstraction in this case (especially since you mention "caching" using the #Cacheable annotation) than it does with say, Pivotal GemFire itself, or more likely in your application domain model, specifically.
Second, the problem you are experiencing has very little do with your configuration shown above. Essentially, in your configuration, you are creating a "peer" Cache instance along with Regions for each of your caches identified in the #Cacheable annotations declared on your application service methods, which is not particularly interesting in this case.
TIP: Regarding configuration, it would have been better to do this:
#SpringBootApplication
#EnableCachingDefinedRegions
public class MyCachingSpringBootApplication { ... }
See here, here and here for more information.
NOTE: SBDG creates a ClientCache instance by default, not a "peer" Cache instance. If you truly want your Spring application to contain an embedded peer Cache instance and be part of the server cluster, then you would additionally override SBDG's preference of auto-configuring a ClientCache instance by declaring the #PeerCacheApplication annotation. See here for more details.
Next, you mention that you "overrode" equals and hashCode, which seems to suggest you are using some complex key. In general, it is better to keep with simple key types when using Pivotal GemFire, such as Long, Integer, String, etc, for reasons like what you are experiencing.
A better option if you need to influence your partitioning strategy or data organization across the cluster (e.g. perhaps for collocation) is to implement GemFire's PartitionResolver and register it with the PR.
However, it is not uncommon for you cacheable service methods to look like the following:
#Cacheable("CustomersByAccount")
Account findBy(Customer customer) { ... }
As you may well know, the "key" to the #Cacheable "findBy" service method shown above is Customer, which is clearly a complex object and must have a valid equals and hashCode method when used as a key in a GemFire cache Region, used to back the application cache "CustomersByAccount".
A few questions:
Is it possible that A) your complex key's class definition (e.g. like Customer) changed, such as by adding/removing a [new] field or by changing a field type (?) and B) your PARTITION Region backing the cache (e.g. "CustomersByAccount") is persistent?
Is your equals and hashCode methods consistent? That is they declare and use the same fields to determine the result of equals and hashCode?
For example, this would not be valid:
class Customer {
private Long id;
private String firstName;
private String lastName;
...
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (!(obj instanceof Customer)) {
return false;
}
Customer that = (Customer) obj;
return this.id.equals(that.id);
}
#Override
public int hashCode() {
int hashValue = 17;
hashValue = 37 * hashValue + this.firstName.hashCode();
hashValue = 37 * hashValue + this.lastName.hashCode();
return hashValue;
}
...
}
Or any other combination where equals/hashCode could potentially yield a different result depending on state previously stored in GemFire.
You might also try clearing the cache and rehydrating (eagerly or lazily as necessary), particularly if your class definitions have changed and especially if some of those class types are used as keys.
Also, in general, I would recommend immutable keys as much as possible if it is not possible to strictly stick to simple/scalar types (e.g. like Long or String).
Perhaps, if you could share a bit more details into your application domain model classes, such as the types used as keys, along with your use of Spring's Cache Abstraction on your service methods, that might help.
Also, any examples or test cases reproducing the problem are greatly appreciated.
Thanks!

Spring Caching not working for findAll method

I have recently started working on caching the result from a method. I am using #Cacheable and #CachePut to implement the desired the functionality.
But somehow, the save operation is not updating the cache for findAll method. Below is the code snippet for the same:
#RestController
#RequestMapping(path = "/test/v1")
#CacheConfig(cacheNames = "persons")
public class CacheDemoController {
#Autowired
private PersonRepository personRepository;
#Cacheable
#RequestMapping(method = RequestMethod.GET, path="/persons/{id}")
public Person getPerson(#PathVariable(name = "id") long id) {
return this.personRepository.findById(id);
}
#Cacheable
#RequestMapping(method = RequestMethod.GET, path="/persons")
public List<Person> findAll() {
return this.personRepository.findAll();
}
#CachePut
#RequestMapping(method = RequestMethod.POST, path="/save")
public Person savePerson(#RequestBody Person person) {
return this.personRepository.save(person);
}
}
For the very first call to the findAll method, it is storing the the result in the "persons" cache and for all the subsequent calls it is returning the same result even if the save() operation has been performed in between.
I am pretty new to caching so any advice on this would be of great help.
Thanks!
So, a few things come to mind regarding your UC and looking at your code above.
First, I am not a fan of users enabling caching in either the UI or Data tier of the application, though it makes more sense in the Data tier (e.g. DAOs or Repos). Caching, like Transaction Management, Security, etc, is a service-level concern and therefore belongs in the Service tier IMO, where your application consists of: [Web|Mobile|CLI]+ UI -> Service -> DAO (a.k.a. Repo). The advantage of enabling Caching in the Service tier is that is is more reusable across your application/system architecture. Think, servicing Mobile app clients in addition to Web, for instance. Your Controllers for you Web tier may not necessarily be the same as those handling Mobile app clients.
I encourage you to read the chapter in the core Spring Framework's Reference Documentation on Spring's Cache Abstraction. FYI, Spring's Cache Abstraction, like TX management, is deeply rooted in Spring's AOP support. However, for your purposes here, let's break your Spring Web MVC Controller (i.e. CacheDemoController) down a bit as to what is happening.
So, you have a findAll() method that you are caching the results for.
WARNING: Also, I don't generally recommend that you cache the results of a Repository.findAll() call, especially in production! While this might work just fine locally given a limited data set, the CrudRepository.findAll() method returns all results in the data structure in the backing data store (e.g. the Person Table in an RDBMS) for that particular object/data type (e.g. Person) by default, unless you are employing paging or some LIMIT on the result set returned. When it comes to caching, always think a high degree of reuse on relatively infrequent data changes; these are good candidates for caching.
Given your Controller's findAll() method has NO method parameters, Spring is going to determine a "default" key to use to cache the findAll() method's return value (i.e. List<Person).
TIP: see Spring's docs on "Default Key Generation" for more details.
NOTE: In Spring, as with caching in general, Key/Value stores (like java.util.Map) are the primary implementation's for Spring's notion of a Cache. However, not all "caching providers" are equal (e.g. Redis vs. a java.util.concurrent.ConcurrentHashMap, for instance).
After calling the findAll() Controller method, your cache will have...
KEY | VALUE
------------------------
abc123 | List of People
NOTE: the cache will not store each Person in the list individually as a separate cache entry. That is not how method-level caching works in Spring's Cache Abstraction, at least not by default. However, it is possible.
Then, suppose your Controller's cacheable getPerson(id:long) method is called next. Well, this method includes a parameter, the Person's ID. The argument to this parameter will be used as the key in Spring's Cache Abstraction when the Controller getPerson(..) method is called and Spring attempts to find the (possibly existing) value in the cache. For example, say the method is called with controller.getPerson(1). Except a cache entry with key 1 does not exist in the cache, even if that Person (1) is in list mapped to key abc123. Thus, Spring is not going to find Person 1 in the list and return it, and so, this op results in a cache miss. When the method returns the value (the Person with ID 1) will be cached. But, the cache now looks like this...
KEY | VALUE
------------------------
abc123 | List of People
1 | Person(1)
Finally, a user invokes the Controller's savePerson(:Person) method. Again, the savePerson(:Person) Controller method's parameter value is used as the key (i.e. a "Person" object). Let's say the method is called as so, controller.savePerson(person(1)). Well, the CachePut happens when the method returns, so the existing cache entry for Person 1 is not updated since the "key" is different, so a new cache entry is created, and your cache again looks like this...
KEY | VALUE
---------------------------
abc123 | List of People
1 | Person(1)
Person(1) | Person(1)
None of which is probably what you wanted nor intended to happen.
So, how do you fix this. Well, as I mentioned in the WARNING above, you probably should not be caching an entire collection of values returned from an op. And, even if you do, you need to extend Spring's Caching infrastructure OOTB to handle Collection return types, to break the elements of the Collection up into individual cache entries based on some key. This is intimately more involved.
You can, however, add better coordination between the getPerson(id:long) and savePerson(:Person) Controller methods, however. Basically, you need to be a bit more specific about your key to the savePerson(:Person) method. Fortunately, Spring allows you to "specify" the key, by either providing s custom KeyGenerator implementation or simply by using SpEL. Again, see the docs for more details.
So your example could be modified like so...
#CachePut(key = "#result.id"
#RequestMapping(method = RequestMethod.POST, path="/save")
public Person savePerson(#RequestBody Person person) {
return this.personRepository.save(person);
}
Notice the #CachePut annotation with the key attribute containing the SpEL expression. In this case, I indicated that the cache "key" for this Controller savePerson(:Person) method should be the return value's (i.e. the "#result") or Person object's ID, thereby matching the Controller getPerson(id:long) method's key, which will then update the single cache entry for the Person keyed on the Person's ID...
KEY | VALUE
---------------------------
abc123 | List of People
1 | Person(1)
Still, this won't handle the findAll() method, but it works for getPerson(id) and savePerson(:Person). Again, see my answers to the posting(s) on Collection values as return types in Spring's Caching infrastructure and how to handle them properly. But, be careful! Caching an entire Collection of values as individual cache entries could reck havoc on your application's memory footprint, resulting in OOME. You definitely need to "tune" the underlying caching provider in this case (eviction, expiration, compression, etc) before putting a large deal of entires in the cache, particular at the UI tier where literally thousands of requests maybe happening simultaneously, then "concurrency" becomes a factor too! See Spring's docs on sync capabilities.
Anyway, hope this helps aid your understanding of caching, with Spring in particular, as well as caching in general.
Cheers,
-John

Setting a global map in Grails

I am building a Grails web app, I created a map in BootStrap and placed it in servletContext in order to make it available to my application from anywhere. On average this map should hold about 1000 entries with String keys and Date value.
I was wondering if that can impact my application performance and there is a better place to keep this map ? I want this map to work as a caching mechanism. I wanna put a unique Key in it and a date, and be able to retrieve that date object from anywhere such as within a controller, or service class by passing the key. I was thinking of using a caching mechanism to do that but haven't find one that can do this form. I appreciate it if anyone can suggest any plugin for Grails that can achieve this.
P.S: Is it possible to do this with Cache Plugin : http://grails-plugins.github.io/grails-cache/docs/manual/guide/usage.html#annotations
You could use a Service for this task. Service is a singleton, so it will be alive all the time. And it's much easier to access from other parts of app. To prepare data on application startup, you can implements InitializingBean.
Foe example:
class MyCacheService implements InitializingBean {
Map cache
void afterPropertiesSet() {
cache = [
a: 1,
b: 2,
// .....
]
}
}
About making the Map cache thread-safe, we can use ConcurrentReaderHashMap cache so that does mostly-concurrent reading, but exclusive writing. That way everyone can read it from the service but not everyone can write to it or modify it at the same time.
It is possible to use Synchronized block on the methods such as addToCache so that not two controllers can write at the same time, but for getFromCache we don't need that.
Sample code for ConcurrentReaderHashMap

How to handle a large set of data using Spring Data Repositories?

I have a large table that I'd like to access via a Spring Data Repository.
Currently, I'm trying to extend the PagingAndSortingRepository interface but it seems I can only define methods that return lists, eg.:
public interface MyRepository extends
PagingAndSortingRepository<MyEntity, Integer>
{
#Query(value="SELECT * ...")
List<MyEntity> myQuery(Pageable p);
}
On the other hand, the findAll() method that comes with PagingAndSortingRepository returns an Iterable (and I suppose that the data is not loaded into memory).
Is it possible to define custom queries that also return Iterable and/or don't load all the data into memory at once?
Are there any alternatives for handling large tables?
We have the classical consulting answer here: it depends. As the implementation of the method is store specific, we depend on the underlying store API. In case of JPA there's no chance to provide streaming access as ….getResultList() returns a List. Hence we also expose the List to the client as especially JPA developers might be used to working with lists. So for JPA the only option is using the pagination API.
For a store like Neo4j we support the streaming access as the repositories return Iterable on CRUD methods as well as on the execution of finder methods.
The implementation of findAll() simply loads the entire list of all entities into memory. Its Iterable return type doesn't imply that it implements some sort of database level cursor handling.
On the other hand your custom myQuery(Pageable) method will only load one page worth of entities, because the generated implementation honours its Pageable parameter. You can declare its return type either as Page or List. In the latter case you still receive the same (restricted) number of entities, but not the metadata that a Page would additionally carry.
So you basically did the right thing to avoid loading all entities into memory in your custom query.
Please review the related documentation here.
I think what you are looking for is Spring Data JPA Stream. It brings a significant performance boost to data fetching particularly in databases with millions of record. In your case you have several options which you can consider
Pull all data once in memory
Use pagination and read pages each time
Use something like Apache Spark
Streaming data using Spring Data JPA
In order to make Spring Data JPA Stream to work, we need to modify our MyRepository to return Stream<MyEntity> like this:
public interface MyRepository extends PagingAndSortingRepository<MyEntity, Integer> {
#QueryHints(value = {
#QueryHint(name = HINT_CACHEABLE, value = "false"),
#QueryHint(name = READ_ONLY, value = "true")
})
#Query(value="SELECT * ...")
Stream<MyEntity> myQuery();
}
In this example, we disable second level caching and hint Hibernate that the entities will be read only. If your requirement is different, make sure to change those settings accordingly for your requirements.

Is it sometimes okay to use service locator pattern in a domain class?

This question may be more appropriate for the Programmers stack. If so, I will move it. However I think I may get more answers here.
So far, all interface dependencies in my domain are resolved using DI from the executing assembly, which for now, is a .NET MVC3 project (+ Unity IoC container). However I've run across a scenario where I think service locator may be a better choice.
There is an entity in the domain that stores (caches) content from a URL. Specifically, it stores SAML2 EntityDescriptor XML from a metadata URL. I have an interface IConsumeHttp with a single method:
public interface IConsumeHttp
{
string Get(string url);
}
The current implementation uses the static WebRequest class in System.Net:
public class WebRequestHttpConsumer : IConsumeHttp
{
public string Get(string url)
{
string content = null;
var request = WebRequest.Create(url);
var response = request.GetResponse();
var stream = response.GetResponseStream();
if (stream != null)
{
var reader = new StreamReader(stream);
content = reader.ReadToEnd();
reader.Close();
stream.Close();
}
response.Close();
return content;
}
}
The entity which caches the XML content exists as a non-root in a much larger entity aggregate. For the rest of the aggregate, I am implementing a somewhat large Facade pattern, which is the public endpoint for the MVC controllers. I could inject the IConsumeHttp dependency in the facade constructor like so:
public AnAggregateFacade(IDataContext dataContext, IConsumeHttp httpClient)
{
...
The issue I see with this is that only one method in the facade has a dependency on this interface, so it seems silly to inject it for the whole facade. Object creation of the WebRequestHttpConsumer class shouldn't add a lot of overhead, but the domain is unaware of this.
I am instead considering moving all of the caching logic for the entity out into a separate static factory class. Still, the code will depend on IConsumeHttp. So I'm thinking of using a static service locator within the static factory method to resolve IConsumeHttp, but only when the cached XML needs to be initialized or refreshed.
My question: Is this a bad idea? It does seem to me that it should be the domain's responsibility to make sure the XML metadata is appropriately cached. The domain does this periodically as part of other related operations (such as getting metadata for SAML Authn requests & responses, updating the SAML EntityID or Metadata URL, etc). Or am I just worrying about it too much?
It does seem to me that it should be the domain's responsibility to
make sure the XML metadata is appropriately cached
I'm not sure about that, unless your domain is really about metadata manipulation, http requests and so on. For a "normal" application with a non-technical domain, I'd rather deal with caching concerns in the Infrastructure/Technical Services layer.
The issue I see with this is that only one method in the facade has a
dependency on this interface, so it seems silly to inject it for the
whole facade
Obviously, Facades usually don't lend themselves very well to constructor injection since they naturally tend to point to many dependencies. You could consider other types of injection or, as you pointed out, using a locator. But what I'd personnaly do is ask myself if a Facade is really appropriate and consider using finer-grained objects instead of the same large interface in all of my controllers. This would allow for more modularity and ad-hoc injection rather than inflating a massive object upfront.
But that may just be because I'm not a big Facade fan ;)
In your comment to #ian31, you mention "It seems like making the controller ensure the domain has the correct XML is too granular, giving the client too much responsibility". For this reason, I'd prefer the controller asks its service/repository (which can implement the caching layer) for the correct & current XML. To me, this responsibility is a lot to ask of the domain entity.
However, if you're OK with the responsibilities you've outlined, and you mention the object creation isn't much overhead, I think leaving the IConsumeHttp in the entity is fine.
Sticking with this responsibility, another approach could be to move this interface down into a child entity. If this was possible for your case, at least the dependency is confined to the scenario that requires it.

Resources