Do caches in ehcache.xml inherit from defaultCache? - ehcache

If I have the following configuration:
<defaultCache timeToIdleSeconds="120"
timeToLiveSeconds="120" />
<cache name="test"
timeToLiveSeconds="300" />
What will be the value of timeToIdleSeconds for the cache test? Will it be inherited from the default cache, and thus be equal to 120, or will it take the default value as given in the manual, which is 0 (infinity)?

The timeToIdleSeconds will be default value and not inherit from "defaultCache". The "defaultCache" is a bit misnomer/misleading, in the sense that it does not provide "defaults" for every cache, but its just a way of specifying config for caches that can/are added dynamically - using cacheManager.addCache(String cacheName).
From http://www.ehcache.org/ehcache.xml, documentation for that tag reads
Default Cache configuration.
These settings will be applied to caches created programmatically using
CacheManager.add(String cacheName). This element is optional, and using
CacheManager.add(String cacheName) when its not present will throw CacheException
The defaultCache has an implicit name "default" which is a reserved cache name.

private Ehcache cloneDefaultCache(final String cacheName) {
if (defaultCache == null) {
return null;
}
Ehcache cache;
try {
cache = (Ehcache) defaultCache.clone();
} catch (CloneNotSupportedException e) {
throw new CacheException("Failure cloning default cache. Initial cause was " + e.getMessage(), e);
}
if (cache != null) {
cache.setName(cacheName);
}
return cache;
}
Method
cloneDefaultCache(String)
Found usages (2 usages found)
Library (2 usages found)
Unclassified usage (2 usages found)
Maven: net.sf.ehcache:ehcache-core:2.6.11 (2 usages found)
net.sf.ehcache (2 usages found)
CacheManager (2 usages found)
addCache(String) (1 usage found)
1173 Ehcache clonedDefaultCache = cloneDefaultCache(cacheName);
addCacheIfAbsent(String) (1 usage found)
1857 Ehcache clonedDefaultCache = cloneDefaultCache(cacheName);

Related

JCache Hazelcast embedded does not scale

Hello, Stackoverflow Community.
I have a Spring Boot application that uses Jcache with Hazelcast implementation as a cache Framework.
Each Hazelcast node has 5 caches with the size of 50000 elements each. There are 4 Hazelcast Instances that form a cluster.
The problem that I face is the following:
I have a very heavy call that reads data from all four caches. On the initial start, when all caches are yet empty, this call takes up to 600 seconds.
When there is one Hazelcast instance running and all 5 caches are filled with data, then this call happens relatively fast, it takes on average only 4 seconds.
When I start 2 Hazelcast instances and they form a cluster, then the response time gets worse, and the same call takes already 25 seconds on average.
And the more Hazelcast instances I add in a cluster, the longer the response time gets. Of course, I was expecting to see some worse delivery time when data is partitioned among Hazelcast nodes in a cluster. But I did not expect that just by adding one more hazelcast instance, the response time would get 6 - 7 times slower...
Please note, that for simplicity reasons and for testing purposes, I just start four Spring Boot Instances with each Hazelcast embedded node embedded in it on one machine. Therefore, such poor performance cannot be justified by network delays. I assume that this API call is so slow even with Hazelcast because much data needs to be serialized/deserialized when sent among Hazelcast cluster nodes. Please correct me if I am wrong.
The cache data is partitioned evenly among all nodes. I was thinking about adding near cache in order to reduce latency, however, according to the Hazelcast Documentation, the near cache is not available for Jcache Members. In my case, because of some project requirements, I am not able to switch to Jcache Clients to make use of Near Cache. Is there maybe some advice on how to reduce latency in such a scenario?
Thank you in advance.
DUMMY CODE SAMPLES TO DEMONSTRATE THE PROBLEM:
Hazelcast Config: stays default, nothing is changed
Caches:
private void createCaches() {
CacheConfiguration<?, ?> cacheConfig = new CacheConfig<>()
.setEvictionConfig(
new EvictionConfig()
.setEvictionPolicy(EvictionPolicy.LRU)
.setSize(150000)
.setMaxSizePolicy(MaxSizePolicy.ENTRY_COUNT)
)
.setBackupCount(5)
.setInMemoryFormat(InMemoryFormat.OBJECT)
.setManagementEnabled(true)
.setStatisticsEnabled(true);
cacheManager.createCache("books", cacheConfig);
cacheManager.createCache("bottles", cacheConfig);
cacheManager.createCache("chairs", cacheConfig);
cacheManager.createCache("tables", cacheConfig);
cacheManager.createCache("windows", cacheConfig);
}
Dummy Controller:
#GetMapping("/dummy_call")
public String getExampleObjects() { // simulates a situatation where one call needs to fetch data from multiple cached sources.
Instant start = Instant.now();
int i = 0;
while (i != 50000) {
exampleService.getBook(i);
exampleService.getBottle(i);
exampleService.getChair(i);
exampleService.getTable(i);
exampleService.getWindow(i);
i++;
}
Instant end = Instant.now();
return String.format("The heavy call took: %o seconds", Duration.between(start, end).getSeconds());
}
Dummy service:
#Service
public class ExampleService {
#CacheResult(cacheName = "books")
public ExampleBooks getBook(int i) {
try {
Thread.sleep(1); // just to simulate slow service here!
} catch (InterruptedException e) {
e.printStackTrace();
}
return new Book(Integer.toString(i), Integer.toString(i));
}
#CacheResult(cacheName = "bottles")
public ExampleMooks getBottle(int i) {
try {
Thread.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
return new Bottle(Integer.toString(i), Integer.toString(i));
}
#CacheResult(cacheName = "chairs")
public ExamplePooks getChair(int i) {
try {
Thread.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
return new Chair(Integer.toString(i), Integer.toString(i));
}
#CacheResult(cacheName = "tables")
public ExampleRooks getTable(int i) {
try {
Thread.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
return new Table(Integer.toString(i), Integer.toString(i));
}
#CacheResult(cacheName = "windows")
public ExampleTooks getWindow(int i) {
try {
Thread.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
return new Window(Integer.toString(i), Integer.toString(i));
}
}
If you do the math:
4s / 250 000 lookups is 0.016 ms per local lookup. This seems rather high, but let's take that.
When you add a single node then the data gets partitioned and half of the requests will be served from the other node. If you add 2 more nodes (4 total) then 25 % of the requests will be served locally and 75 % will be served over network. This should explain why the response time grows when you add more nodes.
Even simple ping on localhost takes twice or more time. On a real network the read latency we see in benchmarks is 0.3-0.4 ms per read call. This makes:
0.25 * 250k *0.016 + 0.75 * 250k * 0.3 = ~57 s
You simply won't be able to make so many calls serially over the network (even local one), you need to either
parallelize the calls - use javax.cache.Cache#getAll to reduce the number of calls
you can try enabling reading local backups via com.hazelcast.config.MapConfig#setReadBackupData so there is less requests over the network.
The read backup data feature is only available for IMap, so you would need to use Spring caching with hazelcast-spring module and its com.hazelcast.spring.cache.HazelcastCacheManager:
#Bean
HazelcastCacheManager cacheManager(HazelcastInstance hazelcastInstance) {
return new HazelcastCacheManager(hazelcastInstance);
}
See documentation for more details.

How to get spring cache size in spring boot?

I have two methods as follows:
#Cacheable(cacheNames = "foos")
public List<FooDto> getAllFoos() {
return this.fooRepository.findAll().stream()
.map(FooEntityDomainToDtoMapper::mapDomainToDto) // mapping entity to dto
.collect(Collectors.toList());
}
#Cacheable(cacheNames = "foos",key = "#fooDto.Id")
public FooDto getFooById(FooDto fooDto) {
return this.fooRepository.findById(fooDto.getId()).stream()
.map(FooEntityDomainToDtoMapper::mapDomainToDto) // mapping entity to dto
.collect(Collectors.toList());
}
First getAllFoos() will be called during the System startup and second will be called after starting the system when user request object by particular id. I wanted to know whether second method will occupy any separate cache space or it'll simply add keys in the cache obtained in the first method? I want to confirm that even if i comment the second getFooById() whether or not the size of the cache will be the same? Is there any way to get the size of cache?
P.S: we are not using any implementation of cache, just using spring-boot-starter-cache
Although, there is no direct method through which you can see cache data and get size but you can do that by using reflection. Here is what I've done.
public Object getAllCache() {
//Autowire your cache manager first and then process the request using cache manager
ConcurrentMapCache cache = (ConcurrentMapCache) manager.getCache("yourCachename");
Object store1 = null;
try {
Field store = ConcurrentMapCache.class.getDeclaredField("store");
store.setAccessible(true);
store1 = store.get(cache);
} catch (NoSuchFieldException | SecurityException | IllegalArgumentException | IllegalAccessException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return store1.toString();
}

Performance in microservice-to-microservice data transfer

I have controller like this:
#RestController
#RequestMapping("/stats")
public class StatisticsController {
#Autowired
private LeadFeignClient lfc;
private List<Lead> list;
#GetMapping("/leads")
private int getCount(#RequestParam(value = "count", defaultValue = "1") int countType) {
list = lfc.getLeads(AccessToken.getToken());
if (countType == 1) {
return MainEngine.getCount(list);
} else if (countType == 2) {
return MainEngine.getCountRejected(list);
} else if (countType == 3) {
return MainEngine.getCountPortfolio(list);
} else if (countType == 4) {
return MainEngine.getCountInProgress(list);
} else if (countType == 5) {
return MainEngine.getCountForgotten(list);
} else if (countType == 6) {
return MainEngine.getCountAddedInThisMonth(list);
} else if (countType == 7) {
return MainEngine.getCountAddedInThisYear(list);
} else {
throw new RuntimeException("Wrong mapping param");
}
}
#GetMapping("/trends")
private boolean getTrend() {
return MainEngine.tendencyRising(list);
}
It is basically a microservice that will handle statistics basing on list of 'Business Leads'. FeignClient is GETting list of trimmed to the required data leads. Everything is working properly.
My only concern is about performance - all of this statistics (countTypes) are going to be presented on the landing page of webapp. If i will call them one by one, does every call will retrieve lead list again and again? Or list will be somehow stored in temporary memory? I can imagine that if list become longer, it could take a while to load them.
I've tried to call them outside this method, by #PostConstruct, to populate list at the start of service, but this solution has two major problems: authentication cannot be handled by oauth token, retrieved list will be insensitive to adding/deleting leads, cause it is loaded at the beginning only.
The list = lfc.getLeads(AccessToken.getToken()); will be called with each GET request. Either take a look at caching the responses which might be useful when you need to obtain a large volume of data often.
I'd start here: Baeldung's: Spring cache tutorial which gives you an idea about the caching. Then you can take a look at the EhCache implementation or implement own interceptor putting/getting from/to external storage such as Redis.
The caching is the only way I see to resolve this: Since the Feign client is called with a different request (based on the token) the data are not static and need to be cached.
You need to implement a caching layer to improve performance. What you can do is, you can have cache preloaded immediately after application starts. This way you will have the response ready in the cache. I would suggest to go with Redis cache. But any cache will do the job.
Also, it will be better if you can move the logic of getCount() to some service class.

fs.hdfs.impl.disable.cache causes SparkSQL slowness

This question is related to Hive/Hadoop intermittent failure: Unable to move source to destination
We found that we could avoid the problem of "Unable to move source ... Filesystem closed" by setting fs.hdfs.impl.disable.cache to true.
However, we also observed that the SparkSQL queries became very slow -- queries that used to finish within a few seconds now take more than 30 to 40 seconds to finish (even when the query is very simple, like reading a tiny table).
Is this normal?
My understanding of fs.hdfs.impl.disable.cache being true means that FileSystem#get() would always createFileSystem() instead of returning a cached FileSystem. This setting prevents a FileSystem object from being shared by multiple clients and it really makes sense, because it would prevent, for example, two callers of FileSystem#get() from closing each other's filesystem.
(For example, see this discussion )
This setting would slow things down, but probably not by so much.
From: hadoop-source-reading
/**
* Returns the FileSystem for this URI's scheme and authority. The scheme of
* the URI determines a configuration property name,
* <tt>fs.<i>scheme</i>.class</tt> whose value names the FileSystem class.
* The entire URI is passed to the FileSystem instance's initialize method.
*/
public static FileSystem get(URI uri, Configuration conf)
throws IOException {
String scheme = uri.getScheme();
String authority = uri.getAuthority();
if (scheme == null) { // no scheme: use default FS
return get(conf);
}
if (authority == null) { // no authority
URI defaultUri = getDefaultUri(conf);
if (scheme.equals(defaultUri.getScheme()) // if scheme matches
// default
&& defaultUri.getAuthority() != null) { // & default has
// authority
return get(defaultUri, conf); // return default
}
}
String disableCacheName = String.format("fs.%s.impl.disable.cache",
scheme);
if (conf.getBoolean(disableCacheName, false)) {
return createFileSystem(uri, conf);
}
return CACHE.get(uri, conf);
}
Would the slowness point to some other networking issues, such as resolving domain names? Any insights to this problem are welcome.

Azure cache write implementation approaches - when to use which

I used to call the Put(Key, Value) method to set data in Azure cache. I later learnt that this method could lead to race conditions during writes and introduced the following code for setting data into cache.
try
{
if (GetData(key) == null)
{
_cache.Add(key, "--dummy--");
}
DataCacheLockHandle lockHandle;
TimeSpan lockTimeout = TimeSpan.FromMinutes(1);
_cache.GetAndLock(key, lockTimeout, out lockHandle);
if (ttlInMinutes == 0)
{
_cache.PutAndUnlock(key, value, lockHandle);
}
else
{
TimeSpan ttl = TimeSpan.FromMinutes(ttlInMinutes);
_cache.PutAndUnlock(key, value, lockHandle, ttl);
}
}
catch (Exception e)
{}
This involves two IOs as against one in the previous call. Is this locking really needed in application code? Is cache consistency not taken care of by Azure's caching framework? What is the standard way of managing cache writes in Azure? When to use Put and when PutAndUnlock?

Resources