EhCache to put new element to disk if memory store full - ehcache

I would like to use EhCache in combination of memory and disk cache. EhCache should move new elements to disk when memory is full. e.g. I have 100 elements in ehCache memory store and tries to put 101st element and if memory is full then put 101st element to disk not 1st element.
Could you please let me know the cache configuration to achieve this?

Ehcache no longer works that way. The tiering model introduced in Ehcache 2.6 and used since then will always store ALL mappings into the lower tier, disk in your case.
The reason is predictable latency. If Ehcache waited for the memory tier to be full before using the disk, you would see a latency increase maybe at the worst time for your application. While the model were all mappings are written to disk gives you the upper bound for the write latency, while reads may be faster for hot value that are available in memory directly.

Related

Spring Data JPA Meta JpaMetamodelMappingContext Memory Consumption

My Spring Data JPA/Hibernate Application consumes over 2GB of memory at start without a single user hitting it. I am using Hazelcast as the second level cache but I had the same issue when I used ehCache as well so that is probably not the cause of the issue.
I ran a profile with a Heap Dump in Visual VM and I see where the bulk of the memory is being consumed by JpaMetamodelMappingContext and secondary a ton of Map objects. I just need help in deciphering what I am seeing and if this is actually a problem. I do have a hundred classes in the model so this may be normal but I have no point of reference. It just seems a bit excessive.
Once I get a load of 100 concurrent users, my memory consumption increases to 6-7 GB. That is quite normal for the amount of data I push around and cache, but I feel like if I could reduce the initial memory, I'd have a lot more room for growth.
I don't think you have a problem here.
Instead, I think you are misinterpreting the data you are looking at.
Note that the heap space diagram displays two numbers: Heap size and Used heap
Heap size (orange) is the amount of memory available to the JVM for the heap.
This means it is the amount that the JVM requested at some point from the OS.
Used heap is the part of the Heap size that is actually used.
Ignoring the startup phase, it grows linear and then drops repeatedly over time.
This is typical behavior of an idling application.
Some part of the application generates a moderate amount of garbage (rising part of the curve) which from time to time gets collected.
The low points of that curve are the amount of memory you are actually really using.
It seems to be about 250MB which doesn't sound very much to me, especially when you say that the total consumption of 6-7GB when actually working sounds reasonable to you.
Some other observations:
Both CPU load and heap grows fast/fluctuates a lot at start time.
This is to be expected because the analysis of repositories and entities happen at that time.
JpaMetamodelMappingContext s retained size is about 23MB.
Again, a good chunk of memory, but not that huge.
This includes the stuff it references, which is almost exclusively metadata from the JPA implementation as you can easily see when you take a look at its source.

Ignite uses more memory than expected

I am using Ignite to build a framework for data calculation. One big problem is the memory usage is a little more than expected. The data using 1G memory outside Ignite will use more than 1.5G in Ignite cache.
I turned off backup and copyOnRead already. I don't use query feature so no extra index space. I also counted in the extra space used for each cache and cache entry. The total memory usages still doesn't add up.
The data value for each cache entry is a big map contains list of primitive arrays. Each entry is about 120MB.
What can be the problem? The data structure or the configuration?
Ignite does introduce some overhead to your data and half of a GB doesn't sound too bad too me. I would recommend you to refer to this guide for more details: https://apacheignite.readme.io/docs/capacity-planning
Difference between expected and real memory usage arises from 2 main points:
Each entry takes constant overhead consists of objects providing support for processing entries in distributed computing environment.
E.g. you can declare integer local variable, it takes 4 bytes in the stack, but it's hard to make the variable long live and accessible from other places of program. So you have to create new Integer object, which consumes at least 16 bytes (300% overhead isn't it?). Going further, if you want to make this object mutable and safely acsessible by multiple threads, you have to create new AtomicReference and store your object inside. Total memory consumption will be at least 32 bytes... and so on. Every time we're extending object functionality, we get additional overhead, there is no other way.
Each entry stored inside a cache in a special serialized format. So the actual memory footprint of an entry depends on the format is used. By default Ignite uses BinaryMarshaller to convert an object to the byte array, and this array is stored inside a BinaryObject.
The reason is simple, distributed computing systems continiously exchange entries between nodes, and every entry in cache should be ready to be transferred as a byte array.
Please, read the article, it was recently updated. You could estimate entry overhead for small entries by hand, but for big entries you should inspect actual entry stored in the cache as a byte array. Look at the withKeepBinary method.

EHcache performance in using disk store cache

We are using the ehcache in our application. Look at the following configuration:
<diskStore path="java.io.tmpdir" />
<cache name="service" maxElementsInMemory="50000" eternal="true" overflowToDisk="true"/>
Since we have configured as eternal="true", Is it going to create caches for ever?. Is there a chance of running out of disk space?
What would be the performance impact on disk store?. It is definitely slower than the in-memory cache, but how much impact.
If more caches are stored in the disk, will it cause IO issue of doing multiple file operations?
Please suggest the best practice for a production grade applications. Consider that we have a 3 GB heap memory and 25000 concurrent users accessing the application. But, there is no database used in our application.
The application is deployed in WAS 8.5.5.
eternal=true means mappings will never expire.
overflowToDisk=true means that all mappings put in the cache will end up written on disk, from the first mapping put in the cache. The current Ehcache tiering model (since version 2.6.0) always makes use of the slower store - disk here - in order to give you predictable latency. When a mapping is accessed, it gets faulted into heap for faster retrieval. When too many mappings are faulted in heap, eviction from heap kicks in to keep the heap cache size according to maxElementsInMemory.
Given that you do not size the disk store by setting maxElementsLocalDisk, it defaults to 0 which means no limit. So yes, you may run out of disk space if you never explicitly remove cache entries.
It is quite hard to recommend proper cache size without knowing the details of your application. What I can recommend is that you measure both heap and disk usage and assess when the increased memory usage outweighs the performance gain.

Does ehcache reserve (allocate) heap memory set with maxBytesLocalHeap?

I am using ehache v. 2.8.
But I am not sure if I understand the documentation correctly regarding reservation of the memory for the cache.
If the memory is set in ehcache.xml like this:
<ehcache maxBytesLocalHeap="256M">
(...)
</ehcache>
..will it actually be allocated at start and this cache will use exactly 256MB of heap or does this only mean (like it should, if this attribute is named like it is) that this cache can take at most 256MB of heap?
This means that this cache will do its best to contain 256MB or less of user data.
But note that the actual memory footprint of the cache can be somewhat larger due to internal data structures.
Also in case the cache operates at full capacity, it may temporarily go over size while eviction takes place.

Does larger cache size always lead to improved performance?

Since cache inside the processor increases the instruction execution speed. I'm wondering what if we increase the size of cache to many MBs like 1 GB. Is it possible? If it is will increasing the cache size always result in increased performance?
There is a tradeoff between cache size and hit rate on one side and read latency with power consumption on another. So the answer to your first question is: technically (probably) possible, but unlikely to make sense, since L3 cache in modern CPUs with size of just a few MBs has read latency of about dozens of cycles.
Performance depends more on memory access pattern than on cache size. More precisely, if the program is mainly sequential, cache size is not a big deal. If there are quite a lot of random access (ex. when associative containers are actively used), cache size really matters.
The above is true for single computational tasks. In multiprocess environment with several active processes bigger cache size is always better, because of decrease of interprocess contention.
This is a simplification, but, one of the primary reasons the cache increases 'speed' is that it provides a fast memory very close to the processor - this is much faster to access than main memory. So, in theory, increasing the size of the cache should allow more information to be stored in this 'fast' memory, and thereby improve performance.. In the real world things are obviously much more complex than this, and there will of course be added complexity, and cost, associated with such a large cache, and with dealing with issues like cache coherency, caching algorithms etc.
As cache stores data temporary. Cache is used to locate the file easily that has been frequently using. So if the size of cache increased upto 1gb or more it will not stay as cache, it becomes RAM. Data is stored in ram temporary. So if cache isn't used, when data is called by processor, ram will take time to fetch data to provide to the processor because of its wide size of 4gb or more. So we use cache as our temporary memory for the things we recently or frequently used. In this way, ram ram doesnt required to find and fetch data to give it to processor, because processor direct access data from cache, because of small size of cache, it doesnt take time to find data, and processor doesn't require to call ram to fetch data, all of this done fastly without ram. Lets take an example, we have a wide classroom (RAM) , our principal (processor) call class CR (Data) for some purposes, then ones will go to the class room and will find the CR in the class of 1000 students and take him to the principal. It takes time. When we specify a space(cache) for CR in the class, because principal mostly call CR of the class, so it will become easy to find CR becuase most of the time CR is called by Principal.

Resources