I'd like to use as much RAM as possible with ehcache but ehcache still uses some space on harddrive. this is my config
name="MainCacheManager"
overflowToDisk="false"
diskPersistent="false"
updateCheck="true"
monitoring="autodetect"
dynamicConfig="true"
maxBytesLocalHeap="2G"
maxBytesLocalDisk="1M"
Is it possible to disable swap at all?
Also this value doesn't work
maxBytesLocalDisk="1M"
ehcache swap takes much more space than 1M
Ehcache uses a tiering model in recent versions (2.6+). This means that the lower tier, also called authority, will always contain all the entries from the cache.
So you should never configure a disk store to be smaller than the onheap store, as it will limit the cache capacity.
If you do not want a disk store, drop the maxBytesLocalDisk="1M" config line.
Also, the disk store of ehcache should not be compared to a swap file.
Related
I've implemented a web application using GeoServer to provide tile maps. In order to apply caching strategy, I've enabled the embedded GeoWebCache and set tiling page in a PostgreSQL database. The disk quota is set 5 MB and LFU approach to test the truncate behavior on quota limit exceeding. The problem is shown when the caching volumes are more than 5 MB and GeoWebCahe delete all tiles without regarding the "frequency_of_use" of each tile. Is this the expected behavior because I think it should remove least used tiles first.
<!-- geowebcache-diskquota.xml -->
<gwcQuotaConfiguration>
<enabled>true</enabled>
<cacheCleanUpFrequency>10</cacheCleanUpFrequency>
<cacheCleanUpUnits>SECONDS</cacheCleanUpUnits>
<maxConcurrentCleanUps>2</maxConcurrentCleanUps>
<globalExpirationPolicyName>LFU</globalExpirationPolicyName>
<globalQuota>
<value>5</value>
<units>MiB</units>
</globalQuota>
<quotaStore>JDBC</quotaStore>
</gwcQuotaConfiguration>
and the geowebcache-diskquota-jdbc.xml file:
<gwcJdbcConfiguration>
<dialect>PostgreSQL</dialect>
<JNDISource>java:comp/env/jdbc/gwc</JNDISource>
<connectionPool>
<driver>org.postgresql.Driver</driver>
<url>jdbc:postgresql://localhost:5432/gwc</url>
<username>postgres</username>
<password></password>
<minConnections>1</minConnections>
<maxConnections>10</maxConnections>
<connectionTimeout>10000</connectionTimeout>
<maxOpenPreparedStatements>50</maxOpenPreparedStatements>
</connectionPool>
</gwcJdbcConfiguration>
The disk quota mechanism does not track each single tile, but "tile pages", groups of tiles whose statistics are tracked as a unit, in order to reduce the accounting database size.
I don't know the details of the implementation to the point of telling you how big a tile page is, but for a tile cache that is potentially hundreds of gigabytes, I would not be surprised if the minimum tracking unit is more than 5MB. If that's the case, then a delete of all the tiles available in a 5MB quota would be very likely.
I have a PERSISTENT cache configured like this :-
<region name="stock-hist" refid="PARTITION_PERSISTENT" >
<region-attributes disk-store-name="myOverflowStore" disk- synchronous="false">
<partition-attributes local-max-memory="1024" />
<eviction-attributes>
<!-- Overflow to disk when 100 megabytes of data reside in the
region -->
<lru-memory-size maximum="100" action="overflow-to-disk"/>
</eviction-attributes>
</region-attributes>
The problem is that when I storing say 8 GB of data the cache crashes due to too much memory. I do not want that to happen. Like I need the data to overflow to disk when it is beyond 100MB, but get it back to cache if I try to access it. I also want persistent cache.
Also in case I write behind to a database, how can I evict data after sometime.
How does this work?
This is a use-case for which an In-Memory Data Grid is not intended. Based on the problem that you are describing, you should consider using a relational DB OR you should increase memory to use an IN-MEMORY Data Grid. Overflow features are intended as a safety valve and not for "normal" use.
I do not understand when you say that "it" crashes due to "too much" memory since it obviously does not have "enough" memory. I suspect that there is not have sufficient disk space defined. If you think not, check your explicit and not implicit disk allocations.
As for time-based eviction/ expiration, please see "PARTITION_HEAP_LRU" at: http://gemfire.docs.pivotal.io/docs-gemfire/latest/reference/topics/region_shortcuts_reference.html
I want to setup 2 instances of Redis because I have different requirements for the data I want to store in Redis. While I sometimes do not mind losing some data that are used primarly as cached data, I want to avoid to lose some data in some cases like when I use python RQ that stores into Redis the jobs to execute.
I mentionned below the main settings to achieve such a goal.
What do you think?
Did I forget anything important?
1) Redis as a cache
# Snapshotting to not rebuild the whole cache if it has to restart
# Be reasonable to not decrease the performances
save 900 1
save 300 10
save 60 10000
# Define a max memory and remove less recently used keys
maxmemory X # To define according needs
maxmemory-policy allkeys-lru
maxmemory-samples 5
# The rdb file name
dbfilename dump.rdb
# The working directory.
dir ./
# Make sure appendonly is disabled
appendonly no
2) Redis as a persistent datastore
# Disable snapshotting since we will save each request, see appendonly
save ""
# No limit in memory
# How to disable it? By not defining it in the config file?
maxmemory
# Enable appendonly
appendonly yes
appendfilename redis-aof.aof
appendfsync always # Save on each request to not lose any data
no-appendfsync-on-rewrite no
# Rewrite the AOL file, choose a good min size based on the approximate size of the DB?
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 32mb
aof-rewrite-incremental-fsync yes
aof-load-truncated yes
Sources:
http://redis.io/topics/persistence
https://raw.githubusercontent.com/antirez/redis/2.8/redis.conf
http://fr.slideshare.net/eugef/redis-persistence-in-practice-1
http://oldblog.antirez.com/post/redis-persistence-demystified.html
How to perform Persistence Store in Redis?
https://www.packtpub.com/books/content/implementing-persistence-redis-intermediate
I think your persistence options are too aggressive - but it mostly depends on the nature and the volume of your data.
For the cache, using RDB is a good idea, but keep in mind that depending on the volume of data, dumping the content of the memory on disk has a cost. On my system, Redis can write memory data at 400 MB/s, but note that data may (or may not) be compressed, may (or may not) be using dense data structures, so your mileage will vary. With your settings, a cache supporting heavy writing will generate a dump every minute. You have to check that with the volume you have, the dump duration is well below that minute (something like 6-10 seconds would be fine). Actually, I would recommend to keep only save 900 1 and remove the other save lines. And even a dump every 15 min could be considered as too frequent, especially if you have SSD hardware that will progressively wear out.
For the persistent store, you need to define also the dir parameter (since it also controls the location of the AOF file). The appendfsync always option is overkill and too slow for most purposes, except if you have very low throughput. You should set it to everysec. If you cannot afford to lose a single bit of data even in case of system crash, then using Redis as a storage backend is not a good idea. Finally, you will probably have to adjust auto-aof-rewrite-percentage and auto-aof-rewrite-min-size to the level of write throughput the Redis instance has to sustain.
I totally agree with #Didier - this is more of a supplement rather than a full answer.
First note that Redis offers tunable persistency - you can use RDB and/or AOF. While a your choice of using RDB for a persistent cache makes perfect sense, I would recommend considering using both for your persistent store. This will allow you both point-in-time recovery based on the snapshots (i.e. backup) as well as post-crash recovery to the last recorded operation with the AOF.
For the persistent store, you don't want to set maxmemory to 0 (which is the default if it is commented out in the conf file). When set to 0, Redis will use as much memory as the OS will give it so eventually, as your dataset grows, you will run into a situation where the OS will kill it to free memory (this often happens when you least expect it ;)). You should, instead, use a real value that's based on the amount of RAM that your server has with enough padding for the OS. For example, if your server has 16GB of RAM, as a rule of thumb I'd restrict Redis from using more than 14GB.
But there's a catch. Since you've read everything about Redis' persistency, you probably remember that Redis forks to write the data to disk. Forking can more than double the memory consumption (forked copy + changes) during the child process' execution so you need to make sure that your server has enough free memory to accommodate that if you use data persistence. Also note that you should consider in your maxmemory calculation other potential memory-consuming thingies such as replication and client buffers depending on what/how you and the app use Redis.
Using APCu with TYPO3 6.2 extensively, I always get a high fragmentation of the cache over time. I already had values of 99% with a smaller shm_size.
In case you are a TYPO3 admin, I also switched the caches cache_pagesection, cache_hash, cache_pages (currently for testing purposes moved to DB again), cache_rootline, extbase_reflection, extbase_opject as well as some other extension caches to apc backend. Mainly switching the cache_hash away from DB sped up menu rendering times dramatically (https://forge.typo3.org/issues/57953)
1) Does APC fragmentation matter at all or should I simply watch out that it just never runs out of memory?
2) To TYPO3 admins: do you happen to have an idea which tables cause fragmentation most and what bit in the apcu.ini configuration is relevant for usage with TYPO3?
I already tried using apc.stat = 0, apc.user_ttl = 0, apc.ttl = 0 (as in the T3 caching guide http://docs.typo3.org/typo3cms/CoreApiReference/CachingFramework/FrontendsBackends/Index.html#caching-backend-apc) and to increase the shm_size (currently at 512M where normally around 100M would be used). Shm_size does a good job at reducing fragmentation, but I'd rather have a smaller but full cache than a large one unused.
3) To APC(u) admins: could it be that frequently updating cache entries that change in size as well cause most of the fragmentation? Or is there any other misconfiguration that I'm unaware of?
I know there is a lot of entries in cache (mainly JSON data from remote servers) where some of them update every 5 minutes and normally are a different size each time. If that is indeed a cause, how can I avoid it? Btw: APCU Info shows there are a lot of entries taking up only 2kB but each with a fragmented spacing of about 200 Bytes.
4) To TYPO3 and APC admins: apc has a great integration in TYPO3, but for more frequently updating and many small entries, would you advise a different cache backend than apc?
This is no longer relevant for us, I found a different solution reverting back to MySQL cache. Though if anyone comes here via search, this is how we did it in the end:
Leave the APC cache alone and only use it for the preconfigured extbase_object cache. This one is less than 1MB, has only a few inserts at the beginning and yields a very high hit / miss ratio after. As stated in the install tool in the section "Configuration Presets", this is what the cache backend has been designed for.
I discovered this bug https://forge.typo3.org/issues/59587 in the process and reviewed our cache usage again. It resulted in huge cache entries only used for tag-to-ident-mappings. My conclusion is, even after trying out the fixed cache, that APCu is great for storing frequently accessed key-value mappings but yields when a lot of frequently inserted or tagged entries are around (such as cache_hash or cache_pages).
Right now, the MySQL cache tables have a better performance with extended usage of the MySQL server memory cache (but in contrast to APCu with disc backup). This was the magic setup for our my.cnf (found here: http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/):
innodb_buffer_pool_size = 512M
innodb_log_file_size = 256M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency = 8
innodb_flush_method=O_DIRECT
innodb_file_per_table
With this additional MySQL server setup, the default typo3 cache tables do their job best.
I tried using ehcache but the ehcache fails when I try to save data on disk. Gives
net.sf.ehcache.config.InvalidConfigurationException: Search attributes
not supported by this store type:
net.sf.ehcache.store.DiskBackedMemoryStore .
If my memory capacity of 200 Mb is exceeded then i cannot use the ehache in that case?
The ehcache search API does (at least currently) not work with disk persisted caches.