APC with TYPO3: high fragmentation over time - caching

Using APCu with TYPO3 6.2 extensively, I always get a high fragmentation of the cache over time. I already had values of 99% with a smaller shm_size.
In case you are a TYPO3 admin, I also switched the caches cache_pagesection, cache_hash, cache_pages (currently for testing purposes moved to DB again), cache_rootline, extbase_reflection, extbase_opject as well as some other extension caches to apc backend. Mainly switching the cache_hash away from DB sped up menu rendering times dramatically (https://forge.typo3.org/issues/57953)
1) Does APC fragmentation matter at all or should I simply watch out that it just never runs out of memory?
2) To TYPO3 admins: do you happen to have an idea which tables cause fragmentation most and what bit in the apcu.ini configuration is relevant for usage with TYPO3?
I already tried using apc.stat = 0, apc.user_ttl = 0, apc.ttl = 0 (as in the T3 caching guide http://docs.typo3.org/typo3cms/CoreApiReference/CachingFramework/FrontendsBackends/Index.html#caching-backend-apc) and to increase the shm_size (currently at 512M where normally around 100M would be used). Shm_size does a good job at reducing fragmentation, but I'd rather have a smaller but full cache than a large one unused.
3) To APC(u) admins: could it be that frequently updating cache entries that change in size as well cause most of the fragmentation? Or is there any other misconfiguration that I'm unaware of?
I know there is a lot of entries in cache (mainly JSON data from remote servers) where some of them update every 5 minutes and normally are a different size each time. If that is indeed a cause, how can I avoid it? Btw: APCU Info shows there are a lot of entries taking up only 2kB but each with a fragmented spacing of about 200 Bytes.
4) To TYPO3 and APC admins: apc has a great integration in TYPO3, but for more frequently updating and many small entries, would you advise a different cache backend than apc?

This is no longer relevant for us, I found a different solution reverting back to MySQL cache. Though if anyone comes here via search, this is how we did it in the end:
Leave the APC cache alone and only use it for the preconfigured extbase_object cache. This one is less than 1MB, has only a few inserts at the beginning and yields a very high hit / miss ratio after. As stated in the install tool in the section "Configuration Presets", this is what the cache backend has been designed for.
I discovered this bug https://forge.typo3.org/issues/59587 in the process and reviewed our cache usage again. It resulted in huge cache entries only used for tag-to-ident-mappings. My conclusion is, even after trying out the fixed cache, that APCu is great for storing frequently accessed key-value mappings but yields when a lot of frequently inserted or tagged entries are around (such as cache_hash or cache_pages).
Right now, the MySQL cache tables have a better performance with extended usage of the MySQL server memory cache (but in contrast to APCu with disc backup). This was the magic setup for our my.cnf (found here: http://www.mysqlperformanceblog.com/2007/11/01/innodb-performance-optimization-basics/):
innodb_buffer_pool_size = 512M
innodb_log_file_size = 256M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency = 8
innodb_flush_method=O_DIRECT
innodb_file_per_table
With this additional MySQL server setup, the default typo3 cache tables do their job best.

Related

How Stack Overflow has implemented caching?

Not sure if Stack Overflow uses caching to enhance the loading speed of its pages, but if it has, it has done a great job. There are many components to be updated. I this picture you see every single second there are many components to be updated. How Stack Overflow does it so perfectly?
This is answered much more comprehensively on Meta.SE in this answer. The highlights, however, are:
Basically everything is cached, especially everything served to anonymous users.
They use Redis servers with 96 GB of Ram. This server keeps an "L1 Cache" of recently set and read values. These values are compressed before sending them to Redis. They also use IIS's Output Caching
Each site has 3 types of caches:
"Local" (user sessions, view counts, etc)
"Site" (hot question ids, user acceptance rates)
"Global" (user inboxes, API quotas)
There are even more details on High Scalability , though this is nearly 5 years old at this point. There is a newer article (from 2014) that mentions that there are two additional levels of caching involved as well:
SQL Server (the entire database sits in memory). As of 2013, the database servers had 384 GB of memory
SSD (hit only when the SQL server cache is warming up...as not something I'd consider "cache", but it's mentioned in the article).

When 777ms are still not good enough (MariaDB tuning)

Over the past couple of months, I've been on a rampage optimising a Joomla website that I'm managing. When I first started, the homepage used to open in around 30-40 seconds, in spite of repeatedly upgrading my dedicated server, as suggested by the hosting firm.
I was able to bring the pagespeed down to around 800ms by religiously following all the recommendations of the likes of GT Matrix and PingdomTools, (such as using JCH-optimize, .htaccess caching and compression settings, and MaxCDN) but now I'm stuck optimising my my.cnf settings, trying various settings suggested on a number of related articles. The fastest I'm getting the homepage to open - with the current settings - is 777ms after refresh, which might not sound too bad, but look at the configuration of my dedicated server:
2 Quads, 128GB, 2x480GB SSD RAID
CloudLinux/Cpanel/WHM
Apache/suEXEC/PHP5/FastCGI
MariaDB 10.0.17 (all tables converted to XtraDB/InnoDB)
The site traffic is moderate, 10,000 and 20,000 visitors per day, with around 200,000 pageviews.
These are the current my.cnf settings. My goal is to bring the pagespeed down to under 600ms, which should be possible with this kind of hardware, provided it is tuned the right way.
[mysqld]
local-infile=0
max_connections=10000
max_user_connections=1000
max_connect_errors=20
key_buffer_size=1G
join_buffer_size=1G
bulk_insert_buffer_size=1G
max_allowed_packet=1G
slow_query_log=1
slow_query_log_file="diskar/mysql-slow.log"
long_query_time=40
connect_timeout=120
wait_timeout=20
interfactive_timeout=25
back_log=500
query_cache_type=1
query_cache_size=512M
query_cache_limit=512K
query_cache_min_res_unit=2K
sort_buffer_size=1G
thread_cache_size=16
open_files_limit=10000
tmp_table_size=8G
thread_handling=pool-of-threads
thread_stack=512M
thread_pool_size=12
thread_pool_idle_timeout=500
thread_cache_size=1000
table_open_cache=52428
table_definition_cache=8192
default-storage-engine=InnoDB
[innodb]
memlock
innodb_buffer_pool_size=96G
innodb_buffer_pool_instances=12
innodb_additional_mem_pool_size=4G
innodb_log_bugger_size=1G
innodb_open_files=300
innodb_data_file_path=ibdata1:400M:autoextend
innodb_use_native_aio=1
innodb_doublewrite=0
innodb_user_atomic_writes=1
innodb_flus_log_at_trx_commit=2
innodb_compression_level=6
innodb_compression_algorithm=2
innodb_flus_method=O_DIRECT
innodb_log_file_size=4G
innodb_log_files_in_group=3
innodb_buffer_pool_instances=16
innodb_adaptive_hash_index_partitions=16
innodb_thread_concurrency
innodb_thread_concurrency=24
innodb_write_io_threads=24
innodb_read_io_threads=32
innodb_adaptive_flushing=1
innodb_flush_neighbors=0
innodb_io_capacity=20000
innodb_io_capacity_max=40000
innodb_lru_scan_depth=20000
innodb_purge_threads=1
innodb_randmon_read_ahead=1
innodb_read_io_threads=64
innodb_write_io_threads=64
innodb_use_fallocate=1
innodb_use_atomic_writes=1
inndb_use_trim=1
innodb_mtflush_threads=16
innodb_use_mfflush=1
innodb_file_per_table=1
innodb_file_format=Barracuda
innodb_fast_shutdown=1
I tried Memcached and APCU, but it didn't work. The site actually runs 2-3 times faster with 'Files' as the caching handler in Joomla's Global Configuration. And yes, I ran my-sqltuner, but that was of no help.
I am newby as far as Linux is concerned and suspect that above settings could be improved. Any comments and/or suggestions?
long_query_time=40
Set that to 1 so you can find out what the slow queries are.
max_connections=10000
That is unreasonably high. If you come anywhere near it, you will have more problems than failure to connect. Let's say only 3000.
query_cache_type=1
query_cache_size=512M
The Query cache is hurting performance by being so large. This is because any write causes all QC entries for the table to be purged. Recommend no more than 50M. If you have heavy writes, it might be better to change the type to DEMAND and pepper your SELECTs with SQL_CACHE (for relatively static tables) or SQL_NO_CACHE (for busy tables).
What OS?
Are the entries in [innodb] making it into the system? I thought these needed to be in [mysqld]. Check by doing SHOW VARIABLES LIKE 'innodb%';.
Ah, buggers; a spelling error:
innodb_log_bugger_size=1G
innodb_flus_log_at_trx_commit=2
inndb_use_trim=1
and more??
After you get some data in the slowlog, run pt-query-digest, and let's discuss the top couple of queries.

2 instances of Redis: as a cache and as a persistent datastore

I want to setup 2 instances of Redis because I have different requirements for the data I want to store in Redis. While I sometimes do not mind losing some data that are used primarly as cached data, I want to avoid to lose some data in some cases like when I use python RQ that stores into Redis the jobs to execute.
I mentionned below the main settings to achieve such a goal.
What do you think?
Did I forget anything important?
1) Redis as a cache
# Snapshotting to not rebuild the whole cache if it has to restart
# Be reasonable to not decrease the performances
save 900 1
save 300 10
save 60 10000
# Define a max memory and remove less recently used keys
maxmemory X # To define according needs
maxmemory-policy allkeys-lru
maxmemory-samples 5
# The rdb file name
dbfilename dump.rdb
# The working directory.
dir ./
# Make sure appendonly is disabled
appendonly no
2) Redis as a persistent datastore
# Disable snapshotting since we will save each request, see appendonly
save ""
# No limit in memory
# How to disable it? By not defining it in the config file?
maxmemory
# Enable appendonly
appendonly yes
appendfilename redis-aof.aof
appendfsync always # Save on each request to not lose any data
no-appendfsync-on-rewrite no
# Rewrite the AOL file, choose a good min size based on the approximate size of the DB?
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 32mb
aof-rewrite-incremental-fsync yes
aof-load-truncated yes
Sources:
http://redis.io/topics/persistence
https://raw.githubusercontent.com/antirez/redis/2.8/redis.conf
http://fr.slideshare.net/eugef/redis-persistence-in-practice-1
http://oldblog.antirez.com/post/redis-persistence-demystified.html
How to perform Persistence Store in Redis?
https://www.packtpub.com/books/content/implementing-persistence-redis-intermediate
I think your persistence options are too aggressive - but it mostly depends on the nature and the volume of your data.
For the cache, using RDB is a good idea, but keep in mind that depending on the volume of data, dumping the content of the memory on disk has a cost. On my system, Redis can write memory data at 400 MB/s, but note that data may (or may not) be compressed, may (or may not) be using dense data structures, so your mileage will vary. With your settings, a cache supporting heavy writing will generate a dump every minute. You have to check that with the volume you have, the dump duration is well below that minute (something like 6-10 seconds would be fine). Actually, I would recommend to keep only save 900 1 and remove the other save lines. And even a dump every 15 min could be considered as too frequent, especially if you have SSD hardware that will progressively wear out.
For the persistent store, you need to define also the dir parameter (since it also controls the location of the AOF file). The appendfsync always option is overkill and too slow for most purposes, except if you have very low throughput. You should set it to everysec. If you cannot afford to lose a single bit of data even in case of system crash, then using Redis as a storage backend is not a good idea. Finally, you will probably have to adjust auto-aof-rewrite-percentage and auto-aof-rewrite-min-size to the level of write throughput the Redis instance has to sustain.
I totally agree with #Didier - this is more of a supplement rather than a full answer.
First note that Redis offers tunable persistency - you can use RDB and/or AOF. While a your choice of using RDB for a persistent cache makes perfect sense, I would recommend considering using both for your persistent store. This will allow you both point-in-time recovery based on the snapshots (i.e. backup) as well as post-crash recovery to the last recorded operation with the AOF.
For the persistent store, you don't want to set maxmemory to 0 (which is the default if it is commented out in the conf file). When set to 0, Redis will use as much memory as the OS will give it so eventually, as your dataset grows, you will run into a situation where the OS will kill it to free memory (this often happens when you least expect it ;)). You should, instead, use a real value that's based on the amount of RAM that your server has with enough padding for the OS. For example, if your server has 16GB of RAM, as a rule of thumb I'd restrict Redis from using more than 14GB.
But there's a catch. Since you've read everything about Redis' persistency, you probably remember that Redis forks to write the data to disk. Forking can more than double the memory consumption (forked copy + changes) during the child process' execution so you need to make sure that your server has enough free memory to accommodate that if you use data persistence. Also note that you should consider in your maxmemory calculation other potential memory-consuming thingies such as replication and client buffers depending on what/how you and the app use Redis.

OLAP Saiku Cache expires

I'm using Saiku and PHPAnalytics to run MDX queries on my cube.
it seems if i run queries it's all good, caching is fine. But if I go for 2 hours and run those queries again - it does not using cache! Why? I need the cache to be saved for a long time! What to do? I tried to add this ti mondrian.properties mondrian.rolap.CachePool.costLimit = 2147483647
But no help. What do to?
The default in-memory cache of Mondrian stores things in a WeakHashMap. This means that it could be cleared at the discretion of the JVM's garbage collector. Most application servers are setup to do a periodical sweep of garbage collection (usually each hour or so). You have to either tweak your JVM's configuration to not do this.
-Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000
You can also implement your own cache implementation of the SegmentCache SPI. If your implementation uses hard references, they will never be collected. This is trickier to do and will require you to do quite a bit of studying to get it right. You can start by taking a look at the default implementation and start from there.
The mondrian cache should cache up until the cache is deliberately flushed. That said it uses an aging system to determine what should be cached should it run out of memory to store the data, the oldest query gets pushed out of the cache and replaced.
I've not tried the PHPAnalytics stuff, but maybe they've put some call into the Saiku server to flush the cache on a regular basis, otherwise this shouldn't happen.

What happens when Varnish Cache is full?

I'm using varnish with
-s malloc,1G"
It's currently 98% full. Once it completely full what will happen?
With it purge?
Maybe purge old images/pages?
Or better yet purge the files with least amount of hits?
It looks like Varnish uses a LRU (least recently used) strategy to remove items from cache when the cache becomes full with things whose TTL (time to live) has not expired (so first remove things whose TTL is expired, if the cache is still full remove things least recently accessed).
See
https://www.varnish-cache.org/trac/wiki/ArchitectureLRU
Note you can watch the n_lru_nuked counter to see the rate at which things are being flushed from the cache due to LRU.

Resources