Cassandra key cache optimization - caching

I want to optimize the key cache in cassandra. I know about the key_cache_size_in_mb: The capacity in megabytes of all key caches on the node. Now while increasing what stats I need to look in order to determine the increase is actually benefiting the system.
Currently with the default settings I am getting
Key Cache : entries 20342, size 17.51 MB, capacity 100 MB, 4806 hits, 29618 requests, 0.162 recent hit rate, 14400 save period in seconds.
I have opsCenter up and running too.

Look at number of hits and recent hit rate.


HBase: Why are there evicted blocks before the max size of the BlockCache is reached?

I am currently using a stock configuration of Apache HBase, with RegionServer heap at 4G and BlockCache sizing at 40%, so around 1.6G. No L2/BucketCache configured.
Here are the BlockCache metrics after ~2K requests to RegionServer. As you can see, there were blocks evicted already, probably leading to some of the misses.
Why were they evicted when we aren't even close to the limit?
Size 2.1 M Current size of block cache in use (bytes)
Free 1.5 G The total free memory currently available to store more cache entries (bytes)
Count 18 Number of blocks in block cache
Evicted 14 The total number of blocks evicted
Evictions 1,645 The total number of times an eviction has occurred
Mean 10,984 Mean age of Blocks at eviction time (seconds)
StdDev 5,853,922 Standard Deviation for age of Blocks at eviction time
Hits 1,861 Number requests that were cache hits
Hits Caching 1,854 Cache hit block requests but only requests set to cache block if a miss
Misses 58 Block requests that were cache misses but set to cache missed blocks
Misses Caching 58 Block requests that were cache misses but only requests set to use block cache
Hit Ratio 96.98% Hit Count divided by total requests count
What you are seeing is the effect of the LRU treating blocks with three levels of priority: single-access, multi-access, and in-memory. For the default L1 LruBlockCache class their share of the cache can be set with (default values in brackets):
hbase.lru.blockcache.single.percentage (25%)
hbase.lru.blockcache.multi.percentage (50%)
hbase.lru.blockcache.memory.percentage (25%)
For the 4 GB heap example, and 40% set aside for the cache, you have 1.6 GB heap, which is further divided into 400 MB, 800 MB, and 400 MB for each priority level, based on the above percentages.
When a block is loaded from storage it is flagged as single-access usually, unless the column family it belongs to has been configured as IN_MEMORY = true, setting its priority to in-memory (obviously). For single-access blocks, if another read access is requesting the same block, it is flagged as multi-access priority.
The LruBlockCache has an internal eviction thread that runs every 10 seconds and checks if the blocks for each level together are exceeding their allowed percentage. Now, if you scan a larger table once, and assuming the cache was completely empty, all of the blocks are tagged single-access. If the table was 1 GB in size, you have loaded 1 GB into a 400 MB cache space, which the eviction thread then is going to reduce in due course. In fact, dependent on how long the scan is taking, the 10 seconds of the eviction thread is lapsing during the scan and will start to evict blocks once you exceed the 25% threshold.
The eviction will first evict blocks from the single-access area, then the multi-access area, and finally, if there is still pressure on the heap, from the in-memory area. That is also why you should make sure your working set for in-memory flagged column families is not exceeding the configured cache area.
What can you do? If you have mostly single-access blocks, you could tweak the above percentages to give more to the single-access area of the LRU.

cassandra key cache hit rate differs between nodetool and opscenter

I checked my key cache hit rate via nodetool and opscenter, the first shows a hit rate of 0.907 percent.
Key Cache : entries 1152104, size 96.73 MB, capacity 100 MB, 52543777 hits, 57954469 requests, 0.907 recent hit rate, 14400 save period in seconds
but in opscenter the graph shows 100%.
any one understands why the difference?
Cassandra has a perhaps bug (or at least typo) here, it lists it as recent hit cache but its of all time:
Its grabbing the value of the "total" hitrate:
So although you may be getting 100% hit rate for the last 19 minutes according to opscenter it wasn't always 100%. The total number of hits / total number of requests of all time is ~90%.
This is shown from:
52543777 hits, 57954469 requests
52543777 / 57954469 = 0.907

Elasticsearch indexing performance issues

We are facing some performance issues with elasticsearch in the last couple of days. As you can see on the screenshot, the indexing rate has some significant drops after the index reaches a certain size. At normal speed, we index arround 3000 logs per second. When the index we write to reaches a size of about ~10 GB, the rate drops.
We are using time based indices and arround 00:00, when a new Index is created by Logstash, the rates climb again to ~3000 logs per second (thats why we think its somehow related to the size of the index).
Server stats show nothing unusal at the CPU or memory stats (they are the same during drop-phases), but one of the servers has alot of I/O waits. Our Elasticsearch config is quite standard, with some adjustments to index performance (taken from the ES guide):
# If your index is on spinning platter drives, decrease this to one
# Reference / index-modules-merge
index.merge.scheduler.max_thread_count: 1
# allows larger segments to flush and decrease merge pressure
index.refresh_interval: 5s
# increase threshold_size from default when you are > ES 1.3.2
index.translog.flush_threshold_size: 1000mb
# JVM settings
bootstrap.mlockall: true (ES_HEAP SIZE is 50% of RAM)
We use two nodes. Both with 8 GB of RAM, 2 CPU cores and 300GB HDD size (dev environment).
I already saw clusters with alot bigger indices than ours. Do you guys have any idea what we could do to fix the issues?
Just ran into the performance issues again. Top sometimes shows arround 60% wa (wait), but iotop only reports about 1000 K/s read and write at max. I have no idea where these waits are coming from.

How much load can cassandra handle on m1.xlarge instance?

I setup 3 nodes of Cassandra (1.2.10) cluster on 3 instances of EC2 m1.xlarge.
Based on default configuration with several guidelines included, like:
not using EBS, raided 0 xfs on ephemerals instead,
commit logs on separate disk,
6GB heap, 200MB new size (also tested with greater new size/heap values),
enhanced limits.conf.
With 500 writes per second, the cluster works only for couple of hours. After that time it seems like not being able to respond because of CPU overload (mainly GC + compactions).
Nodes remain Up, but their load is huge and logs are full of GC infos and messages like:
ERROR [Native-Transport-Requests:186] 2013-12-10 18:38:12,412 (line 210) Unexpected exception during request Broken pipe
nodetool shows many dropped mutations on each node:
Message type Dropped
MUTATION 4072827
Is 500 wps too much for 3-node cluster of m1.xlarge and I should add nodes? Or is it possible to further tune GC somehow? What load are you able to serve with 3 nodes of m1.xlarge? What are your GC configs?
Cassandra is perfectly able to handle tens of thousands small writes per second on a single node. I just checked on my laptop and got about 29000 writes/second from cassandra-stress on Cassandra 1.2. So 500 writes per second is not really an impressive number even for a single node.
However beware that there is also a limit on how fast data can be flushed to disk and you definitely don't want your incoming data rate to be close to the physical capabilities of your HDDs. Therefore 500 writes per second can be too much, if those writes are big enough.
So first - what is the average size of the write? What is your replication factor? Multiply number of writes by replication factor and by average write size - then you'll approximately know what is required write throughput of a cluster. But you should take some safety margin for other I/O related tasks like compaction. There are various benchmarks on the Internet telling a single m1.xlarge instance should be able to write anywhere between 20 MB/s to 100 MB/s...
If your cluster has sufficient I/O throughput (e.g. 3x more than needed), yet you observe OOM problems, you should try to:
reduce memtable_total_space_mb (this will cause C* to flush smaller memtables, more often, freeing heap earlier)
lower write_request_timeout to e.g. 2 seconds instead of 10 (if you have big writes, you don't want to keep too many of them in the incoming queues, which reside on the heap)
turn off row_cache (if you ever enabled it)
lower size of the key_cache
consider upgrading to Cassandra 2.0, which moved quite a lot of things off-heap (e.g. bloom filters and index-summaries); this is especially important if you just store lots of data per node
add more HDDs and set multiple data directories, to improve flush performance
set larger new generation size; I usually set it to about 800M for a 6 GB heap, to avoid pressure on the tenured gen.
if you're sure memtable flushing lags behind, make sure sstable compression is enabled - this will reduce amount of data physically saved to disk, at the cost of additional CPU cycles

MySQL query caching: limited to a maximum cache size of 128 MB?

My application is very database intensive so I've tried really hard to make sure the application and the MySQL database are working as efficiently as possible together.
Currently I'm tuning the MySQL query cache to get it in line with the characteristics of queries being run on the server.
query_cache_size is the maximum amount of data that may be stored in the cache and query_cache_limit is the maximum size of a single resultset in the cache.
My current MySQL query cache is configured as follows:
query_cache_limit=1M gives me the following tuning hints about the running system:
Query cache is enabled
Current query_cache_size = 128 M
Current query_cache_used = 127 M
Current query_cache_limit = 1 M
Current Query cache Memory fill ratio = 99.95 %
Current query_cache_min_res_unit = 4 K
However, 21278 queries have been removed from the query cache due to lack of memory
Perhaps you should raise query_cache_size
MySQL won't cache query results that are larger than query_cache_limit in size
And gives the following tuning hints:
[OK] Query cache efficiency: 31.3% (39K cached / 125K selects)
[!!] Query cache prunes per day: 2300654
Variables to adjust:
query_cache_size (> 128M)
Both tuning scripts suggest that I should raise the query_cache_size. However, increasing the query_cache size over 128M may reduce performance according to (see
How would you tackle this problem? Would you increase the query_cache_size despite's warning or try to adjust the querying logic in some way? Most of the data access is handled by Hibernate, but quite a lot of hand-coded SQL is used in the application as well.
The warning issued by is actually relevant even if your cache has no risk of being swapped.
It is well-explained in the following:
Basically MySQL spends more time grooming the cache the bigger the cache is and since the cache is very volatile under even moderate write loads (queries gets cleared often), putting it too large will have an adverse effect on your application performance. Tweak the query_cache_size and query_cache_limit for your application, try finding a breaking point where you have most hits per insert, a low number of lowmem_prunes and keep a close eye on your database servers load while doing so too.
Usually "too big cache size" warnings are issued under assumption that you have few physical memory and the cache itself well need to be swapped or will take resources that are required by the OS (like file cache).
If you have enough memory, it's safe to increase query_cache size (I've seen installations with 1GB query cache).
But are you sure you are using the query cache right? Do have lots of verbatim repeating queries? Could you please post the example of a typical query?
You should be easy on increasing your cache, it is not only a "not that much available mem" thing!
Reading for instance the manual you get this quote:
Be cautious about sizing the query cache excessively large, which increases the overhead required to maintain the cache, possibly beyond the benefit of enabling it. Sizes in tens of megabytes are usually beneficial. Sizes in the hundreds of megabytes might not be.
There are various other sources you can check out!
A non-zero prune rate may be an indication that you should increase the size of your query cache. However, keep in mind that the overhead of maintaining the cache is likely to increase with its size, so do this in small increments and monitor the result. If you need to dramatically increase the size of the cache to eliminate prunes, there is a good chance that your workload is not a good match for the query cache.
So don't just put as much as you can in that query cache!
The best thing, would be to gradually increase the query cache and measure performance on your site. It's some sort of default in performance questions, but in cases like this 'testing' is one of the best things you can do.
Be careful with setting the query_cache_size and limit to high. MySQL only uses a single thread to read from the query cache.
With the query_cache_size set to 4G and query_cache_limit 12M we had a query cache rate of 85% but noticed a recurring spikes in connections.
After changing the query_cache_size to 256M with 64K query_cache_limit the query cache ratio dropped to 50% but the overall performance increased.
Overhead for Query cache is around 10% so I would disable query caching. Usually if you can't get your hit rate over 40 or 50 % maybe query cache isn't right for your database.
I've blog about this topic... Mysql query_cache_size performance here.
Query Cache gets invalidated/flush every time there is an insert, Use InnoDB/cache and avoid query cache or set it to a very small value.
