Redis Vs. Memcached - performance

I am using memcached right now as a LRU cache to cache big data. I've set the max object size to 128 MB (I know this is inefficient and not recommended) and total memcached to 1 GB. But 128 MB is not enough for my purposes so I am planning to move to Redis. A couple questions:
memcached is extremely slow - My current memcached setup is taking 3-4 seconds to return just one request. This is extremely slow. I sometimes need to make up to 30 memcached requests to serve one user request. And just doing this takes 90 seconds!! Am I doing something wrong or is memcached actually this slow?
Redis would be faster? - I plan to use Redis lists to cache the data. I'll fetch full lists using 0 to -1. I hope Redis be faster because I might as well not use any cache if its going to take 90 seconds!
Thanks!

I'd recommend doing a little profiling to see where the bottleneck is. My uninformed guess is that with such large objects, you may be limited by the connection between your app server and memcached and thus you'll see similar results with redis. It could also be that your app is taking a lot of time marshaling and unmarshaling a lot of objects. If it's easy, it might be worth trying a caching scheme where you're just caching the request being sent down to the client (which I'm sure is much less than 128MB).
Another thing to try would be turning on compression. This would give added latency compressing/uncompressing but would reduce network latency if that is indeed the issue.

Related

Redis vs memcached vs Scylla Cache - Which one to choose?

I'm designing an application where I want to cache million data each around 10kb.. I did some analysis and on the fence between using Redis vs memcached vs Scylla as Cache.. Can some experts suggests which might best suits my needs?
Highly performant
High availability
High Throughput
Low pricing?
Full disclosure - I work on the Scylla project.
I think it is a question of latency and HA vs cost. As a RAM-based system, Redis will be the lowest latency. If you need < 1 millisecond response, then Redis or memcached are the choice.
Scylla is a disk-based system. Those values that are in Scylla's RAM will be low latency, but those that need to pull from disk will be slower. So your 99p latency is likely to be slower. How slow? Depends on your disk. NVME can be 99p 3-5 ms. SSD, maybe 5-10 ms. If this is an acceptable latency, then Scylla will be much less expensive, as even NVME is much cheaper than RAM.
As for HA - Redis and memcached are intended as a cache. While there are some features and frameworks that you can use to replicate data around, these are all bolt-ons and increase complexity. Scylla is a distributed system by design. So the replication to allow for multiple layers of HA is built-in (node, rack and DC-availability)
Redis (and to a lesser extend, memcached) are phenomenal caches. But, depending upon your use case, Scylla might be the right choice.
All three options you mentioned are open-source software, so the pricing is the same - zero :-) However, both Scylla and Redis are written and backed by companies (ScyllaDB and RedisLabs, respectively), so if your use case is mission-critical you may choose to pay these companies for enteprise-level support, you can inquire with these companies what are their prices.
The more interesting difference between the three is in the technology.
You described a use case where you have 10 GB of data in the cache. This amount can be easily held in memory, so a completely in-memory database like Memcached or Redis is a natural choice. However, there are still questions you need to ask yourself, which may lead you to a distributed database, such as Scylla depending on your answers:
Would you be using powerful many-core machines? If so, you should probably rule out Memcached - my experience (and others' - see
Can memcached make full use of multi-core?) suggests that it does not scale well with many cores. On an 8-core machine you will not get anywhere close to 8 times the performance of a one-core machine.
Redis is also not really meant for multi-core use - https://redis.io/topics/benchmarks says that Redis "is not designed to benefit from multiple CPU cores. People are supposed to launch several Redis instances to scale out on several cores if needed.". Scylla, on the other hand, thrives on multi-core machines. You should probably test the performance of all three products on your use case before making a decision.
How much of a disaster would be to suddenly lose the entire content of your cache? In some use cases, it just means you would need to query some slightly-slower backend server, so suddenly losing the cache on reboot is acceptable. In such cases, a memory-only cache like Memached or Redis is probably exactly what you need. However, in other cases, there may be a big penalty for starting from scratch with an empty cache - the backend server might be very slow, or maybe the original content is stored on a far-away server with a slow and expensive WAN. In such a case you would want a disk-backed cache, so if the memory cache is lost, you can still refresh it from disk and not from the backend server. Redis has a disk backing option, and in Scylla disk backing is the main way.
You mentioned a working set of 10 GB, which can easily fit memory of a single server. But is it possible this will grow and in a year you'll find yourself needing to cache 100 GB or 1 TB, which no longer fits the memory of a single server? In memcached you'll be out of luck. Redis used to have a "virtual memory" solution for this purpose, but it is deprecated and https://redis.io/topics/virtual-memory now states that Redis is "without considering at least for now the support for databases bigger than RAM". Scylla does handle this issue in two ways. First, your cache would be stored on disk which can be much larger than memory (and whatever amount of memory you have will be used to further speed up that cache, but it doesn't need to fit memory). Second, Scylla is a distributed server. It can distribute a 100 GB working set to 10 different nodes. Redis also has "replication", but it copies the entire data to all nodes - while Scylla can optionally store different subsets of the data on different nodes.
In-memory is actually a bad thing since RAM is expensive and not persistent.
So Scylla will be a better option for K/V or columnar workloads.
Scylla also has a limited Redis api with good results [1], using the CQL
api will result in better results.
[1] https://medium.com/#siddharthc/redis-on-nvme-with-scylladb-5e12afd38dbc

How slow is Redis when full and evicting keys ? (LRU algorithm)

I am using Redis in a Java application, where I am reading log files, storing/retrieving some info in Redis for each log. Keys are IP addresses in my log file, which mean that they are always news keys coming, even if the same appears regularly.
At some point, Redis reaches its maxmemory size (3gb in my case), and starts evicting some keys. I use the "allkeys-lru" settings as I want to keep the youngest keys.
The whole application then slows a lot, taking 5 times longer than at the beginning.
So I have three questions:
is it normal to have such a dramatic slowdown (5 times longer)? Did anybody experience such slowdown? If not, I may have another issue in my code (improbable as the slowdown appears exactly when Redis reaches its limit)
can I improve my config ? I tried to change the maxmemory-samples setting without much success
should I consider an alternative for my particular problem? Is there a in-memory DB that could handle evicting keys with better performances ? I may consider a pure Java object (HashMap...), even if it doesn't look like a good design.
edit 1:
we use 2 DBs in Redis
edit 2:
We use redis 2.2.12 (ubuntu 12.04 LTS). Further investigations explained the issue: we are using db0 and db1 in redis. db1 is used much less than db0, and keys are totally different. When Redis reaches max-memory (and LRU algo starts evicting keys), redis does remove almost all db1 keys, which slows drastically all calls. This is a strange behavior, probably unusual and maybe linked to our application. We fixed the issue by moving to another (better) memory mechanism for keys that were loaded in db1.
thanks !
I'm not convinced Redis is the best option for your use case.
Redis "LRU" is only a best effort algorithm (i.e. quite far from an exact LRU). Redis tracks memory allocations and knows when it has to free some memory. This is checked before the execution of each command. The mechanism to evict a key in "allkeys-lru" mode consists in choosing maxmemory-samples random keys, comparing their idle time, and select the most idle key. Redis repeats these operations until the used memory is below maxmemory.
The higher maxmemory-samples, the more CPU consumption, but the more accurate result.
Provided you do not explicitly use the EXPIRE command, there is no other overhead to be associated with key eviction.
Running a quick test with Redis benchmark on my machine results in a throughput of:
145 Kops/s when no eviction occurs
125 Kops/s when 50% eviction occurs (i.e. 1 key over 2 is evicted).
I cannot reproduce the 5 times factor you experienced.
The obvious recommendation to reduce the overhead of eviction is to decrease maxmemory-samples, but it also means a dramatic decrease of the accuracy.
My suggestion would be to give memcached a try. The LRU mechanism is different. It is still not exact (it applies only on a per slab basis), but it will likely give better results that Redis on this use case.
Which version of Redis are you using? The 2.8 version (quite recent) improved the expiration algorithm and if you are using 2.6 you might give it a try.
http://download.redis.io/redis-stable/00-RELEASENOTES

NoSQL replacement for memcache

We are having a situation in which the values we store on memcache are bigger than 1MB.
It is not possible to make such values smaller, and even if there was a way, we need to persist them to disk.
One solution would be to recompile the memcache server to allow say 2MB values, but this is either not clean nor a complete solution (again, we need to persist the values).
Good news is that
We can predict quite acurately how many key/values pair we are going to have
We can also predict the total size we will need.
A key feature for us is the speed of memcache.
So question is: is there any noSQL replacement for memcache which will allow us to have values longer than 1MB AND store them in disk without loss of speed?
In the past I have used tokyotyrant/cabinet but seems to be deprecated now.
Any idea?
I'd use redis.
Redis addresses the issues you've listed, supports keys up to 512Mb, and values up to 2Gb.
You can persist data to disc using AOF snap-shotting given a frequency, 1s, 5s, etc., although RDB persistence provides maximum performance over AOF, in most cases.
We use redis for caching json documents. We've learned that, for maximum performance, deploy redis on physical hardware, if you can; virtual machines dramatically impacts redis network performance.
You also have Couchbase which is compatible with the Memcache API and allows you to either only store your data in Memcache or in a persisted cluster.
Redis is fine if the total ammount of your data will not exceed the size of you physical memory. If the total ammount of your data is too much to fit the memmory, you will need to install more Redis instances on different servers.
Or you may try SSDB(https://github.com/ideawu/ssdb), which will automatically migrate cold data into disk, so you will get more storage capacity with SSDB.
Any key/value store will do, really. See this list for example: http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores
Also take a look at MongoDB - durability doesn't seem to be an issue for you, and that's basically where Mongo sucks, so you can get fast document-database (key/value store on steroids, basically) with indexes for free. At least until you grow too large.
I would go with couchbase, it can support up to 20mb for a document, it's possible to run a bucket as either memcache or couchbase protocol, the latter providing persistence.
Take a look at the other limits for keys/metadata here: http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-server-limits.html
And a presentation on how mongodb/cassandra and couchbase stack up on throughput/operations a second. http://www.slideshare.net/renatko/couchbase-performance-benchmarking
I've used both redis and couchbase in production, for a persistent sit in replacement for memcache its hard to argue against a nosql db that is built upon the protocol.

How to implement 50M key-value pair memcache with 4 M qps?

The business scenario requires:
50M key-value pairs, 2K each , 100G memory in total.
About 40% of key-value will change in a second.
The Java application need Get() once and set() once for each changed pair, it will be 50M*40%*2=4M qps (query per second) .
We tested memcached - which shows very limited qps.
Our benchmarking is very similar to results showed here
http://xmemcached.googlecode.com/svn/trunk/benchmark/benchmark.html
10,000 around qps is the limitation of one memcached server.
That mean we need 40 partitioned memcached servers in our business scenario- which seems very uneconomic and unrealistic.
In your experience, is the benchmarking accurate in term of memcached’s designed performance?
Any suggestion to tune memcached system(client or server)?
Or any other alternative memory store system that is able meet the requirement more economically?
Many thanks in advance!
If you look at the graphs in the benchmark you talked about you need to realize that the in many of those instances the limit was the network and not memcached. For instance, if you will have 2k values for all of your items then your maximum throughput on a GigE network if about 65k ops/sec. (1024*1024*128/2048=65536). Memcached can do a lot more operations per second than this. I have personally hit 200K ops/sec with (I think) 512b values and I have heard of others getting much higher throughput than I did. This all depends heavily on the network though.
Also, memcached is barely doing anything at 10k ops/sec. My guess is your aren't taking advantage of concurrency in your benchmarks.

memcached limitations

Has anyone experienced memcached limitations in terms of:
of objects in cache store - is there a point where it loses performance?
Amount of allocated memory - what are the basic numbers to work with?
I can give you some metrics for our environment. We run memcached for Win32 on 12 boxes (as cache for a very database heavy ASP.NET web site). These boxes each have their own other responsibilities; we just spread the memcached nodes across all machines with memory to spare. Each node had max 512MB allocated by memcached.
Our nodes have on average 500 - 1000 connections open. A typical node has 60.000 items in cache and handles 1000 requests per second (!). All of this runs fairly stable and requires little maintenance.
We have run into 2 kinds of limitations:
1. CPU use on the client machines. We use .NET serialization to store and retrieve objects in memcached. Works seamless, but CPU use can get very high with our loads. We found that some object can better be first converted to strings (or HTML fragments) and then cached.
2. We have had some problems with memcached boxes running out of TCP/IP connections. Spreading across more boxes helped.
We run memcached 1.2.6 and use the .NET client from http://www.codeplex.com/EnyimMemcached/
I can't vouch for the accuracy of this claim, but at a linux/developer meetup a few months ago an engineer talked about how his company scaled memcache back to using 2GB chunks, 3-4 per memcache box. They found that throughput was fine, but with very large memcache daemons that they were getting 4% more misses. He said they couldn't figure out why there was a difference but decided to just go with what works.

Resources