ServiceStack Redis - caching expensive queries - caching

We have a number of really expensive queries, which involve multiple joins, which I would like to cache using Redis (using the ultimate ServiceStack.Redis framework).
How many rows/items should I be storing in Redis before memory becomes an issue?
e..g can I store 10 000+ rows into Redis without worrying about memory issues (our server, which also hosts our web app has 8Gb Ram).
Secondly, what is the best way of storing them (as List or Hash?).

For the number of rows it depends on the row size. The best approach would be to start saving and see the memory usage on the Redis server. 10k doesn't sound like too much data.
On how to store them I would use a Hash only if I need to retrieve specific rows, for example if I would do the filtering and sorting in Redis, which theoretically is possible. But most likely filtering and sorting of the results is done in your app so you can keep all that data in one key only. What we did in our app is serialized all the results in json, archived them in code and then saved to a simple key Redis and this gave the smallest memory consumption.

Related

Best way to construct a cache key whose uniqueness is defined by 6 properties

Currently I am tasked to fix cache for an ecommerce like system whose prices depend on many factors. The cache backend is redis. For a given product the factors that influence the price are:
sku
channel
sub channel
plan
date
Currently the cache is structured like this in redis:
product1_channel1_subchannel1: {sku_1: {plan1: {2019-03-18: 2000}}}
The API caters to requests for multiple products, skus and all the factors above . So they decided to query all the data on a product_channel_subchannel level and filter the data in the app which is very slow. Also they have decided that, on a cache miss they will construct the cache for all skus for 90 days of data. This way only one request will face the wrath while the others gets benefited from it (only the catch is now we are busting cache more often which is also dragging the system down)
The downside of going with all these factors included in the keys is there will be too many keys. To ball park there are 400 products each made up of 20 skus with 20 channels, 200 subchannels 3 types of plans and 400 days of pricing. To avoid these many keys at some place we must group the data.
The system is currently receives about 10 rps and the has to respond within 100ms.
Question is:
Is the above cache structure fine? Or how do we go about flattening this structure?
How are caches stored in pricing systems in general. I feel like this a very trivial task nonetheless I find it very hard to justify my approaches
Is it okay to sacrifice one request to warm cache for bulk of the data? Or is it better to have a cache warming strategy?
Any sort of caching strategy will be an exercise in trade-offs. And the precise trade-offs you need to make will be dependent upon complex domain logic that you can't predict until you try it out.
What this means is that whatever you implement should be based on data and should be flexible enough to change over time as the business changes. In particular the answer to these questions:
Is it okay to sacrifice one request to warm cache for bulk of the data? Or is it better to have a cache warming strategy?
depend on how the data will be queried by your users and how long a cache miss will take. If queries tend to be clustered around certain skus, or certain dates in a predictable manner, then you should use that information to help guide cache hits and misses.
There is no way I, or anyone else, can give you a correct answer without doing proper experimentation, but we can give you some guidelines.
Here are some best practices that I would recommend when using redis for caching:
If the bottleneck is sending data from redis to the api, then consider using lua scripts to do the simple processing before any data leaves redis. But, be careful that you don't make the scripts too complex since a long-running lua script can block all other parts of redis
It looks like you are using simple get/set keys to store your data. Consider using something more complex:
a. use sorted sets (zsets) if you want to have better access to data by date (use the date as the score).
b. use hash sets to get more fine-grained access to skus
Based on your question, it looks like you will have about 1.6M keys. This is not a huge amount, but you need to make sure that redis has enough memory to store everything in ram without swapping anything to disk. This is something that we had to learn the hard way. If you are running your redis instance on linux, you must set the system's swappiness to 0, to ensure swap is never used.
But, most importantly, you need to experiment with everything until you find a good solution.

Cassandra client code with high read throughput with row_cache optimization

Can someone point me to cassandra client code that can achieve a read throughput of at least hundreds of thousands of reads/s if I keep reading the same record (or even a small number of records) over and over? I believe row_cache_size_in_mb is supposed to cache frequently used records in memory, but setting it to say 10MB seems to make no difference.
I tried cassandra-stress of course, but the highest read throughput it achieves with 1KB records (-col size=UNIFORM\(1000..1000\)) is ~15K/s.
With low numbers like above, I can easily write an in-memory hashmap based cache that will give me at least a million reads per second for a small working set size. How do I make cassandra do this automatically for me? Or is it not supposed to achieve performance close to an in-memory map even for a tiny working set size?
Can someone point me to cassandra client code that can achieve a read throughput of at least hundreds of thousands of reads/s if I keep reading the same record (or even a small number of records) over and over?
There are some solution for this scenario
One idea is to use row cache but be careful, any update/delete to a single column will invalidate the whole partition from the cache so you loose all the benefit. Row cache best usage is for small dataset and are frequently read but almost never modified.
Are you sure that your cassandra-stress scenario never update or write to the same partition over and over again ?
Here are my findings: when I enable row_cache, counter_cache, and key_cache all to sizable values, I am able to verify using "top" that cassandra does no disk I/O at all; all three seem necessary to ensure no disk activity. Yet, despite zero disk I/O, the throughput is <20K/s even for reading a single record over and over. This likely confirms (as also alluded to in my comment) that cassandra incurs the cost of serialization and deserialization even if its operations are completely in-memory, i.e., it is not designed to compete with native hashmap performance. So, if you want get native hashmap speeds for a small-working-set workload but expand to disk if the map grows big, you would need to write your own cache on top of cassandra (or any of the other key-value stores like mongo, redis, etc. for that matter).
For those interested, I also verified that redis is the fastest among cassandra, mongo, and redis for a simple get/put small-working-set workload, but even redis gets at best ~35K/s read throughput (largely independent, by design, of the request size), which hardly comes anywhere close to native hashmap performance that simply returns pointers and can do so comfortably at over 2 million/s.

NoSQL replacement for memcache

We are having a situation in which the values we store on memcache are bigger than 1MB.
It is not possible to make such values smaller, and even if there was a way, we need to persist them to disk.
One solution would be to recompile the memcache server to allow say 2MB values, but this is either not clean nor a complete solution (again, we need to persist the values).
Good news is that
We can predict quite acurately how many key/values pair we are going to have
We can also predict the total size we will need.
A key feature for us is the speed of memcache.
So question is: is there any noSQL replacement for memcache which will allow us to have values longer than 1MB AND store them in disk without loss of speed?
In the past I have used tokyotyrant/cabinet but seems to be deprecated now.
Any idea?
I'd use redis.
Redis addresses the issues you've listed, supports keys up to 512Mb, and values up to 2Gb.
You can persist data to disc using AOF snap-shotting given a frequency, 1s, 5s, etc., although RDB persistence provides maximum performance over AOF, in most cases.
We use redis for caching json documents. We've learned that, for maximum performance, deploy redis on physical hardware, if you can; virtual machines dramatically impacts redis network performance.
You also have Couchbase which is compatible with the Memcache API and allows you to either only store your data in Memcache or in a persisted cluster.
Redis is fine if the total ammount of your data will not exceed the size of you physical memory. If the total ammount of your data is too much to fit the memmory, you will need to install more Redis instances on different servers.
Or you may try SSDB(https://github.com/ideawu/ssdb), which will automatically migrate cold data into disk, so you will get more storage capacity with SSDB.
Any key/value store will do, really. See this list for example: http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores
Also take a look at MongoDB - durability doesn't seem to be an issue for you, and that's basically where Mongo sucks, so you can get fast document-database (key/value store on steroids, basically) with indexes for free. At least until you grow too large.
I would go with couchbase, it can support up to 20mb for a document, it's possible to run a bucket as either memcache or couchbase protocol, the latter providing persistence.
Take a look at the other limits for keys/metadata here: http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-server-limits.html
And a presentation on how mongodb/cassandra and couchbase stack up on throughput/operations a second. http://www.slideshare.net/renatko/couchbase-performance-benchmarking
I've used both redis and couchbase in production, for a persistent sit in replacement for memcache its hard to argue against a nosql db that is built upon the protocol.

Follow up Q on [Segmenting Redis By Database]

This is a follow up question Segmenting Redis By Database.
I originally asked about the time complexity of the redis keys operation in different databases within one redis instance. The reason I was asking is because I am attempting to implement a cache where there are x multi-segment keys, each of which may have y actual data instances, resulting in xy total keys.
However, I would like to support the wild-card search of the primary keys and it seems that in redis the only implemented wild-card query for keys is the keys command, the use of which is discouraged. It seemed to me to be a decent compromise to put the x keys in a separate database where the lower number of keys would make the keys operation perform satisfactorily.
Can anyone suggest a better alternative ?
Thanks.
I still think using KEYS is really not scalable with Redis, whatever clever scheme you can put in place to work the linear complexity around.
Partitioning is one of this scheme, and it is commonly used in traditional RDBMS to reduce the cost of table scans on flat tables. Your idea is actually an adaptation of this concept to Redis.
But there is an important difference compared to traditional RDBMS providing this facility (Oracle, MySQL, ...): Redis is a single-threaded event loop. So a scan cannot be done concurrently with any other activity (like serving other client connections for instance). When Redis scans data, it is blocked for all connections.
You would have to setup a huge number of partitions (i.e. of databases) to get good performance. Something like 1/1000 or 1/10000 of the global number of keys. And this is why it is not scalable: Redis is not designed to handle such a number of databases. You will likely have issues with internal mechanisms iterating on all the databases. Here is a list extracted from the source code:
automatic rehashing
item expiration management
database status logging (every 5 secs)
INFO command
maxmemory management
You would likely have to limit the number of databases, which also limits the scalability. If you set 1000 databases, it will be work fine for say 1M items, will be slower for 10M items, and unusable with 100M items.
If you still want to stick to linear scans to implement this facility, you will be better served by other stores supporting concurrent scans (like MySQL, MongoDB, etc ...). With the other stores, the critical point will be to implement item expiration in an efficient way.
If you really have to use Redis, you can easily segment the data without relying on multiple databases. For instance, you could use the method I have described here. With this strategy, the list of keys is retrieved in an incremental way, and the search is actually done on client-side. The main benefit is you can have a large number of partitions, so that Redis would not block.
Now, AFAIK no storage engine provides the capability to efficiently search data with an arbitrary regular expression (i.e. avoiding a linear scan). However, this feature is provided by some search engines, typically using n-gram indexing.
Here is a good article about it from Russ Cox: http://swtch.com/~rsc/regexp/regexp4.html
This indexing mechanism could probably be adapted to Redis (you would use Redis to store a trigram index of your keys), but it represents a lot of code to write.
You could also imagine restricting the regular expressions to prefix searches. For instance U:SMITH:(.*) is actually a search with prefix U:SMITH:
In that case, you can use a zset to index your keys, and perform the linear search on client side once the range of keys you are interested in has been retrieved. The score of the items in the zset is calculated from the keys on client-side, so that the score order corresponds to the lexicographic order of the keys.
With such zset, it is possible to retrieve the range of keys you have to scan chunk by chunk by a combination of zscore and zrange commands. The consequences are the number of keys to scan is limited (by the prefix), the search occurs on client-side, and it is friendly with Redis concurrency model. The drawbacks are the complexity (especially to handle item expiration), and the network bandwidth consumption.

Memcached vs. Redis? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Closed 2 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
We're using a Ruby web-app with Redis server for caching. Is there a point to test Memcached instead?
What will give us better performance? Any pros or cons between Redis and Memcached?
Points to consider:
Read/write speed.
Memory usage.
Disk I/O dumping.
Scaling.
Summary (TL;DR)
Updated June 3rd, 2017
Redis is more powerful, more popular, and better supported than memcached. Memcached can only do a small fraction of the things Redis can do. Redis is better even where their features overlap.
For anything new, use Redis.
Memcached vs Redis: Direct Comparison
Both tools are powerful, fast, in-memory data stores that are useful as a cache. Both can help speed up your application by caching database results, HTML fragments, or anything else that might be expensive to generate.
Points to Consider
When used for the same thing, here is how they compare using the original question's "Points to Consider":
Read/write speed: Both are extremely fast. Benchmarks vary by workload, versions, and many other factors but generally show redis to be as fast or almost as fast as memcached. I recommend redis, but not because memcached is slow. It's not.
Memory usage: Redis is better.
memcached: You specify the cache size and as you insert items the daemon quickly grows to a little more than this size. There is never really a way to reclaim any of that space, short of restarting memcached. All your keys could be expired, you could flush the database, and it would still use the full chunk of RAM you configured it with.
redis: Setting a max size is up to you. Redis will never use more than it has to and will give you back memory it is no longer using.
I stored 100,000 ~2KB strings (~200MB) of random sentences into both. Memcached RAM usage grew to ~225MB. Redis RAM usage grew to ~228MB. After flushing both, redis dropped to ~29MB and memcached stayed at ~225MB. They are similarly efficient in how they store data, but only one is capable of reclaiming it.
Disk I/O dumping: A clear win for redis since it does this by default and has very configurable persistence. Memcached has no mechanisms for dumping to disk without 3rd party tools.
Scaling: Both give you tons of headroom before you need more than a single instance as a cache. Redis includes tools to help you go beyond that while memcached does not.
memcached
Memcached is a simple volatile cache server. It allows you to store key/value pairs where the value is limited to being a string up to 1MB.
It's good at this, but that's all it does. You can access those values by their key at extremely high speed, often saturating available network or even memory bandwidth.
When you restart memcached your data is gone. This is fine for a cache. You shouldn't store anything important there.
If you need high performance or high availability there are 3rd party tools, products, and services available.
redis
Redis can do the same jobs as memcached can, and can do them better.
Redis can act as a cache as well. It can store key/value pairs too. In redis they can even be up to 512MB.
You can turn off persistence and it will happily lose your data on restart too. If you want your cache to survive restarts it lets you do that as well. In fact, that's the default.
It's super fast too, often limited by network or memory bandwidth.
If one instance of redis/memcached isn't enough performance for your workload, redis is the clear choice. Redis includes cluster support and comes with high availability tools (redis-sentinel) right "in the box". Over the past few years redis has also emerged as the clear leader in 3rd party tooling. Companies like Redis Labs, Amazon, and others offer many useful redis tools and services. The ecosystem around redis is much larger. The number of large scale deployments is now likely greater than for memcached.
The Redis Superset
Redis is more than a cache. It is an in-memory data structure server. Below you will find a quick overview of things Redis can do beyond being a simple key/value cache like memcached. Most of redis' features are things memcached cannot do.
Documentation
Redis is better documented than memcached. While this can be subjective, it seems to be more and more true all the time.
redis.io is a fantastic easily navigated resource. It lets you try redis in the browser and even gives you live interactive examples with each command in the docs.
There are now 2x as many stackoverflow results for redis as memcached. 2x as many Google results. More readily accessible examples in more languages. More active development. More active client development. These measurements might not mean much individually, but in combination they paint a clear picture that support and documentation for redis is greater and much more up-to-date.
Persistence
By default redis persists your data to disk using a mechanism called snapshotting. If you have enough RAM available it's able to write all of your data to disk with almost no performance degradation. It's almost free!
In snapshot mode there is a chance that a sudden crash could result in a small amount of lost data. If you absolutely need to make sure no data is ever lost, don't worry, redis has your back there too with AOF (Append Only File) mode. In this persistence mode data can be synced to disk as it is written. This can reduce maximum write throughput to however fast your disk can write, but should still be quite fast.
There are many configuration options to fine tune persistence if you need, but the defaults are very sensible. These options make it easy to setup redis as a safe, redundant place to store data. It is a real database.
Many Data Types
Memcached is limited to strings, but Redis is a data structure server that can serve up many different data types. It also provides the commands you need to make the most of those data types.
Strings (commands)
Simple text or binary values that can be up to 512MB in size. This is the only data type redis and memcached share, though memcached strings are limited to 1MB.
Redis gives you more tools for leveraging this datatype by offering commands for bitwise operations, bit-level manipulation, floating point increment/decrement support, range queries, and multi-key operations. Memcached doesn't support any of that.
Strings are useful for all sorts of use cases, which is why memcached is fairly useful with this data type alone.
Hashes (commands)
Hashes are sort of like a key value store within a key value store. They map between string fields and string values. Field->value maps using a hash are slightly more space efficient than key->value maps using regular strings.
Hashes are useful as a namespace, or when you want to logically group many keys. With a hash you can grab all the members efficiently, expire all the members together, delete all the members together, etc. Great for any use case where you have several key/value pairs that need to grouped.
One example use of a hash is for storing user profiles between applications. A redis hash stored with the user ID as the key will allow you to store as many bits of data about a user as needed while keeping them stored under a single key. The advantage of using a hash instead of serializing the profile into a string is that you can have different applications read/write different fields within the user profile without having to worry about one app overriding changes made by others (which can happen if you serialize stale data).
Lists (commands)
Redis lists are ordered collections of strings. They are optimized for inserting, reading, or removing values from the top or bottom (aka: left or right) of the list.
Redis provides many commands for leveraging lists, including commands to push/pop items, push/pop between lists, truncate lists, perform range queries, etc.
Lists make great durable, atomic, queues. These work great for job queues, logs, buffers, and many other use cases.
Sets (commands)
Sets are unordered collections of unique values. They are optimized to let you quickly check if a value is in the set, quickly add/remove values, and to measure overlap with other sets.
These are great for things like access control lists, unique visitor trackers, and many other things. Most programming languages have something similar (usually called a Set). This is like that, only distributed.
Redis provides several commands to manage sets. Obvious ones like adding, removing, and checking the set are present. So are less obvious commands like popping/reading a random item and commands for performing unions and intersections with other sets.
Sorted Sets (commands)
Sorted Sets are also collections of unique values. These ones, as the name implies, are ordered. They are ordered by a score, then lexicographically.
This data type is optimized for quick lookups by score. Getting the highest, lowest, or any range of values in between is extremely fast.
If you add users to a sorted set along with their high score, you have yourself a perfect leader-board. As new high scores come in, just add them to the set again with their high score and it will re-order your leader-board. Also great for keeping track of the last time users visited and who is active in your application.
Storing values with the same score causes them to be ordered lexicographically (think alphabetically). This can be useful for things like auto-complete features.
Many of the sorted set commands are similar to commands for sets, sometimes with an additional score parameter. Also included are commands for managing scores and querying by score.
Geo
Redis has several commands for storing, retrieving, and measuring geographic data. This includes radius queries and measuring distances between points.
Technically geographic data in redis is stored within sorted sets, so this isn't a truly separate data type. It is more of an extension on top of sorted sets.
Bitmap and HyperLogLog
Like geo, these aren't completely separate data types. These are commands that allow you to treat string data as if it's either a bitmap or a hyperloglog.
Bitmaps are what the bit-level operators I referenced under Strings are for. This data type was the basic building block for reddit's recent collaborative art project: r/Place.
HyperLogLog allows you to use a constant extremely small amount of space to count almost unlimited unique values with shocking accuracy. Using only ~16KB you could efficiently count the number of unique visitors to your site, even if that number is in the millions.
Transactions and Atomicity
Commands in redis are atomic, meaning you can be sure that as soon as you write a value to redis that value is visible to all clients connected to redis. There is no wait for that value to propagate. Technically memcached is atomic as well, but with redis adding all this functionality beyond memcached it is worth noting and somewhat impressive that all these additional data types and features are also atomic.
While not quite the same as transactions in relational databases, redis also has transactions that use "optimistic locking" (WATCH/MULTI/EXEC).
Pipelining
Redis provides a feature called 'pipelining'. If you have many redis commands you want to execute you can use pipelining to send them to redis all-at-once instead of one-at-a-time.
Normally when you execute a command to either redis or memcached, each command is a separate request/response cycle. With pipelining, redis can buffer several commands and execute them all at once, responding with all of the responses to all of your commands in a single reply.
This can allow you to achieve even greater throughput on bulk importing or other actions that involve lots of commands.
Pub/Sub
Redis has commands dedicated to pub/sub functionality, allowing redis to act as a high speed message broadcaster. This allows a single client to publish messages to many other clients connected to a channel.
Redis does pub/sub as well as almost any tool. Dedicated message brokers like RabbitMQ may have advantages in certain areas, but the fact that the same server can also give you persistent durable queues and other data structures your pub/sub workloads likely need, Redis will often prove to be the best and most simple tool for the job.
Lua Scripting
You can kind of think of lua scripts like redis's own SQL or stored procedures. It's both more and less than that, but the analogy mostly works.
Maybe you have complex calculations you want redis to perform. Maybe you can't afford to have your transactions roll back and need guarantees every step of a complex process will happen atomically. These problems and many more can be solved with lua scripting.
The entire script is executed atomically, so if you can fit your logic into a lua script you can often avoid messing with optimistic locking transactions.
Scaling
As mentioned above, redis includes built in support for clustering and is bundled with its own high availability tool called redis-sentinel.
Conclusion
Without hesitation I would recommend redis over memcached for any new projects, or existing projects that don't already use memcached.
The above may sound like I don't like memcached. On the contrary: it is a powerful, simple, stable, mature, and hardened tool. There are even some use cases where it's a little faster than redis. I love memcached. I just don't think it makes much sense for future development.
Redis does everything memcached does, often better. Any performance advantage for memcached is minor and workload specific. There are also workloads for which redis will be faster, and many more workloads that redis can do which memcached simply can't. The tiny performance differences seem minor in the face of the giant gulf in functionality and the fact that both tools are so fast and efficient they may very well be the last piece of your infrastructure you'll ever have to worry about scaling.
There is only one scenario where memcached makes more sense: where memcached is already in use as a cache. If you are already caching with memcached then keep using it, if it meets your needs. It is likely not worth the effort to move to redis and if you are going to use redis just for caching it may not offer enough benefit to be worth your time. If memcached isn't meeting your needs, then you should probably move to redis. This is true whether you need to scale beyond memcached or you need additional functionality.
Use Redis if
You require selectively deleting/expiring items in the cache. (You need this)
You require the ability to query keys of a particular type. eq. 'blog1:posts:*', 'blog2:categories:xyz:posts:*'. oh yeah! this is very important. Use this to invalidate certain types of cached items selectively. You can also use this to invalidate fragment cache, page cache, only AR objects of a given type, etc.
Persistence (You will need this too, unless you are okay with your cache having to warm up after every restart. Very essential for objects that seldom change)
Use memcached if
Memcached gives you headached!
umm... clustering? meh. if you gonna go that far, use Varnish and Redis for caching fragments and AR Objects.
From my experience I've had much better stability with Redis than Memcached
Memcached is multithreaded and fast.
Redis has lots of features and is very fast, but completely limited to one core as it is based on an event loop.
We use both. Memcached is used for caching objects, primarily reducing read load on the databases. Redis is used for things like sorted sets which are handy for rolling up time-series data.
This is too long to be posted as a comment to already accepted answer, so I put it as a separate answer
One thing also to consider is whether you expect to have a hard upper memory limit on your cache instance.
Since redis is an nosql database with tons of features and caching is only one option it can be used for, it allocates memory as it needs it — the more objects you put in it, the more memory it uses. The maxmemory option does not strictly enforces upper memory limit usage. As you work with cache, keys are evicted and expired; chances are your keys are not all the same size, so internal memory fragmentation occurs.
By default redis uses jemalloc memory allocator, which tries its best to be both memory-compact and fast, but it is a general purpose memory allocator and it cannot keep up with lots of allocations and object purging occuring at a high rate. Because of this, on some load patterns redis process can apparently leak memory because of internal fragmentation. For example, if you have a server with 7 Gb RAM and you want to use redis as non-persistent LRU cache, you may find that redis process with maxmemory set to 5Gb over time would use more and more memory, eventually hitting total RAM limit until out-of-memory killer interferes.
memcached is a better fit to scenario described above, as it manages its memory in a completely different way. memcached allocates one big chunk of memory — everything it will ever need — and then manages this memory by itself, using its own implemented slab allocator. Moreover, memcached tries hard to keep internal fragmentation low, as it actually uses per-slab LRU algorithm, when LRU evictions are done with object size considered.
With that said, memcached still has a strong position in environments, where memory usage has to be enforced and/or be predictable. We've tried to use latest stable redis (2.8.19) as a drop-in non-persistent LRU-based memcached replacement in workload of 10-15k op/s, and it leaked memory A LOT; the same workload was crashing Amazon's ElastiCache redis instances in a day or so because of the same reasons.
Memcached is good at being a simple key/value store and is good at doing key => STRING. This makes it really good for session storage.
Redis is good at doing key => SOME_OBJECT.
It really depends on what you are going to be putting in there. My understanding is that in terms of performance they are pretty even.
Also good luck finding any objective benchmarks, if you do find some kindly send them my way.
If you don't mind a crass writing style, Redis vs Memcached on the Systoilet blog is worth a read from a usability standpoint, but be sure to read the back & forth in the comments before drawing any conclusions on performance; there are some methodological problems (single-threaded busy-loop tests), and Redis has made some improvements since the article was written as well.
And no benchmark link is complete without confusing things a bit, so also check out some conflicting benchmarks at Dormondo's LiveJournal and the Antirez Weblog.
Edit -- as Antirez points out, the Systoilet analysis is rather ill-conceived. Even beyond the single-threading shortfall, much of the performance disparity in those benchmarks can be attributed to the client libraries rather than server throughput. The benchmarks at the Antirez Weblog do indeed present a much more apples-to-apples (with the same mouth) comparison.
I got the opportunity to use both memcached and redis together in the caching proxy that i have worked on , let me share you where exactly i have used what and reason behind same....
Redis >
1) Used for indexing the cache content , over the cluster . I have more than billion keys in spread over redis clusters , redis response times is quite less and stable .
2) Basically , its a key/value store , so where ever in you application you have something similar, one can use redis with bothering much.
3) Redis persistency, failover and backup (AOF ) will make your job easier .
Memcache >
1) yes , an optimized memory that can be used as cache . I used it for storing cache content getting accessed very frequently (with 50 hits/second)with size less than 1 MB .
2) I allocated only 2GB out of 16 GB for memcached that too when my single content size was >1MB .
3) As the content grows near the limits , occasionally i have observed higher response times in the stats(not the case with redis) .
If you ask for overall experience Redis is much green as it is easy to configure, much flexible with stable robust features.
Further , there is a benchmarking result available at this link , below are few higlight from same,
Hope this helps!!
Test. Run some simple benchmarks. For a long while I considered myself an old school rhino since I used mostly memcached and considered Redis the new kid.
With my current company Redis was used as the main cache. When I dug into some performance stats and simply started testing, Redis was, in terms of performance, comparable or minimally slower than MySQL.
Memcached, though simplistic, blew Redis out of water totally. It scaled much better:
for bigger values (required change in slab size, but worked)
for multiple concurrent requests
Also, memcached eviction policy is in my view, much better implemented, resulting in overall more stable average response time while handling more data than the cache can handle.
Some benchmarking revealed that Redis, in our case, performs very poorly. This I believe has to do with many variables:
type of hardware you run Redis on
types of data you store
amount of gets and sets
how concurrent your app is
do you need data structure storage
Personally, I don't share the view Redis authors have on concurrency and multithreading.
Another bonus is that it can be very clear how memcache is going to behave in a caching scenario, while redis is generally used as a persistent datastore, though it can be configured to behave just like memcached aka evicting Least Recently Used items when it reaches max capacity.
Some apps I've worked on use both just to make it clear how we intend the data to behave - stuff in memcache, we write code to handle the cases where it isn't there - stuff in redis, we rely on it being there.
Other than that Redis is generally regarded as superior for most use cases being more feature-rich and thus flexible.
It would not be wrong, if we say that redis is combination of (cache + data structure) while memcached is just a cache.
A very simple test to set and get 100k unique keys and values against redis-2.2.2 and memcached. Both are running on linux VM(CentOS) and my client code(pasted below) runs on windows desktop.
Redis
Time taken to store 100000 values is = 18954ms
Time taken to load 100000 values is = 18328ms
Memcached
Time taken to store 100000 values is = 797ms
Time taken to retrieve 100000 values is = 38984ms
Jedis jed = new Jedis("localhost", 6379);
int count = 100000;
long startTime = System.currentTimeMillis();
for (int i=0; i<count; i++) {
jed.set("u112-"+i, "v51"+i);
}
long endTime = System.currentTimeMillis();
System.out.println("Time taken to store "+ count + " values is ="+(endTime-startTime)+"ms");
startTime = System.currentTimeMillis();
for (int i=0; i<count; i++) {
client.get("u112-"+i);
}
endTime = System.currentTimeMillis();
System.out.println("Time taken to retrieve "+ count + " values is ="+(endTime-startTime)+"ms");
One major difference that hasn't been pointed out here is that Memcache has an upper memory limit at all times, while Redis does not by default (but can be configured to). If you would always like to store a key/value for certain amount of time (and never evict it because of low memory) you want to go with Redis. Of course, you also risk the issue of running out of memory...
Memcached will be faster if you are interested in performance, just even because Redis involves networking (TCP calls). Also internally Memcache is faster.
Redis has more features as it was mentioned by other answers.
The biggest remaining reason is specialization.
Redis can do a lot of different things and one side effect of that is developers may start using a lot of those different feature sets on the same instance. If you're using the LRU feature of Redis for a cache along side hard data storage that is NOT LRU it's entirely possible to run out of memory.
If you're going to setup a dedicated Redis instance to be used ONLY as an LRU instance to avoid that particular scenario then there's not really any compelling reason to use Redis over Memcached.
If you need a reliable "never goes down" LRU cache...Memcached will fit the bill since it's impossible for it to run out of memory by design and the specialize functionality prevents developers from trying to make it so something that could endanger that. Simple separation of concerns.
We thought of Redis as a load-takeoff for our project at work. We thought that by using a module in nginx called HttpRedis2Module or something similar we would have awesome speed but when testing with AB-test we're proven wrong.
Maybe the module was bad or our layout but it was a very simple task and it was even faster to take data with php and then stuff it into MongoDB. We're using APC as caching-system and with that php and MongoDB. It was much much faster then nginx Redis module.
My tip is to test it yourself, doing it will show you the results for your environment. We decided that using Redis was unnecessary in our project as it would not make any sense.
Redis is better.
The Pros of Redis are ,
It has a lot of data storage options such as string , sets , sorted sets , hashes , bitmaps
Disk Persistence of records
Stored Procedure (LUA scripting) support
Can act as a Message Broker using PUB/SUB
Whereas Memcache is an in-memory key value cache type system.
No support for various data type storages like lists , sets as in
redis.
The major con is Memcache has no disk persistence .
Here is the really great article/differences provided by Amazon
Redis is a clear winner comparing with memcached.
Only one plus point for Memcached
It is multithreaded and fast. Redis has lots of great features and is very fast, but limited to one core.
Great points about Redis, which are not supported in Memcached
Snapshots - User can take a snapshot of Redis cache and persist on
secondary storage any point of time.
Inbuilt support for many data structures like Set, Map, SortedSet,
List, BitMaps etc.
Support for Lua scripting in redis

Resources