What's the point of remote/cloud memcached service? - caching

As far as I understand, memcached is mainly used to cache key value objects in local memory to speed up access.
But on platform like heroku, to use memcached you have to choose add-on like Memcachier, which is cloud based. I don't understand why is that useful? The network latency is orders of magnitude higher than accessing local memory and completely unpredictable.
So what am I missing?

In the applicable use cases, e.g. accessing a remote disk-based RDBMS or performing an expensive computation, the network latency is orders of magnitude lower than the alternative. Furthermore, while it is true that networks are generally unreliable, during normal operation you still get sub-millisecond latency.
That said, usually a local cache beats a remote cache in terms of latency but on the other hand it could prove problematic to scale.
Edit: answering the OP's comment.
You can essentially think of a disk-based DB as a memory cache over the data in disk - but the DB server's RAM is limited (like any other server). An external cache is therefore used to offload some of that stress, reduce the contention on the DB server resources and free it for other tasks.
As for latency, yes - I was referring to AWS' network. While I'm less familiar with Memcachier's offer, we (Redis Labs) make sure that our Memcached Cloud and Redis Cloud instances are co-located in the same data region as Heroku's dynos are to ensure minimal possible latency. In addition, we also have an Availability Zone Mapping utility that makes it possible to have the application and cache instances reside within the same zone for the same purpose.

Related

Redis vs memcached vs Scylla Cache - Which one to choose?

I'm designing an application where I want to cache million data each around 10kb.. I did some analysis and on the fence between using Redis vs memcached vs Scylla as Cache.. Can some experts suggests which might best suits my needs?
Highly performant
High availability
High Throughput
Low pricing?
Full disclosure - I work on the Scylla project.
I think it is a question of latency and HA vs cost. As a RAM-based system, Redis will be the lowest latency. If you need < 1 millisecond response, then Redis or memcached are the choice.
Scylla is a disk-based system. Those values that are in Scylla's RAM will be low latency, but those that need to pull from disk will be slower. So your 99p latency is likely to be slower. How slow? Depends on your disk. NVME can be 99p 3-5 ms. SSD, maybe 5-10 ms. If this is an acceptable latency, then Scylla will be much less expensive, as even NVME is much cheaper than RAM.
As for HA - Redis and memcached are intended as a cache. While there are some features and frameworks that you can use to replicate data around, these are all bolt-ons and increase complexity. Scylla is a distributed system by design. So the replication to allow for multiple layers of HA is built-in (node, rack and DC-availability)
Redis (and to a lesser extend, memcached) are phenomenal caches. But, depending upon your use case, Scylla might be the right choice.
All three options you mentioned are open-source software, so the pricing is the same - zero :-) However, both Scylla and Redis are written and backed by companies (ScyllaDB and RedisLabs, respectively), so if your use case is mission-critical you may choose to pay these companies for enteprise-level support, you can inquire with these companies what are their prices.
The more interesting difference between the three is in the technology.
You described a use case where you have 10 GB of data in the cache. This amount can be easily held in memory, so a completely in-memory database like Memcached or Redis is a natural choice. However, there are still questions you need to ask yourself, which may lead you to a distributed database, such as Scylla depending on your answers:
Would you be using powerful many-core machines? If so, you should probably rule out Memcached - my experience (and others' - see
Can memcached make full use of multi-core?) suggests that it does not scale well with many cores. On an 8-core machine you will not get anywhere close to 8 times the performance of a one-core machine.
Redis is also not really meant for multi-core use - https://redis.io/topics/benchmarks says that Redis "is not designed to benefit from multiple CPU cores. People are supposed to launch several Redis instances to scale out on several cores if needed.". Scylla, on the other hand, thrives on multi-core machines. You should probably test the performance of all three products on your use case before making a decision.
How much of a disaster would be to suddenly lose the entire content of your cache? In some use cases, it just means you would need to query some slightly-slower backend server, so suddenly losing the cache on reboot is acceptable. In such cases, a memory-only cache like Memached or Redis is probably exactly what you need. However, in other cases, there may be a big penalty for starting from scratch with an empty cache - the backend server might be very slow, or maybe the original content is stored on a far-away server with a slow and expensive WAN. In such a case you would want a disk-backed cache, so if the memory cache is lost, you can still refresh it from disk and not from the backend server. Redis has a disk backing option, and in Scylla disk backing is the main way.
You mentioned a working set of 10 GB, which can easily fit memory of a single server. But is it possible this will grow and in a year you'll find yourself needing to cache 100 GB or 1 TB, which no longer fits the memory of a single server? In memcached you'll be out of luck. Redis used to have a "virtual memory" solution for this purpose, but it is deprecated and https://redis.io/topics/virtual-memory now states that Redis is "without considering at least for now the support for databases bigger than RAM". Scylla does handle this issue in two ways. First, your cache would be stored on disk which can be much larger than memory (and whatever amount of memory you have will be used to further speed up that cache, but it doesn't need to fit memory). Second, Scylla is a distributed server. It can distribute a 100 GB working set to 10 different nodes. Redis also has "replication", but it copies the entire data to all nodes - while Scylla can optionally store different subsets of the data on different nodes.
In-memory is actually a bad thing since RAM is expensive and not persistent.
So Scylla will be a better option for K/V or columnar workloads.
Scylla also has a limited Redis api with good results [1], using the CQL
api will result in better results.
[1] https://medium.com/#siddharthc/redis-on-nvme-with-scylladb-5e12afd38dbc

Azure Redis cache latency

I am working on an application having web job and azure function app. Web job generates the redis cache for function app to consume. Cache size is around 10 Mega Bytes. I am using lazy loading and all as per the recommendation. I still find that the overall cache operation is slow. Depending upon the size of the file i am processing, i may end up calling Redis cache upto 100,000 times . Wondering if I need to hold the cache data in a local variabke instead of reading it every time from redis. Has anyone experienced any latency in accessing Redis? Does it makes sense to create a singletone object in c# function app and refresh it based on some timer or other logic?
could you consider this points in your usage this is some good practices of azure redis cashe
Redis works best with smaller values, so consider chopping up bigger data into multiple keys. In this Redis discussion, 100kb is considered "large". Read this article for an example problem that can be caused by large values.
Use Standard or Premium Tier for Production systems. The Basic Tier is a single node system with no data replication and no SLA. Also, use at least a C1 cache. C0 caches are really meant for simple dev/test scenarios since they have a shared CPU core, very little memory, are prone to "noisy neighbor", etc.
Remember that Redis is an In-Memory data store. so that you are aware of scenarios where data loss can occur.
Reuse connections - Creating new connections is expensive and increases latency, so reuse connections as much as possible. If you choose to create new connections, make sure to close the old connections before you release them (even in managed memory languages like .NET or Java).
Locate your cache instance and your application in the same region. Connecting to a cache in a different region can significantly increase latency and reduce reliability. Connecting from outside of Azure is supported, but not recommended especially when using Redis as a cache (as opposed to a key/value store where latency may not be the primary concern).
Redis works best with smaller values, so consider chopping up bigger data into multiple keys.
Configure your maxmemory-reserved setting to improve system responsiveness under memory pressure conditions, especially for write-heavy workloads or if you are storing larger values (100KB or more) in Redis. I would recommend starting with 10% of the size of your cache, then increase if you have write-heavy loads. See some considerations when selecting a value.
Avoid Expensive Commands - Some redis operations, like the "KEYS" command, are VERY expensive and should be avoided.
Configure your client library to use a "connect timeout" of at least 10 to 15 seconds, giving the system time to connect even under higher CPU conditions. If your client or server tend to be under high load, use an even larger value. If you use a large number of connections in a single application, consider adding some type of staggered reconnect logic to prevent a flood of connections hitting the server at the same time.

Redis: using two instances or just one (caching and storage)?

We need to perform rate limiting for requests to our API. We have a lot of web servers, and the rate limit should be shared between all of them. Also, the rate limit demands a certain amount of ephemeral storage (we want to store the users quota for a certain period of time).
We have a great rate limiting implementation that works with Redis by using SETEX. In this use case we need Redis to also be used a storage (for a short while, according to the expiration set on the SETEX calls). Also, the cache needs to be shared across all servers, and there is no way we could use something like an in-memory cache on each web server for dealing with the rate limiting since the rate limiting is per user - so we expect to have a lot of memory consumed for this purpose. So this process is a great use case for a Redis cluster.
Thing is - the same web server that performs the rate limit, also has some other caching needs. It fetches some stuff from a DB, and then caches the results in two layers: first, in an in-memory LRU-cache (on the actual server) and the second layer is Redis again - this time used as cache-only (no storage). In case the item gets evicted from the in-memory LRU-cache, it is passed on to be saved in Redis (so that even when a cache miss occurs in-memory, there would still be a cache-hit because thanks to Redis).
Should we use the same Redis instance for both needs (rate limiter that needs storage on one hand and cache layer that does not on the other)? I guess we could use a single Redis instance that includes storage (not the cache only option) and just use that for both needs? Would it be better, performance wise, for each server of ours to talk to two Redis instances - one that's used as cache-only and one that also features the storage option?
I always recommend dividing your setup into distinct data roles. Combining them sounds neat but in practice can be a real pain. In your case you ave two distinct "data roles": cached data and stored data. That is two major classes of distinction which means use two different instances.
In your particular case isolating them will be easier from an operational standpoint when things go wrong or need upgrading. You'll avoid intermingling services such that an issue in caching causes issues in your "storage" layer - or the inverse.
Redis usage tends to grow into more areas. If you get in the habit of dedicated Redis endpoints now you'll be better able to grow your usage in the future, as opposed to having to refactor and restructure into it when things get a bit rough.

Should the threshold of an "expensive" operation change with infrastructure?

I'm using Amazon ElastiCache for session storage and expensive operation caching on a multi-node web application. One gotchya – I failed to take into account the network latency of the ElastiCache node(s) compared to a local Memcached server.
My benchmarks are show 1-2ms response times for ElastiCache calls within an AWS VPC (as advertised), pretty good, but obviously dramatically slower than anything local. In terms of actual compute cycles 1-2ms is a lifetime. This dramatically changes what I can consider an "expensive" operation worth caching.
My inexperience that led me down this path, but I would imagine others must have similar issues when moving into the "cloud".
Question: Is it better to rethink (and rewrite) what qualifies as an "expensive" operation, or should the infrastructure do a better job of supporting the code (for example, I could use a local memcached server on each node and only pass cache misses through to the ElastiCache node).
Some open questions:
How did you benchmarked ? 1-2ms was from connection opening to close, or just fetching time.
Why you need 15-20 average calls, can they be reduced ? If yes, that might be better.
Is Amazon ElasticCache on same region as your servers ?
Now coming to solutions (will update according to above questions' answer):
Possible Solution
Divide things which you are fine displaying stale values, and things which should be non-staled and common across all servers (example money in wallet, etc.). Now setup two layer caching, one layer i.e. on server only, and save all values which can be staled individually on different servers, and maintain common AmazonElastic cache for stable values. This can be one of working strategy.

Balancing Redis queries and in-process memory?

I am a software developer but wannabe architect new to the server scalability world.
In the context of multiple services working with the same data set, aiming to scale for redundancies and load balancing.
The question is: In a idealistic system, should services try to optimize their internal processing to reduce the amount of queries done to the remote server cache for better performance and less bandwidth at the cost of some local memory and code base or is it better to just go all-in and query the remote cache as the single transaction point every time any transaction need processing done on the data?
When I read about Redis and even general database usage online, the later seems to be the common option. Every nodes of the scaled application have no memory and read and write directly to the remote cache on every transactions.
But as a developer, I ask if this isn't a tremendous waste of resources? Whether you are designing at electronic chips level, at inter-thread, inter-process or inter-machine, I do believe it's the responsibility of each sub-system to do whatever it can to optimize its processing without depending on the external world if it can and hence reduce overall operation time.
I mean, if the same data is read over hundreds or time from the same service without changes (write), isn't it just more logical to keep a local cache and wait for notifications of changes (pub/sub) and only read only these changes to update the cache instead reading the bigger portion of data every time a transaction require it? On the other hand, I understand that this method implies that the same data will be duplicated at multiple place (more ram usage) and require some sort of expiration system not to keep the cache from filling up.
I know Redis is built to be fast. But however fast it is, in my opinion there's still a massive difference between reading directly from local memory versus querying an external service, transfer data over network, allocating memory, deserialize into proper objects and garbage collect it when you are finished with it. Anyone have benchmark numbers between in-process dictionaries query versus a Redis query on the localhost? Is it a negligible time in the bigger scheme of things or is it an important factor?
Now, I believe the real answer to my question until now is "it depends on your usage scenario", so let's elaborate:
Some of our services trigger actions on conditions of data change, others periodically crunch data, others periodically read new data from external network source and finally others are responsible to present data to users and let them trigger some actions and bring in new data. So it's a bit more complex than a single web pages deserving service. We already have a cache system codebase in most services, and we have a message broker system to notify data changes and trigger actions. Currently only one service of each type exist (not scaled). They transfer small volatile data over messages and bigger more persistent (changing less often) data over SQL. We are in process of moving pretty much all data to Redis to ease scalability and performances. Now some colleagues are having a heated discussion about whether we should abandon the cache system altogether and use Redis as the common global cache, or keep our notification/refresh system. We were wondering what the external world think about it. Thanks
(damn that's a lot of text)
I would favor utilizing in-process memory as much as possible. Any remote query introduces latency. You can use a hybrid approach and utilize in-process cache for speed (and it is MUCH faster) but put a significantly shorter TTL on it, and then once expired, reach further back to Redis.

Resources