As distributed caching requires network call, isn't it beneficial to read directly from the DB in some cases? - caching

I want to understand what is the benefit of running an in-memory cache instance on a separate server to lookup data in distributed caching. The application server will have to make a network call to get the data from Cache. Isn't network call adding to the latency while reading the data ? Wouldn't it make more sense to get the data directly from the database instance ?

Network calls are an order of magnitude faster than disk look-ups (less than 100 micros seconds RTT within adata center). Look-up from memory is also fairly fast (10-20 micro seconds per read). On other hand, databases often have to read from the disk and they maintain extra transaction meta-data and locks.
So caches provide higher throughput as well as better latencies. The final design depends on the type of databases and data access scenarios.

Related

Load Balancing to Maximize Local Server Cache

I have a single-server system that runs all kinds of computations on user data, accessible via REST API. The computations require that large chunks of the user data are in memory during the computation. To do this efficiently, the system includes an in-memory cache, so that multiple requests on the same data chunks will not need to re-read the chunks from storage.
I'm now trying to scale the system out, since one large server is not enough, and I also want to achieve active/active high availability. I'm looking for the best practice to load balance between the multiple servers, while maximizing the efficiency of the local cache already implemented.
Each REST call includes a parameter that identifies which chunk of data should be accessed. I'm looking for a way to tell a load balancer to route the request to a server that has that chunk in cache, if such a server exists - otherwise just use a regular algorithm like round robin (and update the routing table such that the next requests for the same chunk will be routed to the selected server).
A bit more input to consider:
The number of data chunks is in the thousands, potentially tens of thousands. The number of servers is in the low dozens.
I'd rather not move to a centralized cache on another server, e.g. Redis. I have a lot of spare memory on the existing machines that I'd like to utilize since the computations are mostly CPU-bound. Also, I'd prefer not re-implement another custom caching layer.
My servers are on AWS so a way to implement this in ELB is fine with me, but open to other cloud-agnostic solutions. I could in theory implement a system that updates rules on an AWS application load balancer, but it could potentially grow to thousands of rules (one per chunk) and I'm not sure that will be efficient.
Since requests using the same data chunk can come from multiple sources, session-based stickiness is not enough. Some of these operations are write operations, and I'd really not want to deal with cross-server synchronization. All the operations on a single chunk should be routed to the single server that has that chunk in memory.
Any ideas are welcome! Thanks!

Azure Redis cache latency

I am working on an application having web job and azure function app. Web job generates the redis cache for function app to consume. Cache size is around 10 Mega Bytes. I am using lazy loading and all as per the recommendation. I still find that the overall cache operation is slow. Depending upon the size of the file i am processing, i may end up calling Redis cache upto 100,000 times . Wondering if I need to hold the cache data in a local variabke instead of reading it every time from redis. Has anyone experienced any latency in accessing Redis? Does it makes sense to create a singletone object in c# function app and refresh it based on some timer or other logic?
could you consider this points in your usage this is some good practices of azure redis cashe
Redis works best with smaller values, so consider chopping up bigger data into multiple keys. In this Redis discussion, 100kb is considered "large". Read this article for an example problem that can be caused by large values.
Use Standard or Premium Tier for Production systems. The Basic Tier is a single node system with no data replication and no SLA. Also, use at least a C1 cache. C0 caches are really meant for simple dev/test scenarios since they have a shared CPU core, very little memory, are prone to "noisy neighbor", etc.
Remember that Redis is an In-Memory data store. so that you are aware of scenarios where data loss can occur.
Reuse connections - Creating new connections is expensive and increases latency, so reuse connections as much as possible. If you choose to create new connections, make sure to close the old connections before you release them (even in managed memory languages like .NET or Java).
Locate your cache instance and your application in the same region. Connecting to a cache in a different region can significantly increase latency and reduce reliability. Connecting from outside of Azure is supported, but not recommended especially when using Redis as a cache (as opposed to a key/value store where latency may not be the primary concern).
Redis works best with smaller values, so consider chopping up bigger data into multiple keys.
Configure your maxmemory-reserved setting to improve system responsiveness under memory pressure conditions, especially for write-heavy workloads or if you are storing larger values (100KB or more) in Redis. I would recommend starting with 10% of the size of your cache, then increase if you have write-heavy loads. See some considerations when selecting a value.
Avoid Expensive Commands - Some redis operations, like the "KEYS" command, are VERY expensive and should be avoided.
Configure your client library to use a "connect timeout" of at least 10 to 15 seconds, giving the system time to connect even under higher CPU conditions. If your client or server tend to be under high load, use an even larger value. If you use a large number of connections in a single application, consider adding some type of staggered reconnect logic to prevent a flood of connections hitting the server at the same time.

What's the point of remote/cloud memcached service?

As far as I understand, memcached is mainly used to cache key value objects in local memory to speed up access.
But on platform like heroku, to use memcached you have to choose add-on like Memcachier, which is cloud based. I don't understand why is that useful? The network latency is orders of magnitude higher than accessing local memory and completely unpredictable.
So what am I missing?
In the applicable use cases, e.g. accessing a remote disk-based RDBMS or performing an expensive computation, the network latency is orders of magnitude lower than the alternative. Furthermore, while it is true that networks are generally unreliable, during normal operation you still get sub-millisecond latency.
That said, usually a local cache beats a remote cache in terms of latency but on the other hand it could prove problematic to scale.
Edit: answering the OP's comment.
You can essentially think of a disk-based DB as a memory cache over the data in disk - but the DB server's RAM is limited (like any other server). An external cache is therefore used to offload some of that stress, reduce the contention on the DB server resources and free it for other tasks.
As for latency, yes - I was referring to AWS' network. While I'm less familiar with Memcachier's offer, we (Redis Labs) make sure that our Memcached Cloud and Redis Cloud instances are co-located in the same data region as Heroku's dynos are to ensure minimal possible latency. In addition, we also have an Availability Zone Mapping utility that makes it possible to have the application and cache instances reside within the same zone for the same purpose.

Redis: using two instances or just one (caching and storage)?

We need to perform rate limiting for requests to our API. We have a lot of web servers, and the rate limit should be shared between all of them. Also, the rate limit demands a certain amount of ephemeral storage (we want to store the users quota for a certain period of time).
We have a great rate limiting implementation that works with Redis by using SETEX. In this use case we need Redis to also be used a storage (for a short while, according to the expiration set on the SETEX calls). Also, the cache needs to be shared across all servers, and there is no way we could use something like an in-memory cache on each web server for dealing with the rate limiting since the rate limiting is per user - so we expect to have a lot of memory consumed for this purpose. So this process is a great use case for a Redis cluster.
Thing is - the same web server that performs the rate limit, also has some other caching needs. It fetches some stuff from a DB, and then caches the results in two layers: first, in an in-memory LRU-cache (on the actual server) and the second layer is Redis again - this time used as cache-only (no storage). In case the item gets evicted from the in-memory LRU-cache, it is passed on to be saved in Redis (so that even when a cache miss occurs in-memory, there would still be a cache-hit because thanks to Redis).
Should we use the same Redis instance for both needs (rate limiter that needs storage on one hand and cache layer that does not on the other)? I guess we could use a single Redis instance that includes storage (not the cache only option) and just use that for both needs? Would it be better, performance wise, for each server of ours to talk to two Redis instances - one that's used as cache-only and one that also features the storage option?
I always recommend dividing your setup into distinct data roles. Combining them sounds neat but in practice can be a real pain. In your case you ave two distinct "data roles": cached data and stored data. That is two major classes of distinction which means use two different instances.
In your particular case isolating them will be easier from an operational standpoint when things go wrong or need upgrading. You'll avoid intermingling services such that an issue in caching causes issues in your "storage" layer - or the inverse.
Redis usage tends to grow into more areas. If you get in the habit of dedicated Redis endpoints now you'll be better able to grow your usage in the future, as opposed to having to refactor and restructure into it when things get a bit rough.

Balancing Redis queries and in-process memory?

I am a software developer but wannabe architect new to the server scalability world.
In the context of multiple services working with the same data set, aiming to scale for redundancies and load balancing.
The question is: In a idealistic system, should services try to optimize their internal processing to reduce the amount of queries done to the remote server cache for better performance and less bandwidth at the cost of some local memory and code base or is it better to just go all-in and query the remote cache as the single transaction point every time any transaction need processing done on the data?
When I read about Redis and even general database usage online, the later seems to be the common option. Every nodes of the scaled application have no memory and read and write directly to the remote cache on every transactions.
But as a developer, I ask if this isn't a tremendous waste of resources? Whether you are designing at electronic chips level, at inter-thread, inter-process or inter-machine, I do believe it's the responsibility of each sub-system to do whatever it can to optimize its processing without depending on the external world if it can and hence reduce overall operation time.
I mean, if the same data is read over hundreds or time from the same service without changes (write), isn't it just more logical to keep a local cache and wait for notifications of changes (pub/sub) and only read only these changes to update the cache instead reading the bigger portion of data every time a transaction require it? On the other hand, I understand that this method implies that the same data will be duplicated at multiple place (more ram usage) and require some sort of expiration system not to keep the cache from filling up.
I know Redis is built to be fast. But however fast it is, in my opinion there's still a massive difference between reading directly from local memory versus querying an external service, transfer data over network, allocating memory, deserialize into proper objects and garbage collect it when you are finished with it. Anyone have benchmark numbers between in-process dictionaries query versus a Redis query on the localhost? Is it a negligible time in the bigger scheme of things or is it an important factor?
Now, I believe the real answer to my question until now is "it depends on your usage scenario", so let's elaborate:
Some of our services trigger actions on conditions of data change, others periodically crunch data, others periodically read new data from external network source and finally others are responsible to present data to users and let them trigger some actions and bring in new data. So it's a bit more complex than a single web pages deserving service. We already have a cache system codebase in most services, and we have a message broker system to notify data changes and trigger actions. Currently only one service of each type exist (not scaled). They transfer small volatile data over messages and bigger more persistent (changing less often) data over SQL. We are in process of moving pretty much all data to Redis to ease scalability and performances. Now some colleagues are having a heated discussion about whether we should abandon the cache system altogether and use Redis as the common global cache, or keep our notification/refresh system. We were wondering what the external world think about it. Thanks
(damn that's a lot of text)
I would favor utilizing in-process memory as much as possible. Any remote query introduces latency. You can use a hybrid approach and utilize in-process cache for speed (and it is MUCH faster) but put a significantly shorter TTL on it, and then once expired, reach further back to Redis.

Resources