What is a multi-tier cache? - caching

I've recently come across the phrase "multi-tier cache" relating to multi-tiered architectures, but without a meaningful explanation of what such a cache would be (or how it would be used).
Relevant online searches for that phrase don't really turn up anything either. My interpretation would be a cache servicing all tiers of some n-tier web app. Perhaps a distributed cache with one cache node on each tier.
Has SO ever come across this term before? Am I right? Way off?

I know this is old, but thought I'd toss in my two cents here since I've written several multi-tier caches, or at least several iterations of one.
Consider this; Every application will have different layers, and at each layer a different form of information can be cached. Each cache item will generally expire for one of two reasons, either a period of time has expired, or a dependency has been updated.
For this explanation, lets imagine that we have three layers:
Templates (object definitions)
Objects (complete object cache)
Blocks (partial objects / block cache)
Each layer depends on it's parent, and we would define those using some form of dependency assignment. So Blocks depend on Objects which depend on Templates. If an Object is changed, any dependencies in Block would be expunged and refreshed; if a Template is changed, any Object dependencies would be expunged, in turn expunging any Blocks, and all would be refreshed.
There are several benefits, long expiry times are a big one because dependencies will ensure that downstream resources are updated whenever parents are updated, so you won't get stale cached resources. Block caches alone are a big help because, short of whole page caching (which requires AJAX or Edge Side Includes to avoid caching dynamic content), blocks will be the closest elements to an end users browser / interface and can save boatloads of pre-processing cycles.
The complication in a multi-tier cache like this though is that it generally can't rely on a purely DB based foreign key expunging, that is unless each tier is 1:1 in relation to its parent (ie. Block will only rely on a single object, which relies on a single template). You'll have to programmatically address the expunging of dependent resources. You can either do this via stored procedures in the DB, or in your application layer if you want to dynamically work with expunging rules.
Hope that helps someone :)
Edit: I should add, any one of these tiers can be clustered, sharded, or otherwise in a scaled environment, so this model works in both small and large environments.

After playing around with EhCache for a few weeks it is still not perfectly clear what they mean by the term "multi-tier" cache. I will follow up with what I interpret to be the implied meaning; if at any time down the road someone comes along and knows otherwise, please feel free to answer and I'll remove this one.
A multi-tier cache appears to be a replicated and/or distributed cache that lives on 1+ tiers in an n-tier architecture. It allows components on multiple tiers to gain access to the same cache(s). In EhCache, using a replicated or distributed cache architecture in conjunction with simply referring to the same cache servers from multiple tiers achieves this.

Related

System Design: Global Caching and consistency

Lets take an example of Twitter. There is a huge cache which gets updated frequently. For example: if person Foo tweets and it has followers all across the globe. Ideally all the caches across all PoP needs to get updated. i.e. they should remain in sync
How does replication across datacenter (PoP) work for realtime caches ?
What tools/technologies are preferred ?
What are potential issues here in this system design ?
I am not sure there is a right/wrong answer to this, but here's my two pennies' worth of it.
I would tackle the problem from a slightly different angle: when a user posts something, that something goes in a distributed storage (not necessarily a cache) that is already redundant across multiple geographies. I would also presume that, in the interest of performance, these nodes are eventually consistent.
Now the caching. I would not design a system that takes care of synchronising all the caches each time someone does something. I would rather implement caching at the service level. Imagine a small service residing in a geographically distributed cluster. Each time a user tries to fetch data, the service checks its local cache - if it is a miss, it reads the tweets from the storage and puts a portion of them in a cache (subject to eviction policies). All subsequent accesses, if any, would be cached at a local level.
In terms of design precautions:
Carefully consider the DC / AZ topology in order to ensure sufficient bandwidth and low latency
Cache at the local level in order to avoid useless network trips
Cache updates don't happen from the centre to the periphery; cache is created when a cache miss happens
I am stating the obvious here, implement the right eviction policies in order to keep only the right objects in cache
The only message that should go from the centre to the periphery is a cache flush broadcast (tell all the nodes to get rid of their cache)
I am certainly missing many other things here, but hopefully this is good food for thought.

Which caching mechanism to use in my spring application in below scenarios

We are using Spring boot application with Maria DB database. We are getting data from difference services and storing in our database. And while calling other service we need to fetch data from db (based on mapping) and call the service.
So to avoid database hit, we want to cache all mapping data in cache and use it to retrieve data and call service API.
So our ask is - Add data in Cache when it gets created in database (could add up-to millions records) and remove from cache when status of one of column value is "xyz" (for example) or based on eviction policy.
Should we use in-memory cache using Hazelcast/ehCache or Redis/Couch base?
Please suggest.
Thanks
I mostly agree with Rick in terms of don't build it until you need it, however it is important these days to think early of where this caching layer would fit later and how to integrate it (for example using interfaces). Adding it into a non-prepared system is always possible but much more expensive (in terms of hours) and complicated.
Ok to the actual question; disclaimer: Hazelcast employee
In general for caching Hazelcast, ehcache, Redis and others are all good candidates. The first question you want to ask yourself though is, "can I hold all necessary records in the memory of a single machine. Especially in terms for ehcache you get replication (all machines hold all information) which means every single node needs to keep them in memory. Depending on the size you want to cache, maybe not optimal. In this case Hazelcast might be the better option as we partition data in a cluster and optimize the access to a single network hop which minimal overhead over network latency.
Second question would be around serialization. Do you want to store information in a highly optimized serialization (which needs code to transform to human readable) or do you want to store as JSON?
Third question is about the number of clients and threads that'll access the data storage. Obviously a local cache like ehcache is always the fastest option, for the tradeoff of lots and lots of memory. Apart from that the most important fact is the treading model the in-memory store uses. It's either multithreaded and nicely scaling or a single-thread concept which becomes a bottleneck when you exhaust this thread. It is to overcome with more processes but it's a workaround to utilize todays systems to the fullest.
In more general terms, each of your mentioned systems would do the job. The best tool however should be selected by a POC / prototype and your real world use case. The important bit is real world, as a single thread behaves amazing under low pressure (obviously way faster) but when exhausted will become a major bottleneck (again obviously delaying responses).
I hope this helps a bit since, at least to me, every answer like "yes we are the best option" would be an immediate no-go for the person who said it.
Build InnoDB with the memcached Plugin
https://dev.mysql.com/doc/refman/5.7/en/innodb-memcached.html

What is the maximum allowed views in CouchDB

I am building a traffic tracking application. I ended up using CouchDB to store all the traffic log, the application can dynamically create views based on user's query and custom data.
I want to create thousands (or could up to millions) of views.
Is there a limit ? Would too many views impact CouchDB performance ?
There is no hard limit on the number of views. There are a few things I would recommend though:
First, split up your views among many design documents. My first thought is 1 per user, but you could probably sub-divide them further depending on how many views you actually have.
Views are grouped internally by the design document, which affects when they are rebuilt, where they are stored, etc. Thus, keeping things partitioned off will help prevent 1 user's views from impacting the performance of any other user.
In addition, without regularly compacting your database, each document (including design documents) retains the old copies across different writes, which is one of the reasons CouchDB uses so much disk space. (it trades using more disk space for the ability to write quickly)
Second, be very conservative with the values you emit() in your views. Avoid things like emit(key, doc). If you emit the entire document in your view, it will be considered part of the view index (which is stored separately from the primary database index) and creates multiple copies of the document. If you need to access the source document in your view, you should use include_docs=true.
Depending on exactly the situation, you may want to consider partitioning across multiple databases as well. That may not be possible, depending on how you want to write queries and such, but worth mentioning. If you can partition into databases, that will make creating backups a little easier and may scale better in the long run.
The main point is, CouchDB is very flexible, which is one of my favorite things about it, as it puts the power in your hands as a developer.

How to distribute data and computation to maximize locality?

Please bear with me, this is a basic architectural question for my first attempt at a "big data" project, but I believe your answers will be of general interest to anyone who is starting out in this field.
I've googled and read the high-level descriptions of Kafka, Storm, Memcached, MongoDB, etc., but now that I'm ready to dig in to start designing my app, I still need some further insight on how in fact the data should be distributed and shared.
The performance of my app is critical, so one objective is to somehow maximize the locality of the data in the RAM of the machines doing the distributed calculations. I need advice for this part of the design.
If my app had some clear criteria for a priori sharding the data and distributing the calculations (such as geographical regions or company divisions) then the solution would be obvious. But unfortunately my app's data access patterns are dynamic and depend on the results of previous calculations.
My app is an analysis program with distinct stages. In the first stage, all the data is accessed once and a metric is calculated for each data object. In the second stage, a subset of the data objects may be accessed, with the probability of access being proportional to each data object's metric that was calculated in the previous stage. In the final stage, a relatively small subset of data objects will be accessed many times for many calculations.
At all stages, it is required that the calculations be distributed across several servers. The calculations are embarassingly parallel, and each distributed calculation only needs to access a few data objects. It is also required that the number of servers can be specified before the app runs (for example, run on one server, or run on fifty servers).
It seems to me that I need some mechanism that distributes the appropriate data objects to the appropriate compute servers, as opposed to just blindly fetching the data from some database service (whether centralized or distributed). Also, it seems to me that some sort of smart caching system might be appropriate, since the data access pattern depends on the previous calculations and cannot be predicted a priori. But as far as I can tell, Memcached is not such a system because the sharding is determined a priori.
I've read many times that the operating system cache performs better than any monkeying around that we may try. I think the ideal solution is that each compute server's RAM cache somehow captures the data objects' dynamic access patterns, but it's not clear to me how this would work with a NoSQL or Memcached service.
Thanks for bearing with me this far. I realize this is a basic question, but the answer eludes me so far. I can't resolve the dynamic access patterns of my app with the a priori sharding of the NoSQL/Memcached packages. Any advice would be greatly appreciated.
I recommend you to take a look at http://tarantool.org. Shard to maximize locality for the most common data access pattern, use Lua for local computations, and net.box to issue a remote RPC when calculation needs to continue on another node. All data is stored in RAM, if you write your computation code carefully it could take advantage of the Just In Time compiler.

Performance Optimization For Highly Interactive Websites

I recently completed development of a mid-traficked(?) website (peak 60k hits/hour), however, the site only needs to be updated once a minute - and achieving the required performance can be summed up by a single word: "caching".
For a site like SO where the data feeding the site changes all the time, I would imagine a different approach is required.
Page cache times presumably need to be short or non-existent, and updates need to be propogated across all the webservers very rapidly to keep all users up to date.
My guess is that you'd need a distributed cache to control the serving of data and pages that is updated on the order of a few seconds, with perhaps a distributed cache above the database to mediate writes?
Can those more experienced that I outline some of the key architectural/design principles they employ to ensure highly interactive websites like SO are performant?
The vast majority of sites have many more reads than writes. It's not uncommon to have thousands or even millions of reads to every write.
Therefore, any scaling solution depends on separating the scaling of the reads from the scaling of the writes. Typically scaling reads is really cheap and easy, scaling the writes is complicated and costly.
The most straightforward way to scale reads is to cache entire pages at a time and expire them after a certain number of seconds. If you look at the popular web-site, Slashdot. you can see that this is the way they scale their site. Unfortunately, this caching strategy can result in counter-intuitive behaviour for the end user.
I'm assuming from your question that you don't want this primitive sort of caching. Like you mention, you'll need to update the cache in place.
This is not as scary as it sounds. The key thing to realise is that from the server's point of view. Stackoverflow does not update all the time. It updates fairly rarely. Maybe once or twice per second. To a computer a second is nearly an eternity.
Moreover, updates tend to occur to items in the cache that do not depend on each other. Consider Stack Overflow as example. I imagine that each question page is cached separately. Most questions probably have an update per minute on average for the first fifteen minutes and then probably once an hour after that.
Thus, in most applications you barely need to scale your writes. They're so few and far between that you can have one server doing the writes; Updating the cache in place is actually a perfectly viable solution. Unless you have extremely high traffic, you're going to get very few concurrent updates to the same cached item at the same time.
So how do you set this up? My preferred solution is to cache each page individually to disk and then have many web-heads delivering these static pages from some mutually accessible space.
When a write needs to be done it is done from exactly one server and this updates that particular cached html page. Each server owns it's own subset of the cache so there isn't a single point of failure. The update process is carefully crafted so that a transaction ensures that no two requests are not writing to the file at exactly the same time.
I've found this design has met all the scaling requirements we have so far required. But it will depend on the nature of the site and the nature of the load as to whether this is the right thing to do for your project.
You might be interested in this article which describes how wikimedia's servers are structured. Very enlightening!
The article links to this pdf - be sure not to miss it.

Resources