Exploit caching to the fullest - caching

I am caching result of a method(obviously with its signature) so that it don't make complex query on my data-store every time. My caching is working perfectly.
My question is:
How should I find the optimal value of timeout for an entry in cache?
What should be the optimal number of entry in the cache?
Are their any other variables that I can change to improve performance of my application?
Assume the various factors effecting the performance of caching as variables and get me a formula to help understand how can I optimize my cache?

There are two hard problems in computer science: cache invalidation and naming things. First off, I'd be sure that you need a cache. It depends on what sort of datastore you're using (redis, apparently). If it were a traditional RDBMS then you'd be better off making sure that your indexing strategy was tight first. The trouble with introducing caching is that at some point, sooner rather than later, and many times thereafter, you're going to get an inconsistent cache. The cache invalidation isn't atomic with updates to your datastore, so something's going to fire an invalidate message but fail to reach it's destination, and your cache will be out of date. So be dead sure you need that caching before you introduce it. In terms of cache timeout - the sooner the better. An hour is great, a day less so. If something gets out of sync then it'll fix itself rather than causing ongoing issues. Also, if you're setting cache timeouts of a week or more then your cache is going to start operating like a datastore all of it's own; if it goes down and you have to rebuild it then you're going to take a large performance hit. So in this case less is more. Finally, make sure that you make sure and do actually set a cache timeout for everything that goes into your cache. It's all too easy with memcache to just have no expiry date by default, and in that case your cache really is going to start acting like a datastore. Don't let that happen; I've been there and waiting a week for your site to recover is not fun.

Related

How does an LRU cache fit into the CAP theorem?

I was pondering this question today. An LRU cache in the context of a database in a web app helps ensure Availability with fast data lookups that do not rely on continually accessing the database.
However, how does an LRU cache in practice stay fresh? As I understand it, one cannot garuntee Consistency along with Availibility. How is a frequently used item, which therefore does not expire from the LRU cache, handle modification? Is this an example where in a system that needs C over A, an LRU cache is not a good choice?
First of all, a cache too small to hold all the data (where an eviction might happen and the LRU part is relevant) is not a good example for the CAP theorem, because even without looking at consistency, it can't even deliver partition tolerance and availability at the same time. If the data the client asks for is not in the cache, and a network partition prevents the cache from getting the data from the primary database in time, then it simply can't give the client any answer on time.
If we only talk about data actually in the cache, we might somewhat awkwardly apply the CAP-theorem only to that data. Then it depends on how exactly that cache is used.
A lot of caching happens on the same machine that also has the authoritative data. For example, your database management system (say PostgreSql or whatever) probably caches lots of data in RAM and answers queries from there rather than from the persistent data on disk. Even then cache invalidation is a hairy problem. Basically even without a network you either are OK with sometimes using outdated information (basically sacrificing consistency) or the caching system needs to know about data changes and act on that and that can get very complicated. Still, the CAP theorem simply doesn't apply, because there is no distribution. Or if you want to look at it very pedantically (not the usual way of putting it) the bus the various parts of one computer use to communicate is not partition tolerant (the third leg of the CAP theorem). Put more simply: If the parts of your computer can't talk to one another the computer will crash.
So CAP-wise the interesting case is having the primary database and the cache on separate machines connected by an unreliable network. In that case there are two basic possibilities: (1) The caching server might answer requests without asking the primary database if its data is still valid, or (2) it might check with the primary database on every request. (1) means consistency is sacrificed. If its (2), there is a problem the cache's design must deal with: What should the cache tell the client if it doesn't get the primary database's answer on time (because of a partition, that is some networking problem)? In that case there are basically only two possibilities: It might still respond with the cached data, taking the risk that it might have become invalid. This is sacrificing consistency. Or it may tell the client it can't answer right now. That is sacrificing availability.
So in summary
If everything happens on one machine the CAP theorem doesn't apply
If the data and the cache are connected by an unreliable network, that is not a good example of the CAP theorem, because you don't even get A&P even without C.
Still, the CAP theorem means you'll have to sacrifice C or even more of A&P than the part a cache won't deliver in the first place.
What exactly you end up sacrificing depends on how exactly the cache is used.

MongoDB small collection Query very slow

I have a 33MB collection with around 33k items in it. This has been working perfectly for the past month and the queries were responsive and no slow queries. The collection have all the required indexes, and normally the response is almost instant(1-2ms)
Today I spotted that there was a major query queue and the requests were just not getting processed. The Oplog was filling up and just not clearing. After some searching I found the post below which suggest compacting and databaseRepair. I ran the repair and it fixed the problem. Ridiculously slow mongoDB query on small collection in simple but big database
My question is what could have gone wrong with the collection and how did databaseRepair fix the problem? Is there a way for me to ensure this does not happen again?
There are many things that could be an issue here, but ultimately if a repair/compact solved things for you it suggests storage related issues. Here are a few suggestions to follow up on:
Disk performance: Ensure that your disks are performing properly and that you do not have bad sectors. If part of your disk is damaged it could have spiked access times and you may run into this again. You may want to test your RAM modules as well.
Fragmentation: Without knowing your write profile it's hard to say, but your collections and indexes could have fragmented all over your storage system. Running repair will have rebuilt them and brought them back into a more contiguous form, allowing your disk access times to be much faster, especially if you're using mechanical disks and are going to disk for a lot of data.
If this was the issue then you may want to adjust your paddingFactor to reduce the frequency of this in the future, especially if your updates are growing the size of your documents over time. (Assuming you're using MMAPv1 storage).
Page faults: I'm assuming you may have brought the system down to do the repair, which may have reset your memory/working set. You might want to monitor for hard page faults that indicate that your queries are being bottlenecked by IO rather than being served by your in-memory working set. If this is consistently the case, your application behavior may change unexpectedly as data gets pushed in and out of memory, and you may need to add more RAM.

How to determine a good cache time to live for live or semi-live data

I write a lot of web-applications that poll data from a server. Often these are updated live, or at least semi-live, but generating the data often takes some time and should be cached to reduce server-strain. I do however have some trouble finding any good guides on how to best set an appropriate time to live, etc. Anyone have some good suggestions or rules of thumb?
Use the longest duration you could afford your data to be stale as your TTL. If you can afford ten seconds, use a ten-second TTL. If you can afford one second, use a one-second TTL.
You can also look at the problem from the other side: have a single asynchronous server process continuously run the data generation query as often as possible and update the cache as fast as possible. This approach solves the cache stampede problem elegantly and you get an effective and optimum TTL of "how long does it take to generate the data?"

How to deal with expiring item (due to TTL) in memcached on high-load website?

When you have peaks of 600 requests/second, then the memcache flushing an item due to the TTL expiring has some pretty negative effects. At almost the same time, 200 threads/processes find the cache empty and fire of a DB request to fill it up again
What is the best practice to deal with these situations?
p.s. what is the term for this situation? (gives me chance to get better google results on the topic)
If you have memcached objects which will be needed on a large number of requests (which you imply is the case), then I would look into having a separate process or cron job that regularly calculated and refreshed these objects. That way it should never go TTL. It's a common trade-off: you add a little unnecessary load during low traffic time to help reduce the load during peaking (the time you probably care the most about).
I found out this is referred to as "stampeding herd" by the memcached folks, and they discuss it here: http://code.google.com/p/memcached/wiki/NewProgrammingTricks#Avoiding_stampeding_herd
My next suggestion was actually going to be using soft cache limits as discussed in the link above.
If your object is expiring because you've set an expiry and it's gone past date, there is nothing you can do but increase the expiry time.
If you are worried about stale data, a few techniques exist you could consider:
Consider making the cache the authoritative source for whatever data you are looking at, and make a thread whose job is to keep it fresh. This will make the other threads block on refilling the cache, so it may only make sense if you can
Rather than setting a TTL on the data, change whatever process updates the data to update the cache. One technique I use for frequently changing data is to do this probabilistically -- 10% of the time data is written, it is updated. You can tune this for whatever is sensible, depending on how expensive the DB query is and how severe the impact of stale data.

Performance Optimization For Highly Interactive Websites

I recently completed development of a mid-traficked(?) website (peak 60k hits/hour), however, the site only needs to be updated once a minute - and achieving the required performance can be summed up by a single word: "caching".
For a site like SO where the data feeding the site changes all the time, I would imagine a different approach is required.
Page cache times presumably need to be short or non-existent, and updates need to be propogated across all the webservers very rapidly to keep all users up to date.
My guess is that you'd need a distributed cache to control the serving of data and pages that is updated on the order of a few seconds, with perhaps a distributed cache above the database to mediate writes?
Can those more experienced that I outline some of the key architectural/design principles they employ to ensure highly interactive websites like SO are performant?
The vast majority of sites have many more reads than writes. It's not uncommon to have thousands or even millions of reads to every write.
Therefore, any scaling solution depends on separating the scaling of the reads from the scaling of the writes. Typically scaling reads is really cheap and easy, scaling the writes is complicated and costly.
The most straightforward way to scale reads is to cache entire pages at a time and expire them after a certain number of seconds. If you look at the popular web-site, Slashdot. you can see that this is the way they scale their site. Unfortunately, this caching strategy can result in counter-intuitive behaviour for the end user.
I'm assuming from your question that you don't want this primitive sort of caching. Like you mention, you'll need to update the cache in place.
This is not as scary as it sounds. The key thing to realise is that from the server's point of view. Stackoverflow does not update all the time. It updates fairly rarely. Maybe once or twice per second. To a computer a second is nearly an eternity.
Moreover, updates tend to occur to items in the cache that do not depend on each other. Consider Stack Overflow as example. I imagine that each question page is cached separately. Most questions probably have an update per minute on average for the first fifteen minutes and then probably once an hour after that.
Thus, in most applications you barely need to scale your writes. They're so few and far between that you can have one server doing the writes; Updating the cache in place is actually a perfectly viable solution. Unless you have extremely high traffic, you're going to get very few concurrent updates to the same cached item at the same time.
So how do you set this up? My preferred solution is to cache each page individually to disk and then have many web-heads delivering these static pages from some mutually accessible space.
When a write needs to be done it is done from exactly one server and this updates that particular cached html page. Each server owns it's own subset of the cache so there isn't a single point of failure. The update process is carefully crafted so that a transaction ensures that no two requests are not writing to the file at exactly the same time.
I've found this design has met all the scaling requirements we have so far required. But it will depend on the nature of the site and the nature of the load as to whether this is the right thing to do for your project.
You might be interested in this article which describes how wikimedia's servers are structured. Very enlightening!
The article links to this pdf - be sure not to miss it.

Resources