Concurrent Product View Counter - caching

So i have been studying some System Design Concepts and i have stumbled on this question of "Show Number Of People Viewing a Product".
There is some leeway on consistency for the design. But it should support high traffic e-commerce site.
My approach was to consider storing last 5-10 data in a timestamp manner along with the product-id and user session-id in a distributed cache like redis. This should be performed when user is viewing the product. While to get the actual count we have to have a separate read API that should hit a replica cache instance ( secondary ) and aggregate the result for that product-id.
I have to keep the computation logic in cyclic manner so that i dont waste too much memory on the calculation of timestamps more than 5-10 mins.
Do you think i should tweak my read/write strategy to optimise further.
Is the cache option good enough, specifically redis ?
what are more tried and tested approaches out there ?

Related

Should I set expiration on caches of "constant" results in Redis?

I have a few queries to a database that return absolutely constant responses, i.e. some entries on this database are never changed after written.
I'm wondering if I'm to implement caching on them with Redis, should I set an expiration time?
Pros and cons of not doing that -
Pros: Users will always benefit from caching (except for the first query)
Cons: The number of these entries to be queried is growing. So Redis will end up using more and more memory.
Edit
To give more context, the queries run quite slow. Each of them may take seconds. It will be beneficial to minimize the number of users that experience this.
Also, each of these results has size around the magnitude a several kB; The number (not size) of entries may be increasing for 1 per minute.
Sorry for answering with questions. Still waiting for enough reputation to comment and clarify.
Answering your direct question:
Are the number of queries you expect unbounded?
No: You could improve first user experience by triggering the queries on startup and leaving in cache. Other responses that are expected to change you could attach a TTL to and use any of the following maxmemory-policy settings in the config: volatile-ttl, allkeys-lru, 'volatile-lfu, or volatile-random` to only evict keys with TTLs.
Yes: Prioritize these by attaching a TTL and updating each time it's requested to keep in cache as long as possible and use any of the memory management policies that best fit the rest of your use case.
Related concerns:
If these are really static values, why are you querying a database rather than reading from a flat file of constants generated once and read at startup?
Have you attempted to optimize your queries?

Best way to construct a cache key whose uniqueness is defined by 6 properties

Currently I am tasked to fix cache for an ecommerce like system whose prices depend on many factors. The cache backend is redis. For a given product the factors that influence the price are:
sku
channel
sub channel
plan
date
Currently the cache is structured like this in redis:
product1_channel1_subchannel1: {sku_1: {plan1: {2019-03-18: 2000}}}
The API caters to requests for multiple products, skus and all the factors above . So they decided to query all the data on a product_channel_subchannel level and filter the data in the app which is very slow. Also they have decided that, on a cache miss they will construct the cache for all skus for 90 days of data. This way only one request will face the wrath while the others gets benefited from it (only the catch is now we are busting cache more often which is also dragging the system down)
The downside of going with all these factors included in the keys is there will be too many keys. To ball park there are 400 products each made up of 20 skus with 20 channels, 200 subchannels 3 types of plans and 400 days of pricing. To avoid these many keys at some place we must group the data.
The system is currently receives about 10 rps and the has to respond within 100ms.
Question is:
Is the above cache structure fine? Or how do we go about flattening this structure?
How are caches stored in pricing systems in general. I feel like this a very trivial task nonetheless I find it very hard to justify my approaches
Is it okay to sacrifice one request to warm cache for bulk of the data? Or is it better to have a cache warming strategy?
Any sort of caching strategy will be an exercise in trade-offs. And the precise trade-offs you need to make will be dependent upon complex domain logic that you can't predict until you try it out.
What this means is that whatever you implement should be based on data and should be flexible enough to change over time as the business changes. In particular the answer to these questions:
Is it okay to sacrifice one request to warm cache for bulk of the data? Or is it better to have a cache warming strategy?
depend on how the data will be queried by your users and how long a cache miss will take. If queries tend to be clustered around certain skus, or certain dates in a predictable manner, then you should use that information to help guide cache hits and misses.
There is no way I, or anyone else, can give you a correct answer without doing proper experimentation, but we can give you some guidelines.
Here are some best practices that I would recommend when using redis for caching:
If the bottleneck is sending data from redis to the api, then consider using lua scripts to do the simple processing before any data leaves redis. But, be careful that you don't make the scripts too complex since a long-running lua script can block all other parts of redis
It looks like you are using simple get/set keys to store your data. Consider using something more complex:
a. use sorted sets (zsets) if you want to have better access to data by date (use the date as the score).
b. use hash sets to get more fine-grained access to skus
Based on your question, it looks like you will have about 1.6M keys. This is not a huge amount, but you need to make sure that redis has enough memory to store everything in ram without swapping anything to disk. This is something that we had to learn the hard way. If you are running your redis instance on linux, you must set the system's swappiness to 0, to ensure swap is never used.
But, most importantly, you need to experiment with everything until you find a good solution.

What's the benefit of the client-server model of memcached?

As I understand, the benefit of using memcached is to shorten the access time to the information stored in the database by caching it in the memory. But isn't the time overhead for the client-server model based on network protocol (e.g. TCP) also considerable as well? My guess is that it actually might be worse as network access is generally slower than hardware access. What am I getting wrong?
Thank you!
It's true that caching won't address network transport time. However, what matters to the user is the overall time from request to delivery. If this total time is perceptible, then your site does not seem responsive. Appropriate use of caching can improve responsiveness, even if your overall transport time is out of your control.
Also, caching can be used to reduce overall server load, which will essentially buy you more cycles. Consider the case of a query whose response is the same for all users - for example, imagine that you display some information about site activity or status every time a page is loaded, and this information does not depend on the identity of the user loading the page. Let's imagine also that this information does not change very rapidly. In this case, you might decide to recalculate the information every minute, or every five minutes, or every N page loads, or something of that nature, and always serve the cached version. In this case, you're getting two benefits. First, you've cut out a lot of repeated computation of values that you've decided don't really need to be recalculated, which takes some load off your servers. Second, you've ensured that users are always getting served from the cache rather than from computation, which might speed things up for them if the computation is expensive.
Both of those could - in the right circumstances - lead to improved performance from the user's perspective. But of course, as with any optimization, you need to have benchmarks and actually benchmark to data rather than to your perceptions of what ought to be correct.

Strategy for "user data" in couchbase

I know that a big part of the performance from Couchbase comes from serving in-memory documents and for many of my data types that seems like an entirely reasonable aspiration but considering how user-data scales and is used I'm wondering if it's reasonable to plan for only a small percentage of the user documents to be in memory all of the time. I'm thinking maybe only 10-15% at any given time. Is this a reasonable assumption considering:
At any given time period there will be a only a fractional number of users will be using the system.
In this case, users only access there own data (or predominantly so)
Recently entered data is exponentially more likely to be viewed than historical user documents
UPDATE:
Some additional context:
Let's assume there's a user base of a 1 million customers, that 20% rarely if ever access the site, 40% access it once a week, and 40% access it every day.
At any given moment, only 5-10% of the user population would be logged in
When a user logs in they are like to re-query for certain documents in a single session (although the client does do some object caching to minimise this)
For any user, the most recent records are very active, the very old records very inactive
In summary, I would say of a majority of user-triggered transactional documents are queried quite infrequently but there are a core set -- records produced in the last 24-48 hours and relevant to the currently "logged in" group -- that would have significant benefits to being in-memory.
Two sub-questions are:
Is there a way to indicate a timestamp on a per-document basis to indicate it's need to be kept in memory?
How does couchbase overcome the growing list of document id's in-memory. It is my understanding that all ID's must always be in memory? isn't this too memory intensive for some apps?
First,one of the major benefits to CB is the fact that it is spread across multiple nodes. This also means your queries are spread across multiple nodes and you have a performance gain as a result (I know several other similar nosql spread across nodes - so maybe not relevant for your comparison?).
Next, I believe this question is a little bit too broad as I believe the answer will really depend on your usage. Does a given user only query his data one time, at random? If so, then according to you there will only be an in-memory benefit 10-15% of the time. If instead, once a user is on the site, they might query their data multiple times, there is a definite performance benefit.
Regardless, Couchbase has pretty fast disk-access performance, particularly on SSDs, so it probably doesn't make much difference either way, but again without specifics there is no way to be sure. If it's a relatively small document size, and if it involves a user waiting for one of them to load, then the user certainly will not notice a difference whether the document is loaded from RAM or disk.
Here is an interesting article on benchmarks for CB against similar nosql platforms.
Edit:
After reading your additional context, I think your scenario lines up pretty much exactly how Couchbase was designed to operate. From an eviction standpoint, CB keeps the newest and most-frequently accessed items in RAM. As RAM fills up with new and/or old items, oldest and least-frequently accessed are "evicted" to disk. This link from the Couchbase Manual explains more about how this works.
I think you are on the right track with Couchbase - in any regard, it's flexibility with scaling will easily allow you to tune the database to your application. I really don't think you can go wrong here.
Regarding your two questions:
Not in Couchbase 2.2
You should use relatively small document IDs. While it is true they are stored in RAM, if your document ids are small, your deployment is not "right-sized" if you are using a significant percentage of the available cluster RAM to store keys. This link talks about keys and gives details relevant to key size (e.g. 250-byte limit on size, metadata, etc.).
Basically what you are making a decision point on is sizing the Couchbase cluster for bucket RAM, and allowing a reduced residency ratio (% of document values in RAM), and using Cache Misses to pull from disk.
However, there are caveats in this scenario as well. You will basically also have relatively constant "cache eviction" where "not recently used" values are being removed from RAM cache as you pull cache missed documents from disk into RAM. This is because you will always be floating at the high water mark for the Bucket RAM quota. If you also simultaneously have a high write velocity (new/updated data) they will also need to be persisted. These two processes can compete for Disk I/O if the write velocity exceeds your capacity to evict/retrieve, and your SDK client will receive a Temporary OOM error if you actually cannot evict fast enough to open up RAM for new writes. As you scale horizontally, this becomes less likely as you have more Disk I/O capacity spread across more machines all simultaneously doing this process.
If when you say "queried" you mean querying indexes (i.e. Views), this is a separate data structure on disk that you would be querying and of course getting results back is not subject to eviction/NRU, but if you follow the View Query with a multi-get the above still applies. (Don't emit entire documents into your Index!)

Performance Optimization For Highly Interactive Websites

I recently completed development of a mid-traficked(?) website (peak 60k hits/hour), however, the site only needs to be updated once a minute - and achieving the required performance can be summed up by a single word: "caching".
For a site like SO where the data feeding the site changes all the time, I would imagine a different approach is required.
Page cache times presumably need to be short or non-existent, and updates need to be propogated across all the webservers very rapidly to keep all users up to date.
My guess is that you'd need a distributed cache to control the serving of data and pages that is updated on the order of a few seconds, with perhaps a distributed cache above the database to mediate writes?
Can those more experienced that I outline some of the key architectural/design principles they employ to ensure highly interactive websites like SO are performant?
The vast majority of sites have many more reads than writes. It's not uncommon to have thousands or even millions of reads to every write.
Therefore, any scaling solution depends on separating the scaling of the reads from the scaling of the writes. Typically scaling reads is really cheap and easy, scaling the writes is complicated and costly.
The most straightforward way to scale reads is to cache entire pages at a time and expire them after a certain number of seconds. If you look at the popular web-site, Slashdot. you can see that this is the way they scale their site. Unfortunately, this caching strategy can result in counter-intuitive behaviour for the end user.
I'm assuming from your question that you don't want this primitive sort of caching. Like you mention, you'll need to update the cache in place.
This is not as scary as it sounds. The key thing to realise is that from the server's point of view. Stackoverflow does not update all the time. It updates fairly rarely. Maybe once or twice per second. To a computer a second is nearly an eternity.
Moreover, updates tend to occur to items in the cache that do not depend on each other. Consider Stack Overflow as example. I imagine that each question page is cached separately. Most questions probably have an update per minute on average for the first fifteen minutes and then probably once an hour after that.
Thus, in most applications you barely need to scale your writes. They're so few and far between that you can have one server doing the writes; Updating the cache in place is actually a perfectly viable solution. Unless you have extremely high traffic, you're going to get very few concurrent updates to the same cached item at the same time.
So how do you set this up? My preferred solution is to cache each page individually to disk and then have many web-heads delivering these static pages from some mutually accessible space.
When a write needs to be done it is done from exactly one server and this updates that particular cached html page. Each server owns it's own subset of the cache so there isn't a single point of failure. The update process is carefully crafted so that a transaction ensures that no two requests are not writing to the file at exactly the same time.
I've found this design has met all the scaling requirements we have so far required. But it will depend on the nature of the site and the nature of the load as to whether this is the right thing to do for your project.
You might be interested in this article which describes how wikimedia's servers are structured. Very enlightening!
The article links to this pdf - be sure not to miss it.

Resources