Laravel Redis Caching - performance

I'm now working on a big project,we decided to use redis as cache in our system so,when we put some data in the cache and then the original data is changed,how could we know ? and what is the best practice in this case ? to delete the old data and replace the new one ? Is there any mechanism to replace just the changed part ?

Few things to keep in mind for caching for a large application using redis :
1) localise your cache as much as you can. For example if you have 5 information for every user that needs to be cached. Instead of accessing them all together make simple cache for each info.
2) choose the right data structure. Use redis' set, hash, sorted set and bit operations wherever possible.
3) make sure your system will work even if redis is not available (to overcome downtime). That is, check in redis if it's there serve, if not get from dB and populate in cache. So that even If redis is not available you will get values from DB
To answer your question, You can do it in three ways
1) you can maintain cache alongside your DB. During on success of transaction in the DB update the cache. So that you will not loose any information. But implementing this is bit difficult
2) whenever a transaction begins drop cache belongs to that. So that the values in the cache will be removed and will be fetched from DB during the successive read request.
3) maintain a last accessed or created time in both cache and DB. During every read compare them and decide. This is the most reliable solution.

Related

Springboot Ehcache load data

Service using SpringBoot, Maven, MongoDB, Ehcache.
Service requires a fast and frequently cache server, so eventually, I chose Ehcache.
All the cache will be called almost at the same frequency so there are no hot cold data in this case.
The original data in MongoDB will be updated every day by a timer service, so what I do is to load all the updated data to Ehcache every day.
Each item in this data has a connection with each other, like you use one to find the relevant Ids of the other. So if one cache is updated, but the other one hasn't, then you can't find these relevant Ids. I want to avoid this situation.
So my question is, is there any way to achieve a function like this, like using two Ehcache servers or something? i.e. When one is in use, the other one can load the data from MongoDB. When the update is done, switch it to the updated one. So every day when the MongoDB data updated, and I have to update the Ehcache data, it won't influence my current cache. That's just a thought I have. Another thought is something like a SQL transaction. Is there any other way to achieve this.
Please advise.
Good question. I see two ways.
One is to use an application lock. When you are ready to reload the cache, you block access to it and do it. There is no way to clear all caches are the same time. The problem is that everything will be blocked during the update.
The other way is to use an other cache. So you load the new cache with the new data and then swap the new cache and the expired one. The problem with this solution is that at a given moment you will take twice the memory since both caches are in memory.

Will Caching be useful when we need multiple items in one go

We are working on a ecom site, where admin can store some configuration on the combination of Product-Category-manufacturer or on Product-Category.
We have some reports, which can return 10000 Product's transactions (with 100-1000 unique combination of product-category-manufacturer ).
In this report, we also need to use configuration as well.
One option could be to fetch configurations from the same stored procedure for all unique Product-Category-manufacturer.
Another option could be to cache all these combination in some outproc cache (like redis). And once transaction data is fetched from stored procedure, system will pull the data from cache for all 1000 Product-Category-Feature combinations. But in this case, we will have to request cache 1000 times and if some of keys are not found in cache, we will have to hit database.
In fact there can be some combination where data does not exist in database. If we request for these combination, system will not find it in cache, and it will have to hit database every-time. To resolve this, we will have to form a set of all the Product-Category-Feature combination where there is data available in cache.
Could anybody suggest that if cache will be useful in this case?
We use caching mainly in 2 occasions,
To Reduce latency: Cache is closer to the client it takes less time for the resource to reach the client.
To Reduce network traffic: Most of the time we see that some resources are reusable but always fetch from original source which
is costly and make more unnecessary traffic. Adding a cache layer
solves this.
So to answer your question, "Will Caching be useful when we need multiple items in one go?" You have to think on the above 2 points. How much you are reusing (cache hit percentage). And cost difference between cache call and call to original source.
If your issue is getting 1000 items at once, Redis don't have issue providing that. It will be so much faster than the transnational DB. And you can have set of all the Product-Category-Feature combinations, its better as we will no have cache misses. However think about the size of the Redis DB, before you proceed.

Difference between In-Memory cache and In-Memory Database

I was wondering if I could get an explanation between the differences between In-Memory cache(redis, memcached), In-Memory data grids (gemfire) and In-Memory database (VoltDB). I'm having a hard time distinguishing the key characteristics between the 3.
Cache - By definition means it is stored in memory. Any data stored in memory (RAM) for faster access is called cache. Examples: Ehcache, Memcache Typically you put an object in cache with String as Key and access the cache using the Key. It is very straight forward. It depends on the application when to access the cahce vs database and no complex processing happens in the Cache. If the cache spans multiple machines, then it is called distributed cache. For example, Netflix uses EVCAche which is built on top of Memcache to store the users movie recommendations that you see on the home screen.
In Memory Database - It has all the features of a Cache plus come processing/querying capabilities. Redis falls under this category. Redis supports multiple data structures and you can query the data in the Redis ( examples like get last 10 accessed items, get the most used item etc). It can span multiple machine and is usually very high performant and also support persistence to disk if needed. For example, Twitter uses Redis database to store the timeline information.
I don't know about gemfire and VoltDB, but even memcached and redis are very different. Memcached is really simple caching, a place to store variables in a very uncomplex fashion, and then retrieve them so you don't have to go to a file or database lookup every time you need that data. The types of variable are very simple. Redis on the other hand is actually an in memory database, with a very interesting selection of data types. It has a wonderful data type for doing sorted lists, which works great for applications such as leader boards. You add your new record to the data, and it gets sorted automagically.
So I wouldn't get too hung up on the categories. You really need to examine each tool differently to see what it can do for you, and the application you're building. It's kind of like trying to draw comparisons on nosql databases - they are all very different, and do different things well.
I would add that things in the "database" category tend to have more features to protect and replicate your data than a simple "cache". Cache is temporary (usually) where as database data should be persistent. Many cache solutions I've seen do not persist to disk, so if you lost power to your whole cluster, you'd lose everything in cache.
But there are some cache solutions that have persistence and replication features too, so the line is blurry.
An in-memory Cache is a common query store therefore relieves DB of read Workloads. Common examples of in-memory cache are Redis cache. An example could be Web site storing popular searches made by clients thereby relieving the DB of some load.
In-memory Cache provides query functionality on top of caching (storing session data in RAM (temporary storage)).
Memcache falls in the temp store caching category.

Torquebox Infinispan Cache - Too many open files

I looked around and apparently Infinispan has a limit on the amount of keys you can store when persisting data to the FileStore. I get the "too many open files" exception.
I love the idea of torquebox and was anxious to slim down the stack and just use Infinispan instead of Redis. I have an app that needs to cache allot of data. The queries are computationally expensive and need to be re-computed daily (phone and other productivity metrics by agent in a call center).
I don't run a cluster though I understand the cache would persist if I had at least one app running. I would rather like to persist the cache. Has anybody run into this issue and have a work around?
Yes, Infinispan's FileCacheStore used to have an issue with opening too many files. The new SingleFileStore in 5.3.x solves that problem, but it looks like Torquebox still uses Infinispan 5.1.x (https://github.com/torquebox/torquebox/blob/master/pom.xml#L277).
I am also using infinispan cache in a live application.
Basically we are storing database queries and its result in cache for tables which are not up-datable and smaller in data size.
There are two approaches to design it:
Use queries as key and its data as value
It leads to too many entries in cache when so many different queries are placed into it.
Use xyz as key and Map as value (Map contains the queries as key and its data as value)
It leads to single entry in cache whenever data is needed from this cache (I call it query cache) retrieve Map first by using key xyz then find the query in Map itself.
We are using second approach.

How is memcached updated?

I have never used memcached before and I am confused on the following basic question.
Memcached is a cache right? And I assume we cache data from a DB for faster access. So when the DB is updated who is responsible to update the cache? Our code is does memcached "understand" when the DB has been updated?
Memcached is a cache right? And I assume we cache data from a DB for
faster access
Yes it is a cache, but you have to understand that a cache speed up the access when you are often accessing same data. If you access thousand times data/objects which are always different each other a cache doesn't help.
To answer your question:
So when the DB is updated who is responsible to update the cache?
Always you but you don't have to worry about if you are doing the right thing.
Our code is does memcached "understand" when the DB has been updated?
memcached doesn't know about your database. (actually the client doesn't know even about servers..) So when you use an object of your database you should check if is present in cache, if not you put in cache otherwise you are fine.. that is all. When the moment comes memcache will free the memory used by old data, or you can tell memcached to free data after a time you choose(read the API for details).
You are responsible to update the cache (or some plugin).
What happens is that the query is compressed to some key features and these are hashed. This is tested against the cache. If the value is in the cache, the data is returned directly from cache. Otherwise the query is performed, stored in cache and returned to the user.
In pseudo code:
key = query_key(your_sql_query)
if key in cache:
return cache.get(key)
else:
results = execute(your_sql_query)
cache.set(key, results, time_to_live)
return results.
The cache is cleared once in a while, you can give a time to live to a key, then your cached results are refreshed.
This is the most simple model, but can cause some inconsistencies.
One strategy is that if your code is also the only app that updates data, then your code can also refresh memcached as a second step after it has updated the database. Or at least evict the stale data from memcached, so the next time an app wants to read it, it will be forced to re-query the current data from the database and restore that latest data to memcached.
Another strategy is to store data in memcached with an expiration time, so memcached automatically purges that data element after a certain time. You pick the expiration time, based on your knowledge of how frequently the data might be updated, and how tolerant your app is of reading stale data.
So ultimately, you are the one responsible for putting data into memcached. Only you know what data is worth storing in the cache, what format you want to store it in, how frequently you expect to query it, and when to refresh it. You make this judgment on a case-by-case basis, because you know better than any automatic system the likely behavior of your data and your app.

Resources