I have a questions about Redis cache and laravel. By default laravel uses file which caches views to a file and load them from that cache.
Now here's the thing, I started using ElastiCache with Redis for my Laravel 5.4 project. If I change the driver to redis and it starts to cache (which I can tell by a loading time) but what does it actually cache? Does it automatically cache and retrieve my views? css? js? anything else?
I am also using redis for sessions driver, what does that give me?
Is it worth caching database as well? I was planning to cache whole database every hour and then whenever new item is added to database, add it to the existing cache. Is that possible?
The redis could give you two advantages:
faster data retrieving. Any memory-based cache system can give you this advantage than file-based or DB-based, such as memcached.
flexible data saving in redis. redis support many data-type store such as string, list, set, sorted-set and so on.
About caching what?
Cache the frequent request thing. If your client request something or query something to you, and you do not have cache, you will have to query it from your database, which give you an disk I/O time cost. And if the thing is heavy, then the IO cost will be bigger and slow down your server. So the smart way is , just query once and then save it into redis by suitable data-type store. After that retrive thousands with Cache. But you do not need to cache the overall database. It looks rude. And when you update something in db, just delete from your cache and after next time someone query this, it will save into cache again.
About Session. this is very frequent access thing for http server , so every user'session into cache is more light weight than file or db if your app server many many many people.
Cache the static file. Actually I has not dealt with this. But it can do this definitely! E.g. In modern architecture, there is often a Http server stand before your laravel such as nginx. In this way, you will use nginx serve the static file directly. And if you want decrease the disk io about this, you can add a module like redis2-nginx-module for nginx to do a same thing : save the static file into redis once and serve thousands.
Related
Correct me if I'm wrong, but from my understanding, "database caches" are usually implemented with an in-memory database that is local to the web server (same machine as the web server). Also, these "database caches" store the actual results of queries. I have also read up on the multiple caching strategies like - Cache Aside, Read Through, Write Through, Write Behind, Write Around.
For some context, the Write Through strategy looks like this:
and the Cache Aside strategy looks like this:
I believe that the "Application" refers to a backend server with a REST API.
My first question is, in the Write Through strategy (application writes to cache, cache then writes to database), how does this work? From my understanding, the most commonly used database caches are Redis or Memcached - which are just key-value stores. Suppose you have a relational database as the main database, how are these key-value stores going to write back to the relational database? Do these strategies only apply if your main database is also a key-value store?
In a Write Through (or Read Through) strategy, the cache sits in between the application and the database. How does that even work? How do you get the cache to talk to the database server? From my understanding, the web server (the application) is always the one facilitating the communication between the cache and the main database - which is basically a Cache Aside strategy. Unless Redis has some kind of functionality that allows it to talk to another database, I don't quite understand how this works.
Isn't it possible to mix and match caching strategies? From how I see it, Cache Aside and Read Through are caching strategies for application reads (user wants to read data), while Write Through and Write Behind are caching strategies for application writes (user wants to write data). Couldn't you have a strategy that uses both Cache Aside and Write Through? Why do most articles always seem to portray them as independent strategies?
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
Could you implement a cache using a normal (not in-memory) database? I suppose this would still be somewhat useful since you do not need to make an additional network hop to the database server (since the cache lives on the same machine as the web server)?
Introduction & clarification
I guess you have one misunderstood point, that the cache is NOT expclicitely stored on the same server as the werbserver. Sometimes, not even the database is sperated on it's own server from the webserver. If you think of APIs, like HTTP REST APIs, you can use caching to not spend too many resources on database connections & queries. Generally, you want to use as few database connections & queries as possible. Now imagine the following setting:
You have a werbserver who serves your application and a REST API, which is used by the webserver to work with some resources. Those resources come from a database (lets say a relational database) which is also stored on the same server. Now there is one endpoint which serves e.g. a list of posts (like blog-posts). Every user can fetch all posts (to make it simple in this example). Now we have a case where one can say that this API request could be cached, to not let all users always trigger the database, just to query the same resources (via the REST API) over and over again. Here comes caching. Redis is one of many tools which can be used for caching. Since redis is a simple in-memory key-value storage, you can just put all of your posts (remember the REST API) after the first DB-query, into the cache. All future requests for the posts-list would first check whether the posts are alreay cached or not. If they are, the API will return the cache-content for this specific request.
This is one simple example to show off, what caching can be used for.
Answers on your question
My first question is, why would you ever write to a cache?
To reduce the amount of database connections and queries.
how is writing to these key-value stores going to help with updating the relational database?
It does not help you with updating, but instead it helps you with spending less resources. It also helps you in terms of "temporary backing up" some data - but that only as a very little side effect. For this, out there are more attractive solutions (Since redis is also not persistent by default. But it supports persistence.)
Do these cache writing strategies only apply if your main database is also a key-value store?
No, it is not important which database you use. Whether it's a NoSQL or SQL DB. It strongly depends on what you want to cache and how the database and it's tables are set up. Do you have frequent changes in your recources? Do resources get updated manually or only on user-initiated actions? Those are questions, leading you to the right caching implementation.
Isn't it possible to mix and match caching strategies?
I am not an expert at caching strategies, but let me try:
I guess it is possible but it also, highly depends on what you are doing in your DB and what kind of application you have. I guess if you find out what kind of application you are building up, then you will know, what strategy you have to use - i guess it is also not recommended to mix those strategies up, because those strategies are coupled to your application type - in other words: It will not work out pretty well.
What happens if you have a cluster of webs servers? Do they each have their own local in-memory database that acts as a cache?
I guess that both is possible. Usually you have one database, maybe clustered or synchronized with copies, to which your webservers (e.g. REST APIs) make their requests. Then whether each of you API servers would have it's own cache, to not query the database at all (in cloud-based applications your database is also maybe on another separated server - so another "hop" in terms of networking). OR (what i also can imagine) you have another middleware between your APIs (clusterd up) and your DB (maybe also clustered up) - but i guess that no one would do that because of the network traffic. It would result in a higher response-time, what you usually want to prevent.
Could you implement a cache using a normal (not in-memory) database?
Yes you could, but it would be way slower. A machine can access in-memory data faster then building up another (local) connection to a database and query your cached entries. Also, because your database has to write the entries into files on your machine, to persist the data.
Conclusion
All in all, it is all about being fast in terms of response times and to prevent much network traffic. I hope that i could help you out a little bit.
I am a new developer and am trying to implement Laravel's (5.1) caching facility to improve the speed of my app. I started out caching a large DB table that my app constantly references - but it got too large so I have backed away from that and am now 'forever' caching smaller chunks of data - for example, for each page only the portions of that large DB table that are relevant.
I have watched 'Caching Essentials' on Laracasts, done some Googling and had a search in this forum (and Laracasts') but I still have a couple of questions:
I am not totally clear on how the cache size limits work when you are using Laravel's file-based system - is there an overall in-app size limit for the cache or is one limited size-wise only per key and by your server size?
What are the signs you should switch from file-based caching to something like Memcached or Redis - and what are the benefits of using one of those services? Is it the fact that your caching is handled on a different server (thereby lightening the load on your own)? Do you switch over to one of these services when your local, file-based cache gets too big for your server?
My app utilizes several tables that have 3,000-4,000 rows - the data in these tables is constantly referenced and will remain static unless I decide to add new options. I am basically looking for the best way to speed up queries to the data in these tables.
Thanks!
I don't think Laravel imposes any limitations on its file i/o at all - the limitations will be with how much what PHP can read / write to a file at once, or hold in its memory / process at any one time.
It does serialise the data that you cache, and unserialise it when you reload it, so your PHP environment would have to be able to process the entire cache file (which is equivalent to the top level cache key) at once. So, if you are getting cacheduser.firstname, it would have to load the whole cacheduser key from the file, unserialise it, then get the firstname key from that.
I would take the PHP memory limit (classic, i know!) as a first point to investigate if you want to keep down this road.
Caching services like Redis or memcached are bespoke, optimised caching solutions. They take some of the logic and responsibility out of your PHP environment.
They can, for example, retrieve sub-keys from items without having to process the whole thing, so can retrieve part of some cached data in a memory efficient way. So, when you request cacheduser.firstname from redis, it just returns you the firstname attribute.
They have other advantages regarding tagging / clearing out subsets of caches (see [the cache tags Laravel docs] (https://laravel.com/docs/5.4/cache#cache-tags))
Another thing to think about is scaling. If your site is large enough, and is load-balanced across multiple servers, the filesystem caching may be different across those servers, as each server can only check their local filesystem for the cache files. A caching service can be on a different server (many hosts will have a separate redis / memcached services available), so isn't victim to this issue.
Also - as I understand it (and this might be the most important thing), the file cache driver in Laravel is mainly for local development and testing. Although it can work fine for simple applications with basic caching needs, it's not intended for large scalable production environments.
Personally, I develop locally and test with file caching, as i'm only dealing with small amounts of data then, and use redis to cache on production environments.
It doesn't necessarily need to be on a separate server to get the benefits. If you are never going to scale to multiple application servers, then using a caching service on the same server will already be a large improvement to caching large documents.
I'm trying to work out of I can take advantage of a caching layer in my web application or not (and if so which technology).
Our web app has and internal and external component and I would like if possible to add an in-memory cache tier between the Web App and DB Tier for the public external component. We are suffering DB performance issues and I want to alleviate stress on the DB as much as possible (plus make our public facing site of the component lightening fast).
The external component offers a location search facility based on a post code. E.g enter post code for an area and you get 50 results back each time (the data is relatively stale) the DB might change (new record added 1 per day) so I was thinking if a cache tier was possible then I could invalidate the cache nightly and then load it again (as opposed to the cache aside pattern).
Question:
Based on my overview above e.g. postcode mapping to multiple records (JSON or serializable objects) can I use a cache tier to store the data in-memory (total size of data ~100 MG, heaps of RAM free) and retrieve multiple records back per post code based on a caching technology "key-value data stores"?
If number 1 above is feasible, what caching technology, we are using a PHP front end, Zend server has an im-memory cache but it doesn't look mature, I would prefer Redis over Memcached for caching, thoughts?
If pre-loading the cache nightly is not achievable, thoughts on a better approach to utilise the cache?
If in-memory caching is not achievable at all (based on my requirement) then should I look at opmtiising the DB (it's SQL Server), e.g. loading the search table into SQL cache on SQL Server start-up?
Other, something I'm missing?
Thanks in advance, all comments welcome!
Cheers,
I understand that memcached is a distributed caching system. However, is it entirely necessary for memcached to replicate? The objective is to persist sessions in a clustered environment.
For example if we have memcached running on say 2 servers, both with data on it, and server #1 goes down, could we potentially lose session data that was stored on it? In other words, what should we expect to see happen should any memcached server (storing data) goes down and how would it affect our sessions in a clustered environment?
At the end of the day, will it be up to use to add some fault tolerance to our application? For example, if the key doesn't exist possibly because one of the servers it was on went down, re-query and store back to memcached?
From what I'm reading, it appears to lean in this direction but would like confirmation: https://developers.google.com/appengine/articles/scaling/memcache#transient
Thanks in advance!
Memcached has it's own fault tolerance built in so you don't need to add it to your application. I think providing an example will show why this is the case. Let's say you have 2 memcached servers set up in front of your database (let's say it's mysql). Initially when you start your application there will be nothing in memcached. When your application needs to get data if will first check in memcached and if it doesn't exist then it will read the data from the database and insert it into memcached before returning it to the user. For writes you will make sure that you insert the data into both your database and memcached. As you application continues to run it will populate the memcached servers with a bunch of data and take load off of your database.
Now one of your memcached servers crashes and you lose half of your cached data. What will happen is that your application will now be going to the database more frequently right after the crash and your application logic will continue to insert data into memcached except everything will go directly to the server that didn't crash. The only consequence here is that your cache is smaller and your database might need to do a little bit more work if everything doesn't fit into the cache. Your memcached client should also be able to handle the crash since it will be able to figure out where your remaining healthy memcached servers are and it will automatically hash values into them accordingly. So in short you don't need any extra logic for failure situations in memcached since the memcached client should take care of this for you. You just need to understand that memcached servers going down might mean your database has to do a lot of extra work. I also wouldn't recommend re-populating the cache after a failure. Just let the cache warm itself back up since there's no point in loading items that you aren't going to use in the near future.
m03geek also made a post where he mentioned that you could also use Couchbase and this is true, but I want to add a few things to his response about what the pros and cons are. First off Couchbase has two bucket (database) types and these are the Memcached Bucket and the Couchbase Bucket. The Memcached bucket is plain memcached and everything I wrote above is valid for this bucket. The only reasons you might want to go with Couchbase if you are going to use the memcached bucket are that you get a nice web ui which will provide stats about your memcached cluster along with ease of use of adding and removing servers. You can also get paid support down the road for Couchbase.
The Couchbase bucket is totally different in that it is not a cache, but an actual database. You can completely drop your backend database and just use this bucket type. One nice thing about the Couchbase bucket is that it provides replication and therefore prevents the cold cache problem that memcached has. I would suggest reading the Couchbase documentation if this sounds interesting you you since there are a lot of feature you get with the Couchbase bucket.
This paper about how Facebook uses memcached might be interesting too.
https://www.usenix.org/system/files/conference/nsdi13/nsdi13-final170_update.pdf
Couchbase embedded memcached and "vanilla" memcached have some differences. One of them, as far as I know, is that couchbase's memcached servers act like one. This means that if you store your key-value on one server, you'll be able to retreive it from another server in cluster. And vanilla memcached "clusters" are usally built with sharding technique, which means on app side you should know what server contain desired key.
My opinion is that replicating memcached data is unnessesary. Modern datacenters provide almost 99% uptime. So if someday one of your memcached servers will go down just some of your online users will be needed to relogin.
Also on many websites you can see "Remember me" checkbox that sets a cookie, which can be used to restore session. If your users will have that cookie they will not even notice that one of your servers were down. (that's answer for your question about "add some fault tolerance to our application")
But you can always use something like haproxy and replicate all your session data on 2 or more independent servers. In this case to store 1 user session you'll need N times more RAM, where N is number of replicas.
Another way - to use couchbase to store sessions. Couchbase cluster support replicas "out of the box" and it also stores data on disk, so if your node (or all nodes) will suddenly shutdown or reboot, session data will not lost.
Short answer: memcached with "remember me" cookie and without replication should be enough.
I am using onapp api in my website and in a page it is fetching all the servers in onapp. For some users this list is very large and it extends upto thousand in some cases. The response is not only data but it contains other information too. Also I am doing a pagination. So for each api has to be called and data should be populated. Now for increasing the speed of this I am writting the response to a file and reading from it. But it also taking time. Is there anyway to spped up this operation.
Before file caching, each page was taking around 45 seconds and now it is reduced to 25. But this is also a high value . I am using Symfony Framework. I am using the following code for caching data to file.
$userStatisticsCached=unserialize(file_get_contents($filePath));
if(is_null($userStatisticsCached)||$userStatisticsCached==false){
$userStatistics = $statisticsInstance->getList(1);
file_put_contents($filePath, serialize($userStatistics));
}
else {
$userStatistics=$userStatisticsCached;
}
Is there any better method for achieving the same output with less loading time ?
First: 45seconds is a lot to load a page. How many API calls are you making?
Second: Whoa, 25seconds when having all API calls already cached in filesystem is absolutely huge, too. How many filesystem lookups does your page load perform? Are you sure all your API requests are cached when measuring the 25s page load?
In-Memory Caching:
Depending on the size of your data, I would certainly suggest storing your cached data in memory to speed up cache lookups. For caches around 1GB or less it shouldn't be an issue (depending on what server hardware/hosting provider you are running on). An excellent first choice is Memcache, which also happens to have good PHP support.
Running Memcache: When working locally on your computer you should have no problem installing memcached for yourself. When you upload your website to the server, you'll either need to ensure that Memcache is running on the same server, or ask your server hosting provider for details on how to connect to their memcache server. Most PHP hosting providers offer Memcache as part of the hosting. If they don't, you can use a hosted remote memcache provider like MemCachier, although again the latency to a remote server is going to slow down your cache lookups.