How long does the Cache in Unity hold objects? - caching

I am loading Asset Bundles from a Server at runtime with "LoadFromCacheOrDownload()".
I wonder how long the asset bundles are stored there (How long? Still there after restart?).
Should I also save them to the filesystem or is the cache enough?
Thank you.

Cached data by default sticks around for 150 days before being deleted because it has been unused. So if you don't clean it before that, it will most likely stick around. Caching behaviour is however dependent on the cache size as well, which is 50MiB for the web, and 4GiB for other platforms.
With this in mind it's up to you to decide whether or not the cache (and its behaviour) suffice for you, or if you would be better off storing data yourself as well.

Related

Redis memory management - clear based on key, database or instance

I am very new to Redis. I've implemented caching in our application and it works nicely. I want to store two main data types: a directory listing and file content. It's not really relevant, but this will cache files served up via WebDAV.
I want the file structure to remain almost forever. The file content needs to be cached for a short time only. I have set up my expiry/TTL to reflect this.
When the server reaches memory capacity is it possible to priorities certain cached items over others? i.e. flush a key, flush a whole database or flush a whole instance of Redis.
I want to keep my directory listing and flush the file content when memory begins to be an issue.
EDIT: Reading this article seems to be what I need. I think I will need to use volatile-ttl. My file content will have a much shorter TTL set, so this should in theory clear that first. If anyone has any other helpful advice I would love to hear it, but for now I am going to implement this.
Reading this article describes what I needed. I have implemented volatile-ttl as my memory management type.

OLAP Saiku Cache expires

I'm using Saiku and PHPAnalytics to run MDX queries on my cube.
it seems if i run queries it's all good, caching is fine. But if I go for 2 hours and run those queries again - it does not using cache! Why? I need the cache to be saved for a long time! What to do? I tried to add this ti mondrian.properties mondrian.rolap.CachePool.costLimit = 2147483647
But no help. What do to?
The default in-memory cache of Mondrian stores things in a WeakHashMap. This means that it could be cleared at the discretion of the JVM's garbage collector. Most application servers are setup to do a periodical sweep of garbage collection (usually each hour or so). You have to either tweak your JVM's configuration to not do this.
-Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000
You can also implement your own cache implementation of the SegmentCache SPI. If your implementation uses hard references, they will never be collected. This is trickier to do and will require you to do quite a bit of studying to get it right. You can start by taking a look at the default implementation and start from there.
The mondrian cache should cache up until the cache is deliberately flushed. That said it uses an aging system to determine what should be cached should it run out of memory to store the data, the oldest query gets pushed out of the cache and replaced.
I've not tried the PHPAnalytics stuff, but maybe they've put some call into the Saiku server to flush the cache on a regular basis, otherwise this shouldn't happen.

How to keep your distributed cache clean?

In a N-Tier architecture, what would be the best patterns to use so that you can keep your cache clean?
I know it's easy to just set an absolute/sliding timeout, but is there a better mechanism available to allow you to mark your cache as dirty after you update the underlying persistence.
The difficulty I"m trying to wrap my head around is that Cache are usually stored as KVP. But a query is usually a fair bit more complex than that. So how can the gateway service tell the cache store that for such and such query, it needs to refetch from persistence.
I also can't afford to hand-code the cache update per query. I'm looking for a more systematic approach.
Is this just a pipe dream, or is there some way to do this elegantly?
Link/Guide/Post appreciated.
I have worked with AppFabric and I think tried to do what you are asking about. I was working on an auction site and I wanted to pro-actively invalidate items in the cache.
For example, we had listings (things for sale) and they would be present all over the cache (AppFabric). The data that represented a listing was in 10 different places. What I initially wanted was a way to say, "Ok, my listing has changed. Let me go find everywhere it exists in cache, and then update." (I think you say "mark as dirty" in your question)
I found doing this was incredibly difficult. There are tags in AppFabric that I tried to use, so I would mark a given object (or collection of objects) with a tag and that would let me query the cache and remove items. In other words, if an object had a LISTING tag, I would find it and invalidate it.
Eventually I settled on a two-pronged attack.
For 95% of the data I let it expire. It was a happy day when I decided this because everything got much easier to develop. I had to make some concessions in the UI etc., but it was well worth it.
For the last 5% of the data I resolved to only ever store it once. For example, a bid on a listing. Whenever a new bid came in, we'd pro-actively invalidate that object, and then everything that needed that information would be updated as well.

What should be stored in cache for web app?

I realize that this might be a vague question the bequests a vague answer, but I'm in need of some real world examples, thoughts, &/or best practices for caching data for a web app. All of the examples I've read are more technical in nature (how to add or remove cache data from the respective cache store), but I've not been able to find a higher level strategy for caching.
For example, my web app has an inbox/mail feature for each user. What I've been doing to date is storing typical session data in the cache. In this example, when the user logs in I go to the database and retrieve the user's mail messages and store them in cache. I'm beginning to wonder if I should just maintain a copy of all users' messages in the cache, all the time, and just retrieve them from cache when needed, instead of loading from the database upon login. I have a bunch of other data that's loaded on login (product catalogs and related entities) and login is starting to slow down.
So I guess my question to the community, is what would you do/recommend as an approach in this scenario?
Thanks.
This might be better suited to https://softwareengineering.stackexchange.com/, but generally you want to cache:
Metadata/configuration data that does not change frequently. E.g. country/state lists, external resource addresses, logic/branching settings, product/price/tax definitions, etc.
Data that is costly to retrieve or generate and that does not need to frequently change. E.g. historical data sets for reports.
Data that is unique to the current user's session.
The last item above is where you need to be careful as you can drastically increase your app's memory usage, by adding a few megabytes to the data for every active session. It also implies different levels of caching -- application wide, user session, etc.
Generally you should NOT cache data that is under active change.
In larger systems you also need to think about where the cache(s) will sit. Is it possible to have one central cache server, or is it good enough for each server/process to handle its own caching?
Also: you should have some method to quickly reset/invalidate the cached data. For a smaller or less mission-critical app, this could be as simple as restarting the web server. For the large system that I work on, we use a 12 hour absolute expiration window for most cached data, but we have a way of forcing immediate expiration if we need it.
This is a really broad question, and the answer depends heavily on the specific application/system you are building. I don't know enough about your specific scenario to say if you should cache all the users' messages, but instinctively it seems like a bad idea since you would seem to be effectively caching your entire data set. This could lead to problems if new messages come in or get deleted. Would you then update them in the cache? Would that not simply duplicate the backing store?
Caching is only a performance optimization technique, and as with any optimization, measure first before making substantial changes, to avoid wasting time optimizing the wrong thing. Maybe you don't need much caching, and it would only complicate your app. Maybe the data you are thinking of caching can be retrieved in a faster way, or less of it can be retrieved at once.
Cache anything that causes duplicate database queries.
Client side file caching is important as well. Assuming files are marked with an id in your database, cache them on every network request to avoid many network requests for the same file. A resource to do this can be found here (https://developer.mozilla.org/en-US/docs/Web/API/IndexedDB_API). If you don't need to cache files, web storage, local storage and cookies are good for smaller pieces of data.
//if file is in cache
//refer to cache
//else
//make network request and push file to cache

CMS - Save pictures in database, What is the proper structure?

I currently build a CMS system that need to save a lot of pictures per article. I have a lot of questions :-)
I need to show the pictures in a few sizes, with or without watermark. In addition I need to have the original picture too, for archive and admin purpose. What that I think to do right now is to save the pictures in the database, in two versions: 1. the original picture, 2. web-optimized version.
It is really convenient way to save all the images in a table. But does it really good idea? Let say that the database will contain a hundred of thousand pictures, the original pictures size is probably around 3MB. so the db can be easily 100TB size.... Is this really good strategy?
On the other hand, I save a smaller version to each picture. This version need to be shown in a few sizes, with and without watermark. Currently I think to do think to this in on each request. the request will have parameters width, and according to this I can decide the size and the watermark. (I'll cache this work of course). Again, Is this a good strategy? do it really gonna work, or this is very expensive extra work?
Is it really better to save this on the db? I mean each request to article, will need around 50 another requests to its images, and each request required open/close connection to the database.
Technologies that I going to use: .net, sql-server 2008, NHibernate.
The best approach would be storing those images in filesystem and ids on database. Because of performance and maintenance reasons. Backing up and restoring would be much easier on filesystem and pushing the DBMS for such a work is not the best idea, you will need to transfer them from db to application and then push to the client. I just believe that's not it's job. Put a lighttpd daemon or something for image hosting and leave it do its job.
But if you like the idea, since you are going with sql server 2008, you can use FILESTREAM to store your images in your tables. Eventually, it will create files in a storage location that you choose and store the binary data in filesystem while providing transactional features and data integrity, it is a big bonus. Take a look at that option. As I remember, that performs good and the actual database will be much compact.
About the dynamic resizing, I say avoid that. Storage is cheaper than CPU time, just create variety of thumbnails and watermarked versions upon upload time and store them once in somewhere then use when required. Do not perform same operations again and again. You may do that at first request to the resized version, this way it will be easier to add new versions or purging the cache periodically to remove unused files. You will also be able to backup just the original versions.
Putting the images in the database has a couple of advantages. ACID tanscations and backup consistency come to mind. If you absolutely need that then put the images in the database. As you pointed out, this comes with a price: you'll need a huge database infrastructure like machines, licenses, operation team. Each image retrieval is a huge DB I/O effort.
A lot of things will be much easier with only storing metadata in the DB and putting the image blobs on a filesystem.
Two approaches to come to a decison:
What is the killer feature you absolutely (absolutely like in "if I don't have that, the whole thing will not work at all") need from the image-in-database approach? If there is one, go for it
Do a back-of-the-napkin business case, calculating the total cost of the image-in-database approach (project efforts, infrastructure, machine, license, operation) and compare that with an image-in-filesystem approach. That should give some hints on how to proceed.

Resources