Invalidating cache techniques when storing rich objects - caching

There are times when you store entire object graphs into cache, or objects with collections that make cache invalidation a little tricky.
What techniques are there to know when to invalid a cache?
For simple objects you can invalidate whenever you e.g. update/save the object, you could simply make an extra call and refresh the cache.
When you have a rich object like, for example:
User
Locations
Sales
History
Now this user object will become 'dirty' whenever the user properties, or Location/Sales/History collection data is mutated.
I think one simple method would be to updated the 'modified_date' property of the user object, and maybe keep the modified_date as part of the cache key, and make a call to get the user row and then compare then pull the object graph from the cache based on the modified_date in the key:
user_cache_key + user.id + user.modified_date
The only problem with this approach is you have to make sure you update the 'modified_date' whenever any of the objects dependancies are updated.
Are there any other possible solutions to this problem?

Related

How to decide objects passed to generate cache-key in rabl rails 4

We use rabl for our view templates with rails 4. Recently we are experimenting with caching these rabl views. We know that caching is done by adding following line to the views -
cache some_object
My question is how to decide this some_object ?
For example We have a view that returns videos, that has associated products' info in it. Now there is data that is derived from a user object in this view that is not to be cached as caching that would result in incorrect data for requests from different users. So I understand that I need to pass the user object for the cache key. But what other objects should i pass to generate cache key for this view and get best performance?
It's depends what you have on view. If You have user and video i will add user and video to cache key. It all depends what is important for you. More objects in cache key means more often changes in cache because key will change when you change user and video.

Key-based cache expiration

I'd like to discuss post on 37signals blog named How key-based cache expiration works. I'm Django developer, not RoR, so here is Django "translation" by Ross Poulton: Key-based cache expiration with Django.
As you can see, main idea is following: we have "russian-doll" structure, where one object contains several levels of other.
class A:
timestamp updated_at;
class B:
A parent;
timestamp updated_at;
class C:
B parent;
timestamp updated_at;
View (for example, HTML) of object of class A is cached with all related objects. When class C is updated, we need:
Update timestamp in C.
Update timestamp in B.
Update timestamp in A.
When we access view of class A after this, we need:
Make SELECT to get timestamp from A.
Check, that there is no cached object with this timestamp, so we
need to recache it.
Make SELECT to get A data.
Make SELECT to get all timestamps from B.
Get Bs exist in cache.
Make SELECT to get Bs that not exists in cache.
Make SELECT to get all timestamps of Cs related with Bs that not
exist in cache.
Get Cs from cache, if exist.
Make SELECT to get Cs that not exist in cache.
So, if I understand this strategy right, we need to do 6 queries to DB - 2 for each object: one will get timestamps, second - objects, outdated in cache.
Instead, if we will reset all data, we need to make only 3 queries:
Get object A.
Get related objects B.
Get related objects C.
As I know, it's ofter better to execute 3 queries with more data instead of 6 queries with less. So is this strategy effective?
Of course, we can store timestamps in cache too, but in this case we will face with problem of invalidation of timestamp. So it's no sense to invalidate data for strategy that needed to avoid invalidation.
Please, correct me if I wrong in understanding of scope or principle of work of this algorithm.

Entity Framework and caching - Changes are tracking back to cache

I have some data being pulled in from an Entity model. This contains attributes of items, let's say car parts with max-speed, weight and size. Since there are a lot of parts and the base attributes never change, I've cached all the records.
Depending on the car these parts are used in, these attributes might now be changed, so I setup a new car, copy the values from the cached item "Engine" to the new car object and then add "TurboCharger", which boosts max speed, weight and size of the Engine.
The problem I'm running into is that it seems that the Entity model is still tracking the context back to the cached data. So when weight is increased by the local method, it increases it for all users. I tried adding "MergeOption.NoTracking" to my context as this is supposed to remove all entity tracking, but it still seems to be tracking back. If I turn off the cache, it works fine as it pulls fresh values from the database each time.
If I want to copy a record from my entity model, is there a way I can say "Copy the object but treat it as a standard object with no history of coming from entity" so that once my car has the attributes from an item, it is just a flattened object?
Cheers!
Im not too sure about MergeOption.NoTracking on the whole context and exactly what that does but what you can do as an alternative is to add .AsNoTracking() into your query from the database. This will definitely return a detached object.
Take a look here for some details on AsNoTracking usage : http://blog.staticvoid.co.nz/2012/04/entity-framework-and-asnotracking.html.
The other thing is to make sure you enumerate your collection before you insert to the cache to ensure that you arent acting within the queriable, ie use .ToArray().
The other option is to manually detach the object from the context (using Detach(T entity)).

how can i update an object/entity that is not completely filled out?

I have an entity with several fields, but on one view i want to only edit one of the fields. for example... I have a user entity, user has, id, name, address, username, pwd, and so on. on one of the views i want to be able to change the pwd(and only the pwd). so the view only knows of the id and sends the pwd. I want to update my entity without loading the rest of the fields(there are many many more) and changing the one pwd field and then saving them ALL back to the database. has anyone tried this. or know where i can look. all help is greatly appreciated.
Thx in advance.
PS
i should have given more detail. im using hibernate, roo is creating my entities. I agree that each view should have its own entity, problem is, im only building controllers, everything was done before. we were finders from the service layer, but we wanted to use some other finders, they seemed to not be accessible through the service layer, the decision was made to blow away the service layer and just interact with the entities directly (through the finders), the UserService.update(user) is no longer an option. i have recently found a User.persist() and a User.merge(), does the merge update all the fields on the object or only the ones that are not null, or if i want one to now be null how would it know the difference?
Which technologies except Spring are you using?
First of all have separate DTOs for every view, stripped only to what's needed. One DTO for id+password, another for address data, etc. Remember that DTOs can inherit from each other, so you can avoid duplication. And never pass business/ORM entities directly to view. It is too risky, leaks in some frameworks might allow users to modify fields which you haven't intended.
After the DTO comes back from the view (most web frameworks work like this) simply load the whole entity and fill only the fields that are present in the DTO.
But it seems like it's the persistence that is troubling you. Assuming you are using Hibernate, you can take advantage of dynamic-update setting:
dynamic-update (optional - defaults to false): specifies that UPDATE SQL should be generated at runtime and can contain only those columns whose values have changed.
In this case you are still loading the whole entity into memory, but Hibernate will generate as small UPDATE as possible, including only modified (dirty) fields.
Another approach is to have separate entities for each use-case/view. So you'll have an entity with only id and password, entity with only address data, etc. All of them are mapped to the same table, but to different subset of columns. This easily becomes a mess and should be treated as a last resort.
See the hibernate reference here
For persist()
persist() makes a transient instance persistent. However, it does not guarantee that the
identifier value will be assigned to the persistent instance immediately, the assignment
might happen at flush time. persist() also guarantees that it will not execute an INSERT
statement if it is called outside of transaction boundaries. This is useful in long-running
conversations with an extended Session/persistence context.
For merge
if there is a persistent instance with the same identifier currently associated with the session, copy the state of the given object onto the persistent instance
if there is no persistent instance currently associated with the session, try to load it from the database, or create a new persistent instance
the persistent instance is returned
the given instance does not become associated with the session, it remains detached
persist() and merge() has nothing to do with the fact that the columns are modified or not .Use dynamic-update as #Tomasz Nurkiewicz has suggested for saving only the modified columns .Use dynamic-insert for inserting not null columns .
Some JPA providers such as EclipseLink support fetch groups. So you can load a partial instance and update it.
See,
http://wiki.eclipse.org/EclipseLink/Examples/JPA/AttributeGroup

Accessing properties of Core Data objects via bindings from non-Core Data objects

I have a set of data created by another app and stored in XML format on disk. Since this data is managed by this other app, I don't want to bother with loading this data into a Core Data store for two reasons: 1) it would be redundant storage of the same data, and 2) I would have to constantly update my own Core Data store to match updates in the XML file produced by the other app.
However, I have data created in my own app that needs to be associated with the data from the XML from the other app, and I want to save the data created in my own app to disk.
To accomplish this, the XML data from the other app has persistent, unique IDs associated with each object stored in the XML file. I store these unique IDs in my own Core Data store. Upon every launch of my app, I load the XML data created by the other app, and then I can access the corresponding data in my own app via Core Data by issuing a fetch request for managed objects matching the unique ID.
OtherAppObjects represents items loaded from the XML data. They have their own unique properties in addition to the uniqueID. These OtherAppObjects are controlled by an NSArrayController. Then I have MyManagedObjects which are loaded from the Core Data store, and have distinct unique properties in addition to a uniqueID.
I have a table view which needs to display properties from both the OtherAppObjects as well as the MyManagedObjects, so I want to be able to access and set properties of the MyManagedObjects via bindings from the OtherAppObjects. Thus, I figured that I could create a correspondingMyManagedObject property of the OtherAppObjects, and then I'd be able to access the Core Data properties of the MyManagedObject via bindings.
For example, if I wanted to display property "foo" of the OtherAppObjects, and "bar" of the MyManagedObjects in the table view, I could simply bind one table column to the NSArrayController with a model key path of "foo", and bind the second table column to the model key path of "correspondingMyManagedObject.bar".
This works when not dealing with multiple threads, or when passing around a single managed object context. But since that's "strongly discouraged", I wanted to try to do this the right way by passing around a single persistent store coordinator, but creating separate managed object contexts.
However, this breaks down. The problem is that when the table view attempts to access the bar property, it needs to first access the correspondingMyManagedObject property. So, the OtherAppObject dutifully creates a new managed object context and a corresponding fetch request with the appropriate uniqueID and returns the managed object. But in doing so, it releases the managed object context and now the managed object is no longer valid, so the table view can't access the bar property!
I see only two ways around this, and I wanted to verify that there isn't another easier way to do this:
Load the objects from the XML data into my own Core Data store. In essence, create ManagedOtherAppObjects from the OtherAppObjects, with a relationship to the MyManagedObjects, and then accessing via bindings will be peachy. However, this means there's redundant storage of the same data on disk, and I'll have to recreate the ManagedOtherAppObjects every single time I launch the app (because the XML file is updated fairly frequently).
Create custom setters/getters on the OtherAppObject class. So, for example, I'd create -(NSValue *)bar and -(void)setBar:(NSValue *)newValue methods in OtherAppObject. Then, instead of binding the table view column to the key value path "correspondingMyManagedObject.bar" of OtherAppObjects, I'd just bind it to the key path "bar" of OtherAppObjects. These methods would be able to fetch the corresponding MyManagedObject and retrieve or set the value within the managed object context, and then return the correct value.
This second method isn't particularly appealing because I'd have to create two custom methods for every single property of MyManagedObject (and for properties of other managed objects for which MyManagedObject has a relationship).
I suppose I could create the generalized methods -(NSValue *)retrieveCoreDataPropertyUsingKeyPath:(NSString *)keyPath and -(void)setCoreDataProperty:(NSValue *)newValue usingKeyPath:(NSString *)keyPath , but I'd still have to create shell setters/getters for each individual property.
[UPDATE: Hmm, maybe I could just override valueForKeyPath: and setValue:forKeyPath:, and then everything would work OK?]
Is this correct, or am I missing something?
One variation on option #1 that could be worth a try would be to set things up so that you have a single persistent store coordinator that splits the objects between two separate persistent stores. You would keep MyManagedObjects (MMO) the same, being stored separately on disk, but then the OtherAppObjects (OAO) could either be backed by some temporary store on disk (e.g. in ~/Library/Caches or something) or just by an in-memory store.
Upon launch, you would create your PSC and add the store containing the MMOs. You would then add a second store to the PSC (using -[NSPersistentStoreCoordinator addPersistentStoreWithType:configuration:URL:options:error:]), read in the XML file and create all the OAOs, and associate those objects with that store using -[NSManagedObjectContext assignObject:toPersistentStore:].
Core Data doesn't allow directly modeling relationships between objects in different stores, but you could still do the lookup via unique ID like you're doing now to associate a MMO with an OAO. The difference would be that the OAO could simply use its own managed object context to fetch the MMO, so you would be sure that the MMO would stick around at least as long as the OAO.
Then, when you quit the app, you'd either delete the temporary store in ~/Library/Caches, or if using an in-memory store, just let it disappear into the ether, leaving the other store with the MMOs intact.

Resources