Using Core Data as cache

Using Core Data as cache - macos

I am using Core Data for its storage features. At some point I make external API calls that require me to update the local object graph. My current (dumb) plan is to clear out all instances of old NSManagedObjects (regardless if they have been updated) and replace them with their new equivalents -- a trump merge policy of sorts.
I feel like there is a better way to do this. I have unique identifiers from the server, so I should be able to match them to my objects in the store. Is there a way to do this without manually fetching objects from the context by their identifiers and resetting each property? Is there a way for me to just create a completely new context, regenerate the object graph, and just give it to Core Data to merge based on their unique identifiers?

Your strategy of matching, based on the server's unique IDs, is a good approach. Hopefully you can get your server to deliver only the objects that have changed since the time of your last update (which you will keep track of, and provide in the server call).
In order to update the Core Data objects, though, you will have to fetch them, instantiate the NSManagedObjects, make the changes, and save them. You can do this all in a background thread (child context, performBlock:), but you'll still have to round-trip your objects into memory and back to store. Doing it in a child context and its own thread will keep your UI snappy, but you'll still have to do the processing.
Another idea: In the last day or so I've been reading about AFIncrementalStore, an NSIncrementalStore implementation which uses AFNetworking to provide Core Data properties on demand, caching locally. I haven't built anything with it yet but it looks pretty slick. It sounds like your project might be a good use of this library. Code is on GitHub: https://github.com/AFNetworking/AFIncrementalStore.

Related

What does data look like when using Event Sourcing?

I'm trying to understand how Event Sourcing changes the data architecture of a service. I've been doing a lot of research, but I can't seem to understand how data is supposed to be properly stored with event sourcing.
Let's say I have a service that keeps track of vehicles transporting packages. The current non relational structure for the data model is that each document represents a vehicle, and has many fields representing origin location, destination location, types of packages, amount of packages, status of the vehicle, etc. Normally this gets queried for information to be read to the front end. When changes are made by the user, the appropriate changes are made to this document in order to update this.
With event sourcing, it seems that a snapshot of every event is stored, but there seem to be a few ways to interpret that:
The first is that the multiple versions of the document I described exist, each a new snapshot every time a change is made. Each event would create a new version of this document and alter it. This is the easiest way for me to wrap my head around it, but I believe this to be incorrect.
Another interpretation I have is that each event stores SPECIFIC information about what's been altered in the document. When the vehicle status changes from On Road to Available, for example, an event specifically for vehicle status changes is triggered. Let's say it's called VehicleStatusUpdatedEvent, and contains the Vehicle ID number, the new status, and the timestamp for this event. So this event is stored and is published to a messaging queue. When picked up from the queue, the appropriate changes are made to the current version of the document. I can understand this, but I think I still have some misconceptions here. My understanding is that event sourcing allows us to have a snapshot of data upon each change, so we can know what it looks like at any point. What I just described would keep a log of changes, but still only have one version of the file, as the events only contain specific pieces of the whole file.
Can someone describe how the data flow and architecture works with event sourcing? Using the vehicle data example I provided might help me frame it better. I feel that I am close to understanding this, but I am missing some fundamental pieces that I can't seem to understand by searching online.

The current non relational structure for the data model is that each document represents a vehicle
OK, let's start from there.
In the data model you've described, storage of a document destroys the earlier copy.
Now imagine that instead we were storing the the document in a git repository. Then then saving the document would also save metadata, and that metadata would include a pointer to the previous document.
Of course, we've probably got a lot of duplication in that case. So instead of storing the complete document every time, we'll store a patch document (think JSON Patch), and metadata pointing to the original patch.
Take that same idea again, but instead of storing generic patch documents, we use domain specific messages that describe what is going on in terms of the model.
That's what the data model of an event sourced entity looks like: a list of domain specific descriptions of document transformations.
When you need to reconstitute the current state, you start with a state you know (which could be the "null" state of the document before anything happened to it, and replay onto that document all of the patches (events) that have occurred since.
If you want to do a temporal query, the game is the same, you replay the events up to the point in time that you are interested in.
So essentially when referring to an older build, you reconstruct the document using the events, correct?
Yes, that's exactly right.
So is there still a "current status" document or is that considered bad practice?
"It depends". In the general case, there is no current status document; only the write-ordered list of events is "real", and everything else is derived from that.
Conversations about event sourcing often lead to consideration of dedicated message stores for managing persistence of those ordered lists, and it is common that the message stores do not also support document storage. So trying to keep a "current version" around would require commits to two different stores.
At this point, designers typically either decide that "recent version" is good enough, in which case they build eventually consistent representations of documents outside of the transaction boundary... OR they decide current version is important, and look into storage solutions that support storing the current version in the same transaction as the events (ex: using an RDBMS).
what is the procedure used to generate the snapshot you want using the events?
IF you want to generate a snapshot, then you'll normally end up using a pattern called a projection, to iterate over the events and either fold or reduce them to create the document.
Roughly, you have a function somewhere that looks like
document-with-meta-data = projection(event-history-with-metadata)

Sharing singleton flux stores in React

So the general gist I've got with flux is that stores should always be a singleton. In my example I have the following :
A people store which controls the CRUD operations of people, as well as search / filtering.
I now have 2 components which show at the same time that make use of this filtering, my issue at the moment is with the current implementation they would be filtering on both components due to their shared store.
My current idea for solutions are:
Have filtering controlled in controller components
Have 2 separate stores that cover each of their domains and have filtering functionality in a shared util

The first solution sounds ok to me.
Nonetheless you can also implement a buffer, made of a hashtable in order to separate temporary filtering results, as if they were sessions.
Pros:
You can alter the shared data and if several components are looking to the same data, every change will be reflected in all those components.
Cons:
There will be a lot of change events and you will need to check if the store change event is important to your component before change its state in order to prevent unnecessary render calls.

How to get Core Data to make only one instance of entity of type

So. Before I get singleton pattern hate on this message hear me out. I'd love to hear ideas. I'm making a program that I think I need to use core data for, because later I want the status of some variables to be easily accessible from OS X, and multiple iOS devices.
What I'm making is an OS X program that will control phidgets (phidgets.com) to control and listen for status changes in real world objects. Example: whether a motor is turned on or not. Turn a motor on and off. Turn on status lights, etc.
I originally thought I'd just make global variables that I change, poll and manipulate in order to have a central status board for the logic of the program to work off of. But, because of the engineering that is put into core data every year by apple, I am assuming making this work with core data will allow me to more easily have options to sync this later with iOS devices that could control or monitor the said status' remotely.
Is there a nifty way you can imagine to:
-startup the program, confirm there is only one entity of type "SystemStatus", if there isn't one, make one. is there is one, we continue and are able to let the program update it's attributes with status of the real world objects it's controlling.
using core data was something I thought of also, because it will allow me a place to persist stored history of data gathered too. Example: motor bearing temperature over time.

If you ensure that access to this object is done through your API, Core Data becomes an implementation detail behind the getter method of the singleton object. There are no facilities in Core Data to tell it to create only one object, but if you ensure access to the object is done through a wrapper of your own, you can fetch it on demand and if it doesn't exist, you can insert it, save, and pass it to the caller.
An important thing to consider when using Core Data objects is multithreading. Passing the same object to multiple threads is very error-prone and requires locking mechanisms (or use of Apple's block-based API). This is not very straightforward for what you describe. Consider either a wrapper object which uses Core Data objects internally (wraps access to properties in block-based API) or using a different approach than Core Data.

Cache Management with Numerous Similar Database Queries

I'm trying to introduce caching into an existing server application because the database is starting to become overloaded.
Like many server applications we have the concept of a data layer. This data layer has many different methods that return domain model objects. For example, we have an employee data access object with methods like:
findEmployeesForAccount(long accountId)
findEmployeesWorkingInDepartment(long accountId, long departmentId)
findEmployeesBySearch(long accountId, String search)
Each method queries the database and returns a list of Employee domain objects.
Obviously, we want to try and cache as much as possible to limit the number of queries hitting the database, but how would we go about doing that?
I see a couple possible solutions:
1) We create a cache for each method call. E.g. for findEmployeesForAccount we would add an entry with a key account-employees-accountId. For findEmployeesWorkingInDepartment we could add an entry with a key department-employees-accountId-departmentId and so on. The problem I see with this is when we add a new employee into the system, we need to ensure that we add it to every list where appropriate, which seems hard to maintain and bug-prone.
2) We create a more generic query for findEmployeesForAccount (with more joins and/or queries because more information will be required). For other methods, we use findEmployeesForAccount and remove entries from the list that don't fit the specified criteria.
I'm new to caching so I'm wondering what strategies people use to handle situations like this? Any advice and/or resources on this type of stuff would be greatly appreciated.

I've been struggling with the same question myself for a few weeks now... so consider this a half-answer at best. One bit of advice that has been working out well for me is to use the Decorator Pattern to implement the cache layer. For example, here is an article detailing this in C#:
http://stevesmithblog.com/blog/building-a-cachedrepository-via-strategy-pattern/
This allows you to literally "wrap" your existing data access methods without touching them. It also makes it very easy to swap out the cached version of your DAL for the direct access version at runtime quite easily (which can be useful for unit testing).
I'm still struggling to manage my cache keys, which seem to spiral out of control when there are numerous parameters involved. Inevitably, something ends up not being properly cleared from the cache and I have to resort to heavy-handed ClearAll() approaches that just wipe out everything. If you find a solution for cache key management, I would be interested, but I hope the decorator pattern layer approach is helpful.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio