Core Data - get/create NSManagedObject performance - performance

I'm creating an iphone/ipad app that basically reads XML documents and creates tableviews from objects created based on the xml. The xml represents a 'level' in a filesystem. Its basically a browser.
Each time i parse the xml documents i update the filesystem which is mirrored in a core-data sqllite database. For each "File" encountered in the xml i attempt to get the NSManagedObject associated with it.
The problem is this function which i use to get/create either a new blank entity or get the existing one from database.
+(File*)getOrCreateFile:(NSString*)remotePath
context:(NSManagedObjectContext*)context
{
struct timeval start,end,res;
gettimeofday(&start,NULL);
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
NSEntityDescription *entity = [NSEntityDescription entityForName:#"File" inManagedObjectContext:context];
[fetchRequest setEntity:entity];
[fetchRequest setFetchLimit:1];
NSPredicate *predicate = [NSPredicate predicateWithFormat:#"remotePath == %#",remotePath];
[fetchRequest setPredicate:predicate];
NSError *error;
NSArray *items = [context executeFetchRequest:fetchRequest error:&error];
[fetchRequest release];
File *f;
if ([items count] == 0)
f = (File*)[NSEntityDescription insertNewObjectForEntityForName:#"File" inManagedObjectContext:context];
else
f = (File*)[items objectAtIndex:0];
gettimeofday(&end, NULL);
[JFS timeval_subtract:&res x:&end y:&start];
count++;
elapsed += res.tv_usec;
return f;
}
For eksample, if i'm parsing a document with 200ish files the total time on a iPhone 3G is about 4 seconds. 3 of those seconds are spent in this function getting the objets from core data.
RemotePath is a unique string of variable length and indexed in the sqllite database.
What am i doing wrong here? or.. what could i do better/different to improve performance.

Executing fetches is somewhat expensive in Core Data, though the Core Data engineers have done some amazing work to keep this hit minimal. Thus, you may be able to improve things slightly by running a fetch to return multiple items at once. For example, batch the remotePaths and fetch with a predicate such as
[NSPredicate predicateWithFormat:#"remotePath IN %#", paths];
where paths is a collection of possible paths.
From the results, you can do the searches in-memory to determine if a particular path is present.
Fundamentally, however, doing fetches against strings (even if indexed) is an expensive operation. There may not be much you can do. Consider fetching against non-string attributes, perhaps by hasing the path and saving the hash in the entity as well. You'll get back a (potentially) larger result set which you could then search in memory for string equality.
Finally, do not make any changes without some performance data. Profile, profile, profile.

Related

Understanding Core Data Queries(searching data for a unique attribute)

Since I am coming from those programmers who have used sqlite extensively, perhaps I am just having a hard time grasping how Core Data manages to-many relationships.
For my game I have a simple database schema on paper.
Entity: Level - this will be the table that has all information about each game level
Attributes: levelNumber(String), simply the level number
: levelTime(String), the amount of time you have to finish the level(this time will vary with the different levels)
: levelContent(String), a list of items for that level(and that level only) separated by commas
: levelMapping(String), how the content is layed out(specific for a unique level)
So basically in core data i want to set up the database so i can say in my fetchRequest:
Give me the levelTime, levelContent and levelMapping for Level 1.(or
whatever level i want)
How would i set up my relationships so that i can make this type of fetchRequest?
Also I already have all the data ready and know what it is in advance. Is there any way to populate the entity and its attributes within XCode?
As you've described it, it's a single Core Data entity, called Level that has four string attributes. Since there's just the one entity, there are no relationships. You'd create the one entity and add properties so that it looks just like you've described it above:
Getting just one Level is basic Core Data fetching:
NSFetchRequest *request = [NSFetchRequest fetchRequestWithEntityName:#"Level"];
NSString *levelNumber = #"1";
NSPredicate *predicate = [NSPredicate predicateWithFormat:#"levelNumber = %#", levelNumber];
[request setPredicate:predicate];
NSError *error = nil;
NSArray *results = [[self managedObjectContext] executeFetchRequest:request error:&error];
NSManagedObject *level = nil;
if ([results count] > 0) {
level = [results objectAtIndex:0];
}
// Use level...
If it was me I'd use one of the numeric types for levelNumber, but maybe you have some reason to use a string there. I'd also probably break levelContent into a separate entity, because (a) comma delimited strings are ugly, no matter how you slice 'em, and (b) you might well want the items to have more attributes, and a separate entity would hold those.

Is there anything bad about reusing an NSFetchRequest for several different fetches with Core Data?

My question:
Is there anything bad about reusing an NSFetchRequest for several different fetches with Core Data?
Example code:
NSFetchRequest *request = [[NSFetchRequest alloc] init];
NSEntityDescription *logEntity = [NSEntityDescription entityForName:#"LogEntry" inManagedObjectContext:context];
[request setEntity:logEntity];
NSSortDescriptor *sortDescriptor = [[NSSortDescriptor alloc] initWithKey:#"dateTimeAction" ascending:NO]; // ascending NO = start with latest date
[request setSortDescriptors:[NSArray arrayWithObject:sortDescriptor]];
NSPredicate *predicate = [NSPredicate predicateWithFormat:#"status == %#",#"op tijd"];
[request setPredicate:predicate];
[request setFetchLimit:50];
NSError *error = nil;
NSInteger onTimeCount = [context countForFetchRequest:request error:&error];
NSPredicate *predicate1 = [NSPredicate predicateWithFormat:#"status == %#",#"uitgesteld"];
[request setPredicate:predicate1];
[request setFetchLimit:50];
NSInteger postponedCount = [context countForFetchRequest:request error:&error];
NSPredicate *predicate2 = [NSPredicate predicateWithFormat:#"status == %#",#"gemist"];
[request setPredicate:predicate2];
[request setFetchLimit:50];
NSInteger missedCount = [context countForFetchRequest:request error:&error];
It's not a problem, but in the example given it's not gaining you much (just some code brevity.) The most expensive part of creating a fetch request is parsing the predicate format string.
If the code you've given is called frequently, and you're looking to speed it up, here are some ideas to try:
Create all the predicates and fetch request just once: maybe in a dispatch_once() block and storing them statically; or in the constructor and stored in object fields
Don't specify sort descriptors, since order doesn't matter if you only care about the count
If the actual predicates will be more complex or flexible than shown, create one general template predicate with substitution variables, and use predicateWithSubstitutionVariables: to generate specified copies.
For even more code brevity, define that template in the object model using the model editor, and use fetchRequestFromTemplateWithName:substitutionVariables: to create fetch requests.
I can gin up some sample code if you like.
I don't think that it's a problem, because the NSFetchedRequest is just a search criteria descriptor, moreover you can have multiple predicates on your fetched request like this:
NSPredicate *predicates = [NSCompoundPredicate andPredicateWithSubpredicates:NSArray_of_predicates];
It's OK if you reuse them with the same store, or perhaps different stores with the same model. I ran into crashes (e.g. in NSKnownKeysDictionary1) when using the same fetch request to query multiple stores that had different models. It seemed to make sense to reuse the request since I was fetching the same entity, just from two different places. The entity name and predicate were the same. You would think this would be OK since the fetch request takes an entity name rather than an entity description; the latter would be a different (but equivalent) object for the same entity in different stores/models. However, it looks like the fetch request caches the entity description and doesn't check that it's still valid for the current context that it's being executed against.
You can, however, reuse the same predicate in multiple fetch requests without problems.

CoreData predicate latest stored date

I need to get the latest date from coredata
i found a way
NSSortDescriptor * sortDescriptor = [[NSSortDescriptor alloc] initWithKey:#"date" ascending:NO];
[fetchRequest setSortDescriptors:[NSArray arrayWithObject:sortDescriptor]];
[fetchRequest setFetchLimit:1];
so sort them by date and then pick the first
however can this not be done more optimal? this approach looks like a brute force
sort is nlogn, but simple search for the max is n
You can actually ask SQL for just that value, not the object with that value:
NSExpression *date = [NSExpression expressionForKeyPath:#"date"];
NSExpression *maxDate = [NSExpression expressionForFunction:#"max:"
arguments:[NSArray arrayWithObject:maxDate]];
NSExpressionDescription *d = [[[NSExpressionDescription alloc] init] autorelease];
[d setName:#"maxDate"];
[d setExpression:maxSalaryExpression];
[d setExpressionResultType:NSDateAttributeType];
[request setPropertiesToFetch:[NSArray arrayWithObject:d]];
NSError *error = nil;
NSArray *objects = [managedObjectContext executeFetchRequest:request error:&error];
if (objects == nil) {
// Handle the error.
} else {
if (0 < [objects count]) {
NSLog(#"Maximum date: %#", [[objects objectAtIndex:0] valueForKey:#"maxDate"]);
}
}
This is described in more detail under Fetching Managed Objects -> Fetching Specific Values in the CoreData documentation.
On line 2 maxDate expression refers to itself (maxDate).
I assume this must be the "date" variable from the first line.
I guess that's pretty much the best way. I'm not sure whether there is a more efficient way since it has to compare every date anyway to figure out which the oldest is.
Here are 2 other ways:
1) You could work with BOOLs as an attribute of the managed object. (like oldest = 1)
However, you'd have to find a new "oldest" managed object every time you delete one.
2) You could just save the oldest one until it changes. This might save a lot of work if you have to find the oldest managedObject often.
It depends on your application (how many times you insert/remove managed objects and how many times you need the oldest object).

Is there a more memory efficient way to search through a Core Data database?

I need to see if an object that I have obtained from a CSV file with a unique identifier exists in my Core Data Database, and this is the code I deemed suitable for this task:
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
NSEntityDescription *entity;
entity =
[NSEntityDescription entityForName:#"ICD9"
inManagedObjectContext:passedContext];
[fetchRequest setEntity:entity];
NSPredicate *pred = [NSPredicate predicateWithFormat:#"uniqueID like %#", uniqueIdentifier];
[fetchRequest setPredicate:pred];
NSError *err;
NSArray* icd9s = [passedContext executeFetchRequest:fetchRequest error:&err];
[fetchRequest release];
if ([icd9s count] > 0) {
for (int i = 0; i < [icd9s count]; i++) {
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc]init];
NSString *name = [[icd9s objectAtIndex:i] valueForKey:#"uniqueID"];
if ([name caseInsensitiveCompare:uniqueIdentifier] == NSOrderedSame && name != nil)
{
[pool release];
return [icd9s objectAtIndex:i];
}
[pool release];
}
}
return nil;
After more thorough testing it appears that this code is responsible for a huge amount of leaking in the app I'm writing (it crashes on a 3GS before making it 20 percent through the 1459 items). I feel like this isn't the most efficient way to do this, any suggestions for a more memory efficient way? Thanks in advance!
Don't use the like operator in your request predicate. Use =. That should be much faster.
You can specify the case insensitivity of the search via the predicate, using the [c] modifier.
It's not necessary to create and destroy an NSAutoreleasePool on each iteration of your loop. In fact, it's probably not needed at all.
You don't need to do any of the checking inside the for() loop. You're duplicating the work of your predicate.
So I would change your code to be:
NSFetchRequest *fetchRequest = [[NSFetchRequest alloc] init];
[fetchRequest setEntity:...];
[fetchRequest setPredicate:[NSPredicate predicateWithFormat:#"uniqueID =[c] %#", uniqueIdentifier]];
NSError *err = nil;
NSArray *icd9s = [passedContext executeFetchRequest:fetchRequest error:&err];
[fetchRequest release];
if (error == nil && [icd9s count] > 0) {
return [icd9s objectAtIndex:0]; //we know the uniqueID matches, because of the predicate
}
return nil;
Use the Leaks template in Instruments to hunt down the leak(s). Your current code may be just fine once you fix them. The leak(s) may even be somewhere other than code.
Other problems:
Using fast enumeration will make the loop over the array (1) faster and (2) much easier to read.
Don't send release to an autorelease pool. If you ever port the code to garbage-collected Cocoa, the pool will not do anything. Instead, send it drain; in retain-release Cocoa and in Cocoa Touch, this works the same as release, and in garbage-collected Cocoa, it pokes the garbage collector, which is the closest equivalent in GC-land to draining the pool.
Don't repeat yourself. You currently have two [pool release]; lines for one pool, which gets every experienced Cocoa and Cocoa Touch programmer really worried. Store the result of your tests upon the name in a Boolean variable, then drain the pool before the condition, then conditionally return the object.
Be careful with variable types. -[NSArray count] returns and -[NSArray objectAtIndex:] takes an NSUInteger, not an int. Try to keep all your types matching at all times. (Switching to fast enumeration will, of course, solve this instance of this problem in a different way.)
Don't hide releases. I almost accused you of leaking the fetch request, then noticed that you'd buried it in the middle of the code. Make your releases prominent so that you're less likely to accidentally add redundant (i.e., crash-inducing) ones.

How do I get Core Data to use my own NSManagedObjectID URI scheme?

I am writing an app that connects to a database to fetch data. Since the fetching is expensive and the data is generally unchanging, I'm using CoreData to cache the results so that I can do fast, local queries.
From the database, for each type, there is a string property that is guaranteed to be unique. In fact, there is a URI scheme for the database which is a unique address for each item.
The URL scheme is very basic along the lines of:
ngaobject://<server_license_id>/<type>/<identifier>
I'd like to be able to use this in CoreData as well. I've made a method to fetch a single item from the CoreData store:
-(NSFetchRequest*)fetchRequestForType:(NSString*)typeName identifier:(NSString*)identifier
{
NSFetchRequest * fetchRequest = [self fetchRequestForType:typeName];
[fetchRequest setFetchLimit:1];
NSString * identifierProperty = [self identifierPropertyNameForObjectType:typeName];
NSPredicate * predicate = [NSPredicate predicateWithFormat:#"%K == %#", identifierProperty, identifier];
[fetchRequest setPredicate:predicate];
return fetchRequest;
}
-(NGAObject*)objectWithType:(NSString*)typeName
identifier:(NSString*)identifier
{
// First try to retrieve it from the cache
NSAssert1( (identifier != nil), #"Request to create nil-name object of type %#", typeName );
NSFetchRequest * fetchRequest = [self fetchRequestForType:typeName identifier:identifier];
if ( !fetchRequest )
return nil;
NSError * error = nil;
NSArray * fetchResults = [[self managedObjectContext] executeFetchRequest:fetchRequest error:&error];
if ( !fetchResults )
{
NSLog(#"%#", error);
[NSApp presentError:error];
return nil;
}
if ( [fetchResults count] )
return [fetchResults objectAtIndex:0];
return nil;
}
When I retrieve an item from the server, I want to first get a reference to it in the cache and if it's there, update it. If it's not, create a new one.
Since I'm getting back thousands of objects from the server, performing a fetch for a single object for which I know a unique ID brings my machine to a crawl.
Instead, what I'm doing is pre-loading all the objects of a type, then creating a dictionary of identifiers->object, then processing the thousands of objects for that type by running it through the dictionary. This works fine, but is awkward.
Could I not write a method that takes the type/identifier combo and get a single object from CoreData without having to execute a lengthy fetch request?
It seems there is a solution if I can get CoreData to use my own URI specification. I could then call -(NSManagedObjectID*)managedObjectIDForURIRepresentation:(NSURL*)url on the persistent store coordinator.
So, the question is, how can I get CoreData to use my URI scheme? How can I make CoreData use my own unique identifiers?
You can't make Core Data use a custom URI scheme. The URI scheme is hardcoded into Core Data such that the URI can be decoded to locate particular data in a particular store in a particular apps on a particular piece of hardware. If the URI was customizable, that system would break down.
Fetching object singularly is what is killing you. Instead you need to batch fetch all objects whose customID matches those provided by the server. The easiest way to that is to use the IN predicate operator:
NSArray *customIDs=//... array of customIDs provided by server
NSPredicate *p;
p=[NSPredicate predicateWithFormat: #"customIdAtrribute IN %#", customIDs];
This will return all existing objects that you can ignore.
Alternatively, you could
Do a fetch on just the customID property by setting the fetch's propertiesToFetch to the customID attribute.
Set the fetch result type to dictionary.
Use the above predicate.
You will get an array of one key dictionaries returned with the customID as each value.
Convert the dictionary to an array of values e.g cachedIDs
Convert customIDs above to a mutable array.
Filter the customIDs array using the predicate, #"NOT (SELF IN %#)", cachedIDs"
The filtered customIDs array will now only contain the customID values NOT cached in Core Data.
You can create managed objects for only the new ids.
(This is how you use a filter predicate if you are unfamilar with it.)
NSMutableArray *f=[NSMutableArray arrayWithObjects:#"1",#"2",#"3",#"4",#"5",#"6",nil];
NSArray *g=[NSArray arrayWithObjects:#"5",#"6",nil];
[f filterUsingPredicate:[NSPredicate predicateWithFormat:#"NOT (SELF IN %#)",g]];
NSLog(#"f=%#",f);
...which outputs:
f=(
1,
2,
3,
4
)
Are all the fields which you are using for unique-ID lookup marked as "Indexed" in the CoreData designer? If that has been done then the CoreData fetches shouldn't be lengthy ...

Resources