iTunes XML Parsing in cocoa - cocoa

I am developing an application in cocoa .I need to parse a iTunes XML file of large size(about 25Mb).I am using the following code snippet now
NSDictionary *itunesDatabase = [NSDictionary dictionaryWithContentsOfFile:itunesPath];
But this is a little bit slow
Is there any faster method to load the entire data to a dictionary??

The reason you're having such slow performance is because NSDictionary reads everything into memory all at once. For a large iTunes library, this can take a long time and -- feel free to confirm this with Activity Monitor -- a metric assload of memory. (This is the precise technical term for that amount of memory)
The alternative in these situations is to use a callback-based XML parser (generally known as "SAX" parsers). These parse XML documents an entity at a time and call your callback methods. In Cocoa, the NSXMLParser class provides this functionality. You set your class as its delegate, call the parse method, and the parser starts calls the delegate methods as it reads tags, attributes, text, etc. in the XML file.
Now, this is obviously harder than just loading everything into an NSDictionary and walking the resulting tree of objects. You'll need to keep track of state information yourself. And you'll have to "build up" your objects progressively, so organizing your classes can be difficult.
However, you can ignore the XML you aren't interested in, and that saves a lot of memory. And, depending on what data you're getting out of iTunes, you may also be able to end the parsing as soon as you've gotten the data you need. Even if this does end up taking quite a while, at least you'll be able to show your user a progress bar or some other indication that your program is working, which is much better than just hanging for 10-20 seconds while NSDictionary loads a giant XML file.

If you're able to use third-party frameworks, run, do not walk to EyeTunes. (BSD license.) It's an abstraction layer around Apple Events for communicating with iTunes, and as such it doesn't parse the XML database directly (I think, it's been a while since I've used it), but you'll have get/set access to anything in the XML.

Try to use libxml:
http://www.cimgf.com/2008/08/18/cocoa-tutorial-libxml-and-xmlreader/
To minimize highest memory footprint, create and drain NSAutoreleasePool in your loop

Related

Which methods/calls perform the disk I/O operations and how to find them?

Which methods and system calls should I hook into, so I can replace 'how' an OS X app (the target) reads and writes to/from the HD?.
How may I determine that list of functions or system calls?.
Adding more context:
This is a final project and I'm looking for advise. The goal is to alter the behavior of an OS X app, adding it data encryption and decryption capabilities.
Which tools could I use to achieve my goal, and why?
For instance, assume the target app is Text Edit. Instead of saving "hello world" as plain text in a .txt file in the HD, it'll save: "ifmmnXxnpme". Opening the file will show the original text.
I think its better to get more realistic or at least conscious of what you want to do.
The lowest level in software is a kernel module on top of the storage modules, that "encrypt" the data.
In Windows you can stack drivers, so conceptually you simply intercept the call for a read/write, edit it and pass it down the driver stack.
Under BSD there is an equivalent mechanism surely, but I don't know precisely what it is.
I don't think you want to dig into kernel programming.
At the lowest level from an user space application point of view, there are the system calls.
The system calls used to write and read are respectively the number 3 and 4 (see here), in BSD derived OS, like OS X, they becomes 2000003h and 2000004h (see here).
This IA32e specific since you are using Apple computers.
Files can be read/written by memory mapping them, so you would need to hijack the system call sys_mmap too.
This is more complex as you need to detect page faults or any mechanism used to implement file mapping.
To hijack system calls you need a kernel module again.
The next upper level of abstraction is the runtime, that probably is the Obj C runtime (up to data, Swift still use Obj C runtime AFAIK).
An Obj C application use the Cocoa Framework and can read/write to file with calls like [NSData dataWithContentOfFile: myFileName] or [myData writeToFile: myFileName atomically:myAtomicalBehavior].
There are plenty of Cocoa methods that write to or read from file, but internally the framework will use few methods from the Obj C runtime.
I'm not an expert of the internals of Cocoa, so you need to take a debugger and look what the invocation chain is.
Once you have found the "low level" methods that read or write to files you can use method swizzling.
If the target app load your code as part of a library, this is really simple, otherwise you need more clever techniques (like infecting or manipulating the memory of the other process directly). You can google around for more info.
Again to be honest this is still a lot of work, although manageable.
You may consider to simply hijack a limited set of Cocoa methods, for example the writeToFile of NSData or similar for NSString and consider the project a work in progress demo.
A similar question has been asked and answered here.

How can I hold common runtime data in cocoa app?

I am new to cocoa and mac os x development.
Different components of my app use a particular location to store data etc, The location is determined at the start of the app. For example a subdirectory directory in the user's home directory , temp directory of the system etc and similar runtime information that is used by different classes in my cocoa app. This information should be determined at the start of the app once and reused later.
Every component should be able to access a central component to get this information rather than each one calculating over and over again.
Does cocoa provide some place to hold this data ? or Do we create singleton object ? any ideas ?
A common pattern for accessing shared model resources is a through a singleton model controller class, like you wrote. Here's how I manage creating/accessing singletons:
+ (id)sharedInstance {
static dispatch_once_t once;
static SomeModelControllerClass * sharedInstance;
dispatch_once(&once, ^ { sharedInstance = [[self alloc] init]; });
return sharedInstance;
}
The function dispatch_once guarantees that a given block of code identified by the dispatch_once_t token "once" is only executed once.
The other, more important, question is how to create/store the data that your model controller will manage. There a couple of options:
Keep it all in memory If you have a relatively small amount of data that can be held in memory all at once, and regenerated with ease on each app launch, then this is the simplest way. It is probably not a good user experience though.
NSCoding Have your model objects implement the NSCoding protocol methods (init/encode withCoder). Your model controller will be responsible for writing/reading your model stack to /from disk at appropriate times (preferably on a background thread). This technique makes sure your user sees some data immediately upon launch, but it requires that all data be held in memory after it's read from disk. It's a good technique for an app like Twitter, and indeed they used this technique for many years.
Core Data Core Data is a great choice for "shoebox" style apps with lots of data that needs to be stored locally and is too large to keep in memory all at once. It comes with a big learning curve and lots of boilerplate, so I only recommend this if you can't keep all your model objects in memory at once.
Custom Storage There are many third-party frameworks for doing what Core Data does without the headache. Most of them are built on SQLite. Explore Github for options that appeal to you. YapDatabase looks like the coolest one to me.

Does NSFileWrapper support lazy loading?

I am creating a NSDocument package that contains potentially hundreds of large files, so I don't want to read it all in when opening the document.
I've spent some time searching, but I can't find a definitve answer. Most people seem to think that NSFileWrapper loads all of the data into memory, but some indicate that it doesn't load data until you invoke -regularFileContents on a wrapper. (See Does NSFileWrapper load everything into memory? and Objective-C / Cocoa: Uploading Images, Working Memory, And Storage for examples.)
The documentation isn't entirely clear, but options like NSFileWrapperReadingImmediate and NSFileWrapperReadingWithoutMapping seem to suggest that it doesn't always read everything in.
I gather that NSFileWrapper supports incremental saving, only writing out sub-wrappers that have been replaced. So it'd be nice if it supports incremental loading too.
Is there a definitive answer?
NSFileWrapper loads lazily by default, unless you specify the NSFileWrapperReadingImmediate option. It will avoid reading a file into memory until something actually requests it.
As a debugging aid only, you can see whether a file has been loaded yet, by examining:
[wrapper valueForKey:#"_contents"];
It gets filled in as NSData once the file is read from disk.

How to store preferences for an application?

I am a newbie in Ruby coming from web development with mainly PHP/SQL. I was thinking about how I store preferences in my application. For instance, if I want to store a path as default_path and have that set also when the user restarts the application.
In the web world one would probably store this in a database or XML. Database seems overkill for a standalone application. But I am unsure wheter XML/YAML/Other-Write-Format is the way to go. And if so, where should I store these preferences? Should they be, for instance on a Mac, in ~/Library/MyAppName?
I like using YAML because it's very easily read/written by a lot of languages, making it possible for several apps to share the same configuration info. It's a well documented standard so there should be very little chance of data falling into a hole with it.
Also, because it's easy for a human to understand, and doesn't take any special tools to change, it works nicely for any data that might occasionally change in an app, either for fine-tuning or to enable special behaviors.
A little creative coding on your part that periodically checks the last modified time of the YAML file could make it so your app would modify its behavior on the fly as the prefs file is tweaked. I had a big app I didn't want to shut down for changes and set up that behavior. It ran three weeks straight, and I tweaked its operating parameters via its config file. It would read the file every minute and inherit any changes to its parameters on the fly.
Databases are a good way to store parameters/preferences if it's a centralized server or web-based app. For something distributed that runs on individual machines it makes no sense.
Ruby gives you another method for storing data called Marshaling. This will let you store a class/object to a file and reconstitute it later. If all of your user preferences are stored in a single object (or you can create an object which can hold all of the data that you need), it may be easiest to marshal the data instead of writing import/export routines to a text-based format or trying to pull in an additional library or gem.
As to where on the disk to store the data, that's up to you. Most platforms have a standard location for storing application data based on whether it's available to a single user or all users. It's usually safest to follow the common practice on your target platform of choice.
Update: The simplest example of marshaling would probably be this: Say that you have a class called UserPrefs that you use to store all of your user preferences. You can use the following code to store the preferences data into a file:
my_prefs = UserPrefs.new
# ... Fill in the 'my_prefs' object with the user's preferences, etc ...
# Store the object into a file
File.open("user_prefs.data", "wb") do |file|
Marshal.dump(my_prefs, file)
end
The next time that you load the application, you can restore those preferences using the following:
# Load prefs from file
my_prefs = nil
File.open("user_prefs.data", "rb") {|f| my_prefs = Marshal.load(f)}
At this point, the my_prefs object should be exactly the same as it was when the marshaling code was originally run. This essentially lets you take a 'snaphot' of an object at one point in time (say, when your program shuts down) and restore it later (say, when your program loads). Internally, all of the data in the structure is encoded into a single string and that string is what is stored to disk; the Marshal module simply takes care of the encoding and decoding for you.
Here is another example of using marshaling to store and retrieve data.
The default encode/decode routines built into the Marshal module are usually sufficient for most data-storing classes. Particularly complex classes may have problems, and if that is the case then you can define your own encode and decode methods (the first link includes an example of defining custom methods).
Some types of data, however, cannot be marshaled (things like handles to open files, Proc objects, etc) since they don't normally persist across Ruby sessions. If you are needing to marshal a class that includes members like this that Marshal doesn't like, you can use custom encode/decode functions to marshal the rest of the class and omit the problematic members.
I saw some applications using ruby gconf2

Cocoa Memory Usage

I'm trying to track down some peculiar memory behavior in my Cocoa desktop app. My app does a lot of image processing using NSImage and uploads those images to a website over HTTP using NSURLConnection.
After uploading several hundred images (some very large), when I run Instrument I get no leaks. I've also run through MallocDebug and get no leaks. When I dig into object allocations using Instrument I get output like this:
GeneralBlock-9437184, Net Bytes 9437184, # Net 1
GeneralBlock-192512, Net Bytes 2695168, # Net 14
and etc., for smaller sizes. When I look at these in detail, they're marked as being owned by 'Foundation' and created via NSConcreteMutableData initWithCapacity. During HTTP upload I'm creating a post body using NSMutableData, so I'm guessing these are buffers Cocoa is caching for me when I create the NSMutableData objects.
Is there a way to force Cocoa to free these? I'm 90% positive I'm releasing correctly (and Instruments and MallocDebug seem to confirm this), but I think Cocoa is keeping these around for perf reasons since I'm allocating so many MSMutableData buffers.
If you're certain you're releasing the objects you own correctly, then there's really nothing you can (or should) do. Those blocks are, as Instruments says, owned by Foundation because NSConcreteMutableData, a Foundation object, created them. It's possible that these are some sort of cache that NSData is keeping around on purpose, but there's no way to know what they are.
If you believe this is a bug, you should report it at http://bugreport.apple.com. The rules of memory ownership apply to classes that don't manage memory well, too.
Also, this might be a silly question, but which option are you using for the Object Alloc tool? All objects created or Created and still living? You might be looking at allocations that don't matter anymore.

Resources