Does NSFileWrapper support lazy loading? - macos

I am creating a NSDocument package that contains potentially hundreds of large files, so I don't want to read it all in when opening the document.
I've spent some time searching, but I can't find a definitve answer. Most people seem to think that NSFileWrapper loads all of the data into memory, but some indicate that it doesn't load data until you invoke -regularFileContents on a wrapper. (See Does NSFileWrapper load everything into memory? and Objective-C / Cocoa: Uploading Images, Working Memory, And Storage for examples.)
The documentation isn't entirely clear, but options like NSFileWrapperReadingImmediate and NSFileWrapperReadingWithoutMapping seem to suggest that it doesn't always read everything in.
I gather that NSFileWrapper supports incremental saving, only writing out sub-wrappers that have been replaced. So it'd be nice if it supports incremental loading too.
Is there a definitive answer?

NSFileWrapper loads lazily by default, unless you specify the NSFileWrapperReadingImmediate option. It will avoid reading a file into memory until something actually requests it.
As a debugging aid only, you can see whether a file has been loaded yet, by examining:
[wrapper valueForKey:#"_contents"];
It gets filled in as NSData once the file is read from disk.

Related

Redis memory management - clear based on key, database or instance

I am very new to Redis. I've implemented caching in our application and it works nicely. I want to store two main data types: a directory listing and file content. It's not really relevant, but this will cache files served up via WebDAV.
I want the file structure to remain almost forever. The file content needs to be cached for a short time only. I have set up my expiry/TTL to reflect this.
When the server reaches memory capacity is it possible to priorities certain cached items over others? i.e. flush a key, flush a whole database or flush a whole instance of Redis.
I want to keep my directory listing and flush the file content when memory begins to be an issue.
EDIT: Reading this article seems to be what I need. I think I will need to use volatile-ttl. My file content will have a much shorter TTL set, so this should in theory clear that first. If anyone has any other helpful advice I would love to hear it, but for now I am going to implement this.
Reading this article describes what I needed. I have implemented volatile-ttl as my memory management type.

Coldfusion/Railo: What's the most efficient way to output file contents - fileRead or include?

While I've always cached database calls and placed commonly used data into memory for faster access, I've been finding of late that simple processing and output of data can add a significant amount of time to page load and thus I've been working on a template caching component that will save parsed HTML to either a file, or in memory, for quicker inclusion on pages.
This is all working very well, reducing some page loads down to 10% of the uncached equivalent - however I find myself wondering what would be the most efficient way to output the content.
Currently I'm using fileRead to pull in the parsed HTML and save to a variable, which is output on the page.
This seems very fast, but I'm noticing the memory used by the Tomcat service gradually increasing - presumably because the fileRead operation is reading the contents into memory, and quite possibly, Tomcat isn't removing that data when its finished.
(Side question: Anyone know a way that I can interrogate the JVM memory and find details/stack traces of the objects that CF has created??)
Alternatively, I could use cfinclude to simply include the parsed HTML file. From all the information I can find it seems that the speed would be about the same - so would this method be more memory efficient?
I've had issues on the server before with memory usage crashing Tomcat, so keeping it down is quite important.
Is there anyone doing something similar that can give me the benefit of their experience?
cfinclude just includes the template into the one being compiled, whereas fileread has to read it into memory first and then output, so technically is going to consume more memory. I don;t expect the speed difference is much, but you can see the difference by just turning on debugging and checking the execution times.
The most efficient way would be to cached it with cachePut() and serve it from cacheGet(). What can be faster than fetching from RAM? Don't fetch it at all with proper Expire headers if it's the whole page, or smartly return 304 for Not Modified.
It turns out that CFInclude actually compiles the (already rendered in this case) content into a class, which itself has overhead. The classes aren't unloaded (according to CFTracker) and as such, too many of these can cause permgen errors. FileRead() seems to be orders of magnitude more efficient, as all we're doing is inserting content into the output buffer.

Read a File From Cache, But Without Polluting the Cache (in Windows)

Windows has a FILE_FLAG_NO_BUFFERING flag that allows you to specify whether or not you want your I/O to be cached by the file system.
That's fine, but what if I want to use the cache if possible, but avoid modifying it?
In other words, how do you tell Windows the following?
Read this file from the cache if it's already cached, but my data doesn't exhibit locality, so do not put it into the cache!
The SCSI standard defines a Disable Page Out bit that does precisely this, so I'm wondering how (if at all) it is possible to use that feature from Windows (with cooperation of the file system cache too, of course)?
Edit: TL;DR:
What's the equivalent of FILE_FLAG_WRITE_THROUGH for reads?
About the closest Windows provides to what you're asking is FILE_FLAG_WRITE_THROUGH.
I see two flags that look suspiciously like what you are asking for:
FILE_FLAG_RANDOM_ACCESS
FILE_FLAG_SEQUENTIAL_SCAN
The later's doc clearly suggests that it won't retain pages in cache, though it will probably read-ahead sequentially. The former's doc is completely opaque, but would seem to imply what you want. If the pattern is quite random, hanging onto pages for later reuse would be a waste of memory.
Keep in mind that, for files, the Windows kernel always will use some pages of 'cache' to hold the I/O. It has nowhere else to put it. So it's not meaningful to say 'don't cache it,' as opposed to 'evict the old pages of this file before evicting some other pages.'

Cocoa Memory Usage

I'm trying to track down some peculiar memory behavior in my Cocoa desktop app. My app does a lot of image processing using NSImage and uploads those images to a website over HTTP using NSURLConnection.
After uploading several hundred images (some very large), when I run Instrument I get no leaks. I've also run through MallocDebug and get no leaks. When I dig into object allocations using Instrument I get output like this:
GeneralBlock-9437184, Net Bytes 9437184, # Net 1
GeneralBlock-192512, Net Bytes 2695168, # Net 14
and etc., for smaller sizes. When I look at these in detail, they're marked as being owned by 'Foundation' and created via NSConcreteMutableData initWithCapacity. During HTTP upload I'm creating a post body using NSMutableData, so I'm guessing these are buffers Cocoa is caching for me when I create the NSMutableData objects.
Is there a way to force Cocoa to free these? I'm 90% positive I'm releasing correctly (and Instruments and MallocDebug seem to confirm this), but I think Cocoa is keeping these around for perf reasons since I'm allocating so many MSMutableData buffers.
If you're certain you're releasing the objects you own correctly, then there's really nothing you can (or should) do. Those blocks are, as Instruments says, owned by Foundation because NSConcreteMutableData, a Foundation object, created them. It's possible that these are some sort of cache that NSData is keeping around on purpose, but there's no way to know what they are.
If you believe this is a bug, you should report it at http://bugreport.apple.com. The rules of memory ownership apply to classes that don't manage memory well, too.
Also, this might be a silly question, but which option are you using for the Object Alloc tool? All objects created or Created and still living? You might be looking at allocations that don't matter anymore.

iTunes XML Parsing in cocoa

I am developing an application in cocoa .I need to parse a iTunes XML file of large size(about 25Mb).I am using the following code snippet now
NSDictionary *itunesDatabase = [NSDictionary dictionaryWithContentsOfFile:itunesPath];
But this is a little bit slow
Is there any faster method to load the entire data to a dictionary??
The reason you're having such slow performance is because NSDictionary reads everything into memory all at once. For a large iTunes library, this can take a long time and -- feel free to confirm this with Activity Monitor -- a metric assload of memory. (This is the precise technical term for that amount of memory)
The alternative in these situations is to use a callback-based XML parser (generally known as "SAX" parsers). These parse XML documents an entity at a time and call your callback methods. In Cocoa, the NSXMLParser class provides this functionality. You set your class as its delegate, call the parse method, and the parser starts calls the delegate methods as it reads tags, attributes, text, etc. in the XML file.
Now, this is obviously harder than just loading everything into an NSDictionary and walking the resulting tree of objects. You'll need to keep track of state information yourself. And you'll have to "build up" your objects progressively, so organizing your classes can be difficult.
However, you can ignore the XML you aren't interested in, and that saves a lot of memory. And, depending on what data you're getting out of iTunes, you may also be able to end the parsing as soon as you've gotten the data you need. Even if this does end up taking quite a while, at least you'll be able to show your user a progress bar or some other indication that your program is working, which is much better than just hanging for 10-20 seconds while NSDictionary loads a giant XML file.
If you're able to use third-party frameworks, run, do not walk to EyeTunes. (BSD license.) It's an abstraction layer around Apple Events for communicating with iTunes, and as such it doesn't parse the XML database directly (I think, it's been a while since I've used it), but you'll have get/set access to anything in the XML.
Try to use libxml:
http://www.cimgf.com/2008/08/18/cocoa-tutorial-libxml-and-xmlreader/
To minimize highest memory footprint, create and drain NSAutoreleasePool in your loop

Resources