Leaking memory with Cocoa garbage collection

Leaking memory with Cocoa garbage collection - cocoa

I've been beating my head against a wall trying to figure out how I had a memory leak in a garbage collected Cocoa app. (The memory usage in Activity Monitor would just grow and grow, and running the app using the GC Monitor instruments would also show an ever-growing graph.)
I eventually narrowed it down to a single pattern in my code. Data was being loaded into an NSData and then parsed by a C library (the data's bytes and length were passed into it). The C library has callbacks which would fire and return sub-string starting pointers and lengths (to avoid internal copying). However, for my purposes, I needed to turn them into NSStrings and keep them around awhile. I did this by using NSString's initWithBytes:length:encoding: method. I assumed that would copy the bytes and NSString would manage it appropriately, but something is going wrong because this leaks like crazy.
This code will "leak" or somehow trick the garbage collector:
- (void)meh
{
NSData *data = [NSData dataWithContentsOfFile:[[NSBundle mainBundle] pathForResource:#"holmes" ofType:#"txt"]];
const int substrLength = 80;
for (const char *substr = [data bytes]; substr-(const char *)[data bytes] < [data length]; substr += substrLength) {
NSString *cocoaString = [[NSString alloc] initWithBytes:substr length:substrLength encoding:NSUTF8StringEncoding];
[cocoaString length];
}
}
I can put this in timer and just watch memory usage go up and up with Activity Monitor as well as with the GC Monitor instrument. (holmes.txt is 594KB)
This isn't the best code in the world, but it shows the problem. (I'm running 10.6, the project is targeted for 10.5 - if that matters). I read over the garbage collection docs and noticed a number of possible pitfalls, but I don't think I'm doing anything obviously against the rules here. Doesn't hurt to ask, though. Thanks!
Project zip
Here's a pic of the object graph just growing and growing:

This is an unfortunate edge case. Please file a bug (http://bugreport.apple.com/) and attach your excellent minimal example.
The problem is two fold;
The main event loop isn't running and, thus, the collector isn't triggered via MEL activity. This leaves the collector doing its normal background only threshold based collections.
The data stores the data read from the file into a malloc'd buffer that is allocated from the malloc zone. Thus, the GC accounted allocation -- the NSData object itself -- is really tiny, but points to something really large (the malloc allocation). The end result is that the collector's threshold isn't hit and it doesn't collect. Obviously, improving this behavior is desired, but it is a hard problem.
This is a very easy bug to reproduce in a micro-benchmark or in isolation. In practice, there is typically enough going on that this problem won't happen. However, there may be certain cases where it does become problematic.
Change your code to this and the collector will collect the data objects. Note that you shouldn't use collectExhaustively often -- it does eat CPU.
- (void)meh
{
NSData *data = [NSData dataWithContentsOfFile:[[NSBundle mainBundle] pathForResource:#"holmes" ofType:#"txt"]];
const int substrLength = 80;
for (const char *substr = [data bytes]; substr-(const char *)[data bytes] < [data length]; substr += substrLength) {
NSString *cocoaString = [[NSString alloc] initWithBytes:substr length:substrLength encoding:NSUTF8StringEncoding];
[cocoaString length];
}
[data self];
[[NSGarbageCollector defaultCollector] collectExhaustively];
}
The [data self] keeps the data object alive after the last reference to it.

Related

Why can't I use cocoa frameworks in different forked processes?

I was playing with the NSSound class to play a sound in a background process of my own so as to not block user input. I decided to call fork() but that is giving me problems. At the very moment the sound is allocated the forked process crashes. The funny thing is if I construct an example which only calls fork(), then the child process can call NSSound without problems, the crashes come only if I try to use other cocoa APIs before the fork()call. See this example with the crashme?() calls commented:
#import <AppKit/AppKit.h>
#import <math.h>
#define FILENAME \
"/System/Library/Components/CoreAudio.component/" \
"Contents/SharedSupport/SystemSounds/dock/drag to trash.aif"
void crashme1(void)
{
NSString *path = [[NSString alloc] initWithUTF8String:"false file"];
NSSound *sound = [[NSSound alloc]
initWithContentsOfFile:path byReference:YES];
}
void crashme2(void)
{
NSInteger tag = 0;
[[NSWorkspace sharedWorkspace]
performFileOperation:NSWorkspaceRecycleOperation
source:#"." destination:nil
files:[NSArray arrayWithObject:#"false file"] tag:&tag];
}
double playAif(void)
{
NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
NSString *path = [[NSString alloc] initWithUTF8String:FILENAME];
NSSound *sound = [[NSSound alloc]
initWithContentsOfFile:path byReference:YES];
[path release];
if (!sound) {
[pool release];
return -1;
}
const double ret = [sound duration];
[sound play];
[sound autorelease];
[pool release];
return ret;
}
int main(int argc, char *argv[])
{
//crashme1();
//crashme2();
int childpid = fork();
if (0 == childpid) {
printf("Starting playback\n");
double wait = playAif();
sleep(ceil(wait));
wait = playAif();
sleep(ceil(wait));
printf("Finished playback\n");
}
return 0;
}
When I run this from the commandline it works, but if I uncomment one of the calls to either crashme?() functions the playback in the forked process never starts. Is it because any of the Cocoa framework APIs are not async-signal safe? Is there any way to make the calls async signal safe by wrapping them somehow?

I'll quote the Leopard CoreFoundation Framework Release Notes. I don't know if they're still online, since Apple tends to replace rather than extend the Core Foundation release notes.
CoreFoundation and fork()
Due to the behavior of fork(), CoreFoundation cannot be used on the
child-side of fork(). If you fork(), you must follow that with an
exec*() call of some sort, and you should not use CoreFoundation APIs
within the child, before the exec*(). The applies to all higher-level
APIs which use CoreFoundation, and since you cannot know what those
higher-level APIs are doing, and whether they are using CoreFoundation
APIs, you should not use any higher-level APIs either. This includes
use of the daemon() function.
Additionally, per POSIX, only async-cancel-safe functions are safe to
use on the child side of fork(), so even use of lower-level
libSystem/BSD/UNIX APIs should be kept to a minimum, and ideally to
only async-cancel-safe functions.
This has always been true, and there have been notes made of this on
various Cocoa developer mailling lists in the past. But CoreFoundation
is taking some stronger measures now to "enforce" this limitation, so
we thought it would be worthwhile to add a release note to call this
out as well. A message is written to stderr when something uses API
which is definitely known not to be safe in CoreFoundation after
fork(). If file descriptor 2 has been closed, however, you will get no
message or notice, which is too bad. We tried to make processes
terminate in a very recognizable way, and did for a while and that was
very handy, but backwards binary compatibility prevented us from doing
so.
In other words, it has never been safe to do much of anything on the child side of a fork() other than exec a new program.
Besides the general POSIX prohibition, an oft-mentioned explanation is: a) pretty much all Cocoa programs are multithreaded these days, what with GCD and the like. B) when you fork(), only the calling thread survives into the child process; the other threads just vanish. Since those threads could have been manipulating shared resources, the child process can't rely on having sane state. For example, the malloc() implementation may have a lock to protect shared structures and that lock could have been held by one of the now-gone threads at the time of the fork(). So, if the remaining thread tries to allocate memory, it may hang indefinitely. Other shared data structures may simply be in a corrupted state. Etc.

How Does NSZombies Really Work?

I cannot find any detailed apple documentation on how the NSZombie really functions. I understand that its designed to not actually release objects and just maintain a count of references to catch any extra releases, but how would something like this work:
for(int i = 1; i < 10; i++)
{
NSMutableArray *array = [[NSMutableArray alloc] initWithCapacity: i];
[array release];
}
Since the same variable/object is being allocated/initialized and released in the same application, how would NSZombie's technically handle this? I know that this shouldn't flag any zombies because every alloc has a release, but how would Xcode technically handle re-allocating the same memory with different capacities?

With Zombies, objects don't actually need to be freed[1] -- the object is simply turned into a "Zombie" at some point after the object's retain count reaches 0. When you message a "Zombified" instance, a special error handler is performed.
1) Freeing Zombies is optional. Unless you really need the memory for a long running or memory intensive task, it is a more effective test to not to free the zombies (NSDeallocateZombies = NO)

This question was answered in the comments by Brad Larson.
Quote:
That isn't the same object, or the same memory. You're creating a distinct, new NSMutableArray instance on every pass through that loop. Just because a pointer to each is assigned to array does not make them the same object.
A pointer merely points to a particular location in memory where the object exists. A given object in memory can have multiple pointers to it, or even none (when it is being leaked). NSZombie acts on the object itself, not pointers to it.

RetainCount Memory not free'ed

I am stuck with the Cocoa Memory Managagment.
- (IBAction) createPush:(UIButton *)sender {
[create setEnabled:NO];
[release setEnabled:YES];
aLocation = [[Location alloc] init];
// Put some Example Stuff in the Class
aLocation.title = #"Apartment";
aLocation.street = #"Examplestreet 23";
aLocation.zip = #"12345";
aLocation.city = #"Exampletown";
aLocation.http = #"http://google.com";
aLocation.info = #"First Info Text";
aLocation.info2 = #"Second Info Text, not short as the first one";
aLocation.logoPath = #"http://google.de/nopic.jpg";
[aLocation.pictures addObject:#"http://google.de/nopic.jpg"];
[aLocation.pictures addObject:#"http://google.de/nopic.jpg"];
}
- (IBAction) releasePush:(UIButton *)sender {
[release setEnabled:NO];
[create setEnabled:YES];
[aLocation release];
aLocation = nil;
}
This Code works fine if I set or get Variables, but when I call the 'last' release (so the retain count is 0) it dealloc Method of aLocation gets called, but in Instruments Allocations you see that no memory is given back.
Here the Sources of Location:
http://homes.dnsalias.com/sof/Location.m
same Link with a '.h' instead of '.m' for the Header file (sorry its because of the Spaming Rule).
And the whole Project: http://homes.dnsalias.com/sof/Location.zip
Thanks for any help, where is my failure? Dennis

This Code works fine if I set or get
Variables, but when I call the 'last'
release (so the retain count is 0) it
dealloc Method of aLocation gets
called, but in Instruments Allocations
you see that no memory is given back.
What do you mean by "no memory is given back"?
Though oddly named, the memory management of aLocation is correct is the above code (assuming you have released it in dealloc as well).
Why doesn't memory use decrease when a single object is released?
(Paraphrased)
It is likely that your object is relatively tiny and, thus, that single deallocation falls below the ~20K or so needed to show up in Instruments.
If your app is crashing due to memory use issues, looking to a single deallocation is the wrong place to start. The first thing to do is to answer why your app is accreting memory and what is responsible for that growth.
Configure the Allocations instrument to only track live allocations. Then sort by total memory use. That'll show you what type of allocation is consuming the most memory. Start by reducing that.
Heapshot analysis can be very effective in these situations.

Additional Infos here because of maximum number of links and I have'nt the opportunity to post images...
What do you mean by "no memory is given back"?
I will show you the Instruments run, then it should be clear.
Screenshots from Instruments run
If want more Detail klick here for Instruments Run.

Your code is just fine. You are mistaken the output from Instruments. There is no Location object leaking.
For leaks, use the "Leaks" instrument. It won't fire. :-)

Leak problem with initWithContentsOfFile

My app is running good and looks ok. But after I ran it with Instruments, I found tons of leaks on it. There are several things seems not wrong like a code below. Is the code really have problems? Any single word will be helpful to me.
#interface GameData : NSObject
{
NSDictionary* _data;
NSDictionary* _localData;
}
#end
#implementation GameData
- (id) init
{
NSString* dataFilename = [[NSBundle mainBundle]pathForResource:#"GameData" ofType:#"plist"];
_data = [[NSDictionary alloc]initWithContentsOfFile:dataFilename]; // Leaks 48 bytes
NSString* localDataFilename = [[NSBundle mainBundle]pathForResource:#"GameData-Local" ofType:#"plist"];
_localData = [[NSDictionary alloc]initWithContentsOfFile:localDataFilename];
return self;
}
- (void) dealloc
{
[_data release];
[_localData release];
[super dealloc];
}
#end

Some operations may cause the frameworks to store static data structures that are never released. For instance, the implementation of -initWithContentsOfFile: may set up some internal settings the first time it's used which are then in place until the app is terminated, possibly for performance optimization reasons. This is not a real leak, but leak detection software will sometimes flag it as such. There's also the possibility that the framework itself has a bug that causes a memory leak but this is rare, especially for a well-established class like NSDictionary.
You're not leaking in your code, it's correct as far as I can tell. If your -dealloc method is being called (add a log statement to be sure) then you are fulfilling your part of the contract and any leak is not your fault.
It's probably worth using the ObjectAlloc instrument as this gives you a much better idea of what objects are allocated and are hanging around.

Working around NSFileHandle NSTask blocking when passing large amounts of data

I've got a bit of code that handles exporting data from my application. It takes in an NSString full of XML and runs it through a PHP script to generate HTMl, RTF, etc. It works well unless a user has a large list. This is apparently due to it overrunning the 8k or so buffer of NSPipe.
I worked around it (I think) in the readPipe and readHandle, but I'm not sure how to handle it in the writeHandle/writePipe. The application will beachball at [writeHandle writeData:[in... unless I break on it in gdb, wait a few seconds and and then continue.
Any help on how I can workaround this in my code?
- (NSString *)outputFromExporter:(COExporter *)exporter input:(NSString *)input {
NSString *exportedString = nil;
NSString *path = [exporter path];
NSTask *task = [[NSTask alloc] init];
NSPipe *writePipe = [NSPipe pipe];
NSFileHandle *writeHandle = [writePipe fileHandleForWriting];
NSPipe *readPipe = [NSPipe pipe];
NSFileHandle *readHandle = [readPipe fileHandleForReading];
NSMutableData *outputData = [[NSMutableData alloc] init];
NSData *readData = nil;
// Set the launch path and I/O for the task
[task setLaunchPath:path];
[task setStandardInput:writePipe];
[task setStandardOutput:readPipe];
// Launch the exporter, it will convert the raw OPML into HTML, Plaintext, etc
[task launch];
// Write the raw OPML representation to the exporter's input stream
[writeHandle writeData:[input dataUsingEncoding:NSUTF8StringEncoding]];
[writeHandle closeFile];
while ((readData = [readHandle availableData]) && [readData length]) {
[outputData appendData:readData];
}
exportedString = [[NSString alloc] initWithData:outputData encoding:NSUTF8StringEncoding];
return exportedString;
}

There's a new API since 10.7, so you can avoid using NSNotifications.
task.standardOutput = [NSPipe pipe];
[[task.standardOutput fileHandleForReading] setReadabilityHandler:^(NSFileHandle *file) {
NSData *data = [file availableData]; // this will read to EOF, so call only once
NSLog(#"Task output! %#", [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding]);
// if you're collecting the whole output of a task, you may store it on a property
[self.taskOutput appendData:data];
}];
Probably you want to repeat the same above for task.standardError.
IMPORTANT:
When your task terminates, you have to set readabilityHandler block to nil; otherwise, you'll encounter high CPU usage, as the reading will never stop.
[task setTerminationHandler:^(NSTask *task) {
// do your stuff on completion
[task.standardOutput fileHandleForReading].readabilityHandler = nil;
[task.standardError fileHandleForReading].readabilityHandler = nil;
}];
This is all asynchronous (and you should do it async), so your method should have a ^completion block.

NSFileHandle availableData seems to block infinitely:
I made a test program that I call with NSTask from another program. I assign the NSFileHandle to stdin and read the data from pipe. The test program floods the stdout with lots of text using NSLog function. There seems to be no way around it, no matter what API in the NSFileHandle I use, sooner or later the availableData blocks and then the app will hang infinitely and will do nothing. It actually stops to the data read statement, no matter if it is placed in the while or inside it. I tried also reading bytes, one by one, does not help
either:
data = [file readDataOfLength: 1]; // blocks infinitely
data = [file availableData]; // blocks infinitely
This works for a while until it also freezes. Seems that I have noticed that the NSFileHandle API does not really work with shell commands that output lots of data, so I have to work around this by using Posix API instead.
Every single example of how to read the data in parts with this API found from Stack Overflow or other sites in the Internet, either synchronously, or asynchronously, seems to block to last fileAvailableData read.

The simple, painful truth is that writing a lot of data to a subprocess and then reading a lot of data back from it is not something you can do in a single function or method without blocking the UI.
The solution is just as simple, and is certainly a painful-looking prospect: Make the export asynchronous. Write data as you can, and read data as you can. Not only are you then not blocking the UI, you also gain the ability to update a progress indicator for a really long export, and to do multiple exports in parallel (e.g., from separate documents).
It's work, but the UI payoffs are big, and the result is a cleaner design both internally and externally.

To be clear, this isn't just slow but it's actually freezing until you break in with a debugger? It's not a problem with your sub-process?
One would expect NSFileHandle to handle any data you throw at it, but perhaps you can split your data into smaller chunks using -subdataWithRange: to see what effect that has. You could also grab the fileDescriptor and use the POSIX APIs (fdopen, fwrite, etc.) to write to the stream. The POSIX APIs would provide more flexibility, if indeed that's what you need.

I believe what was happening was I was running into a deadlock caused by the [writeHandle writeData:[input dataUsingEncoding:NSUTF8StringEncoding]]; line filling overfilling its buffer and causing the application to hang until it is (never) emptied.
I worked around it by dispatching the write operation off to a separate thread.

Have a look at asynctask.m from this gist.
asynctask.m allows to process more than 8k of input data.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio