Mac OS X 10.6, Cocoa project, with retain/release gc
I've got a function which:
iterates over a specific directory, scans it for subfolders (included nested ones), builds an NSMutableArray of strings (one string per found subfolder path), and returns that array.
e.g. (error handling removed for brevity).
NSMutableArray * ListAllSubFoldersForFolderPath(NSString *folderPath)
{
NSMutableArray *a = [NSMutableArray arrayWithCapacity:100];
NSString *itemName = nil;
NSFileManager *fm = [NSFileManager defaultManager];
NSDirectoryEnumerator *e = [fm enumeratorAtPath:folderPath];
while (itemName = [e nextObject]) {
NSString *fullPath = [folderPath stringByAppendingPathComponent:itemName];
BOOL isDirectory;
if ([fm fileExistsAtPath:fullPath isDirectory:&isDirectory]) {
if (isDirectory is_eq YES) {
[a addObject: fullPath];
}
}
}
return a;
}
The calling function takes the array just once per session, keeps it around for later processing:
static NSMutableArray *gFolderPaths = nil;
...
gFolderPaths = ListAllSubFoldersForFolderPath(myPath);
[gFolderPaths retain];
All appears good at this stage. [gFolderPaths count] returns the correct number of paths found, and [gFolderPaths description] prints out all the correct path names.
The problem:
When I go to use gFolderPaths later (say, the next run through my event loop) my assertion code (and gdb in Xcode) tells me that it is nil.
I am not modifying gFolderPaths in any way after that initial grab, so I am presuming that my memory management is screwed and that gFolderPaths is being released by the runtime.
My assumptions/presumptions
I do not have to retain each string as I add it to the mutable array because that is done automatically, but I do have to retain the the array once it is handed over to me from the function, because I won't be using it immediately. Is this correct?
Any help is appreciated.
Objects do not “go nil”.
static NSMutableArray *gFolderPaths = nil;
This declaration declares that gFolderPaths is a variable that holds a pointer to an NSMutableArray object.
You initialize it with a pointer to no object: nil.
This initialization is valid, and makes sense because you don't have an array to put here yet—better to initialize with the nil pointer than to not initialize and risk some random pointer being in the variable. (That can't happen with a static variable, as static variables are initialized to nil anyway, but explicitness is good and the explicit initialization is harmless.)
When I go to use gFolderPaths later (say, the next run through my event loop) my assertion code (and gdb in Xcode) tells me that it is nil.
I am not modifying gFolderPaths in any way after that initial grab, so I am presuming that my memory management is screwed and that gFolderPaths is being released by the runtime.
No. The runtime does not release objects. The runtime is part of the language, and retain and release are part of the Foundation framework. The framework sits just on top of the language.
So, you might guess that you or some other code (e.g., in a framework) released the object whose pointer you had previously stored in gFolderPaths.
No. If that had happened, the gFolderPaths variable would not suddenly contain nil; it would still contain the same pointer to the same object. If this were the last release before the object's death, the gFolderPaths variable would still contain the same pointer to the same, now-dead, object.
Attempting to log the pointer (e.g., with NSLog(#"%p", gFolderPaths)) would print a valid-looking address, such as 0x2381ab6780. Attempting to log the object (e.g., with %#) would almost certainly crash, because the object is dead.
That's not what happened. You said that your assertion and your commands to the debugger revealed that the gFolderPaths variable contains nil.
There are two obvious possibilities:
Something re-assigned to the variable. You say that no code of yours reassigns to the variable. Nothing else should know about it, so this possibility is extremely unlikely.
You never assigned an object's pointer to the variable in the first place. Either you assigned nil, or you never assigned anything. You say that you're logging the array whose pointer you assigned to the variable, and that the description checks out, so we can dismiss this possibility entirely. (Logging the count would not be so reliable a test, as [nil count] will successfully return 0.)
That leads to a third possibility:
3. You have two gFolderPaths variables.
I'm guessing you have two functions or methods (or one of each) that both contain this line:
static NSMutableArray *gFolderPaths = nil;
That won't work. Both gFolderPaths variables are static, but also local to the function/method you declared them in. Each function/method gets its own gFolderPaths variable, so you have two such variables, separate from each other.
You need to declare gFolderPaths as a static global variable, outside of any function or method. Better yet, if it is only being accessed from instances, make it an instance variable. Either way, it cannot be a local variable if you want to share it between two functions or methods.
The other way this could happen is if you have two such global declarations, but each in a different file. static on a variable declared at file scope means “only visible within this file”, so this causes the same problem: Two separate variables when you mean to have one shared variable. If this is your problem, the immediate fix is to remove the static keyword from both of them, but you should rethink your design if you mean to use a global variable in this way.
Related
Does creating intermediate variables cause the garbage collector to do more work?
That is, is there any difference between:
output = :asdf.to_s.upcase
and
str = :asdf.to_s
output = str.upcase
? (Assume str is never referenced again.)
It would be a trivial amount of extra work when marking objects still referenced, assuming both str and output were still in scope (i.e. the binding where they exist was still active) when the GC mark phase began. Both variables would start a mark on the same string. I don't know, but suspect that when marking objects as still viable, if Ruby comes across an item already marked, it will probably stop recursing and go to its next item at the same level. In this case the String is a single object without child objects to mark further, so it's one quick call to rb_gc_mark repeated for each reference to the String - one case where it is marked, and another case where Ruby notes it has already been marked and stops recursing.
If neither variable were in any active binding when GC mark phase began, it is no extra work, the String referenced would not get marked (no work) and the sweep phase would delete it just once (same work no matter how many references were active before).
I have a unique situation with setStringValue: and hoping someone could clear this up:
Using the following theoretical example (not literal) code:
NSString *myVar;
[myOutlet setStringValue:myVar];
It appears that for any string value such as:
myVar = #"hello";
a pointer is passed to myOutlet and the NSTextField points to the same memory location as myVar, essentially making them identical. In essence:
myVar == [myOutlet stringValue];
returns TRUE.
HOWEVER
in this situation:
myVar = #"";
it seems as if it is not passing a pointer, but rather NSTextField is creating it's own independent memory location to store it's empty string, essentially:
myVar == [myOutlet stringValue];
return FALSE.
Can anyone confirm whether this is true, and if so, explain why? I believe this to be the source of a very complex problem I'm having in a piece of code I'm working on and I'm trying to wrap my mind around the root of the problem.
Thanks!
Basically, it's pure chance that the first situation works out. These pointers are absolutely not guaranteed to be equal, and if you need to compare strings, use -isEqualToString: always.
What you're running into is probably an optimization of some sort, to avoid storing #"hello" more than once. We have no way of knowing when this will or will not happen, and it may change in the future, or from device to device.
this is very annoying, since now my values are stored as if they contain something by default (like in C). All my OO stuff are now broken since my delegate are always something. I was wonderin why Xcode do this to me, since by default Xcode always set the variable value to 0 or nil.
So if I do
NSArray* anArray;
and then
NSLog(%#"%#", anArray);
It could crash or log hasard last allocated memory. This is very frustrating.
C, Objective C, and C++ all initialize global variables to zero/null/Nil. Local variables are not automatically initialized and must be explicitly initialized.
Additionally, a pointer to a NSArray is not an NSArray. Before using that pointer, you should arrange for an NSArray to actually be at the end of it. For instance, make a new one, something more like
// NSArray* anArray = new NSArray; // if using a C++ backend
NSArray* anArray = [[NSArray alloc] init]; // if using an Objective-C backend
// ...
NSLog(%#"%#", anArray);
Objective C behaves much the same as C in this regard, i.e. non-global variables are not initialised by default. Code defensively and always initialise pointer variables explicitly (either to NULL or to a valid address).
Lets assume myProp is #property (retain) NSString * myProp and synthesized for us....
self.myProp = #"some value";//string literal?
self.myProp = [NSString stringWithString:#"some value"];
Is there a difference here?
Is the first one a string literal that doesnt get autoreleased or is it just in memory for it's current scope and I dont have to worry about it?
You might want to spend some time with the String Programming Guide. From the linked section (about creating #"string constants"):
Such an object is created at compile
time and exists throughout your
program’s execution. The compiler
makes such object constants unique on
a per-module basis, and they’re never
deallocated, though you can retain and
release them as you do any other
object.
A string literal is a hard-coded NSString that exists indefinitely while your application is running. You can even send it messages that NSString responds to, such as [#"test" length].
To answer your question, the first line is setting the property to the string literal, while the second is going through an extra step of creating a new string based off the string literal.
To add to the posts by Joshua and Preston, in fact [NSString stringWithString:xxx] returns xxx itself when xxx is a literal.
This is an implementation detail, so you shouldn't write any program relying on this fact, but it's fun to know.
You can check this fact thus:
NSString*a=#"foo";
NSString*b=[NSString stringWithString:a];
NSLog(#"a at %p, class %#",a,[a class]);
NSLog(#"b at %p, class %#",b,[b class]);
At least on my 10.6.3 box, both give the same address, with class NSCFString.
Remember: retain & release concern your responsibility on the ownership, and they don't always decrease/increase the retain count. The implementation can do whatever optimization it wants, as long as the said optimization doesn't break the ownership policy.
Or in other words: write retain & release so that the objects are kept/destroyed in the case the implementation always does the naive increase/decrease of the retain count. That's the contract between Cocoa and you. But Cocoa can do and indeed does a lot of optimization.
I am working my way through Ferret (Ruby port of Lucene) code to solve
a bug. Ferret code is mainly a C extension to Ruby. I am running into
some issues with the garbage collector. I managed to fix it, but I
don't completely understand my fix =) I am hoping someone with deeper
knowledge of Ruby and C extension (this is my 3rd day with Ruby) can
elaborate. Thanks.
Here is the situation:
Some where in Ferret C code, I am returning a "Token" to Ruby land.
The code looks like
static VALUE get_token (...)
{
...
RToken *token = ALLOC(RToken);
token->text = rb_str_new2("some text");
return Data_Wrap_Struct(..., &frt_token_mark, &frt_token_free, token);
}
frt_token_mark calls rb_gc_mark(token->text) and frt_token_free
just frees the token with free(token)
In Ruby, this code correlates to the following:
token = #input.next
Basically, #input is set to some object, calling the next method on it
triggers the get_token C call, which returns a token object.
In Ruby land, I then do something like w = token.text.scan('\w+')
When I run this code inside a while 1 loop (to isolate my problem), at
some point (roughly when my ruby process mem footprint goes to 256MB,
probably some GC threshold), Ruby dies with errors like
scan method called on terminated object
Or just core dumps. My guess was that token.text was garbage collected.
I don't know enough about Ruby C extension to know what happens with
Data_Wrap_Struct returned objects. Seems to me the assignment in Ruby
land, token =, should create a reference to it.
My "work-around"/"fix" is to create a Ruby instance variable in the
object referred to by #input, and stores the token text in there, to
get an extra reference to it. So the C code looks like
RToken *token = ALLOC(RToken);
token->text = rb_str_new2(tk->text);
/* added code: prevent garbage collection */
rb_ivar_set(input, id_curtoken, token->text);
return Data_Wrap_Struct(cToken, &frt_token_mark, &frt_token_free, token);
So now I've created a "curtoken" in the input instance variable, and
saved a copy of the text there... I've taken care to remove/delete
this reference in the free callback of the class for #input.
With this code, it works in that I no longer get the terminated object
error.
The fix seems to make sense to me -- it keeps an extra ref in curtoken
to the token.text string so an instance of token.text won't be removed
until the next time #input.next is called (at which time a different
token.text replaces the old value in curtoken).
My question is: why did it not work before? Shouldn't
Data_Wrap_Structure return an object that, when assigned in Ruby land,
has a valid reference and not be removed by Ruby?
Thanks.
When the Ruby garbage collector is invoked, it has a mark phase and a sweep phase. The mark phase marks all objects in the system by marking:
all objects referenced by a ruby stack frame (e.g. local variables)
all globally accessible objects (e.g. referred to by a constant or global variable) and their children/referents, and
all objects referred to by a reference on the stack, as well as those objects' children/referents.
as well as a number of other objects that are not important to this discussion. The sweep phase then destroys any objects that are not accessible (i.e. those that were not marked).
Data_Wrap_Struct returns a reference to an object. As long as that reference is available to ruby code (e.g. stored in a local variable) or is on the stack (referred to by a local C variable), the object should not be swept.
It's looks like from what you've posted that token->text is getting garbage collected. But why is it getting collected? It must not be getting marked. Is the Token object itself getting marked? If it is, then token->text should be getting marked. Try setting a breakpoint or printing a message in the token's mark function to see.
If the token is not getting marked, then the next step is to figure out why. If it is getting marked, then the next step is to figure out why the string returned by the text() method is getting swept (maybe it's not the same object that is getting marked).
Also, are you sure that it is the token's text member that is causing the exception? Looking at:
http://github.com/dbalmain/ferret/blob/master/ruby/ext/r_analysis.c
I see that the token and the token stream both have text() methods. The TokenStream struct doesn't hold a reference to its text object (it can't, as it's a C struct with no knowledge of ruby). Thus, the Ruby object wrapping the C struct needs to hold the reference (and this is being done with rb_ivar_set).
The RToken struct shouldn't need to do this, because it marks its text member in its mark function.
One more thing: you may be able to reproduce this bug by calling GC.start explicitly in your loop rather than having to allocate so many objects that the garbage collector kicks in. This won't fix the problem but might make diagnosis simpler.
perhaps mark as volatile:
http://www.justskins.com/forums/chasing-a-garbage-collection-bug-98766.html
maybe your compile is keeping its reference in a registry instead of the stack...there is some way mentioned I think in README.EXT to force an object to never be GC'ed, but...the question still remains as to why it's being collected early...