Using linq with Sharepoint and disposing of objects - linq

How do i dispose reference to the subWeb in the following query?
using (SPSite spSite = Utility.GetElevatedSite(_rootUrl))
{
from SPWeb web in spSite.AllWebs
where web.ServerRelativeUrl.ToLower() == path
from SPWeb subWeb in web.Webs
select subWeb
}
Do i even need to worry about disposing the subWeb if iam already wraped the spSite in the Using statement?
Edit:
Is it a good idea too call garbage collection in this scenario?

Unfortunately, you do.
The trouble starts from the SPSite.AllWebs property.
The SPWeb.Web property isn't safe either.
Read this very thorough reference of the situations where you need to worry about disposing SharePoint objects.
(I suggest adding this to your SharePoint cheat-sheet).
As a result, I feel the current SharePoint object model can not be safely used with LINQ syntax.
Your code would need a re-write with the various implied loops broken out so that you can explicitly dispose the objects involved.
Edit:
The SPDisposeCheck tool is a command-line console app that will scan your .NET assemblies and warn you of undisposed references based on the above best-practice guidelines. Check it out.
http://code.msdn.microsoft.com/SPDisposeCheck
http://blogs.msdn.com/sharepoint/archive/2008/11/12/announcing-spdisposecheck-tool-for-sharepoint-developers.aspx

The technical answer to your original question is a qualified "No": all SPWeb objects opened from an SPSite are automatically disposed when the SPSite is disposed. However, in practice it is a good idea to dispose an SPWeb as soon as you're done with it to reduce memory pressure, especially when working with code like this that opens several SPWeb objects.
Implementing this dispose-safe behavior for LINQ is actually quite simple in C#. You can find full details in this post, but the short version is that a C# iterator can handle disposal for you. Using my AsSafeEnumerable() extension method, your code is relatively safe written like this:
using (SPSite spSite = Utility.GetElevatedSite(_rootUrl))
{
var sw = from SPWeb web in spSite.AllWebs.AsSafeEnumerable()
where web.ServerRelativeUrl.ToLower() == path
from SPWeb subWeb in web.Webs.AsSafeEnumerable()
select subWeb;
foreach(SPWeb aSubWeb in sw)
{
// Do something
}
}
Now the result of your query, which I've assigned to sw, is a lazy iterator of type IEnumerable<SPWeb>. As you enumerate over that result, each SPWeb will be disposed when the enumerator moves to the next item. This means that it is not safe to use that SPWeb reference, or any SP* object created from it (SPList, etc), outside of your foreach loop. Also sw would not be safe to use outside of that using block because the iterator's SPWebCollections will be tied to the now-disposed SPSite.
That said, code like this that enumerates over all webs (twice!) is extremely expensive. There is almost certainly a more efficient way that this could be implemented, if only by using spSite.AllWebs[path] instead of your from/where.
Regarding garbage collection, these objects require disposal because of unmanaged memory allocated that the GC doesn't even know about.
Finally, a word of caution about your 'GetElevatedSite' utility. If you're using RunWithElevatedPrivileges in your helper method to get your elevated SPSite, there are a number of issues you could run into by returning your SPSite out of that elevated context. If possible, I would suggest using SPSite impersonation instead - my preferred method is described here.

Related

C++/CLI Resource Management Confusion

I am extremely confused about resource management in C++/CLI. I thought I had a handle (no pun intended) on it, but I stumbled across the auto_gcroot<T> class while looking through header files, which led to a google search, then the better part of day reading documentation, and now confusion. So I figured I'd turn to the community.
My questions concern the difference between auto_handle/stack semantics, and auto_gcroot/gcroot.
auto_handle: My understanding is that this will clean up a managed object created in a managed function. My confusion is that isn't the garbage collector supposed to do that for us? Wasn't that the whole point of managed code? To be more specific:
//Everything that follows is managed code
void WillThisLeak(void)
{
String ^str = gcnew String ^();
//Did I just leak memory? Or will GC clean this up? what if an exception is thrown?
}
void NotGoingToLeak(void)
{
String ^str = gcnew String^();
delete str;
//Guaranteed not to leak, but is this necessary?
}
void AlsoNotGoingToLeak(void)
{
auto_handle<String ^> str = gcnew String^();
//Also Guaranteed not to leak, but is this necessary?
}
void DidntEvenKnowICouldDoThisUntilToday(void)
{
String str();
//Also Guaranteed not to leak, but is this necessary?
}
Now this would make sense to me if it was a replacement for the C# using keyword, and it was only recommended for use with resource-intensive types like Bitmap, but this isnt mentioned anywhere in the docs so im afraid ive been leaking memory this whole time now
auto_gcroot
Can I pass it as an argument to a native function? What will happen on copy?
void function(void)
{
auto_gcroot<Bitmap ^> bmp = //load bitmap from somewhere
manipulateBmp(bmp);
pictureBox.Image = bmp; //Is my Bitmap now disposed of by auto_gcroot?
}
#pragma unmanaged
void maipulateBmp(auto_gcroot<Bitmap ^> bmp)
{
//Do stuff to bmp
//destructor for bmp is now called right? does this call dispose?
}
Would this have worked if I'd used a gcroot instead?
Furthermore, what is the advantage to having auto_handle and auto_gcroot? It seems like they do similar things.
I must be misunderstanding something for this to make so little sense, so a good explanation would be great. Also any guidance regarding the proper use of these types, places where I can go to learn this stuff, and any more good practices/places I can find them would be greatly appreciated.
thanks a lot,
Max
Remember delete called on managed object is akin to calling Dispose in C#. So you are right, that auto_handle lets you do what you would do with the using statement in C#. It ensures that delete gets called at the end of the scope. So, no, you're not leaking managed memory if you don't use auto_handle (the garbage collector takes care of that), you are just failing to call Dispose. there is no need for using auto_handle if the types your dealing with do not implement IDisposable.
gcroot is used when you want to hold on to a managed type inside a native class. You can't just declare a manged type directly in a native type using the hat ^ symbol. You must use a gcroot. This is a "garbage collected root". So, while the gcroot (a native object) lives, the garbage collector cannot collect this object. When the gcroot is destroyed, it lets go of the reference, and the garbage collector is free to collect the object (assuming it has no other references). You declare a free-standing gcroot in a method like you've done above--just use the hat ^ syntax whenever you can.
So when would you use auto_gcroot? It would be used when you need to hold on to a manged type inside a native class AND that managed type happens to implement IDisposable. On destruction of the auto_gcroot, it will do 2 things: call delete on the managed type (think of this as a Dispose call--no memory is freed) and free the reference (so the type can be garbage collected).
Hope it helps!
Some references:
http://msdn.microsoft.com/en-us/library/aa730837(v=vs.80).aspx
http://msdn.microsoft.com/en-us/library/481fa11f(v=vs.80).aspx
http://www.codeproject.com/Articles/14520/C-CLI-Library-classes-for-interop-scenarios

What instantiate-able types implementing IQueryable<T> are available in .Net 4.0?

Within the context of C# on .Net 4.0, are there any built-in objects that implement IQueryable<T>?
IQueryable objects are produced by Queryable Providers (ex. LINQ to SQL, LINQ to Entities/Entity Framework, etc). Virtually nothing you can instantiate with new in the basic .NET Framework implements IQueryable.
IQueryable is an interface designed to be used to create Queryable providers, which allow the LINQ library to be leveraged against an external data store by building a parse-able expression tree. By nature, Queryables require a context - information regarding what exactly you're querying. Using new to create any IQueryable type, regardless of whether it's possible, doesn't get you very far.
That being said, any IEnumerable can be converted into an IQueryable by using the AsQueryable() extension method. This creates a superficially-similar, but functionally very different construct behind the scenes as when using LINQ methods against a plain IEnumerable object. This is probably the most plentiful source of queryables you have access to without setting up an actual IQueryable provider. This changeover is very useful for unit-testing LINQ-based algorithms as you don't need the actual data store, just a list of in-memory data that can imitate it.
Well, your question is kinda weird... but I believe that if you look at an interface in Reflector, it will give you a list of implementers in the loaded assemblies.
As a disclaimer I have not used Reflector since it went pay-for-play so I might be wrong.
EntityCollection does, as does EnumerableQuery.
Not that I think either of these is going to get you anywhere. To help, we need to know what you are really trying to solve. If you are writing a LINQ provider, you should read this: http://msdn.microsoft.com/en-us/library/bb546158.aspx.
They recommend writing your own implementation.
If you are looking for a way to instantiate an empty list of IQueryable, then you can use this:
IQueryable<MyEntity> = Enumerable.Empty<MyEntity>().AsQueryable()

Disposing a HtmlControl

On the advice of Code Analysis in VS to call Dispose on an object (which I wasn't previuosly) I ended up with a method containing this:
using (var favicon = new HtmlLink
{
Href = "~/templates/default/images/cc_favicon.ico"
})
{
favicon.Attributes.Add("rel", "shortcut icon");
Header.Controls.Add(favicon);
}
This confused me slightly, if I dispose this object after adding it to the Controls collection is that such a good idea?
How does this still work? Is it because the Controls.Add method disposes the object after use as opposed to holding on to it?
I would say that this code shouldn't work but if you say it's working then the only things I can think of are:
Header.Controls.Add add a copy of the object so there is no problem disposing the original.
The Dispose method does not clean anything that is used later.
Hope this helps.
If a method on favicon is called that uses any of the unmanaged resources it will give exception.
From msdn:
You can instantiate the resource object and then pass the variable to
the using statement, but this is not a best practice. In this case,
the object remains in scope after control leaves the using block even
though it will probably no longer have access to its unmanaged
resources. In other words, it will no longer be fully initialized. If
you try to use the object outside the using block, you risk causing an
exception to be thrown. For this reason, it is generally better to
instantiate the object in the using statement and limit its scope to
the using block.
using statement msdn
I assume that you code analysis gave you CA2000: Dispose objects before losing scope before you changed the code. The problem is that you shouldn't dispose your object because you want to use it even after returning from the method (it has been added to a collection).
You can either suppress the message using the SuppressMessage attribute or you can rewrite you code to be really paranoid:
var favicon = new HtmlLink { Href = "~/templates/default/images/cc_favicon.ico" };
try {
favicon.Attributes.Add("rel", "shortcut icon");
}
catch {
favicon.Dispose();
throw;
}
Header.Controls.Add(favicon);
The normal flow of this code adds favicon to the collection that is then responsible for disposing it. However, the abnormal flow where favicon.Attributes.Add throws an exception will dispose favicon before propagating the exception.
In most case, because the garbage collector will do its job eventually, you don't need the paranoid version of the code.

Do you ToList()?

Do you have a default type that you prefer to use in your dealings with the results of LINQ queries?
By default LINQ will return an IEnumerable<> or maybe an IOrderedEnumerable<>. We have found that a List<> is generally more useful to us, so have adopted a habit of ToList()ing our queries most of the time, and certainly using List<> in our function arguments and return values.
The only exception to this has been in LINQ to SQL where calling .ToList() would enumerate the IEnumerable prematurely.
We are also using WCF extensively, the default collection type of which is System.Array. We always change this to System.Collections.Generic.List in the Service Reference Settings dialog in VS2008 for consistency with the rest of our codebase.
What do you do?
ToList always evaluates the sequence immediately - not just in LINQ to SQL. If you want that, that's fine - but it's not always appropriate.
Personally I would try to avoid declaring that you return List<T> directly - usually IList<T> is more appropriate, and allows you to change to a different implementation later on. Of course, there are some operations which are only specified on List<T> itself... this sort of decision is always tricky.
EDIT: (I would have put this in a comment, but it would be too bulky.) Deferred execution allows you to deal with data sources which are too big to fit in memory. For instance, if you're processing log files - transforming them from one format to another, uploading them into a database, working out some stats, or something like that - you may very well be able to handle arbitrary amounts of data by streaming it, but you really don't want to suck everything into memory. This may not be a concern for your particular application, but it's something to bear in mind.
We have the same scenario - WCF communications to a server, the server uses LINQtoSQL.
We use .ToArray() when requesting objects from the server, because it's "illegal" for the client to change the list. (Meaning, there is no purpose to support ".Add", ".Remove", etc).
While still on the server, however, I would recommend that you leave it as it's default (which is not IEnumerable, but rather IQueryable). This way, if you want to filter even more based on some criteria, the filtering is STILL on the SQL side until evaluated.
This is a very important point as it means incredible performance gains or losses depending on what you do.
EXAMPLE:
// This is just an example... imagine this is on the server only. It's the
// basic method that gets the list of clients.
private IEnumerable<Client> GetClients()
{
var result = MyDataContext.Clients;
return result.AsEnumerable();
}
// This method here is actually called by the user...
public Client[] GetClientsForLoggedInUser()
{
var clients = GetClients().Where(client=> client.Owner == currentUser);
return clients.ToArray();
}
Do you see what's happening there? The "GetClients" method is going to force a download of ALL 'clients' from the database... THEN the Where clause will happen in the GetClientsForLoogedInUser method to filter it down.
Now, notice the slight change:
private IQueryable<Client> GetClients()
{
var result = MyDataContext.Clients;
return result.AsQueryable();
}
Now, the actual evaluation won't happen until ".ToArray" is called... and SQL will do the filtering. MUCH better!
In the Linq-to-Objects case, returning List<T> from a function isn't as nice as returning IList<T>, as THE VENERABLE SKEET points out. But often you can still do better than that. If the thing you are returning ought to be immutable, IList is a bad choice because it invites the caller to add or remove things.
For example, sometimes you have a method or property that returns the result of a Linq query or uses yield return to lazily generate a list, and then you realise that it would be better to do that the first time you're called, cache the result in a List<T> and return the cached version thereafter. That's when returning IList may be a bad idea, because the caller may modify the list for their own purposes, which will then corrupt your cache, making their changes visible to all other callers.
Better to return IEnumerable<T>, so all they have is forward iteration. And if the caller wants rapid random access, i.e. they wish they could use [] to access by index, they can use ElementAt, which Linq defines so that it quietly sniffs for IList and uses that if available, and if not it does the dumb linear lookup.
One thing I've used ToList for is when I've got a complex system of Linq expressions mixed with custom operators that use yield return to filter or transform lists. Stepping through in the debugger can get mighty confusing as it jumps around doing lazy evaluation, so I sometimes temporarily add a ToList() to a few places so that I can more easily follow the execution path. (Although if the things you are executing have side-effects, this can change the meaning of the program.)
It depends if you need to modify the collection. I like to use an Array when I know that no one is going to add/delete items. I use a list when I need to sort/add/delete items. But, usually I just leave it as IEnumerable as long as I can.
If you don't need the added features of List<>, why not just stick with IQueryable<> ?!?!?! Lowest common denominator is the best solution (especially when you see Timothy's answer).

Is it a good idea to cache DataContractSerializer instances?

I'm writing a windows service application that needs to serialize and deserialize XML documents repeatedly during its execution. As I need to serialize and deserialize generic types that are not known during compilation time (I don't know a priori how many types I need to serialize/deserialize) I'd like to know if it is a good idea do keep a cache of DataContractSerializer objects I instantiate for serializing and deserializing the objects.
I'm asking this question because I know it is a good idea to cache the XmlSerializer class instances since they create a dynamic assembly in memory under the hood and the assemblies dynamically created in memory are not garbage collected.
I read that the DataContractSerializer relies on lightweight code generation, but I'm not usual with the details of it. That is why I'm asking this question, I need to understand if I instantiate DataContractSerializer instances as needed it would lead me to a memory leak as the XmlSerializer would?
I have chose to use the DataContractSerializer instead of the XmlSerializer for being able to serialize internal properties.
...it is a good idea to cache the XmlSerializer class instances since they create a dynamic assembly in memory under the hood...
With XmlSerializer, it actually depends on whether you use the simple constructor (new XmlSerializer(typeToHandle)), or the more complex constructors that allow you to specify all the attributes etc at runtime. If you only use the simple constructor it re-uses the background assembly, so there is no repeat penalty.
I would expect (but haven't tested) DataContractSerializer to work similarly; but there is certainly no harm in simply caching it, perhaps in a static readonly field
Note that DataContractSerializer restricts the xml layout you have available to you... as long as you're OK with that ;-p

Resources