Sorting an observable collection with linq - linq

I have an observable collection and I sort it using linq. Everything is great, but the problem I have is how do I sort the actual observable collection? Instead I just end up with some IEnumerable thing and I end up clearing the collection and adding the stuff back in. This can't be good for performance. Does anyone know of a better way to do this?

If you are using Silverlight 3.0, then using CollectionViewSource is the cleanest way. Refer below example: (it can be done via xaml as well)
ObservableCollection<DateTime> ecAll = new ObservableCollection<DateTime>();
CollectionViewSource sortedcvs = new CollectionViewSource();
sortedcvs.SortDescriptions.Add(new System.ComponentModel.SortDescription("Date",
System.ComponentModel.ListSortDirection.Ascending));
sortedcvs.Source = ecAll;
ListBoxContainer.DataContext = sortedcvs;
And in corresponding xaml set
ItemsSource="{Binding}"
for the ListBox or any ItemsControl derived control

Since the collection doesn't provide any Sort mechanism, this is probably the most practical option. You could implement a sort manually using Move etc, but it will probably be slower than doing in this way.
var arr = list.OrderBy(x => x.SomeProp).ToArray();
list.Clear();
foreach (var item in arr) {
list.Add(item);
}
Additionally, you might consider unbinding any UI elements while sorting (via either approach) you only pay to re-bind once:
Interestingly, if this was BindingList<T>, you could use RaiseListChangedEvents to minimise the number of notifications:
var arr = list.OrderBy(x => x).ToArray();
bool oldRaise = list.RaiseListChangedEvents;
list.RaiseListChangedEvents = false;
try {
list.Clear();
foreach (var item in arr) {
list.Add(item);
}
} finally {
list.RaiseListChangedEvents = oldRaise;
if (oldRaise) list.ResetBindings();
}

Note that in Linq, you are given an IEnumerable from your query, and that query has not executed yet. Therefore, the following code only runs the query once, to add it to an ObservableCollection:
var query = from x in Data
where x.Tag == "Something"
select x;
foreach(var item in query)
MyObservableCollection.Add(item);
Take a look at the "OrderBy" extension on IEnumerable:
foreach(var item in query.OrderBy(x => x.Name))
MyObservableCollection.Add(item);

ObservableCollections aren't designed to be sortable. List is sortable, and that's the underlying mechanism used by the answer referencing List.Sort(), but ObservableCollection isn't derived from List so you're out of luck there. Imo, the "right" solution is not to try to sort the ObservableCollection, but to implement ICollectionView and bind an instance of that to your control. That interface adds methods for sorting and has the additional benefit that its recognized by Silverlight controls (well, the ones that support it anyway such as DataGrid) so your sorting could be utilized directly from the UI layer. This question might be helpful:
Silverlight and icollectionview

i followed the link mentioned in this post http://mokosh.co.uk/post/2009/08/04/how-to-sort-observablecollection/comment-page-1/#comment-75
but having issues getting it to work in Silverlight
I created a property public SortableObservableCollection Terms
When I call Terms.Sort(new TermComparer()) the records are still display unsorted on the UI
could some suggest what could be going wrong. thanks

I found this on CodePlex:
Sorted Collections
Haven't used it yet though.
Rick

Related

Using LINQ to change values in collection

I think I'm not undertanding LINQ well. I want to do:
foreach (MyObjetc myObject in myObjectCollection)
{
myObjet.MyProperty = newValue
}
Just change all values for a property in all elements of my collection.
Using LINQ wouldn't be this way?
myObjectCollection.Select(myObject => myObject.MyProperty = newValue)
It doesn't work. The property value is not changed. Why?
Edit:
Sorry, guys. Certainly, foreach is the right way. But,in my case, I must repeat the foreach in many collections, and I didn't want to repeat the loop. So, finally, I have found an 'inermediate' solution, the 'foreach' method, just similar to the 'Select':
myObjectCollection.ForEach(myObject => myObject.MyProperty = newValue)
Anyway may be it's not as clear as the more simple:
foreach (MyObjetc myObject in myObjectCollection) myObjet.MyProperty = newValue;
First off, this is not a good idea. See below for arguments against it.
It doesn't work. The property value is not changed. Why?
It doesn't work because Select() doesn't actually iterate through the collection until you enumerate it's results, and it requires an expression that evaluates to a value.
If you make your expression return a value, and add something that will fully evaluate the query, such as ToList(), to the end, then it will "work", ie:
myObjectCollection.Select(myObject => { myObject.MyProperty = newValue; return myObject;}).ToList();
That being said, ToList() has some disadvantages - mainly, it's doing a lot of extra work (to create a List<T>) that's not needed, which adds a large cost. Avoiding it would require enumerating the collection:
foreach(var obj in myObjectCollection.Select(myObject => { myObject.MyProperty = newValue; return myObject; }))
{ }
Again, I wouldn't recommend this. At this point, the Select option is far uglier, more typing, etc. It's also a violation of the expectations involved with LINQ - LINQ is about querying, which suggests there shouldn't be side effects when using LINQ, and the entire purpose here is to create side effects.
But then, at this point, you're better off (with less typing) doing it the "clear" way:
foreach (var obj in myObjectCollection)
{
obj.MyProperty = newValue;
}
This is shorter, very clear in its intent, and very clean.
Note that you can add a ForEach<T> extension method which performs an action on each object, but I would still recommend avoiding that. Eric Lippert wrote a great blog post about the subject that is worth a read: "foreach" vs "ForEach".
As sircodesalot mentioned, you can't use linq to do something like that. Remember that linq is a querying language, which means all you can do is query it. Doing changes must be done in other logic.
What you could do, if you don't want to do it the first way and if your collection is a list already (but not an IEnumerable) you can use the ForEach extension method in linq to do what you're asking.
One other point I should mention is that the Select method does a projection of some specific information to return to an IEnumerable. So, if you wanted to grab only a specific property from a collection you would use that. That's all it does.

Is there a better way to formulate this entity framework 4 update?

I have the following code:
List<Item> csItems = (from i in context.Items
where i.CSItem == true
select i).ToList<Item>();
csItems.ForEach(i => i.Active = false);
context.SaveChanges();
Seems like this is horribly inefficient as I have to read the entire table in first, then update the tables. Is there a better way to do this so that I am only doing an update without reading everything?
Linq2EF doesn't have a built-in bulk update, but you can use Alex James, articles to do it yourself, actually you should implement it yourself. Also simple way is using stored procedures. But if you want to have it yourself you can read all of 4 article series, they are easy to read.
There is no need to build the List<T> of the items. I, personally, would just enumerate the results and set the values:
IQueryable<Item> csItems = (from i in context.Items
where i.CSItem == true
select i);
foreach(var item in csItems)
item.Active = false;
context.SaveChanges();
Note that there is no bulk-update functionality built-in with Entity Framework, so doing something more efficient would require some other technique.

Optimizing away OrderBy() when using Any()

So I have a fairly standard LINQ-to-Object setup.
var query = expensiveSrc.Where(x=> x.HasFoo)
.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName);
// ...
if (!condition && !query.Any())
return; // seems to enumerate and sort entire enumerable
// ...
foreach (var item in query)
// ...
This enumerates everything twice. Which is bad.
var queryFiltered = expensiveSrc.Where(x=> x.HasFoo);
var query = queryFiltered.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName);
if (!condition && !queryFiltered.Any())
return;
// ...
foreach (var item in query)
// ...
Works, but is there a better way?
Would there be any non-insane way to "enlighten" Any() to bypass the non-required operations? I think I remember this sort of optimisation going into EduLinq.
Why not just get rid of the redundant:
if (!query.Any())
return;
It really doesn't seem to be serving any purpose - even without it, the body of the foreach won't execute if the query yields no results. So with the Any() check in, you save nothing in the fast path, and enumerate twice in the slow path.
On the other hand, if you must know if there were any results found after the end of the loop, you might as well just use a flag:
bool itemFound = false;
foreach (var item in query)
{
itemFound = true;
... // Rest of the loop body goes here.
}
if(itemFound)
{
// ...
}
Or you could use the enumerator directly if you're really concerned about the redundant flag-setting in the loop body:
using(var erator = query.GetEnumerator())
{
bool itemFound = erator.MoveNext();
if(itemFound)
{
do
{
// Do something with erator.Current;
} while(erator.MoveNext())
}
// Do something with itemFound
}
There is not much information that can be extracted from an enumerable, so maybe it's better to turn the query into an IQueryable? This Any extension method walks down its expression tree skipping all irrelevant operations, then it turns the important branch into a delegate that can be called to obtain an optimized IQueryable. Standard Any method applied to it explicitly to avoid recursion. Not sure about corner cases, and maybe it makes sense to cache compiled queries, but with simple queries like yours it seems to work.
static class QueryableHelper {
public static bool Any<T>(this IQueryable<T> source) {
var e = source.Expression;
while (e is MethodCallExpression) {
var mce = e as MethodCallExpression;
switch (mce.Method.Name) {
case "Select":
case "OrderBy":
case "ThenBy": break;
default: goto dun;
}
e = mce.Arguments.First();
}
dun:
var d = Expression.Lambda<Func<IQueryable<T>>>(e).Compile();
return Queryable.Any(d());
}
}
Queries themselves must be modified like this:
var query = expensiveSrc.AsQueryable()
.Where(x=> x.HasFoo)
.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName);
Would there be any non-insane way to "enlighten" Any() to bypass the non-required operations? I think I remember this sort of optimisation going into EduLinq.
Well I'm not going to ignore any question which mentions Edulinq :)
In this case, Edulinq might well be faster than LINQ to Objects, as its OrderBy implementation is as lazy as it can be - it only sorts as much as it needs to in order to retrieve the elements it returns.
However, fundamentally it still has to read the whole sequence in before it returns anything. After all, the last element in the sequence could be the first one which has to be returned.
If you're in control of the whole stack, you could make Any() detect that it's being called on your "known" IOrderedEnumerable implementation, and go straight to the original source. Note that this does create a change in the observed behaviour though - if iterating over the whole sequence throws an exception (or has any other side effect) then that side-effect would be lost by the optimization. You could argue that's okay, of course - what counts as "valid" optimization in LINQ is a decidedly tricky area.
One other possibility which is pretty horrible but which would solve this particular problem would be to make the iterator returned from the IOrderedEnumerable just take the first value of MoveNext() from the source. That's enough for the normal implementation of Any, and at that point we don't need to know what the first element is. We could defer the actual sorting until the first time the Current property is used.
That's a pretty special-case optimization though - and one which I'd be wary to implement. I think Ani's approach is the better one - just use the fact that iterating over query using foreach will never go into the body of the loop if the query results are empty.
Edit (revised): This answer adressess the issue of the query executing twice, which I believe is the key issue. See below why:
Making Any() smarter is something that only the Linq implementers can do, IMO... Or it would be some dirty adventure using reflection.
Using a class as shown below, you can cache the output of the original enumerable, and let it be enumerated twice:
public class CachedEnumerable<T>
{
public CachedEnumerable(IEnumerable<T> enumerable)
{
_source = enumerable.GetEnumerator();
}
public IEnumerable<T> Enumerate()
{
int itemIndex = 0;
while (true)
{
if (itemIndex < _cache.Count)
{
yield return _cache[itemIndex];
itemIndex++;
continue;
}
if (!_source.MoveNext())
yield break;
var current = _source.Current;
_cache.Add(current);
yield return current;
itemIndex++;
}
}
private List<T> _cache = new List<T>();
private IEnumerator<T> _source;
}
This way you keep the lazy aspect of LINQ, keep the code readable and generic. It wil be slower that directly using IEnumerator<>. There are lots of opportunities to extend, and optimize this class, such as a policy for discarding old items, getting rid of the coroutine etc. But that is beyond the point of this question I think.
Oh, and the class is not thread safe as it is now. This wasn't asked, but I can imagine people trying. I think this could be easily added, if the source enumerable has no thread affinity..
Why would this be optimal?
Let's consider two possibilites: the enumeration could containt elements or it does not.
If it contains elements, this approach is optimal as the query is
only run once.
If it contains no elements, you would be tempted
to eliminate the OrderBy and Select part of your queries, as they add
no value. But.. if there are zero items after the Where() clause, there are zero items to sort, which will cost zero time (well, almost). The same goes for the Select() clause.
What if this is not fast enough yet? In that case my strategy would be to bypass Linq. Now, I really love linq, but it's elegance comes at a price. So for every 100 times of using Linq, there typically will be one or two computations that are important to execute really fast, which I write with good old for loops and lists. Part of mastering a technology is recognizing where it is not appropriate. Linq is no exception to that rule.
Try this:
var items = expensiveSrc.Where(x=> x.HasFoo)
.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName).ToList();
// ...
if (!condition && items.Count == 0)
return; // Just check the count
// ...
foreach (var item in items)
// ...
The query is executed just once.
but I've lost the streaming/lazy loading that's half the point of linq
Lazy loading (deferred execution), and 2 LINQ queries with disparate results cannot be optimized (reduced) to 1 query execution.
why are you not using a .ToArray()
var query = expensiveSrc.Where(x=> x.HasFoo)
.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName).ToArray();
if there are not elements, sorting and selecting should not give much overhead. if you are sorting, then you need anyway a cache where to store the data, so the overhead .ToArray produces should not be so much.
if you decompile the OrderedEnumerable class, you find that there an int[] array containing the references is formed, so you just create by using .ToArray (or .ToList) a new reference array.
BUT
if expensiveSrc comes from a database, other strategies could be better. if the ordering can be done in the database, this would give to you quite lot of overhead because the data is stored twice.

How can I use set operations to delete objects in an entitycollection that match a collection of view models?

Here is a very basic example of what I want to do. The code I have come up with seems quite verbose... ie looping through the collection, etc.
I am using a Telerik MVC grid that posts back a collection of deleted, inserted and updated ViewModels. The view models are similar but not exactly the same as the entity.
For example... I have:
Order.Lines. Lines is an entity collection (navigation property) containing OrderDetail records. In the update action of my controller using the I have a List names DeletedLines pulled from the POST data. I also have queried the database and have the Order entity including the Lines collection.
Now I basically want to tell it to delete all the OrderDetails in the Lines EntityCollection.
The way I have done it is something like:
foreach (var line in DeletedLines) {
db.DeleteObject(Order.Lines.Where(l => l.Key == line.Key).SingleOrDefault())
}
I was hoping there was a way that I could use .Interset() to get a collection of entities to delete and pass that to DeleteObject.. however, DeleteObject seems to only accept a single entity rather than a collection.
Perhaps the above is good enough.. but it seemed like there should be an easier method.
Thanks,
BOb
Are the items in DeletedLines attached to the context? If so, what about this?
foreach (var line in DeletedLines) db.DeleteObject(line);
Response to comment #1
Ok, I see now. You can make your code a bit shorter, but not much:
foreach (var line in DeletedLines) {
db.DeleteObject(Order.Lines.SingleOrDefault(l => l.Key == line.Key))
}
I'm not sure if DeleteObject will throw an exception when you pass it null. If it does, you may be even better off using Single, as long as you're sure the item is in there:
foreach (var line in DeletedLines) {
db.DeleteObject(Order.Lines.Single(l => l.Key == line.Key))
}
If you don't want to re-query the database and either already have the mapping table PK values (or can include them in the client call), you could use one of Alex James's tips for deleting without first retrieving:
http://blogs.msdn.com/b/alexj/archive/2009/03/27/tip-9-deleting-an-object-without-retrieving-it.aspx

ADO.NET Data Services, LINQ

I have C# code to populate a dropdown list in Silverlight which works fine except when there are duplicates. I think because IEnumerable<Insurance.Claims> is a collection, it filters out duplicates. How would I code my LINQ query to accept duplicates?
My Sample Data looks like:
Code => CodeName
FGI Field General Initiative
SRI Static Resource Initiative
JFI Joint Field Initiative - This is "overwritten" in results
JFI Joint Friend Initiative
IEnumerable<Insurance.Claims> results;
// ADO.NET Data Service
var claim = (from c in DataEntities.Claims.Expand("Claimants").Expand("Policies")
where c.Claim_Number == claimNumber
select c);
DataServiceQuery<Insurance.Claims> dataServiceQuery =
claim as DataServiceQuery<Insurance.Claims>;
dataServiceQuery.BeginExecute((asyncResult) =>
{
results = dataServiceQuery.EndExecute(asyncResult);
if (results == null)
{
// Error
}
else
{
// Code to populate Silverlight form
}
});
(Not sure if you're still struggling with this but anyway...)
I'm pretty sure it's not the IEnumerable interface but the actual drop down that is causing this behaviour. The code is being used as the key, and so obviously each time the same code is encountered, the item is being overwritten.
I don't think you can override this unless you change the code, or use another identifier as the key field in the dropdown.
You may want to add a try-catch block around dataServiceQuery.EndExecute(asyncResult) to properly handle errors.

Resources