Using LINQ to change values in collection - linq

I think I'm not undertanding LINQ well. I want to do:
foreach (MyObjetc myObject in myObjectCollection)
{
myObjet.MyProperty = newValue
}
Just change all values for a property in all elements of my collection.
Using LINQ wouldn't be this way?
myObjectCollection.Select(myObject => myObject.MyProperty = newValue)
It doesn't work. The property value is not changed. Why?
Edit:
Sorry, guys. Certainly, foreach is the right way. But,in my case, I must repeat the foreach in many collections, and I didn't want to repeat the loop. So, finally, I have found an 'inermediate' solution, the 'foreach' method, just similar to the 'Select':
myObjectCollection.ForEach(myObject => myObject.MyProperty = newValue)
Anyway may be it's not as clear as the more simple:
foreach (MyObjetc myObject in myObjectCollection) myObjet.MyProperty = newValue;

First off, this is not a good idea. See below for arguments against it.
It doesn't work. The property value is not changed. Why?
It doesn't work because Select() doesn't actually iterate through the collection until you enumerate it's results, and it requires an expression that evaluates to a value.
If you make your expression return a value, and add something that will fully evaluate the query, such as ToList(), to the end, then it will "work", ie:
myObjectCollection.Select(myObject => { myObject.MyProperty = newValue; return myObject;}).ToList();
That being said, ToList() has some disadvantages - mainly, it's doing a lot of extra work (to create a List<T>) that's not needed, which adds a large cost. Avoiding it would require enumerating the collection:
foreach(var obj in myObjectCollection.Select(myObject => { myObject.MyProperty = newValue; return myObject; }))
{ }
Again, I wouldn't recommend this. At this point, the Select option is far uglier, more typing, etc. It's also a violation of the expectations involved with LINQ - LINQ is about querying, which suggests there shouldn't be side effects when using LINQ, and the entire purpose here is to create side effects.
But then, at this point, you're better off (with less typing) doing it the "clear" way:
foreach (var obj in myObjectCollection)
{
obj.MyProperty = newValue;
}
This is shorter, very clear in its intent, and very clean.
Note that you can add a ForEach<T> extension method which performs an action on each object, but I would still recommend avoiding that. Eric Lippert wrote a great blog post about the subject that is worth a read: "foreach" vs "ForEach".

As sircodesalot mentioned, you can't use linq to do something like that. Remember that linq is a querying language, which means all you can do is query it. Doing changes must be done in other logic.
What you could do, if you don't want to do it the first way and if your collection is a list already (but not an IEnumerable) you can use the ForEach extension method in linq to do what you're asking.
One other point I should mention is that the Select method does a projection of some specific information to return to an IEnumerable. So, if you wanted to grab only a specific property from a collection you would use that. That's all it does.

Related

Optimizing away OrderBy() when using Any()

So I have a fairly standard LINQ-to-Object setup.
var query = expensiveSrc.Where(x=> x.HasFoo)
.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName);
// ...
if (!condition && !query.Any())
return; // seems to enumerate and sort entire enumerable
// ...
foreach (var item in query)
// ...
This enumerates everything twice. Which is bad.
var queryFiltered = expensiveSrc.Where(x=> x.HasFoo);
var query = queryFiltered.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName);
if (!condition && !queryFiltered.Any())
return;
// ...
foreach (var item in query)
// ...
Works, but is there a better way?
Would there be any non-insane way to "enlighten" Any() to bypass the non-required operations? I think I remember this sort of optimisation going into EduLinq.
Why not just get rid of the redundant:
if (!query.Any())
return;
It really doesn't seem to be serving any purpose - even without it, the body of the foreach won't execute if the query yields no results. So with the Any() check in, you save nothing in the fast path, and enumerate twice in the slow path.
On the other hand, if you must know if there were any results found after the end of the loop, you might as well just use a flag:
bool itemFound = false;
foreach (var item in query)
{
itemFound = true;
... // Rest of the loop body goes here.
}
if(itemFound)
{
// ...
}
Or you could use the enumerator directly if you're really concerned about the redundant flag-setting in the loop body:
using(var erator = query.GetEnumerator())
{
bool itemFound = erator.MoveNext();
if(itemFound)
{
do
{
// Do something with erator.Current;
} while(erator.MoveNext())
}
// Do something with itemFound
}
There is not much information that can be extracted from an enumerable, so maybe it's better to turn the query into an IQueryable? This Any extension method walks down its expression tree skipping all irrelevant operations, then it turns the important branch into a delegate that can be called to obtain an optimized IQueryable. Standard Any method applied to it explicitly to avoid recursion. Not sure about corner cases, and maybe it makes sense to cache compiled queries, but with simple queries like yours it seems to work.
static class QueryableHelper {
public static bool Any<T>(this IQueryable<T> source) {
var e = source.Expression;
while (e is MethodCallExpression) {
var mce = e as MethodCallExpression;
switch (mce.Method.Name) {
case "Select":
case "OrderBy":
case "ThenBy": break;
default: goto dun;
}
e = mce.Arguments.First();
}
dun:
var d = Expression.Lambda<Func<IQueryable<T>>>(e).Compile();
return Queryable.Any(d());
}
}
Queries themselves must be modified like this:
var query = expensiveSrc.AsQueryable()
.Where(x=> x.HasFoo)
.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName);
Would there be any non-insane way to "enlighten" Any() to bypass the non-required operations? I think I remember this sort of optimisation going into EduLinq.
Well I'm not going to ignore any question which mentions Edulinq :)
In this case, Edulinq might well be faster than LINQ to Objects, as its OrderBy implementation is as lazy as it can be - it only sorts as much as it needs to in order to retrieve the elements it returns.
However, fundamentally it still has to read the whole sequence in before it returns anything. After all, the last element in the sequence could be the first one which has to be returned.
If you're in control of the whole stack, you could make Any() detect that it's being called on your "known" IOrderedEnumerable implementation, and go straight to the original source. Note that this does create a change in the observed behaviour though - if iterating over the whole sequence throws an exception (or has any other side effect) then that side-effect would be lost by the optimization. You could argue that's okay, of course - what counts as "valid" optimization in LINQ is a decidedly tricky area.
One other possibility which is pretty horrible but which would solve this particular problem would be to make the iterator returned from the IOrderedEnumerable just take the first value of MoveNext() from the source. That's enough for the normal implementation of Any, and at that point we don't need to know what the first element is. We could defer the actual sorting until the first time the Current property is used.
That's a pretty special-case optimization though - and one which I'd be wary to implement. I think Ani's approach is the better one - just use the fact that iterating over query using foreach will never go into the body of the loop if the query results are empty.
Edit (revised): This answer adressess the issue of the query executing twice, which I believe is the key issue. See below why:
Making Any() smarter is something that only the Linq implementers can do, IMO... Or it would be some dirty adventure using reflection.
Using a class as shown below, you can cache the output of the original enumerable, and let it be enumerated twice:
public class CachedEnumerable<T>
{
public CachedEnumerable(IEnumerable<T> enumerable)
{
_source = enumerable.GetEnumerator();
}
public IEnumerable<T> Enumerate()
{
int itemIndex = 0;
while (true)
{
if (itemIndex < _cache.Count)
{
yield return _cache[itemIndex];
itemIndex++;
continue;
}
if (!_source.MoveNext())
yield break;
var current = _source.Current;
_cache.Add(current);
yield return current;
itemIndex++;
}
}
private List<T> _cache = new List<T>();
private IEnumerator<T> _source;
}
This way you keep the lazy aspect of LINQ, keep the code readable and generic. It wil be slower that directly using IEnumerator<>. There are lots of opportunities to extend, and optimize this class, such as a policy for discarding old items, getting rid of the coroutine etc. But that is beyond the point of this question I think.
Oh, and the class is not thread safe as it is now. This wasn't asked, but I can imagine people trying. I think this could be easily added, if the source enumerable has no thread affinity..
Why would this be optimal?
Let's consider two possibilites: the enumeration could containt elements or it does not.
If it contains elements, this approach is optimal as the query is
only run once.
If it contains no elements, you would be tempted
to eliminate the OrderBy and Select part of your queries, as they add
no value. But.. if there are zero items after the Where() clause, there are zero items to sort, which will cost zero time (well, almost). The same goes for the Select() clause.
What if this is not fast enough yet? In that case my strategy would be to bypass Linq. Now, I really love linq, but it's elegance comes at a price. So for every 100 times of using Linq, there typically will be one or two computations that are important to execute really fast, which I write with good old for loops and lists. Part of mastering a technology is recognizing where it is not appropriate. Linq is no exception to that rule.
Try this:
var items = expensiveSrc.Where(x=> x.HasFoo)
.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName).ToList();
// ...
if (!condition && items.Count == 0)
return; // Just check the count
// ...
foreach (var item in items)
// ...
The query is executed just once.
but I've lost the streaming/lazy loading that's half the point of linq
Lazy loading (deferred execution), and 2 LINQ queries with disparate results cannot be optimized (reduced) to 1 query execution.
why are you not using a .ToArray()
var query = expensiveSrc.Where(x=> x.HasFoo)
.OrderBy(y => y.Bar.Count())
.Select(z => z.FrobberName).ToArray();
if there are not elements, sorting and selecting should not give much overhead. if you are sorting, then you need anyway a cache where to store the data, so the overhead .ToArray produces should not be so much.
if you decompile the OrderedEnumerable class, you find that there an int[] array containing the references is formed, so you just create by using .ToArray (or .ToList) a new reference array.
BUT
if expensiveSrc comes from a database, other strategies could be better. if the ordering can be done in the database, this would give to you quite lot of overhead because the data is stored twice.

How can I use set operations to delete objects in an entitycollection that match a collection of view models?

Here is a very basic example of what I want to do. The code I have come up with seems quite verbose... ie looping through the collection, etc.
I am using a Telerik MVC grid that posts back a collection of deleted, inserted and updated ViewModels. The view models are similar but not exactly the same as the entity.
For example... I have:
Order.Lines. Lines is an entity collection (navigation property) containing OrderDetail records. In the update action of my controller using the I have a List names DeletedLines pulled from the POST data. I also have queried the database and have the Order entity including the Lines collection.
Now I basically want to tell it to delete all the OrderDetails in the Lines EntityCollection.
The way I have done it is something like:
foreach (var line in DeletedLines) {
db.DeleteObject(Order.Lines.Where(l => l.Key == line.Key).SingleOrDefault())
}
I was hoping there was a way that I could use .Interset() to get a collection of entities to delete and pass that to DeleteObject.. however, DeleteObject seems to only accept a single entity rather than a collection.
Perhaps the above is good enough.. but it seemed like there should be an easier method.
Thanks,
BOb
Are the items in DeletedLines attached to the context? If so, what about this?
foreach (var line in DeletedLines) db.DeleteObject(line);
Response to comment #1
Ok, I see now. You can make your code a bit shorter, but not much:
foreach (var line in DeletedLines) {
db.DeleteObject(Order.Lines.SingleOrDefault(l => l.Key == line.Key))
}
I'm not sure if DeleteObject will throw an exception when you pass it null. If it does, you may be even better off using Single, as long as you're sure the item is in there:
foreach (var line in DeletedLines) {
db.DeleteObject(Order.Lines.Single(l => l.Key == line.Key))
}
If you don't want to re-query the database and either already have the mapping table PK values (or can include them in the client call), you could use one of Alex James's tips for deleting without first retrieving:
http://blogs.msdn.com/b/alexj/archive/2009/03/27/tip-9-deleting-an-object-without-retrieving-it.aspx

EF4.1 LINQ, selecting all results

I am new to LINQ queries and to EF too, I usually work with MySQL and I can't guess how to write really simples queries.
I'd like to select all results from a table. So, I used like this:
ZXContainer db = new ZXContainer();
ViewBag.ZXproperties = db.ZXproperties.All();
But I see that I have to write something inside All(---).
Could someone guide me in how could I do that? And if someone has any good link for references too, I thank so much.
All() is an boolean evaluation performed on all of the elements in a collection (though immediately returns false when it reaches an element where the evaluation is false), for example, you want to make sure that all of said ZXproperties have a certain field set as true:
bool isTrue = db.ZXproperties.All(z => z.SomeFieldName == true);
Which will either make isTrue true or false. LINQ is typically lazy-loading, so if you're calling db.ZXproperties directly, you have access to all of the objects as is, but it isn't quite what you're looking for. You can either load all of the objects at the variable assignment with an .ToList():
ViewBag.ZXproperties = db.ZXproperties.ToList();
or you can use the below expression:
ViewBag.ZXproperties = from s in db.ZXproperties
select s;
Which is really no different than saying:
ViewBag.ZXproperties = db.ZXproperties;
The advantage of .ToList() is that if you are wanting to do multiple calls on this ViewBag.ZXproperties, it will only require the initial database call when it is assigning the variable. Alternatively, if you do any form of queryable action on the data, such as .Where(), you'll have another query performed, which is less than ideal if you already have the data to work with.
To select everything, just skip the .All(...), as ZXproperties allready is a collection.
ZXContainer db = new ZXContainer();
ViewBag.ZXproperties = db.ZXproperties;
You might want (or sometimes even need) to call .ToList() on this collection before use...
You don't use All. Just type
ViewBag.ZXproperties = db.ZXproperties;
or
ViewBag.ZXproperties = db.ZXproperties.ToList();
The All method is used to determine if all items of collection match some condition.
If you just want all of the items, you can just use it directly:
ViewBag.ZXproperties = db.ZXproperties;
If you want this evaluated immediately, you can convert it to a list:
ViewBag.ZXproperties = db.ZXproperties.ToList();
This will force it to be pulled across the wire immediately.
You can use this:
var result = db.ZXproperties.ToList();
For more information on linq see 101 linq sample.
All is some checking on all items and argument in it, called lambda expression.

Sorting an observable collection with linq

I have an observable collection and I sort it using linq. Everything is great, but the problem I have is how do I sort the actual observable collection? Instead I just end up with some IEnumerable thing and I end up clearing the collection and adding the stuff back in. This can't be good for performance. Does anyone know of a better way to do this?
If you are using Silverlight 3.0, then using CollectionViewSource is the cleanest way. Refer below example: (it can be done via xaml as well)
ObservableCollection<DateTime> ecAll = new ObservableCollection<DateTime>();
CollectionViewSource sortedcvs = new CollectionViewSource();
sortedcvs.SortDescriptions.Add(new System.ComponentModel.SortDescription("Date",
System.ComponentModel.ListSortDirection.Ascending));
sortedcvs.Source = ecAll;
ListBoxContainer.DataContext = sortedcvs;
And in corresponding xaml set
ItemsSource="{Binding}"
for the ListBox or any ItemsControl derived control
Since the collection doesn't provide any Sort mechanism, this is probably the most practical option. You could implement a sort manually using Move etc, but it will probably be slower than doing in this way.
var arr = list.OrderBy(x => x.SomeProp).ToArray();
list.Clear();
foreach (var item in arr) {
list.Add(item);
}
Additionally, you might consider unbinding any UI elements while sorting (via either approach) you only pay to re-bind once:
Interestingly, if this was BindingList<T>, you could use RaiseListChangedEvents to minimise the number of notifications:
var arr = list.OrderBy(x => x).ToArray();
bool oldRaise = list.RaiseListChangedEvents;
list.RaiseListChangedEvents = false;
try {
list.Clear();
foreach (var item in arr) {
list.Add(item);
}
} finally {
list.RaiseListChangedEvents = oldRaise;
if (oldRaise) list.ResetBindings();
}
Note that in Linq, you are given an IEnumerable from your query, and that query has not executed yet. Therefore, the following code only runs the query once, to add it to an ObservableCollection:
var query = from x in Data
where x.Tag == "Something"
select x;
foreach(var item in query)
MyObservableCollection.Add(item);
Take a look at the "OrderBy" extension on IEnumerable:
foreach(var item in query.OrderBy(x => x.Name))
MyObservableCollection.Add(item);
ObservableCollections aren't designed to be sortable. List is sortable, and that's the underlying mechanism used by the answer referencing List.Sort(), but ObservableCollection isn't derived from List so you're out of luck there. Imo, the "right" solution is not to try to sort the ObservableCollection, but to implement ICollectionView and bind an instance of that to your control. That interface adds methods for sorting and has the additional benefit that its recognized by Silverlight controls (well, the ones that support it anyway such as DataGrid) so your sorting could be utilized directly from the UI layer. This question might be helpful:
Silverlight and icollectionview
i followed the link mentioned in this post http://mokosh.co.uk/post/2009/08/04/how-to-sort-observablecollection/comment-page-1/#comment-75
but having issues getting it to work in Silverlight
I created a property public SortableObservableCollection Terms
When I call Terms.Sort(new TermComparer()) the records are still display unsorted on the UI
could some suggest what could be going wrong. thanks
I found this on CodePlex:
Sorted Collections
Haven't used it yet though.
Rick

Using LINQ Expression Instead of NHIbernate.Criterion

If I were to select some rows based on certain criteria I can use ICriterion object in NHibernate.Criterion, such as this:
public List<T> GetByCriteria()
{
SimpleExpression newJobCriterion =
NHibernate.Criterion.Expression.Eq("LkpStatu", statusObject);
ICriteria criteria = Session.GetISession().CreateCriteria(typeof(T)).SetMaxResults(maxResults);
criteria.Add(newJobCriterion );
return criteria.List<T>();
}
Or I can use LINQ's where clause to filter what I want:
public List<T> GetByCriteria_LINQ()
{
ICriteria criteria = Session.GetISession().CreateCriteria(typeof(T)).SetMaxResults(maxResults);
return criteria.Where(item=>item.LkpStatu=statusObject).ToList();
}
I would prefer the second one, of course. Because
It gives me strong typing
I don't need to learn yet-another-syntax in the form of NHibernate
The issue is is there any performance advantage of the first one over the second one? From what I know, the first one will create SQL queries, so it will filter the data before pass into the memory. Is this kind of performance saving big enough to justify its use?
As usual it depends. First note that in your second snippet there is .List() missing right after return criteria And also note that you won't get the same results on both examples. The first one does where and then return top maxResults, the second one however first selects top maxResults and then does where.
If your expected result set is relatively small and you are likely to use some of the results in lazy loads then it's actually better to take the second approach. Because all entities loaded through a session will stay in its first level cache.
Usually however you don't do it this way and use the first approach.
Perhaps you wanted to use NHibernate.Linq (located in Contrib project ). Which does linq translation to Criteria for you.
I combine the two and made this:
var crit = _session.CreateCriteria(typeof (T)).SetMaxResults(100);
return (from x in _session.Linq<T>(crit) where x.field == <something> select x).ToList();

Resources