Linq - 'Saving' OrderBy operation (c#) - linq

Assume I have generic list L of some type in c#. Then, using linq, call OrderBy() on it, passing in a lambda expression.
If I then re-assign the L, the previous order operation will obviously be lost.
Is there any way I can 'save' the lambda expression I used on the list before i reassigned it, and re-apply it?

Use a Func delegate to store your ordering then pass that to the OrderBy method:
Func<int, int> orderFunc = i => i; // func for ordering
var list = Enumerable.Range(1,10).OrderByDescending(i => i); // 10, 9 ... 1
var newList = list.OrderBy(orderFunc); // 1, 2 ... 10
As another example consider a Person class:
public class Person
{
public int Id { get; set; }
public string Name { get; set; }
}
Now you want to preserve a sort order that sorts by the Name property. In this case the Func operates on a Person type (T) and the TResult will be a string since Name is a string and is what you are sorting by.
Func<Person, string> nameOrder = p => p.Name;
var list = new List<Person>
{
new Person { Id = 1, Name = "ABC" },
new Person { Id = 2, Name = "DEF" },
new Person { Id = 3, Name = "GHI" },
};
// descending order by name
foreach (var p in list.OrderByDescending(nameOrder))
Console.WriteLine(p.Id + ":" + p.Name);
// 3:GHI
// 2:DEF
// 1:ABC
// re-assinging the list
list = new List<Person>
{
new Person { Id = 23, Name = "Foo" },
new Person { Id = 14, Name = "Buzz" },
new Person { Id = 50, Name = "Bar" },
};
// reusing the order function (ascending by name in this case)
foreach (var p in list.OrderBy(nameOrder))
Console.WriteLine(p.Id + ":" + p.Name);
// 50:Bar
// 14:Buzz
// 23:Foo
EDIT: be sure to add ToList() after the OrderBy calls if you need a List<T> since the LINQ methods will return an IEnumerable<T>.

Calling ToList() or ToArray() on your IEnumerable<T> will cause it to be immediately evaluated. You can then assign the resulting list or array to "save" your ordered list.

Related

LINQ DistinctBy chosing what object to keep

If I have a list of objects and I don't want to allow duplicates of a certain attribute of the objects. My understanding is that I can use DistinctBy() to remove one of the objects. My question is, how do I choose which of the objects with the same value of an attribute value do I keep?
Example:
How would I go about removing any objects with a duplicate value of "year" in the list tm and keep the object with the highest value of someValue?
class TestModel{
public int year{ get; set; }
public int someValue { get; set; }
}
List<TestModel> tm = new List<TestModel>();
//populate list
//I was thinking something like this
tm.DistinctBy(x => x.year).Select(x => max(X=>someValue))
You can use GroupBy and Aggregate (there is no MaxBy built-in method in LINQ):
tm
.GroupBy(tm => tm.year)
.Select(g => g.Aggregate((acc, next) => acc.someValue > next.someValue ? acc : next))
User the GroupBy followed by the SelectMany/Take(1) pattern with an OrderBy:
IEnumerable<TestModel> result =
tm
.GroupBy(x => x.year)
.SelectMany(xs =>
xs
.OrderByDescending(x => x.someValue)
.Take(1));
Here's an example:
List<TestModel> tm = new List<TestModel>()
{
new TestModel() { year = 2020, someValue = 5 },
new TestModel() { year = 2020, someValue = 15 },
new TestModel() { year = 2019, someValue = 6 },
};
That gives me:

enumerable group field using Linq?

I've written a Linq sentence like this:
var fs = list
.GroupBy(i =>
new {
X = i.X,
Ps = i.Properties.Where(p => p.Key.Equals("m")) <<<<<<<<<<<
}
)
.Select(g => g.Key });
Am I able to group by IEnumerable.Where(...) fields?
The grouping won't work here.
When grouping, the runtime will try to compare group keys in order to produce proper groups. However, since in the group key you use a property (Ps) which is a distinct IEnumerable<T> for each item in list (the comparison is made on reference equality not on sequence equality) this will result in a different collection for each element; in other words if you'll have two items:
var a = new { X = 1, Properties = new[] { "m" } };
var b = new { X = 1, Properties = new[] { "m" } };
The GroupBy clause will give you two distinct keys as you can see from the image below.
If your intent is to just project the items into the structure of the GroupBy key then you don't need the grouping; the query below should give the same result:
var fs = list.Select(item => new
{
item.X,
Ps = item.Properties.Where(p => p.Key == "m")
});
However, if you do require the results to be distinct, you'll need to create a separate class for your result and implement a separate IEqualityComparer<T> to be used with Distinct clause:
public class Result
{
public int X { get; set; }
public IEnumerable<string> Ps { get; set; }
}
public class ResultComparer : IEqualityComparer<Result>
{
public bool Equals(Result a, Result b)
{
return a.X == b.X && a.Ps.SequenceEqual(b.Ps);
}
// Implement GetHashCode
}
Having the above you can use Distinct on the first query to get distinct results:
var fs = list.Select(item => new Result
{
X = item.X,
Ps = item.Properties.Where( p => p.Key == "m")
}).Distinct(new ResultComparer());

Linq join two lists: is it more efficient to use Dictionary?

Final rephrase
Below I join two sequences and I wondered if it would be faster to create a Dictionary of one sequence with the keySelector of the join as key and iterate through the other collection and find the key in the dictionary.
This only works if the key selector is unique. A real join has no problem with two records having the same key. In a dictionary you'll have to have unique keys
I measured the difference, and I noticed that the dictionary method is about 13% faster. In most use cases ignorable. See my answer to this question
Rephrased question
Some suggested that this question is the same question as LINQ - Using where or join - Performance difference?, but this one is not about using where or join, but about using a Dictionary to perform the join.
My question is: if I want to join two sequences based on a key selector, which method would be faster?
Put all items of one sequence in a Dictionary and enumerate the other sequence to see if the item is in the Dictionary. This would mean to iterate through both sequences once and calculate hash codes on the keySelector for every item in both sequences once.
The other method: use System.Enumerable.Join.
The question is: Would Enumerable.Join for each element in the first list iterate through the elements in the second list to find a match according to the key selector, having to compare N * N elements (is this called second order?) or would it use a more advanced method?
Original question with examples
I have two classes, both with a property Reference. I have two sequences of these classes and I want to join them based on equal Reference.
Class ClassA
{
public string Reference {get;}
...
}
public ClassB
{
public string Reference {get;}
...
}
var listA = new List<ClassA>()
{
new ClassA() {Reference = 1, ...},
new ClassA() {Reference = 2, ...},
new ClassA() {Reference = 3, ...},
new ClassA() {Reference = 4, ...},
}
var listB = new List<ClassB>()
{
new ClassB() {Reference = 1, ...},
new ClassB() {Reference = 3, ...},
new ClassB() {Reference = 5, ...},
new ClassB() {Reference = 7, ...},
}
After the join I want combinations of ClassA objects and ClassB objects that have an equal Reference. This is quite simple to do:
var myJoin = listA.Join(listB, // join listA and listB
a => a.Reference, // from listA take Reference
b => b.Reference, // from listB take Reference
(objectA, objectB) => // if references equal
new {A = objectA, B = objectB}); // return combination
I'm not sure how this works, but I can imagine that for each a in listA the listB is iterated to see if there is a b in listB with the same reference as A.
Question: if I know that the references are Distinct wouldn't it be more efficient to convert B into a Dictionary and compare the Reference for each element in listA:
var dictB = listB.ToDictionary<string, ClassB>()
var myJoin = listA
.Where(a => dictB.ContainsKey(a.Reference))
.Select(a => new (A = a, B = dictB[a.Reference]);
This way, every element of listB has to be accessed once to put in the dictionary and every element of listA has to be accessed once, and the hascode of Reference has to be calculated once.
Would this method be faster for large collections?
I created a test program for this and measured the time it took.
Suppose I have a class of Person, each person has a name and a Father property which is of type Person. If the Father is not know, the Father property is null
I have a sequence of Bastards (no father) that have exactly one Son and One Daughter. All Daughters are put in one sequence. All sons are put in another sequences.
The query: join the sons and the daughters that have the same father.
Results: Joining 1 million families using Enumerable.Join took 1.169 sec. Joining them using Dictionary join used 1.024 sec. Ever so slightly faster.
The code:
class Person : IEquatable<Person>
{
public string Name { get; set; }
public Person Father { get; set; }
// + a lot of equality functions get hash code etc
// for those interested: see the bottom
}
const int nrOfBastards = 1000000; // one million
var bastards = Enumerable.Range (0, nrOfBastards)
.Select(i => new Person()
{ Name = 'B' + i.ToString(), Father = null })
.ToList();
var sons = bastards.Select(father => new Person()
{Name = "Son of " + father.Name, Father = father})
.ToList();
var daughters = bastards.Select(father => new Person()
{Name = "Daughter of " + father.Name, Father = father})
.ToList();
// join on same parent: Traditionally and using Dictionary
var stopwatch = Stopwatch.StartNew();
this.TraditionalJoin(sons, daughters);
var time = stopwatch.Elapsed;
Console.WriteLine("Traditional join of {0} sons and daughters took {1:F3} sec", nrOfBastards, time.TotalSeconds);
stopwatch.Restart();
this.DictionaryJoin(sons, daughters);
time = stopwatch.Elapsed;
Console.WriteLine("Dictionary join of {0} sons and daughters took {1:F3} sec", nrOfBastards, time.TotalSeconds);
}
private void TraditionalJoin(IEnumerable<Person> boys, IEnumerable<Person> girls)
{ // join on same parent
var family = boys
.Join(girls,
boy => boy.Father,
girl => girl.Father,
(boy, girl) => new { Son = boy.Name, Daughter = girl.Name })
.ToList();
}
private void DictionaryJoin(IEnumerable<Person> sons, IEnumerable<Person> daughters)
{
var sonsDictionary = sons.ToDictionary(son => son.Father);
var family = daughters
.Where(daughter => sonsDictionary.ContainsKey(daughter.Father))
.Select(daughter => new { Son = sonsDictionary[daughter.Father], Daughter = daughter })
.ToList();
}
For those interested in the equality of Persons, needed for a proper dictionary:
class Person : IEquatable<Person>
{
public string Name { get; set; }
public Person Father { get; set; }
public bool Equals(Person other)
{
if (other == null)
return false;
else if (Object.ReferenceEquals(this, other))
return true;
else if (this.GetType() != other.GetType())
return false;
else
return String.Equals(this.Name, other.Name, StringComparison.OrdinalIgnoreCase);
}
public override bool Equals(object obj)
{
return this.Equals(obj as Person);
}
public override int GetHashCode()
{
const int prime1 = 899811277;
const int prime2 = 472883293;
int hash = prime1;
unchecked
{
hash = hash * prime2 + this.Name.GetHashCode();
if (this.Father != null)
{
hash = hash * prime2 + this.Father.GetHashCode();
}
}
return hash;
}
public override string ToString()
{
return this.Name;
}
public static bool operator==(Person x, Person y)
{
if (Object.ReferenceEquals(x, null))
return Object.ReferenceEquals(y, null);
else
return x.Equals(y);
}
public static bool operator!=(Person x, Person y)
{
return !(x==y);
}
}

Find / Count Redundant Records in a List<T>

I am looking for a way to identify duplicate records...only I want / expect to see them.
So the records aren't duplicated completely but the unique fields I am unconcerned with at this point. I just want to see if they have made X# payments of the exact same amount, via the exact same card, to the exact same person. (Bogus example just to illustrate)
The collection is a List<> further whatever X# is the List<>.Count will be X#. In other words all the records in the list match (again just the fields I am concerned with) or I will reject it.
The best I can come up with is to take the first record get value of say PayAmount and LINQ the other two to see if they have the same PayAmount value. Repeat for all fields to be matched. This seems horribly inefficient but I am not smart enough to think of a better way.
So any thoughts, ideas, pointers would be greatly appreciated.
JB
Something like this should do it.
var duplicates = list.GroupBy(x => new { x.Amount, x.CardNumber, x.PersonName })
.Where(x => x.Count() > 1);
Working example:
class Program
{
static void Main(string[] args)
{
List<Entry> table = new List<Entry>();
var dup1 = new Entry
{
Name = "David",
CardNumber = 123456789,
PaymentAmount = 70.00M
};
var dup2 = new Entry
{
Name = "Daniel",
CardNumber = 987654321,
PaymentAmount = 45.00M
};
//3 duplicates
table.Add(dup1);
table.Add(dup1);
table.Add(dup1);
//2 duplicates
table.Add(dup2);
table.Add(dup2);
//Find duplicates query
var query = from p in table
group p by new { p.Name, p.CardNumber, p.PaymentAmount } into g
where g.Count() > 1
select new
{
name = g.Key.Name,
cardNumber = g.Key.CardNumber,
amount = g.Key.PaymentAmount,
count = g.Count()
};
foreach (var item in query)
{
Console.WriteLine("{0}, {1}, {2}, {3}", item.name, item.cardNumber, item.amount, item.count);
}
Console.ReadKey();
}
}
public class Entry
{
public string Name { get; set; }
public int CardNumber { get; set; }
public decimal PaymentAmount { get; set; }
}
The meat of which is this:
var query = from p in table
group p by new { p.Name, p.CardNumber, p.PaymentAmount } into g
where g.Count() > 1
select new
{
name = g.Key.Name,
cardNumber = g.Key.CardNumber,
amount = g.Key.PaymentAmount,
count = g.Count()
};
You're unique entries are based off of the 3 criteria of Name, Card Number, and Payment Amount so you group by them and then use .Count() to count how many of those unique values exist. where g.Count() > 1 filters the group to duplicates only.

How to use LINQ where clause for filtering the collection having dynamically generated type

I need to filter the ObservableCollection using LINQ Where clause in my Silverlight application.
The object type is dynamically created using method provided in following url.
http://mironabramson.com/blog/post/2008/06/Create-you-own-new-Type-and-use-it-on-run-time-(C).aspx
Is filtering my collection using Where clause for specific property possible?
How can I achieve it?
Thanks
The only way I know is using reflection, like this:
// using a list of dynamic types
var items = new List<object> { new { A = 0, B = 1 }, new { A = 1, C = 0 } };
// select ao items with A > 0
var filteredItems = items.Where(obj => (int)obj.GetType().GetField("A").GetValue(obj) > 0).ToArray();
// if you have a property instead of field, you should call GetProperty(), like this:
obj.GetType().GetProperty("PropertyName").GetValue(obj, null)
As you know the element type of your collection only at run time it probably is object at compile time. So the argument to the .Where method has to be a Func<object, bool>.
Here's a piece of code that will create such a delegate given a property of the actual element type and a lambda expression on the property (which i suppose you know the type of):
/// <summary>
/// Get a predicate for a property on a parent element.
/// </summary>
/// <param name="property">The property of the parent element to get the value for.</param>
/// <param name="propertyPredicate">The predicate on the property value.</param>
static Func<object, bool> GetPredicate<TProperty>(PropertyInfo property, Expression<Func<TProperty, bool>> propertyPredicate)
{
if (property.PropertyType != typeof(TProperty)) throw new ArgumentException("Bad property type.");
var pObj = Expression.Parameter(typeof(object), "obj");
// ((elementType)obj).property;
var xGetPropertyValue = Expression.Property(Expression.Convert(pObj, property.DeclaringType), property);
var pProperty = propertyPredicate.Parameters[0];
// obj => { var pProperty = xGetPropertyValue; return propertyPredicate.Body; };
var lambda = Expression.Lambda<Func<object, bool>>(Expression.Block(new[] { pProperty }, Expression.Assign(pProperty, xGetPropertyValue), propertyPredicate.Body), pObj);
return lambda.Compile();
}
Sample usage:
var items = new List<object> { new { A = 0, B = "Foo" }, new { A = 1, B = "Bar" }, new { A = 2, B = "FooBar" } };
var elementType = items[0].GetType();
Console.WriteLine("Items where A >= 1:");
foreach (var item in items.Where(GetPredicate<int>(elementType.GetProperty("A"), a => a >= 1)))
Console.WriteLine(item);
Console.WriteLine();
Console.WriteLine("Items where B starts with \"Foo\":");
foreach (var item in items.Where(GetPredicate<string>(elementType.GetProperty("B"), b => b.StartsWith("Foo"))))
Console.WriteLine(item);
Output:
Items where A >= 1:
{ A = 1, B = Bar }
{ A = 2, B = FooBar }
Items where B starts with "Foo":
{ A = 0, B = Foo }
{ A = 2, B = FooBar }

Resources