Linq Union - IEqualityComparer and # of executions - linq

Out of interest how does the GetHashCode of a concrete implementation of IEqualityComparer work?
The reason that I ask is that I'm using linq to union two collections, and when only the left collection has an item GetHashCode is called twice. Further to that, it's called four times if both collections have one row.
This is rough typing but you'll get the point. GetHashCode is called twice, which I'm guessing is twice for the one item in listOne?
e.g.
var listOne = new List<SearchResult>{new SearchResult{Name="Blah"}};
var listTwo = new List<SearchResult>();
listOne.Union(listTwo, SearchResultComparer);
public class SearchResultComparer : IEqualityComparer<SearchResult>
{
public bool Equals(SearchResult x, SearchResult y){....}
public int GetHashCode(SearchResult obj)
{
unchecked
{
int result = 0;
result = (result * 397) ^ (obj.Name != null ?
return result;
}
}
}
Thanks

I'm curious about your observation, I can only observe a single check of GetHashCode for each of the items in each list. But as far as an implementation of Union using your comparer, think of it like this
static IEnumerable<T> Union<T>(this IEnumerable<T> first, IEnumerable<T> second, IEqualityComparer<T> comparer)
{
// there's undoubtedly validation against null sequences
var unionSet = new HashSet<T>(comparer);
foreach (T item in first)
{
if (unionSet.Add(item))
yield return item;
}
foreach (T item in second)
{
if (unionSet.Add(item))
yield return item;
}
}
The Add method of the HashSet will return true or false if the item can be added. In the inner implementation, it will make a call to the item's GetHashCode and get the value, and then see if this value already exists inside the collection. If it does, it compares each with the matching hash code for equality. If there is no equality match (or if the hash code did not already exist), the item is successfully added and the method returns true. Otherwise, the item is not added and the method returns false.

Related

Can Linq project an List of int to another list of class T

I have a list of int and I wonder if I can create another list of object T based on the previous list using Linq.
To simplify the issue:
I have a list of int like that: 1,2,3,4
and I expect to have (1,2), (2,4), (3,6), (4,8)
Normally, we can do that easily without Linq
public class T
{
int first;
int Second;
public T(int x, int y)
{
first = x;
Second = y;
}
}
class Program
{
static void Main(string[] args)
{
List<int> series = new List<int>() { 1,2,3,4 };
List<T> obj = new List<T>();
foreach (int item in series)
{
obj.Add(new T(item,item*2));
}
}
}
That worked perfectly.
But when I tried to use Linq
List<T> obj = series.Select(x=> {new T(x,x*2)}).ToList<T>();
I thought it would work but I got an error saying
Error 2 The type arguments for method 'System.Linq.Enumerable.Select<TSource,TResult>(System.Collections.Generic.IEnumerable<TSource>, System.Func<TSource,int,TResult>)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
What have I done incorrectly? I am still a newbie (of probably a few month old) learning Linq :)
Get rid of the curly braces...
List<T> obj = series.Select(x=> new T(x,x*2)).ToList<T>();
When you write a lambda like:
x => x * 2
It's assumed that the right side is the return value. When you use curly braces, it's expecting you to actually use the return keyword, like this:
x => { return x * 2; }
When you don't, I bet it's assuming the lambda returns void.

C# Extension method to filter list

I found following extension method to filter list. I am very new to this hence I wanted to check if someone can help me. This method compare exact values but I want to use contains instead of exact compare. Any thoughts
public static IEnumerable<T> FilterByProperty<T>(this IEnumerable<T> source,string property,object value)
{
var propertyInfo = typeof(T).GetProperty(property);
return source.Where(p => propertyInfo.GetValue(p, null) == value);
}
If you want to change the equals == into Contains. try this:
public static IEnumerable<T> FilterByProperty<T>(this IEnumerable<T> source,string property,object value)
{
var propertyInfo = typeof(T).GetProperty(property);
return source.Where(p => propertyInfo
.GetValue(p, null)
.ToString() //be aware that this might be null
.Contains(value));
}

Can I get a Func object for an Extension method

I have a small utility extension method that performs some null checks on some LINQ extension methods in IEnumerable<T>. The code looks like this
public static class MyIEnumerableExtensions
{
// Generic wrapper to allow graceful handling of null values
public static IEnumerable<T> NullableExtension<T>(this IEnumerable<T> first, IEnumerable<T> second, Func<IEnumerable<T>, IEnumerable<T>, IEnumerable<T>> f)
{
if (first == null && second == null) return Enumerable.Empty<T>();
if (first == null) return second;
if (second == null) return first;
return f(first, second);
}
// Wrap the Intersect extension method in our Nullable wrapper
public static IEnumerable<T> NullableIntersect<T>(this IEnumerable<T> first, IEnumerable<T> second)
{
// It'd be nice to write this as
//
// return first.NullableExtension<T>(second, IEnumerable<T>.Intersect );
//
return first.NullableExtension<T>(second, (a,b) => a.Intersect(b));
}
}
So, is there a way to pass in the IEnumerable<T>.Intersect extension method to NullableExtension directly rather than wrapping it in a lambda?
Edit
Because it is actually concise to pass in the Enumerable extension method, I removed the NullableIntersect (and other) methods and just call the nullable wrapper directly.
Also, as Anthony points out, the semantics of what an Empty enumerable should do is different depending on the extension method, i.e. Union versus Intersect. As such, I rename the NullableExtension method to IgnoreIfNull which better reflects the generic behavior.
public static class MyIEnumerableExtensions
{
// Generic wrappers to allow graceful handling of null values
public static IEnumerable<T> IgnoreIfNull<T>(this IEnumerable<T> first, IEnumerable<T> second, Func<IEnumerable<T>, IEnumerable<T>, IEnumerable<T>> f)
{
if (first == null && second == null) return Enumerable.Empty<T>();
if (first == null) return second;
if (second == null) return first;
return f(first, second);
}
public static IEnumerable<T> EmptyIfNull<T>(this IEnumerable<T> first, IEnumerable<T> second, Func<IEnumerable<T>, IEnumerable<T>, IEnumerable<T>> f)
{
return f(first ?? Enumerable.Empty<T>(), second ?? Enumerable.Empty<T>());
}
}
// Usage example. Returns { 1, 4 } because arguments to Union and Intersect are ignored
var items = new List<int> { 1, 4 };
var output1 = items.IgnoreIfNull(null, Enumerable.Union).IgnoreIfNull(null, Enumerable.Intersect);
// Usage example. Returns { } because arguments to Union and Intersect are set to empty
var output2 = items.EmptyIfNull(null, Enumerable.Union).EmptyIfNull(null, Enumerable.Intersect);
Intersect is defined within static class Enumerable. You can pass it into your method like below
return first.NullableExtension<T>(second, Enumerable.Intersect);
Note: You might be concerned about the behavior of your logic in the case of a null sequence. For example, in the case of
List<int> first = null;
var second = new List<int> { 1, 4 };
var output = first.NullableIntersect(second).ToList();
You have defined it such that output contains {1, 4} (the elements of second). I might expect first to instead be treated as an empty sequence, and an intersection with second would result in an empty sequence. Ultimately, that's for you to decide the behavior you desire.

Building a lambda WHERE expression to pass into a method

I need to pass in a "where" lambda expression that'll be used in a LINQ query inside a method. The problem is, I don't know what the where value will be compared against until I get into the method.
Now to explain further and clarify some of what I said above I'll come up with a bit of a contrived example.
Imagine I have a List<Products> and I need to narrow that list down to a single record using a productId property of the Products object. Normally I would do this:
var product = productList.Where(p=>p.productId == 123).FirstOrDefault();
Now take it a step further - I need to put the above logic into a method that isn't limited to a List<Products> but is instead a List<T> so ideally, I'd be calling it like this (and I know the below won't work, it's simply here to show what I am trying to achieve):
myMethod(productList, p => p.productId == X)
With the caveat being that X isn't known until I'm inside the method.
Finally, for what it's worth, I need to point out that my collection of data is an OData DataServiceQuery.
So, to re-summarize my question: I need to know how to construct a lambda "where" expression that I can pass into a method and how to use it against a collection of objects in a LINQ query.
myMethod(productList, p => p.productId == X) - you can emulate with this trick
static void myMethod<T>(List<T> list, Func<T,bool> predicate, ref int x)
{
x = 5;
var v = list.Where(predicate);
foreach (var i in v)
Console.Write(i);
Console.ReadLine();
}
static void Main(string[] args)
{
List<int> x = new List<int> { 1, 2, 3, 4, 5 };
int z = 0;
myMethod(x, p => p == z, ref z);
}
but not sure if it solves your problem in whole
For one, if you are going to query an IEnumerable<T>, you will need to ensure that your comparison will work in the first place. In that case you can make your objects implement an interface that guarantees that they will support the comparison.
Once you do that, your method can have a generic constraint that limits the input to those interfaces. At that point, your method can take a Func, which can be passed to the LINQ Where clause:
public interface Identifier
{
int Id { get; set; }
}
public class Product : Identifier
{
public int Id { get; set; }
//Other stuff
}
public T GetMatch<T>(IEnumerable<T> collection, Func<T, int, bool> predicate) where T : Identifier
{
int comparison = 5;
return collection.Where(item => predicate(item, comparison)).FirstOrDefault();
}
Which can be invoked like:
var match = GetMatch<Identifier>(collection, (x, y) => x.Id == y);
UPDATE:
I modified the above code to take in a comparison parameter
You could try to use the PredicateBuilder class from the free LinqKit library(tutorial).
You can then construct a predicate using
PredicateBuilder predicate = PredicateBuilder.True<T>();
predicate = PredicateBuilder.And(predicate, p=> p.product_id == X);
where X is of type T.
You can use this predicate in a where clause such as .Where(predicate) and return an IQueryable or return the predicate itself which would be of type Expression<Func<T, bool>>

LINQ recursion function?

Let's take this n-tier deep structure for example:
public class SomeItem
{
public Guid ID { get;set; }
public string Name { get; set; }
public bool HasChildren { get;set; }
public IEnumerable<SomeItem> Children { get; set; }
}
If I am looking to get a particular Item by ID (anywhere in the structure) is there some LINQ goodness I can use to easily get it in a single statement or do I have to use some recursive function as below:
private SomeItem GetSomeItem(IEnumerable<SomeItem> items, Guid ID)
{
foreach (var item in items)
{
if (item.ID == ID)
{
return item;
}
else if (item.HasChildren)
{
return GetSomeItem(item.Children, ID);
}
}
return null;
}
LINQ doesn't really "do" recursion nicely. Your solution seems appropriate - although I'm not sure HasChildren is really required... why not just use an empty list for an item with no children?
An alternative is to write a DescendantsAndSelf method which will return all of the descendants (including the item itself), something like this;
// Warning: potentially expensive!
public IEnumerable<SomeItem> DescendantsAndSelf()
{
yield return this;
foreach (var item in Children.SelectMany(x => x.DescendantsAndSelf()))
{
yield return item;
}
}
However, if the tree is deep that ends up being very inefficient because each item needs to "pass through" all the iterators of its ancestry. Wes Dyer has blogged about this, showing a more efficient implementation.
Anyway, if you have a method like this (however it's implemented) you can just use a normal "where" clause to find an item (or First/FirstOrDefault etc).
Here's one without recursion. This avoids the cost of passing through several layers of iterators, so I think it's about as efficient as they come.
public static IEnumerable<T> IterateTree<T>(this T root, Func<T, IEnumerable<T>> childrenF)
{
var q = new List<T>() { root };
while (q.Any())
{
var c = q[0];
q.RemoveAt(0);
q.AddRange(childrenF(c) ?? Enumerable.Empty<T>());
yield return c;
}
}
Invoke like so:
var subtree = root.IterateTree(x => x. Children).ToList();
hope this helps
public static IEnumerable<Control> GetAllChildControls(this Control parent)
{
foreach (Control child in parent.Controls)
{
yield return child;
if (child.HasChildren)
{
foreach (Control grandChild in child.GetAllChildControls())
yield return grandChild;
}
}
}
It is important to remember you don't need to do everything with LINQ, or default to recursion. There are interesting options when you use data structures. The following is a simple flattening function in case anyone is interested.
public static IEnumerable<SomeItem> Flatten(IEnumerable<SomeItem> items)
{
if (items == null || items.Count() == 0) return new List<SomeItem>();
var result = new List<SomeItem>();
var q = new Queue<SomeItem>(collection: items);
while (q.Count > 0)
{
var item = q.Dequeue();
result.Add(item);
if (item?.Children?.Count() > 0)
foreach (var child in item.Children)
q.Enqueue(child);
}
return result;
}
While there are extension methods that enable recursion in LINQ (and probably look like your function), none are provided out of the box.
Examples of these extension methods can be found here or here.
I'd say your function is fine.

Resources