Distinct/GroupBy in WhenAll result Async - linq

I am writing a method in which i am using async prog.
var tasks = new List<Task<List<SomeClass>>>();
tasks.Add(this.Method1());
tasks.Add(this.Method2());
var results = await Task.WhenAll(tasks).ConfigureAwait(false);
i want distinct records from this result. How to achieve that.
currently i have written
return results.SelectMany(s => s).GroupBy(x => x.Name).Select(x => x.FirstOrDefault()).ToList();
But i am not sure with SelectMany, will this give correct result.

SelectMany(s => s) is a "flatten" operation. It takes a sequence of sequences and flattens them to a single sequence.
The LINQ "distinct" operator is called Distinct. If your SomeClass overrides equality to be based on Name, then that's all you need:
return results.SelectMany(s => s).Distinct().ToList();
But if SomeClass doesn't define equality that way, you'll need to do another kind of distinct.
One option is to use the Distinct overload that takes an equality comparer. Then you can pass in an equality comparer that determines equality by Name. To do this, first define an equality comparer:
public sealed class NameEqualityComparer: IEqualityComparer<SomeClass>
{
public int GetHashCode(SomeClass obj) => EqualityComparer<string>.Default.GetHashCode(obj.Name);
public bool Equals(SomeClass x, SomeClass y) => EqualityComparer<string>.Default.Equals(x.Name, y.Name);
}
and then you can invoke the correct overload:
return results.SelectMany(s => s).Distinct(new NameEqualityComparer()).ToList();
I have a library that helps define custom comparers (properly handling all edge cases), which I prefer to use for things like this. With the Nito.Comparers library, you don't need to define a custom NameEqualityComparer; instead, you can define comparers in-line like this:
return results.SelectMany(s => s).Distinct(b => b.EquateBy(x => x.Name)).ToList();
or separately, if desired:
var comparer = EqualityComparerBuilder.For<SomeClass>().EquateBy(x => x.Name);
return results.SelectMany(s => s).Distinct(comparer).ToList();
A completely different option is to add a new "Distinct-By" operator that acts the way you want. This is part of MoreLINQ or you can add it yourself:
public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> #this, Func<T, TKey> selector)
{
var keys = new HashSet<TKey>();
foreach (var item in #this)
{
if (keys.Add(selector(item)))
yield return item;
}
}
Then you can use the new operator like this:
return results.SelectMany(s => s).DistinctBy(x => x.Name).ToList();
All of these options are more efficient than grouping.

Related

c# linq combine Expression<Func<T, bool>> predicate in where clause with another condition

Is it possible to construct a where clause like this, where predicate is of type Expression<Func<T, bool>> predicate:
var resultQuery = query.Where(q => !q.IsDeleted && predicate).ToList();
I would like to avoid double where clauses like this:
var resultQuery = query.Where(q => !q.IsDeleted).Where(predicate).ToList();
Instead, you could use custom extension methods and filter result using them:
public static class QueryExtensions
{
public static IQueryable<Image> NonDeleted(this IQueryable<Image> queryable)
{
return queryable.Where(x => !x.Deleted);
}
public static IQueryable<Image> LatestOnly(this IQueryable<Image> queryable)
{
return queryable.Where(x => x.CreateDate <= DateTime.UtcNow.AddDays(-7));
}
}
And then combine them in query:
var result = context
.Images
.NonDeleted()
.LatestOnly()
.ToList();
I like this approach cause it's clean and easy to read. You can also use interfaces in your entities and extension which use those interfaces to quickly filter items based on interfaces that implemented on entity. For example:
public interface ICreationDate{
DateTime CreateDate {get;}
}
public class Image: ICreationDate{
public DateTime CreateDate {get; private set;} = DateTime.UtcNow;
}
Then extension can be changed like this:
public static IQueryable<T> LatestOnly<T>(this IQueryable<T> queryable)
where T : ICreationDate
{
return queryable.Where(x => x.CreateDate <= DateTime.UtcNow.AddDays(-7));
}
This approach gives you more flexibility and reusability.
I know these all are too far from your original question, but it may bring you some alternative aproaches
IT is impossible. You need to provide a lambda expression as a parameter to the Where clause which would be compiled to an expression tree and after that translated into some SQL query. In your example
var resultQuery = query.Where(q => !q.IsDeleted && predicate).ToList();
you are combining a lambda expression and a boolean check. The only way to avoid the double Where clauses is to create a helper function that returns a lambda expression for filtering which includes filtering for the IsDeleted flag and the predicate logic i.e.
private System.Linq.Expressions.Expression<Func<T, bool>> filterPredicate(int n)
{
return q => !q.IsDeleted && q.Age > n;
}
Here we are assuming that
q.Age > n
is the logic of your predicate function. And then use the filter predicates like this:
var resultQuery = query.Where(filterPredicate(5)).ToList();
More about lambda expressions and expression trees you can read here https://learn.microsoft.com/en-us/dotnet/api/system.linq.expressions.expression-1?view=net-5.0

Linq query where there's a certain desired relationship between items in the result

A linq query Where clause can apply a func to an item in the original set and return a bool to include or not include the item based on the item's characteristics. Great stuff:
var q = myColl.Where(o => o.EffectiveDate = LastThursday);
But what if I want to find a set of items where each item is related to the last item in some way? Like:
var q = myColl.Where(o => o.EffectiveDate = thePreviousItem.ExpirationDate);
How do you make a Where (or other linq function) "jump out" of the current item?
Here's what I tried, trying to be clever. I made every item an array just so I can use the Aggregate function:
public IQueryable<T> CurrentVersions
{
get => AllVersions
.Select(vo => new T[] { vo })
.Aggregate((voa1, voa2) => voa1[0].BusinessExpirationDate.Value == voa2[0].BusinessEffectiveDate.Value ? voa1.Concat(voa2).ToArray() : voa1)
.SelectMany(vo => vo);
}
but that doesn't compile on the SelectMany:
The type arguments for method Enumerable.SelectMany<TSource,
TResult>(IEnumerable<TSource>, Func<TSource, IEnumerable<TResult>>)
cannot be inferred from the usage. Try specifying the type arguments
explicitly.
EDIT (SOLUTION)
As it turns out, I was on the right track, but was just confused about what SelectMany does. I didn't need it. I also needed to change IQueryable to IEnumerable because I'm using EF and you can't query after you let go of the DbContext. So, here is the actual solution.
public IEnumerable<T> CurrentVersions
{
get => AllVersions
.Select(vo => new T[] { vo })
.Aggregate((voa1, voa2) => voa1[0].BusinessExpirationDate.Value == voa2[0].BusinessEffectiveDate.Value ? voa1.Concat(voa2).ToArray() : voa1);
}
Linq queries are most effective when each item is processed in isolation. It doesn't work well when trying to relate items within the same collection, without having to process the same collection multiple times and standard linq operators.
The MoreLINQ library helps provide additional operators to fill in some of those gaps. I'm not sure what operators it provides that could be used in this instance, but I know it has a Pairwise() method that combines the current and previous items in the iteration.
In general, for situations like this, if you needed to roll out your own, it would be far easier to write it using a generator to generate your sequence. Either as a general purpose extension method:
public static IEnumerable<TSource> WhereWithPrevious<TSource>(
this IEnumerable<TSource> source,
Func<TSource, TSource, bool> predicate)
{
using (var iter = source.GetEnumerator())
{
if (!iter.MoveNext())
yield break;
var previous = iter.Current;
while (iter.MoveNext())
{
var current = iter.Current;
if (predicate(current, previous))
yield return current;
}
}
}
or one specifically for the problem you're trying to solve.
public static IEnumerable<MyType> GetVersions(IEnumerable<MyType> source)
{
using (var iter = source.GetEnumerator())
{
if (!iter.MoveNext())
yield break;
var previous = iter.Current;
while (iter.MoveNext())
{
var current = iter.Current;
if (current.EffectiveDate == previous.ExpirationDate)
yield return current;
}
}
}
An alternative approach which while standard practice in other languages but terribly inefficient here would be to zip the collection with itself offset by one.
var query = Collection.Skip(1).Zip(Collection, (c, p) => (current:c,previous:p))
.Where(x => x.current.EffectiveDate == x.previous.ExpirationDate)
...;
And with all of that said, using any of these options will most likely make your query incompatible with query providers. It's not something you would want expressed as a single query anyway.

Can I convert a Func<T, bool> to a Func<U, bool> where T and U are POCO classes where I can map properties of one to the other? If so, how?

I have a scenario where a method will take a predicate of type Func< T, bool > because the type T is the one that is exposed to the outer world, but when actually using that predicate I need that method to call another method that will take in Func< U, bool > where properties of T are mapped to properties of U.
A more concrete example would be:
public IEnumerable<ClientEntity> Search(Func<ClientEntity, bool> predicate)
{
IList<ClientEntity> result = new List<ClientEntity>();
// Somehow translate predicate into Func<Client, bool> which I will call realPredicate.
_dataFacade.Clients.Where(realPredicate).ToList().ForEach(c => result.Add(new ClientEntity() { Id = c.Id, Name = c.Name }));
return result.AsEnumerable();
}
Would that be possible?
Please note that ClientEntity is a POCO class that I defined myself while Client is an Entity Framework class created by the model (DB first).
Thanks!
I once attempted this. It resulted in a not-too-bad working expression tree rewriter when the expression tree consist of the simpler operations (equals, larger-then, smaller-then, etc).
It can be found here.
You can use it as:
Expression<Func<Poco1>> where1 = p => p.Name == "fred";
Expression<Func<Poco2>> where2 = ExpressionRewriter.CastParam<Poco1, Poco2>(where1);
EF doesn't use lambdas - it uses Expression Trees
Func<T, bool> lambda = ( o => o.Name == "fred" );
Expression<Func<T, bool>> expressionTree = ( o => o.Name == "fred" );
Expression Trees are in-memory object graphs that represent a given expression.
As they are just objects, you can create or modify them.
Here's another link: MSDN: How to: Modify Expression Trees
What I ended up doing did not require the use of Expression Trees:
public IEnumerable<ClientEntity> Search(Func<ClientEntity, bool> predicate)
{
IList<ClientEntity> result = new List<ClientEntity>();
Func<Client, bool> realPredicate = (c => predicate(ConvertFromClient(c)));
_dataFacade.Clients.Where(realPredicate).ToList().ForEach(c => result.Add(ConvertFromClient(c)));
return result.AsEnumerable();
}
private static ClientEntity ConvertFromClient(Client client)
{
ClientEntity result = new ClientEntity();
if (client != null)
{
// I actually used AutoMapper from http://automapper.org/ here instead of assigning every property.
result.Id = client.Id;
result.Name = client.Name;
}
return result;
}

How to use Func with IQueryable that returns IOrderedQueryable

I'm doing some research about EF and came across a function that accepts
Func<IQueryable<Student>, IOrderedQueryable<Student>>
and just wondering how to call that function that accepts that kind of parameter?
imagine function is something like that, and you've got a property Id in Student class.
public static class Helper {
public static void Test(Func<IQueryable<Student>, IOrderedQueryable<Student>> param)
{
var test = 0;
}
}
then you could use it this way
var student = new List<Student>().AsQueryable();//non sense, just for example
Helper.Test(m => student.OrderBy(x => x.Id));
m => student.OrderBy(x => x.Id) is a
Func<IQueryable<Student>, IOrderedQueryable<Student>>
(IQueryable<student> as parameter, returning a IOrderedQueryable<Student>)
or just
Helper.Test(m => m.OrderBy(x => x.Id));
In fact this doesn't make much sense without a "real" function...
define a method.
public IOrderedQueryable<Student> OrderingMethod(IQueryable<Student> query)
{
return query.OrderBy(student => student.Name);
}
Now this assignment is legal:
Func<IQueryable<Student>, IOrderedQueryable<Student>> orderingFunc = this.OrderingMethod;
And now that you have it in a variable, it's easy to pass it to the method.
You could also do it all inline:
Func<IQueryable<Student>, IOrderedQueryable<Student>> orderingFunc =
query => query.OrderBy(student => student.Name);

Call method on LINQ query results succinctly

I'd like to call MyMethod on each object from a LINQ Query, what is the best way to do this?
Currently I am using:
.Select(x => { x.MyMethod (); return x; }).ToArray();
ToArray() is needed for immediate execution.
Is there a simpler way to write this (without a foreach)
Thanks
You could specify your own reusable extension method that runs an Action<> on each element, and yield returns it.
IEnumerable<T> Do(this IEnumerable<T> vals, Action<T> action) {
foreach(T elem in vals) {
action(elem);
yield return elem;
}
}
Such method is included in the Rx library, under the System.Interactive namespace.
Then you can simply do
myCollection.Do(x => x.MyMethod()).ToArray();
xList already has the method you need: .ForEach(). It calls an Action on each list member.
List<x> fooList = ....Select(x => x).ToList();
fooList.ForEach(x => DoSomething(x));
I created an Apply extension method :
public static IEnumerable<T> Apply<T>(this IEnumerable<T> source, Action<T> action)
{
foreach(var item in source)
{
action(item);
yield return item;
}
}
You can use it as follows :
var results = query.Apply(x => x.MyMethod()).ToArray();
Actually, that's similar to the List<T>.ForEach method, excepts that it returns the items of the source so that you can continue to apply sequence operators on it
A for each is probably going to be the easiest way to do this, you could write an extension method that does the for each, but
You really wouldn't gain anything.
In my opinion, you don't need to call .To___ conversion methods since you are expecting side-effects only. Reactive Extension's Do() method would be a viable option.
By using Do() method, you have two advantages (as far as I'm concerned),
1) Defer execution (You can defer the immediate execution if you want).
2) Do() method has different overloads to let you have more controls over iteration.
For example: Do(onNext, OnError, OnCompeleted) overload
var deferQuery = query.Do(x => a.MyMethod(), ex => Console.WriteLine(ex.Message), () => Console.WriteLine("Completed"));
var immediateQuery = query.Do(x => a.MyMethod(), ex => Console.WriteLine(ex.Message), () => Console.WriteLine("Completed")).Run();

Resources