I've got an ILookup generated by some complicated expression. Let's say it's a lookup of people by last name. (In our simplistic world model, last names are unique by family)
ILookup<string, Person> families;
Now I've got two queries I'm interested in how to build.
First, how would I filter by last name?
var germanFamilies = families.Where(family => IsNameGerman(family.Key));
But here, germanFamilies is an IEnumerable<IGrouping<string, Person>>; if I call ToLookup() on it, I'd best bet would get an IGrouping<string, IGrouping<string, Person>>. If I try to be smart and call SelectMany first I'd end up with the computer doing a lot of unnecessary work. How would you convert this enumeration into a lookup easily?
Second, I'd like to get a lookups of adults only.
var adults = families.Select(family =>
new Grouping(family.Key, family.Select(person =>
person.IsAdult())));
Here I'm faced with two problems: the Grouping type doesn't exist (except as an internal inner class of Lookup), and even if it did we'd have the problem discussed above.
So, apart from implementing the ILookup and IGrouping interfaces completely, or make the computer do silly amounts of work (regrouping what has already been grouped), is there a way to alter existing ILookups to generate new ones that I missed?
(I'm going to assume you actually wanted to filter by last name, given your query.)
You can't modify any implementation of ILookup<T> that I'm aware of. It's certainly possible to implement ToLookup with an immutable lookup, as you're clearly aware :)
What you could do, however, is to change to use a Dictionary<string, List<Person>>:
var germanFamilies = families.Where(family => IsNameGerman(family.Key))
.ToDictionary(family => family.Key,
family.ToList());
That approach also works for your second query:
var adults = families.ToDictionary(family => family.Key,
family.Where(person => persion.IsAdult)
.ToList());
While that's still doing a bit more work than we might think necessary, it's not too bad.
EDIT: The discussion with Ani in the comments is worth reading. Basically, we're already going to be iterating over every person anyway - so if we assume O(1) dictionary lookup and insertion, we're actually no better in terms of time-complexity using the existing lookup than flattening:
var adults = families.SelectMany(x => x)
.Where(person => person.IsAdult)
.ToLookup(x => x.LastName);
In the first case, we could potentially use the existing grouping, like this:
// We'll have an IDictionary<string, IGrouping<string, Person>>
var germanFamilies = families.Where(family => IsNameGerman(family.Key))
.ToDictionary(family => family.Key);
That is then potentially much more efficient (if we have many people in each family) but means we're using groupings "out of context". I believe that's actually okay, but it leaves a slightly odd taste in my mouth, for some reason. As ToLookup materializes the query, it's hard to see how it could actually go wrong though...
For your first query, what about implementing your own FilteredLookup able to take advantage of coming from another ILookup ?
(thank to Jon Skeet for the hint)
public static ILookup<TKey, TElement> ToFilteredLookup<TKey, TElement>(this ILookup<TKey, TElement> lookup, Func<IGrouping<TKey, TElement>, bool> filter)
{
return new FilteredLookup<TKey, TElement>(lookup, filter);
}
With FilteredLookup class being:
internal sealed class FilteredLookup<TKey, TElement> : ILookup<TKey, TElement>
{
int count = -1;
Func<IGrouping<TKey, TElement>, bool> filter;
ILookup<TKey, TElement> lookup;
public FilteredLookup(ILookup<TKey, TElement> lookup, Func<IGrouping<TKey, TElement>, bool> filter)
{
this.filter = filter;
this.lookup = lookup;
}
public bool Contains(TKey key)
{
if (this.lookup.Contains(key))
return this.filter(this.GetGrouping(key));
return false;
}
public int Count
{
get
{
if (count >= 0)
return count;
count = this.lookup.Where(filter).Count();
return count;
}
}
public IEnumerable<TElement> this[TKey key]
{
get
{
var grp = this.GetGrouping(key);
if (!filter(grp))
throw new KeyNotFoundException();
return grp;
}
}
public IEnumerator<IGrouping<TKey, TElement>> GetEnumerator()
{
return this.lookup.Where(filter).GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
private IGrouping<TKey, TElement> GetGrouping(TKey key)
{
return new Grouping<TKey, TElement>(key, this.lookup[key]);
}
}
and Grouping:
internal sealed class Grouping<TKey, TElement> : IGrouping<TKey, TElement>
{
private readonly TKey key;
private readonly IEnumerable<TElement> elements;
internal Grouping(TKey key, IEnumerable<TElement> elements)
{
this.key = key;
this.elements = elements;
}
public TKey Key { get { return key; } }
public IEnumerator<TElement> GetEnumerator()
{
return elements.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
So basically your first query will be:
var germanFamilies = families.ToFilteredLookup(family => IsNameGerman(family.Key));
This allows you to avoid re-flattening-filtering-ToLookup, or creating a new dictionary (and so hashing keys again).
For the second query the idea will be similar, you should just create a similar class not filtering for the whole IGrouping but for the elements of the IGrouping.
Just an idea, maybe it could not be faster than other methods :)
The Lookup creates an index with a Key type and a value type generic indexer. You can added to a lookup and remove from a lookup by using concat for add and iterate and removing the key items in a temp list then rebuilding the lookup. The look up then works like a dictionary by retrieving the value type by a key.
public async Task TestILookup()
{
// Lookup<TKey,TElement>
List<Product> products = new List<Product>
{
new Product { ProductID = 1, Name = "Kayak", Category = "Watersports", Price = 275m },
new Product { ProductID = 2, Name = "Lifejacket", Category = "Watersports", Price = 48.95m },
new Product { ProductID = 3, Name = "Soccer Ball", Category = "Soccer", Price = 19.50m },
new Product { ProductID = 4, Name = "Corner Flag", Category = "Soccer", Price = 34.95m }
};
// create an indexer
ILookup<int, Product> lookup = (Lookup<int,Product>) products.ToLookup(p => p.ProductID, p => p);
Product newProduct = new Product { ProductID = 5, Name = "Basketball", Category = "Basketball", Price = 120.15m };
lookup = lookup.SelectMany(l => l)
.Concat(new[] { newProduct })
.ToLookup(l => l.ProductID, l=>l);
foreach (IGrouping<int, Product> packageGroup in lookup)
{
// Print the key value of the IGrouping.
output.WriteLine("ProductID Key {0}",packageGroup.Key);
// Iterate over each value in the IGrouping and print its value.
foreach (Product product in packageGroup)
output.WriteLine("Name {0}", product.Name);
}
Assert.Equal(lookup.Count(), 5);
}
public class Product
{
public int ProductID { get; set; }
public string Name { get; set; }
public string Category { get; set; }
public decimal Price { get; set; }
}
Output:
ProductID Key 1
Name Kayak
ProductID Key 2
Name Lifejacket
ProductID Key 3
Name Soccer Ball
ProductID Key 4
Name Corner Flag
ProductID Key 5
Name Basketball
Related
If I have a class like this
`
class Person
{
public string First;
public string Last;
public bool IsMarried;
public int Age;
}`
Then how can I write a LINQ Expression where I could select properties of a Person. I want to do something like this (user can enter 1..n properties)
SelectData<Person>(x=>x.First, x.Last,x.Age);
What would be the input expression of my SelectData function ?
SelectData(Expression<Func<TEntity, List<string>>> selector); ?
EDIT
In my SelectData function I want to extract property names and then generate SELECT clause of my SQL Query dynamically.
SOLUTION
Ok, so what I have done is to have my SelectData as
public IEnumerable<TEntity> SelectData(Expression<Func<TEntity, object>> expression)
{
NewExpression body = (NewExpression)expression.Body;
List<string> columns = new List<string>();
foreach(var arg in body.Arguments)
{
var exp = (MemberExpression)arg;
columns.Add(exp.Member.Name);
}
//build query
And to use it I call it like this
ccc<Person>().SelectData(x => new { x.First, x.Last, x.Age });
Hopefully it would help someone who is looking :)
Thanks,
IY
I think it would be better to use delegates instead of Reflection. Apart from the fact that delegates will be faster, the compiler will complain if you try to fetch property values that do not exist. With reflection you won't find errors until run time.
Luckily there is already something like that. it is implemented as an extension function of IEnumerable, and it is called Select (irony intended)
I think you want something like this:
I have a sequence of Persons, and I want you to create a Linq
statement that returns per Person a new object that contains the
properties First and Last.
Or:
I have a sequence of Persns and I want you to create a Linq statement
that returns per Person a new object that contains Age, IsMarried,
whether it is an adult and to make it difficult: one Property called
Name which is a combination of First and Last
The function SelectData would be something like this:
IEnumerable<TResult> SelectData<TSource, TResult>(this IEnumerable<TSource> source,
Func<TSource, TResult> selector)
{
return source.Select(selector);
}
Usage:
problem 1: return per Person a new object that contains the
properties First and Last.
var result = Persons.SelectData(person => new
{
First = person.First,
Last = person.Last,
});
problem 2: return per Person a new object that contains Age, IsMarried, whether he is an adult and one Property called Name which is a combination
of First and Last
var result = Persons.SelectData(person => new
{
Age = person.Name,
IsMarried = person.IsMarried,
IsAdult = person.Age > 21,
Name = new
{
First = person.First,
Last = person.Last,
},
});
Well let's face it, your SelectData is nothing more than Enumerable.Select
You could of course create a function where you'd let the caller provide a list of properties he wants, but (1) that would limit his possibilities to design the end result and (2) it would be way more typing for him to call the function.
Instead of:
.Select(p => new
{
P1 = p.Property1,
P2 = p.Property2,
}
he would have to type something like
.SelectData(new List<Func<TSource, TResult>()
{
p => p.Property1, // first element of the property list
p -> p.Property2, // second element of the property list
}
You won't be able to name the returned properties, you won't be able to combine several properties into one:
.Select(p => p.First + p.Last)
And what would you gain by it?
Highly discouraged requirement!
You could achive similar result using Reflection and Extension Method
Model:
namespace ConsoleApplication2
{
class Person
{
public string First { get; set; }
public string Last { get; set; }
public bool IsMarried { get; set; }
public int Age { get; set; }
}
}
Service:
using System.Collections.Generic;
using System.Linq;
namespace Test
{
public static class Service
{
public static IQueryable<IQueryable<KeyValuePair<string, object>>> SelectData<T>(this IQueryable<T> queryable, string[] properties)
{
var queryResult = new List<IQueryable<KeyValuePair<string, object>>>();
foreach (T entity in queryable)
{
var entityProperties = new List<KeyValuePair<string, object>>();
foreach (string property in properties)
{
var value = typeof(T).GetProperty(property).GetValue(entity);
var entityProperty = new KeyValuePair<string, object>(property, value);
entityProperties.Add(entityProperty);
}
queryResult.Add(entityProperties.AsQueryable());
}
return queryResult.AsQueryable();
}
}
}
Usage:
using System;
using System.Collections.Generic;
using System.Linq;
namespace Test
{
class Program
{
static void Main(string[] args)
{
var list = new List<Person>()
{
new Person()
{
Age = 18,
First = "test1",
IsMarried = false,
Last = "test2"
},
new Person()
{
Age = 40,
First = "test3",
IsMarried = true,
Last = "test4"
}
};
var queryableList = list.AsQueryable();
string[] properties = { "Age", "Last" };
var result = queryableList.SelectData(properties);
foreach (var element in result)
{
foreach (var property in element)
{
Console.WriteLine($"{property.Key}: {property.Value}");
}
}
Console.ReadKey();
}
}
}
Result:
Age: 18
Last: test2
Age: 40
Last: test4
I'm looking for some help with a LINQ query to filter on a property/enum of a custom object which is in a nested List, and want to maintain the parent object in return list.
For example/clarity/sample code, I have a parent object, which has in it a List based on class and enum below:
public class Stage {
public String Name { get; set;}
public List<Evaluation> MyEvaluations { get; set;}
}
public class Evaluation {
public float Result { get; set; }
public enumResultType ResultType { get; set; }
}
public enum enumResultType {
A,B,C
}
One can simulate sample data along those lines with something like:
List<Stage> ParentList = new List<Stage>();
Stage Stage1 = new Stage() { Name = "Stage1",
MyEvaluations = new List<Evaluation>() {
new Evaluation() { ResultType = enumResultType.A, Result=5 },
new Evaluation() { ResultType = enumResultType.B, Result=10},
new Evaluation() { ResultType = enumResultType.B, Result=11},
new Evaluation() { ResultType = enumResultType.C, Result=5}
}};
Stage Stage2 = new Stage() { Name = "Stage2",
MyEvaluations = new List<Evaluation>() {
new Evaluation() { ResultType = enumResultType.A, Result=10},
new Evaluation() { ResultType = enumResultType.B, Result=20},
new Evaluation() { ResultType = enumResultType.C, Result=20}}};
ParentList.Add(Stage1);
ParentList.Add(Stage2);
What I want to be able to do, via LINQ, is to select from the Parentlist object, all the items with only a filtered list where the ResultType in the Evaluations List matches a proper condition...
I don't want to repeat the parent object multiple times (seen selectmany), but rather a filtered down list of the MyEvaluations where the ResultType matches, and if this list has items (it would) return it with the parent.
I've played with:
ParentList.Select(x => x.MyEvaluations.FindAll(y => y.ResultType==enumResultType.B)).ToList();
however this returns only the inner list... whereas
ParentList.Where(x => x.MyEvaluations.Any(y => y.ResultType==enumResultType.B)).ToList();
returns ANY.. however I am missing how to get the list of MyEvaluations to be filtered down..
In my Example/sample data, I would like to query ParentList for all situations where ResultType = enumResultType.B;
So I would expect to get back a list of the same type, but without "Evaluation" which are equal to ResultType.A or .C
Based on dummy data, I would expect to be getting something which would have:
returnList.Count() - 2 items (Stage1 / Stage2) and within that Stage1 --> foreach (item.Result : 10, 11 Stage2 --> foreach (item.Result : 20
Can this be done without going to projections in new anonymous types as I would like to keep the list nice and clean as used later on in DataBinding and I iterate over many ResultTypes?
Feel like I'm missing something fairly simple, but fairly new to LINQ and lambda expressions.
Did you try these approaches already? Or is this not what you're looking for ?
//creating a new list
var answer = (from p in ParentList
select new Stage(){
Name = p.Name,
MyEvaluations = p.MyEvaluations.Where(e => e.ResultType == enumResultType.B).ToList()
}).ToList();
//in place replacement
ParentList.ForEach(p => p.MyEvaluations = p.MyEvaluations.Where(e => e.ResultType == enumResultType.B).ToList());
Say I have a collection of the following simple class:
public class MyEntity
{
public string SubId { get; set; }
public System.DateTime ApplicationTime { get; set; }
public double? ThicknessMicrons { get; set; }
}
I need to search through the entire collection looking for 5 consecutive (not 5 total, but 5 consecutive) entities that have a null ThicknessMicrons value. Consecutiveness will be based on the ApplicationTime property. The collection will be sorted on that property.
How can I do this in a Linq query?
You can write your own extension method pretty easily:
public static IEnumerable<IEnumerable<T>> FindSequences<T>(this IEnumerable<T> sequence, Predicate<T> selector, int size)
{
List<T> curSequence = new List<T>();
foreach (T item in sequence)
{
// Check if this item matches the condition
if (selector(item))
{
// It does, so store it
curSequence.Add(item);
// Check if the list size has met the desired size
if (curSequence.Count == size)
{
// It did, so yield that list, and reset
yield return curSequence;
curSequence = new List<T>();
}
}
else
{
// No match, so reset the list
curSequence = new List<T>();
}
}
}
Now you can just say:
var groupsOfFive = entities.OrderBy(x => x.ApplicationTime)
.FindSequences(x => x.ThicknessMicrons == null, 5);
Note that this will return all sub-sequences of length 5. You can test for the existence of one like so:
bool isFiveSubsequence = groupsOfFive.Any();
Another important note is that if you have 9 consecutive matches, only one sub-sequence will be located.
I'm building a rather large filter based on an SearchObject that has 50+ fields that can be searched.
Rather than building my where clause for each one of these individually I thought I'd use some slight of hand and try building custom attribute suppling the necessary information and then using reflection to build out each of my predicate statements (Using LinqKit btw). Trouble is, that the code finds the appropriate values in the reflection code and successfully builds a predicate for the property, but the "where" doesn't seem to actually generate and my query always returns 0 records.
The attribute is simple:
[AttributeUsage(AttributeTargets.Property, AllowMultiple=true)]
public class FilterAttribute: Attribute
{
public FilterType FilterType { get; set; } //enum{ Object, Database}
public string FilterPath { get; set; }
//var predicate = PredicateBuilder.False<Metadata>();
}
And this is my method that builds out the query:
public List<ETracker.Objects.Item> Search(Search SearchObject, int Page, int PageSize)
{
var predicate = PredicateBuilder.False<ETracker.Objects.Item>();
Type t = typeof(Search);
IEnumerable<PropertyInfo> pi = t.GetProperties();
string title = string.Empty;
foreach (var property in pi)
{
if (Attribute.IsDefined(property, typeof(FilterAttribute)))
{
var attrs = property.GetCustomAttributes(typeof(FilterAttribute),true);
var value = property.GetValue(SearchObject, null);
if (property.Name == "Title")
title = (string)value;
predicate.Or(a => GetPropertyVal(a, ((FilterAttribute)attrs[0]).FilterPath) == value);
}
}
var res = dataContext.GetAllItems().Take(1000)
.Where(a => SearchObject.Subcategories.Select(b => b.ID).ToArray().Contains(a.SubCategory.ID))
.Where(predicate);
return res.ToList();
}
The SearchObject is quite simple:
public class Search
{
public List<Item> Items { get; set; }
[Filter(FilterType = FilterType.Object, FilterPath = "Title")]
public string Title { get; set; }
...
}
Any suggestions will be greatly appreciated. I may well be going way the wrong direction and will take no offense if someone has a better alternative (or at least one that works)
You're not assigning your predicate anywhere. Change the line to this:
predicate = predicate.Or(a => GetPropertyVal(a, ((FilterAttribute)attrs[0]).FilterPath) == value);
I always seem to have a problem when I need to compare 2 list and produce a 3rd list which include all unique items.I need to perform this quite often.
Attempt to reproduce the issue with a noddy example.
Am I missing something?
Thanks for any suggestions
The wanted result
Name= Jo1 Surname= Bloggs1 Category= Account
Name= Jo2 Surname= Bloggs2 Category= Sales
Name= Jo5 Surname= Bloggs5 Category= Development
Name= Jo6 Surname= Bloggs6 Category= Management
Name= Jo8 Surname= Bloggs8 Category= HR
Name= Jo7 Surname= Bloggs7 Category= Cleaning
class Program
{
static void Main(string[] args)
{
List<Customer> listOne = new List<Customer>();
List<Customer> listTwo = new List<Customer>();
listOne.Add(new Customer { Category = "Account", Name = "Jo1", Surname = "Bloggs1" });
listOne.Add(new Customer { Category = "Sales", Name = "Jo2", Surname = "Bloggs2" });
listOne.Add(new Customer { Category = "Development", Name = "Jo5", Surname = "Bloggs5" });
listOne.Add(new Customer { Category = "Management", Name = "Jo6", Surname = "Bloggs6" });
listTwo.Add(new Customer { Category = "HR", Name = "Jo8", Surname = "Bloggs8" });
listTwo.Add(new Customer { Category = "Sales", Name = "Jo2", Surname = "Bloggs2" });
listTwo.Add(new Customer { Category = "Management", Name = "Jo6", Surname = "Bloggs6" });
listTwo.Add(new Customer { Category = "Development", Name = "Jo5", Surname = "Bloggs5" });
listTwo.Add(new Customer { Category = "Cleaning", Name = "Jo7", Surname = "Bloggs7" });
List<Customer> resultList = listOne.Union(listTwo).ToList();//**I get duplicates why????**
resultList.ForEach(customer => Console.WriteLine("Name= {0} Surname= {1} Category= {2}", customer.Name, customer.Surname, customer.Category));
Console.Read();
IEnumerable<Customer> resultList3 = listOne.Except(listTwo);//**Does not work**
foreach (var customer in resultList3)
{
Console.WriteLine("Name= {0} Surname= {1} Category= {2}", customer.Name, customer.Surname, customer.Category);
}
**//Does not work**
var resultList2 = (listOne
.Where(n => !(listTwo
.Select(o => o.Category))
.Contains(n.Category)))
.OrderBy(n => n.Category);
foreach (var customer in resultList2)
{
Console.WriteLine("Name= {0}
Surname= {1}
Category= {2}",
customer.Name,
customer.Surname,
customer.Category);
}
Console.Read();
}
}
public class Customer
{
public string Name { get; set; }
public string Surname { get; set; }
public string Category { get; set; }
}
Couldn't you do this by using the Concat and Distinct LINQ methods?
List<Customer> listOne;
List<Customer> listTwo;
List<Customer> uniqueList = listOne.Concat(listTwo).Distinct().ToList();
If necessary, you can use the Distinct() overload that takes an IEqualityComparer to create custom equality comparisons
The crux of the problem is the Customer object doesn't have a .Equals() implementation. If you override .Equals (and .GetHashCode) then .Distinct would use it to eliminate duplicates. If you don't own the Customer implementation, however, adding .Equals may not be an option.
An alternative is to pass a custom IEqualityComparer to .Distinct(). This lets you compare objects in different ways depending on which comparer you pass in.
Another alternative is to GroupBy the fields that are important and take any item from the group (since the GroupBy acts as .Equals in this case). This requires the least code to be written.
e.g.
var result = listOne.Concat(listTwo)
.GroupBy(x=>x.Category+"|"+x.Name+"|"+x.Surname)
.Select(x=>x.First());
which gets your desired result.
As a rule I use a unique delimiter to combine fields so that two items that should be different don't unexpectedly combine to the same key. consider: {Name=abe, Surname=long} and {Name=abel, Surname=ong} would both get the GroupBy key "abelong" if a delimiter isn't used.
The best option is implement the interface IEqualityComparer and use it within Union or Distinct method as I wrote at the end of this article
http://blog.santiagoporras.com/combinar-listas-sin-duplicados-linq/
Implementation of IEqualityComparer
public class SaintComparer : IEqualityComparer<Saint>
{
public bool Equals(Saint item1, Saint item2)
{
return item1.Name == item2.Name;
}
public int GetHashCode(Saint item)
{
int hCode = item.Name.Length;
return hCode.GetHashCode();
}
}
Use of comparer
var unionList = list1.Union(list2, new SaintComparer());
I had a similar problem where I had two very large lists with random strings.
I made a recursive function which returns a new list with unique strings. I compared two lists with 100k random strings(it may or may not exist duplicates) each with 6 characters of abcdefghijklmnopqrstuvwxyz1234567890 and it was done in about 230 ms. I only measured the given function.
I hope this will give value to someone.
Image of test run
makeCodesUnique(List<string> existing, List<string> newL)
{
// Get all duplicate between two lists
List<string> duplicatesBetween = newL.Intersect(existing).ToList();
// Get all duplicates within list
List<string> duplicatesWithin = newL.GroupBy(x => x)
.Where(group => group.Count() > 1)
.Select(group => group.Key).ToList();
if (duplicatesBetween.Count == 0 && duplicatesWithin.Count == 0)
{
// Return list if there are no duplicates
return newL;
}
else
{
if (duplicatesBetween.Count != 0)
{
foreach (string duplicateCode in duplicatesBetween)
{
newL.Remove(duplicateCode);
}
// Generate new codes to substitute the removed ones
List<string> newCodes = generateSomeMore(duplicatesBetween.Count);
newL.AddRange(newCodes);
makeCodesUnique(existing, newL);
}
else if (duplicatesWithin.Count != 0)
{
foreach (string duplicateCode in duplicatesWithin)
{
newL.Remove(duplicateCode);
}
List<string> newCodes = generateSomeMore(duplicatesWithin.Count);
new.AddRange(newCodes);
makeCodesUnique(existing, newL);
}
}
return newL;
}