How to use LINQ to find all items in list which have the most members in another list? - linq

Given:
class Item {
public int[] SomeMembers { get; set; }
}
var items = new []
{
new Item { SomeMembers = new [] { 1, 2 } }, //0
new Item { SomeMembers = new [] { 1, 2 } }, //1
new Item { SomeMembers = new [] { 1 } } //2
}
var secondList = new int[] { 1, 2, 3 };
I need to find all the Items in items with the most of it's SomeMembers occurring in secondList.
In the example above I would expect Items 0 and 1 to be returned but not 2.
I know I could do it with things like loops or Contains() but it seems there must be a more elegant or efficient way?

This can be written pretty easily:
var result = items.Where(item => item.SomeMembers.Count(secondList.Contains) * 2
>= item.SomeMembers.Length);
Or possibly (I can never guess whether method group conversions will work):
var result = items.Where(item => item.SomeMembers.Count(x => secondList.Contains(x)) * 2
>= item.SomeMembers.Length);
Or to pull it out:
Func<int, bool> inSecondList = secondList.Contains;
var result = items.Where(item => item.SomeMembers.Count(inSecondList) * 2
>= item.SomeMembers.Length);
If secondList becomes large, you should consider using a HashSet<int> instead.
EDIT: To avoid evaluating SomeMembers twice, you could create an extension method:
public static bool MajoritySatisfied<T>(this IEnumerable<T> source,
Func<T, bool> condition)
{
int total = 0, satisfied = 0;
foreach (T item in source)
{
total++;
if (condition(item))
{
satisfied++;
}
}
return satisfied * 2 >= total;
}
Then:
var result = items.Where(item => item.MajoritySatisfied(secondList.Contains));

Related

How do you use linq to group records based on an accumulator?

Given an enumeration of records in the format:
Name (string)
Amount (number)
For example:
Laverne 4
Lenny 2
Shirley 3
Squiggy 5
I want to group the records, so that each group's total Amount does not exceed some limit-per-group. For example, 10.
Group 1 (Laverne,Lenny,Shirley) with Total Amount 9
Group 2 (Squiggy) with Total Amount 5
The Amount number is guaranteed to always be less than the grouping limit.
If you allow for captured variables to maintain state, then it becomes easier. If we have:
int limit = 10;
Then:
int groupTotal = 0;
int groupNum = 0;
var grouped = records.Select(r =>
{
int newCount = groupTotal + r.Amount;
if (newCount > limit)
{
groupNum++;
groupTotal = r.Amount;
}
else
groupTotal = newCount;
return new{Records = r, Group = groupNum};
}
).GroupBy(g => g.Group, g => g.Records);
It's O(n), and just a Select and a GroupBy, but the use of captured variables may not be as portable across providers as one may want though.
For linq-to-objects though, it's fine.
Here I have a solution using only LINQ functions:
// Record definition
class Record
{
public string Name;
public int Amount;
public Record(string name, int amount)
{
Name = name;
Amount = amount;
}
}
// actual code for setup and LINQ
List<Record> records = new List<Record>()
{
new Record("Laverne", 4),
new Record("Lenny", 2),
new Record("Shirley", 3),
new Record("Squiggy", 5)
};
int groupLimit = 10;
// the solution
List<Record[]> test =
records.GroupBy(record => records.TakeWhile(r => r != record)
.Concat(new[] { record })
.Sum(r => r.Amount) / (groupLimit + 1))
.Select(g => g.ToArray()).ToList();
This gives the correct result:
test =
{
{ [ "Laverne", 4 ], [ "Lenny", 2 ], [ "shirley", 3 ] },
{ [ "Squiggly", 5 ] }
}
The only downside is that this is O(n2). It essentially groups by the index of the group (as defined by using the sum of the record up to the current one). Note that groupLimit + 1 is needed so that we actually include groups from 0 to groupLimit, inclusive.
I'm trying to find a way of making it prettier, but it doesn't look easy.
A dotnet fiddle with a solution using Aggregate:
https://dotnetfiddle.net/gVgONH
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
// Record definition
public class Record
{
public string Name;
public int Amount;
public Record(string name, int amount)
{
Name = name;
Amount = amount;
}
}
public static void Main()
{
// actual code for setup and LINQ
List<Record> records = new List<Record>()
{
new Record("Alice", 1), new Record("Bob", 5), new Record("Charly", 4), new Record("Laverne", 4), new Record("Lenny", 2), new Record("Shirley", 3), new Record("Squiggy", 5)}
;
int groupLimit = 10;
int sum = 0;
var result = records.Aggregate(new List<List<Record>>(), (accumulated, next) =>
{
if ((sum + next.Amount >= groupLimit) || accumulated.Count() == 0)
{
Console.WriteLine("New team: " + accumulated.Count());
accumulated.Add(new List<Record>());
sum = 0;
}
sum += next.Amount;
Console.WriteLine("New member {0} ({1}): adds up to {2} ", next.Name, next.Amount, sum);
accumulated.Last().Add(next);
return accumulated;
}
);
Console.WriteLine("Team count: " + result.Count());
}
}
With output:
New team: 0
New member Alice (1): adds up to 1
New member Bob (5): adds up to 6
New team: 1
New member Charly (4): adds up to 4
New member Laverne (4): adds up to 8
New team: 2
New member Lenny (2): adds up to 2
New member Shirley (3): adds up to 5
New team: 3
New member Squiggy (5): adds up to 5
Team count: 4
There is no 'performant' way to do this with the built in Linq operators that I am aware of. You could create your own extension method, though:
public static class EnumerableExtensions
{
public static IEnumerable<TResult> GroupWhile<TSource, TAccumulation, TResult>(
this IEnumerable<TSource> source,
Func<TAccumulation> seedFactory,
Func<TAccumulation, TSource, TAccumulation> accumulator,
Func<TAccumulation, bool> predicate,
Func<TAccumulation, IEnumerable<TSource>, TResult> selector)
{
TAccumulation accumulation = seedFactory();
List<TSource> result = new List<TSource>();
using(IEnumerator<TSource> enumerator = source.GetEnumerator())
{
while(enumerator.MoveNext())
{
if(!predicate(accumulator(accumulation, enumerator.Current)))
{
yield return selector(accumulation, result);
accumulation = seedFactory();
result = new List<TSource>();
}
result.Add(enumerator.Current);
accumulation = accumulator(accumulation, enumerator.Current);
}
if(result.Count > 0)
{
yield return selector(accumulation, result);
}
}
}
}
And then call it like this:
int limit = 10;
var groups =
records
.GroupWhile(
() => 0,
(a, x) => a + x,
(a) => a <= limit,
(a, g) => new { Total = a, Group = g });
The way it is currently written, if any single record exceeds that limit then that record is returned by itself. You could modify it to exclude records that exceed the limit or leave it as is and perform the exclusion with Where.
This solution has O(n) runtime.

Algorithm for unique combinations

I've been trying to find a way to get a list of unique combinations from a list of objects nested in a container. Objects within the same group cannot be combined. Objects will be unique across all the groups
Example:
Group 1: (1,2)
Group 2: (3,4)
Result
1
2
3
4
1,3
1,4
2,3
2,4
If we add another group like so:
Group 1: (1,2)
Group 2: (3,4)
Group 3: (5,6,7)
The result would be
1
2
3
4
5
6
7
1,3
1,4
1,5
1,6
1,7
2,3
2,4
2,5
2,6
2,7
3,5
3,6
3,7
4,5
4,6
4,7
1,3,5
1,3,6
1,3,7
1,4,5
1,4,6
1,4,7
2,3,5
2,3,6
2,3,7
2,4,5
2,4,6
2,4,7
I may have missed a combination above, but the combinations mentioned should be enough indication.
I have a possibility of having up 7 groups, and 20 groups in each object.
I'm trying to avoid having code that knows that it's doing combinations of doubles, triples, quadruples etc, but I'm hitting a lot of logic bumps along the way.
To be clear, I'm not asking for code, and more for an approach, pseudo code or an indication would do great.
UPDATE
Here's what I have after seeing those two answers.
From #Servy's answer:
public static IEnumerable<IEnumerable<T>> GetCombinations<T>(this IEnumerable<IEnumerable<T>> sequences)
{
var defaultArray = new[] { default(T) };
return sequences.Select(sequence =>
sequence.Select(item => item).Concat(defaultArray))
.CartesianProduct()
.Select(sequence =>
sequence.Where(item => !item.Equals(default(T)))
.Select(item => item));
}
public static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] { item })
);
}
From #AK_'s answer
public static IEnumerable<IEnumerable<T>> GetCombinations<T>(this IEnumerable<IEnumerable<T>> groups)
{
if (groups.Count() == 0)
{
yield return new T[0];
}
if (groups.Count() == 1)
{
foreach (var t in groups.First())
{
yield return new T[] { t };
}
}
else
{
var furtherResult = GetCombinations(groups.Where(x => x != groups.Last()));
foreach (var result in furtherResult)
{
yield return result;
}
foreach (var t in groups.Last())
{
yield return new T[] { t };
foreach (var result in furtherResult)
{
yield return result.Concat(new T[] { t });
}
}
}
}
Usage for both
List<List<int>> groups = new List<List<int>>();
groups.Add(new List<int>() { 1, 2 });
groups.Add(new List<int>() { 3, 4, 5 });
groups.Add(new List<int>() { 6, 7 });
groups.Add(new List<int>() { 8, 9 });
groups.Add(new List<int>() { 10, 11 });
var x = groups.GetCombinations().Where(g => g.Count() > 0).ToList().OrderBy(y => y.Count());
What would be considered the best solution? To be honest, I am able to read what's happening with #AK_'s solution much easier (had to look for a solution on how to get Cartesian Product).
So first off consider the problem of a Cartesian Product of N sequences. That is, every single combination of one value from each of the sequences. Here is a example of an implementation of that problem, with an amazing explanation.
But how do we handle the cases where the output combination has a size smaller than the number of sequences? Alone that only handles the case where the given sequences are the same size as the number of sequences. Well, imagine for a second that every single input sequence has a "null" value. That null value gets paired with every single combination of values from the other sequences (including all of their null values). We can then remove these null values at the very end, and voila, we have every combination of every size.
To do this, while still allowing the input sequences to actually use the C# literal null values, or the default value for that type (if it's not nullable) we'll need to wrap the type. We'll create a wrapper that wraps the real value, while also having it's own definition of a def ult/null value. From there we map each of our sequences into a sequence of wrappers, append the actual default value onto the end, compute the Cartesian Product, and then map the combinations back to "real" values, filtering out the default values while we're at it.
If you don't want to see the actual code, stop reading here.
public class Wrapper<T>
{
public Wrapper(T value) { Value = value; }
public static Wrapper<T> Default = new Wrapper<T>(default(T));
public T Value { get; private set; }
}
public static IEnumerable<IEnumerable<T>> Foo<T>
(this IEnumerable<IEnumerable<T>> sequences)
{
return sequences.Select(sequence =>
sequence.Select(item => new Wrapper<T>(item))
.Concat(new[] { Wrapper<T>.Default }))
.CartesianProduct()
.Select(sequence =>
sequence.Where(wrapper => wrapper != Wrapper<T>.Default)
.Select(wrapper => wrapper.Value));
}
In C#
this is actually a monad... I think...
IEnumerable<IEnumerable<int>> foo (IEnumerable<IEnumerable<int>> groups)
{
if (groups.Count == 0)
{
return new List<List<int>>();
}
if (groups.Count == 1)
{
foreach(van num in groups.First())
{
return yield new List<int>(){num};
}
}
else
{
var furtherResult = foo(groups.Where(x=> x != groups.First()));
foreach (var result in furtherResult)
{
yield return result;
}
foreach(van num in groups.First())
{
yield return new List<int>(){num};
foreach (var result in furtherResult)
{
yield return result.Concat(num);
}
}
}
}
a better version:
public static IEnumerable<IEnumerable<T>> foo<T> (IEnumerable<IEnumerable<T>> groups)
{
if (groups.Count() == 0)
{
return new List<List<T>>();
}
else
{
var firstGroup = groups.First();
var furtherResult = foo(groups.Skip(1));
IEnumerable<IEnumerable<T>> myResult = from x in firstGroup
select new [] {x};
myResult = myResult.Concat( from x in firstGroup
from result in furtherResult
select result.Concat(new T[]{x}));
myResult = myResult.Concat(furtherResult);
return myResult;
}
}

Linq GroupBy without merging groupings with the same key if they are separated by other key

I'd like to archieve behaviour similar to Pythons' groupby.
[1, 1, 2, 1].GroupBy() => [[1, 1], [2], [1]]
I think this is what you're looking for:
var data = new int[] { 1, 1, 2, 1 };
var results = Enumerable.Range(0, data.Count ())
.Where (i => i == 0 || data.ElementAt(i - 1) != data.ElementAt(i))
.Select (i => new
{
//Key = data.ElementAt(i),
Group = Enumerable.Repeat(
data.ElementAt(i),
data.Skip(i).TakeWhile (d => d == data.ElementAt(i)).Count ())
}
);
Here's an example of it running and the results: http://ideone.com/NJGQB
Here's a lazy, generic extension method that does what you want.
Code:
public static IEnumerable<IEnumerable<T>> MyGroupBy<T>(this IEnumerable<T> source)
{
using(var enumerator = source.GetEnumerator())
{
var currentgroup = new List<T>();
while (enumerator.MoveNext())
{
if (!currentgroup.Any() || currentgroup[0].Equals(enumerator.Current))
currentgroup.Add(enumerator.Current);
else
{
yield return currentgroup.AsReadOnly();
currentgroup = new List<T>() { enumerator.Current };
}
}
yield return currentgroup.AsReadOnly();
}
}
Test:
void Main()
{
var data = new int[] { 1, 1, 2, 1 };
foreach(var g in data.MyGroupBy())
Console.WriteLine(String.Join(", ", g));
}
Output:
1, 1
2
1

How to use LINQ where clause for filtering the collection having dynamically generated type

I need to filter the ObservableCollection using LINQ Where clause in my Silverlight application.
The object type is dynamically created using method provided in following url.
http://mironabramson.com/blog/post/2008/06/Create-you-own-new-Type-and-use-it-on-run-time-(C).aspx
Is filtering my collection using Where clause for specific property possible?
How can I achieve it?
Thanks
The only way I know is using reflection, like this:
// using a list of dynamic types
var items = new List<object> { new { A = 0, B = 1 }, new { A = 1, C = 0 } };
// select ao items with A > 0
var filteredItems = items.Where(obj => (int)obj.GetType().GetField("A").GetValue(obj) > 0).ToArray();
// if you have a property instead of field, you should call GetProperty(), like this:
obj.GetType().GetProperty("PropertyName").GetValue(obj, null)
As you know the element type of your collection only at run time it probably is object at compile time. So the argument to the .Where method has to be a Func<object, bool>.
Here's a piece of code that will create such a delegate given a property of the actual element type and a lambda expression on the property (which i suppose you know the type of):
/// <summary>
/// Get a predicate for a property on a parent element.
/// </summary>
/// <param name="property">The property of the parent element to get the value for.</param>
/// <param name="propertyPredicate">The predicate on the property value.</param>
static Func<object, bool> GetPredicate<TProperty>(PropertyInfo property, Expression<Func<TProperty, bool>> propertyPredicate)
{
if (property.PropertyType != typeof(TProperty)) throw new ArgumentException("Bad property type.");
var pObj = Expression.Parameter(typeof(object), "obj");
// ((elementType)obj).property;
var xGetPropertyValue = Expression.Property(Expression.Convert(pObj, property.DeclaringType), property);
var pProperty = propertyPredicate.Parameters[0];
// obj => { var pProperty = xGetPropertyValue; return propertyPredicate.Body; };
var lambda = Expression.Lambda<Func<object, bool>>(Expression.Block(new[] { pProperty }, Expression.Assign(pProperty, xGetPropertyValue), propertyPredicate.Body), pObj);
return lambda.Compile();
}
Sample usage:
var items = new List<object> { new { A = 0, B = "Foo" }, new { A = 1, B = "Bar" }, new { A = 2, B = "FooBar" } };
var elementType = items[0].GetType();
Console.WriteLine("Items where A >= 1:");
foreach (var item in items.Where(GetPredicate<int>(elementType.GetProperty("A"), a => a >= 1)))
Console.WriteLine(item);
Console.WriteLine();
Console.WriteLine("Items where B starts with \"Foo\":");
foreach (var item in items.Where(GetPredicate<string>(elementType.GetProperty("B"), b => b.StartsWith("Foo"))))
Console.WriteLine(item);
Output:
Items where A >= 1:
{ A = 1, B = Bar }
{ A = 2, B = FooBar }
Items where B starts with "Foo":
{ A = 0, B = Foo }
{ A = 2, B = FooBar }

Linq Convert to Custom Dictionary?

.NET 4, I have
public class Humi
{
public int huKey { get; set; }
public string huVal { get; set; }
}
And in another class is this code in a method:
IEnumerable<Humi> someHumi = new List<Humi>(); //This is actually ISingleResult that comes from a LinqToSql-fronted sproc but I don't think is relevant for my question
var humia = new Humi { huKey = 1 , huVal = "a"};
var humib = new Humi { huKey = 1 , huVal = "b" };
var humic = new Humi { huKey = 2 , huVal = "c" };
var humid = new Humi { huKey = 2 , huVal = "d" };
I want to create a single IDictionary <int,string[]>
with key 1 containing ["a","b"] and key 2 containing ["c","d"]
Can anyone point out a decent way to to that conversion with Linq?
Thanks.
var myDict = someHumi
.GroupBy(h => h.huKey)
.ToDictionary(
g => g.Key,
g => g.ToArray())
Create an IEnumerable<IGrouping<int, Humi>> and then project that into a dictionary. Note .ToDictionary returns a Dictionary, not an IDictionary.
You can use ToLookup() which allows each key to hold multiple values, exactly your scenario (note that each key would hold an IEnumerable<string> of values though not an array):
var myLookup = someHumi.ToLookup(x => x.huKey, x => x.huVal);
foreach (var item in myLookup)
{
Console.WriteLine("{0} contains: {1}", item.Key, string.Join(",", item));
}
Output:
1 contains: a,b
2 contains: c,d

Resources