How do you use linq to group records based on an accumulator? - linq

Given an enumeration of records in the format:
Name (string)
Amount (number)
For example:
Laverne 4
Lenny 2
Shirley 3
Squiggy 5
I want to group the records, so that each group's total Amount does not exceed some limit-per-group. For example, 10.
Group 1 (Laverne,Lenny,Shirley) with Total Amount 9
Group 2 (Squiggy) with Total Amount 5
The Amount number is guaranteed to always be less than the grouping limit.

If you allow for captured variables to maintain state, then it becomes easier. If we have:
int limit = 10;
Then:
int groupTotal = 0;
int groupNum = 0;
var grouped = records.Select(r =>
{
int newCount = groupTotal + r.Amount;
if (newCount > limit)
{
groupNum++;
groupTotal = r.Amount;
}
else
groupTotal = newCount;
return new{Records = r, Group = groupNum};
}
).GroupBy(g => g.Group, g => g.Records);
It's O(n), and just a Select and a GroupBy, but the use of captured variables may not be as portable across providers as one may want though.
For linq-to-objects though, it's fine.

Here I have a solution using only LINQ functions:
// Record definition
class Record
{
public string Name;
public int Amount;
public Record(string name, int amount)
{
Name = name;
Amount = amount;
}
}
// actual code for setup and LINQ
List<Record> records = new List<Record>()
{
new Record("Laverne", 4),
new Record("Lenny", 2),
new Record("Shirley", 3),
new Record("Squiggy", 5)
};
int groupLimit = 10;
// the solution
List<Record[]> test =
records.GroupBy(record => records.TakeWhile(r => r != record)
.Concat(new[] { record })
.Sum(r => r.Amount) / (groupLimit + 1))
.Select(g => g.ToArray()).ToList();
This gives the correct result:
test =
{
{ [ "Laverne", 4 ], [ "Lenny", 2 ], [ "shirley", 3 ] },
{ [ "Squiggly", 5 ] }
}
The only downside is that this is O(n2). It essentially groups by the index of the group (as defined by using the sum of the record up to the current one). Note that groupLimit + 1 is needed so that we actually include groups from 0 to groupLimit, inclusive.
I'm trying to find a way of making it prettier, but it doesn't look easy.

A dotnet fiddle with a solution using Aggregate:
https://dotnetfiddle.net/gVgONH
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
// Record definition
public class Record
{
public string Name;
public int Amount;
public Record(string name, int amount)
{
Name = name;
Amount = amount;
}
}
public static void Main()
{
// actual code for setup and LINQ
List<Record> records = new List<Record>()
{
new Record("Alice", 1), new Record("Bob", 5), new Record("Charly", 4), new Record("Laverne", 4), new Record("Lenny", 2), new Record("Shirley", 3), new Record("Squiggy", 5)}
;
int groupLimit = 10;
int sum = 0;
var result = records.Aggregate(new List<List<Record>>(), (accumulated, next) =>
{
if ((sum + next.Amount >= groupLimit) || accumulated.Count() == 0)
{
Console.WriteLine("New team: " + accumulated.Count());
accumulated.Add(new List<Record>());
sum = 0;
}
sum += next.Amount;
Console.WriteLine("New member {0} ({1}): adds up to {2} ", next.Name, next.Amount, sum);
accumulated.Last().Add(next);
return accumulated;
}
);
Console.WriteLine("Team count: " + result.Count());
}
}
With output:
New team: 0
New member Alice (1): adds up to 1
New member Bob (5): adds up to 6
New team: 1
New member Charly (4): adds up to 4
New member Laverne (4): adds up to 8
New team: 2
New member Lenny (2): adds up to 2
New member Shirley (3): adds up to 5
New team: 3
New member Squiggy (5): adds up to 5
Team count: 4

There is no 'performant' way to do this with the built in Linq operators that I am aware of. You could create your own extension method, though:
public static class EnumerableExtensions
{
public static IEnumerable<TResult> GroupWhile<TSource, TAccumulation, TResult>(
this IEnumerable<TSource> source,
Func<TAccumulation> seedFactory,
Func<TAccumulation, TSource, TAccumulation> accumulator,
Func<TAccumulation, bool> predicate,
Func<TAccumulation, IEnumerable<TSource>, TResult> selector)
{
TAccumulation accumulation = seedFactory();
List<TSource> result = new List<TSource>();
using(IEnumerator<TSource> enumerator = source.GetEnumerator())
{
while(enumerator.MoveNext())
{
if(!predicate(accumulator(accumulation, enumerator.Current)))
{
yield return selector(accumulation, result);
accumulation = seedFactory();
result = new List<TSource>();
}
result.Add(enumerator.Current);
accumulation = accumulator(accumulation, enumerator.Current);
}
if(result.Count > 0)
{
yield return selector(accumulation, result);
}
}
}
}
And then call it like this:
int limit = 10;
var groups =
records
.GroupWhile(
() => 0,
(a, x) => a + x,
(a) => a <= limit,
(a, g) => new { Total = a, Group = g });
The way it is currently written, if any single record exceeds that limit then that record is returned by itself. You could modify it to exclude records that exceed the limit or leave it as is and perform the exclusion with Where.
This solution has O(n) runtime.

Related

Simple Linq expression Sum - Not returning null value for empty list [duplicate]

This question already has answers here:
How to make a linq Sum return null if the summed values are all null
(5 answers)
Keep null when adding Nullable int?
(1 answer)
Closed 2 years ago.
I am creating a simple test method for sum of list to render null when the list is not rendering any value
[Fact]
public void GetSum()
{
List<TestClass> list = new List<TestClass>();
list.Add(new TestClass
{
Amount = null,
Id = 1
});
list.Add(new TestClass
{
Amount = null,
Id = 0
});
IQueryable<TestClass> classes = list.AsQueryable();
var sum = classes.AsEnumerable().Where(i => i.Id > 1).Select(i => i.Amount).Sum();
var sum1 = classes.AsEnumerable().Sum(i => i.Amount);
Assert.NotNull(sum);
}
#endregion Impact Ratio
}
public class TestClass
{
public int Id { get; set; }
public double? Amount { get; set; }
}
sum and sum1 both render 0, I want them up to be null in case the list is not having any appropriate value. Am I missing anything out in this ?
the Sum() function works as it does so you will have to use another function to achieve what you want. One possibility is the Aggregate() function which allows you to define how you want the summation to work.
For example this seems to achieve what you want:
[Fact]
public void GetSum()
{
var classes = new[]
{
new TestClass {Amount = null, Id = 1},
new TestClass {Amount = 1, Id = 0}
};
var sum1 = classes.Where(i => i.Id > 1)
.Select(i => i.Amount)
.Aggregate((double?)null, (a, b) => a.HasValue && b.HasValue ? a + b : a ?? b);
var sum2 = classes.Select(i => i.Amount)
.Aggregate((double?)null, (a, b) => a.HasValue && b.HasValue ? a + b : a ?? b);
Assert.Null(sum1);
Assert.NotNull(sum2);
}
Note that this method only works when you have an IEnumerable - it does not work with IQueryable, but you convert your IQueryable to IEnumerable in your own example so I guess that is ok...

Sort a List<object> by two properties one in ascending and the other in descending order in dart

I saw examples where I can sort a list in dart using one property in flutter(dart).
But how can I do the functionality which an SQL query does like for example:
order by points desc, time asc
You can sort the list then sort it again..
Here is a sample I made from dartpad.dev
void main() {
Object x = Object(name: 'Helloabc', i: 1);
Object y = Object(name: 'Othello', i: 3);
Object z = Object(name: 'Avatar', i: 2);
List<Object> _objects = [
x, y, z
];
_objects.sort((a, b) => a.name.length.compareTo(b.name.length));
/// second sorting
// _objects.sort((a, b) => a.i.compareTo(b.i));
for (Object a in _objects) {
print(a.name);
}
}
class Object {
final String name;
final int i;
Object({this.name, this.i});
}
I was able to find an answer for this. Thanks to #pskink and the url
https://www.woolha.com/tutorials/dart-sorting-list-with-comparator-and-comparable.
I implemented the Comparable to sort by the two properties.
class Sample implements Comparable<Sample> {
final int points;
final int timeInSeconds;
Sample(
{
this.points,
this.timeInSeconds});
#override
int compareTo(Sample other) {
int pointDifference = points- other.points;
return pointDifference != 0
? pointDifference
: other.timeInSeconds.compareTo(this.timeInSeconds);
}
}
sampleList.sort();

How to use LINQ to find all items in list which have the most members in another list?

Given:
class Item {
public int[] SomeMembers { get; set; }
}
var items = new []
{
new Item { SomeMembers = new [] { 1, 2 } }, //0
new Item { SomeMembers = new [] { 1, 2 } }, //1
new Item { SomeMembers = new [] { 1 } } //2
}
var secondList = new int[] { 1, 2, 3 };
I need to find all the Items in items with the most of it's SomeMembers occurring in secondList.
In the example above I would expect Items 0 and 1 to be returned but not 2.
I know I could do it with things like loops or Contains() but it seems there must be a more elegant or efficient way?
This can be written pretty easily:
var result = items.Where(item => item.SomeMembers.Count(secondList.Contains) * 2
>= item.SomeMembers.Length);
Or possibly (I can never guess whether method group conversions will work):
var result = items.Where(item => item.SomeMembers.Count(x => secondList.Contains(x)) * 2
>= item.SomeMembers.Length);
Or to pull it out:
Func<int, bool> inSecondList = secondList.Contains;
var result = items.Where(item => item.SomeMembers.Count(inSecondList) * 2
>= item.SomeMembers.Length);
If secondList becomes large, you should consider using a HashSet<int> instead.
EDIT: To avoid evaluating SomeMembers twice, you could create an extension method:
public static bool MajoritySatisfied<T>(this IEnumerable<T> source,
Func<T, bool> condition)
{
int total = 0, satisfied = 0;
foreach (T item in source)
{
total++;
if (condition(item))
{
satisfied++;
}
}
return satisfied * 2 >= total;
}
Then:
var result = items.Where(item => item.MajoritySatisfied(secondList.Contains));

Algorithm for unique combinations

I've been trying to find a way to get a list of unique combinations from a list of objects nested in a container. Objects within the same group cannot be combined. Objects will be unique across all the groups
Example:
Group 1: (1,2)
Group 2: (3,4)
Result
1
2
3
4
1,3
1,4
2,3
2,4
If we add another group like so:
Group 1: (1,2)
Group 2: (3,4)
Group 3: (5,6,7)
The result would be
1
2
3
4
5
6
7
1,3
1,4
1,5
1,6
1,7
2,3
2,4
2,5
2,6
2,7
3,5
3,6
3,7
4,5
4,6
4,7
1,3,5
1,3,6
1,3,7
1,4,5
1,4,6
1,4,7
2,3,5
2,3,6
2,3,7
2,4,5
2,4,6
2,4,7
I may have missed a combination above, but the combinations mentioned should be enough indication.
I have a possibility of having up 7 groups, and 20 groups in each object.
I'm trying to avoid having code that knows that it's doing combinations of doubles, triples, quadruples etc, but I'm hitting a lot of logic bumps along the way.
To be clear, I'm not asking for code, and more for an approach, pseudo code or an indication would do great.
UPDATE
Here's what I have after seeing those two answers.
From #Servy's answer:
public static IEnumerable<IEnumerable<T>> GetCombinations<T>(this IEnumerable<IEnumerable<T>> sequences)
{
var defaultArray = new[] { default(T) };
return sequences.Select(sequence =>
sequence.Select(item => item).Concat(defaultArray))
.CartesianProduct()
.Select(sequence =>
sequence.Where(item => !item.Equals(default(T)))
.Select(item => item));
}
public static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)
{
IEnumerable<IEnumerable<T>> emptyProduct = new[] { Enumerable.Empty<T>() };
return sequences.Aggregate(
emptyProduct,
(accumulator, sequence) =>
from accseq in accumulator
from item in sequence
select accseq.Concat(new[] { item })
);
}
From #AK_'s answer
public static IEnumerable<IEnumerable<T>> GetCombinations<T>(this IEnumerable<IEnumerable<T>> groups)
{
if (groups.Count() == 0)
{
yield return new T[0];
}
if (groups.Count() == 1)
{
foreach (var t in groups.First())
{
yield return new T[] { t };
}
}
else
{
var furtherResult = GetCombinations(groups.Where(x => x != groups.Last()));
foreach (var result in furtherResult)
{
yield return result;
}
foreach (var t in groups.Last())
{
yield return new T[] { t };
foreach (var result in furtherResult)
{
yield return result.Concat(new T[] { t });
}
}
}
}
Usage for both
List<List<int>> groups = new List<List<int>>();
groups.Add(new List<int>() { 1, 2 });
groups.Add(new List<int>() { 3, 4, 5 });
groups.Add(new List<int>() { 6, 7 });
groups.Add(new List<int>() { 8, 9 });
groups.Add(new List<int>() { 10, 11 });
var x = groups.GetCombinations().Where(g => g.Count() > 0).ToList().OrderBy(y => y.Count());
What would be considered the best solution? To be honest, I am able to read what's happening with #AK_'s solution much easier (had to look for a solution on how to get Cartesian Product).
So first off consider the problem of a Cartesian Product of N sequences. That is, every single combination of one value from each of the sequences. Here is a example of an implementation of that problem, with an amazing explanation.
But how do we handle the cases where the output combination has a size smaller than the number of sequences? Alone that only handles the case where the given sequences are the same size as the number of sequences. Well, imagine for a second that every single input sequence has a "null" value. That null value gets paired with every single combination of values from the other sequences (including all of their null values). We can then remove these null values at the very end, and voila, we have every combination of every size.
To do this, while still allowing the input sequences to actually use the C# literal null values, or the default value for that type (if it's not nullable) we'll need to wrap the type. We'll create a wrapper that wraps the real value, while also having it's own definition of a def ult/null value. From there we map each of our sequences into a sequence of wrappers, append the actual default value onto the end, compute the Cartesian Product, and then map the combinations back to "real" values, filtering out the default values while we're at it.
If you don't want to see the actual code, stop reading here.
public class Wrapper<T>
{
public Wrapper(T value) { Value = value; }
public static Wrapper<T> Default = new Wrapper<T>(default(T));
public T Value { get; private set; }
}
public static IEnumerable<IEnumerable<T>> Foo<T>
(this IEnumerable<IEnumerable<T>> sequences)
{
return sequences.Select(sequence =>
sequence.Select(item => new Wrapper<T>(item))
.Concat(new[] { Wrapper<T>.Default }))
.CartesianProduct()
.Select(sequence =>
sequence.Where(wrapper => wrapper != Wrapper<T>.Default)
.Select(wrapper => wrapper.Value));
}
In C#
this is actually a monad... I think...
IEnumerable<IEnumerable<int>> foo (IEnumerable<IEnumerable<int>> groups)
{
if (groups.Count == 0)
{
return new List<List<int>>();
}
if (groups.Count == 1)
{
foreach(van num in groups.First())
{
return yield new List<int>(){num};
}
}
else
{
var furtherResult = foo(groups.Where(x=> x != groups.First()));
foreach (var result in furtherResult)
{
yield return result;
}
foreach(van num in groups.First())
{
yield return new List<int>(){num};
foreach (var result in furtherResult)
{
yield return result.Concat(num);
}
}
}
}
a better version:
public static IEnumerable<IEnumerable<T>> foo<T> (IEnumerable<IEnumerable<T>> groups)
{
if (groups.Count() == 0)
{
return new List<List<T>>();
}
else
{
var firstGroup = groups.First();
var furtherResult = foo(groups.Skip(1));
IEnumerable<IEnumerable<T>> myResult = from x in firstGroup
select new [] {x};
myResult = myResult.Concat( from x in firstGroup
from result in furtherResult
select result.Concat(new T[]{x}));
myResult = myResult.Concat(furtherResult);
return myResult;
}
}

How to find a word from arrays of characters?

What is the best way to solve this:
I have a group of arrays with 3-4 characters inside each like so:
{p, {a, {t, {m,
q, b, u, n,
r, c v o
s } } }
}
I also have an array of dictionary words.
What is the best/fastest way to find if the array of characters can combine to form one of the dictionary words? For example, the above arrays could make the words:
"pat","rat","at","to","bum"(lol)but not "nub" or "mat"Should i loop through the dictionary to see if words can be made or get all the combinations from the letters then compare those to the dictionary
I had some Scrabble code laying around, so I was able to throw this together. The dictionary I used is sowpods (267751 words). The code below reads the dictionary as a text file with one uppercase word on each line.
The code is C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Diagnostics;
namespace SO_6022848
{
public struct Letter
{
public const string Chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
public static implicit operator Letter(char c)
{
return new Letter() { Index = Chars.IndexOf(c) };
}
public int Index;
public char ToChar()
{
return Chars[Index];
}
public override string ToString()
{
return Chars[Index].ToString();
}
}
public class Trie
{
public class Node
{
public string Word;
public bool IsTerminal { get { return Word != null; } }
public Dictionary<Letter, Node> Edges = new Dictionary<Letter, Node>();
}
public Node Root = new Node();
public Trie(string[] words)
{
for (int w = 0; w < words.Length; w++)
{
var word = words[w];
var node = Root;
for (int len = 1; len <= word.Length; len++)
{
var letter = word[len - 1];
Node next;
if (!node.Edges.TryGetValue(letter, out next))
{
next = new Node();
if (len == word.Length)
{
next.Word = word;
}
node.Edges.Add(letter, next);
}
node = next;
}
}
}
}
class Program
{
static void GenWords(Trie.Node n, HashSet<Letter>[] sets, int currentArrayIndex, List<string> wordsFound)
{
if (currentArrayIndex < sets.Length)
{
foreach (var edge in n.Edges)
{
if (sets[currentArrayIndex].Contains(edge.Key))
{
if (edge.Value.IsTerminal)
{
wordsFound.Add(edge.Value.Word);
}
GenWords(edge.Value, sets, currentArrayIndex + 1, wordsFound);
}
}
}
}
static void Main(string[] args)
{
const int minArraySize = 3;
const int maxArraySize = 4;
const int setCount = 10;
const bool generateRandomInput = true;
var trie = new Trie(File.ReadAllLines("sowpods.txt"));
var watch = new Stopwatch();
var trials = 10000;
var wordCountSum = 0;
var rand = new Random(37);
for (int t = 0; t < trials; t++)
{
HashSet<Letter>[] sets;
if (generateRandomInput)
{
sets = new HashSet<Letter>[setCount];
for (int i = 0; i < setCount; i++)
{
sets[i] = new HashSet<Letter>();
var size = minArraySize + rand.Next(maxArraySize - minArraySize + 1);
while (sets[i].Count < size)
{
sets[i].Add(Letter.Chars[rand.Next(Letter.Chars.Length)]);
}
}
}
else
{
sets = new HashSet<Letter>[] {
new HashSet<Letter>(new Letter[] { 'P', 'Q', 'R', 'S' }),
new HashSet<Letter>(new Letter[] { 'A', 'B', 'C' }),
new HashSet<Letter>(new Letter[] { 'T', 'U', 'V' }),
new HashSet<Letter>(new Letter[] { 'M', 'N', 'O' }) };
}
watch.Start();
var wordsFound = new List<string>();
for (int i = 0; i < sets.Length - 1; i++)
{
GenWords(trie.Root, sets, i, wordsFound);
}
watch.Stop();
wordCountSum += wordsFound.Count;
if (!generateRandomInput && t == 0)
{
foreach (var word in wordsFound)
{
Console.WriteLine(word);
}
}
}
Console.WriteLine("Elapsed per trial = {0}", new TimeSpan(watch.Elapsed.Ticks / trials));
Console.WriteLine("Average word count per trial = {0:0.0}", (float)wordCountSum / trials);
}
}
}
Here is the output when using your test data:
PA
PAT
PAV
QAT
RAT
RATO
RAUN
SAT
SAU
SAV
SCUM
AT
AVO
BUM
BUN
CUM
TO
UM
UN
Elapsed per trial = 00:00:00.0000725
Average word count per trial = 19.0
And the output when using random data (does not print each word):
Elapsed per trial = 00:00:00.0002910
Average word count per trial = 62.2
EDIT: I made it much faster with two changes: Storing the word at each terminal node of the trie, so that it doesn't have to be rebuilt. And storing the input letters as an array of hash sets instead of an array of arrays, so that the Contains() call is fast.
There are probably many way of solving this.
What you are interested in is the number of each character you have available to form a word, and how many of each character is required for each dictionary word. The trick is how to efficiently look up this information in the dictionary.
Perhaps you can use a prefix tree (a trie), some kind of smart hash table, or similar.
Anyway, you will probably have to try out all your possibilities and check them against the dictionary. I.e., if you have three arrays of three values each, there will be 3^3+3^2+3^1=39 combinations to check out. If this process is too slow, then perhaps you could stick a Bloom filter in front of the dictionary, to quickly check if a word is definitely not in the dictionary.
EDIT: Anyway, isn't this essentially the same as Scrabble? Perhaps try Googling for "scrabble algorithm" will give you some good clues.
The reformulated question can be answered just by generating and testing. Since you have 4 letters and 10 arrays, you've only got about 1 million possible combinations (10 million if you allow a blank character). You'll need an efficient way to look them up, use a BDB or some sort of disk based hash.
The trie solution previously posted should work as well, you are just restricted more by what characters you can choose at each step of the search. It should be faster as well.
I just made a very large nested for loop like this:
for(NSString*s1 in [letterList objectAtIndex:0]{
for(NSString*s2 in [letterList objectAtIndex:1]{
8 more times...
}
}
Then I do a binary search on the combination to see if it is in the dictionary and add it to an array if it is

Resources