Using GroupBy, Count and Sum in LINQ Lambda Expressions - linq

I have a collection of boxes with the properties weight, volume and owner.
I want to use LINQ to get a summarized list (by owner) of the box information
e.g.
**Owner, Boxes, Total Weight, Total Volume**
Jim, 5, 1430.00, 3.65
George, 2, 37.50, 1.22
Can someone show me how to do this with Lambda expressions?

var ListByOwner = list.GroupBy(l => l.Owner)
.Select(lg =>
new {
Owner = lg.Key,
Boxes = lg.Count(),
TotalWeight = lg.Sum(w => w.Weight),
TotalVolume = lg.Sum(w => w.Volume)
});

var q = from b in listOfBoxes
group b by b.Owner into g
select new
{
Owner = g.Key,
Boxes = g.Count(),
TotalWeight = g.Sum(item => item.Weight),
TotalVolume = g.Sum(item => item.Volume)
};

var boxSummary = from b in boxes
group b by b.Owner into g
let nrBoxes = g.Count()
let totalWeight = g.Sum(w => w.Weight)
let totalVolume = g.Sum(v => v.Volume)
select new { Owner = g.Key, Boxes = nrBoxes,
TotalWeight = totalWeight,
TotalVolume = totalVolume }

Related

Default values for empty groups in Linq GroupBy query

I have a data set of values that I want to summarise in groups. For each group, I want to create an array big enough to contain the values of the largest group. When a group contains less than this maximum number, I want to insert a default value of zero for the empty key values.
Dataset
Col1 Col2 Value
--------------------
A X 10
A Z 15
B X 9
B Y 12
B Z 6
Desired result
X, [10, 9]
Y, [0, 12]
Z, [15, 6]
Note that value "A" in Col1 in the dataset has no value for "Y" in Col2. Value "A" is first group in the outer series, therefore it is the first element that is missing.
The following query creates the result dataset, but does not insert the default zero values for the Y group.
result = data.GroupBy(item => item.Col2)
.Select(group => new
{
name = group.Key,
data = group.Select(item => item.Value)
.ToArray()
})
Actual result
X, [10, 9]
Y, [12]
Z, [15, 6]
What do I need to do to insert a zero as the missing group value?
Here is how I understand it.
Let say we have this
class Data
{
public string Col1, Col2;
public decimal Value;
}
Data[] source =
{
new Data { Col1="A", Col2 = "X", Value = 10 },
new Data { Col1="A", Col2 = "Z", Value = 15 },
new Data { Col1="B", Col2 = "X", Value = 9 },
new Data { Col1="B", Col2 = "Y", Value = 12 },
new Data { Col1="B", Col2 = "Z", Value = 6 },
};
First we need to determine the "fixed" part
var columns = source.Select(e => e.Col1).Distinct().OrderBy(c => c).ToList();
Then we can process with the normal grouping, but inside the group we will left join the columns with group elements which will allow us to achieve the desired behavior
var result = source.GroupBy(e => e.Col2, (key, elements) => new
{
Key = key,
Elements = (from c in columns
join e in elements on c equals e.Col1 into g
from e in g.DefaultIfEmpty()
select e != null ? e.Value : 0).ToList()
})
.OrderBy(e => e.Key)
.ToList();
It won't be pretty, but you can do something like this:
var groups = data.GroupBy(d => d.Col2, d => d.Value)
.Select(g => new { g, count = g.Count() })
.ToList();
int maxG = groups.Max(p => p.count);
var paddedGroups = groups.Select(p => new {
name = p.g.Key,
data = p.g.Concat(Enumerable.Repeat(0, maxG - p.count)).ToArray() });
You can do it like this:-
int maxCount = 0;
var result = data.GroupBy(x => x.Col2)
.OrderByDescending(x => x.Count())
.Select(x =>
{
if (maxCount == 0)
maxCount = x.Count();
var Value = x.Select(z => z.Value);
return new
{
name = x.Key,
data = maxCount == x.Count() ? Value.ToArray() :
Value.Concat(new int[maxCount - Value.Count()]).ToArray()
};
});
Code Explanation:-
Since you need to append default zeros in case when you have less items in any group, I am storing the maxCount (which any group can produce in a variable maxCount) for this I am ordering the items in descending order. Next I am storing the maximum count which the item can producr in maxCount variable. While projecting I am simply checking if number of items in the group is not equal to maxCount then create an integer array of size (maxCount - x.Count) i.e. maximum count minus number of items in current group and appending it to the array.
Working Fiddle.

Linq FirstOrDefault List inside a query

I have this linq query
var numberGroups =
from n in VISRUBs.Where(a => a.VISANA.VISITE.DATEVIS <= d && a.VISANA.VISITE.PANUM == p)
group n by n.RUBRIQUE into g
select new {
RemainderCHAPLIB = g.Key.ANALYSE.CHAPITRE.LIBELLE,
RemainderLIB = g.Key.LIBELLE,
RemainderRUNUM = g.Key.RUNUM,
vals = from vlist in g.OrderByDescending(a=>a.VISANA.VISITE.DATEVIS)
select vlist.VALEUR
};
which gives me this result in Linqpad
What I want is to select the first and second item from the last field (vals) which is a List<string>.
I have tried this:
var numberGroups =
from n in VISRUBs.Where(a => a.VISANA.VISITE.DATEVIS <= d && a.VISANA.VISITE.PANUM == p)
group n by n.RUBRIQUE into g
select new {
RemainderCHAPLIB = g.Key.ANALYSE.CHAPITRE.LIBELLE,
RemainderLIB = g.Key.LIBELLE,
RemainderRUNUM = g.Key.RUNUM,
vals = from vlist in g.OrderByDescending(a => a.VISANA.VISITE.DATEVIS)
select vlist.VALEUR
};
var lst = from n in numberGroups
select new
{
RemainderCHAPLIB = n.RemainderCHAPLIB,
RemainderLIB = n.RemainderLIB,
RemainderRUNUM = n.RemainderRUNUM,
VAL = n.vals.FirstOrDefault()
};
but it didn't work, I got an exception:
Dynamic SQL ErrorSQL error code = -104Token unknown - line 54, column 1OUTER
found it !
var lst = from n in numberGroups.ToList()
select new
{
RemainderCHAPLIB = n.RemainderCHAPLIB,
RemainderLIB = n.RemainderLIB,
RemainderRUNUM = n.RemainderRUNUM,
VAL = n.vals.FirstOrDefault(),
ANT = n.vals.Skip(1).FirstOrDefault()
};

LINQ filtering on a Dictionary<string, IList<string>>

I have code similar to this:
var dict = new Dictionary<string, IList<string>>();
dict.Add("A", new List<string>{"1","2","3"});
dict.Add("B", new List<string>{"2","4"});
dict.Add("C", new List<string>{"3","5","7"});
dict.Add("D", new List<string>{"8","5","7", "2"});
var categories = new List<string>{"A", "B"};
//This gives me categories and their items matching the category list
var result = dict.Where(x => categories.Contains(x.Key));
Key Value
A 1, 2, 3
B 2, 4
What I would like to get is this:
A 2
B 2
So the keys and just the values that are in both lists. Is there a way to do this in LINQ?
Thanks.
Easy peasy:
string key1 = "A";
string key2 = "B";
var intersection = dict[key1].Intersect(dict[key2]);
In general:
var intersection =
categories.Select(c => dict[c])
.Aggregate((s1, s2) => s1.Intersect(s2));
Here, I'm utilizing Enumerable.Intersect.
A somewhat dirty way of doing it...
var results = from c in categories
join d in dict on c equals d.Key
select d.Value;
//Get the limited intersections
IEnumerable<string> intersections = results.First();
foreach(var valueSet in results)
{
intersections = intersections.Intersect(valueSet);
}
var final = from c in categories
join i in intersections on 1 equals 1
select new {Category = c, Intersections = i};
Assuming we have 2 and 3 common to both lists, this will do the following:
A 2
A 3
B 2
B 3

LINQ Grouping: Is there a cleaner way to do this without a for loop

I am trying to create a very simple distribution chart and I want to display the counts of tests score percentages in their corresponding 10's ranges.
I thought about just doing the grouping on the Math.Round((d.Percentage/10-0.5),0)*10 which should give me the 10's value....but I wasn't sure the best way to do this given that I would probably have missing ranges and all ranges need to appear even if the count is zero. I also thought about doing an outer join on the ranges array but since I'm fairly new to Linq so for the sake of time I opted for the code below. I would however like to know what a better way might be.
Also note: As I tend to work with larger teams with varying experience levels, I'm not all that crazy about ultra compact code unless it remains very readable to the average developer.
Any suggestions?
public IEnumerable<TestDistribution> GetDistribution()
{
var distribution = new List<TestDistribution>();
var ranges = new int[] { 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110 };
var labels = new string[] { "0%'s", "10%'s", "20%'s", "30%'s", "40%'s", "50%'s", "60%'s", "70%'s", "80%'s", "90%'s", "100%'s", ">110% "};
for (var n = 0; n < ranges.Count(); n++)
{
var count = 0;
var min = ranges[n];
var max = (n == ranges.Count() - 1) ? decimal.MaxValue : ranges[n+1];
count = (from d in Results
where d.Percentage>= min
&& d.Percentage<max
select d)
.Count();
distribution.Add(new TestDistribution() { Label = labels[n], Frequency = count });
}
return distribution;
}
// ranges and labels in a list of pairs of them
var rangesWithLabels = ranges.Zip(labels, (r,l) => new {Range = r, Label = l});
// create a list of intervals (ie. 0-10, 10-20, .. 110 - max value
var rangeMinMax = ranges.Zip(ranges.Skip(1), (min, max) => new {Min = min, Max = max})
.Union(new[] {new {Min = ranges.Last(), Max = Int32.MaxValue}});
//the grouping is made by the lower bound of the interval found for some Percentage
var resultsDistribution = from c in Results
group c by
rangeMinMax.FirstOrDefault(r=> r.Min <= c.Percentage && c.Percentage < r.Max).Min into g
select new {Percentage = g.Key, Frequency = g.Count() };
// left join betweem the labels and the results with frequencies
var distributionWithLabels =
from l in rangesWithLabels
join r in resultsDistribution on l.Range equals r.Percentage
into rd
from r in rd.DefaultIfEmpty()
select new TestDistribution{
Label = l.Label,
Frequency = r != null ? r.Frequency : 0
};
distribution = distributionWithLabels.ToList();
Another solution if the ranges and labels can be created in another way
var ranges = Enumerable.Range(0, 10)
.Select(c=> new {
Min = c * 10,
Max = (c +1 )* 10,
Label = (c * 10) + "%'s"})
.Union(new[] { new {
Min = 100,
Max = Int32.MaxValue,
Label = ">110% "
}});
var resultsDistribution = from c in Results
group c by ranges.FirstOrDefault(r=> r.Min <= c.Percentage && c.Percentage < r.Max).Min
into g
select new {Percentage = g.Key, Frequency = g.Count() };
var distributionWithLabels =
from l in ranges
join r in resultsDistribution on l.Min equals r.Percentage
into rd
from r in rd.DefaultIfEmpty()
select new TestDistribution{
Label = l.Label,
Frequency = r != null ? r.Frequency : 0
};
This works
public IEnumerable<TestDistribution> GetDistribution()
{
var range = 12;
return Enumerable.Range(0, range).Select(
n => new TestDistribution
{
Label = string.Format("{1}{0}%'s", n*10, n==range-1 ? ">" : ""),
Frequency =
Results.Count(
d =>
d.Percentage >= n*10
&& d.Percentage < ((n == range - 1) ? decimal.MaxValue : (n+1)*10))
});
}

Convert to Lambda expression

I have the following expression
var q = from c in D1
join dp in
(from e in E1
group e by e.ID into g
select new { ID = g.Key, Cnt = g.Count() })
on c.ID
equals dp.ID
into dpp from v in dpp.DefaultIfEmpty()
select new { c.ID, Cnt= v.Cnt ?? 0 };
How can i convert this to Lambda expression?
Here's one way to go. This kind-of matches the above.
var subquery = E1
.GroupBy(e => e.Id)
.Select(g => new { ID = g.Key, Cnt = g.Count()});
//.ToList();
var q = D1
.GroupJoin(
subquery,
c => c.ID,
dp => dp.ID,
(c, g) => new {ID = c.ID, Cnt=g.Any() ? g.First().Cnt : 0 }
)
After refactoring, I came up with this:
var q = D1
.GroupJoin(
E1,
d => d.ID,
e => e.ID,
(d, g) => new {ID = d.ID, Cnt = g.Count()}
);
For comparision, the query comprehension form is:
var q = from d in D1
join e in E1 on d.ID equals e.ID into g
select new {ID = d.ID, Cnt = g.Count()};
Why would you want to convert it?
For complex queries like this one the query syntax you have used here is invariably clearer.

Resources