How do I create a nested group-by dictionary using LINQ? - linq

How would I create a nested grouping for the table below, using LINQ? I want to group by Code, then by Mktcode.
Code Mktcode Id
==== ======= ====
1 10 0001
2 20 0010
1 10 0012
1 20 0010
1 20 0014
2 20 0001
2 30 0002
1 30 0002
1 30 0005
I'd like a dictionary, in the end, like
Dictionary<Code, List<Dictionary<Mktcode, List<Id>>>>
So the values of this dictionary would be
{1, ({10,(0001,0012)}, {20,(0010,0014)}, {30, (0002, 0005)})},
{2, ({20,(0001, 0010)}, {30, (0020)} )}

I'd think of it this way:
You're primarily grouping by code, so do that first
For each group, you've still got a list of results - so apply another grouping there.
Something like:
var groupedByCode = source.GroupBy(x => x.Code);
var groupedByCodeAndThenId = groupedByCode.Select(group =>
new { Key=group.Key, NestedGroup = group.ToLookup
(result => result.MktCode, result => result.Id));
var dictionary = groupedByCodeAndThenId.ToDictionary
(result => result.Key, result => result.NestedGroup);
That will give you a Dictionary<Code, Lookup<MktCode, Id>> - I think that's what you want. It's completely untested though.

You can build lookups (Kinds of Dictionary<,List<>>) using group by into
var lines = new []
{
new {Code = 1, MktCode = 10, Id = 1},
new {Code = 2, MktCode = 20, Id = 10},
new {Code = 1, MktCode = 10, Id = 12},
new {Code = 1, MktCode = 20, Id = 10},
new {Code = 1, MktCode = 20, Id = 14},
new {Code = 2, MktCode = 20, Id = 1},
new {Code = 2, MktCode = 30, Id = 2},
new {Code = 1, MktCode = 30, Id = 2},
new {Code = 1, MktCode = 30, Id = 5},
};
var groups = from line in lines
group line by line.Code
into codeGroup
select new
{
Code = codeGroup.Key,
Items = from l in codeGroup
group l by l.MktCode into mktCodeGroup
select new
{
MktCode = mktCodeGroup.Key,
Ids = from mktLine in mktCodeGroup
select mktLine.Id
}
};

Here's how I'd do it:
Dictionary<Code, Dictionary<MktCode, List<Id>>> myStructure =
myList
.GroupBy(e => e.Code)
.ToDictionary(
g => g.Key,
g => g
.GroupBy(e => e.Mktcode)
.ToDictionary(
g2 => g2.Key,
g2 => g2.Select(e => e.Id).ToList()
)
)
Here's the breakdown of the process:
Group the elements by Code, and create an outer dictionary where the key is that code.
myList
.GroupBy(e => e.Code)
.ToDictionary(
g => g.Key,
For each of the keys in the outer dictionary, regroup the elements by Mktcode and create an inner dictionary.
g => g
.GroupBy(e => e.Mktcode)
.ToDictionary(
g2 => g2.Key,
For each of the keys in the inner dictionary, project the id of those elements and convert that to a list.
g2 => g2.Select(e => e.Id).ToList()
)
)

Related

Lua - Sort table and randomize ties

I have a table with two values, one is a name (string and unique) and the other is a number value (in this case hearts). What I want is this: sort the table by hearts but scramble randomly the items when there is a tie (e.g. hearts is equal). By a standard sorting function, in case of ties the order is always the same and I need it to be different every time the sorting function works.
This is anexample:
tbl = {{name = "a", hearts = 5}, {name = "b", hearts = 2}, {name = "c", hearts = 6}, {name = "d", hearts = 2}, {name = "e", hearts = 2}, {name = "f", hearts = 7}}
sort1 = function (a, b) return a.hearts > b.hearts end
sort2 = function (a, b)
if a.hearts ~= b.hearts then return a.hearts > b.hearts
else return a.name > b.name end
end
table.sort(tbl, sort2)
local s = ""
for i = 1, #tbl do
s = s .. tbl[i].name .. "(" .. tbl[i].hearts .. ") "
end
print(s)
Now, with the function sort2 I think I quite got the problem. The problem is, what happens when a.hearts == b.hearts? In my code it just orders the ties by their name, not what I want. I have two ideas:
First scramble randomly all the items in the table, then apply sort1.
Add a value to every element of the table, called rnd, that is a random number. Then in sort2, when a.hearts == b.hearts order the items by a.rnd > b.rnd.
In sort2, when a.hearts == b.hearts generate randomly true or false and return it. It doesn't work, and I understand that this happens because the random true/false makes the order function crash since there could be inconsistencies.
I don't like 1 (because I would like to do everything inside the sorting function) and 2 (since it requires to add a value), I would like to do something like 3 but working. The question is: is there a way do to this in a simple manner, and what is an optimal way of doing this? (maybe, method 1 or 2 are optimal and I don't get it).
Bonus question. Moreover, I need to fix an item and sort the others. For example, suppose we want "c" to be first. Is it good to make a separate table with only the items to sort, sort the table and then add the fixed items?
-- example table
local tbl = {
{ name = "a", hearts = 5 },
{ name = "b", hearts = 2 },
{ name = "c", hearts = 6 },
{ name = "d", hearts = 2 },
{ name = "e", hearts = 2 },
{ name = "f", hearts = 7 },
}
-- avoid same results on subsequent requests
math.randomseed( os.time() )
---
-- Randomly sort a table
--
-- #param tbl Table to be sorted
-- #param corrections Table with your corrections
--
function rnd_sort( tbl, corrections )
local rnd = corrections or {}
table.sort( tbl,
function ( a, b)
rnd[a.name] = rnd[a.name] or math.random()
rnd[b.name] = rnd[b.name] or math.random()
return a.hearts + rnd[a.name] > b.hearts + rnd[b.name]
end )
end
---
-- Show the values of our table for debug purposes
--
function show( tbl )
local s = ""
for i = 1, #tbl do
s = s .. tbl[i].name .. "(" .. tbl[i].hearts .. ") "
end
print(s)
end
for i = 1, 10 do
rnd_sort(tbl)
show(tbl)
end
rnd_sort( tbl, {c=1000000} ) -- now "c" will be the first
show(tbl)
Here's a quick function for shuffling (scrambling) numerically indexed tables:
function shuffle(tbl) -- suffles numeric indices
local len, random = #tbl, math.random ;
for i = len, 2, -1 do
local j = random( 1, i );
tbl[i], tbl[j] = tbl[j], tbl[i];
end
return tbl;
end
If you are free to introduce a new dependency, you can use lazylualinq to do the job for you (or check out how it sorts sequences, if you do not need the rest):
local from = require("linq")
math.randomseed(os.time())
tbl = {{name = "a", hearts = 5}, {name = "b", hearts = 2}, {name = "c", hearts = 6}, {name = "d", hearts = 2}, {name = "e", hearts = 2}, {name = "f", hearts = 7}}
from(tbl)
:orderBy("x => x.hearts")
:thenBy("x => math.random(-1, 1)")
:foreach(function(_, x) print(x.name, x.hearts) end)

LINQ filter on table to meet several criteria / DateDifference

I have a problem filtering a DataTable on different criteria. I know the first where-clause
where row.Field<TimeSpan>("DateDifference") >= TimeSpan.Zero
is why the third criterion isn't met. Is there any way to change my query to meet all requirements?
DateDifference should be positive.
The smallest DateDifference should be selected.
All InventoryChanges must be in the result. So a negative
DateDifference is allowed if there is no positive DateDiff. The
smallest negative DateDiff should be selected.
ArticleNo Article Price PriceSet InventoryChange DateDifference StockDifference
1 Article A 10 01.01.2012 02.01.2012 1 -2
1 Article A 11 01.06.2012 02.01.2012 -151 -2
2 Article B 14 01.01.2012 05.01.2012 4 1
2 Article B 14 01.01.2012 04.10.2012 277 -3
2 Article B 13 01.06.2012 05.01.2012 -148 1
2 Article B 13 01.06.2012 04.10.2012 125 -3
3 Article C 144 01.04.2012 28.02.2012 -33 -1
3 Article C 124 01.05.2012 28.02.2012 -63 -1
My result:
1 Article A 10 01.01.2012 02.01.2012 1 -2
2 Article B 14 01.01.2012 05.01.2012 4 1
2 Article B 13 01.06.2012 04.10.2012 125 -3
What I want to have is a table where the last row, where there is no positive DateDifference, is added.
The row with the smallest DateDifference should be selected:
1 Article A 10 01.01.2012 02.01.2012 1 -2
2 Article B 14 01.01.2012 05.01.2012 4 1
2 Article B 13 01.06.2012 04.10.2012 125 -3
3 Article C 144 01.04.2012 28.02.2012 -33 -1
My query so far:
var query = from row in InventoryChanges.AsEnumerable()
where row.Field<TimeSpan>("DateDifference") >= TimeSpan.Zero
group row by new
{
ArticleNo = row.Field<Int32>("ArticleNo"),
Article = row.Field<String>("Article"),
InventoryChange = row.Field<DateTime>("InventoryChange"),
StockDifference = row.Field<Int32>("StockDifference")
}
into grp
select new
{
ArticleNo = grp.Key.ArticleNo,
Article = grp.Key.Article,
InventoryChange = grp.Key.InventoryChange,
PriceSet = grp.Where(r => r.Field<TimeSpan>("DateDifference") == grp.Select(min => min.Field<TimeSpan>("DateDifference")).Min())
.Select(r => r.Field<DateTime>("PriceSet")).FirstOrDefault(),
DateDifference = grp.Select(r => r.Field<TimeSpan>("DateDifference")).Min(),
StockDifference = grp.Key.StockDifference,
Price = grp.Where(r => r.Field<TimeSpan>("DateDifference") == grp.Select(min => min.Field<TimeSpan>("DateDifference")).Min())
.Select(r => r.Field<Decimal>("Price")).FirstOrDefault(),
};
Any help is appreciated!
DataTable InventoryChanges = new DataTable("InventoryChanges");
InventoryChanges.Columns.Add("ArticleNo", typeof(Int32));
InventoryChanges.Columns.Add("Article", typeof(String));
InventoryChanges.Columns.Add("Price", typeof(Decimal));
InventoryChanges.Columns.Add("PriceSet", typeof(DateTime));
InventoryChanges.Columns.Add("InventoryChange", typeof(DateTime));
InventoryChanges.Columns.Add("DateDifference", typeof(TimeSpan));
InventoryChanges.Columns.Add("StockDifference", typeof(Int32));
DataRow dr = InventoryChanges.NewRow();
dr.ItemArray = new object[] { 1, "Article A", 10, new DateTime(2012, 1, 1), new DateTime(2012, 1, 2), new TimeSpan(1, 0, 0, 0), -2 };
InventoryChanges.Rows.Add(dr);
dr = InventoryChanges.NewRow();
dr.ItemArray = new object[] { 1, "Article A", 11, new DateTime(2012, 6, 1), new DateTime(2012, 1, 2), new TimeSpan(-151, 0, 0, 0), -2 };
InventoryChanges.Rows.Add(dr);
dr = InventoryChanges.NewRow();
dr.ItemArray = new object[] { 2, "Article B", 14, new DateTime(2012, 1, 1), new DateTime(2012, 1, 5), new TimeSpan(4, 0, 0, 0), 1 };
InventoryChanges.Rows.Add(dr);
dr = InventoryChanges.NewRow();
dr.ItemArray = new object[] { 2, "Article B", 14, new DateTime(2012, 1, 1), new DateTime(2012, 10, 4), new TimeSpan(277, 0, 0, 0), -3 };
InventoryChanges.Rows.Add(dr);
dr = InventoryChanges.NewRow();
dr.ItemArray = new object[] { 2, "Article B", 13, new DateTime(2012, 6, 1), new DateTime(2012, 1, 5), new TimeSpan(-148, 0, 0, 0), 1 };
InventoryChanges.Rows.Add(dr);
dr = InventoryChanges.NewRow();
dr.ItemArray = new object[] { 2, "Article B", 13, new DateTime(2012, 6, 1), new DateTime(2012, 10, 4), new TimeSpan(125, 0, 0, 0), -3 };
InventoryChanges.Rows.Add(dr);
dr = InventoryChanges.NewRow();
dr.ItemArray = new object[] { 3, "Article C", 144, new DateTime(2012, 4, 1), new DateTime(2012, 2, 28), new TimeSpan(-33, 0, 0, 0), -1 };
InventoryChanges.Rows.Add(dr);
dr = InventoryChanges.NewRow();
dr.ItemArray = new object[] { 3, "Article C", 124, new DateTime(2012, 5, 1), new DateTime(2012, 2, 28), new TimeSpan(-63, 0, 0, 0), -1 };
InventoryChanges.Rows.Add(dr);
Maybe there are more elegant approaches but this should work:
var query = InventoryChanges.AsEnumerable()
.GroupBy(r => new
{
ArticleNo = r.Field<Int32>("ArticleNo"),
Article = r.Field<String>("Article"),
InventoryChange = r.Field<DateTime>("InventoryChange"),
StockDifference = r.Field<Int32>("StockDifference")
})
.Select(grp =>
{
IEnumerable<DataRow> rows = grp;
bool anyPositiveDateDiff = grp.Any(r => r.Field<TimeSpan>("DateDifference") >= TimeSpan.Zero);
if (anyPositiveDateDiff)
rows = grp.Where(r => r.Field<TimeSpan>("DateDifference") >= TimeSpan.Zero);
var firstRow = rows
.OrderBy(r => r.Field<TimeSpan>("DateDifference").Duration()).First();
return new
{
ArticleNo = grp.Key.ArticleNo,
Article = grp.Key.Article,
InventoryChange = grp.Key.InventoryChange,
PriceSet = firstRow.Field<DateTime>("PriceSet"),
DateDifference = rows.Min(r => r.Field<TimeSpan>("DateDifference")),
StockDifference = grp.Key.StockDifference,
Price = firstRow.Field<Decimal>("Price")
};
});
I'm checking if there are rows in the group with positive timespans at bool anyPositiveDateDiff. Then i replace the rows of the group with the positive timespan rows.
Note also that i have simplified and improved the sub-queries in the select where you create the anonymous types.
Edit This is the result of above query according to your provided sample data:
{ ArticleNo = 2, Article = Article B, InventoryChange = 05.01.2012 00:00:00, PriceSet = 01.01.2012 00:00:00, DateDifference = 4.00:00:00, StockDifference = 1, Price = 14 }
{ ArticleNo = 2, Article = Article B, InventoryChange = 04.10.2012 00:00:00, PriceSet = 01.06.2012 00:00:00, DateDifference = 125.00:00:00, StockDifference = -3, Price = 13 }
{ ArticleNo = 3, Article = Article C, InventoryChange = 28.02.2012 00:00:00, PriceSet = 01.04.2012 00:00:00, DateDifference = -63.00:00:00, StockDifference = -1, Price = 144 }

Complex LINQ Query with Groupings (need some help)

Hopefully someone can help out with the following scenario. I have a table in sql, and I am trying to return all the records which match the value. That logic can have AND based on the group. For example
Row ID Match Value Equal Group
1 >>> 1 >>> 1>>> a>>> a>>> 1
2 >>> 1 >>> a>>> b>>> 0>>> 1
3 >>> 2 >>> a>>> a>>> 1>>> 2
4 >>> 3 >>> b>>> c>>> 0>>> 3
5 >>> 4 >>> a>>> a>>> 1>>> 4
In this case a 1 in the equal column means "equal" a 0 means "not-equal"
This data set after linqed would return records 1, 2, 3 because in row 1, a = (from column Equal) a
AND a != (from column equal) b. Rows 1 and 2 are "AND" because they are in the same group. So on so forth.
Thanks!
Not sure if this is what you are getting at because your question is difficult to understand and it appears that your example data may contain a couple of typos. However, I modeled something similar for you below. Note I changed Row 1 because it looks like you had a typo, and I changed Row 5 so it would actually be an excluded case, otherwise you would have gotten everything.
public class Test
{
public int Row {get; set;}
public int ID {get; set;}
public string Match {get; set;}
public string Value {get; set;}
public int Equal {get; set;}
public int Group {get; set;}
}
void Main()
{
var items = new List<Test>();
items.Add(new Test() {Row = 1, ID = 1, Match = "a", Value = "a", Equal = 1, Group = 1});
items.Add(new Test() {Row = 2, ID = 1, Match = "a", Value = "b", Equal = 0, Group = 1});
items.Add(new Test() {Row = 3, ID = 2, Match = "a", Value = "a", Equal = 1, Group = 2});
items.Add(new Test() {Row = 4, ID = 3, Match = "b", Value = "c", Equal = 0, Group = 3});
items.Add(new Test() {Row = 5, ID = 4, Match = "a", Value = "b", Equal = 1, Group = 4});
var result = items.GroupBy(i => i.Group)
.Where(g => g.All(t =>
(t.Equal == 1 && t.Match == t.Value) ||
(t.Equal == 0 && t.Match != t.Value)))
.SelectMany(g => g.Select(i => i));
foreach (var i in result)
{
Console.WriteLine("Row: {0}, ID: {1}", i.Row, i.ID);
}
}
Result:
Row: 1, ID: 1
Row: 2, ID: 1
Row: 3, ID: 2
Row: 4, ID: 3

Linq intersect with sum

I have two collections that I want to intersect, and perform a sum operation on matching elements.
For example the collections are (in pseudo code):
col1 = { {"A", 5}, {"B", 3}, {"C", 2} }
col2 = { {"B", 1}, {"C", 8}, {"D", 6} }
and the desired result is:
intersection = { {"B", 4}, {"C", 10} }
I know how to use an IEqualityComparer to match the elements on their name, but how to sum the values while doing the intersection?
EDIT:
The starting collections haven't two items with the same name.
Let's say your input data looks like this:
IEnumerable<Tuple<string, int>> firstSequence = ..., secondSequence = ...;
If the strings are unique in each sequence (i.e there can be no more than a single {"A", XXX} in either sequence) you can join like this:
var query = from tuple1 in firstSequence
join tuple2 in secondSequence on tuple1.Item1 equals tuple2.Item1
select Tuple.Create(tuple1.Item1, tuple1.Item2 + tuple2.Item2);
You might also want to consider using a group by, which would be more appropriate if this uniqueness doesn't hold:
var query = from tuple in firstSequence.Concat(secondSequence)
group tuple.Item2 by tuple.Item1 into g
select Tuple.Create(g.Key, g.Sum());
If neither is what you want, please clarify your requirements more precisely.
EDIT: After your clarification that these are dictionaries - your existing solution is perfectly fine. Here's another alternative with join:
var joined = from kvp1 in dict1
join kvp2 in dict2 on kvp1.Key equals kvp2.Key
select new { kvp1.Key, Value = kvp1.Value + kvp2.Value };
var result = joined.ToDictionary(t => t.Key, t => t.Value);
or in fluent syntax:
var result = dict1.Join(dict2,
kvp => kvp.Key,
kvp => kvp.Key,
(kvp1, kvp2) => new { kvp1.Key, Value = kvp1.Value + kvp2.Value })
.ToDictionary(a => a.Key, a => a.Value);
This will give the result, but there are some caveats. It does an union of the two collections and then it groups them by letter. So if, for example, col1 contained two A elements, it would sum them together and, because now they are 2 A, it would return them.
var col1 = new[] { new { L = "A", N = 5 }, new { L = "B", N = 3 }, new { L = "C", N = 2 } };
var col2 = new[] { new { L = "B", N = 1 }, new { L = "C", N = 8 }, new { L = "D", N = 6 } };
var res = col1.Concat(col2)
.GroupBy(p => p.L)
.Where(p => p.Count() > 1)
.Select(p => new { L = p.Key, N = p.Sum(q => q.N) })
.ToArray();
The best I came up with until now is (my collections are actually Dictionary<string, int> instances):
var intersectingKeys = col1.Keys.Intersect(col2.Keys);
var intersection = intersectingKeys
.ToDictionary(key => key, key => col1[key] + col2[key]);
I'm not sure if it will perform well, at least is it readable.
If your intersection algorithm will result in anonymous type, i.e. ...Select(new { Key = key, Value = value}) then you can easily sum it
result.Sum(e => e.Value);
If you want to sum the "while" doing the intersection, add the value to the accumulator value when adding to the result set.

Linq Grouping Question

I'm working on a simple reservation system for meeting room. Each meeting room is assigned a room type depending on it's size etc. I'm looking to shape the count of booked rooms from the following:
01/01/2011 1 Room A 2
01/01/2011 2 Room B 5
01/01/2011 3 Room C 3
01/01/2011 4 Room D 2
01/01/2011 5 Room E 1
01/01/2011 6 Room F 5
02/01/2011 1 Room A 3
02/01/2011 2 Room B 5
02/01/2011 3 Room C 2
02/01/2011 4 Room D 5
02/01/2011 5 Room E 2
02/01/2011 6 Room F 2
03/01/2011 1 Room A 2
03/01/2011 2 Room B 5
03/01/2011 3 Room C 2
Into grouped data such as:
Date Room A Room B Room C Room D Room E Room F
01/01/2011 2 5 3 2 1 5
02/01/2011 3 5 2 5 2 2
03/01/2011 2 5 2 4 5 8
04/01/2011 4 7 3 5 2 2
I've managed to do this previously using Datasets, but I need to be able to do this in Linq for this project using Linq to entites
Can somebody advise the best way to do this?
Thanks
This is what I've done when confronted with this type of problem:
var rooms = bookings
.Select(b => b.Room)
.Distinct()
.OrderBy(r => r)
.ToArray();
var query = (
from b in bookings
group b by b.Date into gbs
let l = gbs.ToLookup(gb => gb.Room, gb => gb.Count)
select new
{
Date = gbs.Key,
RoomCounts = rooms.Select(r => l[r].Sum()).ToArray(),
}).ToArray();
This essentially produces the following arrays:
var rooms = new []
{
"Room A", "Room B", "Room C", "Room D", "Room E", "Room F",
};
var query = new []
{
new
{
Date = new DateTime(2011, 01, 01),
RoomCounts = new [] { 2, 5, 3, 2, 1, 5 }
},
new
{
Date = new DateTime(2011, 01, 02),
RoomCounts = new [] { 3, 5, 2, 5, 2, 2 }
},
new
{
Date = new DateTime(2011, 01, 03),
RoomCounts = new [] { 2, 5, 2, 0, 0, 0 }
},
};
The RoomCounts arrays are all the same length as the rooms array and the value at each index position matches the room in the rooms array.
It's usually fairly workable.
An alternative is to create an array of arrays that represent a grid somewhat like a spreadsheet.
var query2 = (new object[]
{
(new object[] { "Date" })
.Concat(rooms.Cast<object>())
.ToArray()
}).Concat(
from b in bookings
group b by b.Date into gbs
let l = gbs.ToLookup(gb => gb.Room, gb => gb.Count)
select (new object[] { gbs.Key })
.Concat(rooms.Select(r => l[r].Sum()).Cast<object>())
.ToArray())
.ToArray();
This produces the following:
var q2 = new object[]
{
new object[] {
"Date", "Room A", "Room B", "Room C", "Room D", "Room E", "Room F" },
new object[] { new DateTime(2011, 01, 01), 2, 5, 3, 2, 1, 5 },
new object[] { new DateTime(2011, 01, 02), 3, 5, 2, 5, 2, 2 },
new object[] { new DateTime(2011, 01, 03), 2, 5, 2, 0, 0, 0 },
};
An alternative to the alternative, in case the query looks a little hairy, is to do this:
Func<object, IEnumerable, object[]> prepend = (o, os) =>
(new object[] { o }).Concat(os.Cast<object>()).ToArray();
Func<object[], IEnumerable<object[]>, object[][]> prepends = (o, os) =>
(new object[][] { o }).Concat(os).ToArray();
var query2 = prepends(prepend("Date", rooms),
from b in bookings
group b by b.Date into gbs
let l = gbs.ToLookup(gb => gb.Room, gb => gb.Count)
select prepend(gbs.Key, rooms.Select(r => l[r].Sum())));
This form of the query still produces the same grid of objects, but it's a little more readable, IMHO, than the first form.
I hope this helps.

Resources