Linq intersect with sum - linq

I have two collections that I want to intersect, and perform a sum operation on matching elements.
For example the collections are (in pseudo code):
col1 = { {"A", 5}, {"B", 3}, {"C", 2} }
col2 = { {"B", 1}, {"C", 8}, {"D", 6} }
and the desired result is:
intersection = { {"B", 4}, {"C", 10} }
I know how to use an IEqualityComparer to match the elements on their name, but how to sum the values while doing the intersection?
EDIT:
The starting collections haven't two items with the same name.

Let's say your input data looks like this:
IEnumerable<Tuple<string, int>> firstSequence = ..., secondSequence = ...;
If the strings are unique in each sequence (i.e there can be no more than a single {"A", XXX} in either sequence) you can join like this:
var query = from tuple1 in firstSequence
join tuple2 in secondSequence on tuple1.Item1 equals tuple2.Item1
select Tuple.Create(tuple1.Item1, tuple1.Item2 + tuple2.Item2);
You might also want to consider using a group by, which would be more appropriate if this uniqueness doesn't hold:
var query = from tuple in firstSequence.Concat(secondSequence)
group tuple.Item2 by tuple.Item1 into g
select Tuple.Create(g.Key, g.Sum());
If neither is what you want, please clarify your requirements more precisely.
EDIT: After your clarification that these are dictionaries - your existing solution is perfectly fine. Here's another alternative with join:
var joined = from kvp1 in dict1
join kvp2 in dict2 on kvp1.Key equals kvp2.Key
select new { kvp1.Key, Value = kvp1.Value + kvp2.Value };
var result = joined.ToDictionary(t => t.Key, t => t.Value);
or in fluent syntax:
var result = dict1.Join(dict2,
kvp => kvp.Key,
kvp => kvp.Key,
(kvp1, kvp2) => new { kvp1.Key, Value = kvp1.Value + kvp2.Value })
.ToDictionary(a => a.Key, a => a.Value);

This will give the result, but there are some caveats. It does an union of the two collections and then it groups them by letter. So if, for example, col1 contained two A elements, it would sum them together and, because now they are 2 A, it would return them.
var col1 = new[] { new { L = "A", N = 5 }, new { L = "B", N = 3 }, new { L = "C", N = 2 } };
var col2 = new[] { new { L = "B", N = 1 }, new { L = "C", N = 8 }, new { L = "D", N = 6 } };
var res = col1.Concat(col2)
.GroupBy(p => p.L)
.Where(p => p.Count() > 1)
.Select(p => new { L = p.Key, N = p.Sum(q => q.N) })
.ToArray();

The best I came up with until now is (my collections are actually Dictionary<string, int> instances):
var intersectingKeys = col1.Keys.Intersect(col2.Keys);
var intersection = intersectingKeys
.ToDictionary(key => key, key => col1[key] + col2[key]);
I'm not sure if it will perform well, at least is it readable.

If your intersection algorithm will result in anonymous type, i.e. ...Select(new { Key = key, Value = value}) then you can easily sum it
result.Sum(e => e.Value);
If you want to sum the "while" doing the intersection, add the value to the accumulator value when adding to the result set.

Related

Lua - Sort table and randomize ties

I have a table with two values, one is a name (string and unique) and the other is a number value (in this case hearts). What I want is this: sort the table by hearts but scramble randomly the items when there is a tie (e.g. hearts is equal). By a standard sorting function, in case of ties the order is always the same and I need it to be different every time the sorting function works.
This is anexample:
tbl = {{name = "a", hearts = 5}, {name = "b", hearts = 2}, {name = "c", hearts = 6}, {name = "d", hearts = 2}, {name = "e", hearts = 2}, {name = "f", hearts = 7}}
sort1 = function (a, b) return a.hearts > b.hearts end
sort2 = function (a, b)
if a.hearts ~= b.hearts then return a.hearts > b.hearts
else return a.name > b.name end
end
table.sort(tbl, sort2)
local s = ""
for i = 1, #tbl do
s = s .. tbl[i].name .. "(" .. tbl[i].hearts .. ") "
end
print(s)
Now, with the function sort2 I think I quite got the problem. The problem is, what happens when a.hearts == b.hearts? In my code it just orders the ties by their name, not what I want. I have two ideas:
First scramble randomly all the items in the table, then apply sort1.
Add a value to every element of the table, called rnd, that is a random number. Then in sort2, when a.hearts == b.hearts order the items by a.rnd > b.rnd.
In sort2, when a.hearts == b.hearts generate randomly true or false and return it. It doesn't work, and I understand that this happens because the random true/false makes the order function crash since there could be inconsistencies.
I don't like 1 (because I would like to do everything inside the sorting function) and 2 (since it requires to add a value), I would like to do something like 3 but working. The question is: is there a way do to this in a simple manner, and what is an optimal way of doing this? (maybe, method 1 or 2 are optimal and I don't get it).
Bonus question. Moreover, I need to fix an item and sort the others. For example, suppose we want "c" to be first. Is it good to make a separate table with only the items to sort, sort the table and then add the fixed items?
-- example table
local tbl = {
{ name = "a", hearts = 5 },
{ name = "b", hearts = 2 },
{ name = "c", hearts = 6 },
{ name = "d", hearts = 2 },
{ name = "e", hearts = 2 },
{ name = "f", hearts = 7 },
}
-- avoid same results on subsequent requests
math.randomseed( os.time() )
---
-- Randomly sort a table
--
-- #param tbl Table to be sorted
-- #param corrections Table with your corrections
--
function rnd_sort( tbl, corrections )
local rnd = corrections or {}
table.sort( tbl,
function ( a, b)
rnd[a.name] = rnd[a.name] or math.random()
rnd[b.name] = rnd[b.name] or math.random()
return a.hearts + rnd[a.name] > b.hearts + rnd[b.name]
end )
end
---
-- Show the values of our table for debug purposes
--
function show( tbl )
local s = ""
for i = 1, #tbl do
s = s .. tbl[i].name .. "(" .. tbl[i].hearts .. ") "
end
print(s)
end
for i = 1, 10 do
rnd_sort(tbl)
show(tbl)
end
rnd_sort( tbl, {c=1000000} ) -- now "c" will be the first
show(tbl)
Here's a quick function for shuffling (scrambling) numerically indexed tables:
function shuffle(tbl) -- suffles numeric indices
local len, random = #tbl, math.random ;
for i = len, 2, -1 do
local j = random( 1, i );
tbl[i], tbl[j] = tbl[j], tbl[i];
end
return tbl;
end
If you are free to introduce a new dependency, you can use lazylualinq to do the job for you (or check out how it sorts sequences, if you do not need the rest):
local from = require("linq")
math.randomseed(os.time())
tbl = {{name = "a", hearts = 5}, {name = "b", hearts = 2}, {name = "c", hearts = 6}, {name = "d", hearts = 2}, {name = "e", hearts = 2}, {name = "f", hearts = 7}}
from(tbl)
:orderBy("x => x.hearts")
:thenBy("x => math.random(-1, 1)")
:foreach(function(_, x) print(x.name, x.hearts) end)

Fastest way to get values from 2d array

I have 2d aray similar to this:
string[,] arr = {
{ "A", "A", "A", "A", "A", "A", "A", "D", "D", "D", "D", "D", "D", "D", "D" },
{ "1", "1", "1", "1", "1", "1", "1", "0", "0", "0", "0", "0", "0", "0", "0" },
{ "2", "2", "2", "2", "2", "2", "2", "00", "00", "00", "00", "00", "00", "00", "00" }
};
I am trying to get following result from above array:
A 1 2
A 1 2
A 1 2
A 1 2
A 1 2
A 1 2
Get all "A" from the array at length 0. Than get corrospoding values of it from other columns.
This is big 2d array with over 6k values. But design is exactly same as described above. I have tried 2 ways so far:
1st method: using for loop to go through all the values:
var myList = new List<string>();
var arrLength = arr.GetLength(1)-1;
for (var i = 0; i < arrLength; i++)
{
if (arr[0,i].Equals("A"))
myList.Add(arr[0, i]);
else
continue;
}
}
2nd method: creating list and than going through all values:
var dataList = new List<string>();
var list = Enumerable.Range(0, arr.GetLength(1))
.Select(i => arr[0, i])
.ToList();
var index = Enumerable.Range(0, arr.GetLength(1))
.Where(index => arr[0, index].Contains("A"))
.ToArray();
var sI = index[0];
var eI = index[index.Length - 1];
myList.AddRange(list.GetRange(sI, eI - sI));
They both seem to be slow, not efficient enough. Is there any better way of doing this?
I like to approach these kinds of algorithms in a way that my code ends up being self-documenting. Usually, describing the algorithm with your code, and not bloating it with code features, tends to produce pretty good results.
var matchingValues =
from index in Enumerable.Range(0, arr.GetLength(1))
where arr[0, index] == "A"
select Tuple.Create(arr[1, index], arr[2, index]);
Which corresponds to:
// find the tuples produced by
// mapping along one length of an array with an index
// filtering those items whose 0th item on the indexed dimension is A"
// reducing index into the non-0th elements on the indexed dimension
This should parallelize extremely well, as long as you keep to the simple "map, filter, reduce" paradigm and refrain from introducing side-effects.
Edit:
In order to return an arbitrary collection of the columns associated with an "A", you can:
var targetValues = new int[] { 1, 2, 4, 10 };
var matchingValues =
from index in Enumerable.Range(0, arr.GetLength(1))
where arr[0, index] == "A"
select targetValues.Select(x => arr[x, index]).ToArray();
To make it a complete collection, simply use:
var targetValues = Enumerable.Range(1, arr.GetLength(0) - 1).ToArray();
As "usr" said: back to the basics if you want raw performance. Also taking into account that the "A" values can start at an index > 0:
var startRow = -1; // "row" in the new array.
var endRow = -1;
var match = "D";
for (int i = 0; i < arr.GetLength(1); i++)
{
if (startRow == -1 && arr[0,i] == match) startRow = i;
if (startRow > -1 && arr[0,i] == match) endRow = i + 1;
}
var columns = arr.GetLength(0);
var transp = new String[endRow - startRow,columns]; // transposed array
for (int i = startRow; i < endRow; i++)
{
for (int j = 0; j < columns; j++)
{
transp[i - startRow,j] = arr[j,i];
}
}
Initializing the new array first (and then setting the "cell values) is the main performance boost.

Sort an array of string by length in ColdFusion?

How would you sort an array of string by length in ColdFusion?
In PHP, one can use usort as demonstrated here: PHP: Sort an array by the length of its values?
Does ArraySort() in CF10 support passing in a comparator function like usort?
The above answer has an error, here is the correct way to use arraysort to sort by string length:
<cfscript>
data = [ "bb", "a", "dddd", "ccc" ];
arraySort( data, function( a, b ) {
return len(a) - len(b);
});
</cfscript>
The comparator for this function should return a number either < 0 (less than), 0 (equal) or > 0 (greater than), not a boolean. Also see the arraySort docs.
I guess this is not going to be most flexible or even effective solution, but I was interested in the shortest version which uses built-in CFML sorting... Without comments it's just 13 lines of code :)
source = ["bb", "a", "ffff", "ccc", "dd", 22, 0];
lengths = {};
result = [];
// cache lengths of the values with index as key
for (i=1; i LTE ArrayLen(source); i++) {
lengths[i] = Len(source[i]);
}
// sort the values using 'numeric' type
sorted = StructSort(lengths, "numeric", "asc");
// populate results using sorted cache indexes
for (v in sorted) {
ArrayAppend(result, source[v]);
}
Result is ["a",0,"bb",22,"dd","ccc","ffff"]
You can use a quick sort algorithm along with your own custom comparator, similar to how Java's comparators work.
You can find a quickSort UDF here: http://cflib.org/udf/quickSort.
You'll need to define your own comparator to tell the function how it should do the sorting. Below is a working example. Note that you'll need in include the UDF in your page so that the quickSort function is available.
strings = ["bb", "a", "ccc"];
WriteOutput(ArrayToList(quickSort(strings, descStringLenCompare)));
//outputs a,bb,ccc
WriteOutput(ArrayToList(quickSort(strings, ascStringLenCompare)));
//outputs ccc,bb,a
//Ascending comparator
Numeric function ascStringLenCompare(required String s1, required String s2)
{
if (Len(s1) < Len(s2)){
return -1;
}else if (Len(s1) > Len(s2)) {
return 1;
}else{
return 0;
}
}
//Descending comparator
Numeric function descStringLenCompare(required String s1, required String s2)
{
if (Len(s1) < Len(s2)){
return 1;
}else if (Len(s1) > Len(s2)) {
return -1;
} else {
return 0;
}
}
In Coldfusion 10 or Railo 4, you can use the Underscore.cfc library to write this in an elegant and simple way:
_ = new Underscore(); // instantiate the library
// define an array of strings
arrayOfStrings = ['ccc', 'a', 'dddd', 'bb'];
// perform sort
sortedArray = _.sortBy(arrayOfStrings, function (string) {
return len(string);
});
// sortedArray: ['a','bb','ccc','dddd']
The iterator function is called for each value in the array, and that value is passed in as the first argument. The function should return the value that you wish to sort on. In this case, we return len(string). _.sortBy always sorts in ascending order.
(Disclaimer: I wrote Underscore.cfc)
In CF10 you can indeed use a closure with ArraySort().
eg1. sort by length alone.
<cfscript>
data = [ "bb", "a", "dddd", "ccc" ];
arraySort( data, function( a, b ) {
return len(a) < len(b);
});
</cfscript>
data == [ "a", "bb", "ccc", "dddd" ]
eg2. sort by length and alphabetically when same length.
<cfscript>
data = [ "b", "a", "dddd", "ccc" ];
arraySort( data, function( a, b ) {
return len(a) == len(b) ? compare( a, b ) : ( len(a) > len(b) );
});
</cfscript>
data == [ "a", "b", "ccc", "dddd" ]
eg3. same, only reverse the order.
<cfscript>
data = [ "b", "a", "dddd", "ccc" ];
arraySort( data, function( a, b ) {
return len(a) == len(b) ? compare( b, a ) : ( len(a) < len(b) );
});
</cfscript>
data == [ "dddd", "ccc", "b", "a" ]

Query to Method Expression in Linq!

How can i right this in method Expression!
var query =
from l in list where l.Key == "M"
select new { Value = l, Length = l.Length };
You want to turn it into a sequence of extension method calls?
var query = list.Where(l => l.Key == "M")
.Select(l => new { Value = l, Length = l.Length });

How do I create a nested group-by dictionary using LINQ?

How would I create a nested grouping for the table below, using LINQ? I want to group by Code, then by Mktcode.
Code Mktcode Id
==== ======= ====
1 10 0001
2 20 0010
1 10 0012
1 20 0010
1 20 0014
2 20 0001
2 30 0002
1 30 0002
1 30 0005
I'd like a dictionary, in the end, like
Dictionary<Code, List<Dictionary<Mktcode, List<Id>>>>
So the values of this dictionary would be
{1, ({10,(0001,0012)}, {20,(0010,0014)}, {30, (0002, 0005)})},
{2, ({20,(0001, 0010)}, {30, (0020)} )}
I'd think of it this way:
You're primarily grouping by code, so do that first
For each group, you've still got a list of results - so apply another grouping there.
Something like:
var groupedByCode = source.GroupBy(x => x.Code);
var groupedByCodeAndThenId = groupedByCode.Select(group =>
new { Key=group.Key, NestedGroup = group.ToLookup
(result => result.MktCode, result => result.Id));
var dictionary = groupedByCodeAndThenId.ToDictionary
(result => result.Key, result => result.NestedGroup);
That will give you a Dictionary<Code, Lookup<MktCode, Id>> - I think that's what you want. It's completely untested though.
You can build lookups (Kinds of Dictionary<,List<>>) using group by into
var lines = new []
{
new {Code = 1, MktCode = 10, Id = 1},
new {Code = 2, MktCode = 20, Id = 10},
new {Code = 1, MktCode = 10, Id = 12},
new {Code = 1, MktCode = 20, Id = 10},
new {Code = 1, MktCode = 20, Id = 14},
new {Code = 2, MktCode = 20, Id = 1},
new {Code = 2, MktCode = 30, Id = 2},
new {Code = 1, MktCode = 30, Id = 2},
new {Code = 1, MktCode = 30, Id = 5},
};
var groups = from line in lines
group line by line.Code
into codeGroup
select new
{
Code = codeGroup.Key,
Items = from l in codeGroup
group l by l.MktCode into mktCodeGroup
select new
{
MktCode = mktCodeGroup.Key,
Ids = from mktLine in mktCodeGroup
select mktLine.Id
}
};
Here's how I'd do it:
Dictionary<Code, Dictionary<MktCode, List<Id>>> myStructure =
myList
.GroupBy(e => e.Code)
.ToDictionary(
g => g.Key,
g => g
.GroupBy(e => e.Mktcode)
.ToDictionary(
g2 => g2.Key,
g2 => g2.Select(e => e.Id).ToList()
)
)
Here's the breakdown of the process:
Group the elements by Code, and create an outer dictionary where the key is that code.
myList
.GroupBy(e => e.Code)
.ToDictionary(
g => g.Key,
For each of the keys in the outer dictionary, regroup the elements by Mktcode and create an inner dictionary.
g => g
.GroupBy(e => e.Mktcode)
.ToDictionary(
g2 => g2.Key,
For each of the keys in the inner dictionary, project the id of those elements and convert that to a list.
g2 => g2.Select(e => e.Id).ToList()
)
)

Resources