LINQ Grouping: Is there a cleaner way to do this without a for loop - linq

I am trying to create a very simple distribution chart and I want to display the counts of tests score percentages in their corresponding 10's ranges.
I thought about just doing the grouping on the Math.Round((d.Percentage/10-0.5),0)*10 which should give me the 10's value....but I wasn't sure the best way to do this given that I would probably have missing ranges and all ranges need to appear even if the count is zero. I also thought about doing an outer join on the ranges array but since I'm fairly new to Linq so for the sake of time I opted for the code below. I would however like to know what a better way might be.
Also note: As I tend to work with larger teams with varying experience levels, I'm not all that crazy about ultra compact code unless it remains very readable to the average developer.
Any suggestions?
public IEnumerable<TestDistribution> GetDistribution()
{
var distribution = new List<TestDistribution>();
var ranges = new int[] { 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110 };
var labels = new string[] { "0%'s", "10%'s", "20%'s", "30%'s", "40%'s", "50%'s", "60%'s", "70%'s", "80%'s", "90%'s", "100%'s", ">110% "};
for (var n = 0; n < ranges.Count(); n++)
{
var count = 0;
var min = ranges[n];
var max = (n == ranges.Count() - 1) ? decimal.MaxValue : ranges[n+1];
count = (from d in Results
where d.Percentage>= min
&& d.Percentage<max
select d)
.Count();
distribution.Add(new TestDistribution() { Label = labels[n], Frequency = count });
}
return distribution;
}

// ranges and labels in a list of pairs of them
var rangesWithLabels = ranges.Zip(labels, (r,l) => new {Range = r, Label = l});
// create a list of intervals (ie. 0-10, 10-20, .. 110 - max value
var rangeMinMax = ranges.Zip(ranges.Skip(1), (min, max) => new {Min = min, Max = max})
.Union(new[] {new {Min = ranges.Last(), Max = Int32.MaxValue}});
//the grouping is made by the lower bound of the interval found for some Percentage
var resultsDistribution = from c in Results
group c by
rangeMinMax.FirstOrDefault(r=> r.Min <= c.Percentage && c.Percentage < r.Max).Min into g
select new {Percentage = g.Key, Frequency = g.Count() };
// left join betweem the labels and the results with frequencies
var distributionWithLabels =
from l in rangesWithLabels
join r in resultsDistribution on l.Range equals r.Percentage
into rd
from r in rd.DefaultIfEmpty()
select new TestDistribution{
Label = l.Label,
Frequency = r != null ? r.Frequency : 0
};
distribution = distributionWithLabels.ToList();
Another solution if the ranges and labels can be created in another way
var ranges = Enumerable.Range(0, 10)
.Select(c=> new {
Min = c * 10,
Max = (c +1 )* 10,
Label = (c * 10) + "%'s"})
.Union(new[] { new {
Min = 100,
Max = Int32.MaxValue,
Label = ">110% "
}});
var resultsDistribution = from c in Results
group c by ranges.FirstOrDefault(r=> r.Min <= c.Percentage && c.Percentage < r.Max).Min
into g
select new {Percentage = g.Key, Frequency = g.Count() };
var distributionWithLabels =
from l in ranges
join r in resultsDistribution on l.Min equals r.Percentage
into rd
from r in rd.DefaultIfEmpty()
select new TestDistribution{
Label = l.Label,
Frequency = r != null ? r.Frequency : 0
};

This works
public IEnumerable<TestDistribution> GetDistribution()
{
var range = 12;
return Enumerable.Range(0, range).Select(
n => new TestDistribution
{
Label = string.Format("{1}{0}%'s", n*10, n==range-1 ? ">" : ""),
Frequency =
Results.Count(
d =>
d.Percentage >= n*10
&& d.Percentage < ((n == range - 1) ? decimal.MaxValue : (n+1)*10))
});
}

Related

Returning A Sorted List's Index in Lua

I access object properties with an index number
object = {}
object.y = {60,20,40}
object.g = {box1,box2,box3} -- graphic
object.c = {false,false,false} -- collision
-- object.y[2] is 20 and its graphic is box2
-- sorted by y location, index should be, object.sort = {2,3,1}
I know table.sort sorts a list, but how can I sort the y list that returns index for the purpose of drawing each object in-front depending on the y location.
Maybe the quicksort function can be edited, I don't understand it.
http://rosettacode.org/wiki/Sorting_algorithms/Quicksort#Lua
https://github.com/mirven/lua_snippets/blob/master/lua/quicksort.lua
Is this possible?
Do not store the data as you're currently doing. Use something like:
object = {
{
y = 60,
g = box1,
c = false,
},
{
y = 20,
g = box2,
c = false,
},
{
y = 40,
g = box3,
c = false,
},
}
and then use the following callback function in table.sort:
function CustomSort(L, R)
return L.y > R.y
end
as shown below:
table.sort(object, CustomSort)
This should work:
local temp = {}
local values = object.y
-- filling temp with all indexes
for i=1,#values do
temp[i] = i
end
-- sorting the indexes, using object.y as comparison
table.sort(temp,function(a,b)
return values[a] < values[b]
end)
-- sorting is done here, have fun with it
object.sort = temp
temp will be {2,3,1} when using this code combined with yours.
#EinsteinK #hjpotter92 : Thank you
RESULT: This is the final version of the answers I received. My question is solved.
Use sortIndex(object) to get sorted list in object.sort . Update sort after objects move.
box1 = love.graphics.newImage("tile1.png")
box2 = love.graphics.newImage("tile2.png")
box3 = love.graphics.newImage("tile3.png")
hero = love.graphics.newImage("hero.png")
object = {
{ x = 200, y = 50, g = box1 },
{ x = 50, y = 100, g = box2 },
{ x = 150, y = 200, g = box3 },
{ x = 0, y = 0, g = hero }
}
function sortIndex(item)
-- Sort id, using item values
local function sortY(a,b)
return item[a].y < item[b].y
end
--------------------------------
local i
local id = {} -- id list
for i = 1, #item do -- Fill id list
id[i] = i
end
-- print( unpack(id) ) -- Check before
table.sort(id,sortY)-- Sort list
-- print( unpack(id) ) -- Check after
item.sort = id -- List added to object.sort
end
sortIndex(object) -- print( unpack(object.sort) ) -- Check sorted id's
function drawObject()
local i,v, g,x,y
for i = 1, #object do
v = object.sort[i] -- Draw in order
x = object[v].x
y = object[v].y
g = object[v].g
love.graphics.draw(g,x,y)
end
end

Linq FirstOrDefault List inside a query

I have this linq query
var numberGroups =
from n in VISRUBs.Where(a => a.VISANA.VISITE.DATEVIS <= d && a.VISANA.VISITE.PANUM == p)
group n by n.RUBRIQUE into g
select new {
RemainderCHAPLIB = g.Key.ANALYSE.CHAPITRE.LIBELLE,
RemainderLIB = g.Key.LIBELLE,
RemainderRUNUM = g.Key.RUNUM,
vals = from vlist in g.OrderByDescending(a=>a.VISANA.VISITE.DATEVIS)
select vlist.VALEUR
};
which gives me this result in Linqpad
What I want is to select the first and second item from the last field (vals) which is a List<string>.
I have tried this:
var numberGroups =
from n in VISRUBs.Where(a => a.VISANA.VISITE.DATEVIS <= d && a.VISANA.VISITE.PANUM == p)
group n by n.RUBRIQUE into g
select new {
RemainderCHAPLIB = g.Key.ANALYSE.CHAPITRE.LIBELLE,
RemainderLIB = g.Key.LIBELLE,
RemainderRUNUM = g.Key.RUNUM,
vals = from vlist in g.OrderByDescending(a => a.VISANA.VISITE.DATEVIS)
select vlist.VALEUR
};
var lst = from n in numberGroups
select new
{
RemainderCHAPLIB = n.RemainderCHAPLIB,
RemainderLIB = n.RemainderLIB,
RemainderRUNUM = n.RemainderRUNUM,
VAL = n.vals.FirstOrDefault()
};
but it didn't work, I got an exception:
Dynamic SQL ErrorSQL error code = -104Token unknown - line 54, column 1OUTER
found it !
var lst = from n in numberGroups.ToList()
select new
{
RemainderCHAPLIB = n.RemainderCHAPLIB,
RemainderLIB = n.RemainderLIB,
RemainderRUNUM = n.RemainderRUNUM,
VAL = n.vals.FirstOrDefault(),
ANT = n.vals.Skip(1).FirstOrDefault()
};

Algorithm to generate a sequence proportional to specified percentage

Given a Map of objects and designated proportions (let's say they add up to 100 to make it easy):
val ss : Map[String,Double] = Map("A"->42, "B"->32, "C"->26)
How can I generate a sequence such that for a subset of size n there are ~42% "A"s, ~32% "B"s and ~26% "C"s? (Obviously, small n will have larger errors).
(Work language is Scala, but I'm just asking for the algorithm.)
UPDATE: I resisted a random approach since, for instance, there's ~16% chance that the sequence would start with AA and ~11% chance it would start with BB and there would be very low odds that for n precisely == (sum of proportions) the distribution would be perfect. So, following #MvG's answer, I implemented as follows:
/**
Returns the key whose achieved proportions are most below desired proportions
*/
def next[T](proportions : Map[T, Double], achievedToDate : Map[T,Double]) : T = {
val proportionsSum = proportions.values.sum
val desiredPercentages = proportions.mapValues(v => v / proportionsSum)
//Initially no achieved percentages, so avoid / 0
val toDateTotal = if(achievedToDate.values.sum == 0.0){
1
}else{
achievedToDate.values.sum
}
val achievedPercentages = achievedToDate.mapValues(v => v / toDateTotal)
val gaps = achievedPercentages.map{ case (k, v) =>
val gap = desiredPercentages(k) - v
(k -> gap)
}
val maxUnder = gaps.values.toList.sortWith(_ > _).head
//println("Max gap is " + maxUnder)
val gapsForMaxUnder = gaps.mapValues{v => Math.abs(v - maxUnder) < Double.Epsilon }
val keysByHasMaxUnder = gapsForMaxUnder.map(_.swap)
keysByHasMaxUnder(true)
}
/**
Stream of most-fair next element
*/
def proportionalStream[T](proportions : Map[T, Double], toDate : Map[T, Double]) : Stream[T] = {
val nextS = next(proportions, toDate)
val tailToDate = toDate + (nextS -> (toDate(nextS) + 1.0))
Stream.cons(
nextS,
proportionalStream(proportions, tailToDate)
)
}
That when used, e.g., :
val ss : Map[String,Double] = Map("A"->42, "B"->32, "C"->26)
val none : Map[String,Double] = ss.mapValues(_ => 0.0)
val mySequence = (proportionalStream(ss, none) take 100).toList
println("Desired : " + ss)
println("Achieved : " + mySequence.groupBy(identity).mapValues(_.size))
mySequence.map(s => print(s))
println
produces :
Desired : Map(A -> 42.0, B -> 32.0, C -> 26.0)
Achieved : Map(C -> 26, A -> 42, B -> 32)
ABCABCABACBACABACBABACABCABACBACABABCABACABCABACBA
CABABCABACBACABACBABACABCABACBACABABCABACABCABACBA
For a deterministic approach, the most obvious solution would probably be this:
Keep track of the number of occurrences of each item in the sequence so far.
For the next item, choose that item for which the difference between intended and actual count (or proportion, if you prefer that) is maximal, but only if the intended count (resp. proportion) is greater than the actual one.
If there is a tie, break it in an arbitrary but deterministic way, e.g. choosing the alphabetically lowest item.
This approach would ensure an optimal adherence to the prescribed ratio for every prefix of the infinite sequence generated in this way.
Quick & dirty python proof of concept (don't expect any of the variable “names” to make any sense):
import sys
p = [0.42, 0.32, 0.26]
c = [0, 0, 0]
a = ['A', 'B', 'C']
n = 0
while n < 70*5:
n += 1
x = 0
s = n*p[0] - c[0]
for i in [1, 2]:
si = n*p[i] - c[i]
if si > s:
x = i
s = si
sys.stdout.write(a[x])
if n % 70 == 0:
sys.stdout.write('\n')
c[x] += 1
Generates
ABCABCABACABACBABCAABCABACBACABACBABCABACABACBACBAABCABCABACABACBABCAB
ACABACBACABACBABCABACABACBACBAABCABCABACABACBABCAABCABACBACABACBABCABA
CABACBACBAABCABCABACABACBABCABACABACBACBAACBABCABACABACBACBAABCABCABAC
ABACBABCABACABACBACBAACBABCABACABACBACBAABCABCABACABACBABCABACABACBACB
AACBABCABACABACBACBAABCABCABACABACBABCAABCABACBACBAACBABCABACABACBACBA
For every item of the sequence, compute a (pseudo-)random number r equidistributed between 0 (inclusive) and 100 (exclusive).
If 0 ≤ r < 42, take A
If 42 ≤ r < (42+32), take B
If (42+32) ≤ r < (42+32+26)=100, take C
The number of each entry in your subset is going to be the same as in your map, but with a scaling factor applied.
The scaling factor is n/100.
So if n was 50, you would have { Ax21, Bx16, Cx13 }.
Randomize the order to your liking.
The simplest "deterministic" [in terms of #elements of each category] solution [IMO] will be: add elements in predefined order, and then shuffle the resulting list.
First, add map(x)/100 * n elements from each element x chose how you handle integer arithmetics to avoid off by one element], and then shuffle the resulting list.
Shuffling a list is simple with fisher-yates shuffle, which is implemented in most languages: for example java has Collections.shuffle(), and C++ has random_shuffle()
In java, it will be as simple as:
int N = 107;
List<String> res = new ArrayList<String>();
for (Entry<String,Integer> e : map.entrySet()) { //map is predefined Map<String,Integer> for frequencies
for (int i = 0; i < Math.round(e.getValue()/100.0 * N); i++) {
res.add(e.getKey());
}
}
Collections.shuffle(res);
This is nondeterministic, but gives a distribution of values close to MvG's. It suffers from the problem that it could give AAA right at the start. I post it here for completeness' sake given how it proves my dissent with MvG was misplaced (and I don't expect any upvotes).
Now, if someone has an idea for an expand function that is deterministic and won't just duplicate MvG's method (rendering the calc function useless), I'm all ears!
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>ErikE's answer</title>
</head>
<body>
<div id="output"></div>
<script type="text/javascript">
if (!Array.each) {
Array.prototype.each = function(callback) {
var i, l = this.length;
for (i = 0; i < l; i += 1) {
callback(i, this[i]);
}
};
}
if (!Array.prototype.sum) {
Array.prototype.sum = function() {
var sum = 0;
this.each(function(i, val) {
sum += val;
});
return sum;
};
}
function expand(counts) {
var
result = "",
charlist = [],
l,
index;
counts.each(function(i, val) {
char = String.fromCharCode(i + 65);
for ( ; val > 0; val -= 1) {
charlist.push(char);
}
});
l = charlist.length;
for ( ; l > 0; l -= 1) {
index = Math.floor(Math.random() * l);
result += charlist[index];
charlist.splice(index, 1);
}
return result;
}
function calc(n, proportions) {
var percents = [],
counts = [],
errors = [],
fnmap = [],
errorSum,
worstIndex;
fnmap[1] = "min";
fnmap[-1] = "max";
proportions.each(function(i, val) {
percents[i] = val / proportions.sum() * n;
counts[i] = Math.round(percents[i]);
errors[i] = counts[i] - percents[i];
});
errorSum = counts.sum() - n;
while (errorSum != 0) {
adjust = errorSum < 0 ? 1 : -1;
worstIndex = errors.indexOf(Math[fnmap[adjust]].apply(0, errors));
counts[worstIndex] += adjust;
errors[worstIndex] = counts[worstIndex] - percents[worstIndex];
errorSum += adjust;
}
return expand(counts);
}
document.body.onload = function() {
document.getElementById('output').innerHTML = calc(99, [25.1, 24.9, 25.9, 24.1]);
};
</script>
</body>
</html>

Ranking in Linq

There's a generic list of numbers, say
{980, 850,700, 680}---n nos.
I try to compare the above list with a decimal no. say 690., the O/p I need is,to get the ranking of the number which I'm gonna input("692). i,e the desired O/P should be Ranking ="4"
How can I get the O/p for above scenario..??
Following on from Alex's post I think you are looking for
var numbers = new List<int>() { 980, 850, 700, 680 };
var dec = new Decimal(692.0);
var temp = numbers.Count(x => x > dec) + 1;
this will return the position you are looking for
If you want to look for an exact match of a decimal input to a int on the list,you can use FindIndex.
var numbers = new List<int>() { 980, 850, 700, 680 };
var dec = new Decimal(680.0);
var res = numbers.FindIndex(x => x == dec);
It returns the 0-based position of the match.
Your question is not clear, i'm not sure what role 690 is playing.
Assuming that the user can ernter a number and you want to find the rank(index) of the number in the list when it would be inserted. Assuming also that your list should be sorted descending since you want the position of the new int according to it's value:
var input = 692;
var numbers = new List<int>() { 980, 850, 700, 680 };
var firstLower = numbers.OrderByDescending(i => i)
.Select((i, index) => new { Value = i, Index = index })
.FirstOrDefault(x => x.Value < input);
var rank = firstLower == null ? numbers.Count + 1 : firstLower.Index + 1;
Note that the OrderByDescending might be redundant if your list is already sorted, but i assume that your sample data is only sorted accidentally.

LINQ: GroupBy with maximum count in each group

I have a list of duplicate numbers:
Enumerable.Range(1,3).Select(o => Enumerable.Repeat(o, 3)).SelectMany(o => o)
// {1,1,1,2,2,2,3,3,3}
I group them and get quantity of occurance:
Enumerable.Range(1,3).Select(o => Enumerable.Repeat(o, 3)).SelectMany(o => o)
.GroupBy(o => o).Select(o => new { Qty = o.Count(), Num = o.Key })
Qty Num
3 1
3 2
3 3
What I really need is to limit the quantity per group to some number. If the limit is 2 the result for the above grouping would be:
Qty Num
2 1
1 1
2 2
1 2
2 3
1 3
So, if Qty = 10 and limit is 4, the result is 3 rows (4, 4, 2). The Qty of each number is not equal like in example. The specified Qty limit is the same for whole list (doesn't differ based on number).
Thanks
Some of the other answers are making the LINQ query far more complex than it needs to be. Using a foreach loop is certainly faster and more efficient, but the LINQ alternative is still fairly straightforward.
var input = Enumerable.Range(1, 3).SelectMany(x => Enumerable.Repeat(x, 10));
int limit = 4;
var query =
input.GroupBy(x => x)
.SelectMany(g => g.Select((x, i) => new { Val = x, Grp = i / limit }))
.GroupBy(x => x, x => x.Val)
.Select(g => new { Qty = g.Count(), Num = g.Key.Val });
There was a similar question that came up recently asking how to do this in SQL - there's no really elegant solution and unless this is Linq to SQL or Entity Framework (i.e. being translated into a SQL query), I'd really suggest that you not try to solve this problem with Linq and instead write an iterative solution; it's going to be a great deal more efficient and easier to maintain.
That said, if you absolutely must use a set-based ("Linq") method, this is one way you could do it:
var grouped =
from n in nums
group n by n into g
select new { Num = g.Key, Qty = g.Count() };
int maxPerGroup = 2;
var portioned =
from x in grouped
from i in Enumerable.Range(1, grouped.Max(g => g.Qty))
where (x.Qty % maxPerGroup) == (i % maxPerGroup)
let tempQty = (x.Qty / maxPerGroup) == (i / maxPerGroup) ?
(x.Qty % maxPerGroup) : maxPerGroup
select new
{
Num = x.Num,
Qty = (tempQty > 0) ? tempQty : maxPerGroup
};
Compare with the simpler and faster iterative version:
foreach (var g in grouped)
{
int remaining = g.Qty;
while (remaining > 0)
{
int allotted = Math.Min(remaining, maxPerGroup);
yield return new MyGroup(g.Num, allotted);
remaining -= allotted;
}
}
Aaronaught's excellent answer doesn't cover the possibility of getting the best of both worlds... using an extension method to provide an iterative solution.
Untested:
public static IEnumerable<IEnumerable<U>> SplitByMax<T, U>(
this IEnumerable<T> source,
int max,
Func<T, int> maxSelector,
Func<T, int, U> resultSelector
)
{
foreach(T x in source)
{
int number = maxSelector(x);
List<U> result = new List<U>();
do
{
int allotted = Math.Min(number, max);
result.Add(resultSelector(x, allotted));
number -= allotted
} while (number > 0 && max > 0);
yield return result;
}
}
Called by:
var query = grouped.SplitByMax(
10,
o => o.Qty,
(o, i) => new {Num = o.Num, Qty = i}
)
.SelectMany(split => split);

Resources