Linq GroupBy filter - linq

I have the following linq expression pulling all data from my database:
var items = response.Select(a => a.SessionLocationID).ToArray();
mdl = _meetingRepository.Select<SessionLocation>()
.OrderBy(a => a.SessionDT).ThenBy(a => a.SessionEndTime);
Now I want to group by the field ActualRoom and only the ones with ActualRoom count > 3
Is that possible?

You can use GroupBy, just keep in mind that you are losing the ordering you already did so I would start off before you do the sorting:
var groups = _meetingRepository.Select<SessionLocation>()
.GroupBy(x => x.ActualRoom)
.Where(g => g.Count() > 3)
To have sorted groups - assuming preserving the count as a separate property is not neccessary you can just project to an IEnumerable of IEnumerable<SessionLocation>:
var groups = _meetingRepository.Select<SessionLocation>()
.GroupBy(x => x.ActualRoom)
.Where(g => g.Count() > 3)
.Select(g => g.OrderBy(x => x.SessionDT).ThenBy(x => x.SessionEndTime));

Related

Entity Framework LINQ Multiple Column Count(Distinct)

I'm not entirely sure how to translate the following SQL into LINQ Fluent API. Any help would be great
select count(distinct thread.Id) as ThreadCount, count(distinct segment.Id) as SegmentCount
from Segment
inner join SegmentCommunicator as SegmentCommunicator
on Segment.Id = SegmentCommunicator.SegmentId
inner join Thread
on Thread.Id = Segment.ThreadId
where SegmentCommunicator.CommunicatorId in (94, 3540, 6226, 10767, 20945)
Currently, I know how to do this in 2 queries but for the life of me I can't figure out how to condense down into one. Any help would be much appreciated
var aggregate1 = _threadProvider
.AsQueryable()
.SelectMany(t => t.Segments)
.Where(s => s.SegmentCommunicators.Any(sc => communicatorIds.Contains(sc.CommunicatorId)))
.Select(s => s.ThreadId)
.Distinct()
.Count();
var aggregate2 = _threadProvider
.AsQueryable()
.SelectMany(t => t.Segments)
.Where(s => s.SegmentCommunicators.Any(sc => communicatorIds.Contains(sc.CommunicatorId)))
.Select(s => s.Id)
.Distinct()
.Count();
You can use one "base" query but the two distinct counts will be separate queries, hydrating the query once with as small of a data set as possible:
var query = _threadProvider
.AsQueryable()
.SelectMany(t => t.Segments)
.Where(s => s.SegmentCommunicators.Any(sc => communicatorIds.Contains(sc.CommunicatorId)))
.Select(s => new {s.ThreadId, s.Id})
.Distinct()
.ToList();
var aggregate1 = query.Select(s => s.ThreadId)
.Distinct()
.Count();
var aggregate2 = query.Select(s => s.Id)
.Distinct()
.Count();
You might be able to use GroupBy to do it in one query:
var query = _threadProvider
.AsQueryable()
.SelectMany(t => t.Segments)
.Where(s => s.SegmentCommunicators.Any(sc => communicatorIds.Contains(sc.CommunicatorId)))
.GroupBy(s => 1)
.Select(g => new
{
ThreadCount = g.Select(s => s.ThreadId.Distinct().Count()),
SegmentCount = g.Select(s => s.Id.Distinct().Count()),
});
but I doubt that the underlying query provider will support it (at best it will turn it into two sub-queries).
Note that neither query will likely perform as fast as the raw SQL, since SQL can optimize the query before returning the results to Linq.

Linq All() / Any() but not empty

I have a Linq expression that is used in a few places. I went down the expression route as there wasn't a logical way to accomplish some searching logic without enumerating a very large table otherwise.
private Expression<Func<Property, bool>> PropertyIsCompliant()
{
return (p) => p.CalculationSets.OfType<SingleDocumentCalculationSet>()
.GroupBy(cs => cs.SourceDocument)
.Select(g => g.OrderByDescending(d => d.DateTime).FirstOrDefault().CalculationResults)
.SelectMany(cr => cr)
.All(cr => cr.Outcome == CalculationOutcome.Success);
}
My models are as such:
A Property has many CalculationSets
Each CalculationSet is also assigned to a Document
Each CalculationSet has a number of CalculationResults
Each CalculationResult has an Outcome
I'm trying to create an expression that will tell me if all the outcomes from the most recent calculationsets grouped by document ordered by most recent (ie the most recent distinct results) are Successful.
I can the SelectMany clause returns all the CalculationResults from the correct CalculationSets.
I just cant figure out how to return true ONLY if the collection isn't empty AND they are all Outcome.Success.
I understand the All operator automatically returns true on an empty collection. I just can't think of a way around it!
So your real condition is that there are not any unsuccessful outcomes. In that case use Any and reverse the condition:
//V-- notice the ! inverse operator here
return (p) => !(p.CalculationSets.OfType<SingleDocumentCalculationSet>()
.GroupBy(cs => cs.SourceDocument)
.Select(g => g.OrderByDescending(d => d.DateTime).FirstOrDefault().CalculationResults)
.SelectMany(cr => cr)
.Any(cr => cr.Outcome != CalculationOutcome.Success));
var countsBySuccess =
...
.GroupBy(cr => cr.Outcome == CalculationOutcome.Success) //group on success
.Select(g => new { IsSuccessful = g.Key, Count = g.Count() });
You can now examine the two result rows to make sure that the unsuccessful count is zero and the successful count is non-zero.
Regarding performance, this will need to materialize the entire result set server-side and aggregate it. But it does so only once.
If you must use the calculation result as part of a bigger query, you must use another trick:
!countsBySuccess.Any(g =>
g.IsSuccessful && Count == 0 ||
!g.IsSuccessful && Count != 0)
This boolean expression determines whether the condition you are looking for holds with one scan of the data.
It is important to only scan the data once. Do not simply write:
myItems.All(cr => cr.Outcome == CalculationOutcome.Success) && myItems.Any()
Because that does two scans. SQL Server does not optimize this out.
I think you're answering your question - if you know that All returns TRUE for empty then you have two checks to make. Excuse my C# (I'm not sure on the var query assignment, hopefully you get the idea) but you could do something like this:
private Expression<Func<Property, bool>> PropertyIsCompliant()
{
var query = (p) => p.CalculationSets.OfType<SingleDocumentCalculationSet>()
.GroupBy(cs => cs.SourceDocument)
.Select(g => g.OrderByDescending(d => d.DateTime).FirstOrDefault().CalculationResults)
.SelectMany(cr => cr);
return (query.Count > 0) & query.All(cr => cr.Outcome == CalculationOutcome.Success);
}
I didn't realise it was possible to use "&&" in expressions. So I've managed to combine 2 separate expressions that give the answer I need. The "&&" only returns true when both expressions evaluate "true"
return (p) =>
p.CalculationSets.OfType<SingleDocumentCalculationSet>()
.GroupBy(cs => cs.SourceDocument)
.Select(g => g.OrderByDescending(d => d.DateTime).FirstOrDefault().CalculationResults)
.SelectMany(cr => cr).Any()
&&
p.CalculationSets.OfType<SingleDocumentCalculationSet>()
.GroupBy(cs => cs.SourceDocument)
.Select(g => g.OrderByDescending(d => d.DateTime).FirstOrDefault().CalculationResults)
.SelectMany(cr => cr)
.All(cr => cr.Outcome == CalculationOutcome.Success);

Is there a easy way to filter out unique elements using linq?

I have an xml document
<NumSet>
<num>1</num>
<num>2</num>
<num>2</num>
<num>3</num>
</NumSet>
I want unique elements shown up, ie 1 and 3. not distinct which will also bring out 2.
How to do that? Do I have to use Group? Is there any concise way to do that?
You are right, you can use GroupBy and filter group which has only one item by using Count() == 1:
var output = XDocument.Load(xmlFile)
.Descendants("num")
.Select(e => e.Value)
.GroupBy(x => x)
.Where(g => g.Count() == 1)
.Select(g => g.Key);
It sounds like you want a Distinct GroupBy query... Take a look at the Need help on Linq with group by and distinct post here on StackOverflow.
XElement xe = XElement.Parse(#"<NumSet><num>1</num><num>2</num><num>2</num><num>3</num></NumSet>");
var query = xe.Elements("num")
.GroupBy(x => x.Value)
.Where(x=>x.Count ()==1)
.Select (x => x);
To do what you need I'd say that yes, you need to use GrouBy, and then count the elements in each group, and return those that contains just one element. In code, this translates to:
var query = lst.GroupBy(x => x)
.Where(x => x.Count() == 1)
.Select(x => x.Key);

Linq GroupBy TakeWhile?

Using EF4 and Linq
I have a data structure which looks like this:
parentid childid version
1 2 1
1 3 1
1 4 1
1 2 2
1 3 2
1 5 2
...
That is, I want to select the parentid, childid but only for the highest version, for every parentid there is in the database.
My first attempt was this:
var links = data
.GroupBy (link => link.parentid)
.Select(ig => ig.OrderByDescending(link => link.version).First())
.Select ( link => new ....... );
However, this obviously only selects one of the child id's for each parent id..
In the sample data above I want to get the child ids 2,3,5 from version 2 for parent 1 that is..
This should meet your requirement..
var links = data.Where(x => x.version == data.Max(y => y.version))
.OrderBy(x => x.childid)
.Select(x => new {parentid = x.parentid, childid = x.childid});
if you need only for parent 1 then
var links = data.Where(x => x.version == data.Max(y => y.version))
.Where(x => x.parentid == 1)
.OrderBy(x => x.childid)
.Select(x => new {parentid = x.parentid, childid = x.childid});
I ended up using a subquery like this:
var links = data
.OrderBy(link => link.parentid)
.ThenBy(link => link.childid)
.Where(o => o.version == data
.Where(i => i.version == o.version).Max(l => l.version))
.Select(link => new ....);
This way, I get all the parentid's in the db with only the children with the max version for that specific parent.

Complex Linq Collection Query

I have this DB diagram and want to make a query to find all UserLists in a given region. RegionId is supplied.
So I can get all the departments by this code (may not be the best way..):
var region = context.Regions.Find(regionId);
IEnumerable<Department> departments = region.Areas
.SelectMany(a => a.Workplaces)
.SelectMany(w => w.Departments);
The Account can have many UserLists, and an Account can be linked to many Departments. Can someone formulate a queryto achieve this please?
for completeness the final code was:
List<UserList> query2 = context.Regions.Where(r => r.RegionId == regionId)
.SelectMany(r => r.Areas)
.SelectMany(a => a.Workplaces)
.SelectMany(w => w.Departments)
.SelectMany(d => d.AccountsAllowedToPost)
.Distinct()
.SelectMany(da => da.Lists).ToList();
You can use the let syntax (or the .Select method) to navigate the ManyToOne relationship.
var query =
from r in context.Regions
where r.RegionId == regionId
from a in r.Areas
from w in a.Workplaces
from d in w.Departments
from da in d.DepartmentAccounts
let acc = da.Account
from u in acc.UserLists
select u;
var query2 = context.Regions.Where(r => r.RegionId == regionId)
.SelectMany(r => r.Areas)
.SelectMany(a => a.Workplaces)
.SelectMany(w => w.Departments)
.SelectMany(d => d.DepartmentAccounts)
.Select(da => da.Account)
.SelectMany(acc => acc.UserLists);

Resources