Entity Framework LINQ Multiple Column Count(Distinct) - linq

I'm not entirely sure how to translate the following SQL into LINQ Fluent API. Any help would be great
select count(distinct thread.Id) as ThreadCount, count(distinct segment.Id) as SegmentCount
from Segment
inner join SegmentCommunicator as SegmentCommunicator
on Segment.Id = SegmentCommunicator.SegmentId
inner join Thread
on Thread.Id = Segment.ThreadId
where SegmentCommunicator.CommunicatorId in (94, 3540, 6226, 10767, 20945)
Currently, I know how to do this in 2 queries but for the life of me I can't figure out how to condense down into one. Any help would be much appreciated
var aggregate1 = _threadProvider
.AsQueryable()
.SelectMany(t => t.Segments)
.Where(s => s.SegmentCommunicators.Any(sc => communicatorIds.Contains(sc.CommunicatorId)))
.Select(s => s.ThreadId)
.Distinct()
.Count();
var aggregate2 = _threadProvider
.AsQueryable()
.SelectMany(t => t.Segments)
.Where(s => s.SegmentCommunicators.Any(sc => communicatorIds.Contains(sc.CommunicatorId)))
.Select(s => s.Id)
.Distinct()
.Count();

You can use one "base" query but the two distinct counts will be separate queries, hydrating the query once with as small of a data set as possible:
var query = _threadProvider
.AsQueryable()
.SelectMany(t => t.Segments)
.Where(s => s.SegmentCommunicators.Any(sc => communicatorIds.Contains(sc.CommunicatorId)))
.Select(s => new {s.ThreadId, s.Id})
.Distinct()
.ToList();
var aggregate1 = query.Select(s => s.ThreadId)
.Distinct()
.Count();
var aggregate2 = query.Select(s => s.Id)
.Distinct()
.Count();
You might be able to use GroupBy to do it in one query:
var query = _threadProvider
.AsQueryable()
.SelectMany(t => t.Segments)
.Where(s => s.SegmentCommunicators.Any(sc => communicatorIds.Contains(sc.CommunicatorId)))
.GroupBy(s => 1)
.Select(g => new
{
ThreadCount = g.Select(s => s.ThreadId.Distinct().Count()),
SegmentCount = g.Select(s => s.Id.Distinct().Count()),
});
but I doubt that the underlying query provider will support it (at best it will turn it into two sub-queries).
Note that neither query will likely perform as fast as the raw SQL, since SQL can optimize the query before returning the results to Linq.

Related

Which of these LINQ queries performs better?

In the context of Entity Framework, which of these queries performs better?
var result = Items.OrderByDescending(t => t.Created)
.FirstOrDefault(a => a.ID == ItemID)?.ItemName;
or
var result = Items.Where(a => a.ID == ItemID)
.OrderByDescending(t => t.Created)
.FirstOrDefault()?.ItemName;
Is there anywhere I can read about how these LINQ queries are converted to SQL in this case.

Where in clause using linq

trying to convert a query which has 2 levels of where in clauses to linq and getting some errors. Can anybody help me on this?
Original Query:
select id
from student
where suId
in (select suId
from subjects
where cid
in (select id
from chapters
where chapter='C203'))
LINQ query:
var query = (from s in dc.students
let subs = (from su in dc.subjects
where su.cid == Convert.ToInt32(from c in dc.Chapters
where c.chapter == 'Ç203'
select c.id) //Single chapter id will be returned
select su.suid)
where subs.Contains(s.sid)
select s.id).ToArray();
Am getting below 2 errors while compiling app
'System.Linq.IQueryable' does not contain a definition for 'Contains' and the best extension method overload 'System.Linq.ParallelEnumerable.Contains(System.Linq.ParallelQuery, TSource)' has some invalid arguments
Instance argument: cannot convert from 'System.Linq.IQueryable' to 'System.Linq.ParallelQuery'
Since Linq is lazy-loading everything you don't need to cram everything into a single statement; you can do something like this:
var chapterIds = dc.Chapters
.Where(c => c.Chapter == "C023")
.Select(c => c.Id);
var subjectIds = dc.Subjects
.Where(s => chapterIds.Contains(s.Cid))
.Select(s => s.Suid);
var students = dc.Students
.Where(s => subjectIds.Contains(s.Suid))
.Select(s => s.Sid)
.ToArray();
This way you can debug each subquery by looking at what it returns.
However, looking at your original select you can rewrite the whole thing as a Join and get rid of the bugging issue:
var students = dc.Chapters.Where(c => c.Chapter == "C023")
.Join(dc.Subjects,
c => c.Id,
s => s.Cid,
(chapter, subject) => subject)
.Join(dc.Students,
subj => subj.Suid,
student => student.Suid,
(subj, st) => st.Sid)
.ToArray();

Linq To Entities Optional Distinct

Earlier I put a question on Stackoverflow about how to remove duplicate records in a list of objects, based on a particular property within each object.
I got the answer I was looking for (see below), a query which returns a distinct list of objects using MainHeadingID as the property to remove duplicates.
public IList<tblcours> GetAllCoursesByOrgID(int id)
{
return _UoW.tblcoursRepo.All.
Where(c => c.tblCourseCategoryLinks.Any(cl => cl.tblUnitCategory.tblUnit.ParentID == id))
.GroupBy(c => c.MainHeadingID)
.Select(g => g.FirstOrDefault())
.ToList();
}
However, now I need more help! Is there anyway of amending the query above so that, it only removes duplicate values when MainHeadingID is not equal to 180. I tried amending GroupBy line to
.GroupBy(c => c.MainHeadingID != 180)
However, this didn't work.
Any help would be much appreciated with this.
Thanks.
Following works for LINQ to SQL:
return _UoW.tblcoursRepo.All
.Where(c => c.tblCourseCategoryLinks.Any(cl => cl.tblUnitCategory.tblUnit.ParentID == id))
.GroupBy(c => c.MainHeadingID)
//.SelectMany(g => g.Key == 180 ? g : g.Take(1))
.SelectMany(g => g.Take(g.Key == 180 ? Int32.MaxValue : 1))
.ToList();
Comments: SelectMany in query above selects all items from group where MainHeadingID equals to 180, but it takes only one item form other groups (i.e. distinct result). Linq to SQL cannot translate commented out part, but thanks to #usr there is way around.
Linq to Entities cannot translate even simplified query. I think only option for you in this case is simple concating result of two queries:
Expression<Func<tblcours, bool>> predicate = x =>
x.tblCourseCategoryLinks.Any(cl => cl.tblUnitCategory.tblUnit.ParentID == id)
int headingId = 180;
return _UoW.tblcoursRepo.All
.Where(c => c.MainHeadingID != headingId)
.Where(predicate)
.GroupBy(c => c.MainHeadingID)
.Select(g => g.FirstOrDefault())
.Concat(_UoW.tblcoursRepo.All
.Where(c => c.MainHeadingID == headingId)
.Where(predicate))
.ToList();
lazyberezovsky's answer fails due to an EF bug (which is not surprising given the quality of EF's LINQ support). It can be made to work with a hack:
.SelectMany(g => g.Key == 180 ? g.Take(int.MaxValue) : g.Take(1))
or
.SelectMany(g => g.Take(g.Key == 180 ? int.MaxValue : 1))
Note that performance will not be particularly good due to the way this is translated to SQL.

Linq GroupBy filter

I have the following linq expression pulling all data from my database:
var items = response.Select(a => a.SessionLocationID).ToArray();
mdl = _meetingRepository.Select<SessionLocation>()
.OrderBy(a => a.SessionDT).ThenBy(a => a.SessionEndTime);
Now I want to group by the field ActualRoom and only the ones with ActualRoom count > 3
Is that possible?
You can use GroupBy, just keep in mind that you are losing the ordering you already did so I would start off before you do the sorting:
var groups = _meetingRepository.Select<SessionLocation>()
.GroupBy(x => x.ActualRoom)
.Where(g => g.Count() > 3)
To have sorted groups - assuming preserving the count as a separate property is not neccessary you can just project to an IEnumerable of IEnumerable<SessionLocation>:
var groups = _meetingRepository.Select<SessionLocation>()
.GroupBy(x => x.ActualRoom)
.Where(g => g.Count() > 3)
.Select(g => g.OrderBy(x => x.SessionDT).ThenBy(x => x.SessionEndTime));

Complex Linq Collection Query

I have this DB diagram and want to make a query to find all UserLists in a given region. RegionId is supplied.
So I can get all the departments by this code (may not be the best way..):
var region = context.Regions.Find(regionId);
IEnumerable<Department> departments = region.Areas
.SelectMany(a => a.Workplaces)
.SelectMany(w => w.Departments);
The Account can have many UserLists, and an Account can be linked to many Departments. Can someone formulate a queryto achieve this please?
for completeness the final code was:
List<UserList> query2 = context.Regions.Where(r => r.RegionId == regionId)
.SelectMany(r => r.Areas)
.SelectMany(a => a.Workplaces)
.SelectMany(w => w.Departments)
.SelectMany(d => d.AccountsAllowedToPost)
.Distinct()
.SelectMany(da => da.Lists).ToList();
You can use the let syntax (or the .Select method) to navigate the ManyToOne relationship.
var query =
from r in context.Regions
where r.RegionId == regionId
from a in r.Areas
from w in a.Workplaces
from d in w.Departments
from da in d.DepartmentAccounts
let acc = da.Account
from u in acc.UserLists
select u;
var query2 = context.Regions.Where(r => r.RegionId == regionId)
.SelectMany(r => r.Areas)
.SelectMany(a => a.Workplaces)
.SelectMany(w => w.Departments)
.SelectMany(d => d.DepartmentAccounts)
.Select(da => da.Account)
.SelectMany(acc => acc.UserLists);

Resources