Why is Linq that slow (see provided examples) - performance

This Linq is very slow:
IEnumerable<string> iedrDataRecordIDs = dt1.AsEnumerable()
.Where(x => x.Field<string>(InputDataSet.Column_Arguments_Name) == sArgumentName
&& x.Field<string>(InputDataSet.Column_Arguments_Value) == sArgumentValue)
.Select(x => x.Field<string>(InputDataSet.Column_Arguments_RecordID));
IEnumerable<string> iedrDataRecordIDs_Filtered = dt2.AsEnumerable()
.Where(x => iedrDataRecordIDs.Contains(
x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID))
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Field)
== sDataRecordFieldField
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Value)
== sDataRecordFieldValue)
.Select(x => x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID));
IEnumerable<string> ieValue = dt2.AsEnumerable()
.Where(x => x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID)
== iedrDataRecordIDs_Filtered.FirstOrDefault()
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Field) == sFieldName)
.Select(x => x.Field<string>(InputDataSet.Column_DataRecordFields_Value));
if (!ieValue.Any()) //very slow at this point
return iedrDataRecordIDs_Filtered.FirstOrDefault();
This change accelerates it by a factor of 10 or more
string sRecordID = dt2.AsEnumerable()
.Where(x => iedrDataRecordIDs.Contains(
x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID))
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Field)
== sDataRecordFieldField
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Value)
== sDataRecordFieldValue)
.Select(x => x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID))
.FirstOrDefault();
IEnumerable<string> ieValue = dt2.AsEnumerable()
.Where(x => x.Field<string>(InputDataSet.Column_DataRecordFields_RecordID) == sRecordID
&& x.Field<string>(InputDataSet.Column_DataRecordFields_Field) == sFieldName)
.Select(x => x.Field<string>(InputDataSet.Column_DataRecordFields_Value));
if (!ieValue.Any()) //very fast at this point
return iedrDataRecordIDs_Filtered.FirstOrDefault();
The only change is that I store the result directly in a new variable and use create the where clause with this value instead of a LINQ query (which should be calculated when needed). But LINQ seems to calculate it in a bad way here or am I doing something wrong?
Here some values of my data
dt1.Rows.Count 142
dt1.Columns.Count 3
dt2.Rows.Count 159
dt2.Columns.Count 3
iedrDataRecordIDs.Count() 1
iedrDataRecordIDs_Filtered.Count() 1
ieValue.Count() 1

You're asking why
IEnumerable<string> iedrDataRecordIDs_Filtered = data;
foreach (var item in collection)
{
// do something with
iedrDataRecordIDs_Filtered.FirstOrDefault();
}
is slower than
string sRecordID = data.FirstOrDefault();
foreach (var item in collection)
{
// do something with
sRecordID;
}
Very simply because you're evaluating the iedrDataRecordIDs collection every time you get the FirstOrDefault. This isn't a concrete object, it's an enumerable set. That's really just a function that returns some objects. Every time you query it the function will be called and you'll pay that execution cost.
If you change
IEnumerable<string> iedrDataRecordIDs_Filtered = dt2.AsEnumerable()...
var recordIDs = iedrDataRecordIDs_Filtered.ToList();
and then use recordIDs.FirstOrDefault() you'll see a huge performance increase.

Related

Linq: Where count greater than value

I have a linq query which accepts a list of date and port combinations. This query has to return data from a table, CruiseCalendar, where these combinations are found, but only when the count is greater than one. I cant work out the groupby and count syntax. var shipRendezvous is where I'm stuck.
var dateAndPort = (from r in context.CruiseCalendar
where r.ShipId == shipId
&& r.CruiseDayDate >= dateRange.First
&& r.CruiseDayDate <= dateRange.Last
select new DateAndPort
{
Date = r.CruiseDayDate,
PortId = r.PortId
});
var shipRendezvous = (from r in context.CruiseCalendar
where (dateAndPort.Any(d => d.Date == r.CruiseDayDate
&& d.PortId == r.PortId))
orderby r.CruiseDayDate // (Added since first posting)
select r).ToList();
regards, Guy
If I understood you correctly, you are filterting for every set which matches any of the results of dateAndPort and then want to group it by itsself to get a count. Of the grouping results you only want those resultsets, which occur more then once.
var shipRendezvous = (from r in context.CruiseCalendar
where (dateAndPort.Any(d => d.Date == r.CruiseDayDate
&& d.PortId == r.PortId))
select r)
.GroupBy(x => x.CruiseDayDate) //Groups by every combination
.Where(x => x.Count() > 1) //Where Key count is greater 1
.ToList();
Based on your comment, you want to flatten the list again. To do so, use SelectMany():
var shipRendezvous = (from r in context.CruiseCalendar
where (dateAndPort.Any(d => d.Date == r.CruiseDayDate
&& d.PortId == r.PortId))
select r)
.GroupBy(x => x.CruiseDayDate) //Groups by every combination
.Where(x => x.Count() > 1) //Where Key count is greater 1
.SelectMany(x => x)
.ToList();

Getting the Error in my code when framing LINQ

My Code:
var lastName = employees
.Where(a => a.Number ==
(dm.MTM.Where(b => b.MTT.IsManager)
.Select(c => c.Number)
.FirstOrDefault()))
.Select(z => z.LastName)
.FirstOrDefault();
Error Message:
Unable to create a constant value of type 'XXX.Models.Mt.MTM'. Only primitive types or enumeration types are supported in this context.
Try:
int? num = dm.MTM.Where(b => b.MTT.IsManager).Select(c => c.Number).FirstOrDefault();
var lastName = employees.Where(a => a.Number == num).Select(z => z.LastName).FirstOrDefault();
But you should add a check
if (num == null)
{
// bad things, don't execute second query
}
between the two instructions.
The error is because in an Entity Framework query you can't do "things" too much fancy, like the things necessary to calculate num.
Try:
// Calculate the number outside of the main query
// because the result is fixed
var nb = dm.MTM
.Where(b => b.MTT.IsManager)
.Select(c => c.Number)
.FirstOrDefault();
// Perform the main query with the number parameter already calculated before
string lastName = String.Empty;
if (nb != null) // if null no need to run the query
{
lastName = employees
.Where(a => a.Number == nb)
.Select(z => z.LastName)
.FirstOrDefault();
}
You don't need to get the number from the database for each employee, calculate the value before running your main query.
This will be faster, less error-prone and better for caching.

linq to sql fetching all the records category wise in the list<> and then looping

i am fetching all the records from the database with the help of this query organization wise. they become about 30-40 records
List<PagesRef> paages = (from pagess in pagerepository.GetAllPages()
join pagesref in pagerepository.GetAllPageRef()
on pagess.int_PageId equals pagesref.int_PageId
where (pagess.int_PostStatusId != 3 && pagess.int_OrganizationId == Authorization.OrganizationID)
&& pagesref.int_PageRefId == pagesref.Pages.PagesRefs.FirstOrDefault(m => m.int_PageId == pagess.int_PageId && m.bit_Active == true && (m.vcr_PageTitle != null && m.vcr_PageTitle != "")).int_PageRefId
select pagesref).ToList();
next the next step what i want to do is to loop through the above list as linq to object query without going to the database to generate 3 level hierarchical record. can some one give me some insight or idea how can i do it?
edit
var parentrecord = paages.Where(n => n.Pages.int_PageParent == 0).OrderBy(m => m.Pages.int_SortOrder == null).OrderBy(m => m.int_PageId);
foreach (var secondlevel in parentrecord) // if parentrecord found
{
var seclevel = paages.Where(m => m.Pages.int_PageParent == secondlevel.Pages.int_PageId).OrderBy(m => m.Pages.int_SortOrder == null).OrderBy(m => m.Pages.int_SortOrder);
secondlevel.vcr_PageTitle = "parent";
pagesreff.Add(secondlevel); // if parentrecord found then loop and add in there
foreach (var thdlevel in seclevel)
{
var thirdlevel = paages.Where(m => m.Pages.int_PageParent == thdlevel.Pages.int_PageId).OrderBy(m => m.Pages.int_SortOrder == null).OrderBy(m => m.int_PageId).OrderBy(m => m.Pages.int_SortOrder);
thdlevel.vcr_PageTitle = "child";
pagesreff.Add(thdlevel); // if parentrecord child found then loop and add in there
foreach (var thd in thirdlevel)
{
thd.vcr_PageTitle = "subchild";
pagesreff.Add(thd); // if parentrecord child found then loop and add in there
}
}
}
After ToList(); linq-to-sql go to database and get rows. After that, you have collection of objects and can do what you want with linq to objects:
var filteredList = paages.Where(someFilter);
there will be no new sql requests.
Update
Your problem is that you filter in navigation property, so you should load your navigation property with your first query. I'm not sure (linq-to-sql was many years ago:)), but this should help you (I assume that m.Pages is of type Page):
List<PagesRef> paages = (from pagess in pagerepository.GetAllPages()
join pagesref in pagerepository.GetAllPageRef()
on pagess.int_PageId equals pagesref.int_PageId
where (pagess.int_PostStatusId != 3 && pagess.int_OrganizationId == Authorization.OrganizationID)
&& pagesref.int_PageRefId == pagesref.Pages.PagesRefs.FirstOrDefault(m => m.int_PageId == pagess.int_PageId && m.bit_Active == true && (m.vcr_PageTitle != null && m.vcr_PageTitle != "")).int_PageRefId
select pagesref).AssociateWith<Page>.ToList();

Entity Framework/ Linq - groupby and having clause

Given the query below
public TrainingListViewModel(List<int> employeeIdList)
{
this.EmployeeOtherLeaveItemList =
CacheObjects.AllEmployeeOtherLeaves
.Where(x => x.OtherLeaveDate >= Utility.GetToday() &&
x.CancelDate.HasValue == false &&
x.OtherLeaveId == Constants.TrainingId)
.OrderBy(x => x.OtherLeaveDate)
.Select(x => new EmployeeOtherLeaveItem
{
EmployeeOtherLeave = x,
SelectedFlag = false
}).ToList();
}
I want to put in the employeeIdList into the query.
I want to retrieve all of the x.OtherLeaveDate values where the same x.OtherLeaveDate exists for each join where x.EmployeeId = (int employeeId in employeeIdList)
For example if there are EmployeeIds 1, 2, 3 in employeeIdList and in the CacheObjects.AllEmployeeOtherLeaves collection there is a date 1/1/2001 for all 3 employees, then retreive that date.
If I read you well it should be something like
var grp = this.EmployeeOtherLeaveItemList =
CacheObjects.AllEmployeeOtherLeaves
.Where(x => x.OtherLeaveDate >= Utility.GetToday()
&& x.CancelDate.HasValue == false
&& x.OtherLeaveId == Constants.TrainingId
&& employeeIdList.Contains(x.EmployeeId)) // courtesy #IronMan84
.GroupBy(x => x.OtherLeaveDate);
if (grp.Count() == 1)
{
var result = g.First().Select(x => new EmployeeOtherLeaveItem
{
EmployeeOtherLeave = x,
SelectedFlag = false
})
}
First the data is grouped by OtherLeaveDate. If the grouping results in exactly one group, the first (and only) IGrouping instance is taken (which is a list of Leave objects) and its content is projected to EmployeeOtherLeaveItems.
To the where statement add "&& employeeIdList.Contains(x.EmployeeId)"
I need to thank #IronMan84 and #GertArnold for helping me along, and I will have to admonish myself for not being clearer in the question. This is the answer I came up with. No doubt it can be improved but given no one has responded to say why I will now tick this answer.
var numberOfEmployees = employeeIdList.Count;
var grp = CacheObjects.AllEmployeeOtherLeaves.Where(
x =>
x.OtherLeaveDate >= Utility.GetToday()
&& x.CancelDate.HasValue == false
&& x.OtherLeaveId == Constants.TrainingId
&& employeeIdList.Contains(x.EmployeeId))
.GroupBy(x => x.OtherLeaveDate)
.Select(x => new { NumberOf = x.Count(), Item = x });
var list =
grp.Where(item => item.NumberOf == numberOfEmployees).Select(item => item.Item.Key).ToList();

Linq to Entities performance problem with many columns

I am having an issue with getting linq to entities to perform well. The query I have (not mine, maintaining someone's code :-)), has several includes that I've determined are all necessary for the WPF screen that consumes the results of this query.
Now, the SQL generated executes very fast and only returns one row of data. But it is returning 570 columns, and i think the performance hit is in the overhead of creating all the objects and all of those fields.
I've tried using lazy loading, but that doesn't seem to have any effect on performance.
I've tried removing any of the "include" statements that aren't necessary, but it appears that they all are needed.
here's the linq query:
var myQuery =
from appt in ctx.Appointments
.Include("ScheduleColumnProfile")
.Include("EncounterReason")
.Include("Visit")
.Include("Visit.Patient")
.Include("Visit.Patient.PatientInsurances")
.Include("Visit.Patient.PatientInsurances.InsuranceType")
.Include("Visit.Patient.PatientInsurances.InsuranceCarrier")
.Include("MasterLookup")
.Include("User1")
.Include("User2")
.Include("Site")
.Include("Visit.Patient_CoPay")
.Include("Visit.Patient_CoPay.User")
.Include("Visit.VisitInstructions.InstructionSheet")
where appt.VisitId == visitId
&& appt.MasterLookup.LookupDescription.ToUpper() != Rescheduled
&& appt.Site.PracticeID == practiceId
&& appt.MasterLookup.LookupDescription.ToUpper() != Cancelled
orderby appt.AppointmentId descending
select appt;
The SQL generate is 4000 lines long with 570 columns in the select statment and 3 or 4 Union ALLs, so I'm not going to paste it here unless someone REALLY wants to see it. Basically, i'm looking for a way to get rid of the unions if possible, and trim down the columns to only what's needed.
Help!
:-)
if anyone is keeping track, this is the solution that ended up working for me. Thanks to everyone who commented and made suggestions... it eventually lead me to what i have below.
ctx.ContextOptions.LazyLoadingEnabled = true;
var myQuery =
from appt in ctx.Appointments
where appt.VisitId == visitId
&& appt.MasterLookup.LookupDescription.ToUpper() != Rescheduled
&& appt.Site.PracticeID == practiceId
&& appt.MasterLookup.LookupDescription.ToUpper() != Cancelled
orderby appt.AppointmentId descending
select appt;
var myAppt = myQuery.FirstOrDefault();
ctx.LoadProperty(myAppt, a => a.EncounterReason);
ctx.LoadProperty(myAppt, a => a.ScheduleColumnProfile);
ctx.LoadProperty(myAppt, a => a.Visit);
ctx.LoadProperty(myAppt, a => a.MasterLookup);
ctx.LoadProperty(myAppt, a => a.User1);
ctx.LoadProperty(myAppt, a => a.User2);
ctx.LoadProperty(myAppt, a => a.PatientReferredProvider);
var myVisit = myAppt.Visit;
ctx.LoadProperty(myVisit, v => v.Patient);
ctx.LoadProperty(myVisit, v => v.Patient_CoPay);
ctx.LoadProperty(myVisit, v => v.VisitInstructions);
ctx.LoadProperty(myVisit, v => v.EligibilityChecks);
var pat = myVisit.Patient;
ctx.LoadProperty(pat, p => p.PatientInsurances);
//load child insurances
foreach (PatientInsurance patIns in myAppt.Visit.Patient.PatientInsurances)
{
ctx.LoadProperty(patIns, p => p.InsuranceType);
ctx.LoadProperty(patIns, p => p.InsuranceCarrier);
}
//load child instruction sheets
foreach (VisitInstruction vi in myAppt.Visit.VisitInstructions)
{
ctx.LoadProperty(vi, i => i.InstructionSheet);
}
//load child copays
foreach (Patient_CoPay coPay in myAppt.Visit.Patient_CoPay)
{
ctx.LoadProperty(coPay, c => c.User);
}
//load child eligibility checks
foreach (EligibilityCheck ec in myAppt.Visit.EligibilityChecks)
{
ctx.LoadProperty(ec, e => ec.MasterLookup);
ctx.LoadProperty(ec, e => ec.EligibilityResponse);
}
I would recommend creating a new Class that contains only the properties that you need to display. When you project to a new type you don't need to have Include statements, but you can still access the navigation properties of the entity.
var myQuery = from appt in ctx.Appointments
where appt.VisitId == visitId
&& appt.MasterLookup.LookupDescription.ToUpper() != Rescheduled
&& appt.Site.PracticeID == practiceId
&& appt.MasterLookup.LookupDescription.ToUpper() != Cancelled
orderby appt.AppointmentId descending
select new DisplayClass
{
Property1 = appt.Prop1,
Proeprty2 = appt.Visit.Prop1,
.
.
.
};

Resources