Entity Framework appears to be needlessly joining the same table twice - linq

Update This may already be fixed: http://entityframework.codeplex.com/workitem/486
...
A fairly straightforward LINQ statement against my entities is resulting in unnecessarily complex SQL. More on that later, here's the setup:
Tables
Publication
PublicationId (pk)
TopicId (fk to a Topic table)
ReceiptCount (denormalized for query performance)
DateInserted
Receipt
ReceiptId (pk)
PublicationId (fk to the table above)
DateInserted
LINQ
var query = from r in context.Receipts.Include("Publication")
where r.DateInserted < lagDate
&& r.ReceiptId > request.AfterReceiptId
&& r.Publication.TopicId == topicEntity.TopicId
&& r.Publication.ReceiptCount > 1
select r;
SQL
exec sp_executesql N'SELECT TOP (25)
[Project1].[ReceiptId] AS [ReceiptId],
[Project1].[PublicationId] AS [PublicationId],
[Project1].[DateInserted] AS [DateInserted],
[Project1].[DateReceived] AS [DateReceived],
[Project1].[PublicationId1] AS [PublicationId1],
[Project1].[PayloadId] AS [PayloadId],
[Project1].[TopicId] AS [TopicId],
[Project1].[BrokerType] AS [BrokerType],
[Project1].[DateInserted1] AS [DateInserted1],
[Project1].[DateProcessed] AS [DateProcessed],
[Project1].[DateUpdated] AS [DateUpdated],
[Project1].[PublicationGuid] AS [PublicationGuid],
[Project1].[ReceiptCount] AS [ReceiptCount]
FROM ( SELECT
[Extent1].[ReceiptId] AS [ReceiptId],
[Extent1].[PublicationId] AS [PublicationId],
[Extent1].[DateInserted] AS [DateInserted],
[Extent1].[DateReceived] AS [DateReceived],
[Extent3].[PublicationId] AS [PublicationId1],
[Extent3].[PayloadId] AS [PayloadId],
[Extent3].[TopicId] AS [TopicId],
[Extent3].[BrokerType] AS [BrokerType],
[Extent3].[DateInserted] AS [DateInserted1],
[Extent3].[DateProcessed] AS [DateProcessed],
[Extent3].[DateUpdated] AS [DateUpdated],
[Extent3].[PublicationGuid] AS [PublicationGuid],
[Extent3].[ReceiptCount] AS [ReceiptCount]
FROM [dbo].[Receipt] AS [Extent1]
INNER JOIN [dbo].[Publication] AS [Extent2] ON [Extent1].[PublicationId] = [Extent2].[PublicationId]
LEFT OUTER JOIN [dbo].[Publication] AS [Extent3] ON [Extent1].[PublicationId] = [Extent3].[PublicationId]
WHERE ([Extent2].[ReceiptCount] > 1) AND ([Extent1].[DateInserted] < #p__linq__0) AND ([Extent1].[ReceiptId] > #p__linq__1) AND ([Extent2].[TopicId] = #p__linq__2)
) AS [Project1]
ORDER BY [Project1].[ReceiptId] ASC',N'#p__linq__0 datetime,#p__linq__1 int,#p__linq__2 int',#p__linq__0='2012-09-05 19:39:21:510',#p__linq__1=4458824,#p__linq__2=90
Problem
Publication gets joined twice:
LEFT OUTER JOIN because of .Include("Publication")
INNER JOIN because of the where.
If I remove [Extent2] from the SQL entirely, and change the WHERE bits to use [Extent3], I get the same results back. Since I'm not using Lazy Loading on my entities, I have to .Include("Publication")... is there any solution for this?
I was using EF4, but grabbed EF5 from NuGet to see if it was perhaps fixed, but it produces the same result (although I have no idea how to tell if my EDMX is really using EF5).

There is however, a work-around. It may not be the most elegant solution, but it does exactly what you want; it generates only one join.
Change:
var query = from r in context.Receipts.Include("Publication")
where r.DateInserted < lagDate
&& r.ReceiptId > request.AfterReceiptId
&& r.Publication.TopicId == topicEntity.TopicId
&& r.Publication.ReceiptCount > 1
select r;
To be:
var query = from r in context.Receipts
join pub in context.Publication on r.PublicationId equals pub.PublicationId
where r.DateInserted < lagDate
&& r.ReceiptId > request.AfterReceiptId
&& pub.TopicId == topicEntity.TopicId
&& pub.ReceiptCount > 1
select new {
Receipt = r,
Publication = pub
};
Note that we have removed the Include AND we are no longer using r.Publication.?? in the where clause. Instead we are using pub.??
Now when you loop through query, you will see that r.Publication is not null:
foreach ( var item in query)
{
//see that item.Publication is not null
if(item.Receipt != null && item.Receipt.Publication != null)
{
//do work based on a valid Publication
}
else
{
//do work based on no linked Publication
}
}

This behavior can be avoided by using temporary variables (eg, let pub = r.Publication).
var query = from r in context.Receipts
let pub = r.Publication // using a temp variable
where r.DateInserted < lagDate
&& r.ReceiptId > request.AfterReceiptId
&& pub.TopicId == topicEntity.TopicId
&& pub.ReceiptCount > 1
select new { r, pub };

I will optimize on previous person's answer by changing the code as below, which removes the need of join so you don't have to know what columns you need to join and also don't have to change LINQ when join criteria gets changed. This should have been unnecessary but MS is not focused on fixing their sql code generation right now.
var query = from pub in context.Publications
from r in pub.Reciepts
where r.DateInserted < lagDate
&& r.ReceiptId > request.AfterReceiptId
&& pub.TopicId == topicEntity.TopicId
&& pub.ReceiptCount > 1
select new {
Receipt = r,
Publication = pub
};

Related

C#: LINQ not returning the same result as SQL

I'm trying to convert the following SQL query to LINQ, but getting different result count with both,
SQL Query:
SELECT T5.CNTR, T5.BenefitCode,T5.ApprovedFlag,
T5.PaymentFrequencyCode, T5.InstalmentAmt, T5.TotalAmt,
T5.CarRego
FROM
dbo.EmployeeBenefit As T5
LEFT JOIN dbo.Payee ON T5.PayeeCntr = dbo.Payee.CNTR
LEFT JOIN dbo.BankDetails ON dbo.Payee.BankCntr = dbo.BankDetails.BankCntr
Left Join dbo.EmployeeCar As T4 on T5.EmployeeCarCntr=T4.Cntr
Inner Join dbo.EmployeeEntity As T1 On T5.EmployeeEntityCntr=T1.EmployeeEntityCntr
Inner Join dbo.EmployerEntity As T2 On T1.EmployerEntityCntr=T2.EmployerEntityCntr
where T5.EmployeeCntr = 117165
AND ((T5.EndDate is Null) OR (T5.EndDate >= GETDATE()))
LINQ:
var result = (from employeeBenefit in context.EmployeeBenefit
from payee in context.Payee.Where(x => x.Cntr == employeeBenefit.PayeeCntr).DefaultIfEmpty()
from bankDetails in context.BankDetails.Where(x => x.BankCntr == employeeBenefit.PayeeCntr).DefaultIfEmpty()
from employeeCar in context.EmployeeCar.Where(x => x.Cntr == payee.BankCntr).DefaultIfEmpty()
from employeeEntity in context.EmployeeEntity
where employeeEntity.EmployeeEntityCntr == employeeBenefit.EmployeeEntityCntr
from employeeEntity1 in context.EmployeeEntity
where employeeEntity.EmployerEntityCntr == employeeEntity1.EmployerEntityCntr
&& employeeBenefit.EmployeeCntr == iEmployeeID
&& (!employeeBenefit.EndDate.HasValue || employeeBenefit.EndDate >= DateTime.Now)
&& employeeBenefit.EmployeeCntr == 117165
&& employeeBenefit.CarRego == registration
select new
{
CNTR = employeeBenefit.Cntr,
BenefitCode = employeeBenefit.BenefitCode,
PaymentFrequencyCode = employeeBenefit.PaymentFrequencyCode,
InstalmentAmount = employeeBenefit.InstalmentAmt,
TotalAmount = employeeBenefit.TotalAmt,
CarRego = employeeBenefit.CarRego,
ApprovedFlag = employeeBenefit.ApprovedFlag
}).ToList();
Please let me know what i'm missing.
For the data in my database the SQL query is returning 10 records. But, the LINQ is returning 2700 records.
Not a full answer (I'm late for work) but:
var result = (from T5 in context.EmployeeBenefit
join PY in dbo.Payee on T5.PayeeCntr equals PY.CNTR into PY1
where T5.EmployeeCntr = 117165
select new {
CNTR = T5.Cntr,
...
}
).ToList();
It was the issue with the condition mismatch that i had done in the LINQ. The below query just worked fine. Thank you for helping me with the issue.
var result = (from employeeBenefit in context.EmployeeBenefit
from payee in context.Payee.Where(x => x.Cntr == employeeBenefit.PayeeCntr).DefaultIfEmpty()
from bankDetails in context.BankDetails.Where(x => x.BankCntr == payee.BankCntr).DefaultIfEmpty()
from employeeCar in context.EmployeeCar.Where(x => x.Cntr == employeeBenefit.EmployeeCarCntr).DefaultIfEmpty()
join employeeEntity in context.EmployeeEntity
on employeeBenefit.EmployeeEntityCntr equals employeeEntity.EmployeeEntityCntr
join employerEntity in context.EmployerEntity
on employeeEntity.EmployerEntityCntr equals employerEntity.EmployerEntityCntr
where employeeBenefit.EmployeeCntr == 117165 && (!employeeBenefit.EndDate.HasValue || employeeBenefit.EndDate >= DateTime.Now)
select new
{
CNTR = employeeBenefit.Cntr,
BenefitCode = employeeBenefit.BenefitCode,
PaymentFrequencyCode = employeeBenefit.PaymentFrequencyCode,
InstalmentAmount = employeeBenefit.InstalmentAmt,
TotalAmount = employeeBenefit.TotalAmt,
CarRego = employeeBenefit.CarRego,
ApprovedFlag = employeeBenefit.ApprovedFlag
}).ToList();

convert sql query to linq with complex left join

I am new to Linq joins.
i have a sql query as below. Please help me convert the same to linq.
select c.mem_id,
c.po_start,
c.po_end,
isnull(pp.new_policy_code,c.policy_no)
from claims_data c
left join tob_policy tp on (c.subgroup_id =tp.subgroup_id and c.category_id = tp.category_id
and c.service_from_date Between tp.Start_Date And isnull (tp.End_Date,
cast(GETDATE() as date)))
Left Join Tob_Policy_Period Pp On (Pp.Policy_Id = tp.Policy_Id And
c.service_from_date Between Pp.Start_Date And Pp.End_Date )
where c.cid = 13
and c.g_id = 19013
and c.mem_id = '123'
and c.code ='555'
Try with this code:
var results = (from c in yourmodelentity.claims_data
join tp in yourmodelentity.tob_policy on c.subgroup_id equals tp.subgroup_id && c.category_id equals tp.category_id && c.service_from_date Between tp.Start_Date && isnull (tp.End_Date, datetime.now)))
join Pp in yourmodelentity.Tob_Policy_Period on Pp.Policy_Id equals tp.Policy_Id && c.service_from_date Between Pp.Start_Date && Pp.End_Date
where c.cid = 13 && c.g_id = 19013 && c.mem_id = '123' && c.code ='555'
select new
{
c.mem_id,
c.po_start,
c.po_end
}).ToList();
Try with below code:
var results = (from c in modelentity.claims_data
join tp in modelentity.tob_policy on new { first = c.subgroup_id, second= c.category_id } equals new { first = tp.subgroup_id, second = tp.category_id } into tpjoin
from tplj in tpjoin.DefaultIfEmpty()
join pp in modelentity.Tob_Policy_Period on new { first = tplj.Policy_Id ?? 0 } equals new { first = Pp.Policy_Id } into ppjoin
from pplj in tpjoin.DefaultIfEmpty()
where ((c.service_from_date) >= tplj.Start_Date && (c.service_from_date) <= (tplj.End_Date ?? DateTime.Now)) &&
((c.service_from_date) >= pplj.Start_Date && (c.service_from_date) <= pplj.End_Date) &&
(c.cid = 13 && c.g_id = 19013 && c.mem_id = "123" && c.code = "555")
select new
{
c.mem_id,
c.po_start,
c.po_end,
policy_no = (pplj != null ? (pplj.new_policy_code ?? c.policy_no) : c.policy_no)
}).ToList();

LINQ - Previous Record

All:
Lets say I have the following table:
RevisionID, Project_ID, Count, Changed_Date
1 2 4 01/01/2016: 01:02:01
2 2 7 01/01/2016: 01:03:01
3 2 8 01/01/2016: 01:04:01
4 2 3 01/01/2016: 01:05:01
5 2 15 01/01/2016: 01:06:01
I am ordering the records based on Updated_Date. A user comes into my site and edits record (RevisionID = 3). For various reasons, using LINQ (with entity framework), I need to get the previous record in the table, which would be RevisionID = 2 so I can perform calculations on "Count". If user went to edit record (RevisionID = 4), I would need to select RevisionID = 3.
I currently have the following:
var x = _db.RevisionHistory
.Where(t => t.Project_ID == input.Project_ID)
.OrderBy(t => t.Changed_Date);
This works in finding the records based on the Project_ID, but how then do I select the record before?
I am trying to do the following, but in one LINQ statement, if possible.
var itemList = from t in _db.RevisionHistory
where t.Project_ID == input.Project_ID
orderby t.Changed_Date
select t;
int h = 0;
foreach (var entry in itemList)
{
if (entry.Revision_ID == input.Revision_ID)
{
break;
}
h = entry.Revision_ID;
}
var previousEntry = _db.RevisionHistory.Find(h);
Here is the correct single query equivalent of your code:
var previousEntry = (
from r1 in db.RevisionHistory
where r1.Project_ID == input.Project_ID && r1.Revision_ID == input.Revision_ID
from r2 in db.RevisionHistory
where r2.Project_ID == r1.Project_ID && r2.Changed_Date < r1.Changed_Date
orderby r2.Changed_Date descending
select r2
).FirstOrDefault();
which generates the following SQL query:
SELECT TOP (1)
[Project1].[Revision_ID] AS [Revision_ID],
[Project1].[Project_ID] AS [Project_ID],
[Project1].[Count] AS [Count],
[Project1].[Changed_Date] AS [Changed_Date]
FROM ( SELECT
[Extent2].[Revision_ID] AS [Revision_ID],
[Extent2].[Project_ID] AS [Project_ID],
[Extent2].[Count] AS [Count],
[Extent2].[Changed_Date] AS [Changed_Date]
FROM [dbo].[RevisionHistories] AS [Extent1]
INNER JOIN [dbo].[RevisionHistories] AS [Extent2] ON [Extent2].[Project_ID] = [Extent1].[Project_ID]
WHERE ([Extent1].[Project_ID] = #p__linq__0) AND ([Extent1].[Revision_ID] = #p__linq__1) AND ([Extent2].[Changed_Date] < [Extent1].[Changed_Date])
) AS [Project1]
ORDER BY [Project1].[Changed_Date] DESC
hope I understood what you want.
Try:
var x = _db.RevisionHistory
.FirstOrDefault(t => t.Project_ID == input.Project_ID && t.Revision_ID == input.Revision_ID -1)
Or, based on what you wrote, but edited:
_db.RevisionHistory
.Where(t => t.Project_ID == input.Project_ID)
.OrderBy(t => t.Changed_Date)
.TakeWhile(t => t.Revision_ID != input.Revision_ID)
.Last()

Query performance to compare date with many join and entity framework

I have this database model :
I use this query :
public List<Film> ListFilmsSortiesDes7DerniersJoursDVD()
{
DateTime dateDans7Jours = DateTime.Now.AddDays(7);
DateTime dateIlYa7Jours = DateTime.Now.AddDays(-7);
return Query(f => f.Releases.Where(r => r.Langue.langue_code == "FR" && r.TypeRelease.typerelease_code == "DVD").FirstOrDefault().release_date > dateIlYa7Jours
&& f.Releases.Where(r => r.Langue.langue_code == "FR" && r.TypeRelease.typerelease_code == "DVD").FirstOrDefault().release_date < dateDans7Jours && !string.IsNullOrEmpty(f.film_image)).ToList();
}
But the SQL generated have bad performance, about 1.3 seconds to return results (with SQL Server Express 2008 and I already have Index on correct fields):
SELECT [Extent1].[film_id] AS [film_id],
[Extent1].[film_image] AS [film_image],
[Extent1].[film_image_thumb] AS [film_image_thumb],
[Extent1].[film_format] AS [film_format],
[Extent1].[film_motsclefs] AS [film_motsclefs],
[Extent1].[film_nom] AS [film_nom],
[Extent1].[film_nomvf] AS [film_nomvf],
[Extent1].[film_synopsis] AS [film_synopsis],
[Extent1].[film_anneeproduction] AS [film_anneeproduction],
[Extent1].[film_budget] AS [film_budget],
[Extent1].[film_dateajout] AS [film_dateajout],
[Extent1].[film_actif] AS [film_actif],
[Extent1].[utilisateur_id] AS [utilisateur_id],
[Extent1].[film_francais] AS [film_francais],
[Extent1].[film_revenue] AS [film_revenue],
[Extent1].[filmgroupe_id] AS [filmgroupe_id]
FROM [dbo].[Film] AS [Extent1]
OUTER APPLY (SELECT TOP (1) [Filter1].[release_date] AS [release_date]
FROM (SELECT [Extent2].[film_id] AS [film_id],
[Extent3].[release_date] AS [release_date],
[Extent3].[typerelease_id] AS [typerelease_id]
FROM [dbo].[FilmRelease] AS [Extent2]
INNER JOIN [dbo].[Release] AS [Extent3]
ON [Extent3].[release_id] = [Extent2].[release_id]
INNER JOIN [dbo].[Langue] AS [Extent4]
ON [Extent3].[langue_id] = [Extent4].[langue_id]
WHERE N'FR' = [Extent4].[langue_code]) AS [Filter1]
INNER JOIN [dbo].[TypeRelease] AS [Extent5]
ON [Filter1].[typerelease_id] = [Extent5].[typerelease_id]
WHERE ([Extent1].[film_id] = [Filter1].[film_id])
AND (N'CINEMA' = [Extent5].[typerelease_code])) AS [Limit1]
CROSS APPLY (SELECT TOP (1) [Filter3].[release_date] AS [release_date]
FROM (SELECT [Extent6].[film_id] AS [film_id],
[Extent7].[release_date] AS [release_date],
[Extent7].[typerelease_id] AS [typerelease_id]
FROM [dbo].[FilmRelease] AS [Extent6]
INNER JOIN [dbo].[Release] AS [Extent7]
ON [Extent7].[release_id] = [Extent6].[release_id]
INNER JOIN [dbo].[Langue] AS [Extent8]
ON [Extent7].[langue_id] = [Extent8].[langue_id]
WHERE N'FR' = [Extent8].[langue_code]) AS [Filter3]
INNER JOIN [dbo].[TypeRelease] AS [Extent9]
ON [Filter3].[typerelease_id] = [Extent9].[typerelease_id]
WHERE ([Extent1].[film_id] = [Filter3].[film_id])
AND (N'CINEMA' = [Extent9].[typerelease_code])) AS [Limit2]
WHERE ([Limit1].[release_date] > '2013-02-04T00:07:48' /* #p__linq__0 */)
AND ([Limit2].[release_date] < '2013-02-18T00:07:48' /* #p__linq__1 */)
AND ([Extent1].[film_image] IS NOT NULL)
Do you please have any ideas to improve performance of this query ?
Ok why search complicated when the answer is simple :
public List<Film> ListFilmsSortiesDes7DerniersJoursCinema()
{
DateTime dateDans7Jours = DateTime.Now.AddDays(7);
DateTime dateIlYa7Jours = DateTime.Now.AddDays(-7);
return Query(f => f.Releases.Where(r => r.Langue.langue_code == "FR" && r.TypeRelease.typerelease_code == "CINEMA" && r.release_date > dateIlYa7Jours && r.release_date < dateDans7Jours).Any()).ToList();
}
I did a join in too

left outer join in LINQ

how can i implement left outer join in following code:
var showmenu = from pag in pagerepository.GetAllPages()
join pgmt in pagerepository.GetAllPageMeta()
on pag.int_PageId equals pgmt.int_PageId
where (pag.int_OrganizationId == layoutrep.GetSidebarDetailById(SidebarDetailsId).int_OrganizationId
&& pag.int_PostStatusId == 2 && pag.bit_ShowInMenu == true) &&
(pgmt.vcr_MetaKey.Contains("chk") && pgmt.vcr_MetaValue.Contains("true"))
select pag;
Try this, the key is the usage of the DefaultIfEmpty() operator. Here is a good article that discusses usage of this operator in more detail.
var showmenu = from pag in pagerepository.GetAllPages()
join pgmt in pagerepository.GetAllPageMeta()
on pag.int_PageId equals pgmt.int_PageId into leftj
from pgmt2 in leftj.DefaultIfEmpty()
where (pag.int_OrganizationId == layoutrep.GetSidebarDetailById(SidebarDetailsId).int_OrganizationId
&& pag.int_PostStatusId == 2 && pag.bit_ShowInMenu == true) &&
(pgmt2.vcr_MetaKey.Contains("chk") && pgmt2.vcr_MetaValue.Contains("true"))
select pag;

Resources