Linq join with an inner collection - linq

I am trying a LINQ to Object query on 2 collections
Customer.Orders
Branches.Pending.Orders (Collection within a collection)
I want to output each branch which is yet to deliver any order of the customer.
var match = from order in customer.Orders
join branch in Branches
on order equals branch.Pending.Orders
select branch;
This does not work, I get :
The type of one of the expressions in the join clause is incorrect. Type inference failed in the call to 'GroupJoin'.
From my search, I think this is because Order or collection of Orders does not implement equals.
If this query worked, it will still be wrong, as it will return a branch if the customer's and pending orders match exactly. I want a result if any of the order matches.
I am learning Linq, and looking for a approach to address such issues, rather than the solution itself.
I would have done this in SQL like this;
SELECT b.branch_name from Customers c, Branches b, Orders o
WHERE c.customer_id = o.customer_id
AND o.branch_id = b.branch_id
AND c.customer_id = 'my customer'
AND o.order_status = 'pending'

Looking at your Linq, you want something like this
var match =
from o in customer.Orders
from b in Branches
where b.Pending.Orders.FirstOrDefault(p => o.order_id == p.order_id) != null
select b;

Related

How can I get distinct results from this LINQ query?

Below is my ERD and sample data. Note, I'm using Entity Framework and Code first to control my database.
For the project named Vacation, return all the DISTINCT users who have a "true" value in UserBooleanAttributes table for either the Parents or Teens rows defined in the UserAttributes table.
Here is my current attempt:
var myQuery =
from P in context.Projects
join UA in context.UserAttributes on P.ProjectID equals UA.ProjectID
join UBA in context.UserBooleanAttributes on UA.UserAttributeID equals UBA.UserAttributeID
join U in context.Users on UBA.UserID equals U.UserID
where P.ProjectID == 1
where UBA.Value == true
where (UA.UserAttributeID == 1 || UA.UserAttributeID == 2)
select new { uba = U };
This returns 6 users, with e#acme.org being listed twice. Is there a LINQ way of returning distinct values? I suppose I could convert this to a list then filter, but I'd rather have the Database do the work.
I'd rather avoid using lambda expressions if possible. Once again, I want the database to do the work, and not have to write code to union/intersect result groups.

How to improve LINQ to EF performance

I have two classes: Property and PropertyValue. A property has several values where each value is a new revision.
When retrieving a set of properties I want to include the latest revision of the value for each property.
in T-SQL this can very efficiently be done like this:
SELECT
p.Id,
pv1.StringValue,
pv1.Revision
FROM dbo.PropertyValues pv1
LEFT JOIN dbo.PropertyValues pv2 ON pv1.Property_Id = pv2.Property_Id AND pv1.Revision < pv2.Revision
JOIN dbo.Properties p ON p.Id = pv1.Property_Id
WHERE pv2.Id IS NULL
ORDER BY p.Id
The "magic" in this query is to join on the lesser than condition and look for rows without a result forced by the LEFT JOIN.
How can I accomplish something similar using LINQ to EF?
The best thing I could come up with was:
from pv in context.PropertyValues
group pv by pv.Property into g
select g.OrderByDescending(p => p.Revision).FirstOrDefault()
It does produce the correct result but is about 10 times slower than the other.
Maybe this can help. Where db is the database context:
(
from pv1 in db.PropertyValues
from pv2 in db.PropertyValues.Where(a=>a.Property_Id==pv1.Property_Id && pv1.Revision<pv2.Revision).DefaultIfEmpty()
join p in db.Properties
on pv1.Property_Id equals p.Id
where pv2.Id==null
orderby p.Id
select new
{
p.Id,
pv1.StringValue,
pv1.Revision
}
);
Next to optimizing a query in Linq To Entities, you also have to be aware of the work it takes for the Entity Framework to translate your query to SQL and then map the results back to your objects.
Comparing a Linq To Entities query directly to a SQL query will always result in lower performance because the Entity Framework does a lot more work for you.
So it's also important to look at optimizing the steps the Entity Framework takes.
Things that could help:
Precompile your query
Pre-generate views
Decide for yourself when to open the database connection
Disable tracking (if appropriate)
Here you can find some documentation with performance strategies.
if you want to use multiple conditions (less than expression) in join you can do this like
from pv1 in db.PropertyValues
join pv2 in db.PropertyValues on new{pv1.Property_ID, Condition = pv1.Revision < pv2.Revision} equals new {pv2.Property_ID , Condition = true} into temp
from t in temp.DefaultIfEmpty()
join p in db.Properties
on pv1.Property_Id equals p.Id
where t.Id==null
orderby p.Id
select new
{
p.Id,
pv1.StringValue,
pv1.Revision
}

LINQ nested joins

Im trying to convert a SQL join to LINQ. I need some help in getting the nested join working in LINQ.
This is my SQL query, Ive cut it short just to show the nested join in SQL:
select distinct
txtTaskStatus as TaskStatusDescription,
txtempfirstname+ ' ' + txtemplastname as RaisedByEmployeeName,
txtTaskPriorityDescription as TaskPriorityDescription,
dtmtaskcreated as itemDateTime,
dbo.tblTask.lngtaskid as TaskID,
dbo.tblTask.dtmtaskcreated as CreatedDateTime,
convert(varchar(512), dbo.tblTask.txttaskdescription) as ProblemStatement,
dbo.tblTask.lngtaskmessageid,
dbo.tblMessage.lngmessageid as MessageID,
case when isnull(dbo.tblMessage.txtmessagesubject,'') <> '' then txtmessagesubject else left(txtmessagedescription,50) end as MessageSubject,
dbo.tblMessage.txtmessagedescription as MessageDescription,
case when dbo.tblMessage.dtmmessagecreated is not null then dbo.tblMessage.dtmmessagecreated else CAST(FLOOR(CAST(dtmtaskcreated AS DECIMAL(12, 5))) AS DATETIME) end as MessageCreatedDateTime
FROM
dbo.tblAction RIGHT OUTER JOIN dbo.tblTask ON dbo.tblAction.lngactiontaskid = dbo.tblTask.lngtaskid
LEFT OUTER JOIN dbo.tblMessage ON dbo.tblTask.lngtaskmessageid = dbo.tblMessage.lngmessageid
LEFT OUTER JOIN dbo.tblTaskCommentRecipient
RIGHT OUTER JOIN dbo.tblTaskComment ON dbo.tblTaskCommentRecipient.lngTaskCommentID = dbo.tblTaskComment.lngTaskCommentID
ON dbo.tblTask.lngtaskid = dbo.tblTaskComment.lngTaskCommentTaskId
A more seasoned SQL programmer wouldn't join that way. They'd use strictly left joins for clarity (as there is a strictly left joining solution available).
I've unraveled these joins to produce a hierarchy:
Task
Action
Message
TaskComment
TaskCommentRecipient
With associations created in the linq to sql designer, you can reach these levels of the hierarchy:
//note: these aren't outer joins
from t in db.Tasks
let actions = t.Actions
let message = t.Messages
let comments = t.TaskComments
from c in comments
let recipients = c.TaskCommentRecipients
DefaultIfEmpty produces a default element when the collection is empty. Since these are database rows, a default element is a null row. That is the behavior of left join.
query =
(
from t in db.Tasks
from a in t.Actions.DefaultIfEmpty()
from m in t.Messages.DefaultIfEmpty()
from c in t.Comments.DefaultIfEmpty()
from r in c.Recipients.DefaultIfEmpty()
select new Result()
{
TaskStatus = ???
...
}
).Distinct();
Aside: calling Distinct after a bunch of joins is a crutch. #1 See if you can do without it. #2 If not, see if you can eliminate any bad data that causes you to have to call it. #3 If not, call Distinct in a smaller scope than the whole query.
Hope this helps.
SELECT [t0].[OrderID], [t0].[CustomerID], [t0].[EmployeeID], [t0].[OrderDate], [t0].[RequiredDate], [t0].[ShippedDate], [t0].[ShipVia], [t0].[Freight], [t0].[ShipName], [t0].[ShipAddress], [t0].[ShipCity], [t0].[ShipRegion], [t0].[ShipPostalCode], [t0].[ShipCountry]
FROM [Orders] AS [t0]
LEFT OUTER JOIN ([Order Details] AS [t1]
INNER JOIN [Products] AS [t2] ON [t1].[ProductID] = [t2].[ProductID]) ON [t0].[OrderID] = [t1].[OrderID]
can be write as
from o in Orders
join od in (
from od in OrderDetails join p in Products on od.ProductID equals p.ProductID select od)
on o.OrderID equals od.OrderID into ood from od in ood.DefaultIfEmpty()
select o

Linq to entities Left Join

I want to achieve the following in Linq to Entities:
Get all Enquires that have no Application or the Application has a status != 4 (Completed)
select e.*
from Enquiry enq
left outer join Application app
on enq.enquiryid = app.enquiryid
where app.Status <> 4 or app.enquiryid is null
Has anyone done this before without using DefaultIfEmpty(), which is not supported by Linq to Entities?
I'm trying to add a filter to an IQueryable query like this:
IQueryable<Enquiry> query = Context.EnquirySet;
query = (from e in query
where e.Applications.DefaultIfEmpty()
.Where(app=>app.Status != 4).Count() >= 1
select e);
Thanks
Mark
In EF 4.0+, LEFT JOIN syntax is a little different and presents a crazy quirk:
var query = from c1 in db.Category
join c2 in db.Category on c1.CategoryID equals c2.ParentCategoryID
into ChildCategory
from cc in ChildCategory.DefaultIfEmpty()
select new CategoryObject
{
CategoryID = c1.CategoryID,
ChildName = cc.CategoryName
}
If you capture the execution of this query in SQL Server Profiler, you will see that it does indeed perform a LEFT OUTER JOIN. HOWEVER, if you have multiple LEFT JOIN ("Group Join") clauses in your Linq-to-Entity query, I have found that the self-join clause MAY actually execute as in INNER JOIN - EVEN IF THE ABOVE SYNTAX IS USED!
The resolution to that? As crazy and, according to MS, wrong as it sounds, I resolved this by changing the order of the join clauses. If the self-referencing LEFT JOIN clause was the 1st Linq Group Join, SQL Profiler reported an INNER JOIN. If the self-referencing LEFT JOIN clause was the LAST Linq Group Join, SQL Profiler reported an LEFT JOIN.
Do this:
IQueryable<Enquiry> query = Context.EnquirySet;
query = (from e in query
where (!e.Applications.Any())
|| e.Applications.Any(app => app.Status != 4)
select e);
I don't find LINQ's handling of the problem of what would be an "outer join" in SQL "goofy" at all. The key to understanding it is to think in terms of an object graph with nullable properties rather than a tabular result set.
Any() maps to EXISTS in SQL, so it's far more efficient than Count() in some cases.
Thanks guys for your help. I went for this option in the end but your solutions have helped broaden my knowledge.
IQueryable<Enquiry> query = Context.EnquirySet;
query = query.Except(from e in query
from a in e.Applications
where a.Status == 4
select e);
Because of Linq's goofy (read non-standard) way of handling outers, you have to use DefaultIfEmpty().
What you'll do is run your Linq-To-Entities query into two IEnumerables, then LEFT Join them using DefaultIfEmpty(). It may look something like:
IQueryable enq = Enquiry.Select();
IQueryable app = Application.Select();
var x = from e in enq
join a in app on e.enquiryid equals a.enquiryid
into ae
where e.Status != 4
from appEnq in ae.DefaultIfEmpty()
select e.*;
Just because you can't do it with Linq-To-Entities doesn't mean you can't do it with raw Linq.
(Note: before anyone downvotes me ... yes, I know there are more elegant ways to do this. I'm just trying to make it understandable. It's the concept that's important, right?)
Another thing to consider, if you directly reference any properties in your where clause from a left-joined group (using the into syntax) without checking for null, Entity Framework will still convert your LEFT JOIN into an INNER JOIN.
To avoid this, filter on the "from x in leftJoinedExtent" part of your query like so:
var y = from parent in thing
join child in subthing on parent.ID equals child.ParentID into childTemp
from childLJ in childTemp.Where(c => c.Visible == true).DefaultIfEmpty()
where parent.ID == 123
select new {
ParentID = parent.ID,
ChildID = childLJ.ID
};
ChildID in the anonymous type will be a nullable type and the query this generates will be a LEFT JOIN.

Linq To Entity Framework selecting whole tables

I have the following Linq statement:
(from order in Orders.AsEnumerable()
join component in Components.AsEnumerable()
on order.ORDER_ID equals component.ORDER_ID
join detail in Detailss.AsEnumerable()
on component.RESULT_ID equals detail.RESULT_ID
where orderRestrict.ORDER_MNEMONIC == "MyOrderText"
select new
{
Mnemonic = detail.TEST_MNEMONIC,
OrderID = component.ORDER_ID,
SeqNumber = component.SEQ_NUM
}).ToList()
I expect this to put out the following query:
select *
from Orders ord (NoLock)
join Component comp (NoLock)
on ord .ORDER_ID = comp.ORDER_ID
join Details detail (NoLock)
on comp.RESULT_TEST_NUM = detail .RESULT_TEST_NUM
where res.ORDER_MNEMONIC = 'MyOrderText'
but instead I get 3 seperate queries that select all rows from the tables. I am guessing that Linq is then filtering the values because I do get the correct values in the end.
The problem is that it takes WAY WAY too long because it is pulling down all the rows from all three tables.
Any ideas how I can fix that?
Remove the .AsEnumerable()s from the query as these are preventing the entire query being evaluated on the server.

Resources