Linq Acrobatics: How to flatten hierarical data models? - linq

I use SQL like this to flatten hierarchical data. I just create a view and toss it on the EF diagram. However this doesn't fit the "Replace SQL Management Studio with LinqPad" mentality. How would I code these in Linq (and C#)? (Linq to Entities / Entity Framework 4)
Table A holds products and table B holds many kinds of categories. I want to select the category id as a single field in the view:
select A.*, B1.category as color, B2.category as size, B3.category as shape
from A left join B B1 on A.key = B1.key and B1.type = 1 -- Selects one B row
left join B B2 on A.key = B2.key and B2.type = 2
left join B B3 on A.key = B3.key and B3.type = 3
Better yet, is there a Linq pattern cookbook where you can look-up the SQL and see the Linq equivalent? I have already seen the 101 Linq examples in C#.

Unfortunately, there's neither an outer join in LINQ, nor can you add arbitrary join conditions. The inner join can be worked around using DefaultIfEmpty, but the Bn.type = n part of the join condition would need to be moved to a where condition.
The following produces exactly the SQL you provided, except for the type clauses I mentioned:
from A in products
join B1 in categories on A.key equals B1.key into tmp_color
join B2 in categories on A.key equals B2.key into tmp_size
join B3 in categories on A.key equals B3.key into tmp_shape
from B1 in tmp_color.DefaultIfEmpty()
from B2 in tmp_size.DefaultIfEmpty()
from B3 in tmp_shape.DefaultIfEmpty()
where B1.type == 1 && B2.type == 2 && B3.type == 3
select new { product = A, color = B1.category, size = B2.category, shape = B3.category };
results in
exec sp_executesql N'SELECT [t0].[key], [t1].[category] AS [color], [t2].[category] AS [size], [t3].[category] AS [shape]
FROM [Product] AS [t0]
LEFT OUTER JOIN [Category] AS [t1] ON [t0].[key] = [t1].[key]
LEFT OUTER JOIN [Category] AS [t2] ON [t0].[key] = [t2].[key]
LEFT OUTER JOIN [Category] AS [t3] ON [t0].[key] = [t3].[key]
WHERE ([t1].[type] = #p0) AND ([t2].[type] = #p1) AND ([t3].[type] = #p2)',N'#p0 int,#p1 int,#p2 int',#p0=1,#p1=2,#p2=3
(Update: that's LINQ to SQL, just assuming that EF would be similar.)
Albin's answer is more readable, but probably produces less optimal SQL. For an exact match with your SQL, you need to replace FirstOrDefault with DefaultIfEmpty though (might make no difference, depending on your data). (Sorry, can't comment yet ;-))

I would go for a subselect approach.
from a in ModelEntities.A
select new
{
f1 = a.f1,
f2 = a.f2,
// ...,
fn = a.fn,
color = ModelEntities.B.Where(b => a.key == b.key && b.type == 1)
.Select(b => b.category).FirstOrDefault(),
size = ModelEntities.B.Where(b => a.key == b.key && b.type == 2)
.Select(b => b.category).FirstOrDefault(),
shape = ModelEntities.B.Where(b => a.key == b.key && b.type == 3)
.Select(b => b.category).FirstOrDefault(),
}
But following the create a view habit you should probably create some fancy entity in the EF-designer that does something like this.

Related

Linq left outer join with multiple condition

I am new to Linq. I am trying to query some data in MS SQL.
Here is my statement:
select * from booking
left outer join carpark
on booking.bookingId = carpark.bookingId
where userID = 5 and status = 'CL'
When I run this in MS SQL, I get the expected result. How can I do this in Linq?
Thank you for your help.
you need this:
var query = (from t1 in tb1
join t2 in tb2 on t1.pKey = t2.tb1pKey into JoinedList
from t2 in JoinedList.DefaultIfEmpty()
where t1.userID == 5 && t1.status == "CL"
select new
{
t1,
t2
})
.ToList();
You can try to do left join this way :
from t1 in tb1
from t2 in tb2.Where(o => o.tb1pKey == t1.pKey).DefaultIfEmpty()
where tb1.userId == 5 && tb1.status == "CL"
select t1;
Usually when people say they want a "left outer join," that's just because they've already converted what they really want into SQL in their head. Usually what they really want is all of the items from table A, and the ability to get the related items from table B if there are any.
Assuming you have your navigation properties set up correctly, this could be as easy as:
var tb1sWithTb2s = context.tb1
.Include(t => t.tb2s) // Include all the tb2 items for each of these.
.Where(t => t.userID == 5 and t.status = "CL");

Where Clause on Joined Table with Into Keyword

I wish to join two tables while filtering one of the tables. That works fine like
var matching = from a in ctx.A
join b in ctx.B on a.BId equals b.Id
where idList.Contains(b.Id)
select a;
However, if I also make use of the into keyword to name the joined result
var matching = from a in ctx.A
join b in ctx.B on a.BId equals b.Id into c
where idList.Contains(b.Id)
select a;
I get a compiler error telling me
The name 'b' does not exist in the current context
However, I can reference a at that point, as well as 'c', without problems.
Why is that exactly, and how can I apply a where clause to b?
Why is that exactly
Because after a join into clause, the range variable introduced by that clause isn't in scope - whereas previous variables are. Don't forget that you're joining into c, so each value of b is effectively part of the group of values (c).
and how can I apply a where clause to b?
By doing it earlier:
var matching = from a in ctx.A
join b in ctx.B.Where(x => idList.Contains(x.Id))
on a.BId equals b.Id into c
where c.Any()
select a;
EDIT: This can be put into slightly more query-expression-oriented code as:
var matchingBs = from b in ctx.B
where idList.Contains(b.Id)
select b;
var matching = from a in ctx.A
join b in matchingBs
on a.BId equals b.Id into c
where c.Any()
select a;
(You could use a nested query expression, but I'm not keen on those in general.)
Or using Any on c:
var matching = from a in ctx.A
join b in ctx.B on a.BId equals b.Id into c
where c.Any(b => idList.Contains(b.Id))
select a;
Or even:
var matching = from a in ctx.A
where ctx.B.Any(b => idList.Contains(x.Id) &&
a.BId == b.Id)
select a;
Which can be rewritten as:
var matching = ctx.A.Where(a => ctx.B.Any(b => idList.Contains(x.Id) &&
a.BId == b.Id));
It's important to understand the difference in results between join and join into - the first creates a "pairwise" join; the second creates a group join, where the result for the extra range variable is a group of matches.

Multiple Joins LINQ Query performance

I'm new to LINQ and have very little knowledge.
I have the following complex query. it runs 3 or 4 times slower than the stored procedure which i translated to LINQ.
any tips for me to make it run faster?
var result = from a in db.A
join al in db.AL.Where(q => q.CurrentLocation == 1) on a.AID equals al.AID into tmp_al
from al in tmp_al.DefaultIfEmpty()
join l in db.Lon al.LID equals l.LID into tmp_l
from l in tmp_l.DefaultIfEmpty()
join r in db.R on l.RID equals r.RID into tmp_r
from r in tmp_r.DefaultIfEmpty()
join b in db.B on r.BID equals b.BID into tmp_b
from b in tmp_b.DefaultIfEmpty()
join ap in db.AP.Where(q => q.CurrentProtocol == 1) on a.AID equals ap.AID into tmp_ap
from ap in tmp_ap.DefaultIfEmpty()
join p in db.P on ap.PID equals p.PID into tmp_p
from p in tmp_p.DefaultIfEmpty()
join s in db.S on a.SID equals s.SID into tmp_s
from s in tmp_s.DefaultIfEmpty()
join ans in db.AS on a.ASID equals ans.ASID into tmp_ans
from ans in tmp_ans.DefaultIfEmpty()
join pr in db.P on p.PI equals pr.PID into tmp_pr
from pr in tmp_pr.DefaultIfEmpty()
where a.Active == 1
group a by new { a.Active, pr.LN, pr.FN, b.BN, r.RID, r.R1, p.PN, s.S1, ans.AS1 }
into grp
orderby grp.Key.BN, grp.Key.R1, grp.Key.PN, grp.Key.S1, grp.Key.AS1
select new
{
PIName = grp.Key.LN + " " + grp.Key.FN,
BN = grp.Key.BN,
RID = grp.Key.RID,
R = grp.Key.R1,
PN = grp.Key.PN,
S = grp.Key.S1,
AS = grp.Key.AS1,
NumberOA = grp.Count()
};
Thanks for your answers. #Albin Sunnanbo: i dont know how to check the execution plans. my LINQ runs correctly and produces the required output. it is just slow. I would like to speeden it up. #usr: the original sql is as follows:
sorry about the silly table names. the original code is confidential. so i'm not posting the complete table names.
CREATE PROCEDURE [dbo].[report_CBRP] --
AS
SELECT LN + ' ' + FN As PIN, BN, R.RID, R, PN,
S, AS, COUNT(*) As NOA
FROM A
LEFT JOIN AL
ON A.AID = AL.AID
AND AL.CL = 1
LEFT JOIN L
ON AL.LID = L.LID
LEFT JOIN R
ON L.RID = R.RID
LEFT JOIN B
ON R.BID = B.BID
LEFT JOIN AP
ON A.AID = AP.AID
AND AP.CPl = 1
LEFT JOIN P
ON AP.PID = P.PID
LEFT JOIN S
ON A.SID = S.SID
LEFT JOIN AS
ON A.ASID = AS.ASID
LEFT JOIN P
ON P.PI = P.PID
GROUP BY A.A, LN , FN , B.BN, R.RID, R.R, P.PN,
S.S, AS.AS
HAVING A.A = 1
ORDER BY B.BN, R.R, P.PN, S, AS
GO
It seems you're doing SQL hard life here.
In general, try to avoid so many joins, but rather break them into few small queries.
More than that, you're performing a group by which in itself is an expensive operation, let alone with so many columns
I've noticed that you're joining all the columns in each table. Try to select only the relevant columns.
Also noticed that few of the tables aren't used in the group by like al, ap and l. Do you need them at all??
Use AsNoTracking() for readonly data from EF. In that way you speed up things.
Use SQL Views

Complex Join Using LINQ EF

How can I join the two queries using LINQ to EF? I need the result set returned to me that includes joined data from the 2 queries combined.
1
select StockNo, Description
from VehicleOption_New
where StockNo in
(
select v.StockNo
from Vehicles v
join StatusDescription s
on v.Status = s.StatusId
where NewOrUsed = 'n' and v.model = 'cts'
)
and color is not null
2
select v.StockNo, s.StatusDescriptionText
from Vehicles v
join StatusDescription s
on v.Status = s.StatusId
where NewOrUsed = 'n' and v.model = 'cts'
Once you have the equivalent EF queries you can use either Concat() or Union() to combine the results.

LINQ nested joins

Im trying to convert a SQL join to LINQ. I need some help in getting the nested join working in LINQ.
This is my SQL query, Ive cut it short just to show the nested join in SQL:
select distinct
txtTaskStatus as TaskStatusDescription,
txtempfirstname+ ' ' + txtemplastname as RaisedByEmployeeName,
txtTaskPriorityDescription as TaskPriorityDescription,
dtmtaskcreated as itemDateTime,
dbo.tblTask.lngtaskid as TaskID,
dbo.tblTask.dtmtaskcreated as CreatedDateTime,
convert(varchar(512), dbo.tblTask.txttaskdescription) as ProblemStatement,
dbo.tblTask.lngtaskmessageid,
dbo.tblMessage.lngmessageid as MessageID,
case when isnull(dbo.tblMessage.txtmessagesubject,'') <> '' then txtmessagesubject else left(txtmessagedescription,50) end as MessageSubject,
dbo.tblMessage.txtmessagedescription as MessageDescription,
case when dbo.tblMessage.dtmmessagecreated is not null then dbo.tblMessage.dtmmessagecreated else CAST(FLOOR(CAST(dtmtaskcreated AS DECIMAL(12, 5))) AS DATETIME) end as MessageCreatedDateTime
FROM
dbo.tblAction RIGHT OUTER JOIN dbo.tblTask ON dbo.tblAction.lngactiontaskid = dbo.tblTask.lngtaskid
LEFT OUTER JOIN dbo.tblMessage ON dbo.tblTask.lngtaskmessageid = dbo.tblMessage.lngmessageid
LEFT OUTER JOIN dbo.tblTaskCommentRecipient
RIGHT OUTER JOIN dbo.tblTaskComment ON dbo.tblTaskCommentRecipient.lngTaskCommentID = dbo.tblTaskComment.lngTaskCommentID
ON dbo.tblTask.lngtaskid = dbo.tblTaskComment.lngTaskCommentTaskId
A more seasoned SQL programmer wouldn't join that way. They'd use strictly left joins for clarity (as there is a strictly left joining solution available).
I've unraveled these joins to produce a hierarchy:
Task
Action
Message
TaskComment
TaskCommentRecipient
With associations created in the linq to sql designer, you can reach these levels of the hierarchy:
//note: these aren't outer joins
from t in db.Tasks
let actions = t.Actions
let message = t.Messages
let comments = t.TaskComments
from c in comments
let recipients = c.TaskCommentRecipients
DefaultIfEmpty produces a default element when the collection is empty. Since these are database rows, a default element is a null row. That is the behavior of left join.
query =
(
from t in db.Tasks
from a in t.Actions.DefaultIfEmpty()
from m in t.Messages.DefaultIfEmpty()
from c in t.Comments.DefaultIfEmpty()
from r in c.Recipients.DefaultIfEmpty()
select new Result()
{
TaskStatus = ???
...
}
).Distinct();
Aside: calling Distinct after a bunch of joins is a crutch. #1 See if you can do without it. #2 If not, see if you can eliminate any bad data that causes you to have to call it. #3 If not, call Distinct in a smaller scope than the whole query.
Hope this helps.
SELECT [t0].[OrderID], [t0].[CustomerID], [t0].[EmployeeID], [t0].[OrderDate], [t0].[RequiredDate], [t0].[ShippedDate], [t0].[ShipVia], [t0].[Freight], [t0].[ShipName], [t0].[ShipAddress], [t0].[ShipCity], [t0].[ShipRegion], [t0].[ShipPostalCode], [t0].[ShipCountry]
FROM [Orders] AS [t0]
LEFT OUTER JOIN ([Order Details] AS [t1]
INNER JOIN [Products] AS [t2] ON [t1].[ProductID] = [t2].[ProductID]) ON [t0].[OrderID] = [t1].[OrderID]
can be write as
from o in Orders
join od in (
from od in OrderDetails join p in Products on od.ProductID equals p.ProductID select od)
on o.OrderID equals od.OrderID into ood from od in ood.DefaultIfEmpty()
select o

Resources