Entity to Linq Left Join + Grouping + Sum - linq

SQL Query (Execution plan cost = 0.0127553)
SELECT
SUM(DATEDIFF(second, DTActivate, DTDeActivate)) AS Seconds,
AA.ID AS AAID,
AA.WorkStation
FROM
DbLogItems I
INNER JOIN DbApplicationArguments AA ON
AA.Id = I.ApplicationArgument_ID
GROUP BY
AA.ID,
AA.WorkStation
C#
var q = from items in db.LogItem
join aa in db.ApplicationArguments on
items.ApplicationArgument.ID equals aa.ID
into aaGroup
from aaJoin in aaGroup.DefaultIfEmpty()
group items by new
{
aaJoin.ID,
aaJoin.WorkStation
} into grouping
select new
{
Seconds = grouping.Sum(x => SqlFunctions.DateDiff("second", x.DTActivate, x.DTDeActivate)),
grouping.Key.ID,
grouping.Key.WorkStation
};
Result SQL very big (Execution plan cost = 0.0199849)
SELECT
1 AS [C1],
[GroupBy1].[A1] AS [C2],
[GroupBy1].[K1] AS [ID],
[GroupBy1].[K2] AS [WorkStation]
FROM
(
SELECT
[Join1].[K1] AS [K1],
[Join1].[K2] AS [K2],
SUM([Join1].[A1]) AS [A1]
FROM
(
SELECT
[Extent2].[ID] AS [K1],
[Extent2].[WorkStation] AS [K2],
DATEDIFF(second, [Extent1].[DTActivate], [Extent1].[DTDeActivate]) AS [A1]
FROM
[dbo].[DbLogItems] AS [Extent1]
LEFT OUTER JOIN [dbo].[DbApplicationArguments] AS [Extent2]
ON [Extent1].[ApplicationArgument_ID] = [Extent2].[ID]
) AS [Join1]
GROUP BY
[K1],
[K2]
) AS [GroupBy1]
help please write correct linq code.
My SQL Execution plan cost = 0.0127553.
Linq SQL Execution plan cost = 0.0199849.
DIFF = 0,0072296 on 21+10 records

LINQ to SQL queries are what they are. Sometimes you can't emit better SQL. However, you can write plain old T-SQL and call it in a number ways: a table-valued UDF perhaps that your customized DataContext exposes as an IQueryable.

Related

speed up a query with multiple inner joins in ms access

as tittle says i need to improve this query that i have made in ms access, the tables are from a linked DB. i can't index them. i need help to understand where it is taking so long... is there any function like
EXPLAIN to access? do i need to put more columns in some sort of group by? what i need to do to improve the speed of this (the group by of first select has 4M rows but after grouped only has 321k and it takes 20min to run when laptop doesn't crashes)
SELECT a.SEQ_NO,
b.SKU,
b.maxdate,
(a.BASE_COST/a.EXCHANGE) AS BASE_COST,
(a.NET_COST/a.EXCHANGE) AS NET_COST,
(a.NET_NET_COST/a.EXCHANGE) AS EXCHAGED_NET_NET_COST,
a.NET_NET_COST,
(a.DEAD_NET_NET_COST/a.EXCHANGE) AS DEAD_NET_NET_COST,
(a.LANDED_COST/a.EXCHANGE) AS LANDED_COST,
(a.POSEIMA/a.EXCHANGE) AS POSEIMA,
(a.TOTAL_BONUS/a.EXCHANGE) AS TOTAL_BONUS,
(a.IEC/a.EXCHANGE) AS IEC,
(a.IEC_BONUS/a.EXCHANGE) AS IEC_BONUS,
(a.ECO_INVOICE_FORN/a.EXCHANGE) AS ECO_INVOICE_FORN_SYSTEM,
(a.ECO_INVOICE/a.EXCHANGE) AS ECO_INVOICE_SYSTEM,
(a.ECO_MERCHANDISE/a.EXCHANGE) AS ECO_MERCHANDISE_SYSTEM,
c.SUPPLIER,
c.SUP_NAME,
d.UPC,
d.PRIMARY_UPC_IND,
f.BRAND,
g.DEPT,
g.DESC_UP,
g.CLASS,
g.SUBCLASS,
h.AV_COST,
h.UNIT_RETAIL AS Last_of_unit_retail,
h.STATUS, i.[UNIT VALUE],
i.[INITIAL DATE],
i.[END DATE] INTO PRICELIST
FROM (((((((RMS_MC_NB_PRICELIST_COST AS a INNER JOIN (SELECT MAX(SEQ_NO) AS
ID, SKU, MAX(ACTIVE_DATE) AS maxdate FROM RMS_MC_NB_PRICELIST_COST GROUP BY
SKU) AS b ON a.SEQ_NO = b.ID)
INNER JOIN RMS_MC_SUPS AS c ON a.SUPPLIER = c.SUPPLIER)
INNER JOIN RMS_MC_UPC_EAN AS d ON b.SKU = d.SKU)
INNER JOIN RMS_MC_WIN_ATTRIBUTES AS e ON b.SKU = e.SKU)
INNER JOIN RMS_MC_NB_BRAND AS f ON e.NB_BRAND_NO = f.BRAND_NO)
INNER JOIN RMS_MC_DESC_LOOK AS g ON b.SKU = g.SKU)
INNER JOIN RMS_MC_WIN_STORE AS h ON b.SKU = h.SKU)
LEFT JOIN MAPA_APOIOS_SISO AS i ON b.SKU = i.[# ARTICLE];

Increase linq query effieciency

I'm optimizing my web application and have run into a bottle neck where the SQL generated from my linq expression is very slow.
The following SQL executes in well under a second:
SELECT
ISNULL(COUNT(distinct JOBIDNumber),0),
ISNULL(SUM(JIMQuantityActual * JIMNetMarginFactor),0),
ISNULL(sum((isnull(MATRecoverablePercent,0) / 100) * JIMQuantityActual * JIMNetMarginFactor),0),
ISNULL(sum(CarbonSaving),0)
FROM
dbo.fn_GetJobsForUser(183486) jb
inner join cd_JobMaterial on JIMJobId = jb.JOBIDNumber
WHERE
JOBCollectionDate >= '2014-11-01'
Whereas the sql output by the following query takes between 4 and 16 seconds over the same data:
DateTime d = new DateTime(2014, 11, 1)
from job in sp.sp_GetJobsForUser(183486)
where job.JOBCollectionDate >= d
join material in UnitOfWork.Repository<cd_JobMaterial>().Queryable()
on job.JOBIDNumber equals material.JIMJobId
group material by 1 into f
select new
{
Jobs = f.Distinct().Count(),
Weight = f.Sum(x=> x.JIMQuantityActual * x.JIMNetMarginFactor),
Carbon = f.Sum(x=> x.CarbonSaving),
Recovery = f.Sum(x => ((x.MATRecoverablePercent / 100) * x.JIMQuantityActual * x.JIMNetMarginFactor))
}
Which outputs the following:
-- Region Parameters
DECLARE #contactId Int = 183486
DECLARE #p__linq__0 DateTime2 = '2014-11-01 00:00:00.0000000'
-- EndRegion
SELECT
[Project4].[C1] AS [C1],
[Project4].[C5] AS [C2],
[Project4].[C2] AS [C3],
[Project4].[C3] AS [C4],
[Project4].[C4] AS [C5]
FROM (SELECT
[Project2].[C1] AS [C1],
[Project2].[C2] AS [C2],
[Project2].[C3] AS [C3],
[Project2].[C4] AS [C4],
(SELECT
COUNT(1) AS [A1]
FROM (SELECT DISTINCT
/*Fields omitted for brevity */
FROM [dbo].[fn_GetJobsForUser](#contactId) AS [Extent3]
INNER JOIN (SELECT
/*Fields omitted for brevity */
FROM [dbo].[cd_JobMaterial] AS [cd_JobMaterial]) AS [Extent4]
ON [Extent3].[JOBIDNumber] = [Extent4].[JIMJobId]
WHERE ([Extent3].[JOBCollectionDate] >= #p__linq__0)
AND ([Project2].[C1] = 1)) AS [Distinct1])
AS [C5]
FROM (SELECT
#contactId AS [contactId],
#p__linq__0 AS [p__linq__0],
[GroupBy1].[K1] AS [C1],
[GroupBy1].[A1] AS [C2],
[GroupBy1].[A2] AS [C3],
[GroupBy1].[A3] AS [C4]
FROM (SELECT
[Project1].[K1] AS [K1],
SUM([Project1].[A1]) AS [A1],
SUM([Project1].[A2]) AS [A2],
SUM([Project1].[A3]) AS [A3]
FROM (SELECT
1 AS [K1],
[Project1].[JIMQuantityActual] * [Project1].[JIMNetMarginFactor] AS [A1],
[Project1].[CarbonSaving] AS [A2],
([Project1].[MATRecoverablePercent] / CAST(100 AS DECIMAL(18))) * [Project1].[JIMQuantityActual] * [Project1].[JIMNetMarginFactor] AS [A3]
FROM (SELECT
[Extent2].[MATRecoverablePercent] AS [MATRecoverablePercent],
[Extent2].[JIMQuantityActual] AS [JIMQuantityActual],
[Extent2].[JIMNetMarginFactor] AS [JIMNetMarginFactor],
[Extent2].[CarbonSaving] AS [CarbonSaving]
FROM [dbo].[fn_GetJobsForUser](#contactId) AS [Extent1]
INNER JOIN (SELECT
/*Fields omitted for brevity */
FROM [dbo].[cd_JobMaterial] AS [cd_JobMaterial]) AS [Extent2]
ON [Extent1].[JOBIDNumber] = [Extent2].[JIMJobId]
WHERE [Extent1].[JOBCollectionDate] >= #p__linq__0) AS [Project1]) AS [Project1]
GROUP BY [K1]) AS [GroupBy1]) AS [Project2]) AS [Project4]
How do I re-write the linq expression to produce more efficient sql or is it just a case of writing a stored procedure and using that instead?
Unfortunately that's one of the downsides of using something as flexible as Entity Framework that has to support a wide variety of complex translations. It obviously comes with great benefit in other areas though, so you'll have to balance those with the performance aspect.
Even if you could find a way to rewrite the query that would generate better SQL now, that's subject to changes in the underlying provider in future versions. Enjoy the clean, concise code for as long as you can before performance is no longer acceptable for your application. If EF isn't cutting it at that point, then use some of the hooks provided to execute raw SQL, stored procedures, etc. that won't be as pretty.

Linq-to-SQL left join on left join/multiple left joins in one Linq-to-SQL statement

I'm trying to rewrite SQL procedure to Linq, it all went well and works fine, as long as it works on small data set. I couldn't really find answer to this anywhere. Thing is, I have 3 joins in the query, 2 are left joins and 1 is inner join, they all join to each other/like a tree. Below you can see SQL procedure:
SELECT ...
FROM sprawa s (NOLOCK)
LEFT JOIN strona st (NOLOCK) on s.ident = st.id_sprawy
INNER JOIN stan_szczegoly ss (NOLOCK) on s.kod_stanu = ss.kod_stanu
LEFT JOIN broni b (NOLOCK) on b.id_strony = st.ident
What I'd like to ask you is a way to translate this to Linq. For now I have this:
var queryOne = from s in db.sprawa
join st in db.strona on s.ident equals st.id_sprawy into tmp1
from st2 in tmp1.DefaultIfEmpty()
join ss in db.stan_szczegoly on s.kod_stanu equals ss.kod_stanu
join b in db.broni on st2.ident equals b.id_strony into tmp2
from b2 in tmp2.DefaultIfEmpty()
select new { };
Seems alright, but when checked with SQL Profiler, query that is sent to database looks like that:
SELECT ... FROM [dbo].[sprawa] AS [Extent1]
LEFT OUTER JOIN [dbo].[strona] AS [Extent2]
ON [Extent1].[ident] = [Extent2].[id_sprawy]
INNER JOIN [dbo].[stan_szczegoly] AS [Extent3]
ON [Extent1].[kod_stanu] = [Extent3].[kod_stanu]
INNER JOIN [dbo].[broni] AS [Extent4]
ON ([Extent2].[ident] = [Extent4].[id_strony]) OR
(([Extent2].[ident] IS NULL) AND ([Extent4].[id_strony] IS NULL))
As you can see both SQL queries are bit different. Effect is the same, but latter works incomparably slower (less than a second to over 30 minutes). There's also a union made, but it shouldn't be the problem. If asked for I'll paste code for it.
I'd be grateful for any advice on how to better the performance of my Linq statement or how to write it in a way that is translated properly.
I guess I found the solution:
var queryOne = from s in db.sprawa
join st in db.strona on s.ident equals st.id_sprawy into tmp1
where tmp1.Any()
from st2 in tmp1.DefaultIfEmpty()
join ss in db.stan_szczegoly on s.kod_stanu equals ss.kod_stanu
join b in db.broni on st2.ident equals b.id_strony into tmp2
where tmp2.Any()
from b2 in tmp2.DefaultIfEmpty()
select new { };
In other words where table.Any() after each into table statement. It doesn't make translation any better but has sped up execution time from nearly 30minutes(!) to about 5 seconds.
This has to be used carefully though, because it MAY lead to losing some records in result set.

Query that uses Clustered Index Scan instead of seek

I have the following query that returns < 300 results. It is currently taking about 4 seconds to complete, and when I look at the execution plan, it shows that it is spending 41% of resources on a clustered index scan. My limited knowledge of database administration suggests that a clustered index seek would improve performance. How can I get the query to use a clustered index seek instead of a clustered index scan? Below is the pertinent information and the query.
Sql Server 2008 R2
Table PMDME approx 140,000 rows (this is the one that is taking up 41% of resources)
Server Hardware: 16 core 2.7gz processors, 48gb ram
DECLARE #start date, #end date
SET #start = '2013-01-01'
SET #end = CAST(GETDATE() AS DATE)
SELECT
b.total,
c.intakes,
d.ships,
a.CODE_,
RTRIM(a.NAME_) as name,
f.employee as Salesperson,
g.referral_type_id,
h.referral_type,
e.slscode,
a.city,
a.STATE_,
a.zip
FROM PACWARE.ADS.RFDME a
LEFT OUTER JOIN (
SELECT SUM(b.quantity) total, a.ref_id from event.dbo.sample a
JOIN event.dbo.sample_parts b on a.id = b.sample_id
JOIN PACWARE.ADS.PTDME c on b.part_id = c.CODE_
WHERE c.MEDICAREID = 'E0607' AND a.order_date between #start and #end
GROUP BY a.ref_id
)b on a.CODE_ = b.ref_id
LEFT OUTER JOIN (
SELECT COUNT(a.CODE_)as intakes, rfcode
FROM PACWARE.ADS.PMDME a
WHERE a.REGDATETIME BETWEEN #start and #end
GROUP BY a.RFCODE
) c on a.CODE_ = c.rfcode
LEFT OUTER JOIN (
SELECT
COUNT(a.CODE) as ships, b.rfcode
FROM
(
SELECT
A.ACCOUNT AS CODE,
MIN(CAST(A.BILLDATETIME AS DATE)) AS SHIPDATE
FROM PACWARE.ADS.ARODME A
LEFT OUTER JOIN PACWARE.ADS.PTDME B ON A.PTCODE=B.CODE_
LEFT OUTER JOIN event.dbo.newdate() D ON A.ACCOUNT=D.ACCOUNT
LEFT OUTER JOIN event.dbo.newdate_extras() D2 ON A.ACCOUNT=D2.ACCOUNT
WHERE A.BILLDATETIME>=#start
AND A.BILLDATETIME=#start AND D.NEWDATE=#start AND D2.NEWDATE'ID'
Group by
A.ACCOUNT,
B.MEDICAREID,
A.CATEGORY
) a
JOIN PACWARE.ADS.PMDME b on a.CODE = b.CODE_
GROUP BY b.RFCODE
) d on a.CODE_ = d.rfcode
LEFT OUTER JOIN event.dbo.employee_slscode e on a.SLSCODE = e.slscode
JOIN event.dbo.employee f on e.employee_id = f.id
JOIN event.dbo.referral_data g on a.CODE_ = g.CODE_
JOIN event.dbo.referral_type h on g.referral_type_id = h.id
WHERE total > 0
I would try creating first and index just for the colum REGDATETIME on PACWARE.ADS.PMDME table.
GO
CREATE NONCLUSTERED INDEX [IX_PMDME_REGDATETIME] ON PACWARE.ADS.PMDME
(
[REGDATETIME] ASC
)
GO
Test how it works. I would also test adding another index to the column RFCODE (same table) if the selectivity of the column is good enough.

LINQ nested joins

Im trying to convert a SQL join to LINQ. I need some help in getting the nested join working in LINQ.
This is my SQL query, Ive cut it short just to show the nested join in SQL:
select distinct
txtTaskStatus as TaskStatusDescription,
txtempfirstname+ ' ' + txtemplastname as RaisedByEmployeeName,
txtTaskPriorityDescription as TaskPriorityDescription,
dtmtaskcreated as itemDateTime,
dbo.tblTask.lngtaskid as TaskID,
dbo.tblTask.dtmtaskcreated as CreatedDateTime,
convert(varchar(512), dbo.tblTask.txttaskdescription) as ProblemStatement,
dbo.tblTask.lngtaskmessageid,
dbo.tblMessage.lngmessageid as MessageID,
case when isnull(dbo.tblMessage.txtmessagesubject,'') <> '' then txtmessagesubject else left(txtmessagedescription,50) end as MessageSubject,
dbo.tblMessage.txtmessagedescription as MessageDescription,
case when dbo.tblMessage.dtmmessagecreated is not null then dbo.tblMessage.dtmmessagecreated else CAST(FLOOR(CAST(dtmtaskcreated AS DECIMAL(12, 5))) AS DATETIME) end as MessageCreatedDateTime
FROM
dbo.tblAction RIGHT OUTER JOIN dbo.tblTask ON dbo.tblAction.lngactiontaskid = dbo.tblTask.lngtaskid
LEFT OUTER JOIN dbo.tblMessage ON dbo.tblTask.lngtaskmessageid = dbo.tblMessage.lngmessageid
LEFT OUTER JOIN dbo.tblTaskCommentRecipient
RIGHT OUTER JOIN dbo.tblTaskComment ON dbo.tblTaskCommentRecipient.lngTaskCommentID = dbo.tblTaskComment.lngTaskCommentID
ON dbo.tblTask.lngtaskid = dbo.tblTaskComment.lngTaskCommentTaskId
A more seasoned SQL programmer wouldn't join that way. They'd use strictly left joins for clarity (as there is a strictly left joining solution available).
I've unraveled these joins to produce a hierarchy:
Task
Action
Message
TaskComment
TaskCommentRecipient
With associations created in the linq to sql designer, you can reach these levels of the hierarchy:
//note: these aren't outer joins
from t in db.Tasks
let actions = t.Actions
let message = t.Messages
let comments = t.TaskComments
from c in comments
let recipients = c.TaskCommentRecipients
DefaultIfEmpty produces a default element when the collection is empty. Since these are database rows, a default element is a null row. That is the behavior of left join.
query =
(
from t in db.Tasks
from a in t.Actions.DefaultIfEmpty()
from m in t.Messages.DefaultIfEmpty()
from c in t.Comments.DefaultIfEmpty()
from r in c.Recipients.DefaultIfEmpty()
select new Result()
{
TaskStatus = ???
...
}
).Distinct();
Aside: calling Distinct after a bunch of joins is a crutch. #1 See if you can do without it. #2 If not, see if you can eliminate any bad data that causes you to have to call it. #3 If not, call Distinct in a smaller scope than the whole query.
Hope this helps.
SELECT [t0].[OrderID], [t0].[CustomerID], [t0].[EmployeeID], [t0].[OrderDate], [t0].[RequiredDate], [t0].[ShippedDate], [t0].[ShipVia], [t0].[Freight], [t0].[ShipName], [t0].[ShipAddress], [t0].[ShipCity], [t0].[ShipRegion], [t0].[ShipPostalCode], [t0].[ShipCountry]
FROM [Orders] AS [t0]
LEFT OUTER JOIN ([Order Details] AS [t1]
INNER JOIN [Products] AS [t2] ON [t1].[ProductID] = [t2].[ProductID]) ON [t0].[OrderID] = [t1].[OrderID]
can be write as
from o in Orders
join od in (
from od in OrderDetails join p in Products on od.ProductID equals p.ProductID select od)
on o.OrderID equals od.OrderID into ood from od in ood.DefaultIfEmpty()
select o

Resources