.NET LINQ Expression<Func<T, bool>> Performance Issues - performance

I have two functions as follow -
public IQueryable<RequestSummaryDTO> GetProgramOfficerUSA(Guid officerId)
{
List<string> officerCountries = UnityProvider.Instance.Get<IProgramOfficerService>().GetPOCountries(officerId).Select(c => c.CNTR_ID.ToUpper()).ToList();
Expression<Func<TaskRequest, bool>> countriesFilter = (a) => officerCountries.Contains(a.tblTaskDetail.FirstOrDefault().tblOrganization.ORG_Country.ToUpper());
Expression<Func<TaskRequest, bool>> USAAndNotDelegated = LinqUtils.And(this.USAFilter(), this.NotDelegatedFilter(officerId));
Expression<Func<TaskRequest, bool>> countriesOrOwned = LinqUtils.Or(countriesFilter, this.OwnedFilter(officerId));
Expression<Func<TaskRequest, bool>> filter = LinqUtils.And(USAAndNotDelegated, countriesOrOwned);
return this.Get(filter, TaskRequestState.USA);
}
private IQueryable<RequestSummaryDTO> Get(Expression<Func<TaskRequest, bool>> additionalFilter, TaskRequestState? TaskRequestState = null)
{
var Tasks = this.TaskRequestRepository.List(additionalFilter).Where(x => x.tblTaskDetail.FirstOrDefault().PD_TaskRequestID == x.Id &&
(x.tblRequestDetail.AD_Status == (int)RequestStatus.Paid || x.tblRequestDetail.AD_Status == (int)RequestStatus.NotConfirmed));
if (TaskRequestState == TaskRequestState.USA)
{
Tasks = Tasks.Where(w => (w.tblTaskDetail.FirstOrDefault().PD_PStatus == null || w.tblTaskDetail.FirstOrDefault().PD_PStatus == 125));
}
else
{
Tasks = Tasks.Where(w => w.State == null);
}
return Tasks.ToList().Select(TaskSummaryFactory.CreateDto).AsQueryable();
}
I used Expression> and IQueryable in the LINQ. It is supposed to use LINQ To SQL instead of LINQ To Objects. The performance should be good.
But I do not see that. I believe the LINQ pulls every table's data into memory to process using LINQ To Objects. I do not see a join sql query sent to SQL Server tracking by SQL profile but a bunch of individual table selection statement.
It takes over 10 minutes to get something return from
return Tasks.ToList().Select(TaskSummaryFactory.CreateDto).AsQueryable();
I know the problem is in
Expression<Func<TaskRequest, bool>> countriesFilter = (a) => officerCountries.Contains(a.tblTaskDetail.FirstOrDefault().tblOrganization.ORG_Country.ToUpper());
tblTaskDetail is a big table. If I switch to a smaller one, the performance is noticeably improved.
Anyone can help find out why is wrong there.
Thanks,
Update 1 -
The statement from Entity Framework logged in SQL Profile are all like this -
exec sp_executesql N'SELECT [Extent1].[ORG_ID] AS [ORG_ID], [Extent1].[ORG_CreatedBy] AS [ORG_CreatedBy], [Extent1].[ORG_CreatedOn] AS [ORG_CreatedOn] FROM [dbo].[tblOrganization] AS [Extent1] WHERE [Extent1].[ORG_ID] = #EntityKeyValue1',N'#EntityKeyValue1 uniqueidentifier',#EntityKeyValue1='E8C3F120-AA40-445E-A8A0-2937F330D347'
They are all just having individual table select statement, not joined sql statement.
Update 2 -
I was wrong in Update 1. I missed the join SQL statement. The problem is that the generated SQL is too poor. There are 6 nested select statements, 11 LEFT OUTER JOINs, and 10 OUTER APPLYs. The query is too long and can not post here. Executing the generated SQL takes 9 minutes.

I have upgraded the EF STE 5 to EF6. Now the performance is better. The page loading time is cut down to 1.5 minutes from 12 minutes.

Related

I need most complicated query

I use Linq to Sql.
I have three table.
tabl_Region: from this table returning .ToList() via county column == 85
tabl_Season: from this table returning .ToList() via startdate >= today
tabl_Desc: from this table returning int [] ID via fldRegion== 1 results && fldSeason== 2 results
I will try to explain not worked codes
TurkusEntities context = new TurkusEntities();
return context.tabl_AttrDesc.Where(c => c.fldRegionId == context.tabl_Region.Where(r => r.fldCounty == 85).ToList() && c.fldSeasonId == context.tabl_Season.Where(s => s.fldStartDate >= DateTime.Now).ToList()).ToList;
I know i can solve it by using loops, but if it is possible i want to use only query.
If you are using Linq2Sql you can get the
sql that is generated by a linq query. with this command.
dc.GetCommand(query).CommandText
If you are using SQL Server Profiler inside the Sql Server (Tools --> SQL Server Profiler)
I found solution but not impossible in one query.
First I got int [] ID arrays which match conditions from two tables.After I wrote a query which provide control column data from arrays.
TurkusEntities context = new TurkusEntities();
Region region = new Region();
string[] dataarray = region.GetAllRegionsBySomeRule(RegionX);
var _db= context.tabl_AttrDesc.Where(c =>dataarray.Contains(c.fldRegionId.ToString().Trim())).ToList();
And Region
public string[] GetAllRegionsBySomeRule(int fldType,int fldRegionX)
{
TurkusEntities context = new TurkusEntities();
var _db = context.tabl_Region.Where(c => c.fldTown.Contains(fldRegionX.ToString())).ToList();
foreach (var data in _db)
{
Regions.Add(data.fldId.ToString().Trim());
}
string[] IDS = Regions.ToArray();
return IDS;
}

sql to linq with left join

i have create a request in SQL and put them in dataset. apparently it hang when the data very huge. so i use an Entity.
my original sql is like this:
SELECT NO_ORDRE,ORDRE.CODE_DEST as CODE_DEST,REF_EXPED,ORDRE.MODAL_MODE,RS_NOM,ADRESSE,TEL,VILLE,
ORDRE.NBR_COLIS,ORDRE.POID,DATE_CREE,DATE_CLOTUR,STATUT_ORDRE,ORDRE.TRANSPORTEUR,ORDRE.LIB_TOURNE,
ORDRE.DATE_CLOTUR_REEL,ORDRE.OBS,AUTRE_REF,
ORDRE.CODE_CLIENT+'_'+CAST(NOID as VARCHAR(50))+'_'+SUBSTRING(NO_ORDRE_CUMMUL, 0, CHARINDEX('_', NO_ORDRE_CUMMUL + '_')) as NOLV
FROM ORDRE
LEFT OUTER JOIN LETTRE_VOIT_FINAL
ON charindex('_'+cast(ORDRE.NO_ORDRE as varchar(255))+'_', '_'+LETTRE_VOIT_FINAL.NO_ORDRE_CUMMUL+'_') > 0
WHERE DATE_CREE BETWEEN #DATE_CREE_DEB AND #DATE_CREE_FIN
ORDER BY NO_ORDRE DESC
and i try my linq like this:
public IQueryable<ORDRE> Get_OrdreEntity(DateTime datedeb, DateTime datefin)
{
try
{
IQueryable<ORDRE> LesListe;
Soft8Exp_ClientEntities oEntite_T = new Soft8Exp_ClientEntities();
var query = from o in oEntite_T.ORDRE
where o.DATE_CREE >= datedeb && o.DATE_CREE <= datefin
select o;
LesListe = query;
return LesListe;
}
catch (Exception excThrown)
{
throw new Exception("Err_02", excThrown);
}
}
it works well but i don't know how to make a join from this sql:
LEFT OUTER JOIN LETTRE_VOIT_FINAL
ON charindex('_'+cast(ORDRE.NO_ORDRE as varchar(255))+'_', '_'+LETTRE_VOIT_FINAL.NO_ORDRE_CUMMUL+'_') > 0
and how can i translate it to linq from this sql:
ORDRE.CODE_CLIENT+'_'+CAST(NOID as VARCHAR(50))+'_'+SUBSTRING(NO_ORDRE_CUMMUL, 0, CHARINDEX('_', NO_ORDRE_CUMMUL + '_')) as NOLV
I can't see any reason to have exception handling in the Get_OrdreEntity function. It should be coded in way it just work. Debug it. In any way you do nothing in catch.
I you query and filter data in this function and want to get results it is a good idea to return collection instead of query in the result of this function to eliminate performance and side effect isssues. I.e. return IEnumerable, ICollection, IList wherether you want.
It is easy to find a ton of Linq join examples, just use Google. Here is all you need.

Optimizing LINQ Any() call in Entity Framework

After profiling my Entity Framework 4.0 based database layer I have found the major performance sinner to be a simple LINQ Any() I use to check if an entity is already existing in the database. The Any() check performs orders of magnitude slower than saving the entity.
There are relatively few rows in the database and the columns being checked are indexed.
I use the following LINQ to check for the existence of a setting group:
from sg in context.SettingGroups
where sg.Group.Equals(settingGroup) && sg.Category.Equals(settingCategory)
select sg).Any()
This generates the following SQL (additionally my SQL profiler claims the query is executed twice):
exec sp_executesql N'SELECT
CASE WHEN ( EXISTS (SELECT
1 AS [C1]
FROM [dbo].[SettingGroups] AS [Extent1]
WHERE ([Extent1].[Group] = #p__linq__0) AND ([Extent1].[Category] = #p__linq__1)
)) THEN cast(1 as bit) WHEN ( NOT EXISTS (SELECT
1 AS [C1]
FROM [dbo].[SettingGroups] AS [Extent2]
WHERE ([Extent2].[Group] = #p__linq__0) AND ([Extent2].[Category] = #p__linq__1)
)) THEN cast(0 as bit) END AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]',N'#p__linq__0 nvarchar(4000),#p__linq__1 nvarchar(4000)',#p__linq__0=N'Cleanup',#p__linq__1=N'Mediator'
Right now I can only think of creating stored procedures to solve this problem, but I would of course prefer to keep the code in LINQ.
Is there a way to make such an "Exist" check run faster with EF?
I should probably mention that I also use self-tracking-entities in an n-tier architecture. In some scenarios the ChangeTracker state for some entities is set to "Added" even though they already exist in the database. This is why I use a check to change the ChangeTracker state accordingly if updating the database caused an insert failure exception.
Try adding index to the database table "SettingGroups", by Group & Category.
BTW, does this produce similar sql?
var ok = context.SettingGroups.Any(sg => sg.Group==settingGroup && sg.Category==settingCategory);
The problem is Entity Framework (at least EF4) is generating stupid SQL. The following code seems to generate decent SQL with minimal pain.
public static class LinqExt
{
public static bool BetterAny<T>( this IQueryable<T> queryable, Expression<Func<T, bool>> predicate)
{
return queryable.Where(predicate).Select(x => (int?)1).FirstOrDefault().HasValue;
}
public static bool BetterAny<T>( this IQueryable<T> queryable)
{
return queryable.Select(x => (int?)1).FirstOrDefault().HasValue;
}
}
Then you can do:
(from sg in context.SettingGroups
where sg.Group.Equals(settingGroup) && sg.Category.Equals(settingCategory)
select sg).BetterAny()
or even:
context.SettingGroups.BetterAny(sg => sg.Group.Equals(settingGroup) && sg.Category.Equals(settingCategory));
I know it sounds a miserable solution, but what happens if you use Count instead of Any?
Have you profiled the time to execute the generated select statement against the time to execute the select what you would expect/like to be produced ? It is possible that it is not as bad as it looks.
The section
SELECT
1 AS [C1]
FROM [dbo].[SettingGroups] AS [Extent1]
WHERE ([Extent1].[Group] = #p__linq__0) AND ([Extent1].[Category] = #p__linq__1)
is probably close to what you would expect to be produced. It is quite possible that the query optimiser will realise the second query is the same as the first and hence it may add very little time to the overall query.

LINQ to SQL Queries odd Materialization

I ran across an interesting Linq to SQL, uh, feature, the other day. Perhaps someone can give me a logical explanation for the reasoning behind the results. Take the code below as my example which utilizes the AdventureWorks database setup in a Linq to SQL DataContext. This is a clip from my unit test. The resulting customer returned from a call to both CustomerQuery_Test_01() and CustomerQuery_Test_02() is the same. However, the query executed on the SQLServer are different is a major way. The method CustomerQuery_Test_01 us causing the entire Customer table to be materialized, which the call to CustomerQuery_Test_02 is only causing the single customer to be materialized. The resulting SQL Queries are at the bottom of this post. Anyone have a good reason for this? To me, it was highly non-intuitive.
protected virtual Customer GetByPrimaryKey(Func<Customer, bool> keySelection)
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
return (from r in context.Customers select r).SingleOrDefault(keySelection);
}
[TestMethod]
public void CustomerQuery_Test_01()
{
Customer customer = GetByPrimaryKey(c => c.CustomerID == 2);
}
[TestMethod]
public void CustomerQuery_Test_02()
{
AdventureWorksDataContext context = new AdventureWorksDataContext();
Customer customer = (from r in context.Customers select r).SingleOrDefault(c => c.CustomerID == 2);
}
Query for CustomerQuery_Test_01 (notice the lack of a where clause)
SELECT [t0].[CustomerID], [t0].[NameStyle], [t0].[Title], [t0].[FirstName], [t0].[MiddleName], [t0].[LastName], [t0].[Suffix], [t0].[CompanyName], [t0].[SalesPerson], [t0].[EmailAddress], [t0].[Phone], [t0].[PasswordHash], [t0].[PasswordSalt], [t0].[rowguid], [t0].[ModifiedDate]
FROM [SalesLT].[Customer] AS [t0]
Query for CustomerQuery_Test_02 (notice the where clause)
SELECT [t0].[CustomerID], [t0].[NameStyle], [t0].[Title], [t0].[FirstName], [t0].[MiddleName], [t0].[LastName], [t0].[Suffix], [t0].[CompanyName], [t0].[SalesPerson], [t0].[EmailAddress], [t0].[Phone], [t0].[PasswordHash], [t0].[PasswordSalt], [t0].[rowguid], [t0].[ModifiedDate]
FROM [SalesLT].[Customer] AS [t0]
WHERE [t0].[CustomerID] = #p0
Func<Customer, bool> keySelection
That's not an Expression<Func<Customer, bool>>... the compiler resolved Enumerable.Single instead of Queryable.Single

Does Enumerable.ToDictionary only retrieve what it needs?

I'm using Enumerable.ToDictionary to create a Dictionary off of a linq call:
return (from term in dataContext.Terms
where term.Name.StartsWith(text)
select term).ToDictionary(t => t.TermID, t => t.Name);
Will that call fetch the entirety of each term, or will it only retrieve the TermID and the Name fields from my data provider? In other words, would I be saving myself database traffic if I instead wrote it like this:
return (from term in dataContext.Terms
where term.Name.StartsWith(text)
select new { term.TermID, term.Name }).ToDictionary(t => t.TermID, t => t.Name);
Enumerable.ToDictionary works on IEnumerable objects. The first part of your statement "(from ... select term") is an IQueryable object. Queryable is going to look at the expression and build the SQL statement. It will then convert that to an IEnumerable to pass to ToDictionary().
In other words, yes, your second version would be more efficient.
The generated SQL will return the entire term, so your second statement will bring down just what you need.
You can set dataContext.Log = Console.Out and look at the different results of the query.
Using my sample LINQPad database, here's an example:
var dc = (TypedDataContext)this;
// 1st approach
var query = Orders.Select(o => o);
dc.GetCommand(query).CommandText.Dump();
query.ToDictionary(o => o.OrderID, o => o.OrderDate).Dump();
// 2nd approach
var query2 = Orders.Select(o => new { o.OrderID, o.OrderDate});
dc.GetCommand(query2).CommandText.Dump();
query2.ToDictionary(o => o.OrderID, o => o.OrderDate).Dump();
The generated SQL is (or just peek at LINQPad's SQL tab):
// 1st approach
SELECT [t0].[OrderID], [t0].[OrderDate], [t0].[ShipCountry]
FROM [Orders] AS [t0]
// 2nd approach
SELECT [t0].[OrderID], [t0].[OrderDate]
FROM [Orders] AS [t0]
No. ToDictionary is an extension method for IEnumerable<T> not IQueryable<T>. It doesn't take an Expression<Func<T, TKey>> but simply a Func<T, TKey> that it'll blindly call for each item. It doesn't care (and doesn't know) about LINQ and the underlying expression trees and stuff like that. It just iterates the sequence and builds up a dictionary. As a consequence, in your first query, all columns are fetched.

Resources