Entity Framework Core + Count with Group By - linq

I have a table which contains ~600k records and 33 columns. In my project I am using EF Core (2.0.1) to retrieve data from database. I am having issues with below code:
var theCounter = (from f in _context.tblData.Take(100000)
group f by f.TypeId into data
select new DataDto { ID = data.Key, Count = data.Count() }).ToList();
This code is a part of REST API and when I am testing it from SOAP UI, I am gettin timeout error. When I tested the code for
Take(1000)
There are around 300 unique TypeIds.
it works fine. Any ideas how I can make it work?
-- EDIT 1:
Here is what I see when debugging the code:
Microsoft.EntityFrameworkCore.Query:Warning: Query: '(from TblData <generated>_1 in DbSet<TblData> select [<generated>_1]).Take(__p_0)' uses a row limiting operation (Skip/Take) without OrderBy which may lead to unpredictable results.
Microsoft.EntityFrameworkCore.Query:Warning: Query: '(from TblData <generated>_1 in DbSet<TblData> select [<generated>_1]).Take(__p_0)' uses a row limiting operation (Skip/Take) without OrderBy which may lead to unpredictable results.
Microsoft.EntityFrameworkCore.Query:Warning: The LINQ expression 'GroupBy([f].TypeId, [f])' could not be translated and will be evaluated locally.
Microsoft.EntityFrameworkCore.Query:Warning: The LINQ expression 'GroupBy([f].TypeId, [f])' could not be translated and will be evaluated locally.
Microsoft.EntityFrameworkCore.Query:Warning: The LINQ expression 'Count()' could not be translated and will be evaluated locally.
Microsoft.EntityFrameworkCore.Database.Command:Information: Executed DbCommand (131ms) [Parameters=[#__p_0='?'], CommandType='Text', CommandTimeout='30']
SELECT [t2].[Id], [t2].[at], [t2].[add], [t2].[AddDate], [t2].[aftc], [t2].[aftcd], [t2].[aid], [t2].[afl], [t2].[prdid], [t2].[cid], [t2].[TypeId], [t2].[env], [t2].[ext], [t2].[extddcode], [t2].[fn], [t2].[fn], [t2].[fic], [t2].[gid], [t2].[grp], [t2].[hnm], [t2].[IP], [t2].[icid], [t2].[ln], [t2].[lg], [t2].[pcid], [t2].[ret], [t2].[rts], [t2].[rnam], [t2].[sled], [t2].[seq], [t2].[sid], [t2].[styp]
FROM (
SELECT TOP(#__p_0) [t1].[Id], [t1].[at], [t1].[add], [t1].[AddDate], [t1].[aftc], [t1].[aftcd], [t1].[aid], [t1].[afl], [t1].[prdid], [t1].[cid], [t1].[TypeId], [t1].[env], [t1].[ext], [t1].[extddcode], [t1].[fn], [t1].[fn], [t1].[fic], [t1].[gid], [t1].[grp], [t1].[hnm], [t1].[IP], [t1].[icid], [t1].[ln], [t1].[lg], [t1].[pcid], [t1].[ret], [t1].[rts], [t1].[rnam], [t1].[sled], [t1].[seq], [t1].[sid], [t1].[styp]
FROM [TblData] AS [t1]
) AS [t2]
WHERE [t2].[TypeId] IS NOT NULL
ORDER BY [t2].[TypeId]
I think it is not translated properly. Any ideas why?
-- EDIT 2:
I have changed my queries to:
var query = _context.TblData
.Select(a => new {ID = a.Id, TypeId= a.TypeId})
.Distinct();
var q1 = query.GroupBy(p => p.TypeId)
.Select(g => new DataDto {TypeId= g.Key, Count = g.Count()});
return await q1.ToListAsync();
But it was translated to:
SELECT DISTINCT [a0].[Id], [a0].[TypeId] AS [TypeId]
FROM [tblData] AS [a0]
ORDER BY [a0].[TypeId]
When I checked directly in the database this query takes 14 seconds to execute. Any idea why it was not translated to something like:
SELECT DISTINCT [a0].[Id], COUNT([TypeId]) AS [TypeId]
FROM [tblData] AS [a0]
GROUP BY COUNT([a0].[Id])
ORDER BY [a0].[TypeId]

I had to upgrade EF Core version to 2.1 and LINQ is now translated properly into SQL.

Related

Load only some elements of a nested collection efficiently with LINQ

I have the following LINQ query (using EF Core 6 and MS SQL Server):
var resultSet = dbContext.Systems
.Include(system => system.Project)
.Include(system => system.Template.Type)
.Select(system => new
{
System = system,
TemplateText = system.Template.TemplateTexts.FirstOrDefault(templateText => templateText.Language == locale.LanguageIdentifier),
TypeText = system.Template.Type.TypeTexts.FirstOrDefault(typeText => typeText.Language == locale.LanguageIdentifier)
})
.FirstOrDefault(x => x.System.Id == request.Id);
The requirement is to retrieve the system matching the requested ID and load its project, template and template's type info. The template has multiple TemplateTexts (one for each translated language) but I only want to load the one matching the requested locale, same deal with the TypeTexts elements of the template's type.
The LINQ query above does that in one query and it gets converted to the following SQL query (I edited the SELECT statements to use * instead of the long list of columns generated):
SELECT [t1].*, [t2].*, [t5].*
FROM (
SELECT TOP(1) [p].*, [t].*, [t0].*
FROM [ParkerSystems] AS [p]
LEFT JOIN [Templates] AS [t] ON [p].[TemplateId] = [t].[Id]
LEFT JOIN [Types] AS [t0] ON [t].[TypeId] = [t0].[Id]
LEFT JOIN [Projects] AS [p0] ON [p].[Project_ProjectId] = [p0].[ProjectId]
WHERE [p].[SystemId] = #__request_Id_1
) AS [t1]
LEFT JOIN (
SELECT [t3].*
FROM (
SELECT [t4].*, ROW_NUMBER() OVER(PARTITION BY [t4].[ReferenceId] ORDER BY [t4].[Id]) AS [row]
FROM [TemplateTexts] AS [t4]
WHERE [t4].[Language] = #__locale_LanguageIdentifier_0
) AS [t3]
WHERE [t3].[row] <= 1
) AS [t2] ON [t1].[Id] = [t2].[ReferenceId]
LEFT JOIN (
SELECT [t6].*
FROM (
SELECT [t7].*, ROW_NUMBER() OVER(PARTITION BY [t7].[ReferenceId] ORDER BY [t7].[Id]) AS [row]
FROM [TypeTexts] AS [t7]
WHERE [t7].[Language] = #__locale_LanguageIdentifier_0
) AS [t6]
WHERE [t6].[row] <= 1
) AS [t5] ON [t1].[Id0] = [t5].[ReferenceId]
which is not bad, it's not a super complicated query, but I feel like my requirement can be solved with a much simpler SQL query:
SELECT *
FROM [Systems] AS [p]
JOIN [Templates] AS [t] ON [p].[TemplateId] = [t].[Id]
JOIN [TemplateTexts] AS [tt] ON [p].[TemplateId] = [tt].[ReferenceId]
JOIN [Types] AS [ty] ON [t].[TypeId] = [ty].[Id]
JOIN [TemplateTexts] AS [tyt] ON [ty].[Id] = [tyt].[ReferenceId]
WHERE [p].[SystemId] = #systemId and tt.[Language] = 2 and tyt.[Language] = 2
My question is: is there a different/simpler LINQ expression (either in Method syntax or Query syntax) that produces the same result (get all info in one go) because ideally I'd like to not have to have an anonymous object where the filtered sub-collections are aggregated. For even more brownie points, it'd be great if the generated SQL would be simpler/closer to what I think would be a simple query.
Is there a different/simpler LINQ expression (...) that produces the same result
Yes (maybe) and no.
No, because you're querying dbContext.Systems, therefore EF will return all systems that match your filter, also when they don't have TemplateTexts etc. That's why it has to generate outer joins. EF is not aware of your apparent intention to skip systems without these nested data or of any guarantee that these systems don't occur in the database. (Which you seem to assume, seeing the second query).
That accounts for the left joins to subqueries.
These subqueries are generated because of FirstOrDefault. In SQL it always requires some sort of subquery to get "first" records of one-to-many relationships. This ROW_NUMBER() OVER construction is actually quite efficient. Your second query doesn't have any notion of "first" records. It'll probably return different data.
Yes (maybe) because you also Include data. I'm not sure why. Some people seem to think Include is necessary to make subsequent projections (.Select) work, but it isn't. If that's your reason to use Includes then you can remove them and thus remove the first couple of joins.
OTOH you also Include system.Project which is not in the projection, so you seem to have added the Includes deliberately. And in this case they have effect, because the entire entity system is in the projection, otherwise EF would ignore them.
If you need the Includes then again, EF has to generate outer joins for the reason mentioned above.
EF decides to handle the Includes and projections separately, while hand-crafted SQL, aided by prior knowledge of the data could do that more efficiently. There's no way to affect that behavior though.
This LINQ query is close to your SQL, but I'm afraid of correctness of the result:
var resultSet =
(from system in dbContext.Systems
from templateText in system.Template.TemplateTexts
where templateText.Language == locale.LanguageIdentifier
from typeText in system.Template.Type.TypeTexts
where typeText.Language == locale.LanguageIdentifier
select new
{
System = system,
TemplateText = templateText
TypeText = typeText
})
.FirstOrDefault(x => x.System.Id == request.Id);

Dynamic Linq core

Hi I am using a Jqwidgets Grid to display my data. It has a build in possibility to use filters but if you filter your records on the server side you have to build your own query. As I am working with Linq I thought to use the Dynamic Linq Library for Asp net core. Problem is there are not many examples or explanations how to do this. But I am busy for days now and not getting very far.The way I am setup; I have a normal Linq query:
var Mut = from M in _DB.Mutations
join S in _DB.Shifts on M.ShiftId equals S.ShiftId
join U in _DB.RoosterUsers on M.UserId equals U.RoosterUserId
join D in deps on M.UserId equals D.UserId
join DD in _DB.Departements on D.DepartementID equals DD.DepartementId
select new MutationModel
{
MutId=M.MutationId,
Naam=U.FirstName + " " + U.LastName,
UserId=M.UserId,
Departement= DD.DepartementName,
MutationType = S.publicName,
MutationGroup = S.ShiftType.ToString(),
DateTot =M.DateTill,
TijdVan=M.DateStartOn,
TijdTot=M.DateTill,
Status=CreateStatus(M.Tentative, M.ApprovedOn, M.Processed, M.CancelRefId, M.Deleted)
};
This query is running OK and gives me all the data I need for the Grid.
Then for the filter I would like to add a dynamic Linq Query using the System.Linq.Dynamic.Core library
But this is as far as I get things working until now:
var outQuery = Mut.Where("Status = #0 and UserId = #1", "Nieuw", "KLM22940").Select("Status");
My questions now :
1. In the where clause If I make the fieldname variable I get an error. how to do this??
2. In the Select Clause, how to add multiple Columns? (actually I just like to output all columns.)
Best would be to see an example. has somebody used Dynamic Linq to build a dynamic linq query for the JQWidgets Grid?
Thank you very much.
In what way you are trying to use fieldname variable in where clause ?
If you want to output all columns you can use ToList()
like
var outQuery = Mut.Where("Status = #0 and UserId = #1", "Nieuw", "KLM22940").ToList();
If you want to get some specific columns you can use Select clause like this
var outQuery = Mut.Where("Status = #0 and UserId = #1", "Nieuw", "KLM22940").Select("new(Status,UserId )");
This Select clause creates data class which contains Status and UserId properties and returns a sequence of instances of that data class.

Linq: Select Most Recent Record of Each Group

I want to get the latest record of each group from a SQL Server table using Linq.
Table Example:
I want to get this result:
My Linq query returns one record for each company, but it doesn't return the most recent ones:
var query = from p in db.Payments
where p.Status == false
&& DateTime.Compare(DateTime.Now, p.NextPaymentDate.Value) == 1
group p by p.CompanyID into op
select op.OrderByDescending(nd => nd.NextPaymentDate.Value).FirstOrDefault();
What am i missing here? Why isn't the NextPaymentDate being ordered correctly?
!!UPDATE!!
My query is working as expected. After analysing #Gilang and #JonSkeet comments i ran further tests and found that i wasn't getting the intended results due to a column that wasn't being updated.
var query = from p in db.Payments
where p.Status == false
group p by p.CompanyID into op
select new {
CompanyID = op.Key,
NextPaymentDate = op.Max(x => x.NextPaymentDate),
Status = false
};
The reason your query is not being ordered correctly is that your query does not do proper grouping. You did correctly grouping by CompanyID, but then you have to retrieve the maximum NextPaymentDate by calling aggregate function.
Status can be assigned false because it is already filtered by Where clause in the early clauses.

Why does LINQ Date Column comparison not work?

For a LINQ query like:
var entities = from Account p in context.Accounts
where p.LastTimeServerSettingsChanged > p.LastTimeDeviceConnected
select p;
the query that is generated is:
SELECT
[Extent1].[Username] AS [Username],
[Extent1].[LastTimeDeviceConnected] AS [LastTimeDeviceConnected],
[Extent1].[LastTimeServerSettingsChanged] AS [LastTimeServerSettingsChanged]
FROM [dbo].[Account] AS [Extent1]
WHERE [Extent1].[LastTimeServerSettingsChanged] > [Extent1].[LastTimeDeviceConnected]
And this does not work (no results).
And the following also generates the same SQL (hence no results also)
var entities = context.Accounts.Where(k => k.LastTimeServerSettingsChanged > k.LastTimeDeviceConnected).Select(k => k);
My question is why, and how can this query be performed (using LINQ)?
The above code works fine. I was hitting the wrong database and hence was getting the wrong result. GIGO. QED.

LINQ to NHibernate (3.0) : GroupBy and Sum in subquery gives NoTImplemented

I have a Linq query using NHibernate 3.0. But it keeps returning an error.
threw exception: System.NotImplementedException: The method or operation is not implemented..
I tried the same in LINQ 2 SQL and it works perfectly.
What might be wrong here? Here is part of my select, it's a subquery with a Groupby and Sum.
Amount = (System.Double)
((from m0 in _session.Query<Statement>()
where m0.Code== c.Code
group m0 by new
{
m0.Code
}
into g
select new
{
Expr1 = (System.Double)g.Sum(p => p.Amount)
}).First().Expr1)
};
I have the latest CSR1 installed of NHibernate but it just doesn't seem to work with my query.
The LINQ provider in NH3 is currently in a beta state. There are certain constructs that are not yet supported. (The team plans to address this after the NH3 release.) The parts causing problems in your query are the "new {}" anonymous type in the group by clause and the First() in the context of a group by. Both are not currently implemented. The following query executes properly and should give the same results:
var query = from m0 in session.Query<Statement>()
where m0.Code == c.Code
group m0 by m0.Code into g
select new {Expr1 = g.Sum(p => p.Amount)};
var result = query.ToList().First().Expr1;
First note that the "new {}" in the group by clause is not required. The other change was adding "ToList()". This forces the results to be queried from the database and then we use LINQ-to-Objects to get the First() result. The SQL generated for this query is:
select cast(sum(statement0_.Amount) as DOUBLE PRECISION) as col_0_0_
from Statement statement0_
where (statement0_.Code is null)
and ('FOO' /* #p0 */ is null)
or statement0_.Code = 'FOO' /* #p0 */
group by statement0_.Code

Resources