How this linq execute? - linq

Data = _db.ALLOCATION_D.OrderBy(a => a.ALLO_ID)
.Skip(10)
.Take(10)
.ToList();
Let say I have 100000 rows in ALLOCATION_D table. I want to select first 10 row. Now I want to know how the above statement executes. I don't know but I think it executes in the following way...
first it select the 100000 rows
then ordered by ALLO_ID
then Skip 10
finally select the 10 rows.
Is it right? I want to know more details.

This Linq produce a SQL query via Entity Framework. Then it depends on your DBMS, but for SQL Server 2008, here is the query produces:
SELECT TOP (10) [Extent1].[ALLO_ID] AS [ALLO_ID],
FROM (
SELECT [Extent1].[ALLO_ID] AS [ALLO_ID]
, row_number() OVER (ORDER BY [Extent1].[ALLO_ID] ASC) AS [row_number]
FROM [dbo].[ALLOCATION_D] AS [Extent1]
) AS [Extent1]
WHERE [Extent1].[row_number] > 10
ORDER BY [Extent1].[ALLO_ID] ASC
You can run this in your C# for retrieve the query:
var linqQuery = _db.ALLOCATION_D
.OrderBy(a => a.ALLO_ID)
.Skip(10)
.Take(10);
var sqlQuery = ((System.Data.Objects.ObjectQuery)linqQuery).ToTraceString();
Data = linqQuery.ToList();
Second option with Linq To SQL
var linqQuery = _db.ALLOCATION_D
.OrderBy(a => a.ALLO_ID)
.Skip(10)
.Take(10);
var sqlQuery = _db.GetCommand(linqQuery).CommandText;
Data = linqQuery.ToList();
References:
How do I view the SQL generated by the entity framework?
How to: Display Generated SQL
How to view LINQ Generated SQL statements?

Your statement reads as follows:
Select all rows (overwritten by skip/take)
Order by Allo_ID
Order by Allo_ID again
Skip first 10 rows
Take next 10 rows
If you want it to select the first ten rows, you simply do this:
Data = _db.ALLOCATION_D // You don't need to order twice
.OrderBy(a => a.ALLO_ID)
.Take(10)
.ToList()

Up to the ToList call, the calls only generates expressions. That means that the OrderBy, Skip and Take calls are bundled up as an expression that is then sent to the entity framework to be executed in the database.
Entity framework will make an SQL query from that expression, which returns the ten rows from the table, which the ToList methods reads and places in a List<T> where T is the type of the items in the ALLOCATION_D collection.

Related

Oracle not using index, Entity Framework & Devart DotConnect for oracle

The table in question has ~30mio records. Using Entity Framework I write a LINQ Query like this:
dbContext.MyTable.FirstOrDefault(t => t.Col3 == "BQJCRHHNABKAKU-KBQPJGBKSA-N");
Devart DotConnect for Oracle generates this:
SELECT
Extent1.COL1,
Extent1.COL2,
Extent1.COL3
FROM MY_TABLE Extent1
WHERE (Extent1.COL3 = :p__linq__0) OR ((Extent1.COL3 IS NULL) AND (:p__linq__0 IS NULL))
FETCH FIRST 1 ROWS ONLY
The query takes about four minutes, obviously a full table scan.
However, handcrafting this SQL:
SELECT
Extent1.COL1,
Extent1.COL2,
Extent1.COL3
FROM MY_TABLE Extent1
WHERE Extent1.COL3 = :p__linq__0
FETCH FIRST 1 ROWS ONLY
returns the expected match in 200ms.
Question: Why is it so? I would expect the query optimizer to note that the right part is false if the parameter is not null, so why doesn't the first query hit the index?
Please set UseCSharpNullComparisonBehavior=false explicitly:
var config = Devart.Data.Oracle.Entity.Configuration.OracleEntityProviderConfig.Instance;
config.QueryOptions.UseCSharpNullComparisonBehavior = false;
If this doesn't help, send us a small test project with the corresponding DDL script so that we can investigate the issue.

optimize linq query with multiple include statements

The linq query takes around 20 seconds for executing on some of the data . When converted the linq to sql there are 3 nested joins that might be taking more time for execution . Can we optimize the below query .
var query = (from s in this.Items
where demoIds.Contains(s.Id)
select)
.Include("demo1")
.Include("demo2")
.Include("demo3")
.Include("demo4");
return query;
The expectation is to execute the query in 3-4 seconds which is now taking around 20 secs for 100 demoIds .
As far as your code is concerned it looks like it's the best way to get what you want (Assuming Includeing "demo3" twice is a typo for this example.).
However, the database you use will have a way to optimize your queries or rather the underlying data structure. Use whatever tool your database provider has to get an execution plan of the query and see where it spends so much time. You might be missing an index or two.
I advice lazy loading or join query.
Probably SQL output is this query;
(SELECT .. FROM table1 WHERE ID in (...)) AS T1
(INNER, FULL) JOIN (SELECT .. FROM table2) AS T2 ON T1.PK = T2.FOREIGNKEY
(INNER, FULL) JOIN (SELECT .. FROM table3) AS T3 ON T1.PK = T3.FOREIGNKEY
(INNER, FULL) JOIN (SELECT .. FROM table4) AS T4 ON T1.PK = T4.FOREIGNKEY
But if you can use lazy loading, no need use the Include() func. And lazy loading will solve your problem.
Other else, you can write with join query,
var query = from i in this.Items.Where(w=>demoIds.Contains(w.Id))
join d1 in demo1 on i.Id equals d1.FK
join d2 in demo2 on i.Id equals d2.FK
join d3 in demo3 on i.Id equals d3.FK
select new ... { };
This two solutions solve your all problems.
If it continues your problem, I strongly recommended store procedure.
I've got a similar issue with a query that had 15+ "Include" statements and generated a 2M+ rows result in 7 minutes.
The solution that worked for me was:
Disabled lazy loading
Disabled auto detect changes
Split the big query in small chunks
A sample can be found below:
public IQueryable<CustomObject> PerformQuery(int id)
{
ctx.Configuration.LazyLoadingEnabled = false;
ctx.Configuration.AutoDetectChangesEnabled = false;
IQueryable<CustomObject> customObjectQueryable = ctx.CustomObjects.Where(x => x.Id == id);
var selectQuery = customObjectQueryable.Select(x => x.YourObject)
.Include(c => c.YourFirstCollection)
.Include(c => c.YourFirstCollection.OtherCollection)
.Include(c => c.YourSecondCollection);
var otherObjects = customObjectQueryable.SelectMany(x => x.OtherObjects);
selectQuery.FirstOrDefault();
otherObjects.ToList();
return customObjectQueryable;
}
IQueryable is needed in order to do all the filtering at the server side. IEnumerable would perform the filtering in memory and this is a very time consuming process. Entity Framework will fix up any associations in memory.

LINQ query (or lambda expression) to return records that match a list

I have a list of strings (converted from Guid) that contains the ID's of items I want to pull from my table.
Then, in my LINQ query, I am trying to figure out how to do an in clause to pull records that are in that list.
Here is the LINQ
var RegionRequests = (from r in db.course_requests
where PendingIdList.Contains(r.request_state.ToString())
select r).ToList();
It builds, but I get a run error: "System.NotSupportedException: LINQ to Entities does not recognize the method 'System.String ToString()' method, and this method cannot be translated into a store expression".
I would prefer to compare guid to guid, but that gets me nowhere.
Can this be converted to a lambda expression? If that is best, how?
LINQ to Entites tries to convert your expression to an SQL Statement. Your server didn't know the stored procedure ToString().
Fix:
var regionRequests =
from r in db.course_requests.ToList()
where PendingIdList.Contains(r.request_state.ToString())
select r;
With db.course_requests.ToList() you force LINQ to materialize your database data (if big table, you gonna have a bad time) and the ToString() is executed in the object context.
You stated: I have a list of strings (converted from Guid) ...
Can you NOT convert them into strings and keep it as a List< System.Guid>?? Then you can do this (assuming PendingIdGuidList is List< System.Guid>:
var regionRequets = (from r in db.course_requests
join p in PendingIdGuidList on u.request_state equals p
select r).ToList();
Edited to add:
I ran a test on this using the following code:
var db = new EntityModels.MapleCreekEntities();
List<System.Guid> PendingIdGuidList =
new List<System.Guid>() {
System.Guid.Parse("77dfd79e-2d61-40b9-ac23-36eb53dc55bc"),
System.Guid.Parse("cd409b96-de92-4fd7-8870-aa42eb5b8751")
};
var regionRequets = (from r in db.Users
join p in PendingIdGuidList on r.Test equals p
select r).ToList();
Users is a table in my database. I added a column called Test as a Uniqueidentifier data type, then modified 2 records with the following Guids.
I know it's not exactly a 1:1 of what the OP is doing, but pretty close. Here is the profiled SQL statement:
SELECT
[Extent1].[ID] AS [ID],
[Extent1].[UserLogin] AS [UserLogin],
[Extent1].[Password] AS [Password],
[Extent1].[Test] AS [Test]
FROM [dbo].[Users] AS [Extent1]
INNER JOIN (SELECT
cast('77dfd79e-2d61-40b9-ac23-36eb53dc55bc' as uniqueidentifier) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
SELECT
cast('cd409b96-de92-4fd7-8870-aa42eb5b8751' as uniqueidentifier) AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable2]) AS [UnionAll1] ON [Extent1].[Test] = [UnionAll1].[C1]

SQL to LINQ query asp.net

I am currently trying to get some statistics for my website but i cant seem to create the query for my database to get the username that if found most frequent in all the rows.
The sql query should look something like this:
SELECT username FROM Views GROUP BY 'username' ORDER BY COUNT(*) DESC LIMIT 1
How do i make that query in my controller?
var username = db.Views.GroupBy(v => v.username).OrderByDescending(g => g.Count()).First().Key
(from a in Views
group a by a.username into b
let c = b.count()
orderby c descending
select a.username).take(1);
Your query conversion .....
This is how you do that query using LINQ:
var temp = (from a in Views
group a.username by a.username into b
orderby b.Count() descending
select b.Key).Take(1);
You can't do LIMIT 1 (mysql), since LinqToSql only generates TSql from MSSqlServer.

Optimizing a LINQ to SQL query

I have a query that looks like this:
public IList<Post> FetchLatestOrders(int pageIndex, int recordCount)
{
DatabaseDataContext db = new DatabaseDataContext();
return (from o in db.Orders
orderby o.CreatedDate descending
select o)
.Skip(pageIndex * recordCount)
.Take(recordCount)
.ToList();
}
I need to print the information of the order and the user who created it:
foreach (var o in FetchLatestOrders(0, 10))
{
Console.WriteLine("{0} {1}", o.Code, o.Customer.Name);
}
This produces a SQL query to bring the orders and one query for each order to bring the customer. Is it possible to optimize the query so that it brings the orders and it's customer in one SQL query?
Thanks
UDPATE: By suggestion of sirrocco I changed the query like this and it works. Only one select query is generated:
public IList<Post> FetchLatestOrders(int pageIndex, int recordCount)
{
var options = new DataLoadOptions();
options.LoadWith<Post>(o => o.Customer);
using (var db = new DatabaseDataContext())
{
db.LoadOptions = options;
return (from o in db.Orders
orderby o.CreatedDate descending
select o)
.Skip(pageIndex * recordCount)
.Take(recordCount)
.ToList();
}
}
Thanks sirrocco.
Something else you can do is EagerLoading. In Linq2SQL you can use LoadOptions : More on LoadOptions
One VERY weird thing about L2S is that you can set LoadOptions only before the first query is sent to the Database.
you might want to look into using compiled queries
have a look at http://www.3devs.com/?p=3
Given a LINQ statement like:
context.Cars
.OrderBy(x => x.Id)
.Skip(50000)
.Take(1000)
.ToList();
This roughly gets translated into:
select * from [Cars] order by [Cars].[Id] asc offset 50000 rows fetch next 1000 rows
Because offset and fetch are extensions of order by, they are not executed until after the select-portion runs (google). This means an expensive select with lots of join-statements are executed on the whole dataset ([Cars]) prior to getting the fetched-results.
Optimize the statement
All that is needed is taking the OrderBy, Skip, and Take statements and putting them into a Where-clause:
context.Cars
.Where(x => context.Cars.OrderBy(y => y.Id).Select(y => y.Id).Skip(50000).Take(1000).Contains(x.Id))
.ToList();
This roughly gets translated into:
exec sp_executesql N'
select * from [Cars]
where exists
(select 1 from
(select [Cars].[Id] from [Cars] order by [Cars].[Id] asc offset #p__linq__0 rows fetch next #p__linq__1 rows only
) as [Limit1]
where [Limit1].[Id] = [Cars].[Id]
)
order by [Cars].[Id] asc',N'#p__linq__0 int,#p__linq__1 int',#p__linq__0=50000,#p__linq__1=1000
So now, the outer select-statement only executes on the filtered dataset based on the where exists-clause!
Again, your mileage may vary on how much query time is saved by making the change. General rule of thumb is the more complex your select-statement and the deeper into the dataset you want to go, the more this optimization will help.

Resources