Optimising Lambda Linq to SQL query with OrderBy - performance

I have the following lambda expression:
IEnumerable<Order> query
= _ordersRepository.GetAllByFilter(
o =>
o.OrderStatus.OrderByDescending(os => os.Status.Date).First()
.Status.StatusType.DisplayName != "Completed"
||
o.OrderStatus.OrderByDescending(os => os.Status.Date).First()
.Status.Date > sinceDate
).OrderBy(o => o.DueDate);
As you can see, I'm having to order the collection twice within the main query (so three times in total) in order to perform my OR query.
1) Is the query optimiser clever enough to deal with this in an efficient way?
2) If not, how can I rewrite this expression to only order by once, but keeping with lambda syntax?
This is linked to this previous question, which explains the query in a bit more detail if the above code isn't clear.

1) Is the query optimiser clever enough to deal with this in an efficient way?
You can get the SQL for this query (one way is to use the SQL profiler), and then ask SQL Studio for the execution plan. Unless you do this, there is no way to know what the optimizer thinks. My guess is the answer is "no".
2) If not, how can I rewrite this expression to only order by once, but keeping with lambda syntax?
Like this:
IEnumerable<Order> query = _ordersRepository.GetAllByFilter( o =>
o.OrderStatus
.OrderByDescending(os => os.Status.Date)
.Take(1)
.Any(os => os.Status.StatusType.DisplayName != "Completed"
|| os.Status.Date > sinceDate)
})
.OrderBy(o => o.DueDate);

Regarding your first point: You can see the SQL that is generated by subscribing to the output of the DatabaseContext object. This is usually in a property called Log.
As for optimising your query, try the following (I've not tested it so I don't know if it will work)
IEnumerable<Order> query
= _ordersRepository.GetAllByFilter(
o =>
o.OrderStatus.Max(os => os.Status.Date).Any(os =>
os.Status.StatusType.DisplayName != "Completed"
|| os.Status.Date > sinceDate)
).OrderBy(o => o.DueDate);
Hopefully that will only perform the subquery once, and also performs a max rather than an order by with top 1.

Related

Why is the "Select" of a Method Syntax is in another parenthesis?

var sample = db.Database.OrderByDescending(x => x.RecordId).Select(y => y.RecordId).FirstOrDefault();
I don't know if my title is correct / right. Just want to ask why this query the select is in another ( )?. As for the example .Select(y => y.RecordId) unlike the query I use to be
var sample = (from s in db.Databse where s.RecordId == id select s) I know this is the same right?. Then what is the why it is in another parenthesis?. Anyone has an idea or can anyone explain it why?. Thanks a lot.
In your first example, you're using "regular" C# syntax to call a bunch of extension methods:
var sample = db.Database
.OrderByDescending(x => x.RecordId)
.Select(y => y.RecordId)
.FirstOrDefault();
(They happen to be extension methods here, but of course they don't have to be...)
You use lambda expressions to express how you want the ordering and projection to be performed, and the compiler converts those into expression trees (assuming this is EF or similar; it would be delegates for LINQ to Objects).
The second example is a query expression, although it doesn't actually match your first example. A query expression corresponding to your original query would be:
var sample = (from x in db.Database
orderby x.RecordId descending
select x.RecordId)
.FirstOrDefault();
Query expressions are very much syntactic sugar. The compiler effectively converts them into the first form, then compiles that. The range variable declared in the from clause (x in this case) is used as the parameter name for the lambda expression, so select x.RecordId becomes .Select(x => x.RecordId).
Things become a bit more complicated with joins and multiple from clauses, as then the compiler introduces transparent identifiers to allow you to work with all the range variables that are in scope, even though you've really only got a single parameter. For example, if you had:
var query = from person in people
from job in person.Jobs
order by person.Name
select new { Person = person, Job = job };
that would be translated into the equivalent of
var query = people.SelectMany(person => person.Jobs, (person, job) => new { person, job } )
.OrderBy(t => t.person.Name)
.Select(t => new { Person = t.person, Job = t.job });
Note how the compiler introduces an anonymous type to combine the person and job range variables into a single object, which is used later on.
Basically, query expression syntax makes LINQ easier to work with - but it's just a translation into other C# code, and is neatly wrapped up in a single section of the C# specification. (Section 7.16.2 of the C# 5 spec.)
See my Edulinq blog post on query expressions for more detail on the precise translation from query expressions to "regular" C#.

Linq to Entities Query explanation

Is there any way I can make this Linq to entities query in another way (better) and understand what I did?
First, can I have the string.jon() in the first part (select(p => new {...)?
Second, why do I need the first select to end with .ToList() for the string.join() to work?
The tables relation are as follow:
And here is the code:
Productos.Select(p => new {
Id = p.Id,
Code = p.CodigoProd,
Name = p.Nombre,
Cant = p.Inventario.Sum(i => i.Cantidad),
Pric = p.Inventario.OrderBy(i => i.Precio).Select (i => i.Precio).FirstOrDefault(),
cate = p.ProductosXCategoria.Select(pc => pc.CategoriasdeProducto.Nombre)
}).Where (p => p.Cant != null).ToList()
.Select (r => new {
r.Id, r.Code, r.Cant, r.Name, r.Pric, Categ = string.Join("-",r.cate)
})
the result is this (which is the result i expected to be):
IEnumerable<> (17 items)
**Id-- Code-- Cant-- Name-- Pric-- Categ**
1-- AXI-- 30-- Pepsi-- 10-- Granos
3-- ASI-- 38-- Carne blanca-- 12-- Granos-Limpieza
The query looks fine to me.
The reason you can't move the string.Join method to the first Select, is that LINQ-to-Entities ultimately has to be able to translate to SQL. string.Join has no direct translation to SQL, so it doesn't know how to translate your LINQ query to it. By calling ToList() first, you bring the results of the first Select into memory, where the subsequent Select works with Linq-to-Objects. Since Linq-to-Objects does not need to translate to SQL, it can operate directly on the results of the first query in memory.
Generally, you would want to put everything that would be better left to SQL before the ToList() call (such as filtering, sorting, averaging, grouping, etc.), and leave additional work that can't be translated to SQL (or isn't as efficient to do so) for after the results have been brought into local memory.

How to convert a LINQ query from query syntax to query method

Linq and EF4.
I have this Linq query in query syntax I would like convert into query method.
Are you able to do it? I tried more tha 2 hours without success :-(
Thanks for your time
CmsContent myContentObj = (from cnt in context.CmsContents
from categoy in cnt.CmsCategories
where categoy.CategoryId == myCurrentCategoryId && cnt.ContentId == myCurrentContentId
select cnt).Single();
My original answer selected the wrong item. It's a bit more complicated than what I had (which Ani has posted). Here's what I believe is an equivalent query however and should perform better:
CmsContent myContentObj =
context.CmsContents
.Where(cnt => cnt.ContentId == myCurrentId
&& cnt.CmsCategories
.Any(categoy => categoy.CategoryId == myCurrentCategoryId))
.Single();
Here is a non-direct translation that I believe performs the same task in much less code:
var myContentObj = context.CmsContents.Single(
x => x.ContentId == myCurrentContentId &&
x.CmsCategories.Any(y => y.CategoryId == myCurrentCategoryId)
);
Here's how the C# compiler actually does it, with some help from .NET Reflector to verify:
var myContentObj = context
.CmsContents
.SelectMany(cnt => cnt.CmsCategories,
(cnt, categoy) => new { cnt, categoy })
.Where(a => a.categoy.CategoryId == myCurrentCategoryId
&& a.cnt.ContentId == myCurrentContentId)
.Select(a => a.cnt)
.Single();
Essentially, the 'nested' from clauses results in a SelectMany call with a transparent identifier (an anonymous-type instance holding the 'parent' cnt and the 'child' categoy). The Where filter is applied on the anonymous-type instance, and then we do another Select projection to get back the 'parent'. The Single call was always 'outside' the query expression of course, so it should be obvious how that fits in.
For more information, I suggest reading Jon Skeet's article How query expressions work.

How do I merge two LINQ statements into one to perform a list2.Except(list1)?

Currently, I have the following LINQ queries. How can I merge the two queries into one. Basically, write a LINQ query to bring back the results I'd get from
IEnumerable<int> deltaList = people2010.Except(allPeople);
except in a single query.
var people2010 = Contacts.Where(x => x.Contractors
.Any(d => d.ContractorsStatusTrackings
.Any(date => date.StatusDate.Year >= 2010)))
.Select(x => x.ContactID);
var allPeople = Contacts.Where(x => x.Contractors
.Any(m => m.ContactID == x.ContactID))
.Select(x=> x.ContactID);
Thanks!
Why can you not just do Except as you are doing? Don't forget that your people2010 and allPeople variables are just queries - they're not the data. Why not just use them as they are?
If that's not acceptable for some reason, please give us more information - such as whether this is in LINQ to Object, LINQ to SQL etc, and what's wrong with just using Except.
It sounds like you're just looking for a more elegant way to write your query. I believe that this is a more elegant way to write your combined queries:
var deltaList =
from contact in Contacts
let contractors = contact.Contractors
where contractors.Any(ctor => ctor.ContractorStatusTrackings
.Any(date => date.StatusDate.Year >= 2010))
&& !contractors.Any(m => m.ContactID == contact.ContactID)
select contact.ContactID

How to write dynamic Linq2Sql compiled queries?

I'm having performance issues with Linq2Sql compared to raw ADO.NET which has led me down the path of compiled queries. I have got this far so far
public static readonly Func<MyDataContext, WebServices.Search.Parameters, IQueryable<image>>
Compiled_SelectImagesLinq =
CompiledQuery.Compile<MyDataContext, WebServices.Search.Parameters, IQueryable<image>>(
(dc, parameters) => from i in dc.images
join l in dc.links on i.image_id equals l.image_id
join r in dc.resolutions on i.image_id equals r.image_id
where i.image_enabled == true && i.image_rating >= parameters.MinRating
&& i.image_rating <= parameters.MaxRating
select i
);
However I can't figure out how to add the extra optional parameters to the query as I currently do
if (parameters.Country != null)
{
query = query.Where(x => x.image_country_id == parameters.Country);
}
if (parameters.ComponentId != null)
{
query = query.Where(x => x.links.Any(l => l.link_component_id == parameters.ComponentId));
}
etc, etc
I tried writing another function which does
var query = Compiled_SelectImagesLinq(parameters);
and then adding the extra parameters to the query and returning
return query.Distinct().Take(parameters.Results);
Bit this doesn't seem right and returns no results
Have a look at this article. It may not do what you need (especially since you are compiling your queries), but anytime someone mentions Dynamic and Linq in the same sentence, I refer them to this article:
Dynamic LINQ: Using the LINQ Dynamic Query Library
http://weblogs.asp.net/scottgu/archive/2008/01/07/dynamic-linq-part-1-using-the-linq-dynamic-query-library.aspx
You'd have to benchmark your specific query, but often queries must be used 10-20 times before compiled query performance improvements equal the overhead. Also, how are you adding parameters to the where clause?
Additionally, dynamic compiled queries seems a bit of a mismatch. The Dynamic LINQ query library will do what you need but I don't think you'll get the compiled query performance improvement you want.

Resources