Differences between LINQ to Objects and LINQ to SQL queries - linq

I have been using LINQ to query my POCO objects for some time, but I have not yet tried LINQ to SQL. I assume that LINQ to SQL queries are somehow converted to equivalent SQL queries and, given this, I am wondering if that affects the way LINQ to SQL queries are or should be written.
Are there any significant differences between LINQ to Objects and LINQ to SQL that affect how I should write a query for either?

The main difference is as you say, LINQ to SQL queries are converted into SQL. That means that there is code you can write which isn't actually convertible or has some subtly different semantics - and you only find that out at execution time.
For example:
var query = from person in people
where person.Age == person.GetHashCode()
select person;
will compile fine, but fail at execution time because LINQ to SQL doesn't know what to do with GetHashCode().
Basically I find LINQ to SQL a lot harder to predict than LINQ to Objects. That's not to say it's not useful - it's just a slightly different world. MS has done an amazing job at letting you write queries which very often just do what you expect them to, but it can't do everything.

LINQ to SQL will use the column DB server's collation for Where and OrderBy. LINQ to Objects will use string comparisons. So the former might be case-insensitive while the latter is case-sensitive. LINQ to Entities coalesces nulls. I presume L2S does the same, but I haven't tested. So in L2E you can do:
let foo = item.Property.SomeNullableType
... and foo will be null if Property is null. But in LINQ to Objects you'd have to do something like:
let foo = item.Property != null ? item.Property.SomeNullableType : null
... or you'd get a null exception.

MSDN reference here and here should help you out.

One difference that I run into is differences in grouping.
When you group in linq to objects, you get a hierarchically shaped result (keys, with child objects).
When you group in SQL, you get keys and aggregates only.
When you group in linq to sql, if you ask for the child objects (more than aggregates), linq to sql will re-query each group using the key to get those child objects. If you have thousands of groups, that can be thousands of roundtrips.
//this is ok
var results = db.Orders
.GroupBy( o => o.CustomerID )
.Select(g => new
{
CustomerId = g.Key,
OrderCount = g.Count()
});
//this could be a lot of round trips.
var results = db.Orders
.GroupBy( o => o.CustomerID )
.Select(g => new
{
CustomerId = g.Key,
OrderIds = g.Select(o => o.OrderId)
});
// this is ok
// used ToList to separate linqtosql work from linqtoObject work
var results = db.Orders
.Select(o => new {o.CustomerId, o.OrderId})
.ToList()
.GroupBy(o => o.CustomerId)
.Select(g => new
{
CustomerId = g.Key,
OrderIds = g.Select(o => o.OrderId)
});

Related

Linq to Entities Query explanation

Is there any way I can make this Linq to entities query in another way (better) and understand what I did?
First, can I have the string.jon() in the first part (select(p => new {...)?
Second, why do I need the first select to end with .ToList() for the string.join() to work?
The tables relation are as follow:
And here is the code:
Productos.Select(p => new {
Id = p.Id,
Code = p.CodigoProd,
Name = p.Nombre,
Cant = p.Inventario.Sum(i => i.Cantidad),
Pric = p.Inventario.OrderBy(i => i.Precio).Select (i => i.Precio).FirstOrDefault(),
cate = p.ProductosXCategoria.Select(pc => pc.CategoriasdeProducto.Nombre)
}).Where (p => p.Cant != null).ToList()
.Select (r => new {
r.Id, r.Code, r.Cant, r.Name, r.Pric, Categ = string.Join("-",r.cate)
})
the result is this (which is the result i expected to be):
IEnumerable<> (17 items)
**Id-- Code-- Cant-- Name-- Pric-- Categ**
1-- AXI-- 30-- Pepsi-- 10-- Granos
3-- ASI-- 38-- Carne blanca-- 12-- Granos-Limpieza
The query looks fine to me.
The reason you can't move the string.Join method to the first Select, is that LINQ-to-Entities ultimately has to be able to translate to SQL. string.Join has no direct translation to SQL, so it doesn't know how to translate your LINQ query to it. By calling ToList() first, you bring the results of the first Select into memory, where the subsequent Select works with Linq-to-Objects. Since Linq-to-Objects does not need to translate to SQL, it can operate directly on the results of the first query in memory.
Generally, you would want to put everything that would be better left to SQL before the ToList() call (such as filtering, sorting, averaging, grouping, etc.), and leave additional work that can't be translated to SQL (or isn't as efficient to do so) for after the results have been brought into local memory.

preserving the order of returning entities when using .Contains(Id)

I want to hydrate a collection of entities by passing in a List of Ids and also preserve the order.
Another SO answer https://stackoverflow.com/a/15187081/1059911 suggested this approach to hydrating the entities which works great
var entities = db.MyEntities.Where(e => myListOfIds.Contains(e.ID)).ToList();
however the order of entities in the collection is different from the order of Ids
Is there a way to preserve the order?
May be that helps:
var entities = db.MyEntities
.Where(e => myListOfIds.Contains(e.ID))
.OrderBy(e => myListOfIds.IndexOf(e.ID)).ToList();
EDIT
JohnnyHK clarified that this will not work with LINQ to Entities. For this to work you need to order IEnumerable instead of IQueryable, since IQueryProvider don't know how to deal with local list IndexOf method when it sends query to server. But after AsEnumerable() OrderBy method deals with local data. So you can do this:
var entities = db.MyEntities
.Where(e => myListOfIds.Contains(e.ID))
.AsEnumerable()
.OrderBy(e => myListOfIds.IndexOf(e.ID)).ToList();
Entity Framework contains a subset of all of the LINQ commands so you won't have all the commands that LINQ to Objects has.
The following approach should give you your list of MyEntities in the same order as supplied by myListOfIds:
var entities = myListOfIds.Join(db.MyEntities, m => m, e => e.ID, (m,e) => e)
.ToList();

LINQ to Entities multiple columns need 1 to be distinct

I am trying to select multiple columns from an entity object but I want 1 property to be distinct. I am very new to both LINQ and Entity Framework so any help will be useful.
Here is my LINQ query so far:
var listTypes = (from s in context.LIST_OF_VALUES
orderby s.SORT_INDEX
select new { s.LIST_TYPE, s.DISPLAY_TEXT });
I want s.LIST_TYPE to be distinct. I figure using the groupby keyword is what I want (maybe?) but I have not found a way to use it that works.
Thank you.
Assuming DISPLAY_TEXT matches LIST_TYPE somehow (so you don't lose any information):
var distinct = context.LIST_OF_VALUES
.OrderBy(s => s.SORT_INDEX)
.GroupBy(s => s.LIST_TYPE)
.Select(g => new { g.Key, g.First().DISPLAY_TEXT });

ef and linq extension method

I have this sql that i want to have written in linq extension method returning an entity from my edm:
SELECT p.[Id],p.[Firstname],p.[Lastname],prt.[AddressId],prt.[Street],prt.[City]
FROM [Person] p
CROSS APPLY (
SELECT TOP(1) pa.[AddressId],a.[ValidFrom],a.[Street],a.[City]
FROM [Person_Addresses] pa
LEFT OUTER JOIN [Addresses] AS a
ON a.[Id] = pa.[AddressId]
WHERE p.[Id] = pa.[PersonId]
ORDER BY a.[ValidFrom] DESC ) prt
Also could this be re-written in linq extension method using 3 joins?
Assuming you have set the Person_Addresses table up as a pure relation table (i.e., with no data besides the foreign keys) this should do the trick:
var persons = model.People
.Select(p => new { p = p, a = p.Addresses.OrderByDescending(a=>a.ValidFrom).First() })
.Select(p => new { p.p.Id, p.p.Firstname, p.p.LastName, AddressId = p.a.Id, p.a.Street, p.a.City });
The first Select() orders the addresses and picks the latest one, and the second one returns an anonymous type with the properties specified in your query.
If you have more data in your relation table you're gonna have to use joins but this way you're free from them. In my opinion, this is more easy to read.
NOTE: You might get an exception if any entry in Persons have no addresses connected to them, although I haven't tried it out.

Does Enumerable.ToDictionary only retrieve what it needs?

I'm using Enumerable.ToDictionary to create a Dictionary off of a linq call:
return (from term in dataContext.Terms
where term.Name.StartsWith(text)
select term).ToDictionary(t => t.TermID, t => t.Name);
Will that call fetch the entirety of each term, or will it only retrieve the TermID and the Name fields from my data provider? In other words, would I be saving myself database traffic if I instead wrote it like this:
return (from term in dataContext.Terms
where term.Name.StartsWith(text)
select new { term.TermID, term.Name }).ToDictionary(t => t.TermID, t => t.Name);
Enumerable.ToDictionary works on IEnumerable objects. The first part of your statement "(from ... select term") is an IQueryable object. Queryable is going to look at the expression and build the SQL statement. It will then convert that to an IEnumerable to pass to ToDictionary().
In other words, yes, your second version would be more efficient.
The generated SQL will return the entire term, so your second statement will bring down just what you need.
You can set dataContext.Log = Console.Out and look at the different results of the query.
Using my sample LINQPad database, here's an example:
var dc = (TypedDataContext)this;
// 1st approach
var query = Orders.Select(o => o);
dc.GetCommand(query).CommandText.Dump();
query.ToDictionary(o => o.OrderID, o => o.OrderDate).Dump();
// 2nd approach
var query2 = Orders.Select(o => new { o.OrderID, o.OrderDate});
dc.GetCommand(query2).CommandText.Dump();
query2.ToDictionary(o => o.OrderID, o => o.OrderDate).Dump();
The generated SQL is (or just peek at LINQPad's SQL tab):
// 1st approach
SELECT [t0].[OrderID], [t0].[OrderDate], [t0].[ShipCountry]
FROM [Orders] AS [t0]
// 2nd approach
SELECT [t0].[OrderID], [t0].[OrderDate]
FROM [Orders] AS [t0]
No. ToDictionary is an extension method for IEnumerable<T> not IQueryable<T>. It doesn't take an Expression<Func<T, TKey>> but simply a Func<T, TKey> that it'll blindly call for each item. It doesn't care (and doesn't know) about LINQ and the underlying expression trees and stuff like that. It just iterates the sequence and builds up a dictionary. As a consequence, in your first query, all columns are fetched.

Resources