Clean way to write this query - linq

I'm looking for a clean way to write this Linq query.
Basically I have a collection of objects with id's, then using nhibernate and Linq, I need to check if the nhibernate entity has a subclass collection where all id's in object collection exist in the nhibernate subclass collection.
If there was just one item this would work:
var objectImCheckingAgainst = ... //irrelevant
where Obj.SubObj.Any(a => a.id == objectImCheckingAgainst.Id)
Now I want to instead somehow pass a list of objectImCheckingAgainst and return true only if the Obj.SubObj collection contains all items in list of objectImCheckingAgainst based on Id.

I like to use GroupJoin for this.
return objectImCheckingAgainst.GroupJoin(Obj.SubObj,
a => a.Id,
b => b.id,
(a, b) => b.Any())
.All(c => c);
I believe this query should be more or less self-explanatory, but essentially, this joins the two collections using their respective ids as keys, then groups those results. Then for each of those groupings, it determines whether any matches exist. Finally, it ensures that all groupings had matches.
A useful alternative that I sometimes use is .Count() == 1 instead of the .Any(). Obviously, the difference there is whether you want to support multiple elements with the same id matching. From your description, it sounded like that either doesn't matter or is enforced by another means. But that's an easy swap, either way.
An important concept in GroupJoin that I know is relevant, but may or may not be obvious, is that the first enumerable (which is to say, the first argument to the extension method, or objectImCheckingAgainst in this example) will have all its elements included in the result, but the second one may or may not. It's not like Join, where the ordering is irrelevant. If you're used to SQL, these are the elementary beginnings of a LEFT OUTER JOIN.
Another way you could accomplish this, somewhat more simply but not as efficiently, would be to simply nest the queries:
return objectImCheckingAgainst.All(c => Obj.SubObj.Any(x => x.id == c.Id));
I say this because it's pretty similar to the example you provided.
I don't have any experience with NHibernate, but I know many ORMs (I believe EF included) will map this to SQL, so efficiency may or may not be a concern. But in general, I like to write LINQ as close to par as I can so it works as well in memory as against a database, so I'd go with the first one I mentioned.

I'm not well versed in LINQ-to-NHibernate but when using LINQ against any SQL backen it's always important to keep an eye on the generated SQL. I think this where clause...
where Obj.SubObj.All(a => idList.Contains(a.id))
...will produce the best SQL (having an IN statement).
idList is a list of Ids extracted from the list of objectImCheckingAgainst objects.

Related

Query syntax for getting a list as a list of different types?

I have a query that fetches books, I'm new to LINQ so I don't know the syntax:
var books = (from book in db.Books
join borrow in db.Borrows on book equals borrow.Book
select new BookDTO { Title = book.Title,
Borrows = book.Borrows.ToList() }).ToList(); // book.Borrows.ToList() <- use dto's instead
How can I select Book.Borrows as a list of objects (BorrowDTO's)? Is there something like Borrows = new List<BorrowDTO>(book.Borrows)
You can use .Select() to project the list into a different type. So instead of this:
Borrows = book.Borrows.ToList()
you would have something like this:
Borrows = book.Borrows.Select(b => new BorrowDTO { /* properties here */ }).ToList()
Note that, depending on your data source, there may be more efficient ways to approach selecting your data. If you're pulling directly from LINQ To Entities then you may run into problems trying to materialize a type within the query that isn't known to the DB, or any other operation that can't be translated into SQL. It's also not necessarily wise to toss in a bunch of .ToList() operations without a specific purpose.
But that's all theoretical at this point in the question. Based on the code shown and on LINQ syntax itself, you can select from a list just fine. (I'd even recomment using the extension method syntax more than the query syntax that you currently use. Personal preference of course, but I find it easier and more intuitive to build nested operations like this. Though you can just as well use the from ... select ... syntax after Borrows =, I would imagine.)
Just select book.Borrows instead of creating a new temporal object.
That query is going to return a IEnumerable of the Borrows type; and you'll be able to iterate through it and convert it into a List if you please

Compound "from" clause in Linq query in Entity Framework

I've been working with Entity Framework for a few weeks now. I have been working with Linq-Objects and Linq-SQL for years. A lot of times, I like to write linq statements like this:
from student in db.Students
from score in student.Scores
where score > 90
select student;
With other forms of linq, this returns distinct students who have at least one score greater than 90. However, in EF this query returns one student for every score greater than 90.
Does anyone know if this behavior can be replicated in unit tests? Is it possible this is a bug in EF?
I don't like that SQL-like syntax (I have no better name for it), especially when you start nesting them.
var students = db.Students.Where(student
=> student.Scores.Any(score => score > 90)
)
.ToList();
This snippet, using the method syntax, does the same thing. I find it far more readable. It's more explicit in the order of operations used.
And as far as I have experienced, EF hasn't yet shown a bug with its selection using method syntax.
Edit
To actually answer your problem:
However, in EF this query returns one student for every score greater than 90.
I think is is due to a JOIN statement used in the final SQL that will be run. This is why I avoid SQL-like syntax, because it becomes very hard to differentiate between what you want to retrieve (students) and what you want to filter with (scores).
Much like you would do in SQL, you are joining the data from students and scores, and then running a filtering operation on that collection. It becomes harder to then unseparate that result again into a collection of students. I think this is the main cause of your issue. It's not a bug per sé, but I think EF can only handle it one way.
Alternative solutions to the above:
If it returns one student per score over 90, take the distinct students returned. It should be the same result set.
Use more explicit parentheses () and formatting to nest separate SQL-like statements.
Note: I'm not saying it can't be done with SQL-like syntax. I am well aware that most of this answer is opinion based.

Concatenating a LINQ query and LINQ sort statement

I'm having a problem joining two LINQ queries.
Currently, my (original) code looks like this
s.AnimalTypes.Sort((x, y) => string.Compare(x.Type, y.Type));
What I'm needing to do is change this based on a date, then select all data past that date, so I have
s.AnimalTypes.Select(t=>t.DateChanged > dateIn).ToList()
s.AnimalTypes.Sort((…
This doesn't look right as it's not sorting the data selected, rather sorting everything in s.AnimalTypes.
Is there a way to concatenate the two LINQ lines? I've tried
s.AnimalTypes.Select(t=>t.DateChanged > dateIn).ToList().Sort((…
but this gives me an error on the Sort section.
Is there a simple way to do this? I've looked around and Grouo and OrderBy keep cropping up, but I'm not sure these are what I need here.
Thanks
From your description, I believe you want something like:
var results = s.AnimalTypes.Where(t => t.DateChanged > dateIn).OrderBy(t => t.Type);
You can call ToList() to convert to a List<T> at the end if required.
There are a couple of fundamental concepts I believe you are missing here -
First, unlike List<T>.Sort, the LINQ extension methods don't change the original collections, but rather return a new IEnumerable<T> with the filtered or sorted results. This means you always need to assign something to the return value (hence my var results = above).
Second, Select performs a mapping operation - transforming the data from one form to another. For example, you could use it to extract out the DateChanged (Select(t => t.DateChanged)), but this would give you an enumeration of dates, not the original animal types. In order to filter or restrict the list with a predicate (criteria), you'd use Where instead.
Finally, you can use OrderBy to reorder the resulting enumerable.
You are using Select when you actually want to use Where.
Select is a projection from one a collection of one type into another type - you won't increase or reduce the number of items in a collection using Select, but you can instead select each object's name or some other property.
Where is what you would use to filter a collection based on a boolean predicate.

complex EF graph linq join

So, that's my model .. nice and complex.
I'm looking to get Areas by a UserID. If I were doing this in SQL, I would do a bunch of joins all the way up to the Users table. How would you do this in LINQ query syntax or method chaining?
I can do this pretty straightforward from the other way around, but then I have to do a lot of extra work to flatten the resulting graph and it also requires pulling back all the entities in between.
If I can optionally include AreasPermissions & Permissions, that would be gravy .. but at this point I wouldn't mind an additional query to fetch those.
Another option I was floating was using a function import to a sproc and map that to an Area .. but when I start needing to include other entities it makes that option less elegant. I'm also trying to avoid using sprocs just to use sprocs because it's always a slippery slope with folks .. 'use a sproc for one thing' tends to mold into 'don't use EF (table access) for anything'.
var userByID = new Func<User, bool>(x => x.UserId.Equals(userID, StringComparison.InvariantCultureIgnoreCase));
var user = this._context.Users
.Include("TeamsUsers.Team.TeamsRoles.Role.RolesAreasPermissions.AreaPermission.Area")
.Include("TeamsUsers.Team.TeamsRoles.Role.RolesAreasPermissions.AreaPermission.Permission")
.Single(userByID);
I have no way of testing this, but I think it should work:
var result =
from user in _context.Users
where user.Id == userId
from teamUser in user.TeamUsers
from teamRole in teamUser.Team.TeamRoles
from roleAreaPermission in teamRole.Role.RoleAreaPermissions
select roleAreaPermission.AreaPermission.Area;

Linq - Is operations order relevant?

I have some slow linq queries and need to optimize them. I have read about compiled queries and setting the merge option in NoTracking in my readonly operations.
But I think my problem is that I have too many Includes so the number of joins done in the DB is huge.
context.ExampleEntity
.Include("A")
.Include("B")
.Include("D.E.F")
.Include("G.H")
.Include("I.J")
.Include("K.M")
.Include("K.N")
.Include("O.P")
.Include("Q.R")
.Where(a => condition1 || complexCondition2)
My doubt is, if I put the Where before the Includes, would this filter ExampleEntity objects before making all the joins?? Im not sure about how linq queries are translated to SQL
"Yes".
Each sub-query passes it's results to the next. Moving the Where first will filter, then perform the includes against a potentially smaller set.
Whether that makes sense in the context of your specific query is up to you to decide.

Resources