Entity framework linq query Include() multiple children entities - linq

This may be a really elementry question but whats a nice way to include multiple children entities when writing a query that spans THREE levels (or more)?
i.e. I have 4 tables: Company, Employee, Employee_Car and Employee_Country
Company has a 1:m relationship with Employee.
Employee has a 1:m relationship with both Employee_Car and Employee_Country.
If i want to write a query that returns the data from all 4 the tables, I am currently writing:
Company company = context.Companies
.Include("Employee.Employee_Car")
.Include("Employee.Employee_Country")
.FirstOrDefault(c => c.Id == companyID);
There has to be a more elegant way! This is long winded and generates horrendous SQL
I am using EF4 with VS 2010

Use extension methods.
Replace NameOfContext with the name of your object context.
public static class Extensions{
public static IQueryable<Company> CompleteCompanies(this NameOfContext context){
return context.Companies
.Include("Employee.Employee_Car")
.Include("Employee.Employee_Country") ;
}
public static Company CompanyById(this NameOfContext context, int companyID){
return context.Companies
.Include("Employee.Employee_Car")
.Include("Employee.Employee_Country")
.FirstOrDefault(c => c.Id == companyID) ;
}
}
Then your code becomes
Company company =
context.CompleteCompanies().FirstOrDefault(c => c.Id == companyID);
//or if you want even more
Company company =
context.CompanyById(companyID);

EF Core
For eager loading relationships more than one navigation away (e.g. grand child or grand parent relations), where the intermediate relation is a collection (i.e. 1 to many with the original 'subject'), EF Core has a new extension method, .ThenInclude(), and the syntax is slightly different to the older EF 4-6 syntax:
using Microsoft.EntityFrameworkCore;
...
var company = context.Companies
.Include(co => co.Employees)
.ThenInclude(emp => emp.Employee_Car)
.Include(co => co.Employees)
.ThenInclude(emp => emp.Employee_Country)
With some notes
As per above (Employees.Employee_Car and Employees.Employee_Country), if you need to include 2 or more child properties of an intermediate child collection, you'll need to repeat the .Include navigation for the collection for each child of the collection.
Personally, I would keep the extra 'indent' in the .ThenInclude to preserve your sanity.
For serialization of intermediaries which are 1:1 (or N:1) with the original subject, the dot syntax is also supported, e.g.
var company = context.Companies
.Include(co => co.City.Country);
This is functionally equivalent to:
var company = context.Companies
.Include(co => co.City)
.ThenInclude(ci => ci.Country);
However, in EFCore, the old EF4 / 6 syntax of using 'Select' to chain through an intermediary which is 1:N with the subject is not supported, i.e.
var company = context.Companies
.Include(co => co.Employee.Select(emp => emp.Address));
Will typically result in obscure errors like
Serialization and deserialization of 'System.IntPtr' instances are not supported
EF 4.1 to EF 6
There is a strongly typed .Include which allows the required depth of eager loading to be specified by providing Select expressions to the appropriate depth:
using System.Data.Entity; // NB!
var company = context.Companies
.Include(co => co.Employees.Select(emp => emp.Employee_Car))
.Include(co => co.Employees.Select(emp => emp.Employee_Country))
.FirstOrDefault(co => co.companyID == companyID);
The Sql generated is by no means intuitive, but seems performant enough. I've put a small example on GitHub here

You might find this article of interest which is available at codeplex.com.
Improving Entity Framework Query Performance Using Graph-Based Querying.
The article presents a new way of expressing queries that span multiple tables in the form of declarative graph shapes.
Moreover, the article contains a thorough performance comparison of this new approach with EF queries. This analysis shows that GBQ quickly outperforms EF queries.

How do you construct a LINQ to Entities query to load child objects directly, instead of calling a Reference property or Load()
There is no other way - except implementing lazy loading.
Or manual loading....
myobj = context.MyObjects.First();
myobj.ChildA.Load();
myobj.ChildB.Load();
...

Might be it will help someone, 4 level and 2 child's on each level
Library.Include(a => a.Library.Select(b => b.Library.Select(c => c.Library)))
.Include(d=>d.Book.)
.Include(g => g.Library.Select(h=>g.Book))
.Include(j => j.Library.Select(k => k.Library.Select(l=>l.Book)))

To doing this:
namespace Application.Test
{
using Utils.Extensions;
public class Test
{
public DbSet<User> Users { get; set; }
public DbSet<Room> Rooms { get; set; }
public DbSet<Post> Posts { get; set; }
public DbSet<Comment> Comments { get; set; }
public void Foo()
{
DB.Users.Include(x => x.Posts, x => x.Rooms, x => x.Members);
//OR
DB.Users.Include(x => x.Posts, x => x.Rooms, x => x.Members)
.ThenInclude(x => x.Posts, y => y.Owner, y => y.Comments);
}
}
}
this extension might be helpful:
namespace Utils.Extensions
{
using Microsoft.EntityFrameworkCore;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Linq.Expressions;
public static partial class LinqExtension
{
public static IQueryable<TEntity> Include<TEntity>(
this IQueryable<TEntity> sources,
params Expression<Func<TEntity, object>>[] properties)
where TEntity : class
{
System.Text.RegularExpressions.Regex regex = new(#"^\w+[.]");
IQueryable<TEntity> _sources = sources;
foreach (var property in properties)
_sources = _sources.Include($"{regex.Replace(property.Body.ToString(), "")}");
return _sources;
}
public static IQueryable<TEntity> ThenInclude<TEntity, TProperty>(
this IQueryable<TEntity> sources,
Expression<Func<TEntity, IEnumerable<TProperty>>> predicate,
params Expression<Func<TProperty, object>>[] properties)
where TEntity : class
{
System.Text.RegularExpressions.Regex regex = new(#"^\w+[.]");
IQueryable<TEntity> _sources = sources;
foreach (var property in properties)
_sources = _sources.Include($"{regex.Replace(predicate.Body.ToString(), "")}.{regex.Replace(property.Body.ToString(), "")}");
return _sources;
}
}
}

Related

Performance issue in IEnumerable type when querying large amount of data with LINQ

I'm using LINQ to execute a query on a List type variable with a large amount of data (over a million). For performance purposes I'm using IEnumerable to store the results but when I try to access it there is a slight delay.
Specifically I want to see if the query produced any results, but when I use the .Count() or .Any() functions the performance drops.
I read that for IEnumerable types the execution of the query happens at the time of need, hence the delay. Is there a way to see if the IEnumerable has elements inside it without having that much delay?
This is what I'm trying to run.
IEnumerable<Entity> matchingEntities = entities.Where(e => e.Names.Any(n => myEntity.Names.Any(entityName => entityName.CompareNameObjects(n))));
and here are my classes
public class Entity
{
public string EntityIdentifier { get; set; }
public List<Name> Names { get; set; }
}
public class Name
{
public string FullName { get; set; }
public string NameType { get; set; }
public bool CompareNameObjects(Name name2)
{
return FullName == name2.FullName &&
NameType == name2.NameType;
}
}
entities is a list of all my objects and I want to check if myEntity has any Names identical with another entity in the set.
EDITED:
The data structure is similar to the 2 classes (Entity and Name). The entities are created by selecting all the entities, along with their names, from the database in XML format and then I convert the XML to a List as such:
List<Entity> entities = new List<Entity>();
using (SqlConnection conn = new SqlConnection(ConfigurationManager.ConnectionStrings["myCS"].ConnectionString))
{
conn.Open();
SqlCommand cmd = new SqlCommand("GetAllEntities", conn);
cmd.CommandType = CommandType.StoredProcedure;
string entitiesXml = "";
using (SqlDataReader rdr = cmd.ExecuteReader())
{
while (rdr.Read())
{
entitiesXml += rdr["XmlString"].ToString();
}
}
using (TextReader reader = new StringReader(entitiesXml))
entities = (Entity)xmlSerializer.Deserialize(reader);
conn.Close();
}
GetAllEntities (Stored Procedure):
declare #xmlString nvarchar(max) =(
select e.EntityIdentifier,
(
select n.[Full Name] as 'FullName',
n.[Name Type] as 'NameType'
from tblNames n
where e.EntityID=n.[Entity_ID]
for xml path('Name'), type
)
from tblEntities e
order by e.EntityID
for xml path('Entity')
)
select #xmlString as XmlString
Basically, you should avoid getting all data from your database then filter it with C# code. It consumes a lot of effort.
However, for quick solution, you can improve performance by preparing your conditions in a Dictionary form firstly.
// Let's say you have myEntity here
var myEntity = new Entity();
var entities = new List<Entity>();
// You should prepare the list of name that you wanna to find before you do it so that you don't have to make it repeatedly for every iteration
var names = myEntity.Names.Select(p=> p.FullName + p.NameType ).ToDictionary(p=>p, p=>p);
IEnumerable<Entity> matchingEntities = entities.Where(e => e.Names.Any(n => names.ContainsKey(n.FullName + n.NameType)));
This is just an example that may give you more idea. You can improve much more. I hope it can help you

Include_In_Parent option for ElasticSearch and NEST library

I am using ElasticSearch and the NEST .Net library for implementing the Search functionality needed in our app. In my model, I have a type that contains Nested objects as per below.
[ElasticType(Name = "x")]
public class X
{
[ElasticProperty(IncludeInAll = false, Index = FieldIndexOption.NotAnalyzed)]
public string Id { get; set; }
[ElasticProperty(Type = FieldType.Nested)]
public List<Y> Ys { get; set; }
}
Any queries executed against X are actually executed against the List of Ys. I would like to highlight the hits in the nested objects and based on https://github.com/elasticsearch/elasticsearch/issues/5245 .
However, in order to use the proposed workaround, the include_in_parent option should be true for the nested object.
How can this option be enabled using the NEST library? Is there any ElasticProperty property (I haven’t found any obvious one) or some other way to do so?
Thank you
Apparently this can be done only by using fluent syntax. For the above case the code would be:
.AddMapping<X>(m => m
.Properties(p => p
.NestedObject<Y>(n => n
.Name("ys")
.IncludeInParent())

count based on lookup in LINQ

I have a table (or entity) named Cases. There is another table CaseStatus_Lookup and the primary key of this table is a foreign key in the Cases table.
What I want to do is: For every status type I want the number of count of cases. For e.g. if status = in progress , I want to know how many cases are in that status.
one other thing: I also want to filter the Cases based on UserID.
I tried several ways in LINQ but could not get vary far. I was wondering if someone could help.
try Linq .GroupBy
am assuming your entity structure
suppose your Case Entity is like
public class Case
{
public int Id{get;set;}
public int CaseStatusId{get;set;}
public int UserId{get;set;}
//navigational fields
public virtual CaseStatus CaseStatus {get;set;}
}
and suppose your CaseStatus entity is like:
public class CaseStatus
{
public int Id{get;set;}
public string Name{get;set;}
//navigational fields..
public virtual ICollection<Case> Cases{get;set;}
}
then you can do this:
using (myDbContext db = new myDbContext())
{
var query = db.Cases.GroupBy(case => case.CaseStatus.Name)
.Select(group =>
new {
Name = group.Key,
Cases= group.OrderBy(x => x.Id),
Count= group.Count()
}
).ToList();
//query will give you count of cases grouped by CaseStatus.
}
similarly you can further filter your result based on userId.
Start to explore about Linq .GroupBy
You need a function that returns the sum and takes the status as parameter :- something like below.
MyCaseStatusEnum caseStatus; //Pass your required status
int caseCount = myCases
.Where(r => r.Status == caseStatus)
.GroupBy(p => p.Status)
.Select(q => q.Count()).FirstOrDefault<int>();

Linq Parsing Error when trying to create seperation of concerns

I am in the middle of a refactoring cycle where I converted some extension methods that used to look like this:
public static IQueryable<Family> FilterOnRoute(this IQueryable<Family> families, WicRoute route)
{
return families.Where(fam => fam.PODs
.Any(pod => pod.Route.RouteID == route.RouteID));
}
to a more fluent implementation like this:
public class SimplifiedFamilyLinqBuilder
{
private IQueryable<Family> _families;
public SimplifiedFamilyLinqBuilder Load(IQueryable<Family> families)
{
_families = families;
return this;
}
public SimplifiedFamilyLinqBuilder OnRoute(WicRoute route)
{
_families = _families.Where(fam => fam.PODs
.Any(pod => pod.Route.RouteID == route.RouteID));
return this;
}
public IQueryable<Family> AsQueryable()
{
return _families;
}
}
which I can call like this: (note this is using Linq-to-Nhibernate)
var families =
new SimplifiedFamilyLinqBuilder()
.Load(session.Query<Family>())
.OnRoute(new WicRoute() {RouteID = 1})
.AsQueryable()
.ToList();
this produces the following SQL which is fine with me at the moment: (of note is that the above Linq is being translated to a SQL Query)
select ... from "Family" family0_
where exists (select pods1_.PODID from "POD" pods1_
inner join Route wicroute2_ on pods1_.RouteID=wicroute2_.RouteID
where family0_.FamilyID=pods1_.FamilyID
and wicroute2_.RouteID=#p0);
#p0 = 1
my next effort in refactoring is to move the query part that deals with the child to another class like this:
public class SimplifiedPODLinqBuilder
{
private IQueryable<POD> _pods;
public SimplifiedPODLinqBuilder Load(IQueryable<POD> pods)
{
_pods = pods;
return this;
}
public SimplifiedPODLinqBuilder OnRoute(WicRoute route)
{
_pods = _pods.Where(pod => pod.Route.RouteID == route.RouteID);
return this;
}
public IQueryable<POD> AsQueryable()
{
return _pods;
}
}
with SimplifiedFamilyLinqBuilder changing to this:
public SimplifiedFamilyLinqBuilder OnRoute(WicRoute route)
{
_families = _families.Where(fam =>
_podLinqBuilder.Load(fam.PODs.AsQueryable())
.OnRoute(route)
.AsQueryable()
.Any()
);
return this;
}
only I now get this error:
Remotion.Linq.Parsing.ParserException : Cannot parse expression 'value(Wic.DataTests.LinqBuilders.SimplifiedPODLinqBuilder)' as it has an unsupported type. Only query sources (that is, expressions that implement IEnumerable) and query operators can be parsed.
I started to implement IQueryable on SimplifiedPODLinqBuilder(as that seemed more logical than implementing IEnumberable) and thought I would be clever by doing this:
public class SimplifiedPODLinqBuilder : IQueryable
{
private IQueryable<POD> _pods;
...
public IEnumerator GetEnumerator()
{
return _pods.GetEnumerator();
}
public Expression Expression
{
get { return _pods.Expression; }
}
public Type ElementType
{
get { return _pods.ElementType; }
}
public IQueryProvider Provider
{
get { return _pods.Provider; }
}
}
only to get this exception (apparently Load is not being called and _pods is null):
System.NullReferenceException : Object reference not set to an instance of an object.
is there a way for me to refactor this code out that will parse properly into an expression that will go to SQL?
The part fam => _podLinqBuilder.Load(fam.PODs.AsQueryable() is never going to work, because the linq provider will try to parse this into SQL and for that it needs mapped members of Family after the =>, or maybe a mapped user-defined function but I don't know if Linq-to-Nhibernate supports that (I never really worked with it, because I still doubt if it is production-ready).
So, what can you do?
To be honest, I like the extension methods much better. You switched to a stateful approach, which doesn't mix well with the stateless paradigm of linq. So you may consider to retrace your steps.
Another option: the expression in .Any(pod => pod.Route.RouteID == route.RouteID)); could be paremeterized (.Any(podExpression), with
OnRoute(WicRoute route, Expression<Func<POD,bool>> podExpression)
(pseudocode).
Hope this makes any sense.
You need to separate methods you intend to call from expressions you intend to translate.
This is great, you want each of those methods to run. They return an instance that implements IQueryable<Family> and operate on that instance.
var families = new SimplifiedFamilyLinqBuilder()
.Load(session.Query<Family>())
.OnRoute(new WicRoute() {RouteID = 1})
.AsQueryable()
.ToList();
This is no good. you don't want Queryable.Where to get called, you want it to be an expression tree which can be translated to SQL. But PodLinqBuilder.Load is a node in that expression tree which can't be translated to SQL!
families = _families
.Where(fam => _podLinqBuilder.Load(fam.PODs.AsQueryable())
.OnRoute(route)
.AsQueryable()
.Any();
You can't call .Load inside the Where expression (it won't translate to sql).
You can't call .Load outside the Where expression (you don't have the fam parameter).
In the name of "separation of concerns", you are mixing query construction methods with query definition expressions. LINQ, by its Integrated nature, encourages you to attempt this thing which will not work.
Consider making expression construction methods instead of query construction methods.
public static Expression<Func<Pod, bool>> GetOnRouteExpr(WicRoute route)
{
int routeId = route.RouteID;
Expression<Func<Pod, bool>> result = pod => pod.Route.RouteID == route.RouteID;
return result;
}
called by:
Expression<Func<Pod, bool>> onRoute = GetOnRouteExpr(route);
families = _families.Where(fam => fam.PODs.Any(onRoute));
With this approach, the question is now - how do I fluidly hang my ornaments from the expression tree?

Using eager loading with specification pattern

I've implemented the specification pattern with Linq as outlined here https://www.packtpub.com/article/nhibernate-3-using-linq-specifications-data-access-layer
I now want to add the ability to eager load and am unsure about the best way to go about it.
The generic repository class in the linked example:
public IEnumerable<T> FindAll(Specification<T> specification)
{
var query = GetQuery(specification);
return Transact(() => query.ToList());
}
public T FindOne(Specification<T> specification)
{
var query = GetQuery(specification);
return Transact(() => query.SingleOrDefault());
}
private IQueryable<T> GetQuery(
Specification<T> specification)
{
return session.Query<T>()
.Where(specification.IsSatisfiedBy());
}
And the specification implementation:
public class MoviesDirectedBy : Specification<Movie>
{
private readonly string _director;
public MoviesDirectedBy(string director)
{
_director = director;
}
public override
Expression<Func<Movie, bool>> IsSatisfiedBy()
{
return m => m.Director == _director;
}
}
This is working well, I now want to add the ability to be able to eager load. I understand NHibernate eager loading can be done by using Fetch on the query.
What I am looking for is whether to encapsulate the eager loading logic within the specification or to pass it into the repository, and also the Linq/expression tree syntax required to achieve this (i.e. an example of how it would be done).
A possible solution would be to extend the Specification class to add:
public virtual IEnumerable<Expression<Func<T, object>>> FetchRelated
{
get
{
return Enumerable.Empty<Expression<Func<T, object>>>();
}
}
And change GetQuery to something like:
return specification.FetchRelated.Aggregate(
session.Query<T>().Where(specification.IsSatisfiedBy()),
(current, related) => current.Fetch(related));
Now all you have to do is override FetchRelated when needed
public override IEnumerable<Expression<Func<Movie, object>>> FetchRelated
{
get
{
return new Expression<Func<Movie, object>>[]
{
m => m.RelatedEntity1,
m => m.RelatedEntity2
};
}
}
An important limitation of this implementation I just wrote is that you can only fetch entities that are directly related to the root entity.
An improvement would be to support arbitrary levels (using ThenFetch), which would require some changes in the way we work with generics (I used object to allow combining different entity types easily)
You wouldn't want to put the Fetch() call into the specification, because it's not needed. Specification is just for limiting the data that can then be shared across many different parts of your code, but those other parts could have drastically different needs in what data they want to present to the user, which is why at those points you would add your Fetch statements.

Resources