How can I manually join cached Entity Framework objects? - linq

I'm having a performance issue with lookups using the navigation properties of an EF model.
My model is something like this (conceptually):
public class Company
{
public int ID { get; set; }
public string CompanyName { get; set; }
public EntityCollection<Employee> Employees { get; set; }
}
public class Employee
{
public int CompanyID { get; set; }
public string EmployeeName { get; set; }
public EntityReference<Company> CompanyReference { get; set; }
}
Now let's say I want to get a list of all Companies that have (known) Employees.
Additionally, assume that I've already cached lists of the both the Companies and the Employees through previous calls:
var dbContext = new EmploymentContext();
var allCompanies = dbContext.Companies.ToList();
var allEmployees = dbContext.Employees.ToList();
bool activeCompanies =
allCompanies.Where(company => company.Employees.Any()).ToList();
This (in my environment) generates a new SQL statement for each .Any() call, following the Employees navigation property.
I already have all the records I need in my cached lists, but they're not 'connected' to each other on the client side.
I realize I can add .Include() calls to my initial cache-fill statement. I want to avoid doing this because in my actual environment I have a large number of relations and a large number of lists I'm populating up front. I'm caching largely to keep Linq from generating overly-complicated nested SQL statements that tend to bog down my database server.
I also realize I can modify my query so as to do an in-memory join:
bool activeCompanies = allCompanies.Where
(
company => allEmployees.Any(employee => employee.CompanyID == company.ID)
);
I'm trying to avoid doing such a rewrite, because the actual business logic gets rather involved. Using Linq statements has significantly improved the readability of this logic, and I'd prefer not to lose that if at all possible.
So my question is this: can I connect them together manually somehow, in the way that the Entity Framework would connect them?
I'd like to continue to use the .Any() operator, but I want it to examine only the objects I have in memory in my dbContext - without going back to the database repeatedly.

Related

Combining Linq Expressions for Dto Selector

We have a lot of Dto classes in our project and on various occasions SELECT them using Expressions from the entity framework context. This has the benefit, that EF can parse our request, and build a nice SQL statement out of it.
Unfortunatly, this has led to very big Expressions, because we have no way of combining them.
So if you have a class DtoA with 3 properties, and one of them is of class DtoB with 5 properties, and again one of those is of class DtoC with 10 properties, you would have to write one big selector.
public static Expression<Func<ClassA, DtoA>> ToDto =
from => new DtoA
{
Id = from.Id,
Name = from.Name,
Size = from.Size,
MyB = new DtoB
{
Id = from.MyB.Id,
...
MyCList = from.MyCList.Select(myC => new DtoC
{
Id = myC.Id,
...
}
}
};
Also, they cannot be reused. When you have DtoD, which also has a propertiy of class DtoB, you would have to paste in the desired code of DtoB and DtoC again.
public static Expression<Func<ClassD, DtoD>> ToDto =
from => new DtoD
{
Id = from.Id,
Length = from.Length,
MyB = new DtoB
{
Id = from.MyB.Id,
...
MyCList = from.MyCList.Select(myC => new DtoC
{
Id = myC.Id,
...
}
}
};
So this will escalate pretty fast. Please note that the mentioned code is just an example, but you get the idea.
I would like to define an expression for each class and then combine them as required, as well as EF still be able to parse it and generate the SQL statement so to not lose the performance improvement.
How can i achieve this?
Have you thought about using Automapper ? You can define your Dtos and create a mapping between the original entity and the Dto and/or vice versa, and using the projection, you don't need any select statements as Automapper will do it for you automatically and it will project only the dto's properties into SQL query.
for example, if you have a Person table with the following structure:
public class Person
{
public int Id { get; set; }
public string Title { get; set; }
public string FamilyName { get; set; }
public string GivenName { get; set; }
public string Initial { get; set; }
public string PreferredName { get; set; }
public string FormerTitle { get; set; }
public string FormerFamilyName { get; set; }
public string FormerGivenName { get; set; }
}
and your dto was like this :
public class PersonDto
{
public int Id { get; set; }
public string Title { get; set; }
public string FamilyName { get; set; }
public string GivenName { get; set; }
}
You can create a mapping between Person and PersonDto like this
Mapper.CreateMap<Person, PersonDto>()
and when you query the database using Entity Framework (for example), you can use something like this to get PersonDto columns only:
ctx.People.Where(p=> p.FamilyName.Contains("John"))
.Project()
.To<PersonDto>()
.ToList();
which will return a list of PersonDtos that has a family name contains "John", and if you run a sql profiler for example you will see that only the PersonDto columns were selected.
Automapper also supports hierachy, if your Person for example has an Address linked to it that you want to return AddressDto for it.
I think it worth to have a look and check it, it cleans a lot of the mess that manual mapping requires.
I thought about it a little, and I didn't come up with any "awesome" solution.
Essentially you have two general choices here,
Use placeholder and rewrite expression tree entirely.
Something like this,
public static Expression<Func<ClassA, DtoA>> DtoExpression{
get{
Expression<Func<ClassA, DtoA>> dtoExpression = classA => new DtoA(){
BDto = Magic.Swap(ClassB.DtoExpression),
};
// todo; here you have access to dtoExpression,
// you need to use expression transformers
// in order to find & replace the Magic.Swap(..) call with the
// actual Expression code(NewExpression),
// Rewriting the expression tree is no easy task,
// but EF will be able to understand it this way.
// the code will be quite tricky, but can be solved
// within ~50-100 lines of code, I expect.
// For that, see ExpressionVisitor.
// As ExpressionVisitor detects the usage of Magic.Swap,
// it has to check the actual expression(ClassB.DtoExpression),
// and rebuild it as MemberInitExpression & NewExpression,
// and the bindings have to be mapped to correct places.
return Magic.Rebuild(dtoExpression);
}
The other way is to start using only Expression class(ditching the LINQ). This way you can write the queries from zero, and reusability will be nice, however, things get harder & you lose type safety. Microsoft has nice reference about dynamic expressions. If you structure everything that way, you can reuse a lot of the functionality. Eg, you define NewExpression and then you can later reuse it, if needed.
The third way is to basically use lambda syntax: .Where, .Select etc.. This gives you definitely better "reusability" rate. It doesn't solve your problem 100%, but it can help you to compose queries a bit better. For example: from.MyCList.Select(dtoCSelector)

Linq selecting from muliple tables

I have the following model
public class SummaryModel
{
public int CompanyCount { get; set; }
public int GroupCount { get; set; }
public int ProjectCount { get; set; }
public int ResourcesCount { get; set; }
public int PeopleCount { get; set; }
}
I would like to use linq to query my database and return record counts from multiple tables and populate this model object.
This is how I am doing it:
using (var ctx = new WeWorkModel.weWorkEntities())
{
var summary = new SummaryModel()
{
CompanyCount = ctx.Companies.Count(),
PeopleCount = ctx.People.Count(),
GroupCount = ctx.Groups.Count(),
ProjectCount = ctx.Projects.Count(),
ResourcesCount = ctx.Resources.Count()
};
}
Is this the most efficient way to do this?
Yes, this is the most efficient way - equivalent to writing sql query as this does not fetch the objects but only does a count on the server. So something like this ( using profiler I tracked the query)
SELECT
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
COUNT(1) AS [A1]
FROM [dbo].[Company] AS [Extent1]
) AS [GroupBy1]
Do you need to store this model in a database or change it's values after instantiating? If no, why not put this code block inside of a parameterless constructor and mark the fields readonly as to avoid using this model differently than intended. If you find later you need greater control over initialization of fields, simply add another constructor to deal with that specific case. To the main question, I see nothing particularly inefficient with your way of handling it. Although, with code there is nearly always terser ways or more efficient ways of handling just about any scenario.

Coalesce fields in a .net MVC 4 model without getting "Only initializers, entity members, and entity navigation properties are supported" from LINQ

The answer to this question gave rise to this other question: How to use LINQ expressions as static members of classes in queries when the class is related multiple times to a second class
I have an existing ASP.net MVC 4 site which I need to modify.
The core entity within this site are Items that are for sale, which are created by several different companies and divided into several categories. My task is to allow each company its own optional alias for the global categories. Getting the two categories set up in the database and model was no problem, making the application use the new optional alias when it exists and default to the global otherwise is where I'm struggling to find the optimal approach.
Adding a coalesce statement to every LINQ query will clearly work, but there are several dozen locations where this logic would need to exist and it would be preferable to keep this logic in one place for when the inevitable changes come.
The following code is my attempt to store the coalesce in the model, but this causes the "Only initializers, entity members, and entity navigation properties are supported." error to be thrown when the LINQ query is executed. I'm unsure how I could achieve something similar with a different method that is more LINQ friendly.
Model:
public class Item
{
[StringLength(10)]
[Key]
public String ItemId { get; set; }
public String CompanyId { get; set; }
public Int32 CategoryId { get; set; }
[ForeignKey("CategoryId")]
public virtual GlobalCategory GlobalCategory { get; set; }
[ForeignKey("CompanyId, CategoryId")]
public virtual CompanyCategory CompanyCategory { get; set; }
public String PreferredCategoryName
{
get{
return (CompanyCategory.CategoryAlias == null || CompanyCategory.CategoryAlias == "") ? GlobalCategory.CategoryName : CompanyCategory.CategoryAlias;
}
}
}
Controller LINQ examples:
var categories = (from i in db.Items
where i.CompanyId == siteCompanyId
orderby i.PreferredCategoryName
select i.PreferredCategoryName).Distinct();
var itemsInCategory = (from i in db.Items
where i.CompanyId == siteCompanyId
&& i.PreferredCategoryName == categoryName
select i);
For one you are using a compiled function (getPreferredCategoryName) in the query, unless EF knows how to translate that you are in trouble.
Try the following in item definition:
public static Expression<Func<Item,String>> PreferredCategoryName
{
get
{
return i => (i.CompanyCategory.CategoryAlias == null || i.CompanyCategory.CategoryAlias == "") ?
i.GlobalCategory.CategoryName :
i.CompanyCategory.CategoryAlias;
}
}
Which is used as follows:
var categories = db.Items.Where(i => i.CompanyID == siteCompanyId)
.OrderBy(Item.PreferredCategoryName)
.Select(Item.PreferredCategoryName)
.Distinct();
This should work as you have a generically available uncompiled expression tree that EF can then parse.

Include nested entities using LINQ

I'm messing around with LINQ for the first time, and I'm using EF 4.1 code first.
I have entities containing nested Lists of other entities, for example:
class Release
{
int ReleaseID { get; set; }
string Title { get; set; }
ICollection<OriginalTrack> OriginalTracks { get; set; }
}
class OriginalTrack
{
int OriginalTrackID { get; set; }
string Title { get; set; }
ICollection<Release> Releases { get; set; }
ICollection<OriginalArtist> OriginalArtists { get; set; }
}
class OriginalArtist
{
int OriginalArtistID { get; set; }
string Name { get; set; }
ICollection<OriginalTrack> OriginalTracks { get; set; }
}
I'm wondering what is the quickest way, in one LINQ query, to obtain all the information for where ReleaseID == some value.
I've done my homework, but have found solutions that require implicit rebuilding of an object (usually anonymous) with the required data. I want the data out of the database in the exact format that it is held within the database, i.e. pulling a Release object with relevant ReleaseID pulls and populates all the OriginalTrack and OriginalArtist data in the Lists.
I know about Include(), but am not sure how to apply it for multiple entities.
All help greatly appreciated.
Use Include. This is the purpose of Include, and there's no reason to write a bunch of nested select statements.
context.Releases.Include("OriginalTracks.OriginalArtist")
.Where(release => release.ReleaseID == id);
This is simpler to write, simpler to read, and preserves your existing data structure.
To use Include you need to specify the name of the property you want to return - this means the name as it exists in your code, not in the database. For example:
.Include("OriginalTracks") will include the OriginalTracks property on each Release
.Include("OriginalTracks.OriginalArtist") will include OriginalTracks property on each Release, and the OriginalArtist on each Track (note that it's not possible - syntactically or logically - to include an OriginalArtist within including the OriginalTrack)
.Include("OriginalTracks").Include("OtherProperty") will include the OriginalTracks and OtherProperty objects on each Release.
You can chain as many of these as you like, for example:
.Include("Tracks.Artist").Include("AnotherProperty")
.Include("ThirdProperty.SomeItems").Where(r => r.something);
is perfectly valid. The only requirement is that you put the Include on the EntitySet, not on a query - you can't .Where().Include().
Don't worry about using include here
just do something like the following
var query =
from release in ctx.Releases
select new {
release,
originalTracks = from track in release.OriginalTracks
select new {
track,
releases = track.Releases,
orignialArtist = from artist in track.OriginalArtists
select new {
artist,
artist.OriginalTracks
}
}
}
var Releases = query.Select(x => x.Release);
Should load all of your data
I worked with information from this post here.
http://blogs.msdn.com/b/alexj/archive/2009/10/13/tip-37-how-to-do-a-conditional-include.aspx
To include the nested entities without using string literals, use Select, like this:
context.Releases.Include(r => r.OriginalTracks.Select(t => t.OriginalArtist))
.Where(release => release.ReleaseID == id);

Linq expression for filtered collection of collections?

I'm hoping this will be a rather simple question for anyone who's good at Linq. I'm struggling to come up with the right Linq expression for the following. I'm able to hack something to get the results, but I'm sure there's a proper and simple Linq way to do it, I'm just not good enough at Linq yet...
I have a database accessed through Entity Framework. It has a number of Tasks. Each Task has a collection of TimeSegments. The TimeSegments have Date and Employee properties.
What I want is to be able to get the tasks for a certain employee and a certain month and the timesegments for each task for that same month and employee.
Again, the tasks do not in themselves have month nor date information, but they do by the TimeSegments associated with each task.
Very simplified it looks sort of like this:
public class Model //Simplified representation of the Entity Framework model
{
public List<Task> Tasks { get; set; }
}
public class Task
{
public int Id { get; set; }
public List<TimeSegment> TimeSegments { get; set; }
public Customer Customer { get; set; }
}
public class TimeSegment
{
public int Id { get; set; }
public string Date { get; set; }
public Employee Employee { get; set; }
}
public class Employee
{
public int Id { get; set; }
public string Name { get; set; }
}
So how do I do this as simply as possible with Linq? I.e. tasks and associated timesegments for a certain month and employee. I would also like to be able to get it by Customer BTW...
This is the simplest thing I could come up with:
var tasksWithSegments =
from segment in model.TimeSegments
where segment.Date.Month == month
where segment.Employee.Id == employeeId
group segment by segment.Task into result
select new
{
Task = result.Key,
TimeSegments = result.ToArray()
};
Please note that you might have to add some properties to your model, such as Model.TimeSegment and TimeSegment.Task.
The trick with LINQ queries often is to start at the right collection. In this case the ideal starting point is TimeSegments.
ps. I'm not sure whether Date.Month == month will actually work with EF, but I think it will (with EF 4.0 that is).
Update:
Could you show how to extend this
query and get the tasks for a
particular Customer as well?
I'm not sure what you mean, but you can for instance filter the previous queryable like this:
var tasksWithSegmentsForCustomers =
from taskWithSegments in tasksWithSegments
where taskWithSegments.Task.Customer.Id == customerId
select taskWithSegments;
Can I get the return type to be a list
of Tasks with a list of TimeSegments
if I have this in a method?
Again, not sure what you exactly want, but if you want two separate lists that have no relation, you can do this:
List<Task> tasks = (
from taskWithSegments in tasksWithSegments
select taskWithSegments.Task).ToList();
List<TimeSegments> segments = (
from taskWithSegments in tasksWithSegments
from segment in taskWithSegments.Segments
select segment).ToList();
Of course, if this is what you need, than it might be easier to rewrite the original query to something like this:
List<TimeSegment> segments = (
from segment in model.TimeSegments
where segment.Date.Month == month
where segment.Employee.Id == employeeId
select segment).ToList();
List<Task> allTasks =
segments.Select(s => s.Task).Distinct().ToList();
Once you got the hang of writing LINQ queries, there is no way you want to go back to writing SQL statements or old-fashion foreach statements.
Think LINQ!!!
What I want is to be able to get the
tasks for a certain employee and a
certain month and the timesegments for
each task for that same month and
employee.
This will select tasks from an instance of Model where the task has at least one time segment that in the requested month for the requested employee (untested):
Model model = new Model();
tasks = model.Tasks.Where(t => t.TimeSegments.Any(ts => ts.Employee.Id = requestedId && Convert.ToDate(ts.Date).Month == requestedMonth));

Resources