NHibernate: Where clause containing child collection as subquery building improper T-SQL statements - linq

I am using NHibernate 3.x, along with Fluent NHibernate, and have not had any issues constructing database queries until now.
To simplify my objects for the purposes of this post, I've included a subset of my object and mapping structures below:
IssueItem POCO entity class:
public class IssueItem : DomainEntity, IKeyed<Guid> {
public virtual Guid ID { get; set; }
public virtual string Subject { get; set; }
public virtual string Description { get; set; }
public virtual IList<IssueLocation> Locations { get; set; }
}
Location POCO entity class:
public class Location : DomainEntity, IKeyed<Guid> {
public virtual Guid ID { get; set; }
public virtual string City { get; set; }
public virtual string State { get; set; }
public virtual string Zip { get; set; }
public virtual string Organization { get; set; }
public virtual IssueItem Issue { get; set; }
}
IssueItem Fluent NHibernate map:
public class IssueItemMap : DomainEntityMapping<IssueItem> {
public IssueItemMap()
{
Table("IssueItem");
LazyLoad();
Map(x => x.ID).Column("ID");
Map(x => x.Subject).Column("Subject");
Map(x => x.Description).Column("Description");
HasMany(x => x.Locations).KeyColumn("IssueItemID").LazyLoad().ReadOnly().Inverse();
}
}
Location Fluent NHibernate map:
public class LocationMap : DomainEntityMapping<Location> {
public LocationMap()
{
Table("Location");
LazyLoad();
Map(x => x.ID).Column("ID");
Map(x => x.City).Column("City");
Map(x => x.State).Column("State");
Map(x => x.Zip).Column("Zip");
Map(x => x.Organization).Column("Organization");
References(x => x.IssueItem).ForeignKey("IssueItemID").LazyLoad().ReadOnly();
}
}
Now, I'm using a Unit of Work and Service/Repository pattern in my MVC app. Therefore, I have a domain layer of my project that contains my basic POCO entities, as well as validators and services. In my data layer, I've got my NHibernate-related stuff, such as my repositories that my domain layer access from my services. This is where my NHibernate maps live as well.
In order to ensure that no NHibernate-specific logic creeps into my domain layer (in case I want to use a different ORM in the future), I perform my LINQ statements in my services within my domain layer against IQueryable objects returned from the repositories in my data layer. Therefore, when I write my queries, I am using System.Linq and System.Linq.Expressions instead of the NHibernate.Linq class.
That said, here's my LINQ query I'm having issues with from within one of my service classes in my domain layer:
var issues = _issueRepo.All();
if (!string.IsNullOrWhiteSpace(searchWords)) {
issues = issues.Where(i => i.Subject.Contains(searchWords)
|| i.Description.Contains(searchWords)
|| i.Locations.Where(l => l.Organization.Contains(searchWords)
|| l.City.Contains(searchWords))
.Select(x => x.IssueItemID).Contains(i.ID)
);
}
Now, the IssueItems are queried just fine. However, the one-to-many table (Locations) is not properly queried. This is what I mean...
The generated T-SQL statement is perfect except for the very end of it. Example:
select TOP(100) issueitem0_.ID as ID2_, issueitem0_.Subject as Subject2_, issueitem0_.Description as Description2_
from IssueItem issueitem0_
where issueitem0_.Subject like ('%test%') or issueitem0_.Description like ('%test%')
or exists (select location1_.IssueItemID from Location location1_ where
issueitem0_.ID=location1_.IssueItemID and (location1_.Organization like ('%test%')
or location1_.City like ('%test%')) and location1_.ID=issueitem0_.ID)
See that last bit? It throws in that last "and" statement (and location1_.ID=issueitem0_.ID) that throws a wrench in the whole system. I have tweaked every configuration parameter I could think of with my mapping and have tried many different LINQ statements and I cannot get rid of that last part. I don't know why it adds it.
If I construct the same LINQ statement in LINQPad, it properly generates the T-SQL statement without the last part (and location1_.ID=issueitem0_.ID).
Any ideas?
Thanks!
Joel

Add Any() when you query locations. It will come true if any location property contains what you are looking for. You are trying to select in where clause, then trying to get IssueID from there. I think you will see this query is clearer.
var issues = _issueRepo.All();
if (!string.IsNullOrWhiteSpace(searchWords))
{
issues = issues.Where(i => i.Subject.Contains(searchWords)
|| i.Description.Contains(searchWords)
|| i.Locations.Any(l => l.Organization.Contains(searchWords))
|| i.Locations.Any(l => l.City.Contains(searchWords)) )
}

Related

Combining Linq Expressions for Dto Selector

We have a lot of Dto classes in our project and on various occasions SELECT them using Expressions from the entity framework context. This has the benefit, that EF can parse our request, and build a nice SQL statement out of it.
Unfortunatly, this has led to very big Expressions, because we have no way of combining them.
So if you have a class DtoA with 3 properties, and one of them is of class DtoB with 5 properties, and again one of those is of class DtoC with 10 properties, you would have to write one big selector.
public static Expression<Func<ClassA, DtoA>> ToDto =
from => new DtoA
{
Id = from.Id,
Name = from.Name,
Size = from.Size,
MyB = new DtoB
{
Id = from.MyB.Id,
...
MyCList = from.MyCList.Select(myC => new DtoC
{
Id = myC.Id,
...
}
}
};
Also, they cannot be reused. When you have DtoD, which also has a propertiy of class DtoB, you would have to paste in the desired code of DtoB and DtoC again.
public static Expression<Func<ClassD, DtoD>> ToDto =
from => new DtoD
{
Id = from.Id,
Length = from.Length,
MyB = new DtoB
{
Id = from.MyB.Id,
...
MyCList = from.MyCList.Select(myC => new DtoC
{
Id = myC.Id,
...
}
}
};
So this will escalate pretty fast. Please note that the mentioned code is just an example, but you get the idea.
I would like to define an expression for each class and then combine them as required, as well as EF still be able to parse it and generate the SQL statement so to not lose the performance improvement.
How can i achieve this?
Have you thought about using Automapper ? You can define your Dtos and create a mapping between the original entity and the Dto and/or vice versa, and using the projection, you don't need any select statements as Automapper will do it for you automatically and it will project only the dto's properties into SQL query.
for example, if you have a Person table with the following structure:
public class Person
{
public int Id { get; set; }
public string Title { get; set; }
public string FamilyName { get; set; }
public string GivenName { get; set; }
public string Initial { get; set; }
public string PreferredName { get; set; }
public string FormerTitle { get; set; }
public string FormerFamilyName { get; set; }
public string FormerGivenName { get; set; }
}
and your dto was like this :
public class PersonDto
{
public int Id { get; set; }
public string Title { get; set; }
public string FamilyName { get; set; }
public string GivenName { get; set; }
}
You can create a mapping between Person and PersonDto like this
Mapper.CreateMap<Person, PersonDto>()
and when you query the database using Entity Framework (for example), you can use something like this to get PersonDto columns only:
ctx.People.Where(p=> p.FamilyName.Contains("John"))
.Project()
.To<PersonDto>()
.ToList();
which will return a list of PersonDtos that has a family name contains "John", and if you run a sql profiler for example you will see that only the PersonDto columns were selected.
Automapper also supports hierachy, if your Person for example has an Address linked to it that you want to return AddressDto for it.
I think it worth to have a look and check it, it cleans a lot of the mess that manual mapping requires.
I thought about it a little, and I didn't come up with any "awesome" solution.
Essentially you have two general choices here,
Use placeholder and rewrite expression tree entirely.
Something like this,
public static Expression<Func<ClassA, DtoA>> DtoExpression{
get{
Expression<Func<ClassA, DtoA>> dtoExpression = classA => new DtoA(){
BDto = Magic.Swap(ClassB.DtoExpression),
};
// todo; here you have access to dtoExpression,
// you need to use expression transformers
// in order to find & replace the Magic.Swap(..) call with the
// actual Expression code(NewExpression),
// Rewriting the expression tree is no easy task,
// but EF will be able to understand it this way.
// the code will be quite tricky, but can be solved
// within ~50-100 lines of code, I expect.
// For that, see ExpressionVisitor.
// As ExpressionVisitor detects the usage of Magic.Swap,
// it has to check the actual expression(ClassB.DtoExpression),
// and rebuild it as MemberInitExpression & NewExpression,
// and the bindings have to be mapped to correct places.
return Magic.Rebuild(dtoExpression);
}
The other way is to start using only Expression class(ditching the LINQ). This way you can write the queries from zero, and reusability will be nice, however, things get harder & you lose type safety. Microsoft has nice reference about dynamic expressions. If you structure everything that way, you can reuse a lot of the functionality. Eg, you define NewExpression and then you can later reuse it, if needed.
The third way is to basically use lambda syntax: .Where, .Select etc.. This gives you definitely better "reusability" rate. It doesn't solve your problem 100%, but it can help you to compose queries a bit better. For example: from.MyCList.Select(dtoCSelector)

How can I manually join cached Entity Framework objects?

I'm having a performance issue with lookups using the navigation properties of an EF model.
My model is something like this (conceptually):
public class Company
{
public int ID { get; set; }
public string CompanyName { get; set; }
public EntityCollection<Employee> Employees { get; set; }
}
public class Employee
{
public int CompanyID { get; set; }
public string EmployeeName { get; set; }
public EntityReference<Company> CompanyReference { get; set; }
}
Now let's say I want to get a list of all Companies that have (known) Employees.
Additionally, assume that I've already cached lists of the both the Companies and the Employees through previous calls:
var dbContext = new EmploymentContext();
var allCompanies = dbContext.Companies.ToList();
var allEmployees = dbContext.Employees.ToList();
bool activeCompanies =
allCompanies.Where(company => company.Employees.Any()).ToList();
This (in my environment) generates a new SQL statement for each .Any() call, following the Employees navigation property.
I already have all the records I need in my cached lists, but they're not 'connected' to each other on the client side.
I realize I can add .Include() calls to my initial cache-fill statement. I want to avoid doing this because in my actual environment I have a large number of relations and a large number of lists I'm populating up front. I'm caching largely to keep Linq from generating overly-complicated nested SQL statements that tend to bog down my database server.
I also realize I can modify my query so as to do an in-memory join:
bool activeCompanies = allCompanies.Where
(
company => allEmployees.Any(employee => employee.CompanyID == company.ID)
);
I'm trying to avoid doing such a rewrite, because the actual business logic gets rather involved. Using Linq statements has significantly improved the readability of this logic, and I'd prefer not to lose that if at all possible.
So my question is this: can I connect them together manually somehow, in the way that the Entity Framework would connect them?
I'd like to continue to use the .Any() operator, but I want it to examine only the objects I have in memory in my dbContext - without going back to the database repeatedly.

Using RavenDB, how do I efficiently retrieve a list of items related by a 'foreign key'

I store User objects in RavenDB. Each User has a User.Id property.
I also have a Relationship class that links two User.Ids together to create a Mentor/Mentee relationship, like this:
public class User
{
public string Id { get; set; }
public string UserName { get; set; }
... more properties
}
public class Relationship
{
public string Id { get; set; }
public string MentorId { get; set; }
public string MenteeId { get; set; }
public RelationshipStatus Status { get; set; }
}
Now I want to retrieve a list of Mentees for a given Mentor. I have done this in the following way:
public static List<User> GetMentees(IDocumentSession db, string mentorId)
{
var mentees = new List<User>();
db.Query<Relationship>()
.Where(r => r.MentorId == mentorId)
.Select(r => r.MenteeId)
.ForEach(id => mentees.Add(db.Load<User>(id)));
return mentees;
}
This seems to work just fine but the coding-angel on my shoulder is wrinkling her nose at the smells emanating from the nested use of the IDocumentSession (db) and the need for multiple Load calls to fill the Mentees List.
How can I optimise this method using best practice RavenDB syntax?
Edit
Thanks to #Jonah Himango (see accepted answer below) who solved the problem of multiple calls to the database for me. In addition I have also created a new extension method called 'Memoize' to eliminate the need for the external 'mentees' result List (see code above).
Here is the optimised code. Please feel free to comment and refine further.
The Linq
public static List<User> GetMentees(IDocumentSession db, string mentorId)
{
return db.Query<Relationship>()
.Customize(x => x.Include<Relationship>(o => o.MenteeId))
.Where(r => r.MentorId == mentorId)
.Memoize()
.Select(r => db.Load<User>(r.MenteeId))
.ToList();
}
The extension method
public static List<T> Memoize<T>(this IQueryable<T> target)
{
return target.ToList();
}
Note : This extension method may seem completely superfluous (it is really) but it irritates my geek-gland that I have to call a function called ToList(), not to create a list, but to force the execution of the Linq statement. So my extension method just renames ToList() to the far more accurate Memoize().
You'll want to use the .Include query customization to tell Raven to include the related user object off of each Relationship:
db.Query<Relationship>()
.Customize(x => x.Include<Relationship>(o => o.MenteeId))
.Where(r => r.MentorId == mentorId)
.Select(r => r.MenteeId)
.ForEach(id => mentees.Add(db.Load<User>(id))); // .Load no longer makes a call to the DB; it's already loaded into the session!
Relevant documentation here:
The call to Load() is resolved completely client side (i.e. without
additional requests to the RavenDB server) because the [related]
object has already been retrieved via the .Include call.

Include nested entities using LINQ

I'm messing around with LINQ for the first time, and I'm using EF 4.1 code first.
I have entities containing nested Lists of other entities, for example:
class Release
{
int ReleaseID { get; set; }
string Title { get; set; }
ICollection<OriginalTrack> OriginalTracks { get; set; }
}
class OriginalTrack
{
int OriginalTrackID { get; set; }
string Title { get; set; }
ICollection<Release> Releases { get; set; }
ICollection<OriginalArtist> OriginalArtists { get; set; }
}
class OriginalArtist
{
int OriginalArtistID { get; set; }
string Name { get; set; }
ICollection<OriginalTrack> OriginalTracks { get; set; }
}
I'm wondering what is the quickest way, in one LINQ query, to obtain all the information for where ReleaseID == some value.
I've done my homework, but have found solutions that require implicit rebuilding of an object (usually anonymous) with the required data. I want the data out of the database in the exact format that it is held within the database, i.e. pulling a Release object with relevant ReleaseID pulls and populates all the OriginalTrack and OriginalArtist data in the Lists.
I know about Include(), but am not sure how to apply it for multiple entities.
All help greatly appreciated.
Use Include. This is the purpose of Include, and there's no reason to write a bunch of nested select statements.
context.Releases.Include("OriginalTracks.OriginalArtist")
.Where(release => release.ReleaseID == id);
This is simpler to write, simpler to read, and preserves your existing data structure.
To use Include you need to specify the name of the property you want to return - this means the name as it exists in your code, not in the database. For example:
.Include("OriginalTracks") will include the OriginalTracks property on each Release
.Include("OriginalTracks.OriginalArtist") will include OriginalTracks property on each Release, and the OriginalArtist on each Track (note that it's not possible - syntactically or logically - to include an OriginalArtist within including the OriginalTrack)
.Include("OriginalTracks").Include("OtherProperty") will include the OriginalTracks and OtherProperty objects on each Release.
You can chain as many of these as you like, for example:
.Include("Tracks.Artist").Include("AnotherProperty")
.Include("ThirdProperty.SomeItems").Where(r => r.something);
is perfectly valid. The only requirement is that you put the Include on the EntitySet, not on a query - you can't .Where().Include().
Don't worry about using include here
just do something like the following
var query =
from release in ctx.Releases
select new {
release,
originalTracks = from track in release.OriginalTracks
select new {
track,
releases = track.Releases,
orignialArtist = from artist in track.OriginalArtists
select new {
artist,
artist.OriginalTracks
}
}
}
var Releases = query.Select(x => x.Release);
Should load all of your data
I worked with information from this post here.
http://blogs.msdn.com/b/alexj/archive/2009/10/13/tip-37-how-to-do-a-conditional-include.aspx
To include the nested entities without using string literals, use Select, like this:
context.Releases.Include(r => r.OriginalTracks.Select(t => t.OriginalArtist))
.Where(release => release.ReleaseID == id);

NHibernate - LINQ Limitations

i've been using Nhibernate with LINQ a fair bit now and i have a few issues. Say i have the following entities:
public class User
{
public virtual int UserID { get; set; }
public virtual bool IsActive { get; set; }
public virtual bool SomeField { get { return 0; } }
public virtual DateTime DateRegistered { get; set; }
public virtual IList<Membership> Membership { get; set; }
public virtual Membership ValidMembership { get { return Membership.FirstOrDefault(m => m.IsValid); } }
}
public class User2
{
public virtual int UserID { get; set; }
public virtual int MembershipID { get; set; }
}
public class Membership
{
public virtual int MembershipID { get; set; }
public virtual bool IsValid { get; set; }
}
Now if i run the following query:
var users = service.Linq<User>()
.Where(u => u.IsActive) // This would work
.Where(u => u.SomeField > 0) // This would fail (i know i can map these using formulas but this is here to illustrate)
.Where(u => u.Membership.Any(m => m.IsValid)) // This would work
.Where(u => u.ValidMembership != null) // This would fail
.Where(u => u.DateRegistered > DateTime.UtcNow.AddDays(-1)) // This would work
.Where(u => u.DateRegistered.AddDays(1) > DateTime.UtcNow) // This would fail
.Select(u => new User2 { UserID = u.UserID }) // This would work
.Select(u => u.UserID) // This would work
.Select(u => new { UserID = u.UserID }) // This would fail
.Select(u => new User2 { UserID = u.UserID, MembershipID = u.Membership.Any(m => m.IsValid) ? u.Membership.Any(m => m.IsValid).First().MembershipID : 0 }); // This would fail
I've added a comment next to each one to indicated whether they would work or fail. Those are the scenarios i can think of at the moment. I've managed to overcome these issues by converting the data to list before it has to do anything too fancy. This obviously has an impact on performance. I was wondering whether future versions of the LINQ provider for NHibernate will support these? Also does anyone know how the entity framework would handle these scenarios. I'd imagine the entity framework would be an improvement but i don't want to jump ship if the same problems exist.
Appreciate your feedback. Thanks
NHibernate 3 supports more constructs than the contrib provider that you are using (Beta1 was just released, final version is expected before the end of the year)
However, as others pointed out, some constructs are hard (or impossible) to parse, while others require very specific code to translate the expression trees to SQL.
Fortunately, the new provider is also extensible, which means you can add your own db logic for methods of your own, or that are not supported out of the box.
This code shouldn't even compile. User.SomeField is a boolean property but you're trying to return 0 from the getter? SomeField and ValidMemberships shoudn't even be virtual because they are field that wouldn't even be managed by NHibernate.
This is an addition to the answer of Diego Mijelshon
Calculated properties can never be parsed by a Linq provider out of the box, because method bodies can not be converted into an expression tree.
Some or maybe even all not implemented issues are implemented in the current Linq provider.
Where(u => u.SomeField > 0) // Calculated property SomeField
Where(u => u.ValidMembership != null) // Calculated property ValidMembership
Where(u => u.DateRegistered.AddDays(1) > DateTime.UtcNow) // The method DateTime.AddDays is not implemented for this side of the date comparison operator greater than >
Select(u => new { UserID = u.UserID })
// Creating anonymous objects is not implemented
Select(u => new User2 { UserID = u.UserID, MembershipID = u.Membership.Any(m => m.IsValid) ? u.Membership.Any(m => m.IsValid).First().MembershipID : 0 });
// Ternary operator not implemented
Entity Framework will fail in the same cases as NHibernate (at least for your examples). Remember, Linq uses Deferred loading for Where()-operations - and everything in Linq2SQL (Entity Framework included) and Linq2NHibernate needs to translate to SQL in defered loading. Method calls cannot be converted to SQL - there is no representation of the method in SQL - and that is why it would fail.
When you ToList() - you force the previous Linq-statements to evaluate (to a database-call) and then working forwards you are working on an in-memory represenation allowing you to use the full Linq2Object Expression-trees (which have the possibility of fancy method-calls etc.)
As for your projections - I wouldn't use Linq2NHibernate for those - but instead use the Projections built into 'standard' NHibernate.

Resources