Dynamics CRM 2011 - Filtering LINQ query with outer joins - linq

I have a requirement to query for records in CRM that don't have a related entity of a certain type. Normally, I would do this with an Left Outer Join, then filter for all the rows that have NULLs in the right-hand side.
For example:
var query = from c in orgContext.CreateQuery<Contact>()
join aj in orgContext.CreateQuery<Account>()
on c.ContactId equals aj.PrimaryContactId.Id
into wonk
from a in wonk.DefaultIfEmpty()
where a.Name == null
select new Contact
{
FirstName = c.FirstName,
LastName = c.LastName,
};
This should return me any Contats that are not the Primary Contact of an account. However, this query ends up returning all contacts...! When you look at the SQL that gets generated in SQL Profiler it comes out like this:
SELECT cnt.FirstName, cnt.LastName
FROM Contact as cnt
LEFT OUTER JOIN Account AS acct
ON cnt.ContactId = acct.PrimaryContactId AND acct.Name is NULL
So, I get the Left Join OK, but the filter is on the Join clause, and not in a WHERE clause.and not as it should, like this:
SELECT cnt.FirstName, cnt.LastName
FROM Contact as cnt
LEFT OUTER JOIN Account AS acct
ON cnt.ContactId = acct.PrimaryContactId
WHERE acct.Name is NULL
Clearly, the results from this query are very different! Is there a way to get the query on CRM to generate the correct SQL?
Is this a limitation of the underlying FetchXML request?

Unfortunately, this is a limitation of CRM's LINQ and FetchXML implementations. This page from the SDK states outer joins are not supported:
http://technet.microsoft.com/en-us/library/gg328328.aspx
And while I can't find an official document, there are a lot of results out there for people mentioning FetchXML does not support left outer joins, for example:
http://gtcrm.wordpress.com/2011/03/24/fetch-xml-reports-for-crm-2011-online/

Try this:
var query = from c in orgContext.CreateQuery<Contact>()
where orgContext.CreateQuery<Account>().All(aj => c.ContactId != aj.PrimaryContactId.Id)
select new Contact
{
FirstName = c.FirstName,
LastName = c.LastName,
};

If you don't need to update the entity (e.g. to process all the corresponding validation rules and workflow steps), you can write less-ugly and more efficient queries by hitting the SQL Server directly.
Per CRM's pattern, the views take care of most of the common joins for you. For instance, the dbo.ContactBase and dbo.ContactExtensionBase tables are already joined for you in the view dbo.Contact. The AccountName is already there (called AccountIdName for some bizarre reason, but at least it's there).

Related

How to get TOP 1 result from a LINQ left join without APPLY

I have a master table and I intend to use a left join with LINQ.
Unfortunately the left join multiplies the result (I need only a top 1 result from that).
Here comes the problem: my query should have SQL 8 conformance.
So when I use the following query:
var query = from user in context.User
join group in context.Groups on user.ID equals group.GroupID into groupJoin
from subGroup in groupJoin.Take(1).DefaultIfEmpty()
select new
{
Name = user.Name,
GroupName = subGroup!=null ? subGroup.Name : null
};
I get this exception:
The execution of this query requires the APPLY operator, which is not
supported in versions of SQL Server earlier than SQL Server 2005.
How could I replace my query to have SQL8 conformance?
Unfortunately I have no EF4 to test (it does the trick in the latest EF6), but you can try to force usage of a SQL subquery expression(s) like this:
var query = from user in context.User
select new
{
Name = user.Name,
GroupName = (from g in context.Groups where g.GroupId == user.Id select g.Name).FirstOrDefault()
};
Used your linq-query without take(1) and after that use next code:
query = query.Take(1);

Dynamics CRM 2011 Linq Left Outer Join

I am trying to get all records from an entity that do not join to another entity.
This is what I am trying to do in SQL:
SELECT * from table1
LEFT join table2
ON table1.code = table2.code
WHERE table2.code IS NULL
It results in all table1 rows that did not join to table2.
I have it working with Linq when joining on one field, but I have contact records to join on firstname, dob, and number.
I have a "staging" entity that is imported to; a workflow processes the staging records and creates contacts if they are new.
The staging entity is pretty much a copy of the real entity.
var queryable = from staging in linq.mdo_staging_contactSet
join contact in linq.ContactSet
on staging.mdo_code equals contact.mdo_code
into contactGroup
from contact in contactGroup.DefaultIfEmpty()
// all staging records are selected, even if I put a where clause here
select new Contact
{
// import sequence number is set to null if the staging contact joined to the default contact, which has in id of null
ImportSequenceNumber = (contactContactId == null) ? new int?(subImportNo) : null,
/* other fields get populated */
};
return queryable // This is all staging Contacts, the below expressions product only the new Contacts
.AsEnumerable() // Cannot use the below query on IQuerable
.Where(contact => contact.ImportSequenceNumber != null); // ImportSequenceNumber is null for existing Contacts, and not null for new Contacts
Can I do the same thing using method syntax?
Can I do the above and join on multiple fields?
The alternatives I found were worse and involved using newRecords.Except(existingRecords), but with IEnumerables; is there a better way?
You can do the same thing with method calls, but some tend to find it harder to read since there are some LAMBDA expressions in the middle. Here is an example that shows how the two are basically the same.
I've seen others ask this same questions and it boils down to choice by the developer. I personally like the LINQ approach since I also write a bunch of SQL and I can read the code easier.

How to improve LINQ to EF performance

I have two classes: Property and PropertyValue. A property has several values where each value is a new revision.
When retrieving a set of properties I want to include the latest revision of the value for each property.
in T-SQL this can very efficiently be done like this:
SELECT
p.Id,
pv1.StringValue,
pv1.Revision
FROM dbo.PropertyValues pv1
LEFT JOIN dbo.PropertyValues pv2 ON pv1.Property_Id = pv2.Property_Id AND pv1.Revision < pv2.Revision
JOIN dbo.Properties p ON p.Id = pv1.Property_Id
WHERE pv2.Id IS NULL
ORDER BY p.Id
The "magic" in this query is to join on the lesser than condition and look for rows without a result forced by the LEFT JOIN.
How can I accomplish something similar using LINQ to EF?
The best thing I could come up with was:
from pv in context.PropertyValues
group pv by pv.Property into g
select g.OrderByDescending(p => p.Revision).FirstOrDefault()
It does produce the correct result but is about 10 times slower than the other.
Maybe this can help. Where db is the database context:
(
from pv1 in db.PropertyValues
from pv2 in db.PropertyValues.Where(a=>a.Property_Id==pv1.Property_Id && pv1.Revision<pv2.Revision).DefaultIfEmpty()
join p in db.Properties
on pv1.Property_Id equals p.Id
where pv2.Id==null
orderby p.Id
select new
{
p.Id,
pv1.StringValue,
pv1.Revision
}
);
Next to optimizing a query in Linq To Entities, you also have to be aware of the work it takes for the Entity Framework to translate your query to SQL and then map the results back to your objects.
Comparing a Linq To Entities query directly to a SQL query will always result in lower performance because the Entity Framework does a lot more work for you.
So it's also important to look at optimizing the steps the Entity Framework takes.
Things that could help:
Precompile your query
Pre-generate views
Decide for yourself when to open the database connection
Disable tracking (if appropriate)
Here you can find some documentation with performance strategies.
if you want to use multiple conditions (less than expression) in join you can do this like
from pv1 in db.PropertyValues
join pv2 in db.PropertyValues on new{pv1.Property_ID, Condition = pv1.Revision < pv2.Revision} equals new {pv2.Property_ID , Condition = true} into temp
from t in temp.DefaultIfEmpty()
join p in db.Properties
on pv1.Property_Id equals p.Id
where t.Id==null
orderby p.Id
select new
{
p.Id,
pv1.StringValue,
pv1.Revision
}

Linq to entities Left Join

I want to achieve the following in Linq to Entities:
Get all Enquires that have no Application or the Application has a status != 4 (Completed)
select e.*
from Enquiry enq
left outer join Application app
on enq.enquiryid = app.enquiryid
where app.Status <> 4 or app.enquiryid is null
Has anyone done this before without using DefaultIfEmpty(), which is not supported by Linq to Entities?
I'm trying to add a filter to an IQueryable query like this:
IQueryable<Enquiry> query = Context.EnquirySet;
query = (from e in query
where e.Applications.DefaultIfEmpty()
.Where(app=>app.Status != 4).Count() >= 1
select e);
Thanks
Mark
In EF 4.0+, LEFT JOIN syntax is a little different and presents a crazy quirk:
var query = from c1 in db.Category
join c2 in db.Category on c1.CategoryID equals c2.ParentCategoryID
into ChildCategory
from cc in ChildCategory.DefaultIfEmpty()
select new CategoryObject
{
CategoryID = c1.CategoryID,
ChildName = cc.CategoryName
}
If you capture the execution of this query in SQL Server Profiler, you will see that it does indeed perform a LEFT OUTER JOIN. HOWEVER, if you have multiple LEFT JOIN ("Group Join") clauses in your Linq-to-Entity query, I have found that the self-join clause MAY actually execute as in INNER JOIN - EVEN IF THE ABOVE SYNTAX IS USED!
The resolution to that? As crazy and, according to MS, wrong as it sounds, I resolved this by changing the order of the join clauses. If the self-referencing LEFT JOIN clause was the 1st Linq Group Join, SQL Profiler reported an INNER JOIN. If the self-referencing LEFT JOIN clause was the LAST Linq Group Join, SQL Profiler reported an LEFT JOIN.
Do this:
IQueryable<Enquiry> query = Context.EnquirySet;
query = (from e in query
where (!e.Applications.Any())
|| e.Applications.Any(app => app.Status != 4)
select e);
I don't find LINQ's handling of the problem of what would be an "outer join" in SQL "goofy" at all. The key to understanding it is to think in terms of an object graph with nullable properties rather than a tabular result set.
Any() maps to EXISTS in SQL, so it's far more efficient than Count() in some cases.
Thanks guys for your help. I went for this option in the end but your solutions have helped broaden my knowledge.
IQueryable<Enquiry> query = Context.EnquirySet;
query = query.Except(from e in query
from a in e.Applications
where a.Status == 4
select e);
Because of Linq's goofy (read non-standard) way of handling outers, you have to use DefaultIfEmpty().
What you'll do is run your Linq-To-Entities query into two IEnumerables, then LEFT Join them using DefaultIfEmpty(). It may look something like:
IQueryable enq = Enquiry.Select();
IQueryable app = Application.Select();
var x = from e in enq
join a in app on e.enquiryid equals a.enquiryid
into ae
where e.Status != 4
from appEnq in ae.DefaultIfEmpty()
select e.*;
Just because you can't do it with Linq-To-Entities doesn't mean you can't do it with raw Linq.
(Note: before anyone downvotes me ... yes, I know there are more elegant ways to do this. I'm just trying to make it understandable. It's the concept that's important, right?)
Another thing to consider, if you directly reference any properties in your where clause from a left-joined group (using the into syntax) without checking for null, Entity Framework will still convert your LEFT JOIN into an INNER JOIN.
To avoid this, filter on the "from x in leftJoinedExtent" part of your query like so:
var y = from parent in thing
join child in subthing on parent.ID equals child.ParentID into childTemp
from childLJ in childTemp.Where(c => c.Visible == true).DefaultIfEmpty()
where parent.ID == 123
select new {
ParentID = parent.ID,
ChildID = childLJ.ID
};
ChildID in the anonymous type will be a nullable type and the query this generates will be a LEFT JOIN.

Linq To Entity Framework selecting whole tables

I have the following Linq statement:
(from order in Orders.AsEnumerable()
join component in Components.AsEnumerable()
on order.ORDER_ID equals component.ORDER_ID
join detail in Detailss.AsEnumerable()
on component.RESULT_ID equals detail.RESULT_ID
where orderRestrict.ORDER_MNEMONIC == "MyOrderText"
select new
{
Mnemonic = detail.TEST_MNEMONIC,
OrderID = component.ORDER_ID,
SeqNumber = component.SEQ_NUM
}).ToList()
I expect this to put out the following query:
select *
from Orders ord (NoLock)
join Component comp (NoLock)
on ord .ORDER_ID = comp.ORDER_ID
join Details detail (NoLock)
on comp.RESULT_TEST_NUM = detail .RESULT_TEST_NUM
where res.ORDER_MNEMONIC = 'MyOrderText'
but instead I get 3 seperate queries that select all rows from the tables. I am guessing that Linq is then filtering the values because I do get the correct values in the end.
The problem is that it takes WAY WAY too long because it is pulling down all the rows from all three tables.
Any ideas how I can fix that?
Remove the .AsEnumerable()s from the query as these are preventing the entire query being evaluated on the server.

Resources