Reusing part of the query with joins - linq

I need to implement 3 public methods that all rely on the same initial join. Would a solution like the one described below work?
The code below is an example, the real stuff is much more complicated, that's why I want to reuse it.
private Tuple<Customer,Order> GetCustomerOrders()
{
from c in customers
join o in orders
on c.customerid equals o.customerid
select Tuple.Create(c, o)
}
public MyCustomerOrder GetCustomerOrder(int customerId)
{
return (from co in GetCustomerOrders()
where co.Item1.customerid == customerId
select new MyCustomerOrder(co.Item1, co.Item2)).FirstOrDefault();
}
public IEnumerable<MyCustomerOrder> GetCustomerOrders()
{
return from co in GetCustomerOrders()
orderby co.Item1.Name
select new MyCustomerOrder(co.Item1, co.Item2);
}
The question is, does the tuple break the query? In other words, will this end up in the SQL query that gets generated where co.Item1.customerid == customerId?

It really depends on whether LINQ to SQL understands the point of Tuple.Create. I suspect that it does in .NET 4.0 - but the only way to find out is to try it.
It certainly makes conceptual sense, and composability is part of the goal of LINQ - which is why I'd hope it's supported. Effectively it's just like using an anonymous type, except you get to "export" the type information out of the method, which is the point of Tuple.

Related

How do I get a Distinct list to work with EF 4.x DBSet Context and the IEqualityComparer?

I have been trying for hours to get a Distinct to work for my code.
I am using EF 4.3, MVC3, Razor and trying to get a list downto product id and name.
When I run the Sql query against the DB, it's fine.
Sql Query is
SELECT DISTINCT [ProductId]
,[Product_Name]
FROM [dbo].[PRODUCT]
The only other column in that table is a country code so that's why a standard distinct() isn't working.
I have gone as far as creating an IEqualityComparer
Here is code:
public class DistinctProduct : IEqualityComparer<PRODUCT>
{
public bool Equals(PRODUCT x, PRODUCT y)
{
return x.ProductId.Equals(y.ProductId);
}
public int GetHashCode(PRODUCT obj)
{
return obj.ProductId.GetHashCode();
}
}
here is where I called it.
IEqualityComparer<PRODUCT> customComparer = new DistinctProduct();
IEnumerable<PRODUCT> y = db.PRODUCTs.Distinct(customComparer);
But when it hit's that Last line I get an error out of it stating...
LINQ to Entities does not recognize the method 'System.Linq.IQueryable`1[MyService.Models.PRODUCT] Distinct[PRODUCT](System.Linq.IQueryable`1[MyService.Models.PRODUCT], System.Collections.Generic.IEqualityComparer`1[MyService.Models.PRODUCT])' method, and this method cannot be translated into a store expression.
Can anyone tell me what I'm doing wrong?
Thanks,
David
Is there any reason you could just not use a distinct like the following?
var distinctProdcts = (from p in db.PRODUCTs
select new {
ProductId = p.ProductId,
Product_Name = p.ProductName
}).Distinct();
This would remove the country code from the query before you do the distinct.
Entity Framework is trying to translate your query to a SQL query. Obviously it does not know how to translate the IEqualityComparerer. I think the question is whether you want to do the Distinct in the datbase (in which case your client gets only filtered results) or you are OK with bringing all the data to the client and select distinct on the client. If you want the filtering to happen on the database side (which will make your app perform much better) and you want to be able to use different strategies for comparing you can come up with a code that builds distinct criteria on top of your query. If you are fine with bringing your data to the client (note that it can be a lot of data) you should be able just to do (.ToList() will trigger querying the database and materializing results):
IEnumerable<PRODUCT> y = db.PRODUCTs.ToList().Distinct(customComparer);

Joining two tables and returning multiple records as one row using LINQ

I'm trying to write a LINQ expression that will join two tables and return data in a format similar to what is possible using MySql's GROUP_CONCAT. I tried searching around on Google and SO, but all the results I found used MSSQL or were only using one table. The expression I have written now looks like this:
from d in division
join o in office on d.Id = o.DivisionId
select new
{
id = d.Id,
cell = new string[] { d.DivisionName, o.OfficeName }
}
As expected, this returns a list of every division and what offices correspond to that division. The only problem is that since most divisions will have more than one office, I get a division back for each office in said division. Essentially I'm seeing results like this:
Division1: Office1
Division1: Office2
Division1: Office3
Division2: Office1
When I want to see:
Division1: Office1, Office2, Office3
Division2: Office1
I remember doing something a while ago with MySql that used GROUP_CONCAT, but I can't figure out what the equivalent of that would be using LINQ. I tried writing a method which had an IEnumerable<Office> parameter and built a string using the Aggregate extension method, but the way I have my LINQ expression written now, each Office is passed in rather than an IEnumerable<Office>. Is there a better way to approach this problem than what I'm doing now? I'm rather new to LINQ expressions, so I apologize if this is trivial.
You want a group join, e.g.
from d in division
join o in office on d.Id = o.DivisionId into offices
select new
{
id = d.Id,
divisionName = d.DivisionName,
officeNames = offices.Select(o => o.OfficeName)
}

Using lookup values in linq to entity queries

I'm a total LINQ noob so I guess you'll probably have a good laugh reading this question. I'm learning LINQ to create queries in LightSwitch and what I don't seem to understand is to select an entity based on a value in a lookup table. Say I want to select all employees in a table that have a job title that is picked from a related lookup table. I want the descriptive value in the lookup table for the user to pick from a list to use as a parameter in a query, not the non-descriptive id's.
Can someone point me to an article or tutorial that quickly explains this, or give me a quick answer? I AM reading books and have a Pluralsight account but since this is probably the most extensive knowledge I will need for now a simple tutorial would help me more that watching hours of videos and read thousands of pages of books.
Thanks in advance!
Edit: this is the code. As far as I know this should but won't work (red squigly line under EmployeeTitle, error says that EmployeeContract does not contain a definition for EmployeeTitle even though there is a relationship between the two).
partial void ActiveEngineers_PreprocessQuery(ref IQueryable<Employee> query)
{
query = from Employee e in query
where e.EmployeeContract.EmployeeTitle.Description == "Engineer"
select e;
}
Edit 2: This works! But why this one and not the other?
partial void ActiveContracts_PreprocessQuery(ref IQueryable<EmployeeContract> query)
{
query = from EmployeeContract e in query
where e.EmployeeTitle.Description == "Engineer"
select e;
}
The red squiggly line you've described is likely because each Employee can have 1-to-many EmployeeContracts. Therefore, Employee.EmployeeContracts is actually an IEnumerable<EmployeeContract>, which in turn does not have a "EmployeeTitle" property.
I think what you're looking for might be:
partial void ActiveEngineers_PreprocessQuery(ref IQueryable<Employee> query)
{
query = from Employee e in query
where e.EmployeeContract.Any(x => x.EmployeeTitle.Description == "Engineer")
select e;
}
What this is saying is that at least one of the Employee's EmployeeContracts must have an EmployeeTitle.Description == "Engineer"
Try something like this:
partial void RetrieveCustomer_Execute()
{
Order order = this.DataWorkspace.NorthwindData.Orders_Single
(Orders.SelectedItem.OrderID);
Customer cust = order.Customer;
//Perform some task on the customer entity.
}
(http://msdn.microsoft.com/en-us/library/ff851990.aspx#ReadingData)
Assuming you have navigation properties in place for the foreign key over to the lookup table, it should be something like:
var allMonkies = from employee in context.Employees
where employee.EmployeeTitle.FullTitle == "Code Monkey"
select employee;
If you don't have a navigation property, you can still get the same via 'manual' join:
var allMonkies = from employee in context.Employees
join title in context.EmployeeTitles
on employee.EmployeeTitleID equals title.ID
where title.FullTitle == "Code Monkey"
select employee;

Linq Expression Syntax - How to make it more readable?

I am in the process of writing something that will use Linq to combine results from my database, via Linq2Sql and an in-memory list of objects in order to find out which of my in-memory objects match something on the database.
I've come up with the query in both expression and query syntax.
Expression Syntax
var query = order.Items.Join(productNonCriticalityList,
i => i.ProductID,
p => p.ProductID,
(i, p) => i);
Query Syntax
var query =
from p in productNonCriticalityList
join i in order.Items
on p.ProductID equals i.ProductID
select i;
I realise that we have all the code completion goodness with expression syntax, and I do actually use that more. Mainly because it's easier to create re-usable chunks of filter code that can be chained together to form more complex filters.
But for a join the latter seems far more readable to me, but maybe that is because I am used to writing T-SQL.
So, am I missing a trick or is it just a matter of getting used to it?
I agree with the other responders that the exact question you're asking is simply a matter of preference. Personaly, I mix the two forms depending upon which is clearer for the specific query that I'm writing.
If I have one comment though, I would say that the query looks like it might load all of the items from the order. That might be fine for a single order one time, but if you're looping through lots of orders, it might be more efficient to load all of the items for all of the in one go (you might want to additionally filter by date or customer, or whatever though). If you do that, you might get better results by switching the query around:
var productIds = (from p in productNonCriticalityList
orderby p.productID
select p.ProductID).Distinct();
var orderItems = from i in dc.OrderItems
where productIds.Contains(i.ProductID)
&& // Additional filtering here.
select i;
It's a bit backwards at first glance, but it could save you from loading in all the order items and also from sending lots of queries. It works because the where productIds.Contains(...) call can be converted to where i.ProductID in (1, 2, 3, 4, 5) in SQL. Of course, you'd have to judge it based on the expected number of order items, and the number of product IDs.
It really all comes down to preference. Some people just hate the idea of query like syntax in their code. I for one appreciate the query syntax, it is declarative and quite readable. Like you said though, the chainability of the first example is a nice thing to have. I guess for my money I would keep it query until I felt I needed to begin chaining the call.
I used to feel the same way. Now I find query syntax easier to read and write, particularly when things get complicated. As much as it irked me to type it the first time, 'let' does wonderful things in ways that would not be readable in Expression Syntax.
I prefer the Query syntax when its complex and Expression syntax when its a simple query.
If a DBA were to read the C# code to see what SQL we are using, they would understand and digest the Query syntax easier.
Taking a simple example:
Query
var col = from o in orders
orderby o.Cost ascending
select o;
Expression
var col2 = orders.OrderBy(o => o.Cost);
To me, the Expression syntax is an easier choice to understand here.
Another example:
Query
var col9 = from o in orders
orderby o.CustomerID, o.Cost descending
select o;
Expression
var col6 = orders.OrderBy(o => o.CustomerID).
ThenByDescending(o => o.Cost);
Both are easy to read and understand, however if the query was
//returns same results as above
var col5 = from o in orders
orderby o.Cost descending
orderby o.CustomerID
select o;
//NOTE the ordering of the orderby's
That looks a little confusing to be as the fields are in a different order and it appears a little backwards.
For Joins
Query
var col = from c in customers
join o in orders on
c.CustomerID equals o.CustomerID
select new
{
c.CustomerID,
c.Name,
o.OrderID,
o.Cost
};
Expression:
var col2 = customers.Join(orders,
c => c.CustomerID,o => o.CustomerID,
(c, o) => new
{
c.CustomerID,
c.Name,
o.OrderID,
o.Cost
}
);
I find that Query is better.
My summary would be use whatever looks easiest and fastest to understand given the query at hand. There is no golden rule of which to use. However, if there are a lot of joins, I'd go with Query syntax.
Well, both statements are equivalent. So you could youse them both, depending on the surrounging code and what is more readable. In my project I make the decision which syntax to use dependent on those two conditions.
Personally I would write the expression syntax in one line, but this is a matter of taste.

Linq to NHibernate generating 3,000+ SQL statements in one request!

I've been developing a webapp using Linq to NHibernate for the past few months, but haven't profiled the SQL it generates until now. Using NH Profiler, it now seems that the following chunk of code hits the DB more than 3,000 times when the Linq expression is executed.
var activeCaseList = from c in UserRepository.GetCasesByProjectManagerID(consultantId)
where c.CompletionDate == null
select new { c.PropertyID, c.Reference, c.Property.Address, DaysOld = DateTime.Now.Subtract(c.CreationDate).Days, JobValue = String.Format("£{0:0,0}", c.JobValue), c.CurrentStatus };
Where the Repository method looks like:
public IEnumerable<Case> GetCasesByProjectManagerID(int projectManagerId)
{
return from c in Session.Linq<Case>()
where c.ProjectManagerID == projectManagerId
select c;
}
It appears to run the initial Repository query first, then iterates through all of the results checking to see if the CompletionDate is null, but issuing a query to get c.Property.Address first.
So if the initial query returns 2,000 records, even if only five of them have no CompletionDate, it still fires off an SQL query to bring back the address details for the 2,000 records.
The way I had imagined this would work, is that it would evaluate all of the WHERE and SELECT clauses and simply amalgamate them, so the inital query would be like:
SELECT ... WHERE ProjectManager = #p1 AND CompleteDate IS NOT NULL
Which would yield 5 records, and then it could fire the further 5 queries to obtain the addresses. Am I expecting too much here, or am I simply doing something wrong?
Anthony
Change the declaration of GetCasesByProjectManagerID:
public IQueryable<Case> GetCasesByProjectManagerID(int projectManagerId)
You can't compose queries with IEnumerable<T> - they're just sequences. IQueryable<T> is specifically designed for composition like this.
Since I can't add a comment yet. Jon Skeet is right you'll want to use IQueryable, this is allows the Linq provider to Lazily construct the SQL. IEnumerable is the eager version.

Resources