How to write SQL translateable linq code that groups by one property and returns distinct list - linq

I want to change code below to be sql translateable because now i get exception.
Basicallly i want list of customers from certain localisation and there could be more than one customer with the same CustomerNumber so i want to take the one that was most recently added.
In other words - distinct list of customers from localisation where "distinct algorithm" works by taking the most recently added customer if there is conflict.
The code below works only if it is client side. I could move Group By and Select after ToListAsync but i want to avoid taking unnecessary data from database (there is include which includes list that is pretty big for every customer).
var someData = await DbContext.Set<Customer>()
.Where(o => o.Metadata.Localisation == localisation)
.Include(nameof(Customer.SomeLongList))
.GroupBy(x => x.CustomerNumber)
.Select(gr => gr.OrderByDescending(x => x.Metadata.DateAdded).FirstOrDefault())
.ToListAsync();

Short answer:
No way. GroupBy has limitation: after grouping only Key and Aggregation result can be selected. And you are trying to select SomeLongList and full entity Customer.
Best answer:
It can be done by the SQL and ROW_NUMBER Window function but without SomeLongList
Workaround:
It is because it is not effective
var groupingQuery =
from c in DbContext.Set<Customer>()
group c by new { c.CustomerNumber } into g
select new
{
g.Key.CustomerNumber,
DateAdded = g.Max(x => x.DateAdded)
};
var query =
from c in DbContext.Set<Customer>().Include(x => x.SomeLongList)
join g in groupingQuery on new { c.CustomerNumber, c.DateAdded } equals
new { g.CustomerNumber, g.DateAdded }
select c;
var result = await query.ToListAsync();

Related

NHibernate LINQ query with GroupBy

I am struggling with converting SQL to NHibernate HQL.
SQL Query
SELECT Posts.id, Count(Comments.id) FROM Posts LEFT JOIN Comments ON Posts.id=Comments.fk_post GROUP BY Posts.id
LINQ
Session.Query<Post>().Fetch(x => x.Comments).GroupBy(x => x.id, x => x.Comments)
.Select(x => new { PostId = x.Key, CommentCount = x.Single().Count }).ToList();
This is still failing with exception:
Parameter 'inputInfo' has type 'Remotion.Linq.Clauses.StreamedData.StreamedSingleValueInfo' when type 'Remotion.Linq.Clauses.StreamedData.StreamedSequenceInfo' was expected.
What is wrong with my query?
So you have tables of Posts and Comments. There is a one-to-many relation between Posts and Comments: every Post has zero or more Comments, every Comment belongs to exactly one Post, namely the Post that the foreign key Comments.fk_post refers to.
You want to fetche the Id of every Post, together with the number of Comments for this Post.
Whenever you need to select "items with their zero or more sub-items", like Schools with their Students, Customers with their Orders, or in your case Posts with their Comments, consider to use one of the overloads of Queryable.GroupJoin.
You can also see that a GroupJoin is the most obvious solution, if you see a SQL Left Outer Join followed by a GroupBy.
Whenever you see a SQL left outer join followed by a GroupBy, it is almost certain that you need a GroupJoin.
If you want something else than juse "items with their sub-items", use the overload that has a parameter resultSelector.
I don't know nHibernate, I assume that Session, Query, Fetch are used to get the IQueryables. As this is not part of the question, I leave it up to you to get the IQueryables:
IQueryable<Post> posts = ...
IQueryable<Comment> comments = ...
// GroupJoin Posts with Comments
var postIdsWithCommentsCount = posts.GroupJoin(comments,
post => post.Id, // from every Post take the primary key
comment => comment.fk_post, // from every Comment take the foreign key to Post
// parameter resultSelector: from every Post, with all its zero or more Comments,
// make one new
(post, commentsOfThisPost) => new
{
Id = post.Id,
Count = commentsOfThisPost.Count(),
});
Try this query:
var query =
from p in Session.Query<Post>()
from c in p.Comments.DefaultIfEmpty()
group c by p.Id into g
select new
{
PostId = g.Key,
CommentCount = g.Sum(x => (int?)c.Id == null ? 0 : 1)
}
var result = query.ToList();;

NotSupportedException for LINQ Queries

I am trying to get a list of a database table called oracleTimeCards whose employee id equals to the employeeID in employees list. Here is what I wrote:
LandornetSQLEntities db = new LandornetSQLEntities();
List<OracleEmployee> employees = db.OracleEmployees.Where(e => e.Office.Contains(officeName) && e.IsActive == true).Distinct().ToList();
var oracleTimeCards = db.OracleTimecards.Where(c => employees.Any(e => c.PersonID == e.PersonID)).ToList();
Anyone has any idea?
I'm going to assume you're using Entity Framework here. You can't embed calls to arbitrary LINQ extension methods inside your predicate, since EF might not know how to translate these to SQL.
Assuming you want to find all the timecards for the employees you found in your first query, you have two options. The simplest is to have a navigation property on your Employee class, named let's say TimeCards, that points to a collection of time card records for the given employee. Here's how that would work:
var oracleTimeCards = employees
.SelectMany(e => e.TimeCards)
.ToList();
If you don't want to do this for whatever reason, you can create an array of employee IDs by evaluating your first query, and use this to filter the second:
var empIDs = employees
.Select(e => e.PersonID)
.ToArray();
var oracleTimeCards = db.OracleTimecards
.Where(tc => empIDs.Contains(tc.PersonID))
.ToList();

Can I join a table to a list using linq? [duplicate]

This question already has answers here:
EntityFramework - contains query of composite key
(12 answers)
Closed 2 years ago.
I have a table as follows:
PersonalDetails
Columns are:
Name
BankName
BranchName
AccountNo
Address
I have another list that contains 'Name' and 'AccountNo'.
I have to find all the records from table that whose respective 'Name' and 'AccountNo' are present in given list.
Any suggestion will be helpful.
I have done following but not of much use:
var duplicationhecklist = dataAccessdup.MST_FarmerProfile
.Join(lstFarmerProfiles,
t => new { t.Name,t.AccountNo},
t1 => new { t1.Name, t1.AccountNo},
(t, t1) => new { t, t1 })
.Select(x => new {
x.t1.Name,
x.t1.BankName,
x.t1.BranchName,
x.t1.AccountNo
}).ToList();
where lstFarmerProfiles is a list.
You probably found out that you can't join an Entity Framework LINQ query with a local list of entity objects, because it can't be translated into SQL. I would preselect the database data on the account numbers only and then join in memory.
var accountNumbers = lstFarmerProfiles.Select(x => x.AccountNo).ToArray();
var duplicationChecklist =
from profile in dataAccessdup.MST_FarmerProfile
.Where(p => accountNumbers
.Contains(p.AccountNo))
.AsEnumerable() // Continue in memory
join param in lstFarmerProfiles on
new { profile.Name, profile.AccountNo} equals
new { param.Name, param.AccountNo}
select profile
So you will never pull the bulk data into memory but the smallest selection you can probably get to proceed with.
If accountNumbers contains thousands of items, you may consider using a better scalable chunky Contains method.
Since you have the lists in .net of values you want to find, try to use the Contains method, for sample:
List<string> names = /* list of names */;
List<string> accounts = /* list of account */;
var result = db.PersonalDetails.Where(x => names.Contains(x.Name) && accounts.Contains(x.AccountNo))
.ToList();
If MST_FarmerProfile is not super large I think you best option is to bring it into memory using AsEnumerable() and do the joining there.
var duplicationhecklist =
(from x in dataAccessdup.MST_FarmerProfile
.Select(z => new {
z.Name,
z.BankName,
z.BranchName,
z.AccountNo
}).AsEnumerable()
join y in lstFarmerProfiles
on new { x.Name, x.AccountNo} equals new { y.Name, y.AccountNo}
select x).ToList();
Since data is usually located on different machines or in separate processes at least: DB - is one and your in-memory list is your app, there is just 2 ways to do it.
Download as small data part from DB to local as possible and join locally (usually using AsEnumerable() or basically ToList()). You got many good thoughts on this in other answers.
Another one is different - upload your local data to server somehow and perform query on DB side. Uploading can be done differently: using some temp tables OR using VALUES. Fortunately there is a small extension for EF now (for both EF6 and EF Core) which you could try. It is EntityFrameworkCore.MemoryJoin (name might be confusing, but it supports both EF6 and EF Core). As stated in author's article it modifies SQL query passed to server and injects VALUES construction with data from your local list. And query is executed on DB server.
If accountNo identifies the record then you could use:
var duplicationCheck = from farmerProfile in dataAccessdup.MST_FarmerProfile
join farmerFromList in lstFarmerProfiles
on farmerProfile.AccountNo equals farmerFromList.AccountNo
select new {
farmerProfile.Name,
farmerProfile.BankName,
farmerProfile.BranchName,
farmerProfile.AccountNo
};
If you need to join on name and account then this should work:
var duplicationCheck = from farmerProfile in dataAccessdup.MST_FarmerProfile
join farmerFromList in lstFarmerProfiles
on new
{
accountNo = farmerProfile.AccountNo,
name = farmerProfile.Name
}
equals new
{
accountNo = farmerFromList.AccountNo,
name = farmerFromList.Name
}
select new
{
farmerProfile.Name,
farmerProfile.BankName,
farmerProfile.BranchName,
farmerProfile.AccountNo
};
If you are only going to go through duplicateChecklist once then leaving .ToList() out will be better for performance.

Entity Framework Take N items of child collection

Say I have a Customer entity, and a Sales entity, of 1-to-many relationship.
How could I get all Customers with N number of most recent sales?
var result = Customers.Where(c => c.Sales.Any());
This would return all customers with ALL their sales.
What if I want just 2 sales record from each customer?
P/S: I can do that with query syntax, i'm looking for method syntax solution. I just can't figure out how to chain them together in method syntax form
var result = from cust in context.Customers
select new
{
Customers = cust,
Sales = cust.Sales.OrderBy(s => s.Date).Take(2)
};
This works, but i'm not sure if this is the best way to do it.
EDIT:
OK, it turns out the query syntax that i included here is not working too.
Only the Sales in the anonymous type is effectively reduced to 2 records.
var filtered = result.AsEnumerable().Select(r => r.Customers);
doing this will still result in a list of customers with ALL their sales
You can do a project as described in here
var dbquery = Customers.Select( c => new {
Customer = c,
Sales = c.Sales.OrderBy(s => s.Date)
.Take(2).Select( s => new { s, s.SalesDetails})
});
var customers = dbquery
.AsEnumerable()
.Select(c => c.Customer);

Entity Framework 4 - What is the syntax for joining 2 tables then paging them?

I have the following linq-to-entities query with 2 joined tables that I would like to add pagination to:
IQueryable<ProductInventory> data = from inventory in objContext.ProductInventory
join variant in objContext.Variants
on inventory.VariantId equals variant.id
where inventory.ProductId == productId
where inventory.StoreId == storeId
orderby variant.SortOrder
select inventory;
I realize I need to use the .Join() extension method and then call .OrderBy().Skip().Take() to do this, I am just gettting tripped up on the syntax of Join() and can't seem to find any examples (either online or in books).
NOTE: The reason I am joining the tables is to do the sorting. If there is a better way to sort based on a value in a related table than join, please include it in your answer.
2 Possible Solutions
I guess this one is just a matter of readability, but both of these will work and are semantically identical.
1
IQueryable<ProductInventory> data = objContext.ProductInventory
.Where(y => y.ProductId == productId)
.Where(y => y.StoreId == storeId)
.Join(objContext.Variants,
pi => pi.VariantId,
v => v.id,
(pi, v) => new { Inventory = pi, Variant = v })
.OrderBy(y => y.Variant.SortOrder)
.Skip(skip)
.Take(take)
.Select(x => x.Inventory);
2
var query = from inventory in objContext.ProductInventory
where inventory.ProductId == productId
where inventory.StoreId == storeId
join variant in objContext.Variants
on inventory.VariantId equals variant.id
orderby variant.SortOrder
select inventory;
var paged = query.Skip(skip).Take(take);
Kudos to Khumesh and Pravin for helping with this. Thanks to the rest for contributing.
Define the join in your mapping, and then use it. You really don't get anything by using the Join method - instead, use the Include method. It's much nicer.
var data = objContext.ProductInventory.Include("Variant")
.Where(i => i.ProductId == productId && i.StoreId == storeId)
.OrderBy(j => j.Variant.SortOrder)
.Skip(x)
.Take(y);
Add following line to your query
var pagedQuery = data.Skip(PageIndex * PageSize).Take(PageSize);
The data variable is IQueryable, so you can put add skip & take method on it. And if you have relationship between Product & Variant, you donot really require to have join explicitly, you can refer the variant something like this
IQueryable<ProductInventory> data =
from inventory in objContext.ProductInventory
where inventory.ProductId == productId && inventory.StoreId == storeId
orderby inventory.variant.SortOrder
select new()
{
property1 = inventory.Variant.VariantId,
//rest of the properties go here
}
pagedQuery = data.Skip(PageIndex * PageSize).Take(PageSize);
My answer here based on the answer that is marked as true
but here I add a new best practice of the code above
var data= (from c in db.Categorie.AsQueryable().Join(db.CategoryMap,
cat=> cat.CategoryId, catmap => catmap.ChildCategoryId,
cat, catmap) => new { Category = cat, CategoryMap = catmap })
select (c => c.Category)
this is the best practice to use the Linq to entity because when you add AsQueryable() to your code; system will converts a generic System.Collections.Generic.IEnumerable to a generic System.Linq.IQueryable which is better for .Net engine to build this query at run time
thank you Mr. Khumesh Kumawat
You would simply use your Skip(itemsInPage * pageNo).Take(itemsInPage) to do paging.

Resources