Linq GroupInto Selector - linq

I am coding through the MS 101 Linq examples, and the joins are throwing me for a loop ;)
Example 104 is intended to show how slightly more complex group joins work:
var prodByCategory =
from cat in categories
join prod in products on cat equals prod.Category into ps
from p in ps // <-- ?
select new { Category = cat, p.ProductName };
The part with the "?" confuses me because it looks like p is exactly the same thing as ps, and that clause is not in the earlier examples.
So I tried to write it using method/linq syntax and what I put together looks like this:
var prodByCategory = categories.GroupJoin(products, c => c, p => p.Category, (c, p) => new { Category = c, p.First().ProductName });
The call to First() confuses me because it is not in the earlier query expression.
When I run this I get an error because p is empty before it gets to first. I'm not sure how that is possible because the category list definitely matches up with the categories in the products collection.
foreach (var item in prodByCategory)
{
Console.WriteLine(item.ProductName + ": " + item.Category);
}
Products already have a category field. So what good is the query? I guess it's just a learning thing, but I'm having trouble understanding the value here (although that might be because I don't understand how the two collections get related).
Update:
Played around with Gert's suggestion. See his well-articulated explanation below. After a minor clean-up this worked:
var prodByCategory = categories.GroupJoin(products, cat => cat, prod => prod.Category,
(cat, prod) => new { cat, products })
.SelectMany(x => x.products
, (x, p) =>
new
{
Category = x.cat, // x.cat.CategoryName not accessible here
ProductName = p.ProductName
});

You are right that it looks pretty useless. GroupJoin (or join - into) is used to execute a hybrid of joining and grouping. The items at the right side of the join are grouped within the items at the left. So, this query in comprehensive syntax
from cat in Categories
join prod in Products on cat equals prod.Category into products
select new { Category = cat.CategoryName,
Products = products.Select (p => p.ProductName) }
with its fluent syntax equvalent
Categories.GroupJoin(Products, cat => cat, prod => prod.Category,
(cat, products => new
{
Category = cat.CategoryName,
Products = products.Select (p => p.ProductName)
})
results in
Beverages Chai
Chang
Guaraná Fantástica
Sasquatch Ale
...
Condiments Aniseed Syrup
Chef Anton's Cajun Seasoning
Chef Anton's Gumbo Mix
Grandma's Boysenberry Spread
...
The benefit of GroupJoin vs. Join is that a Category can have an empty collection of Products, which in SQL is analogous to an outer join.
The effect of join - into followed by from is that the grouped structure is flattened out again. So
from cat in Categories
join prod in Products on cat equals prod.Category into ps
from p in ps
select new { Category = cat.CategoryName, p.ProductName }
with its equivalent
Categories.GroupJoin(Products, cat => cat, prod => prod.Category,
(cat, products) => new {cat, products})
.SelectMany(x => x.products
, (x, p) =>
new
{
Category = x.cat.CategoryName,
ProductName = p.ProductName
})
produces
Category ProductName
--------------- ----------------------------------------
Beverages Chai
Beverages Chang
Condiments Aniseed Syrup
Condiments Chef Anton's Cajun Seasoning
Condiments Chef Anton's Gumbo Mix
Condiments Grandma's Boysenberry Spread
Produce Uncle Bob's Organic Dried Pears
Condiments Northwoods Cranberry Sauce
Meat/Poultry Mishi Kobe Niku
...
Which would be the same as using a simple join to start with.
So the part from p in ps looks like it shouldn't essentially change the query. But it's a disguised SelectMany! That's quite an essential change for such an innocent little statement. I'm not chained to fluent syntax, but sometimes "comprehensive" syntax conceals what really happens, which is an excellent way to produce bugs.

I believe your translation to fluent syntax is not correct. It should be:
var prodByCategory = categories.GroupJoin(products, c => c, p => p.Category, (c, p) => new { c = c, p = p })
.SelectMany (ps => ps.p, (ps, p) => new { Category = ps.c, ProductName = p.ProductName });
I would suggest using LinqPad to examine both the fluent syntax translations and the generated SQL of your LINQ statements. It is a great tool and I find myself using it often.
I agree that their use of that sample is questionable.

Related

Linq Query join on multiple conditions

Trying to do a single query which combined data from multiple joins into a single property
Rig
----
RigId
Component1Id
Component2Id
Component3Id
Work Item
---------
Id
ComponentID
Description
I'm trying to do a query that returns a list of rigs with a single property called history that consists of all the workItems associated with the components in a Rig.
I cant seem to do multiple conditions in a join or do separate joins and concatenate the items into a single property.
So the result is something like
RigId, History (which consists of a list of all the workitems for the rig)
Here is the answer in query syntax:
var ans = from r in Rigs
join w in WorkItems on r.Component1ID equals w.ComponentID into wg1
join w in WorkItems on r.Component2ID equals w.ComponentID into wg2
join w in WorkItems on r.Component3ID equals w.ComponentID into wg3
select new { r.RigID, History = wg1.Concat(wg2).Concat(wg3).ToList() };
and if you prefer, lambda syntax (this was a bit harder...)
var ans2 = Rigs.GroupJoin(WorkItems, r => r.Component1ID, w => w.ComponentID, (r, w1g) => new { r, h1 = w1g.ToList() })
.GroupJoin(WorkItems, rh1 => rh1.r.Component2ID, w => w.ComponentID, (rh1, w2g) => new { rh1.r, h2 = rh1.h1.Concat(w2g.ToList()) })
.GroupJoin(WorkItems, rh2 => rh2.r.Component3ID, w => w.ComponentID, (rh2, w3g) => new { rh2.r.RigID, History = rh2.h2.Concat(w3g.ToList()) });
I don't think using columns for the components if a very good idea - what happens when a Rig has more or fewer than 3 components? You should really have a separate RigComponent table.

Entity Framework 4 - What is the syntax for joining 2 tables then paging them?

I have the following linq-to-entities query with 2 joined tables that I would like to add pagination to:
IQueryable<ProductInventory> data = from inventory in objContext.ProductInventory
join variant in objContext.Variants
on inventory.VariantId equals variant.id
where inventory.ProductId == productId
where inventory.StoreId == storeId
orderby variant.SortOrder
select inventory;
I realize I need to use the .Join() extension method and then call .OrderBy().Skip().Take() to do this, I am just gettting tripped up on the syntax of Join() and can't seem to find any examples (either online or in books).
NOTE: The reason I am joining the tables is to do the sorting. If there is a better way to sort based on a value in a related table than join, please include it in your answer.
2 Possible Solutions
I guess this one is just a matter of readability, but both of these will work and are semantically identical.
1
IQueryable<ProductInventory> data = objContext.ProductInventory
.Where(y => y.ProductId == productId)
.Where(y => y.StoreId == storeId)
.Join(objContext.Variants,
pi => pi.VariantId,
v => v.id,
(pi, v) => new { Inventory = pi, Variant = v })
.OrderBy(y => y.Variant.SortOrder)
.Skip(skip)
.Take(take)
.Select(x => x.Inventory);
2
var query = from inventory in objContext.ProductInventory
where inventory.ProductId == productId
where inventory.StoreId == storeId
join variant in objContext.Variants
on inventory.VariantId equals variant.id
orderby variant.SortOrder
select inventory;
var paged = query.Skip(skip).Take(take);
Kudos to Khumesh and Pravin for helping with this. Thanks to the rest for contributing.
Define the join in your mapping, and then use it. You really don't get anything by using the Join method - instead, use the Include method. It's much nicer.
var data = objContext.ProductInventory.Include("Variant")
.Where(i => i.ProductId == productId && i.StoreId == storeId)
.OrderBy(j => j.Variant.SortOrder)
.Skip(x)
.Take(y);
Add following line to your query
var pagedQuery = data.Skip(PageIndex * PageSize).Take(PageSize);
The data variable is IQueryable, so you can put add skip & take method on it. And if you have relationship between Product & Variant, you donot really require to have join explicitly, you can refer the variant something like this
IQueryable<ProductInventory> data =
from inventory in objContext.ProductInventory
where inventory.ProductId == productId && inventory.StoreId == storeId
orderby inventory.variant.SortOrder
select new()
{
property1 = inventory.Variant.VariantId,
//rest of the properties go here
}
pagedQuery = data.Skip(PageIndex * PageSize).Take(PageSize);
My answer here based on the answer that is marked as true
but here I add a new best practice of the code above
var data= (from c in db.Categorie.AsQueryable().Join(db.CategoryMap,
cat=> cat.CategoryId, catmap => catmap.ChildCategoryId,
cat, catmap) => new { Category = cat, CategoryMap = catmap })
select (c => c.Category)
this is the best practice to use the Linq to entity because when you add AsQueryable() to your code; system will converts a generic System.Collections.Generic.IEnumerable to a generic System.Linq.IQueryable which is better for .Net engine to build this query at run time
thank you Mr. Khumesh Kumawat
You would simply use your Skip(itemsInPage * pageNo).Take(itemsInPage) to do paging.

When is OrderBy operator called?

1)
a)
var result1 = from artist in artists
from album in artist.Albums
orderby album.Title,artist.Name
select new { Artist_id = artist.id, Album_id = album.id };
Is the above query translated into:
var result = artists.SelectMany(p => p.albums
.Select(p1 => new { Artist = p, Album = p1 }))
.OrderBy(p2 => p2.Album.Title)
.ThenBy(p3 => p3.Artist.Name)
.Select(p4 => new { Artist_id = p4.Artist.id, Album_id = p4.Album.id });
b)
I'm not sure if this question will make much sense - If my assumptions are correct and thus OrderBy is always one of the last operators to get called ( when using query expression ), then how would we express the following code using query expression (in other words, how do we specify in a query expression that we want OrderBy operator to get called sooner and not as one of the last operators ):
var result = artists
.SelectMany(p1 => p1.albums
.OrderBy(p2=>p2.title)
.Select(p3 => new { ID = p3.id, Title = p3.title }));
2) Do in the following query expression the two orderby clauses get translated into OrderBy(... artist.Name).OrderBy( ... album.Title):
var result1 = from artist in artists
from album in artist.Albums
orderby artist.Name
orderby album.Title
select new { ...};
thank you
For question 1: orderby gets called wherever you show it. Your query isn't quite equivalent to what you showed, but it's close. It doesn't help that you formatted it so that it looks like the Select is called on the result of SelectMany, when it's actually on the arguments to SelectMany. Your query is translated to something more like:
var result = artists
.SelectMany(artist => artist.albums, (artist, album) => new {artist, album})
.OrderBy(z => z.album.Title)
.ThenBy(z => z.artist.Name)
.Select(z => new { Artist_id = z.artist.id, Album_id = z.album.id }
Question 1b) Your query is roughly equivalent to:
var result = from p1 in artists
from p3 in (from p2 in p1.albums
orderby p2.title
new { ID = p2.id, Title = p2.title })
select p3;
It's only a rough translation as nothing in query expressions is converted to that overload of SelectMany, as far as I can remember. On the other hand, it could be that this does what you want in a slightly simpler way:
var result = from p1 in artists
from p3 in p1.albums.OrderBy(p2 => p2.title)
select new { ID = p3.id, Title = p3.title };
You'll still get the ordering within the artist. It's a mixture of query expression and "dot notation", but it looks good to me. Odd that you're not using p1 in the final result, mind you...
For question 2, using two orderby clauses you do indeed get two OrderBy calls, which is almost certainly not what you want. You want:
var result1 = from artist in artists
from album in artist.Albums
orderby artist.Name, album.Title
select new { ...};
That gets translated into the appropriate OrderBy(...).ThenBy(...) calls.

LINQ count in group from join

I have a LINQ statement I am trying to get right, so maybe going about this all wrong. My objective is to query a table and join in another table to get counts.
Places
ID, Display
ProfilePlaces
ID, PlaceID, Talk, Hear
Basically Places have ProfilePlaces in a one to many relationship. I want to get the number SUM of ProfilePlaces that have Talkand Hear. Talkand Hear are bit fields.
The following gives me a unique list of Places, so I need to add in the Talkand Hear counts.
var counts = from p in db.Places
join pp in db.ProfilePlaces on p.ID equals pp.PlaceID
group new { Place = p } by p.Display;
I thought something like this, but not having any luck
var counts = from p in db.Places
join pp in db.ProfilePlaces on p.ID equals pp.PlaceID
group new { Place = p,
Talk = pp.Count(t => t.Talk == true),
Hear = pp.Count(t => t.Hear == true)
} by p.Display;
Thanks for any help.
You want to do a GROUP JOIN to get the counts for each Place.
var counts2 = from p in places
join pp in profilePlaces on p.ID equals pp.PlaceID into g
select new
{
Place = p,
CountMeet = g.Count(a => a.Meet),
CountTalk = g.Count(a => a.Talk)
};
Here's the documentation on the different joins from MSDN:
http://msdn.microsoft.com/en-us/library/bb311040.aspx

conditional include in linq to entities?

I felt like the following should be possible I'm just not sure what approach to take.
What I'd like to do is use the include method to shape my results, ie define how far along the object graph to traverse. but... I'd like that traversal to be conditional.
something like...
dealerships
.include( d => d.parts.where(p => p.price < 100.00))
.include( d => d.parts.suppliers.where(s => s.country == "brazil"));
I understand that this is not valid linq, in fact, that it is horribly wrong, but essentially I'm looking for some way to build an expression tree that will return shaped results, equivalent to...
select *
from dealerships as d
outer join parts as p on d.dealerid = p.dealerid
and p.price < 100.00
outer join suppliers as s on p.partid = s.partid
and s.country = 'brazil'
with an emphasis on the join conditions.
I feel like this would be fairly straight forward with esql but my preference would be to build expression trees on the fly.
as always, grateful for any advice or guidance
This should do the trick:
using (TestEntities db = new TestEntities())
{
var query = from d in db.Dealership
select new
{
Dealer = d,
Parts = d.Part.Where
(
p => p.Price < 100.0
&& p.Supplier.Country == "Brazil"
),
Suppliers = d.Part.Select(p => p.Supplier)
};
var dealers = query.ToArray().Select(o => o.Dealer);
foreach (var dealer in dealers)
{
Console.WriteLine(dealer.Name);
foreach (var part in dealer.Part)
{
Console.WriteLine(" " + part.PartId + ", " + part.Price);
Console.WriteLine
(
" "
+ part.Supplier.Name
+ ", "
+ part.Supplier.Country
);
}
}
}
This code will give you a list of Dealerships each containing a filtered list of parts. Each part references a Supplier. The interesting part is that you have to create the anonymous types in the select in the way shown. Otherwise the Part property of the Dealership objects will be empty.
Also, you have to execute the SQL statement before selecting the dealers from the query. Otherwise the Part property of the dealers will again be empty. That is why I put the ToArray() call in the following line:
var dealers = query.ToArray().Select(o => o.Dealer);
But I agree with Darren that this may not be what the users of your library are expecting.
Are you sure this is what you want? The only reason I ask is, once you add the filter on Parts off of Dealerships, your results are no longer Dealerships. You're dealing in special objects that are, for the most part, very close to Dealerships (with the same properties), but the meaning of the "Parts" property is different. Instead of being a relationship between Dealerships and Parts, it's a filtered relationship.
Or to put it another way, if I pull a dealership out of your results and passed to a method I wrote, and then in my method I call:
var count = dealership.Parts.Count();
I'm expecting to get the parts, not the filtered parts from Brazil where the price is less than $100.
If you don't use the dealership object to pass the filtered data, it becomes very easy. It becomes as simple as:
var query = from d in dealerships
select new { DealershipName = d.Name,
CheapBrazilProducts = dealership.Parts.Where(d => d.parts.Any(p => p.price < 100.00) || d.parts.suppliers.Any(s => s.country == "brazil")) };
If I just had to get the filtered sets like you asked, I'd probably use the technique I mentioned above, and then use a tool like Automapper to copy the filtered results from my anonymous class to the real class. It's not incredibly elegant, but it should work.
I hope that helps! It was an interesting problem.
I know this can work with one single Include. Never test with two includes, but worth the try:
dealerships
.Include( d => d.parts)
.Include( d => d.parts.suppliers)
.Where(d => d.parts.All(p => p.price < 100.00) && d.parts.suppliers.All(s => s.country == "brazil"))
Am I missing something, or aren't you just looking for the Any keyword?
var query = dealerships.Where(d => d.parts.Any(p => p.price < 100.00) ||
d.parts.suppliers.Any(s => s.country == "brazil"));
Yes that's what I wanted to do I think the next realease of Data Services will have the possiblity to do just that LINQ to REST queries that would be great in the mean time I just switched to load the inverse and Include the related entity that will be loaded multiple times but in theory it just have to load once in the first Include like in this code
return this.Context.SearchHistories.Include("Handle")
.Where(sh => sh.SearchTerm.Contains(searchTerm) && sh.Timestamp > minDate && sh.Timestamp < maxDate);
before I tried to load for any Handle the searchHistories that matched the logic but don't know how using the Include logic you posted so in the mean time I think a reverse lookup would be a not so dirty solution

Resources