Aggregation in LINQ - linq

I am new to LINQ. I am stuck with a very silly problem
Name Subjects Role
---- -------- --------
A Math Student
A English Student
B Math Student
B English Student
C Math Student
C Math Admin
I need result as
Name Subjects Role
---- -------- --------
A Math, English Student
B Math, English Student
C Math Student
C Math Admin
I am confused as to how to go about this problem. This is simple in SQL where I can do a groupby clause and get the comma seperated values via a function.
Can someone please help me out?
Edited: The three columns are from 3 different sources. I have updated the resultant table. Thanks for your help in advance!

I don't have your code but it should look like this:
var grouped = from element in yourList
group element by element.Name into g
select new
{
Name = g.Key,
Subjects = g.Select(e => e.Subject),
// Assuming they are identical when they have the same name
Role = g.First().Role
};

Try this:
var grouped = classes.GroupBy(g => new {Name = g.Name, Role = g.Role}).Select(
s =>
new
{
Name = s.Key,
Subjects = s.Select(x => x.Subject).Aggregate("", (current, se) => current + (", " + se)),
Role = s.Select(x => x.Role).First()
});
var result = grouped.Select(s => new
{
s.Name,
Subjects = s.Subjects.Substring(2),
s.Role
}).ToList();
This will put your subjects in a comma separated string.
Hope this helps.

Related

Linq: Count number of times a sub list appear in another list

I guess there must be an easy way, but not finding it. I would like to check whether a list of items, appear (completely or partially) in another list.
For example: Let's say I have people in a department as List 1. Then I have a list of sports with a list of participants in that sport.
Now I want to count, in how many sports does all the people of a department appear.
(I know some tables might not make sense when looking at it from a normalisation angle, but it is easier this way than to try and explain my real tables)
So I have something like this:
var peopleInDepartment = from d in Department_Members
group d by r.DepartmentID into g
select new
{
DepartmentID = g.Key,
TeamMembers = g.Select(r => d.PersonID).ToList()
};
var peopleInTeam = from s in Sports
select new
{
SportID = s.SportID,
PeopleInSport = s.Participants.Select(x => x.PersonID),
NoOfMatches = peopleInDepartment.Contains(s.Participants.Select(x => x.PersonID)).Count()
};
The error here is that peopleInDepartment does not contain a definition for 'Contains'. Think I'm just in need of a new angle to look at this.
As the end result I would like print:
Department 1 : The Department participates in 3 sports
Department 2 : The Department participates in 0 sports
etc.
Judging from the expected result, you should base the query on Department table like the first query. Maybe just include the sports count in the first query like so :
var peopleInDepartment =
from d in Department_Members
group d by r.DepartmentID into g
select new
{
DepartmentID = g.Key,
TeamMembers = g.Select(r => d.PersonID).ToList(),
NumberOfSports = Sports.Count(s => s.Participants
.Any(p => g.Select(r => r.PersonID)
.Contains(p.PersonID)
)
)
};
NumberOfSports should contains count of sports, where any of its participant is listed as member of current department (g.Select(r => r.PersonID).Contains(p.PersonID))).

Entity Framework Take N items of child collection

Say I have a Customer entity, and a Sales entity, of 1-to-many relationship.
How could I get all Customers with N number of most recent sales?
var result = Customers.Where(c => c.Sales.Any());
This would return all customers with ALL their sales.
What if I want just 2 sales record from each customer?
P/S: I can do that with query syntax, i'm looking for method syntax solution. I just can't figure out how to chain them together in method syntax form
var result = from cust in context.Customers
select new
{
Customers = cust,
Sales = cust.Sales.OrderBy(s => s.Date).Take(2)
};
This works, but i'm not sure if this is the best way to do it.
EDIT:
OK, it turns out the query syntax that i included here is not working too.
Only the Sales in the anonymous type is effectively reduced to 2 records.
var filtered = result.AsEnumerable().Select(r => r.Customers);
doing this will still result in a list of customers with ALL their sales
You can do a project as described in here
var dbquery = Customers.Select( c => new {
Customer = c,
Sales = c.Sales.OrderBy(s => s.Date)
.Take(2).Select( s => new { s, s.SalesDetails})
});
var customers = dbquery
.AsEnumerable()
.Select(c => c.Customer);

When is OrderBy operator called?

1)
a)
var result1 = from artist in artists
from album in artist.Albums
orderby album.Title,artist.Name
select new { Artist_id = artist.id, Album_id = album.id };
Is the above query translated into:
var result = artists.SelectMany(p => p.albums
.Select(p1 => new { Artist = p, Album = p1 }))
.OrderBy(p2 => p2.Album.Title)
.ThenBy(p3 => p3.Artist.Name)
.Select(p4 => new { Artist_id = p4.Artist.id, Album_id = p4.Album.id });
b)
I'm not sure if this question will make much sense - If my assumptions are correct and thus OrderBy is always one of the last operators to get called ( when using query expression ), then how would we express the following code using query expression (in other words, how do we specify in a query expression that we want OrderBy operator to get called sooner and not as one of the last operators ):
var result = artists
.SelectMany(p1 => p1.albums
.OrderBy(p2=>p2.title)
.Select(p3 => new { ID = p3.id, Title = p3.title }));
2) Do in the following query expression the two orderby clauses get translated into OrderBy(... artist.Name).OrderBy( ... album.Title):
var result1 = from artist in artists
from album in artist.Albums
orderby artist.Name
orderby album.Title
select new { ...};
thank you
For question 1: orderby gets called wherever you show it. Your query isn't quite equivalent to what you showed, but it's close. It doesn't help that you formatted it so that it looks like the Select is called on the result of SelectMany, when it's actually on the arguments to SelectMany. Your query is translated to something more like:
var result = artists
.SelectMany(artist => artist.albums, (artist, album) => new {artist, album})
.OrderBy(z => z.album.Title)
.ThenBy(z => z.artist.Name)
.Select(z => new { Artist_id = z.artist.id, Album_id = z.album.id }
Question 1b) Your query is roughly equivalent to:
var result = from p1 in artists
from p3 in (from p2 in p1.albums
orderby p2.title
new { ID = p2.id, Title = p2.title })
select p3;
It's only a rough translation as nothing in query expressions is converted to that overload of SelectMany, as far as I can remember. On the other hand, it could be that this does what you want in a slightly simpler way:
var result = from p1 in artists
from p3 in p1.albums.OrderBy(p2 => p2.title)
select new { ID = p3.id, Title = p3.title };
You'll still get the ordering within the artist. It's a mixture of query expression and "dot notation", but it looks good to me. Odd that you're not using p1 in the final result, mind you...
For question 2, using two orderby clauses you do indeed get two OrderBy calls, which is almost certainly not what you want. You want:
var result1 = from artist in artists
from album in artist.Albums
orderby artist.Name, album.Title
select new { ...};
That gets translated into the appropriate OrderBy(...).ThenBy(...) calls.

What's the LINQ to select the latest item from a number of versioned items?

I've got a class like the following:
public class Invoice
{
public int InvoiceId {get;set;}
public int VersionId {get;set;}
}
Each time an Invoice is modified, the VersionId gets incremented, but the InvoiceId remains the same. So given an IEnumerable<Invoice> which has the following results:
InvoiceId VersionId
1 1
1 2
1 3
2 1
2 2
How can I get just the results:
InvoiceId VersionId
1 3
2 2
I.e. I want just the Invoices from the results which have the latest VersionId. I can easily do this in T-SQL, but cannot for the life of me work out the correct LINQ syntax. I'm using Entity Framework 4 Code First.
Order by the VersionId, group them by InvoiceId, then take the first result of each group. Try this:
var query = list.OrderByDescending(i => i.VersionId)
.GroupBy(i => i.InvoiceId)
.Select(g => g.First());
EDIT: how about this approach using Max?
var query = list.GroupBy(i => i.InvoiceId)
.Select(g => g.Single(i => i.VersionId == g.Max(o => o.VersionId)));
Try using FirstOrDefault or SingleOrDefault in place of Single as well... it would give the same result although Single shows the intention better.
EDIT: I've tested both these queries with LINQ to Entities. They seem to work, so perhaps the issue is something else?
Option 1:
var latestInvoices = invoices.GroupBy(i => i.InvoiceId)
.Select(group => group.OrderByDescending(i => i.VersionId)
.FirstOrDefault());
EDIT: Changed 'Last' to 'FirstOrDefault', LINQ to Entities has issues with the 'Last' query operator.
Option 2:
var invoices = from invoice in dc.Invoices
group invoice by invoice.InvoiceId into invoiceGroup
let maxVersion = invoiceGroup.Max(i => i.VersionId)
from candidate in invoiceGroup
where candidate.VersionId == maxVersion
select candidate;
My version:
var h = from i in Invoices
group i.VersionId by i.InvoiceId into grouping
select new {InvoiceId = grouping.Key, VersionId = grouping.Max()};
Update
As was mentioned by Ahmad in the comments, the above query will return a projection. The version below will return a IQueryable<Invoice>. I use composition to build the query because I think it is more clear.
var maxVersions = from i in Invoices
group i.VersionId by i.InvoiceId into grouping
select new {InvoiceId = grouping.Key,
VersionId = grouping.Max()};
var latestInvoices = from i in Invoices
join m in maxVersions
on new {i.InvoiceId, i.VersionId} equals
new {m.InvoiceId, m.VersionId}
select i;

How to Update previous row column based on the current row column data using LinQ

var customer= from cust in customerData
select new Customer
{
CustomerID = cust["id"],
Name = cust["Name"],
LastVisit = cust["visit"],
PurchashedAmount = cust["amount"],
Tagged = cust["tagged"]
Code = cust["code"]
}
The rows looks like this
Name LastVisit PurchasedAmount Tagged Code CustomerID
------ --------- -------------- ------ ----- -----
Joshua 07-Jan-09 Yes chiJan01 A001
Joshua 10000
The 2nd row belongs to first row just that the other columns are empty.How can i merge the PurchasedAmount into the first row using LinQ?
This is probably a more general solution than you need - it will work even if the other values are scattered across rows. The main condition is that the Name column should identify rows that belong together.
customer = from c in customer
group c by c.Name
into g
select new Customer
{
Name = g.Key,
LastVisit = g.Select(te => te.LastVisit).
Where(lv => lv.HasValue).FirstOrDefault(),
PurchaseAmount = g.Select(te => te.PurchaseAmount).
Where(pa => pa.HasValue).FirstOrDefault(),
Tagged = g.Select(te => te.Tagged).
Where(ta => ta.HasValue).FirstOrDefault(),
Code = g.Select(te => te.Code).
Where(co => !string.IsNullOrEmpty(co)).FirstOrDefault(),
CustomerID = g.Select(te => te.CustomerID).
Where(cid => !string.IsNullOrEmpty(cid)).FirstOrDefault()
};
This will return a new IEnumerable with the items grouped by Name and the non-null values selected (same effect as moving PurchasedAmount to the first row and deleting the second in your case).
Note that the code is based on the assumption that LastVisit, PurchaseAmount and Tagged are nullable types (DateTime?, int? and bool?). Thus the usage of HasValue. If, however, they are strings in your case, you have to use !string.IsNullOrEmpty() instead (as for Code and CustomerID).

Resources