group by and joining tables in linq to sql - linq

I have the following 3 classes(mapped to sql tables).
Places table:
Name(key)
Address
Capacity
Events table:
Name(key)
Date
Place
Orders table:
Id(key)
EventName
Qty
The Places and Events tables are connected through Places.Name = Events.Place, while the Events and Orders tables: Events.Name = Orders.EventName .
The task is that given an event, return the tickets left for that event. Capacity is the number a place can hold and Qty is the number of tickets ordered by someone. So some sort of grouping in the Orders table is needed and then subtract the sum from capacity.

Something like this (C# code sample below)?
Sorry for the weird variable names, but event is a keyword :)
I didn't use visual studio, so I hope that the syntax is correct.
string eventName = "Event";
var theEvent = Events.FirstOrDefault(ev => ev.Name == eventName);
int eventOrderNo = Orders.Count(or => or.EventName == eventName);
var thePlace = Places.FirstOrDefault(pl => pl.Name == theEvent.Place);
int ticketsLeft = thePlace.Capacity - eventOrderNo;
If the Event has multiple places, the last two lines would look like this:
int placesCapacity = Places.Where(pl => pl.Name == theEvent.Place)
.Sum(pl => pl.Capacity);
int ticketsLeft = placesCapacity - eventOrderNo;
On a sidenote
LINQ 101 is a great way to get familiar with LINQ: http://msdn.microsoft.com/en-us/vcsharp/aa336746

Related

How to write SQL translateable linq code that groups by one property and returns distinct list

I want to change code below to be sql translateable because now i get exception.
Basicallly i want list of customers from certain localisation and there could be more than one customer with the same CustomerNumber so i want to take the one that was most recently added.
In other words - distinct list of customers from localisation where "distinct algorithm" works by taking the most recently added customer if there is conflict.
The code below works only if it is client side. I could move Group By and Select after ToListAsync but i want to avoid taking unnecessary data from database (there is include which includes list that is pretty big for every customer).
var someData = await DbContext.Set<Customer>()
.Where(o => o.Metadata.Localisation == localisation)
.Include(nameof(Customer.SomeLongList))
.GroupBy(x => x.CustomerNumber)
.Select(gr => gr.OrderByDescending(x => x.Metadata.DateAdded).FirstOrDefault())
.ToListAsync();
Short answer:
No way. GroupBy has limitation: after grouping only Key and Aggregation result can be selected. And you are trying to select SomeLongList and full entity Customer.
Best answer:
It can be done by the SQL and ROW_NUMBER Window function but without SomeLongList
Workaround:
It is because it is not effective
var groupingQuery =
from c in DbContext.Set<Customer>()
group c by new { c.CustomerNumber } into g
select new
{
g.Key.CustomerNumber,
DateAdded = g.Max(x => x.DateAdded)
};
var query =
from c in DbContext.Set<Customer>().Include(x => x.SomeLongList)
join g in groupingQuery on new { c.CustomerNumber, c.DateAdded } equals
new { g.CustomerNumber, g.DateAdded }
select c;
var result = await query.ToListAsync();

How to get top 10 customers based on values of orders

I have a list of Customers who each have a list of Orders. Each Order has a list of LineItems.
I would like to write a LINQ query that would get me the top 10 customers based on order value (i.e. money spent) and not the total number of orders.
One customer could have 2 orders but could have spent £10,000, but another customer could have 100 orders, and only spent £500.
Right now, I have this which gets me the top 10 customers by the number of orders.
var customers = (from c in _context.Customers where c.SaleOrders.Count > 0
let activeCount = c.SaleOrders.Count(so => so.Status != SaleOrderStatus.Cancelled)
orderby activeCount descending
select c).Take(10);
UPDATE
Thanks to Jon Skeet's comment about doing a double Sum, I wrote the following query which compiles.
var customers = (from c in _context.Customers where c.SaleOrders.Count > 0
let orderSum = c.SaleOrders.Where(so => so.Status != SaleOrderStatus.Cancelled)
.Sum(so => so.LineItems.Sum(li => li.CalculateTotal()))
orderby orderSum descending
select c).Take(10);
But when I run this, I get the following error:
It seems LINQ doesn't recognise my .CalculateTotal() method which sit on my LineItem.cs entity.
The problem you were seeing is that CalculateTotal() is not something that Linq can translate into SQL (which is done at run-time, hence no complier error).
The essential problem here is that Linq doesn't really work on lambdas (Func<>), but actually Expressions (Expression<Func<>>), which is the code in a partial compiled state, which Linq then goes about disassembling and translating into SQL.
So, let assume CalculateTotal is a member function defined like this:
public decimal CalculateTotal()
{
return this.quantity * this.value;
}
We could define that as a local lambda function
Func<LineItem, decimal> CalculateTotal = (li => li.quantity * li.value);
Now, we have a lambda which takes a LineItem and returns a value, which is exactly what Sum() wants, so now we can replace:
.Sum(so => so.LineItems.Sum(li => li.CalculateTotal()))
with
.Sum(so => so.LineItems.Sum(CalculateTotal))
But that will crash, just as it did before, because, as I said, it wants an Expression. So, we give it one:
Expression<Func<LineItem, decimal>> CalculateTotal = (li => li.quantity * li.value);

Linq: Count number of times a sub list appear in another list

I guess there must be an easy way, but not finding it. I would like to check whether a list of items, appear (completely or partially) in another list.
For example: Let's say I have people in a department as List 1. Then I have a list of sports with a list of participants in that sport.
Now I want to count, in how many sports does all the people of a department appear.
(I know some tables might not make sense when looking at it from a normalisation angle, but it is easier this way than to try and explain my real tables)
So I have something like this:
var peopleInDepartment = from d in Department_Members
group d by r.DepartmentID into g
select new
{
DepartmentID = g.Key,
TeamMembers = g.Select(r => d.PersonID).ToList()
};
var peopleInTeam = from s in Sports
select new
{
SportID = s.SportID,
PeopleInSport = s.Participants.Select(x => x.PersonID),
NoOfMatches = peopleInDepartment.Contains(s.Participants.Select(x => x.PersonID)).Count()
};
The error here is that peopleInDepartment does not contain a definition for 'Contains'. Think I'm just in need of a new angle to look at this.
As the end result I would like print:
Department 1 : The Department participates in 3 sports
Department 2 : The Department participates in 0 sports
etc.
Judging from the expected result, you should base the query on Department table like the first query. Maybe just include the sports count in the first query like so :
var peopleInDepartment =
from d in Department_Members
group d by r.DepartmentID into g
select new
{
DepartmentID = g.Key,
TeamMembers = g.Select(r => d.PersonID).ToList(),
NumberOfSports = Sports.Count(s => s.Participants
.Any(p => g.Select(r => r.PersonID)
.Contains(p.PersonID)
)
)
};
NumberOfSports should contains count of sports, where any of its participant is listed as member of current department (g.Select(r => r.PersonID).Contains(p.PersonID))).

Linq data structuring

I have two issues I'm struggling with LINQ. I appreciate if you could advise. I have 2 lists rawStates (storing rows of entity-downtime-uptime-eventtype) and rawData list storing products' in and out times from entity.
I want to select those elements from rawStates that occurred when still waiting for that entity to be processed
foreach(var t in rawData)
var s = rawStates
//I am not sure if this single logic clause in Where is enough;
.Where(o => o.Entity == t.Entity
&& o.DownDate > t.InTime
&& o.Update < t.OutTime)
.ToList();
If I group my rawData by productID (there are multiple rows with same ProductID), how can I revert back this "s" to these groups so that for a productID I can group by eventtype, and summarise durations by productID?

Speed up LINQ query - EF5

I have the following LINQ query using EF5 and generic repository, unit of work patterns to a SQL Server 2008 db
var countriesArr = GetIdsFromDelimStr(countries);
var competitionsArr = GetIdsFromDelimStr(competitions);
var filterTeamName = string.Empty;
if (teamName != null)
{
filterTeamName = teamName.ToUpper();
}
using (var unitOfWork = new FootballUnitOfWork(ConnFooty))
{
// give us our selection of teams
var teams =
(from team in
unitOfWork.TeamRepository.Find()
where ((string.IsNullOrEmpty(filterTeamName) || team.Name.ToUpper().Contains(filterTeamName)) &&
(countriesArr.Contains(team.Venue.Country.Id) || countriesArr.Count() == 0))
select new
{
tId = team.Id
}).Distinct();
// give us our selection of contests
var conts = (
from cont in
unitOfWork.ContestRepository.Find(
c =>
((c.ContestType == ContestType.League && competitionsArr.Count() == 0) ||
(competitionsArr.Contains(c.Competition.Id) && competitionsArr.Count() == 0)))
select new
{
contId = cont.Id
}
).Distinct();
// get selection of home teams based on contest
var homecomps = (from fixt in unitOfWork.FixtureDetailsRepository.Find()
where
teams.Any(t => t.tId == fixt.HomeTeam.Id) &&
conts.Any(c => c.contId == fixt.Contest.Id)
select new
{
teamId = fixt.HomeTeam.Id,
teamName = fixt.HomeTeam.Name,
countryId = fixt.HomeTeam.Venue.Country.Id != null ? fixt.HomeTeam.Venue.Country.Id : 0,
countryName = fixt.HomeTeam.Venue.Country.Id != null ? fixt.HomeTeam.Venue.Country.Name : string.Empty,
compId = fixt.Contest.Competition.Id,
compDesc = fixt.Contest.Competition.Description
}).Distinct();
// get selection of away teams based on contest
var awaycomps = (from fixt in unitOfWork.FixtureDetailsRepository.Find()
where
teams.Any(t => t.tId == fixt.AwayTeam.Id) &&
conts.Any(c => c.contId == fixt.Contest.Id)
select new
{
teamId = fixt.AwayTeam.Id,
teamName = fixt.AwayTeam.Name,
countryId = fixt.AwayTeam.Venue.Country.Id != null ? fixt.AwayTeam.Venue.Country.Id : 0,
countryName = fixt.AwayTeam.Venue.Country.Id != null ? fixt.AwayTeam.Venue.Country.Name : string.Empty,
compId = fixt.Contest.Competition.Id,
compDesc = fixt.Contest.Competition.Description
}).Distinct();
// ensure that we return the max competition based on id for home teams
var homemax = (from t in homecomps
group t by t.teamId
into grp
let maxcomp = grp.Max(g => g.compId)
from g in grp
where g.compId == maxcomp
select g).Distinct();
// ensure that we return the max competition based on id for away teams
var awaymax = (from t in awaycomps
group t by t.teamId
into grp
let maxcomp = grp.Max(g => g.compId)
from g in grp
where g.compId == maxcomp
select g).Distinct();
var filteredteams = homemax.Union(awaymax).OrderBy(t => t.teamName).AsQueryable();
As you can see we want to return the following format which is passed across to a WebAPI so we cast the results to types we can relate to in the UI.
Essentially what we are trying to do is get the home and away teams from a fixture, these fixtures have a contest which relates to a competition. We then get the highest competition id from the grouping and then this is returned with that team. The country is related to the team based on the venue id, when I was originally doing this i had problems figuring out how to do OR joins in linq which is why i split it down to getting home teams and away team and then grouping them based on competition then unioning them together.
An idea of current table size is fixtures has 7840 rows, teams has 8581 rows, contests has 337 rows and competitions has 96 rows. The table that is likely to increase rapidly is the fixture table as this is related to football.
The output we want to end up with is
Team Id, Team Name, Country Id, Country Name, Competition Id, Competition Name
Using no filtering this query takes on average around 5 secs, just wondering if anybody has any ideas/pointers on how to make it quicker.
thanks in advance Mark
I can't judge whether it will speed up things, but your homemax and awaymax queries could be
var homemax = from t in homecomps
group t by t.teamId into grp
select grp.OrderByDescending(x => x.compId).FirstOrDefault();
var awaymax = from t in awaycomps
group t by t.teamId into grp
select grp.OrderByDescending(x => x.compId).FirstOrDefault();
Further, as you are composing one very large query it may perform better when you cut it up in a few smaller queries that fetch intermediary results. Sometimes a few more roundtrips to the database perform better than one very large query for which the database engine can't find a good execution plan.
Another thing is all these Distinct()s. Do you always need them? I think you can do without because you are always fetching data from one table without joining a child collection. Removing them may save a bunch.
Yet another optimization could be to remove the ToUpper. The comparison is done by the database engine in SQL and chances are that the database has a case-insensitive collation. If so, the comparison is never case sensitive even if you'd want it to be! Constructs like Name.ToUpper cancel the use of any index on Name (it is not sargable).

Resources