Joins with multiple fields on GroupBy table data in LINQ query/method - linq

I have to work out how to write the following SQL query usingLINQ query or method syntax. (Edit: This is to return a list of latest AgentActivities for all Agents).
SELECT
a.[AgentActivityId],
a.[AgentId],
a.[ActivityId],
a.[StartedAt],
a.[EndedAt],
a.[Version]
FROM
[dbo].[AgentActivity] a
INNER JOIN
(
SELECT
[AgentId],
MAX([StartedAt])[StartedAt]
FROM
[dbo].[AgentActivity]
WHERE
([StartedAt] > '2010/01/24 23:59:59')
AND ([StartedAt] < '2010/10/25')
GROUP BY
AgentId
)grouped
ON (a.[AgentId] = grouped.[AgentId]
AND a.[StartedAt] = grouped.[StartedAt])

Just to recap, here's how I interpret the question:
What you want is a list with the most recently started activity for an agent, with the added requirement that the activity must be started within a given date interval.
This is one way to do it:
// the given date interval
DateTime startDate = new DateTime(2010, 1, 24);
DateTime endDate = new DateTime(2010, 10, 25);
IEnumerable<AgentActivity> agentActivities =
... original list of AgentActivities ...
IEnumerable<AgentActivity> latestAgentActivitiesByAgent = agentActivities
.Where(a => a.StartedAt >= startDate && a.StartedAt < endDate)
.GroupBy(a => a.AgentId)
.Select(g => g
.OrderByDescending(a => a.StartedAt)
.First());
(If the question involves LINQ to SQL, there may be some gotchas. I haven't tried that.)

Related

How to make zero counts show in LINQ query when getting daily counts?

I have a database table with a datetime column and I simply want to count how many records per day going back 3 months. I am currently using this query:
var minDate = DateTime.Now.AddMonths(-3);
var stats = from t in TestStats
where t.Date > minDate
group t by EntityFunctions.TruncateTime(t.Date) into g
orderby g.Key
select new
{
date = g.Key,
count = g.Count()
};
That works fine, but the problem is that if there are no records for a day then that day is not in the results at all. For example:
3/21/2008 = 5
3/22/2008 = 2
3/24/2008 = 7
In that short example I want to make 3/23/2008 = 0. In the real query all zeros should show between 3 months ago and today.
Fabricating missing data is not straightforward in SQL. I would recommend getting the data that is in SQL, then joining it to an in-memory list of all relevant dates:
var stats = (from t in TestStats
where t.Date > minDate
group t by EntityFunctions.TruncateTime(t.Date) into g
orderby g.Key
select new
{
date = g.Key,
count = g.Count()
}).ToList(); // hydrate so we only query the DB once
var firstDate = stats.Min(s => s.date);
var lastDate = stats.Max(s => s.date);
var allDates = Enumerable.Range(1,(lastDate - firstDate).Days)
.Select(i => firstDate.AddDays(i-1));
stats = (from d in allDates
join s in stats
on d equals s.date into dates
from ds in dates.DefaultIfEmpty()
select new {
date = d,
count = ds == null ? 0 : ds.count
}).ToList();
You could also get a list of dates not in the data and concatenate them.
I agree with #D Stanley's answer but want to throw an additional consideration into the mix. What are you doing with this data? Is it getting processed by the caller? Is it rendered in a UI? Is it getting transferred over a network?
Consider the size of the data. Why do you need to have the gaps filled in? If it is known to be returning over a network for instance, I'd advise against filling in the gaps. All you're doing is increasing the data size. This has to be serialised, transferred, then deserialised.
If you are going to loop the data to render in a UI, then why do you need the gaps? Why not implement the loop from min date to max date (like D Stanley's join) then place a default when no value is found.
If you ARE transferring over a network and you still NEED a single collection, consider applying D Stanley's resolution on the other side of the wire.
Just things to consider...

Limiting a query to the rows created today (Linq-to-Entities, Datetime)

So I've got this query:
var query = from r in context.Cars
let h = context.CarHistories
.Where(u => r.ID == u.CarID)
.Where(u => u.EventID == intEventID)
.OrderByDescending(u => u.CreatedDate)
.FirstOrDefault()
select new RefundListItem()
{
ID = r.ID,
VendorID = r.VendorID,
RecipientName = r.RecipientName,
MostRecentSubmittedName = h.CreatedName,
CreatedDate = h.CreatedDate,
};
Later on, I add this to the query because I only want the rows that were created today:
DateTime today = DateTime.Today;
query.Where(u => Convert.ToDateTime(u.CreatedDate) >= today);
For some reason, this where statement does not affect the query at all. The query still returns items created from previous days instead of limiting them to just the rows created today.
I have also tried this but it does not work either:
DateTime today = DateTime.Today.Date;
query.Where(u => Convert.ToDateTime(u.CreatedDate.Date) >= today.Date);
I'm using Linq-to-Entities (MVC 4, EF 4).
Where does not modify query instance, it returns new one with additional condition added. Assign it back to query to make it work:
query = query.Where(u => Convert.ToDateTime(u.CreatedDate.Date) >= today.Date);

What's the LINQ to select the latest item from a number of versioned items?

I've got a class like the following:
public class Invoice
{
public int InvoiceId {get;set;}
public int VersionId {get;set;}
}
Each time an Invoice is modified, the VersionId gets incremented, but the InvoiceId remains the same. So given an IEnumerable<Invoice> which has the following results:
InvoiceId VersionId
1 1
1 2
1 3
2 1
2 2
How can I get just the results:
InvoiceId VersionId
1 3
2 2
I.e. I want just the Invoices from the results which have the latest VersionId. I can easily do this in T-SQL, but cannot for the life of me work out the correct LINQ syntax. I'm using Entity Framework 4 Code First.
Order by the VersionId, group them by InvoiceId, then take the first result of each group. Try this:
var query = list.OrderByDescending(i => i.VersionId)
.GroupBy(i => i.InvoiceId)
.Select(g => g.First());
EDIT: how about this approach using Max?
var query = list.GroupBy(i => i.InvoiceId)
.Select(g => g.Single(i => i.VersionId == g.Max(o => o.VersionId)));
Try using FirstOrDefault or SingleOrDefault in place of Single as well... it would give the same result although Single shows the intention better.
EDIT: I've tested both these queries with LINQ to Entities. They seem to work, so perhaps the issue is something else?
Option 1:
var latestInvoices = invoices.GroupBy(i => i.InvoiceId)
.Select(group => group.OrderByDescending(i => i.VersionId)
.FirstOrDefault());
EDIT: Changed 'Last' to 'FirstOrDefault', LINQ to Entities has issues with the 'Last' query operator.
Option 2:
var invoices = from invoice in dc.Invoices
group invoice by invoice.InvoiceId into invoiceGroup
let maxVersion = invoiceGroup.Max(i => i.VersionId)
from candidate in invoiceGroup
where candidate.VersionId == maxVersion
select candidate;
My version:
var h = from i in Invoices
group i.VersionId by i.InvoiceId into grouping
select new {InvoiceId = grouping.Key, VersionId = grouping.Max()};
Update
As was mentioned by Ahmad in the comments, the above query will return a projection. The version below will return a IQueryable<Invoice>. I use composition to build the query because I think it is more clear.
var maxVersions = from i in Invoices
group i.VersionId by i.InvoiceId into grouping
select new {InvoiceId = grouping.Key,
VersionId = grouping.Max()};
var latestInvoices = from i in Invoices
join m in maxVersions
on new {i.InvoiceId, i.VersionId} equals
new {m.InvoiceId, m.VersionId}
select i;

How to get a group ordered by the count column

It is hard to ask the question in plain english so I'll show what I'm trying to do.
Here's my SQL code:
select top 100 [Name], COUNT([Name]) as total from ActivityLog
where [Timestamp] between '2010-10-28' and '2010-10-29 17:00'
group by [Name]
order by total desc
I need to write that in LinQ. So far I have the following:
var groups = from ActivityLog log in ctx.ActivityLog
where log.Timestamp > dateFrom
where log.Timestamp <= dateTo
group log by log.Name;
but I don't have the COUNT(*) column to sort from :(
I'm afraid I am far more comfortable with the fluent syntax (as opposed to query syntax), but here is one possible LINQ answer:
ctx.ActivityLog
.Where(x => x.TimeStamp > dateFrom && x.TimeStamp <= dateTo)
.GroupBy(x => x.Name)
.Select(x => new { Name = x.Key, Total = x.Count() })
.OrderByDescending(x => x.Total)
.Take(100)
EDIT:
Alright, I stepped out of my comfort zone and came up with a query syntax version, just don't expect too much. I warned you about my abilities above:
(from y in (
from x in (
from log in ActivityLog
where log.Timestamp > dateFrom
where log.Timestamp <= dateTo
group log by log.Name)
select new { Name = x.Key, Total = x.Count() })
orderby y.Total descending
select new { Name = y.Name, Total = y.Total }).Take(100)
diceguyd30's answer technically is LINQ and is correct. In fact, the query syntax gets translated to those Queryable/Enumerable methods by the compiler. That said what's missing is using the group ... by ... into syntax. The equivalent query should be close to this:
var query = from log in ctx.ActivityLog
where log.TimeStamp > dateFrom && log.TimeStamp <= dateTo
group log by log.Name into grouping
orderby grouping.Count() descending
select new { Name = grouping.Key, Total = grouping.Count() };
var result = query.Take(100);
Note that in C# the Take(100) method has no equivalent in query syntax so you must use the extension method. VB.NET, on the other hand, does support Take and Skip in query syntax.

LINQ: How to perform a conditional sum?

I have a LINQ Query that creates a new type that contains a days of week and a sum of hours worked.
My current (incorrect query) looks like this:
var resultSet = (from a in events
group a by a.Start.DayOfWeek into g
select new DaySummary
{
day = g.Key.ToString(),
hoursWorked = g.Any(p => p.Title == "Lunch") ? 0 :
Math.Round((g.Sum(
p => (Decimal.Parse((p.End - p.Start).TotalMinutes.ToString()))) / 60), 2)
}).ToList();
Hopefully you can see what Im trying to do. The Any method is not having the effect I'd like however. Basically I want to to sum up the hours worked, but if the title was "lunch" I want it to add 0.
The logic of this is just a little beyond me at the moment.
UPDATE
Ok, Im an idiot. Changes the query to this and it now works. Sorry.
var resultSet = (from a in events
group a by a.Start.DayOfWeek into g
select new DaySummary
{
day = g.Key.ToString(),
hoursWorked = Math.Round((g.Where(p => p.Title !="Lunch").
Sum(p => (Decimal.Parse((p.End - p.Start).TotalMinutes.ToString()))) / 60), 2)
}).ToList();
It seems each group is a sequence of 'periods' and you just want to ignore any 'lunch' periods in each calculation. In that case you just need to remove these from the sum using Where.
var hours = events
.GroupBy(e => e.Start.DayOfWeek)
.Select(g => new DaySummary {
day = g.Key.ToString(),
hoursWorked = Math.Round(
g.Where(p => p.Title != "Lunch")
.Sum(pd => (Decimal.Parse((pd.End - pd.Start).TotalMinutes.ToString())) / 60), 2)
}).ToList();

Resources