LINQ to entities Query group by and count from different tables - linq

i have two tables
1) Logs
2) Jobs
structure of both are as follows
Logs :- id, Emailid, LogDate
sampledata:- 1, a#a.com, jan24 1999
2, b#a.com, jan25 1999
3, a#a.com, jan25 1999
4, c#a.com jan26,1999
5, a#a.com jan27,1999
Jobs :- jid, job_name, job_viewed_by
sampledata:- j01, painter, a#a.com
j02, teacher, a#a.com
j01, painter, b#a.com
job_viewed_by is foreign key in jobs table and is related with Emailid in Logs table.
now i want a linq to entitites query which can give me
all Emailids from the logs tables who haved logged recently along with the no of jobs viewed (count of jobs) by them.
so as per above sample data my requirement is
a#a.com last logged on 27th jan.1999 and had viewed 2 jobs so far
b#a.com last logged on 24th jan.1999 had viewed 1 jobs so far
c#a.com last logged on 26th jan.1999. no jobs viewed
i know how to write it in SQL but i need to convert it using LinqtoEntities.
i tried a query but it give me number of recent logins rather than job counts.
var q= (from p in context.Logs
from x in context.ViewedJobs.Where(v=>p.EmailId ==v.ViewedBy)
group p by p.EmailId into grp
select new{ EmailId = grp.Key,
LastDate = grp.Max(g => g.LogDate),
Count=grp.Count() }).OrderByDescending(m=>m.LogDate);

Just smiple to try:
var q = from p in context.Logs
group p by p.Emailid into g
select new
{
EmailId=g.Key,
LastDate= g.Max(x => x.LogDate),
Count=context.ViewedJobs.Count(v=>v.ViewedBy==g.Key)
};
Update Version:
var q = from p in context.Logs
group p by p.Emailid into g
join j in context.ViewedJobs
on g.Key equlas j.ViewedBy into leftGroup
select new
{
EmailId=g.Key,
LastDate= g.Max(x => x.LogDate),
Count=leftGroup.Any()?leftGroup.Count():0
};

Related

is there a faster way to work with nested linq query?

I am trying to query a table with nested linq query. My query working but is too slow. I have almost 400k row. And this query work 10 seconds for 1000 rows. For 400k I think its about to 2 hours.
I have rows like this
StudentNumber - DepartmentID
n100 - 1
n100 - 1
n105 - 1
n105 - 2
n107 - 1
I want the students which have different department ID. My results looks like this.
StudentID - List
n105 - 1 2
And my query provides it. But slowly.
var sorgu = (from yok in YOKAktarim
group yok by yok.StudentID into g
select new {
g.Key,
liste=(from birim in YOKAktarim where birim.StudentID == g.Key select new { birim.DepartmentID }).ToList().GroupBy (x => x.DepartmentID).Count()>1 ? (from birim in YOKAktarim where birim.StudentID == g.Key select new { birim.DepartmentID }).GroupBy(x => x.DepartmentID).Select(x => x.Key).ToList() : null,
}).Take(1000).ToList();
Console.WriteLine(sorgu.Where (s => s.liste != null).OrderBy (s => s.Key));
I wrote this query with linqpad C# statement.
For 400K records you should be able to return the student ids and department ids into an in-memory list.
var list1 = (from r in YOKAktarim
group r by new { r.StudentID, r.DepartmentID} into g
select g.Key
).ToList();
Once you have this list, you should be able to group by StudentID and select those students who have more than one record.
var list2 = (from r in list1 group r by r.StudentID into g
where g.Count() > 1
select new
{
StudentID = g.Key,
Departments = g.Select(a => a.DepartmentID).ToList()
}
).ToList();
This should be faster as it only hits the sql database once, rather than hundreds of thousands of times.
You're iterating your source collection (YOKAktarim) three times, which makes your query *O(n^3)` query. It's going to be slow.
Instead of going back to source collection to get content of the group you can simply iterate over g.
var sorgu = (from yok in YOKAktarim
group yok by yok.StudentID into g
select new {
g.Key,
liste = from birim in g select new { birim.DepartmentID }).ToList().GroupBy (x => x.DepartmentID).Count()>1 ? (from birim in g select new { birim.DepartmentID }).GroupBy(x => x.DepartmentID).Select(x => x.Key).ToList() : null,
}).Take(1000).ToList();
However, that's still not optimal, because you're doing a lot of redundant subgrouping. Your query is pretty much equivalent to:
from yok in YOKAktarim
group yok by yok.StudentID into g
let departments = g.Select(g => g.DepartmentID).Distinct().ToList()
where departments.Count() > 1
select new {
g.Key,
liste = departments
}).Take(1000).ToList();
I can't speak for the correctness of that monster, but simply removing all ToList() calls except the outermost one will fix your issue.

How can I select items from a table but ban certain from another?

I have two tables, one contains entities other entitylog.
MyEntity:
id, lat, lon
A entity has a position in the world.
MyEntityLog:
id, otherid, otherlat, otherlon
Entity with id has interacted with otherid at otherid's latitude and longitude.
For instance, I have the following entities:
1, 4.456, 2.234
2, 3.344, 6.453
3, 6.234, 9.324
(not very accurate, but it serves the purpose).
Now, If entity 1 interact with 2 the result on the log table would look like:
1, 2, 3.344, 6.453
So my question is, how can I for listing entity 1's available interactions NOT include the ones on the log table?
The result of listing entity 1's available interactions should be only be entity 3 as it already has a interaction with 2.
First make a list of ids that interact with entity 1:
var id1 = 1;
var excluded = from l in db.EntityLogs
where l.id == id1
select l.otherid;
then find the entries not having an id in this list or equal to id1:
var logs= from l in db.EntityLogs
where !excluded.Contains(l.id) && l.id != id1
select l;
Note that linq will defer the execution of excluded and incorporate it in the execution of logs.
Not sure if I understand your question, I guess I need more details, but if you want to list the entities that have no entry in log table, one solution will be something like this, assuming myEntities is the collection of MyEntity and myEntityLogs is the collection of MyEntityLog
var firstList = myEntities.Join(myEntityLogs, a => a.Id, b => b.Id, (a, b) => a).Distinct();
var secondList = myEntities.Join(myEntityLogs, a => a.Id, b => b.OtherId, (a, b) => a).Distinct();
var result = myEntities.Except(firstList.Concat(secondList)).ToList();

Left join with grouping and ordering

Musicians write songs. Songs are played on the air.
I have database tables Musicians, Songs and AirTimes. The AirTimes table entries hold information on which song was played on which date and for how many minutes.
I have classes Musician, Song, AirTime that correspond to the tables. The classes have navigational properties that point to the other entity. Arrows below represent navigation.
Musician <--> Song <--> AirTime
From the database, I have to retrieve all the Musicians and dates on which his/her song got AirTime. Plus I want to show the number of Songs played on a particular date and the number of minutes played on that date.
In Microsoft SQL, I would do it as follows:
select
dbo.Musicians.LastName
, dbo.AirTimes.PlayDate
, count(dbo.AirTimes.PlayDate) as 'No. of entries'
, sum(dbo.AirTimes.Duration) as 'No. of minutes'
from dbo.Musicians
left outer join dbo.Songs
on dbo.Musicinas.MusicianId = dbo.Songs.MusicianId
left outer join dbo.AirTimes
on dbo.Songs.SongId = dbo.AirTimes.SongId
and '2014-07-01T00:00:00' <= dbo.AirTimes.PlayDate
and dbo.AirTimes.PlayDate <= '2014-07-31T00:00:00'
group by
dbo.Musicians.LastName
, dbo.AirTimes.PlayDate
order by
dbo.Musicians.LastName
, dbo.AirTimes.PlayDate
Can anybody “translate” this into linq-to-entitese?
Update Aug. 9, 2012
I'm unable to confirm grudolf's schemes do what I wanted. I accomplished things with a different technique. Nonetheless, I accept his/her answer.
As you have the navigational properties in both directions you can start either from AirTimes:
var grpTime = (
from a in AirTimes
where a.Date >= firstDate && a.Date < lastDate
group a by new {a.Song.Musician.LastName, a.Song.Title, a.Date} into grp
select new {
grp.Key.LastName,
grp.Key.Title,
grp.Key.Date,
Plays = grp.Count(),
Seconds = grp.Sum(x => x.Duration)
}
);
or from Musicians:
var grpMus = (
from m in Musicians
from s in m.Songs
from p in s.Plays
where p.Date >= firstDate && p.Date < lastDate
group p by new {m.LastName, s.Title, p.Date} into grp
select new {
grp.Key.LastName,
grp.Key.Title,
grp.Key.Date,
Plays = grp.Count(),
Seconds= grp.Sum(x => x.Duration)
}
);
EDIT:
To display all musicians, including those without airtime you can use another level of grouping - in first step you calculate totals per song+day and then group them with song's author. It could probably work directly with database but I didn't manage to find an efficient way to do it. Yet. ;) With code, the original AirTimes result is changed to return Musician instead of his lastname and then joined to list of all musicians:
//Airtimes for musicians
var grpAir = (
from a in AirTimes
where a.Date >= firstDate && a.Date < lastDate
group a by new {a.Song.Musician, a.Date} into grp
select new {
//Musician instead of his LastName for joining. Id would work too
grp.Key.Musician,
//grp.Key.Musician.LastName,
Date=grp.Key.Date,
Plays = grp.Count(),
Secs = grp.Sum(x => x.Duration)
}
);
var res = (
from m in Musicians
join g in grpAir on m equals g.Musician into g2
from g in g2.DefaultIfEmpty()
orderby m.LastName
select new {
m.LastName,
Date = (g==null ? null : g.Date),
Plays = (g==null ? 0 : g.Plays),
Secs = (g==null ? 0 : g.Secs)
}
);
You can find a more complete LINQPad sample at https://gist.github.com/3236238

Entity Framework Linq - how to get groups that contain all your data

Here a sample dataset:
OrderProduct is a table that contains the productIds that were part of a given order.
Note: OrderProduct is a database table and I am using EF.
OrderId, ProductId
1, 1
2, 2
3, 4
3, 5
4, 5
4, 2
5, 2
5, 3
What I want to be able to do is find an order that contains only the productIds that I am searching for. So if my input was productIds 2,3, then I should get back OrderId 5.
I know how I can group data, but I am unsure of how to perform the select on the group.
Here is what I have:
var q = from op in OrderProduct
group op by op.OrderId into orderGroup
select orderGroup;
Not sure how to proceed from here
IEnumerable<int> products = new List<int> {2, 3};
IEnumerable<OrderProduct> orderProducts = new List<OrderProduct>
{
new OrderProduct(1, 1),
new OrderProduct(2, 2),
new OrderProduct(3, 4),
new OrderProduct(3, 5),
new OrderProduct(4, 5),
new OrderProduct(4, 2),
new OrderProduct(5, 2),
new OrderProduct(5, 3),
};
var orders =
(from op in orderProducts
group op by op.OrderId into orderGroup
//magic goes there
where !products.Except(orderGroup.Select(x => x.ProductId)).Any()
select orderGroup);
//outputs 5
orders.Select(x => x.Key).ToList().ForEach(Console.WriteLine);
Or you can have another version as pointed in another answer, just replace
where !products.Except(orderGroup.Select(x => x.ProductId)).Any()
on
where products.All(pid => orderGroup.Any(op => op.ProductId == pid))
second one will have ~ 15% better performance (I've checked that)
Edit
According to the last requirement change, that you need orders that contain not all productIds you are searching, but exactly those and only those productIds, I wrote an updated version:
var orders =
(from op in orderProducts
group op by op.OrderId into orderGroup
//this line was added
where orderGroup.Count() == products.Count()
where !products.Except(orderGroup.Select(x => x.ProductId)).Any()
select orderGroup);
So the only thing you'll need is to add a precondition ensuring that collections contains the same amount of elements, it will work for both previous queries, and as a bonus I suggest 3rd version of the most important where condition:
where orderGroup.Select(x => x.ProductId).Intersect(products).Count() == orderGroup.Count()
At first glance, I'd try something like this:
var prodIds = new[] {2, 3};
from o in context.Orders
where prodIds.All(pid => o.OrderProducts.Any(op => op.ProductId == pid))
select o
In plain language: "get the orders that have a product with every ID in the given list."
Update
Since it appears you are using LINQ to SQL rather than LINQ to Entities, here's another approach:
var q = context.Orders;
foreach(var pid in prodIds)
{
q = q.Where(o => o.OrderProducts.Any(op => op.ProductId == pid));
}
Rather than using a single LINQ statement, you essentially build the query piecemeal.
Thanks to StriplingWarrior's answer I managed to figure this out. Not sure if this is the best way to do this, but it works.
List<int> prodIds = new List<int>{2,3};
var q = from o in Orders
//get all orderproducts that contain products in the ProdId list
where o.OrderProducts.All(op => prodIds.Contains(op.ProductId))
//now group the OrderProducts by the Orders
select from op in o.OrderProducts
group op by op.OrderId into opGroup
//select only those groups that have the same count as the prodId list
where opGroup.Count() == prodIds.Count()
select opGroup;
//get rid of any groups that may be empty
q = q.Where(fi => fi.Count()> 0);
(I am using LinqPad, which is why the query looks a little funky - no context, etc)

Linq Distinct on list

I have a list like this:
List people
age name
1 bob
1 sam
7 fred
7 tom
8 sally
I need to do a linq query on people and get an int of the number distinct ages (3)
int distinctAges = people.SomeLinq();
how?
how?
Select out the age, then use Distinct and Count.
var ages = people.Select( p => p.Age ).Distinct().Count()
Or you could use GroupBy and Count
var ages = people.GroupBy( p => p.Age ).Count();
Download LinqPad and give these simple linq / lambda queries yourself. Its very easy to compare the SQL and equivalent Linq / lambda result set.
You would start with
select Age, Name
from People
group by Age, Name
Then open another tab
var ages = (from p in Peoples
group p by p.Age into g
select g);
ages.Dump();
Then open another tab
var ages = Peoples.GroupBy(p => p.Age);
ages.Dump();

Resources