Max sequence from a view containing multiple record using Linq lambda - linq

I've been at this for a while. I have a data set that has a reoccurring key and a sequence similar to this:
id status sequence
1 open 1
1 processing 2
2 open 1
2 processing 2
2 closed 3
a new row is added for each 'action' that happens, so the various ids can have variable sequences. I need to get the Max sequence number for each id, but I still need to return the complete record.
I want to end up with sequence 2 for id 1, and sequence 3 for id 2.
I can't seem to get this to work without selecting the distinct ids, then looping through the results, ordering the values and then adding the first item to another list, but that's so slow.
var ids = this.ObjectContext.TNTP_FILE_MONITORING.Select(i => i.FILE_EVENT_ID).Distinct();
List<TNTP_FILE_MONITORING> vals = new List<TNTP_FILE_MONITORING>();
foreach (var item in items)
{
vals.Add(this.ObjectContext.TNTP_FILE_MONITORING.Where(mfe => ids.Contains(mfe.FILE_EVENT_ID)).OrderByDescending(mfe => mfe.FILE_EVENT_SEQ).First<TNTP_FILE_MONITORING>());
}
There must be a better way!

Here's what worked for me:
var ts = new[] { new T(1,1), new T(1,2), new T(2,1), new T(2,2), new T(2,3) };
var q =
from t in ts
group t by t.ID into g
let max = g.Max(x => x.Seq)
select g.FirstOrDefault(t1 => t1.Seq == max);
(Just need to apply that to your datatable, but the query stays about the same)
Note that with your current method, because you are iterating over all records, you also get all records from the datastore. By using a query like this, you allow for translation into a query against the datastore, which is not only faster, but also only returns only the results you need (assuming you are using Entity Framework or Linq2SQL).

Related

Using LINQ to SELECT the SUM() of a subquery

I am trying to learn how to use LINQ to perform a query that yields the same result as this:
SELECT (
SELECT SUM(point)
FROM communitymemberpointfeature
WHERE communitymemberpointfeature.communitymemberid = communitymember.id
) AS points, communitymember.*
FROM communitymember
After browsing around the Internet, I constructed the following statement:
var list = (from pointFeature in communityMemberPointFeatureList
join member in communityMemberList on pointFeature.CommunityMemberId equals member.Id
group pointFeature by new { pointFeature.CommunityMemberId }
into grouping
select new
{
grouping,
points = grouping.Sum(row => row.Point)
}).ToList();
But this yielded a result like
[
{
points:7200,
grouping:[
{Id:1,Point:5000,FeatureId:1,CommunityMemberId:1},
{Id:2,Point:2200,FeatureId:1,CommunityMemberId:1},
],
}
...
]
What I really want is a result set like:
[
{points:7200,CommunityMemberId:1,firstname:'john',lastname:'blah' ....},
...
]
Can someone tell me what I did wrong?
Edit after comment added to the end
I can imagine you have problems translating your SQL into LINQ. When trying to write LINQ statements it is usually a lot easier to start from your requirements, instead of starting from a SQL statement.
It seems to me that you have a table with CommunityMembers. Every CommunityMember has a primary key in property Id.
Furthermore, every CommunityMember has zero or more CommunityMemberPointFeatures, namely those CommunityMemberPointFeatures with a foreign key CommunityMemberId that equals the primary key of the CommunityMember that it belongs to.
For example: CommunityMember [14] has all CommunityMemberPointFeatures that have a value CommunityMemberId equal to 14.
Requirement
If I look at your SQL, it seems to me that you want to query all CommunityMembers, each with the sum of property Point of all CommunityMemberPointFeatures of this CommunityMember.
Whenever you want to query "items with their zero or more subitems", like "Schools with their Students", "Customers with their Orders", "CommunityMembers with their PointFeatures", consider using GroupJoin.
A GroupJoin is in fact a Left Outer Join, followed by a GroupBy to make Groups of the Left item with all its Right items.
var result = dbContext.CommunityMembers // GroupJoin CommunityMembers
.GroupJoin(CommunityMemberPointFeatures, // With CommunityMemberPointFeatures
communityMember => communityMember.Id, // from every CommunityMember take the Id
pointFeature => pointFeature.CommunityMemberId, // from every CommunityMemberPointFeature
// take the CommunityMemberId
// Parameter ResultSelector: take every CommunityMember, with all its matching
// CommunityMemberPointFeatures to make one new object:
(communityMember, pointFeaturesOfThisCommunityMember) => new
{
// Select the communityMember properties that you plan to use:
Id = communityMember.Id,
Name = communityMember.Name,
...
// From the point features of this CommunityMember you only want the sum
// or property Point:
Points = pointFeaturesOfThisCommunityMember
.Select(pointFeature => pointFeature.Point)
.Sum(),
// However, if you want more fields, you can use:
PointFeatures = pointFeaturesOfThisCommunityMember.Select(pointFeature => new
{
Id = pointFeature.Id,
Name = pointFeature.Name,
...
// not needed, you know the value:
// CommunityMemberId = pointFeature.CommunityMemberId,
})
.ToList(),
});
Edit after comment
If you want, you can omit Selecting the values that you plan to use.
// Parameter ResultSelector:
(communityMember, pointFeaturesOfThisCommunityMember) => new
{
CommunityMember = communityMember,
PointFeatures = pointFeaturesOfThisCommunityMember.ToList(),
),
However, I would strongly advise against this. If CommunityMember [14] has a thousand PointFeatures, then every PointFeature will have a foreign key with a value 14. So you are transporting this value 14 1001 times. What a waste of processing power, not to mention all the other fields you plan not to use.
Besides: if you do this you violate against information hiding: whenever your tables changes internally, the result of this function changes. Is that what you want?

Take one and skip other duplicate item in a child table

I have a list of Items and every item have some list, Now I wants to select Distinct items of child. I have tried like below but it's not working.
var items = await _context.Items.
Include(i => i.Tags.Distinct()).
Include(i => i.Comments).
OrderBy(i => i.Title).ToListAsync();
//Tag items
TagId - tag
------------------
1 --- A
2 --- B
3 --- B
4 --- C
5 --- D
6 --- D
7 --- F
//Expected Result
Item.Tags -> [A,B,C,D,F]
how can I do this in EF Core? Thanks.
You can use the MoreLinq library to get DistinctBy or write your own using this post.
Then use this:
var items = await _context.Items.
Include(i => i.Tags).
Include(i => i.Comments).
OrderBy(i => i.Title).
DistinctBy(d => d.Tags.tag).
ToListAsync();
You want to get distinct records based on one column; so that should do it.
Apparently you have a table of Items, where every Item has zero or more Tags. Furthermore the Items have a property Comments, of which we do not know whether it is one string, or a collection of zero or more strings. Furthermore every Item has a Title.
Now you want all properties of Items, each with its Comments, and a list of unique Tags of the items. Ordered by Title
One of the slower parts of database queries is the transport of the selected data from the database management system to your local process. Hence it is wise to limit the amount of data to the minimum you are really using.
It seems that the Tags of the Items are in a separate table. Every Item has zero or more Tags, every Tag belongs to exactly one item. A simple one-to-many relation with a foreign key Tag.ItemId.
If Item with Id 300 has 1000 Tags, then you know that every one of these 1000 Tags has a foreign key ItemId of which you know that it has a value of 300. What a waste if you would transport all these foreign keys to your local process.
Whenever you query data to inspect it, Select only the properties
you really plan to use. Only use Include if you plan to update the
included item.
So your query will be:
var query = myDbContext.Items
.Where(item => ...) // only if you do not want all items
.OrderBy(item => item.Title) // if you Sort here and do not need the Title
// you don't have to Select it
.Select(item => new
{ // select only the properties you plan to use
Id = item.Id,
Title = item.Title,
Comments = item.Comments, // use this if there is only one item, otherwise
Comments = item.Comments // use this version if Item has zero or more Comments
.Where(comment => ...) // only if you do not want all comments
.Select(comment => new
{ // again, select only the Comments you plan to use
Id = comment.Id,
Text = comment.Text,
// no need for the foreign key, you already know the value:
// ItemId = comment.ItemId,
})
.ToList();
Tags = item.Tags.Select(tag => new
{ // Select only the properties you need
Id = tag.Id,
Type = tag.Type,
Text = tag.Text,
// No need for the foreign key, you already know the value
// ItemId = tag.ItemId,
})
.Distinct()
.ToList(),
});
var fetchedData = await query.ToListAsync();
I haven't tried it, but I'd say you put .Distinct() in the wrong place.
var items = await _context.Items
.Include(i => i.Tags)
.Include(i => i.Comments).
.OrderBy(i => i.Title)
.Select(i => { i.Tags = i.Tags.GroupBy(x => x.Tag).Select(x => x.First()); return i; })
.ToListAsync();

dynamic asc desc sort

I am trying to create table headers that sort during a back end call in nhibernate. When clicking the header it sends a string indicating what to sort by (ie "Name", "NameDesc") and sending it to the db call.
The db can get quite large so I also have back end filters and pagination built into reduce the size of the retrieved data and therefore the orderby needs to happen before or at the same time as the filters and skip and take to avoid ordering the smaller data. Here is an example of the QueryOver call:
IList<Event> s =
session.QueryOver<Event>(() => #eventAlias)
.Fetch(#event => #event.FiscalYear).Eager
.JoinQueryOver(() => #eventAlias.FiscalYear, () => fyAlias, JoinType.InnerJoin, Restrictions.On(() => fyAlias.Id).IsIn(_years))
.Where(() => !#eventAlias.IsDeleted);
.OrderBy(() => fyAlias.RefCode).Asc
.ThenBy(() => #eventAlias.Name).Asc
.Skip(numberOfRecordsToSkip)
.Take(numberOfRecordsInPage)
.List();
How can I accomplish this?
One way how to achieve this (one of many, because you can also use some fully-typed filter object etc or some query builder) could be like this draft:
Part one and two:
// I. a reference to our query
var query = session.QueryOver<Event>(() => #eventAlias);
// II. join, filter... whatever needed
query
.Fetch(#event => #event.FiscalYear).Eager
var joinQuery = query
.JoinQueryOver(...)
.Where(() => !#eventAlias.IsDeleted)
...
Part three:
// III. Order BY
// Assume we have a list of strings (passed from a UI client)
// here represented by these two values
var sortBy = new List<string> {"Name", "CodeDesc"};
// first, have a reference for the OrderBuilder
IQueryOverOrderBuilder<Event, Event> order = null;
// iterate the list
foreach (var sortProperty in sortBy)
{
// use Desc or Asc?
var useDesc = sortProperty.EndsWith("Desc");
// Clean the property name
var name = useDesc
? sortProperty.Remove(sortProperty.Length - 4, 4)
: sortProperty;
// Build the ORDER
order = order == null
? query.OrderBy(Projections.Property(name))
: query.ThenBy(Projections.Property(name))
;
// use DESC or ASC
query = useDesc ? order.Desc : order.Asc;
}
Finally the results:
// IV. back to query... call the DB and get the result
IList<Event> s = query
.List<Event>();
This draft is ready to do sorting on top of the root query. You can also extend that to be able to add some order statements to joinQuery (e.g. if the string is "FiscalYear.MonthDesc"). The logic would be similar, but built around the joinQuery (see at the part one)

Multiple Counts within a single query

I want a list of counts for some of my data (count the number of open.closed tasks etc), I want to get all counts inside 1 query, so I am not sure what I do with my linq statement below...
_user is an object that returns info about the current loggedon user
_repo is am object that returns an IQueryable of whichever table I want to select
var counters = (from task in _repo.All<InstructionTask>()
where task.AssignedToCompanyID == _user.CompanyID || task.CompanyID == _user.CompanyID
join instructions in _repo.GetAllMyInstructions(_user) on task.InstructionID equals
instructions.InstructionID
group new {task, instructions}
by new
{
task
}
into g
select new
{
TotalEveryone = g.Count(),
TotalMine = g.Count(),
TotalOpen = g.Count(x => x.task.IsOpen),
TotalClosed = g.Count(c => !c.task.IsOpen)
}).SingleOrDefault();
Do I convert my object to single or default? The exception I am getting is, this sequence contains more than one element
Note: I want overall stats, not for each task, but for all tasks - not sure how to get that?
You need to dump everything into a single group, and use a regular Single. I am not sure if LINQ-to-SQL would be able to translate it correctly, but it's definitely worth a try.
var counters = (from task in _repo.All<InstructionTask>()
where task.AssignedToCompanyID == _user.CompanyID || task.CompanyID == _user.CompanyID
join instructions in _repo.GetAllMyInstructions(_user) on task.InstructionID == instructions.InstructionID
group task by 1 /* <<=== All tasks go into one group */ into g select new {
TotalEveryone = task.Count(),
TotalMine = task.Count(), // <<=== You probably need a condition here
TotalOpen = task.Count(x => x.task.IsOpen),
TotalClosed = task.Count(c => !c.task.IsOpen)
}).Single();
From MSDN
Returns the only element of a sequence, or a default value if the
sequence is empty; this method throws an exception if there is more
than one element in the sequence.
You need to use FirstOrDefault. SingleOrDefault is designed for collections that contains exactly 1 element (or none).

EF - Linq Expression and using a List of Ints to get best performance

So I have a list(table) of about 100k items and I want to retrieve all values that match a given list.
I have something like this.
the Table Sections key is NOT a primary key, so I'm expecting each value in listOfKeys to return a few rows.
List<int> listOfKeys = new List<int>(){1,3,44};
var allSections = Sections.Where(s => listOfKeys.Contains(s.id));
I don't know if it makes a difference but generally listOfKeys will only have between 1 to 3 items.
I'm using the Entity Framework.
So my question is, is this the best / fastest way to include a list in a linq expression?
I'm assuming that it isn't better to use another .NETICollection data object. Should I be using a Union or something?
Thanks
Suppose the listOfKeys will contain only small about of items and it's local list (not from database), like <50, then it's OK. The query generated will be basically WHERE id in (...) or WHERE id = ... OR id = ... ... and that's OK for database engine to handle it.
A Join would probably be more efficient:
var allSections =
from s in Sections
join k in listOfKeys on s.id equals k
select s;
Or, if you prefer the extension method syntax:
var allSections = Sections.Join(listOfKeys, s => s.id, k => k, (s, k) => s);

Resources