Linq FirstOrDefault ordering when OrderBy or OrderByDescending are not supplied? - linq

If we have an EF entity with basic structure of
public class Entity
{
public int Id {get; set;}
public string name {get; set;}
public DateTime lastUpdated {get; set;}
}
And a collection of them with 4 entities:
entities = new List<Entity>{
new { Id = 0, Name = "Thing One", LastUpdated = new DateTime(2020, 10, 1, 0, 0, 0,)},
new { Id = 1, Name = "Thing One", LastUpdated = new DateTime(2020, 10, 4, 0, 0, 0,)},
new { Id = 2, Name = "Thing One", LastUpdated = new DateTime(2020, 10, 3, 0, 0, 0,)},
new { Id = 3, Name = "Thing One", LastUpdated = new DateTime(2020, 10, 2, 0, 0, 0,)}
};
If we use Linq to select the FirstOrDefault by the Name property.
var selectedEntity = entities.FirstOrDefault(e => e.Name == "Thing One");
What is this going to return?
This is causing issues in a piece of code that I'm reviewing; which will be fixed by only allowing unique entities and using SingleOrDefault, but I'm curious at to what the FirstOrDefault is going to return by default when there is no OrderBy clause.

The first item matching this condition is returned, if there are multiple items matching the result is unpredictable/undefined. If you want to ensure a specifc item you need to apply an OrderBy:
var selectedEntity = entities
.Where(e => e.Name == "Thing One")
.OrderByDescending(e => e.LastUpdated)
.FirstOrDefault();
What is returned by the database depends on the vendor. This related qustion might help: When no 'Order by' is specified, what order does a query choose for your record set?

Related

GroupBy using LINQ but base distinct on another column, and still be able to count other records

How should I run a GroupBy based on Id using LINQ when there is an object similar to following:
public class foo
{
public int id { get; set; }
public string name { get; set; }
public string lang { get; set; }
public int displayOrder { get; set; }
public int count { get; set; }
}
The list could be:
id = 1, name="test1", lang = "en", displayOrder = 1, count = 1
id = 1, name="test2", lang = "fr", displayOrder = 2, count = 2
id = 1, name="test3", lang = "de", displayOrder = 3, count = 1
id = 2, name="test4", lang = "en", displayOrder = 2, count = 1
id = 2, name="test5", lang = "fr", displayOrder = 3, count = 1
id = 3, name="test6", lang = "en", displayOrder = 6, count = 1
id = 3, name="test7", lang = "fr", displayOrder = 4, count = 1
id = 4, name="test8", lang = "en", displayOrder = 5, count = 1
id = 5, name="test9", lang = "de", displayOrder = 6, count = 1
I want to run LINQ so that it Groups By Id values, but the distinct id values should be filtered based on lang e.g. "fr", if nothing is available in "fr" it should output only default language record for "en"
but should also Count the total number of records based on Id, it should retrieve following results for above:
id = 1, name="test2", lang = "fr", displayOrder = 2, count = 4
id = 2, name="test5", lang = "fr", displayOrder = 3, count = 2
id = 3, name="test7", lang = "fr", displayOrder = 4, count = 2
id = 4, name="test8", lang = "en", displayOrder = 5, count = 1
id = 5, name="test9", lang = "de", displayOrder = 6, count = 1
Please, is there a way to do something like this using LINQ ?
All of you LINQ experts, I'm ideally looking for query using lambda, this would be a great help. Thanks in advance.
You can sort the target language to the front, and select the first item in a group:
var query = from f in foos
group f by f.id into g
let lang = (from f in g
orderby
f.lang == "fr" ? 0 : 1,
f.lang == "en" ? 0 : 1,
f.lang
select f).First()
select new
{
id = g.Key,
lang.name,
lang.lang,
lang.displayOrder,
count = g.Sum(f => f.count)
};
This assumes that pairs (id, lang) are unique.
Demo.
This may run slightly faster as it does not need to go through all items in the group for ordering.
var res = data
.GroupBy(d => d.id)
.Select(g => new {
id = g.Key,
count = g.Sum(d => d.count),
Lang = g.FirstOrDefault(d => d.lang == "fr") ?? g.First()
})
.Select(g => new {
g.id,
g.Lang.lang,
g.Lang.name,
g.Lang.displayOrder,
g.count
})
.ToList();

Comparing two lists with multiple conditions

I have two different lists of same type. I wanted to compare both lists and need to get the values which are not matched.
List of class:
public class pre
{
public int id {get; set;}
public datetime date {get; set;}
public int sID {get; set;}
}
Two lists :
List<pre> pre1 = new List<pre>();
List<pre> pre2 = new List<pre>();
Query which I wrote to get the unmatched values:
var preResult = pre1.where(p1 => !pre
.any(p2 => p2.id == p1.id && p2.date == p1.date && p2.sID == p1sID));
But the result is wrong here. I am getting all the values in pre1.
Here is solution :
class Program
{
static void Main(string[] args)
{
var pre1 = new List<pre>()
{
new pre {id = 1, date =DateTime.Now.Date, sID=1 },
new pre {id = 7, date = DateTime.Now.Date, sID = 2 },
new pre {id = 9, date = DateTime.Now.Date, sID = 3 },
new pre {id = 13, date = DateTime.Now.Date, sID = 4 },
// ... etc ...
};
var pre2 = new List<pre>()
{
new pre {id = 1, date =DateTime.Now.Date, sID=1 },
// ... etc ...
};
var preResult = pre1.Where(p1 => !pre2.Any(p2 => p2.id == p1.id && p2.date == p1.date && p2.sID == p1.sID)).ToList();
Console.ReadKey();
}
}
Note:Property date contain the date and the time part will be 00:00:00.
I fixed some typos and tested your code with sensible values, and your code would correctly select unmatched records. As prabhakaran S's answer mentions, perhaps your date values include time components that differ. You will need to check your data and decide how to proceed.
However, a better way to select unmatched records from one list compared against another would be to utilize a left join technique common to working with relational databases, which you can also do in Linq against in-memory collections. It will scale better as the sizes of your inputs grow.
var preResult = from p1 in pre1
join p2 in pre2
on new { p1.id, p1.date, p1.sID }
equals new { p2.id, p2.date, p2.sID } into grp
from item in grp.DefaultIfEmpty()
where item == null
select p1;

How to match two Lists with only items that are different in Linq

I have a StudentData class
public class StudentData
{
public int Id { get; set; }
public string FirstName { get; set; }
public string Surname { get; set; }
public int? Bonus { get; set; }
public int? Subject1Mark { get; set; }
public int? Subject2Mark { get; set; }
public int? Subject3Mark{ get; set; }
}
Each student has a unique Id that identifies him
I have a List<StudentData> CurrentData that has data of
1, John, Smith, 10, 50 ,50 ,50
2, Peter, Parker, 10, 60 ,60 ,60
3, Sally, Smart, 10, 70 ,70 ,70
4, Danny, Darko, 20, 80, 80, 80
I then have a List<StudentData> DataToUpdate which only contains the Id and Marks fields. Not the other fields.
1, null, null, null, 50 ,50 ,50
2, null, null, null, 65, 60 ,60
3, null, null, null, 70 ,70 ,70
The Ids of the list are not necessary in the same order
If you compare the two lists only Peter Parker's marks have changed in one subject.
I want to get the output to return
2, Peter, Parker, 10, 65 ,60 ,60
I want to takeList<StudentData> CurrentData inner join this with List<StudentData> DataToUpdate but only where marks are different
So in SQL it want the following
SELECT
CurrentData.Id,
CurrentData.FirstName ,
CurrentData.Surname,
CurrentData.Bonus,
DataToUpdate.Subject1Mark,
DataToUpdate.Subject2Mark,
DataToUpdate.Subject3Mark
FROM CurrentData
INNER JOIN DataToUpdate
ON CurrentData.Id= DataToUpdate.Id
AND (
CurrentData.Subject1Mark<> DataToUpdate.Subject1Mark
OR
CurrentData.Subject2Mark<> DataToUpdate.Subject2Mark
OR
CurrentData.Subject3Mark<> DataToUpdate.Subject3Mark
)
How do I do the above in LINQ?
In the Linq select how do I take all properties from CurrentData but include the 3 Subject properties from DataToUpdate in it to give me List<ChangedData>?
I could map each and every property but my StudentData has 100 fields and I would prefer to have something like
select new StudentData {
this=CurrentData,
this.Subject1Mark=DataToUpdate.Subject1Mark,
this.Subject2Mark=DataToUpdate.Subject2Mark,
this.Subject3Mark=DataToUpdate.Subject3Mark,
}
but I'm not sure how to write this
There is an answer in another stackoverflow question which should work but it doesn't. If I implement that solution (I simplify the example for simplicity)
var changedData = currentData
.Join(dataToUpdate, cd => cd.Id, ld => ld.Id, (cd, ld) => new { cd, ld })
.Select(x => { x.cd.Subject1Mark= x.ld.Subject1Mark; return x.cd; })
;
but the above x.cd.Subject1Mark isn't updated by x.ld.Subject1Mark although I use the answer in the linked stackoverflow question
The structure of LINQ query looks very similar to SQL:
var res =
from cur in CurrentData
join upd in DataToUpdate on upd.Id equals cur.Id
where (cur.Subject1Mark != upd.Subject1Mark || cur.Subject2Mark != upd.Subject2Mark || cur.Subject3Mark != upd.Subject3Mark)
select new {
Current = cur
, UpdatedSubject1Mark = upd.Subject1Mark
, UpdatedSubject2Mark = upd.Subject2Mark
, UpdatedSubject3Mark = upd.Subject3Mark
};
The main difference is that filtering out by inequality has moved from the on clause in SQL to a where clause of LINQ.

Linq - 'Saving' OrderBy operation (c#)

Assume I have generic list L of some type in c#. Then, using linq, call OrderBy() on it, passing in a lambda expression.
If I then re-assign the L, the previous order operation will obviously be lost.
Is there any way I can 'save' the lambda expression I used on the list before i reassigned it, and re-apply it?
Use a Func delegate to store your ordering then pass that to the OrderBy method:
Func<int, int> orderFunc = i => i; // func for ordering
var list = Enumerable.Range(1,10).OrderByDescending(i => i); // 10, 9 ... 1
var newList = list.OrderBy(orderFunc); // 1, 2 ... 10
As another example consider a Person class:
public class Person
{
public int Id { get; set; }
public string Name { get; set; }
}
Now you want to preserve a sort order that sorts by the Name property. In this case the Func operates on a Person type (T) and the TResult will be a string since Name is a string and is what you are sorting by.
Func<Person, string> nameOrder = p => p.Name;
var list = new List<Person>
{
new Person { Id = 1, Name = "ABC" },
new Person { Id = 2, Name = "DEF" },
new Person { Id = 3, Name = "GHI" },
};
// descending order by name
foreach (var p in list.OrderByDescending(nameOrder))
Console.WriteLine(p.Id + ":" + p.Name);
// 3:GHI
// 2:DEF
// 1:ABC
// re-assinging the list
list = new List<Person>
{
new Person { Id = 23, Name = "Foo" },
new Person { Id = 14, Name = "Buzz" },
new Person { Id = 50, Name = "Bar" },
};
// reusing the order function (ascending by name in this case)
foreach (var p in list.OrderBy(nameOrder))
Console.WriteLine(p.Id + ":" + p.Name);
// 50:Bar
// 14:Buzz
// 23:Foo
EDIT: be sure to add ToList() after the OrderBy calls if you need a List<T> since the LINQ methods will return an IEnumerable<T>.
Calling ToList() or ToArray() on your IEnumerable<T> will cause it to be immediately evaluated. You can then assign the resulting list or array to "save" your ordered list.

How do I transfer this logic into a LINQ statement?

I can't get this bit of logic converted into a Linq statement and it is driving me nuts. I have a list of items that have a category and a createdondate field. I want to group by the category and only return items that have the max date for their category.
So for example, the list contains items with categories 1 and 2. The first day (1/1) I post two items to both categories 1 and 2. The second day (1/2) I post three items to category 1. The list should return the second day postings to category 1 and the first day postings to category 2.
Right now I have it grouping by the category then running through a foreach loop to compare each item in the group with the max date of the group, if the date is less than the max date it removes the item.
There's got to be a way to take the loop out, but I haven't figured it out!
You can do something like that :
from item in list
group item by item.Category into g
select g.OrderByDescending(it => it.CreationDate).First();
However, it's not very efficient, because it needs to sort the items of each group, which is more complex than necessary (you don't actually need to sort, you just need to scan the list once). So I created this extension method to find the item with the max value of a property (or function) :
public static T WithMax<T, TValue>(this IEnumerable<T> source, Func<T, TValue> selector)
{
var max = default(TValue);
var withMax = default(T);
var comparer = Comparer<TValue>.Default;
bool first = true;
foreach (var item in source)
{
var value = selector(item);
int compare = comparer.Compare(value, max);
if (compare > 0 || first)
{
max = value;
withMax = item;
}
first = false;
}
return withMax;
}
You can use it as follows :
from item in list
group item by item.Category into g
select g.WithMax(it => it.CreationDate);
UPDATE : As Anthony noted in his comment, this code doesn't exactly answer the question... if you want all items which date is the maximum of their category, you can do something like that :
from item in list
group item by item.Category into g
let maxDate = g.Max(it => it.CreationDate)
select new
{
Category = g.Key,
Items = g.Where(it => it.CreationDate == maxDate)
};
How about this:
private class Test
{
public string Category { get; set; }
public DateTime PostDate { get; set; }
public string Post { get; set; }
}
private void Form1_Load(object sender, EventArgs e)
{
List<Test> test = new List<Test>();
test.Add(new Test() { Category = "A", PostDate = new DateTime(2010, 5, 5, 12, 0, 0), Post = "A1" });
test.Add(new Test() { Category = "B", PostDate = new DateTime(2010, 5, 5, 13, 0, 0), Post = "B1" });
test.Add(new Test() { Category = "A", PostDate = new DateTime(2010, 5, 6, 12, 0, 0), Post = "A2" });
test.Add(new Test() { Category = "A", PostDate = new DateTime(2010, 5, 6, 13, 0, 0), Post = "A3" });
test.Add(new Test() { Category = "A", PostDate = new DateTime(2010, 5, 6, 14, 0, 0), Post = "A4" });
var q = test.GroupBy(t => t.Category).Select(g => new { grp = g, max = g.Max(t2 => t2.PostDate).Date }).SelectMany(x => x.grp.Where(t => t.PostDate >= x.max));
}
Reformatting luc's excellent answer to query comprehension form. I like this better for this kind of query because the scoping rules let me write more concisely.
from item in source
group item by item.Category into g
let max = g.Max(item2 => item2.PostDate).Date
from item3 in g
where item3.PostDate.Date == max
select item3;

Resources