I am using OrderBy, and I have figured out that I have to use OrderBy as a last method, or it will not work. Distinct operator does not grant that it will maintain the original order of values, or if I use Include, it cannot sort the children collection.
Is there any reason why I shouldn't do Orderby always last and don't worry if order is preserved?
Edit:
In general, is there any reason, like performance impact, why I should not use OrderBy last. Doesnt metter if I use EnityFramework to query a database or just querying some collection.
dbContext.EntityFramework.Distinct().OrderBy(o=> o.Something); // this will give me ordered result
dbContext.EntityFramework.OrderBy(o=> o.Something).Distinct().; // this will not, because Distinct doesnt preserve order.
Lets say that I want to Select only one property.
dbContext.EntityFramework.Select(o=> o.Selected).OrderBy(o=> o.Something);
Will order be faster if I order collection after one property selection? So in that case I should use Order last. And I am just asking is there any situation where ordering shoudnt be done as last command?
Is there any reason why I shouldn't do OrderBy always last
There may be reasons to use OrderBy not as the last statement. For example, the sort property may not be in the result:
var result = context.Entities
.OrderBy(e => e.Date)
.Select(e => e.Name);
Or you want a sorted collection as part of the result:
var result = context.Customers
.Select(c => new
{
Customer = c,
Orders = c.Orders.OrderBy(o => o.Date)
Address = c.Address
});
Will order be faster if I order collection after one property selection?
Your examples show that you're working with LINQ to Entities, so the statements will be translated into SQL. You will notice that...
context.Entities
.OrderBy(e => e.Name)
.Select(e => e.Name)
... and ...
context.Entities
.Select(e => e.Name)
.OrderBy(s => s)
... will produce exactly the same SQL. So there is no essential difference between both OrderBy positions.
Doesn't matter if I use Entity Framework to query a database or just querying some collection.
Well, that does matter. For example, if you do...
context.Entities
.OrderBy(e => e.Date)
.Select(e => e.Name)
.Distinct()
... you'll notice that the OrderBy is completely ignored by EF and the order of names is unpredictable.
However, if you do ...
context.Entities
.AsEnumerable() // Continue as LINQ to objects
.OrderBy(e => e.Date)
.Select(e => e.Name)
.Distinct()
... you'll see that the sort order is preserved in the distinct result. LINQ to objects clearly has a different strategy than LINQ to Entities. OrderBy at the end of the statement would have made both results equal.
To sum it up, I'd say that as a rule of the thumb, try to order as late as possible in a LINQ query. This will produce the most predictable results.
I don't know if you misundertood the meaning of Distinct. According to definition it does:
Returns distinct elements from a sequence by using the default equality comparer to compare values.
So if you have a list of int and you want to remove repeated values, you use Distinct. Distinct uses the default equality comparer and it does the comparison by comparing the current element to the next one. So, you have to sort first to get the expected result.
And about OrderBy method, in fact, it does the sort. So if you want to sort something and distinct after you use:
List<int> myNumbers = new List<int>{ 102, 2817, 82, 2, 1, 2, 1, 9, 4 };
Sorting and removing duplicated numbers
// returns 1, 2, 4, 9, 82, 102, 2817
var sortedUniques = myNumbers.OrderBy(n => n).Distinct();
Removing duplicated numbers and sorting
// returns 1, 1, 2, 2, 4, 9, 82, 102, 2817
// It occurs because the Distinct compares current number to the next one
var sortedUniques = myNumbers.Distinct().OrderBy(n => n);
Just removing duplicated numbers
// returns 102, 2817, 82, 2, 1, 9, 4
var sortedUniques = myNumbers.Distinct().OrderBy(n => n);
Just sorting
// returns 1, 1, 2, 2, 4, 9, 82, 102, 2817
var sortedUniques = myNumbers.Distinct().OrderBy(n => n);
I hope it helps you \o/
Related
Maybe someone knows how to achieve this kind of query in linq (or lambda).
I have this set in a list
Filter: My input will be code 100 and 101, I need to get the "values", in this example = 1, 2.
Problem: If you input 100 and 101, you´ll get 3 results, because of 100 from group 1 and group 2. I just need the pair that matches in the same group. (And you don´t have group as an input param)
How can I solve this if the group fully exists?
thanks!
Starting with a simple representation in code of what you have in a picture:
var list = new[]
{
new{code = 100, value = 1, group = 1},
new{code = 101, value = 2, group = 1},
new{code = 100, value = 3, group = 2},
new{code = 103, value = 4, group = 2},
};
var inp = new[]{100, 103};
Then we can do:
list
.GroupBy(el => el.group) // Group by the "group" field.
.Where(grp => !inp.Except(grp.Select(el => el.code)).Any()) // Exclude groups that don't contain all input values
.Single() // Obtain the only such group (with a check that there is only one)
.Select(el => el.value); // Obtain the "value" fields.
If you could perhaps have inputs that were a subset of the "code" fields of some groups, you could also check that you match all of the group completely by excluding groups which have a different size:
list
.GroupBy(el => el.group)
.Where(grp =>
grp.Count() == inp.Count()
&& !inp.Except(grp.Select(el => el.code)).Any())
.Single()
.Select(el => el.value);
There are other variations that match other possible interpretations of your question (e.g. I'm assuming there can be only one matching group, but that wasn't clear).
For an arbitrary collection of objects, would following two LINQ expressions always give the same result (given that LINQ provider is the same):
var result = list.OrderBy(x => x.FirstName).Where(x => x.Age > 18);
var result = list.Where(x => x.Age > 18).OrderBy(x => x.FirstName);
While Enumerable<T>.OrderBy() is specified to be a stable sort, Queryable<T>.OrderBy() is not.
In other words, no, since the sort is not guaranteed to be stable, the two queries are not guaranteed to give the same result for all providers. At the very least, the results may be ordered in a different order.
I have the following code which returns a list of Objects.
var listOfLogins = _logService.GetLogEventsByItemID(137).ToList();
I would like to get the 2nd last object in this list.
Does anyone know how to do this using Linq to Entities?
Thanks.
var secondlast = _logService.GetLogEventsByItemID(137)
.Reverse()
.Skip(1)
.Take(1)
.FirstOrDefault();
Update
#Dherik makes a good point in his comment that .Reverse is not actually supported in LINQ to Entities and will result in the query being evaluated at the point of calling reverse, rather than at the point of calling .FirstOrDefault. See here for all (not) supported methods.
The alternative (LINQ to Entities friendly) solution requires that you have a suitable field to order by (which must be the case anyway otherwise "second last" has no relevance):
var secondlast = _logService.GetLogEventsByItemID(137)
.OrderByDescending(e => e.EventDate /* could be any db field */)
.Skip(1)
.Take(1)
.FirstOrDefault();
int[] items = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
int item = items.Skip(items.Count() - 2).Take(1).Single();
//will return 9
like this?
Here a sample dataset:
OrderProduct is a table that contains the productIds that were part of a given order.
Note: OrderProduct is a database table and I am using EF.
OrderId, ProductId
1, 1
2, 2
3, 4
3, 5
4, 5
4, 2
5, 2
5, 3
What I want to be able to do is find an order that contains only the productIds that I am searching for. So if my input was productIds 2,3, then I should get back OrderId 5.
I know how I can group data, but I am unsure of how to perform the select on the group.
Here is what I have:
var q = from op in OrderProduct
group op by op.OrderId into orderGroup
select orderGroup;
Not sure how to proceed from here
IEnumerable<int> products = new List<int> {2, 3};
IEnumerable<OrderProduct> orderProducts = new List<OrderProduct>
{
new OrderProduct(1, 1),
new OrderProduct(2, 2),
new OrderProduct(3, 4),
new OrderProduct(3, 5),
new OrderProduct(4, 5),
new OrderProduct(4, 2),
new OrderProduct(5, 2),
new OrderProduct(5, 3),
};
var orders =
(from op in orderProducts
group op by op.OrderId into orderGroup
//magic goes there
where !products.Except(orderGroup.Select(x => x.ProductId)).Any()
select orderGroup);
//outputs 5
orders.Select(x => x.Key).ToList().ForEach(Console.WriteLine);
Or you can have another version as pointed in another answer, just replace
where !products.Except(orderGroup.Select(x => x.ProductId)).Any()
on
where products.All(pid => orderGroup.Any(op => op.ProductId == pid))
second one will have ~ 15% better performance (I've checked that)
Edit
According to the last requirement change, that you need orders that contain not all productIds you are searching, but exactly those and only those productIds, I wrote an updated version:
var orders =
(from op in orderProducts
group op by op.OrderId into orderGroup
//this line was added
where orderGroup.Count() == products.Count()
where !products.Except(orderGroup.Select(x => x.ProductId)).Any()
select orderGroup);
So the only thing you'll need is to add a precondition ensuring that collections contains the same amount of elements, it will work for both previous queries, and as a bonus I suggest 3rd version of the most important where condition:
where orderGroup.Select(x => x.ProductId).Intersect(products).Count() == orderGroup.Count()
At first glance, I'd try something like this:
var prodIds = new[] {2, 3};
from o in context.Orders
where prodIds.All(pid => o.OrderProducts.Any(op => op.ProductId == pid))
select o
In plain language: "get the orders that have a product with every ID in the given list."
Update
Since it appears you are using LINQ to SQL rather than LINQ to Entities, here's another approach:
var q = context.Orders;
foreach(var pid in prodIds)
{
q = q.Where(o => o.OrderProducts.Any(op => op.ProductId == pid));
}
Rather than using a single LINQ statement, you essentially build the query piecemeal.
Thanks to StriplingWarrior's answer I managed to figure this out. Not sure if this is the best way to do this, but it works.
List<int> prodIds = new List<int>{2,3};
var q = from o in Orders
//get all orderproducts that contain products in the ProdId list
where o.OrderProducts.All(op => prodIds.Contains(op.ProductId))
//now group the OrderProducts by the Orders
select from op in o.OrderProducts
group op by op.OrderId into opGroup
//select only those groups that have the same count as the prodId list
where opGroup.Count() == prodIds.Count()
select opGroup;
//get rid of any groups that may be empty
q = q.Where(fi => fi.Count()> 0);
(I am using LinqPad, which is why the query looks a little funky - no context, etc)
I'm trying to implement cascading controls using the following LINQ query expression.
The idea is that I have three option lists represented by the tables OptionA, OptionB and OptionC and a view called OptionIndex with one column each for OptionA_ID, OptionB_ID, OptionC_ID and that table has of all the combinations of tags from the option lists that are in use. Left outer joining the OptionIndex on the option list produces a boolean for the Disabled attributed in the option tag.
How do I make the on clause, which is .Where(...) in the following sample code, allow for any combination of the controls being used?
For example, lets say the user initially selects option value 123 in OptionA. The code to return the Values, Labels and Disabled booleans for OptionC would look like the following:
from t1 in OptionCs
from t2 in OptionIndexes.Where(x => t1.OptionC_ID == x.OptionC_ID && new List<int> { 123 }.Contains(x.OptionA_ID)).DefaultIfEmpty()
group new {t1, t2} by new { t1.OptionC_ID, t1.Label } into g
select new { g.Key.OptionC_ID, g.Key.Label, Disabled = g.Count(t => t.t2.OptionC_ID == null) > 0 }
Then lets say the user selects option values 456 and 789 in OptionB. The code to return the Values, Labels and Disabled booleans for OptionC change to:
from t1 in OptionCs
from t2 in OptionIndexes.Where(x => t1.OptionC_ID == x.OptionC_ID && new List<int> { 123 }.Contains(x.OptionA_ID) && new List<int> { 456, 789 }.Contains(x.OptionB_ID)).DefaultIfEmpty()
group new {t1, t2} by new { t1.OptionC_ID, t1.Label } into g
select new { g.Key.OptionC_ID, g.Key.Label, Disabled = g.Count(t => t.t2.OptionC_ID == null) > 0 }
To make the example code easier to understand I used new List<int>. In the actual project, however I would be passing the integers from the option list in as integer arrays from the controls themselves.
The trick is somehow making the query expression dynamic so that it can represent any combination of 0 to N multi-select controls being used or passing something that tells the join to accept any value for any given control such as
{x.OptionB_ID.Any}.Contains(x.OptionB_ID)
What is the best way to handle this?
Thanks!
Distilling your issue down to a simple example, consider this list of integers:
List<int> l = new List<int> { 1, 25, 3, 99, -23, 0, 15, 75 };
Say that you want to conditionally filter this list based on external criteria. Sometimes you want positive numbers, sometimes you want numbers smaller than 50, sometimes you want numbers divisible by 5, or any combination of these. Applying all filters with a static expression would look like this:
l.Where(n => n > 0).Where(n => n < 50).Where(n => n % 5 == 0);
To apply any or all of these dynamically, just build the LINQ query in pieces:
// These switches simulate your external conditions.
bool conditionA = true;
bool conditionB = false;
bool conditionC = true;
IEnumerable<int> myList = l;
if (conditionA) { myList = myList.Where(n => n > 0 ); }
if (conditionB) { myList = myList.Where(n => n < 50 ); }
if (conditionC) { myList = myList.Where(n => n % 5 == 0); }
With the switches set as in my example, the output is 25, 15, 75.
Side note: if you are not aware of it, use LINQPad to experiment with things like this. It is a fantastic tool for essentially executing code interactively, be it LINQ code or not. When I built the above sample, I inserted myList.Dump(); calls after each of the last 4 lines so I could see how each filter was applied. Here is the output: