Comparing two lists with multiple conditions - linq

I have two different lists of same type. I wanted to compare both lists and need to get the values which are not matched.
List of class:
public class pre
{
public int id {get; set;}
public datetime date {get; set;}
public int sID {get; set;}
}
Two lists :
List<pre> pre1 = new List<pre>();
List<pre> pre2 = new List<pre>();
Query which I wrote to get the unmatched values:
var preResult = pre1.where(p1 => !pre
.any(p2 => p2.id == p1.id && p2.date == p1.date && p2.sID == p1sID));
But the result is wrong here. I am getting all the values in pre1.

Here is solution :
class Program
{
static void Main(string[] args)
{
var pre1 = new List<pre>()
{
new pre {id = 1, date =DateTime.Now.Date, sID=1 },
new pre {id = 7, date = DateTime.Now.Date, sID = 2 },
new pre {id = 9, date = DateTime.Now.Date, sID = 3 },
new pre {id = 13, date = DateTime.Now.Date, sID = 4 },
// ... etc ...
};
var pre2 = new List<pre>()
{
new pre {id = 1, date =DateTime.Now.Date, sID=1 },
// ... etc ...
};
var preResult = pre1.Where(p1 => !pre2.Any(p2 => p2.id == p1.id && p2.date == p1.date && p2.sID == p1.sID)).ToList();
Console.ReadKey();
}
}
Note:Property date contain the date and the time part will be 00:00:00.

I fixed some typos and tested your code with sensible values, and your code would correctly select unmatched records. As prabhakaran S's answer mentions, perhaps your date values include time components that differ. You will need to check your data and decide how to proceed.
However, a better way to select unmatched records from one list compared against another would be to utilize a left join technique common to working with relational databases, which you can also do in Linq against in-memory collections. It will scale better as the sizes of your inputs grow.
var preResult = from p1 in pre1
join p2 in pre2
on new { p1.id, p1.date, p1.sID }
equals new { p2.id, p2.date, p2.sID } into grp
from item in grp.DefaultIfEmpty()
where item == null
select p1;

Related

Linq left outer join doesn't work while matching sql does [duplicate]

How to perform left outer join in C# LINQ to objects without using join-on-equals-into clauses? Is there any way to do that with where clause?
Correct problem:
For inner join is easy and I have a solution like this
List<JoinPair> innerFinal = (from l in lefts from r in rights where l.Key == r.Key
select new JoinPair { LeftId = l.Id, RightId = r.Id})
but for left outer join I need a solution. Mine is something like this but it's not working
List< JoinPair> leftFinal = (from l in lefts from r in rights
select new JoinPair {
LeftId = l.Id,
RightId = ((l.Key==r.Key) ? r.Id : 0
})
where JoinPair is a class:
public class JoinPair { long leftId; long rightId; }
As stated in "Perform left outer joins":
var q =
from c in categories
join pt in products on c.Category equals pt.Category into ps_jointable
from p in ps_jointable.DefaultIfEmpty()
select new { Category = c, ProductName = p == null ? "(No products)" : p.ProductName };
If a database driven LINQ provider is used, a significantly more readable left outer join can be written as such:
from c in categories
from p in products.Where(c == p.Category).DefaultIfEmpty()
If you omit the DefaultIfEmpty() you will have an inner join.
Take the accepted answer:
from c in categories
join p in products on c equals p.Category into ps
from p in ps.DefaultIfEmpty()
This syntax is very confusing, and it's not clear how it works when you want to left join MULTIPLE tables.
Note
It should be noted that from alias in Repo.whatever.Where(condition).DefaultIfEmpty() is the same as an outer-apply/left-join-lateral, which any (decent) database-optimizer is perfectly capable of translating into a left join, as long as you don't introduce per-row-values (aka an actual outer apply). Don't do this in Linq-2-Objects (because there's no DB-optimizer when you use Linq-to-Objects).
Detailed Example
var query2 = (
from users in Repo.T_User
from mappings in Repo.T_User_Group
.Where(mapping => mapping.USRGRP_USR == users.USR_ID)
.DefaultIfEmpty() // <== makes join left join
from groups in Repo.T_Group
.Where(gruppe => gruppe.GRP_ID == mappings.USRGRP_GRP)
.DefaultIfEmpty() // <== makes join left join
// where users.USR_Name.Contains(keyword)
// || mappings.USRGRP_USR.Equals(666)
// || mappings.USRGRP_USR == 666
// || groups.Name.Contains(keyword)
select new
{
UserId = users.USR_ID
,UserName = users.USR_User
,UserGroupId = groups.ID
,GroupName = groups.Name
}
);
var xy = (query2).ToList();
When used with LINQ 2 SQL it will translate nicely to the following very legible SQL query:
SELECT
users.USR_ID AS UserId
,users.USR_User AS UserName
,groups.ID AS UserGroupId
,groups.Name AS GroupName
FROM T_User AS users
LEFT JOIN T_User_Group AS mappings
ON mappings.USRGRP_USR = users.USR_ID
LEFT JOIN T_Group AS groups
ON groups.GRP_ID == mappings.USRGRP_GRP
Edit:
See also "
Convert SQL Server query to Linq query "
for a more complex example.
Also, If you're doing it in Linq-2-Objects (instead of Linq-2-SQL), you should do it the old-fashioned way (because LINQ to SQL translates this correctly to join operations, but over objects this method forces a full scan, and doesn't take advantage of index searches, whyever...):
var query2 = (
from users in Repo.T_Benutzer
join mappings in Repo.T_Benutzer_Benutzergruppen on mappings.BEBG_BE equals users.BE_ID into tmpMapp
join groups in Repo.T_Benutzergruppen on groups.ID equals mappings.BEBG_BG into tmpGroups
from mappings in tmpMapp.DefaultIfEmpty()
from groups in tmpGroups.DefaultIfEmpty()
select new
{
UserId = users.BE_ID
,UserName = users.BE_User
,UserGroupId = mappings.BEBG_BG
,GroupName = groups.Name
}
);
Using lambda expression
db.Categories
.GroupJoin(db.Products,
Category => Category.CategoryId,
Product => Product.CategoryId,
(x, y) => new { Category = x, Products = y })
.SelectMany(
xy => xy.Products.DefaultIfEmpty(),
(x, y) => new { Category = x.Category, Product = y })
.Select(s => new
{
CategoryName = s.Category.Name,
ProductName = s.Product.Name
});
Now as an extension method:
public static class LinqExt
{
public static IEnumerable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(this IEnumerable<TLeft> left, IEnumerable<TRight> right, Func<TLeft, TKey> leftKey, Func<TRight, TKey> rightKey,
Func<TLeft, TRight, TResult> result)
{
return left.GroupJoin(right, leftKey, rightKey, (l, r) => new { l, r })
.SelectMany(
o => o.r.DefaultIfEmpty(),
(l, r) => new { lft= l.l, rght = r })
.Select(o => result.Invoke(o.lft, o.rght));
}
}
Use like you would normally use join:
var contents = list.LeftOuterJoin(list2,
l => l.country,
r => r.name,
(l, r) => new { count = l.Count(), l.country, l.reason, r.people })
Hope this saves you some time.
Take a look at this example.
This query should work:
var leftFinal = from left in lefts
join right in rights on left equals right.Left into leftRights
from leftRight in leftRights.DefaultIfEmpty()
select new { LeftId = left.Id, RightId = left.Key==leftRight.Key ? leftRight.Id : 0 };
An implementation of left outer join by extension methods could look like
public static IEnumerable<Result> LeftJoin<TOuter, TInner, TKey, Result>(
this IEnumerable<TOuter> outer, IEnumerable<TInner> inner
, Func<TOuter, TKey> outerKeySelector, Func<TInner, TKey> innerKeySelector
, Func<TOuter, TInner, Result> resultSelector, IEqualityComparer<TKey> comparer)
{
if (outer == null)
throw new ArgumentException("outer");
if (inner == null)
throw new ArgumentException("inner");
if (outerKeySelector == null)
throw new ArgumentException("outerKeySelector");
if (innerKeySelector == null)
throw new ArgumentException("innerKeySelector");
if (resultSelector == null)
throw new ArgumentException("resultSelector");
return LeftJoinImpl(outer, inner, outerKeySelector, innerKeySelector, resultSelector, comparer ?? EqualityComparer<TKey>.Default);
}
static IEnumerable<Result> LeftJoinImpl<TOuter, TInner, TKey, Result>(
IEnumerable<TOuter> outer, IEnumerable<TInner> inner
, Func<TOuter, TKey> outerKeySelector, Func<TInner, TKey> innerKeySelector
, Func<TOuter, TInner, Result> resultSelector, IEqualityComparer<TKey> comparer)
{
var innerLookup = inner.ToLookup(innerKeySelector, comparer);
foreach (var outerElment in outer)
{
var outerKey = outerKeySelector(outerElment);
var innerElements = innerLookup[outerKey];
if (innerElements.Any())
foreach (var innerElement in innerElements)
yield return resultSelector(outerElment, innerElement);
else
yield return resultSelector(outerElment, default(TInner));
}
}
The resultselector then has to take care of the null elements. Fx.
static void Main(string[] args)
{
var inner = new[] { Tuple.Create(1, "1"), Tuple.Create(2, "2"), Tuple.Create(3, "3") };
var outer = new[] { Tuple.Create(1, "11"), Tuple.Create(2, "22") };
var res = outer.LeftJoin(inner, item => item.Item1, item => item.Item1, (it1, it2) =>
new { Key = it1.Item1, V1 = it1.Item2, V2 = it2 != null ? it2.Item2 : default(string) });
foreach (var item in res)
Console.WriteLine(string.Format("{0}, {1}, {2}", item.Key, item.V1, item.V2));
}
take look at this example
class Person
{
public int ID { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
public string Phone { get; set; }
}
class Pet
{
public string Name { get; set; }
public Person Owner { get; set; }
}
public static void LeftOuterJoinExample()
{
Person magnus = new Person {ID = 1, FirstName = "Magnus", LastName = "Hedlund"};
Person terry = new Person {ID = 2, FirstName = "Terry", LastName = "Adams"};
Person charlotte = new Person {ID = 3, FirstName = "Charlotte", LastName = "Weiss"};
Person arlene = new Person {ID = 4, FirstName = "Arlene", LastName = "Huff"};
Pet barley = new Pet {Name = "Barley", Owner = terry};
Pet boots = new Pet {Name = "Boots", Owner = terry};
Pet whiskers = new Pet {Name = "Whiskers", Owner = charlotte};
Pet bluemoon = new Pet {Name = "Blue Moon", Owner = terry};
Pet daisy = new Pet {Name = "Daisy", Owner = magnus};
// Create two lists.
List<Person> people = new List<Person> {magnus, terry, charlotte, arlene};
List<Pet> pets = new List<Pet> {barley, boots, whiskers, bluemoon, daisy};
var query = from person in people
where person.ID == 4
join pet in pets on person equals pet.Owner into personpets
from petOrNull in personpets.DefaultIfEmpty()
select new { Person=person, Pet = petOrNull};
foreach (var v in query )
{
Console.WriteLine("{0,-15}{1}", v.Person.FirstName + ":", (v.Pet == null ? "Does not Exist" : v.Pet.Name));
}
}
// This code produces the following output:
//
// Magnus: Daisy
// Terry: Barley
// Terry: Boots
// Terry: Blue Moon
// Charlotte: Whiskers
// Arlene:
now you are able to include elements from the left even if that element has no matches in the right, in our case we retrived Arlene even he has no matching in the right
here is the reference
How to: Perform Left Outer Joins (C# Programming Guide)
This is the general form (as already provided in other answers)
var c =
from a in alpha
join b in beta on b.field1 equals a.field1 into b_temp
from b_value in b_temp.DefaultIfEmpty()
select new { Alpha = a, Beta = b_value };
However here's an explanation that I hope will clarify what this actually means!
join b in beta on b.field1 equals a.field1 into b_temp
essentially creates a separate result set b_temp that effectively includes null 'rows' for entries on the right hand side (entries in 'b').
Then the next line:
from b_value in b_temp.DefaultIfEmpty()
..iterates over that result set, setting the default null value for the 'row' on the right hand side, and setting the result of the right hand side row join to the value of 'b_value' (i.e. the value that's on the right hand side,if there's a matching record, or 'null' if there isn't).
Now, if the right hand side is the result of a separate LINQ query, it will consist of anonymous types, which can only either be 'something' or 'null'. If it's an enumerable however (e.g. a List - where MyObjectB is a class with 2 fields), then it's possible to be specific about what default 'null' values are used for its properties:
var c =
from a in alpha
join b in beta on b.field1 equals a.field1 into b_temp
from b_value in b_temp.DefaultIfEmpty( new MyObjectB { Field1 = String.Empty, Field2 = (DateTime?) null })
select new { Alpha = a, Beta_field1 = b_value.Field1, Beta_field2 = b_value.Field2 };
This ensures that 'b' itself isn't null (but its properties can be null, using the default null values that you've specified), and this allows you to check properties of b_value without getting a null reference exception for b_value. Note that for a nullable DateTime, a type of (DateTime?) i.e. 'nullable DateTime' must be specified as the 'Type' of the null in the specification for the 'DefaultIfEmpty' (this will also apply to types that are not 'natively' nullable e.g double, float).
You can perform multiple left outer joins by simply chaining the above syntax.
Here's an example if you need to join more than 2 tables:
from d in context.dc_tpatient_bookingd
join bookingm in context.dc_tpatient_bookingm
on d.bookingid equals bookingm.bookingid into bookingmGroup
from m in bookingmGroup.DefaultIfEmpty()
join patient in dc_tpatient
on m.prid equals patient.prid into patientGroup
from p in patientGroup.DefaultIfEmpty()
Ref: https://stackoverflow.com/a/17142392/2343
Here is a fairly easy to understand version using method syntax:
IEnumerable<JoinPair> outerLeft =
lefts.SelectMany(l =>
rights.Where(r => l.Key == r.Key)
.DefaultIfEmpty(new Item())
.Select(r => new JoinPair { LeftId = l.Id, RightId = r.Id }));
Extension method that works like left join with Join syntax
public static class LinQExtensions
{
public static IEnumerable<TResult> LeftJoin<TOuter, TInner, TKey, TResult>(
this IEnumerable<TOuter> outer, IEnumerable<TInner> inner,
Func<TOuter, TKey> outerKeySelector,
Func<TInner, TKey> innerKeySelector,
Func<TOuter, TInner, TResult> resultSelector)
{
return outer.GroupJoin(
inner,
outerKeySelector,
innerKeySelector,
(outerElement, innerElements) => resultSelector(outerElement, innerElements.FirstOrDefault()));
}
}
just wrote it in .NET core and it seems to be working as expected.
Small test:
var Ids = new List<int> { 1, 2, 3, 4};
var items = new List<Tuple<int, string>>
{
new Tuple<int, string>(1,"a"),
new Tuple<int, string>(2,"b"),
new Tuple<int, string>(4,"d"),
new Tuple<int, string>(5,"e"),
};
var result = Ids.LeftJoin(
items,
id => id,
item => item.Item1,
(id, item) => item ?? new Tuple<int, string>(id, "not found"));
result.ToList()
Count = 4
[0]: {(1, a)}
[1]: {(2, b)}
[2]: {(3, not found)}
[3]: {(4, d)}
I would like to add that if you get the MoreLinq extension there is now support for both homogenous and heterogeneous left joins now
http://morelinq.github.io/2.8/ref/api/html/Overload_MoreLinq_MoreEnumerable_LeftJoin.htm
example:
//Pretend a ClientCompany object and an Employee object both have a ClientCompanyID key on them
return DataContext.ClientCompany
.LeftJoin(DataContext.Employees, //Table being joined
company => company.ClientCompanyID, //First key
employee => employee.ClientCompanyID, //Second Key
company => new {company, employee = (Employee)null}, //Result selector when there isn't a match
(company, employee) => new { company, employee }); //Result selector when there is a match
EDIT:
In retrospect this may work, but it converts the IQueryable to an IEnumerable as morelinq does not convert the query to SQL.
You can instead use a GroupJoin as described here: https://stackoverflow.com/a/24273804/4251433
This will ensure that it stays as an IQueryable in case you need to do further logical operations on it later.
There are three tables: persons, schools and persons_schools, which connects persons to the schools they study in. A reference to the person with id=6 is absent in the table persons_schools. However the person with id=6 is presented in the result lef-joined grid.
List<Person> persons = new List<Person>
{
new Person { id = 1, name = "Alex", phone = "4235234" },
new Person { id = 2, name = "Bob", phone = "0014352" },
new Person { id = 3, name = "Sam", phone = "1345" },
new Person { id = 4, name = "Den", phone = "3453452" },
new Person { id = 5, name = "Alen", phone = "0353012" },
new Person { id = 6, name = "Simon", phone = "0353012" }
};
List<School> schools = new List<School>
{
new School { id = 1, name = "Saint. John's school"},
new School { id = 2, name = "Public School 200"},
new School { id = 3, name = "Public School 203"}
};
List<PersonSchool> persons_schools = new List<PersonSchool>
{
new PersonSchool{id_person = 1, id_school = 1},
new PersonSchool{id_person = 2, id_school = 2},
new PersonSchool{id_person = 3, id_school = 3},
new PersonSchool{id_person = 4, id_school = 1},
new PersonSchool{id_person = 5, id_school = 2}
//a relation to the person with id=6 is absent
};
var query = from person in persons
join person_school in persons_schools on person.id equals person_school.id_person
into persons_schools_joined
from person_school_joined in persons_schools_joined.DefaultIfEmpty()
from school in schools.Where(var_school => person_school_joined == null ? false : var_school.id == person_school_joined.id_school).DefaultIfEmpty()
select new { Person = person.name, School = school == null ? String.Empty : school.name };
foreach (var elem in query)
{
System.Console.WriteLine("{0},{1}", elem.Person, elem.School);
}
Easy way is to use Let keyword. This works for me.
from AItem in Db.A
Let BItem = Db.B.Where(x => x.id == AItem.id ).FirstOrDefault()
Where SomeCondition
Select new YourViewModel
{
X1 = AItem.a,
X2 = AItem.b,
X3 = BItem.c
}
This is a simulation of Left Join. If each item in B table not match to A item , BItem return null
This is a SQL syntax compare to LINQ syntax for inner and left outer joins.
Left Outer Join:
http://www.ozkary.com/2011/07/linq-to-entity-inner-and-left-joins.html
"The following example does a group join between product and category. This is essentially the left join. The into expression returns data even if the category table is empty. To access the properties of the category table, we must now select from the enumerable result by adding the from cl in catList.DefaultIfEmpty() statement.
As per my answer to a similar question, here:
Linq to SQL left outer join using Lambda syntax and joining on 2 columns (composite join key)
Get the code here, or clone my github repo, and play!
Query:
var petOwners =
from person in People
join pet in Pets
on new
{
person.Id,
person.Age,
}
equals new
{
pet.Id,
Age = pet.Age * 2, // owner is twice age of pet
}
into pets
from pet in pets.DefaultIfEmpty()
select new PetOwner
{
Person = person,
Pet = pet,
};
Lambda:
var petOwners = People.GroupJoin(
Pets,
person => new { person.Id, person.Age },
pet => new { pet.Id, Age = pet.Age * 2 },
(person, pet) => new
{
Person = person,
Pets = pet,
}).SelectMany(
pet => pet.Pets.DefaultIfEmpty(),
(people, pet) => new
{
people.Person,
Pet = pet,
});
This is the LeftJoin implementation I use. Notice that the the resultSelector expression accepts 2 parameters: one instance from both sides of the join. In most other implementations that I've seen the result selector only accepts one parameter, which is a "join model" with a left/right or outer/inner property. I like this implementation better because it has the same method signature as the built-in Join method. It also works with IQueryables and EF.
var results = DbContext.Categories
.LeftJoin(
DbContext.Products, c => c.Id, p => p.CategoryId,
(c, p) => new { Category = c, ProductName = p == null ? "(No Products)" : p.ProductName })
.ToList();
public static class QueryableExtensions
{
public static IQueryable<TResult> LeftJoin<TOuter, TInner, TKey, TResult>(
this IQueryable<TOuter> outer,
IEnumerable<TInner> inner, Expression<Func<TOuter, TKey>> outerKeySelector,
Expression<Func<TInner, TKey>> innerKeySelector,
Expression<Func<TOuter, TInner, TResult>> resultSelector)
{
var query = outer
.GroupJoin(inner, outerKeySelector, innerKeySelector, (o, i) => new { o, i })
.SelectMany(o => o.i.DefaultIfEmpty(), (x, i) => new { x.o, i });
return ApplySelector(query, x => x.o, x => x.i, resultSelector);
}
private static IQueryable<TResult> ApplySelector<TSource, TOuter, TInner, TResult>(
IQueryable<TSource> source,
Expression<Func<TSource, TOuter>> outerProperty,
Expression<Func<TSource, TInner>> innerProperty,
Expression<Func<TOuter, TInner, TResult>> resultSelector)
{
var p = Expression.Parameter(typeof(TSource), $"param_{Guid.NewGuid()}".Replace("-", string.Empty));
Expression body = resultSelector?.Body
.ReplaceParameter(resultSelector.Parameters[0], outerProperty.Body.ReplaceParameter(outerProperty.Parameters[0], p))
.ReplaceParameter(resultSelector.Parameters[1], innerProperty.Body.ReplaceParameter(innerProperty.Parameters[0], p));
var selector = Expression.Lambda<Func<TSource, TResult>>(body, p);
return source.Select(selector);
}
}
public static class ExpressionExtensions
{
public static Expression ReplaceParameter(this Expression source, ParameterExpression toReplace, Expression newExpression)
=> new ReplaceParameterExpressionVisitor(toReplace, newExpression).Visit(source);
}
public class ReplaceParameterExpressionVisitor : ExpressionVisitor
{
public ReplaceParameterExpressionVisitor(ParameterExpression toReplace, Expression replacement)
{
this.ToReplace = toReplace;
this.Replacement = replacement;
}
public ParameterExpression ToReplace { get; }
public Expression Replacement { get; }
protected override Expression VisitParameter(ParameterExpression node)
=> (node == ToReplace) ? Replacement : base.VisitParameter(node);
}
Perform left outer joins in linq C#
// Perform left outer joins
class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
}
class Child
{
public string Name { get; set; }
public Person Owner { get; set; }
}
public class JoinTest
{
public static void LeftOuterJoinExample()
{
Person magnus = new Person { FirstName = "Magnus", LastName = "Hedlund" };
Person terry = new Person { FirstName = "Terry", LastName = "Adams" };
Person charlotte = new Person { FirstName = "Charlotte", LastName = "Weiss" };
Person arlene = new Person { FirstName = "Arlene", LastName = "Huff" };
Child barley = new Child { Name = "Barley", Owner = terry };
Child boots = new Child { Name = "Boots", Owner = terry };
Child whiskers = new Child { Name = "Whiskers", Owner = charlotte };
Child bluemoon = new Child { Name = "Blue Moon", Owner = terry };
Child daisy = new Child { Name = "Daisy", Owner = magnus };
// Create two lists.
List<Person> people = new List<Person> { magnus, terry, charlotte, arlene };
List<Child> childs = new List<Child> { barley, boots, whiskers, bluemoon, daisy };
var query = from person in people
join child in childs
on person equals child.Owner into gj
from subpet in gj.DefaultIfEmpty()
select new
{
person.FirstName,
ChildName = subpet!=null? subpet.Name:"No Child"
};
// PetName = subpet?.Name ?? String.Empty };
foreach (var v in query)
{
Console.WriteLine($"{v.FirstName + ":",-25}{v.ChildName}");
}
}
// This code produces the following output:
//
// Magnus: Daisy
// Terry: Barley
// Terry: Boots
// Terry: Blue Moon
// Charlotte: Whiskers
// Arlene: No Child
https://dotnetwithhamid.blogspot.in/
Here's a version of the extension method solution using IQueryable instead of IEnumerable
public class OuterJoinResult<TLeft, TRight>
{
public TLeft LeftValue { get; set; }
public TRight RightValue { get; set; }
}
public static IQueryable<TResult> LeftOuterJoin<TLeft, TRight, TKey, TResult>(this IQueryable<TLeft> left, IQueryable<TRight> right, Expression<Func<TLeft, TKey>> leftKey, Expression<Func<TRight, TKey>> rightKey, Expression<Func<OuterJoinResult<TLeft, TRight>, TResult>> result)
{
return left.GroupJoin(right, leftKey, rightKey, (l, r) => new { l, r })
.SelectMany(o => o.r.DefaultIfEmpty(), (l, r) => new OuterJoinResult<TLeft, TRight> { LeftValue = l.l, RightValue = r })
.Select(result);
}
If you need to join and filter on something, that can be done outside of the join. Filter can be done after creating the collection.
In this case if I do this in the join condition I reduce the rows that are returned.
Ternary condition is used (= n == null ? "__" : n.MonDayNote,)
If the object is null (so no match), then return what is after the ?. __, in this case.
Else, return what is after the :, n.MonDayNote.
Thanks to the other contributors that is where I started with my own issue.
var schedLocations = (from f in db.RAMS_REVENUE_LOCATIONS
join n in db.RAMS_LOCATION_PLANNED_MANNING on f.revenueCenterID equals
n.revenueCenterID into lm
from n in lm.DefaultIfEmpty()
join r in db.RAMS_LOCATION_SCHED_NOTE on f.revenueCenterID equals r.revenueCenterID
into locnotes
from r in locnotes.DefaultIfEmpty()
where f.LocID == nLocID && f.In_Use == true && f.revenueCenterID > 1000
orderby f.Areano ascending, f.Locname ascending
select new
{
Facname = f.Locname,
f.Areano,
f.revenueCenterID,
f.Locabbrev,
// MonNote = n == null ? "__" : n.MonDayNote,
MonNote = n == null ? "__" : n.MonDayNote,
TueNote = n == null ? "__" : n.TueDayNote,
WedNote = n == null ? "__" : n.WedDayNote,
ThuNote = n == null ? "__" : n.ThuDayNote,
FriNote = n == null ? "__" : n.FriDayNote,
SatNote = n == null ? "__" : n.SatDayNote,
SunNote = n == null ? "__" : n.SunDayNote,
MonEmpNbr = n == null ? 0 : n.MonEmpNbr,
TueEmpNbr = n == null ? 0 : n.TueEmpNbr,
WedEmpNbr = n == null ? 0 : n.WedEmpNbr,
ThuEmpNbr = n == null ? 0 : n.ThuEmpNbr,
FriEmpNbr = n == null ? 0 : n.FriEmpNbr,
SatEmpNbr = n == null ? 0 : n.SatEmpNbr,
SunEmpNbr = n == null ? 0 : n.SunEmpNbr,
SchedMondayDate = n == null ? dMon : n.MondaySchedDate,
LocNotes = r == null ? "Notes: N/A" : r.LocationNote
}).ToList();
Func<int, string> LambdaManning = (x) => { return x == 0 ? "" : "Manning:" + x.ToString(); };
DataTable dt_ScheduleMaster = PsuedoSchedule.Tables["ScheduleMasterWithNotes"];
var schedLocations2 = schedLocations.Where(x => x.SchedMondayDate == dMon);
class Program
{
List<Employee> listOfEmp = new List<Employee>();
List<Department> listOfDepart = new List<Department>();
public Program()
{
listOfDepart = new List<Department>(){
new Department { Id = 1, DeptName = "DEV" },
new Department { Id = 2, DeptName = "QA" },
new Department { Id = 3, DeptName = "BUILD" },
new Department { Id = 4, DeptName = "SIT" }
};
listOfEmp = new List<Employee>(){
new Employee { Empid = 1, Name = "Manikandan",DepartmentId=1 },
new Employee { Empid = 2, Name = "Manoj" ,DepartmentId=1},
new Employee { Empid = 3, Name = "Yokesh" ,DepartmentId=0},
new Employee { Empid = 3, Name = "Purusotham",DepartmentId=0}
};
}
static void Main(string[] args)
{
Program ob = new Program();
ob.LeftJoin();
Console.ReadLine();
}
private void LeftJoin()
{
listOfEmp.GroupJoin(listOfDepart.DefaultIfEmpty(), x => x.DepartmentId, y => y.Id, (x, y) => new { EmpId = x.Empid, EmpName = x.Name, Dpt = y.FirstOrDefault() != null ? y.FirstOrDefault().DeptName : null }).ToList().ForEach
(z =>
{
Console.WriteLine("Empid:{0} EmpName:{1} Dept:{2}", z.EmpId, z.EmpName, z.Dpt);
});
}
}
class Employee
{
public int Empid { get; set; }
public string Name { get; set; }
public int DepartmentId { get; set; }
}
class Department
{
public int Id { get; set; }
public string DeptName { get; set; }
}
OUTPUT
Overview: In this code snippet, I demonstrate how to group by ID where Table1 and Table2 have a one to many relationship. I group on
Id, Field1, and Field2. The subquery is helpful, if a third Table lookup is required and it would have required a left join relationship.
I show a left join grouping and a subquery linq. The results are equivalent.
class MyView
{
public integer Id {get,set};
public String Field1 {get;set;}
public String Field2 {get;set;}
public String SubQueryName {get;set;}
}
IList<MyView> list = await (from ci in _dbContext.Table1
join cii in _dbContext.Table2
on ci.Id equals cii.Id
where ci.Field1 == criterion
group new
{
ci.Id
} by new { ci.Id, cii.Field1, ci.Field2}
into pg
select new MyView
{
Id = pg.Key.Id,
Field1 = pg.Key.Field1,
Field2 = pg.Key.Field2,
SubQueryName=
(from chv in _dbContext.Table3 where chv.Id==pg.Key.Id select chv.Field1).FirstOrDefault()
}).ToListAsync<MyView>();
Compared to using a Left Join and Group new
IList<MyView> list = await (from ci in _dbContext.Table1
join cii in _dbContext.Table2
on ci.Id equals cii.Id
join chv in _dbContext.Table3
on cii.Id equals chv.Id into lf_chv
from chv in lf_chv.DefaultIfEmpty()
where ci.Field1 == criterion
group new
{
ci.Id
} by new { ci.Id, cii.Field1, ci.Field2, chv.FieldValue}
into pg
select new MyView
{
Id = pg.Key.Id,
Field1 = pg.Key.Field1,
Field2 = pg.Key.Field2,
SubQueryName=pg.Key.FieldValue
}).ToListAsync<MyView>();
This is the prettiest solution I use, give it a try! 😉
(from c in categories
let product = products.Where(d=> d.Category == c.Category).FirstOrDefault()
select new { Category = c, ProductName = p == null ? "(No products)" : product.ProductName };
(from a in db.Assignments
join b in db.Deliveryboys on a.AssignTo equals b.EmployeeId
//from d in eGroup.DefaultIfEmpty()
join c in db.Deliveryboys on a.DeliverTo equals c.EmployeeId into eGroup2
from e in eGroup2.DefaultIfEmpty()
where (a.Collected == false)
select new
{
OrderId = a.OrderId,
DeliveryBoyID = a.AssignTo,
AssignedBoyName = b.Name,
Assigndate = a.Assigndate,
Collected = a.Collected,
CollectedDate = a.CollectedDate,
CollectionBagNo = a.CollectionBagNo,
DeliverTo = e == null ? "Null" : e.Name,
DeliverDate = a.DeliverDate,
DeliverBagNo = a.DeliverBagNo,
Delivered = a.Delivered
});

Linq join two lists: is it more efficient to use Dictionary?

Final rephrase
Below I join two sequences and I wondered if it would be faster to create a Dictionary of one sequence with the keySelector of the join as key and iterate through the other collection and find the key in the dictionary.
This only works if the key selector is unique. A real join has no problem with two records having the same key. In a dictionary you'll have to have unique keys
I measured the difference, and I noticed that the dictionary method is about 13% faster. In most use cases ignorable. See my answer to this question
Rephrased question
Some suggested that this question is the same question as LINQ - Using where or join - Performance difference?, but this one is not about using where or join, but about using a Dictionary to perform the join.
My question is: if I want to join two sequences based on a key selector, which method would be faster?
Put all items of one sequence in a Dictionary and enumerate the other sequence to see if the item is in the Dictionary. This would mean to iterate through both sequences once and calculate hash codes on the keySelector for every item in both sequences once.
The other method: use System.Enumerable.Join.
The question is: Would Enumerable.Join for each element in the first list iterate through the elements in the second list to find a match according to the key selector, having to compare N * N elements (is this called second order?) or would it use a more advanced method?
Original question with examples
I have two classes, both with a property Reference. I have two sequences of these classes and I want to join them based on equal Reference.
Class ClassA
{
public string Reference {get;}
...
}
public ClassB
{
public string Reference {get;}
...
}
var listA = new List<ClassA>()
{
new ClassA() {Reference = 1, ...},
new ClassA() {Reference = 2, ...},
new ClassA() {Reference = 3, ...},
new ClassA() {Reference = 4, ...},
}
var listB = new List<ClassB>()
{
new ClassB() {Reference = 1, ...},
new ClassB() {Reference = 3, ...},
new ClassB() {Reference = 5, ...},
new ClassB() {Reference = 7, ...},
}
After the join I want combinations of ClassA objects and ClassB objects that have an equal Reference. This is quite simple to do:
var myJoin = listA.Join(listB, // join listA and listB
a => a.Reference, // from listA take Reference
b => b.Reference, // from listB take Reference
(objectA, objectB) => // if references equal
new {A = objectA, B = objectB}); // return combination
I'm not sure how this works, but I can imagine that for each a in listA the listB is iterated to see if there is a b in listB with the same reference as A.
Question: if I know that the references are Distinct wouldn't it be more efficient to convert B into a Dictionary and compare the Reference for each element in listA:
var dictB = listB.ToDictionary<string, ClassB>()
var myJoin = listA
.Where(a => dictB.ContainsKey(a.Reference))
.Select(a => new (A = a, B = dictB[a.Reference]);
This way, every element of listB has to be accessed once to put in the dictionary and every element of listA has to be accessed once, and the hascode of Reference has to be calculated once.
Would this method be faster for large collections?
I created a test program for this and measured the time it took.
Suppose I have a class of Person, each person has a name and a Father property which is of type Person. If the Father is not know, the Father property is null
I have a sequence of Bastards (no father) that have exactly one Son and One Daughter. All Daughters are put in one sequence. All sons are put in another sequences.
The query: join the sons and the daughters that have the same father.
Results: Joining 1 million families using Enumerable.Join took 1.169 sec. Joining them using Dictionary join used 1.024 sec. Ever so slightly faster.
The code:
class Person : IEquatable<Person>
{
public string Name { get; set; }
public Person Father { get; set; }
// + a lot of equality functions get hash code etc
// for those interested: see the bottom
}
const int nrOfBastards = 1000000; // one million
var bastards = Enumerable.Range (0, nrOfBastards)
.Select(i => new Person()
{ Name = 'B' + i.ToString(), Father = null })
.ToList();
var sons = bastards.Select(father => new Person()
{Name = "Son of " + father.Name, Father = father})
.ToList();
var daughters = bastards.Select(father => new Person()
{Name = "Daughter of " + father.Name, Father = father})
.ToList();
// join on same parent: Traditionally and using Dictionary
var stopwatch = Stopwatch.StartNew();
this.TraditionalJoin(sons, daughters);
var time = stopwatch.Elapsed;
Console.WriteLine("Traditional join of {0} sons and daughters took {1:F3} sec", nrOfBastards, time.TotalSeconds);
stopwatch.Restart();
this.DictionaryJoin(sons, daughters);
time = stopwatch.Elapsed;
Console.WriteLine("Dictionary join of {0} sons and daughters took {1:F3} sec", nrOfBastards, time.TotalSeconds);
}
private void TraditionalJoin(IEnumerable<Person> boys, IEnumerable<Person> girls)
{ // join on same parent
var family = boys
.Join(girls,
boy => boy.Father,
girl => girl.Father,
(boy, girl) => new { Son = boy.Name, Daughter = girl.Name })
.ToList();
}
private void DictionaryJoin(IEnumerable<Person> sons, IEnumerable<Person> daughters)
{
var sonsDictionary = sons.ToDictionary(son => son.Father);
var family = daughters
.Where(daughter => sonsDictionary.ContainsKey(daughter.Father))
.Select(daughter => new { Son = sonsDictionary[daughter.Father], Daughter = daughter })
.ToList();
}
For those interested in the equality of Persons, needed for a proper dictionary:
class Person : IEquatable<Person>
{
public string Name { get; set; }
public Person Father { get; set; }
public bool Equals(Person other)
{
if (other == null)
return false;
else if (Object.ReferenceEquals(this, other))
return true;
else if (this.GetType() != other.GetType())
return false;
else
return String.Equals(this.Name, other.Name, StringComparison.OrdinalIgnoreCase);
}
public override bool Equals(object obj)
{
return this.Equals(obj as Person);
}
public override int GetHashCode()
{
const int prime1 = 899811277;
const int prime2 = 472883293;
int hash = prime1;
unchecked
{
hash = hash * prime2 + this.Name.GetHashCode();
if (this.Father != null)
{
hash = hash * prime2 + this.Father.GetHashCode();
}
}
return hash;
}
public override string ToString()
{
return this.Name;
}
public static bool operator==(Person x, Person y)
{
if (Object.ReferenceEquals(x, null))
return Object.ReferenceEquals(y, null);
else
return x.Equals(y);
}
public static bool operator!=(Person x, Person y)
{
return !(x==y);
}
}

Find / Count Redundant Records in a List<T>

I am looking for a way to identify duplicate records...only I want / expect to see them.
So the records aren't duplicated completely but the unique fields I am unconcerned with at this point. I just want to see if they have made X# payments of the exact same amount, via the exact same card, to the exact same person. (Bogus example just to illustrate)
The collection is a List<> further whatever X# is the List<>.Count will be X#. In other words all the records in the list match (again just the fields I am concerned with) or I will reject it.
The best I can come up with is to take the first record get value of say PayAmount and LINQ the other two to see if they have the same PayAmount value. Repeat for all fields to be matched. This seems horribly inefficient but I am not smart enough to think of a better way.
So any thoughts, ideas, pointers would be greatly appreciated.
JB
Something like this should do it.
var duplicates = list.GroupBy(x => new { x.Amount, x.CardNumber, x.PersonName })
.Where(x => x.Count() > 1);
Working example:
class Program
{
static void Main(string[] args)
{
List<Entry> table = new List<Entry>();
var dup1 = new Entry
{
Name = "David",
CardNumber = 123456789,
PaymentAmount = 70.00M
};
var dup2 = new Entry
{
Name = "Daniel",
CardNumber = 987654321,
PaymentAmount = 45.00M
};
//3 duplicates
table.Add(dup1);
table.Add(dup1);
table.Add(dup1);
//2 duplicates
table.Add(dup2);
table.Add(dup2);
//Find duplicates query
var query = from p in table
group p by new { p.Name, p.CardNumber, p.PaymentAmount } into g
where g.Count() > 1
select new
{
name = g.Key.Name,
cardNumber = g.Key.CardNumber,
amount = g.Key.PaymentAmount,
count = g.Count()
};
foreach (var item in query)
{
Console.WriteLine("{0}, {1}, {2}, {3}", item.name, item.cardNumber, item.amount, item.count);
}
Console.ReadKey();
}
}
public class Entry
{
public string Name { get; set; }
public int CardNumber { get; set; }
public decimal PaymentAmount { get; set; }
}
The meat of which is this:
var query = from p in table
group p by new { p.Name, p.CardNumber, p.PaymentAmount } into g
where g.Count() > 1
select new
{
name = g.Key.Name,
cardNumber = g.Key.CardNumber,
amount = g.Key.PaymentAmount,
count = g.Count()
};
You're unique entries are based off of the 3 criteria of Name, Card Number, and Payment Amount so you group by them and then use .Count() to count how many of those unique values exist. where g.Count() > 1 filters the group to duplicates only.

EF single entity problem

I need to return a single instance of my viewmodel class from my repository in order to feed this into a strongly-typed view
In my repository, this works fine for a collection of viewmodel instances:
IEnumerable<PAWeb.Domain.Entities.Section> ISectionsRepository.GetSectionsByArea(int AreaId)
{
var _sections = from s in DataContext.Sections where s.AreaId == AreaId orderby s.Ordinal ascending select s;
return _sections.Select(x => new PAWeb.Domain.Entities.Section()
{
SectionId = x.SectionId,
Title = x.Title,
UrlTitle = x.UrlTitle,
NavTitle = x.NavTitle,
AreaId = x.AreaId,
Ordinal = x.Ordinal
}
);
}
But when I attempt to obtain a single entity, like this:
public PAWeb.Domain.Entities.Section GetSection(int SectionId)
{
var _section = from s in DataContext.Sections where s.SectionId == SectionId select s;
return _section.Select(x => new PAWeb.Domain.Entities.Section()
{
SectionId = x.SectionId,
Title = x.Title,
UrlTitle = x.UrlTitle,
NavTitle = x.NavTitle,
AreaId = x.AreaId,
Ordinal = x.Ordinal
}
);
}
I get
Error 1 Cannot implicitly convert type
'System.Linq.IQueryable<PAWeb.Domain.Entities.Section>' to
'PAWeb.Domain.Entities.Section'. An explicit conversion exists
(are you missing a cast?)"
This has got to be simple, but I'm new to c#, and I can't figure out the casting. I tried (PAWeb.Domain.Entities.Section) in various places, but no success. Can anyone help??
Your query is returning an IQueryable, which could have several items. For example, think of the difference between an Array or List of objects and a single object. It doesn't know how to convert the List to a single object, which one should it take? The first? The last?
You need to tell it specifically to only take one item.
e.g.
public PAWeb.Domain.Entities.Section GetSection(int SectionId)
{
var _section = from s in DataContext.Sections where s.SectionId == SectionId select s;
return _section.Select(x => new PAWeb.Domain.Entities.Section()
{
SectionId = x.SectionId,
Title = x.Title,
UrlTitle = x.UrlTitle,
NavTitle = x.NavTitle,
AreaId = x.AreaId,
Ordinal = x.Ordinal
}
).FirstOrDefault();
}
This will either return the first item, or null if there are no items that match your query. In your case that won't happen unless the table is empty since you don't have a where clause.

How do I transfer this logic into a LINQ statement?

I can't get this bit of logic converted into a Linq statement and it is driving me nuts. I have a list of items that have a category and a createdondate field. I want to group by the category and only return items that have the max date for their category.
So for example, the list contains items with categories 1 and 2. The first day (1/1) I post two items to both categories 1 and 2. The second day (1/2) I post three items to category 1. The list should return the second day postings to category 1 and the first day postings to category 2.
Right now I have it grouping by the category then running through a foreach loop to compare each item in the group with the max date of the group, if the date is less than the max date it removes the item.
There's got to be a way to take the loop out, but I haven't figured it out!
You can do something like that :
from item in list
group item by item.Category into g
select g.OrderByDescending(it => it.CreationDate).First();
However, it's not very efficient, because it needs to sort the items of each group, which is more complex than necessary (you don't actually need to sort, you just need to scan the list once). So I created this extension method to find the item with the max value of a property (or function) :
public static T WithMax<T, TValue>(this IEnumerable<T> source, Func<T, TValue> selector)
{
var max = default(TValue);
var withMax = default(T);
var comparer = Comparer<TValue>.Default;
bool first = true;
foreach (var item in source)
{
var value = selector(item);
int compare = comparer.Compare(value, max);
if (compare > 0 || first)
{
max = value;
withMax = item;
}
first = false;
}
return withMax;
}
You can use it as follows :
from item in list
group item by item.Category into g
select g.WithMax(it => it.CreationDate);
UPDATE : As Anthony noted in his comment, this code doesn't exactly answer the question... if you want all items which date is the maximum of their category, you can do something like that :
from item in list
group item by item.Category into g
let maxDate = g.Max(it => it.CreationDate)
select new
{
Category = g.Key,
Items = g.Where(it => it.CreationDate == maxDate)
};
How about this:
private class Test
{
public string Category { get; set; }
public DateTime PostDate { get; set; }
public string Post { get; set; }
}
private void Form1_Load(object sender, EventArgs e)
{
List<Test> test = new List<Test>();
test.Add(new Test() { Category = "A", PostDate = new DateTime(2010, 5, 5, 12, 0, 0), Post = "A1" });
test.Add(new Test() { Category = "B", PostDate = new DateTime(2010, 5, 5, 13, 0, 0), Post = "B1" });
test.Add(new Test() { Category = "A", PostDate = new DateTime(2010, 5, 6, 12, 0, 0), Post = "A2" });
test.Add(new Test() { Category = "A", PostDate = new DateTime(2010, 5, 6, 13, 0, 0), Post = "A3" });
test.Add(new Test() { Category = "A", PostDate = new DateTime(2010, 5, 6, 14, 0, 0), Post = "A4" });
var q = test.GroupBy(t => t.Category).Select(g => new { grp = g, max = g.Max(t2 => t2.PostDate).Date }).SelectMany(x => x.grp.Where(t => t.PostDate >= x.max));
}
Reformatting luc's excellent answer to query comprehension form. I like this better for this kind of query because the scoping rules let me write more concisely.
from item in source
group item by item.Category into g
let max = g.Max(item2 => item2.PostDate).Date
from item3 in g
where item3.PostDate.Date == max
select item3;

Resources