Linq Query Filter from two lists where row differs - linq

Not sure how to formulate this Linq query.
I have two lists, each of which contains HashCheck objects:
class HashCheck
{
public string Id {get; set;}
public string Hash {get; set;}
}
So, given
List<HashCheck> list1;
List<HashCheck> list2;
I need a query that will result in a list having rows where the Ids of the rows matches, but the Hash does not.
So for example
List1 =
{1, 12345,
2, 34323,
3, 34083,
4, 09887}
List2 =
{1, 00001, << matching id, not matching hash
2, 34323,
3, 11112, << matching id, not matching hash
4, 09887
5, 98845}
ResultList =
{1, 00001,
3, 11112}
NOTE: in List2, there is an extra row, it would be a bonus if this were included in the ResultList. But I know how to do that in a separate query if necessary.
Thanks for any help.

try this code:
var list3 = (from i in list1
from j in list2
where i.Id == j.Id && i.Hash != j.Hash
select new HashCheck() { Id = j.Id, Hash = j.Hash
}).ToList<HashCheck>();
You can use join. something like below code:
var list3 = (from i in list1
join j in list2 on i.Id equals j.Id
where i.Hash != j.Hash
select new HashCheck() { Id = j.Id, Hash = j.Hash
}).ToList<HashCheck>();

It looks like you want your result to contain the HashCheck objects from list2, which would simply mean:
var ans = list2.Where(hc2 => !list1.Any(hc1 => hc1.Id == hc2.Id && hc1.Hash == hc2.Hash));
e.g. return all list2 elements without a list1 element that matches in both Id and Hash.
If list1 (and/or list2) is very large and performance is a consideration, you can convert list1 to a Dictionary and do lookups against that:
var list1map = list1.ToDictionary(hc1 => hc1.Id, hc1 => hc1.Hash);
var ans2 = list2.Where(hc2 => !list1map.TryGetValue(hc2.Id, out var hash1) || hash1 != hc2.Hash);
Another alternative would be to implement Equals/GetHashCode for your class and then you can use LINQ Except.
Add the following methods to your class:
public override bool Equals(object other) => (other is HashCheck hco) ? Id == hco.Id && Hash == hco.Hash : false;
public override int GetHashCode() => (Id, Hash).GetHashCode();
Now the computation is simple:
var ans3 = list2.Except(list1);
NOTE: Implementing Equals/GetHashCode in this way can be problematic if your HashCode objects are not treated as immutable. Some collection classes really won't like it if the hash code of an object already stored in them changes.
Also, it would be best practice to implement operator== and operator!= as well and possibly IEquatable.

Related

LINQ for nested for each

Can you please give me a solution for the method as Linq?
I need Linq for the below method:
private List<Model> ConvertMethod(List<Model> List1, List<Model> List2)
{
foreach (var Firstitem in List1)
{
foreach (var Seconditem in List2)
{
if (Firstitem.InnerText.Trim() == Seconditem.InnerText.Trim())
{
Seconditem.A= Firstitem.A;
Seconditem.B= Firstitem.B;
Seconditem.C= Firstitem.C;
Seconditem.D= Firstitem.D;
Seconditem.E= Firstitem.E;
Seconditem.F= Firstitem.F;
}
}
}
return List2;
}
Your task is to assign values, so modify objects. That's not the purpose of LINQ which is to query datasources. So you could use LINQ to build a query that returns all items that need to be updated. Then you can use a foreach to assign the values(as you did):
var sameItems = from l1 in List1 join l2 in List2
on l1.InnerText.Trim() equals l2.InnerText.Trim()
select new { l1, l2 };
foreach(var itemsToUpdate in sameItems)
{
itemsToUpdate.l2.A = itemsToUpdate.l1.A;
// ...
}
It helps if you think about what this code is supposed to do - update records in the second list with the values of matching records from the first list.
There are various ways you can do that. One option is to replace each foreach with from and filter the rows:
var matches = from var Firstitem in List1
from var Seconditem in List2
where Firstitem.InnerText.Trim() == Seconditem.InnerText.Trim()
select (Firstitem,Seconditem);
foreach(var (Firstitem,Seconditem) in matches)
{
Seconditem.A= Firstitem.A;
Seconditem.B= Firstitem.B;
Seconditem.C= Firstitem.C;
Seconditem.D= Firstitem.D;
Seconditem.E= Firstitem.E;
Seconditem.F= Firstitem.F;
}
I'm "cheating" a bit here, using tuples and decomposition to reduce noise
Another option is to use join. In this case, the two options are identical :
var matches = from Firstitem in List1
join Seconditem in List2
on Firstitem.InnerText.Trim() equals Seconditem.InnerText.Trim()
select (Firstitem,Seconditem);
The rest of the code remains the same

Linq: Group on list inside lists

Let's say I have a list of lists.
For each item in this list , I have a list of custom objects.
These objects are as such:
public string Field1
public string Field2
public string Field3
What I'd like to achieve through Linq: filter out of my list of lists all the objects which have the same three fields, which are not the first element of their list and keep only the first one.
So let's say I have two lists listA and list B in my list.
listA has three objects object1, object2 and object3.
object1.Field1 = "a" object1.Field2 = "A" object1.Field3 = "1"
object2.Field1 = "a" object2.Field2 = "B" object2.Field3 = "2"
object3.Field1 = "a" object3.Field2 = "C" object3.Field3 = "3"
listB has three objects object4, object5 and object6.
object4.Field1 = "a" object4.Field2 = "A" object4.Field3 = "1"
object5.Field1 = "a" object5.Field2 = "B" object5.Field3 = "2"
object6.Field1 = "a" object6.Field2 = "D" object6.Field3 = "3"
In this example, object1 and object4 are the same, but because they are first in their respective list, they are not filtered out.
However, object2 and object5 having the same three fields values, only one of them will be kept so that at then end of my process, I'll have my two list like so:
listA has three objects object1, object2 and object3.
object1.Field1 = "a" object1.Field2 = "A" object1.Field3 = "1"
object2.Field1 = "a" object2.Field2 = "B" object2.Field3 = "2"
object3.Field1 = "a" object3.Field2 = "C" object3.Field3 = "3"
listB has now two objects object4 and object6.
object4.Field1 = "a" object4.Field2 = "A" object4.Field3 = "1"
object6.Field1 = "a" object6.Field2 = "D" object6.Field3 = "3"
I've been scratching my head for hours about this to no avail. I cannot do a foreach list of lists look into all the other lists as it would cause a performance problem (I can potentially have 1000000 lists of lists).
Anyone with an idea for this?
Why does it have to be LINQ? A simple iterator block solves the problem quite nicely.
The code below assumes you have overridden Equals and GetHashCode in your object to check for equality on the 3 fields. If that is not possible, use a custom equality comparer instead (to be passed in the HashSet constructor)
static IEnumerable<List<T>> GetFilteredList<T>(IEnumerable<List<T>> input)
{
var found = new HashSet<T>();
foreach (var list in input)
{
var returnList = new List<T>(list.Capacity);
foreach (var item in list)
{
// the first item is always added
// any other items are only added if they were
// never encountered before
if (list.Count == 0 || !found.Contains(item))
{
found.Add(item);
returnList.Add(item);
}
}
yield return returnList;
}
}
If you can stick with IEnumerable<IEnumerable<T>> as the return value, another approach that only sweeps the input just once could be something like this (without creating intermediary lists):
static IEnumerable<IEnumerable<T>> GetFilteredList<T>(IEnumerable<List<T>> input)
{
var encounteredElements = new HashSet<T>();
foreach (var list in input)
{
yield return Process(list, encounteredElements);
}
}
static IEnumerable<T> Process<T>(IEnumerable<T> input,
HashSet<T> encounteredElements)
{
bool first = true;
foreach (var item in input)
{
if (first) yield return item;
if (!encounteredElements.Contains(item))
{
yield return item;
}
encounteredElements.Add(item);
first = false;
}
}

looping through 2 lists to get the results

I have two lists:
myObject object1 = new myObject(id = 1, title = "object1"};
myObject object2 = new myObject(id = 2, title = "object2"};
myObject object3 = new myObject(id = 3, title = "object3"};
//List 1
List<myObject> myObjectList = new List<myObject>{object1, object2, object3};
//List 2
List<int> idList = new List<int>{2, 3,5};
Now I need to get output as follows:
If a id is present in both the lists, I need to print "A",
if a id is present in list1 only, then I need to print "B",
...and if the id is present in list2 only, I need to print "C"
Can I use linq to achieve this?
I would simply use the inbuilt functions of Except and Intersect
List1.Intersect(List2) = "A"
List1.Except(List2) = "B"
List2.Except(List1) = "C"
There are plenty of resources online about how you could go about doing this, as one example (I didn't look into it too much), check out this link - Linq - Except one list with items in another
Hope this does the trick...

Item-by-item list comparison, updating each item with its result (no third list)

The solutions I have found so far in my research on comparing lists of objects have usually generated a new list of objects, say of those items existing in one list, but not in the other. In my case, I want to compare two lists to discover the items whose key exists in one list and not the other (comparing both ways), and for those keys found in both lists, checking whether the value is the same or different.
The object being compared has multiple properites that constitute the key, plus a property that constitutes the value, and finally, an enum property that describes the result of the comparison, e.g., {Equal, NotEqual, NoMatch, NotYetCompared}. So my object might look like:
class MyObject
{
//Key combination
string columnA;
string columnB;
decimal columnC;
//The Value
decimal columnD;
//Enum for comparison, used for styling the item (value hidden from UI)
//Alternatively...this could be a string type, holding the enum.ToString()
MyComparisonEnum result;
}
These objects are collected into two ObservableCollection<MyObject> to be compared. When bound to the UI, the grid rows are being styled based on the caomparison result enum, so the user can easily see what keys are in the new dataset but not in the old, vice-versa, along with those keys in both datasets with a different value. Both lists are presented in the UI in data grids, with the rows styled based on the comparison result.
Would LINQ be suitable as a tool to solve this efficiently, or should I use loops to scan the lists and break out when the key is found, etc (a solution like this comes naturally to be from my procedural programming background)... or some other method?
Thank you!
You can use Except and Intersect:
var list1 = new List<MyObject>();
var list2 = new List<MyObject>();
// initialization code
var notIn2 = list1.Except(list2);
var notIn1 = list2.Except(list1);
var both = list1.Intersect(list2);
To find objects with different values (ColumnD) you can use this (quite efficient) Linq query:
var diffValue = from o1 in list1
join o2 in list2
on new { o1.columnA, o1.columnB, o1.columnC } equals new { o2.columnA, o2.columnB, o2.columnC }
where o1.columnD != o2.columnD
select new { Object1 = o1, Object2 = o2 };
foreach (var diff in diffValue)
{
MyObject obj1 = diff.Object1;
MyObject obj2 = diff.Object2;
Console.WriteLine("Obj1-Value:{0} Obj2-Value:{1}", obj1.columnD, obj2.columnD);
}
when you override Equals and GetHashCode appropriately:
class MyObject
{
//Key combination
string columnA;
string columnB;
decimal columnC;
//The Value
decimal columnD;
//Enum for comparison, used for styling the item (value hidden from UI)
//Alternatively...this could be a string type, holding the enum.ToString()
MyComparisonEnum result;
public override bool Equals(object obj)
{
if (obj == null || !(obj is MyObject)) return false;
MyObject other = (MyObject)obj;
return columnA.Equals(other.columnA) && columnB.Equals(other.columnB) && columnC.Equals(other.columnC);
}
public override int GetHashCode()
{
int hash = 19;
hash = hash + (columnA ?? "").GetHashCode();
hash = hash + (columnB ?? "").GetHashCode();
hash = hash + columnC.GetHashCode();
return hash;
}
}

Join 2 lists by order instead of condition in LINQ

How can I join 2 lists of equal lengths (to produce a 3rd list of equal length) where I do not want to specify a condition but simply rely on the order of items in the 2 lists.
Eg how can I join:
{1,2,3,4} with {5,6,7,8}
to produce:
{{1,5}, {2,6}, {3,7}, {4,8}}
I have tried the following:
from i in new []{1,2,3,4}
from j in new []{5,6,7,8}
select new { i, j }
but this produces a cross join. When I use join, I always need to specify the "on".
You could use Select in the first list, use the item index and access the element on the second list:
var a = new [] {1,2,3,4};
var b = new [] {5,6,7,8};
var qry = a.Select((i, index) => new {i, j = b[index]});
If you are using .Net 4.0, you can use the Zip extension method and Tuples.
var a = new [] {1,2,3,4};
var b = new [] {5,6,7,8};
var result = a.Zip(b, (an, bn) => Tuple.Create(an, bn));
Alternatively, you can keep them as arrays:
var resultArr = a.Zip(b, (an, bn) => new []{an, bn});
There is a half way solution, if you want to use query syntax. Half way in the sense that you need to use the Select method on both lists in order to get the indexes that you will use in the where clause.
int[] list1 = {1,2,3,4};
int[] list2 = {5,6,7,8};
var result = from item1 in list1.Select((value, index) => new {value, index})
from item2 in list2.Select((value, index) => new {value, index})
where item1.index == item2.index
select new {Value1 = item1.value, Value2 = item2.value};
The benefit with this solution is that it wont fail if the lists have different lengths, as the solution using the indexer would do.

Resources