Linq: Group on list inside lists - linq

Let's say I have a list of lists.
For each item in this list , I have a list of custom objects.
These objects are as such:
public string Field1
public string Field2
public string Field3
What I'd like to achieve through Linq: filter out of my list of lists all the objects which have the same three fields, which are not the first element of their list and keep only the first one.
So let's say I have two lists listA and list B in my list.
listA has three objects object1, object2 and object3.
object1.Field1 = "a" object1.Field2 = "A" object1.Field3 = "1"
object2.Field1 = "a" object2.Field2 = "B" object2.Field3 = "2"
object3.Field1 = "a" object3.Field2 = "C" object3.Field3 = "3"
listB has three objects object4, object5 and object6.
object4.Field1 = "a" object4.Field2 = "A" object4.Field3 = "1"
object5.Field1 = "a" object5.Field2 = "B" object5.Field3 = "2"
object6.Field1 = "a" object6.Field2 = "D" object6.Field3 = "3"
In this example, object1 and object4 are the same, but because they are first in their respective list, they are not filtered out.
However, object2 and object5 having the same three fields values, only one of them will be kept so that at then end of my process, I'll have my two list like so:
listA has three objects object1, object2 and object3.
object1.Field1 = "a" object1.Field2 = "A" object1.Field3 = "1"
object2.Field1 = "a" object2.Field2 = "B" object2.Field3 = "2"
object3.Field1 = "a" object3.Field2 = "C" object3.Field3 = "3"
listB has now two objects object4 and object6.
object4.Field1 = "a" object4.Field2 = "A" object4.Field3 = "1"
object6.Field1 = "a" object6.Field2 = "D" object6.Field3 = "3"
I've been scratching my head for hours about this to no avail. I cannot do a foreach list of lists look into all the other lists as it would cause a performance problem (I can potentially have 1000000 lists of lists).
Anyone with an idea for this?

Why does it have to be LINQ? A simple iterator block solves the problem quite nicely.
The code below assumes you have overridden Equals and GetHashCode in your object to check for equality on the 3 fields. If that is not possible, use a custom equality comparer instead (to be passed in the HashSet constructor)
static IEnumerable<List<T>> GetFilteredList<T>(IEnumerable<List<T>> input)
{
var found = new HashSet<T>();
foreach (var list in input)
{
var returnList = new List<T>(list.Capacity);
foreach (var item in list)
{
// the first item is always added
// any other items are only added if they were
// never encountered before
if (list.Count == 0 || !found.Contains(item))
{
found.Add(item);
returnList.Add(item);
}
}
yield return returnList;
}
}
If you can stick with IEnumerable<IEnumerable<T>> as the return value, another approach that only sweeps the input just once could be something like this (without creating intermediary lists):
static IEnumerable<IEnumerable<T>> GetFilteredList<T>(IEnumerable<List<T>> input)
{
var encounteredElements = new HashSet<T>();
foreach (var list in input)
{
yield return Process(list, encounteredElements);
}
}
static IEnumerable<T> Process<T>(IEnumerable<T> input,
HashSet<T> encounteredElements)
{
bool first = true;
foreach (var item in input)
{
if (first) yield return item;
if (!encounteredElements.Contains(item))
{
yield return item;
}
encounteredElements.Add(item);
first = false;
}
}

Related

add data to an arrayList, when If condition is satisfied in nested for loop in Kotlin

Check the code below:
getCommonsArrayList(listA:ArrayList< User >, listB:ArrayList<User>):ArrayList<User>{
var listCommon = ArrayList<User>()
for (i in listA.indices) {
for (j in listB.indices) {
if (listA[i].id.equals(listB[j].id)) { //if id of the user matches
listCommon.put(listA[i]) //add to a new list
}
}
}
return listCommon // return the new list with common entries
}
The above method iterates list a & b and check whether the id's are matching, if they are then the User object is stored to a new list and at the end of the program, it returns the common list.
This thing works good. And I hope nested for followed by if condition is the way in which we can compare two lists.
The problem with this is if listA has repeated entries, then the listCommon will also have repeated entries as ArrayList supports duplicacy of entries.
So what I did to make commonList unique is I introduced a HashMap object as shown below:
getCommonsArrayList(listA:ArrayList< User >, listB:ArrayList<User>):ArrayList<User>{
var listCommon = ArrayList<User>()
var arrResponseMap = HashMap<String,User>()
for (i in listA.indices) {
for (j in listB.indices) {
if (listA[i].id.equals(listB[j].id)) { //if id of the user matches
arrResponseMap.put(listA[i].id,listA[i]) // add id to map so there would be no duplicacy
}
}
}
arrResponseMap.forEach {
listCommon.add(it.value) //iterate the map and add all values
}
return listCommon // return the new list with common entries
}
This will give the new arrayList of userObject with common Id's. But this has an increased complexity than the above code.
If the size of the listA and listB increases to 1000 then this execution will take heavy time.
Can someone guide me if there is some better way to solve this.
You can simply use distinctBy to get only unique values from list.
Official Doc:
Returns a sequence containing only elements from the given sequence
having distinct keys returned by the given selector function.
The elements in the resulting sequence are in the same order as they
were in the source sequence.
Here is an example:
val model1 = UserModel()
model1.userId = 1
val model2 = UserModel()
model1.userId = 2
val model3 = UserModel()
model1.userId = 1
val model4 = UserModel()
model1.userId = 2
val commonList = listOf(model1, model2, model3, model4)
// get unique list based on userID, use any field to base your distinction
val uniqueList = commonList
.distinctBy { it.userId }
.toList()
assert(uniqueList.count() == 2)
assert(commonList.count() == 4)
Add the both the list and use distinctBy like this
data class DevelopersDetail(val domain: String, val role: String)
val d1 = DevelopersDetail("a", "1")
val d2 = DevelopersDetail("b", "1")
val d3 = DevelopersDetail("c", "1")
val d4 = DevelopersDetail("c", "1")
val d5 = DevelopersDetail("d", "1")
var listA = listOf(d1, d2, d3, d4)
var listb = listOf(d1, d2, d3, d4)
var data = listA + listb
var list= data
.distinctBy { it.domain }
.toList()
println("list $list")
//output-list [DevelopersDetail(domain=a, role=1), DevelopersDetail(domain=b, role=1), DevelopersDetail(domain=c, role=1)]

Linq Query Filter from two lists where row differs

Not sure how to formulate this Linq query.
I have two lists, each of which contains HashCheck objects:
class HashCheck
{
public string Id {get; set;}
public string Hash {get; set;}
}
So, given
List<HashCheck> list1;
List<HashCheck> list2;
I need a query that will result in a list having rows where the Ids of the rows matches, but the Hash does not.
So for example
List1 =
{1, 12345,
2, 34323,
3, 34083,
4, 09887}
List2 =
{1, 00001, << matching id, not matching hash
2, 34323,
3, 11112, << matching id, not matching hash
4, 09887
5, 98845}
ResultList =
{1, 00001,
3, 11112}
NOTE: in List2, there is an extra row, it would be a bonus if this were included in the ResultList. But I know how to do that in a separate query if necessary.
Thanks for any help.
try this code:
var list3 = (from i in list1
from j in list2
where i.Id == j.Id && i.Hash != j.Hash
select new HashCheck() { Id = j.Id, Hash = j.Hash
}).ToList<HashCheck>();
You can use join. something like below code:
var list3 = (from i in list1
join j in list2 on i.Id equals j.Id
where i.Hash != j.Hash
select new HashCheck() { Id = j.Id, Hash = j.Hash
}).ToList<HashCheck>();
It looks like you want your result to contain the HashCheck objects from list2, which would simply mean:
var ans = list2.Where(hc2 => !list1.Any(hc1 => hc1.Id == hc2.Id && hc1.Hash == hc2.Hash));
e.g. return all list2 elements without a list1 element that matches in both Id and Hash.
If list1 (and/or list2) is very large and performance is a consideration, you can convert list1 to a Dictionary and do lookups against that:
var list1map = list1.ToDictionary(hc1 => hc1.Id, hc1 => hc1.Hash);
var ans2 = list2.Where(hc2 => !list1map.TryGetValue(hc2.Id, out var hash1) || hash1 != hc2.Hash);
Another alternative would be to implement Equals/GetHashCode for your class and then you can use LINQ Except.
Add the following methods to your class:
public override bool Equals(object other) => (other is HashCheck hco) ? Id == hco.Id && Hash == hco.Hash : false;
public override int GetHashCode() => (Id, Hash).GetHashCode();
Now the computation is simple:
var ans3 = list2.Except(list1);
NOTE: Implementing Equals/GetHashCode in this way can be problematic if your HashCode objects are not treated as immutable. Some collection classes really won't like it if the hash code of an object already stored in them changes.
Also, it would be best practice to implement operator== and operator!= as well and possibly IEquatable.

filter elements in nested dictionary LINQ

I have the following data structure:
Dictionary<string, Dictionary<string, List<int>>> data =
new Dictionary<string, Dictionary<string, List<int>>>();
I want to filter some of the elements in that dictionary based on value in first element of the list of the inner dictionary.
for example:
{legion1
{soldier1, [10,1000]},
{soldier2, [50,1000]}
}
Now let's say I want to do foreach loop in which to work only elements where
the value of the first element of the list is less than 20
expected result in the foreach loop is:
{legion1{soldier1, [10,1000]}}
What I've tried:
I do foreach loop and then I want to use something similar:
data.where(x => x.value.where(o => o[0] < 20 ))
I always get error that that way is incorrect.
Please tell how can I solve the issue and why my way is failing.
You can filter and iterate over the result set like so:
var resultSet =
data.ToDictionary(e => e.Key,
e => e.Value.Where(x => x.Value[0] < 20)
.ToDictionary(k => k.Key, v => v.Value)
);
foreach(var item in resultSet){
var key = item.Key; // string
var values = item.Value; // Dictionary<string, List<int>>
...
...
}
The problem is that you are applying operator [] incorrectly. Moreover, since you want to use both Legion and Soldier, you should construct a tuple combining the two of them:
foreach (var t in data.SelectMany(lg => lg.Value.Select(s => new {
Legion = lg
, Soldier = s
})).Where(ls => ls.Soldier.Value[0] < 20)) {
Console.WriteLine("Legion={0} Soldier = {1}", t.Legion.Key, t.Soldier.Key);
}

Item-by-item list comparison, updating each item with its result (no third list)

The solutions I have found so far in my research on comparing lists of objects have usually generated a new list of objects, say of those items existing in one list, but not in the other. In my case, I want to compare two lists to discover the items whose key exists in one list and not the other (comparing both ways), and for those keys found in both lists, checking whether the value is the same or different.
The object being compared has multiple properites that constitute the key, plus a property that constitutes the value, and finally, an enum property that describes the result of the comparison, e.g., {Equal, NotEqual, NoMatch, NotYetCompared}. So my object might look like:
class MyObject
{
//Key combination
string columnA;
string columnB;
decimal columnC;
//The Value
decimal columnD;
//Enum for comparison, used for styling the item (value hidden from UI)
//Alternatively...this could be a string type, holding the enum.ToString()
MyComparisonEnum result;
}
These objects are collected into two ObservableCollection<MyObject> to be compared. When bound to the UI, the grid rows are being styled based on the caomparison result enum, so the user can easily see what keys are in the new dataset but not in the old, vice-versa, along with those keys in both datasets with a different value. Both lists are presented in the UI in data grids, with the rows styled based on the comparison result.
Would LINQ be suitable as a tool to solve this efficiently, or should I use loops to scan the lists and break out when the key is found, etc (a solution like this comes naturally to be from my procedural programming background)... or some other method?
Thank you!
You can use Except and Intersect:
var list1 = new List<MyObject>();
var list2 = new List<MyObject>();
// initialization code
var notIn2 = list1.Except(list2);
var notIn1 = list2.Except(list1);
var both = list1.Intersect(list2);
To find objects with different values (ColumnD) you can use this (quite efficient) Linq query:
var diffValue = from o1 in list1
join o2 in list2
on new { o1.columnA, o1.columnB, o1.columnC } equals new { o2.columnA, o2.columnB, o2.columnC }
where o1.columnD != o2.columnD
select new { Object1 = o1, Object2 = o2 };
foreach (var diff in diffValue)
{
MyObject obj1 = diff.Object1;
MyObject obj2 = diff.Object2;
Console.WriteLine("Obj1-Value:{0} Obj2-Value:{1}", obj1.columnD, obj2.columnD);
}
when you override Equals and GetHashCode appropriately:
class MyObject
{
//Key combination
string columnA;
string columnB;
decimal columnC;
//The Value
decimal columnD;
//Enum for comparison, used for styling the item (value hidden from UI)
//Alternatively...this could be a string type, holding the enum.ToString()
MyComparisonEnum result;
public override bool Equals(object obj)
{
if (obj == null || !(obj is MyObject)) return false;
MyObject other = (MyObject)obj;
return columnA.Equals(other.columnA) && columnB.Equals(other.columnB) && columnC.Equals(other.columnC);
}
public override int GetHashCode()
{
int hash = 19;
hash = hash + (columnA ?? "").GetHashCode();
hash = hash + (columnB ?? "").GetHashCode();
hash = hash + columnC.GetHashCode();
return hash;
}
}

Linq query to find non-numeric items in list?

Suppose I have the following list:
var strings = new List<string>();
strings.Add("1");
strings.Add("12.456");
strings.Add("Foobar");
strings.Add("0.56");
strings.Add("zero");
Is there some sort of query I can write in Linq that will return to me only the numeric items, i.e. the 1st, 2nd, and 4th items from the list?
-R.
strings.Where(s => { double ignored; return double.TryParse(s, out ignored); })
This will return all the strings that are parseable as doubles as strings. If you want them as numbers (which makes more sense), you could write an extension method:
public static IEnumerable<double> GetDoubles(this IEnumerable<string> strings)
{
foreach (var s in strings)
{
double result;
if (double.TryParse(s, out result))
yield return result;
}
}
Don't forget that double.TryParse() uses your current culture, so it will give different results on different computers. If you don't want that, use double.TryParse(s, NumberStyles.AllowDecimalPoint, CultureInfo.InvariantCulture, out result).
Try this:
double dummy = 0;
var strings = new List<string>();
strings.Add("1");
strings.Add("12.456");
strings.Add("Foobar");
strings.Add("0.56");
strings.Add("zero");
var numbers = strings.Where(a=>double.TryParse(a, out dummy));
You could use a simple predicate to examine each string, like so:
var strings = new List<string>();
strings.Add("1");
strings.Add("12.456");
strings.Add("Foobar");
strings.Add("0.56");
strings.Add("zero");
var nums = strings.Where( s => s.ToCharArray().All( n => Char.IsNumber( n ) || n == '.' ) );

Resources