How to combine dictionaries in Linq? - linq

I'm new to Linq. I have code like this:
public class Data
{
public Dictionary<string,int> WordFrequency;
}
List<Data> dataList;
What I want is one aggregated dictionary that does a combined WordFrequency for the whole list of Data objects. I know how to do this using loops (iterate over the List, then iterate over each Dictionary), my question is, what is the Linq syntax for this? Thank you.
EDIT: here is my (untested) looping approach, so you can see what I mean.
public static Dictionary<string, int> Combine()
{
Dictionary<string, int> result;
foreach (Data data in DataList)
{
foreach (string key in data.WordFrequencies.Keys)
{
if(!result.ContainsKey(key))
result[key] = 0;
result[key] += data.WordFrequencies[key];
}
}
}

So you want to flatten all dictionaries into a single one, which has no duplicate keys - of course?
You can use Enumerable.SelectMany to flatten all and Enumerable.GroupBy to group the keys.
Dictionary<string, int> allWordFrequency = dataList
.SelectMany(d => d.WordFrequency)
.GroupBy(d => d.Key)
.ToDictionary(g => g.Key, g => g.Sum(d => d.Value));
I have presumed that you want to sum all frequencies.

Here is a query-based solution identical in most regards to Tim's:
Dictionary<string, int> allWordFrequency =
(from d in dataList
from kvp in d.WordFrequency
group kvp.Value by d.Key)
// ^^^^^^^^^ this grouping projection...
.ToDictionary(g => g.Key, g => g.Sum());
// ...eliminates need for lambda here ^^
I appreciate how the two from clauses mimic the nested foreach loops in the looping-based approach of the post. Like Tim's solution, the query iterates the KeyValuePair's of the Dictionary rather than iterate the Keys collection - this way the query doesn't need to invoke the indexer to get the corresponding integer count value.

Related

LINQ GroupBy on single property

I am just not understanding the LINQ non-query syntax for GroupBy.
I have a collection of objects that I want to group by a single property. In this case Name
{ Id="1", Name="Bob", Age="23" }
{ Id="2", Name="Sally", Age="41" }
{ Id="3", Name="Bob", Age="73" }
{ Id="4", Name="Bob", Age="34" }
I would like to end up with a collection of all the unique names
{ Name="Bob" }
{ Name="Sally" }
Based on some examples I looked at I thought this would be the way to do it
var uniqueNameCollection = Persons.GroupBy(x => x.Name).Select(y => y.Key).ToList();
But I ended up with a collection with one item. So I though maybe I was over complicating things with the projection. I tried this
var uniqueNameCollection = Persons.GroupBy(x => x.Name).ToList();
Same result. I ended up with a single item in the collection. What am I doing wrong here? I am just looking to GroupBy the Name property.
var names = Persons.Select(p => p.Name).Distinct().ToList()
If you just want names
LINQ's GroupBy doesn't work the same way that SQL's GROUP BY does.
GroupBy takes a sequence and a function to find the field to group by as parameters, and return a sequence of IGroupings that each have a Key that is the field value that was grouped by and sequence of elements in that group.
IEnumerable<IGrouping<TSource>> GroupBy<TSource, TKey>(
IEnumerable<TSource> sequence,
Func<TSource, TKey> keySelector)
{ ... }
So if you start with a list like this:
class Person
{
public string Name;
}
var people = new List<Person> {
new Person { Name = "Adam" },
new Person { Name = "Eve" }
}
Grouping by name will look like this
IEnumerable<IGrouping<Person>> groups = people.GroupBy(person => person.Name);
You could then select the key from each group like this:
IEnumerable<string> names = groups.Select(group => group.Key);
names will be distinct because if there were multiple people with the same name, they would have been in the same group and there would only be one group with that name.
For what you need, it would probably be more efficient to just select the names and then use Distinct
var names = people.Select(p => p.Name).Distinct();
var uniqueNameCollection = Persons.GroupBy(x => x.Name).Select(y => y.Key).ToList();
Appears valid to me. .net Fiddle showing proper expected outcome: https://dotnetfiddle.net/2hqOvt
Using your data I ran the following code statement
var uniqueNameCollection = people.GroupBy(x => x.Name).Select(y => y.Key).ToList();
The return results were List
Bob
Sally
With 2 items in the List
run the following statement and your count should be 2.
people.GroupBy(x => x.Name).Select(y => y.Key).ToList().Count();
Works for me, download a nugget MoreLinq
using MoreLinq
var distinctitems = list.DistinctBy( u => u.Name);

How to convert given data into IEnumerable object

I have below code in c# 4, where I am trying to use linq for ordering, grouping.
IList<Component> components = Component.OrganizationalItem.OrganizationalItem.Components(true);
IEnumerable<Component> baggage = components.Where(x => x.IsBasedOnSchema(Constants.Schemas.BaggageAllowance.ToString()))
.OrderBy(x => x.ComponentValue("name").StringValue("Code"))
.GroupBy(x => x.ComponentValue("name").StringValue("Code"));
In above sample when I am trying to use GroupBy it is giving error, please see below:
Cannot implicitly convert type 'System.Collections.Generic.IEnumerable<System.Linq.IGrouping<string,Tridion.ContentManager.ContentManagement.Component>>' to 'System.Collections.Generic.IEnumerable<Tridion.ContentManager.ContentManagement.Component>'. An explicit conversion exists (are you missing a cast?)*
The result of GroupBy will be an IGrouping<string, Component> - it's a sequence of groups of components, rather than one sequence of components. That's the whole point of grouping. So this should be fine:
IEnumerable<IGrouping<string, Component>> baggage = ... query as before ...;
Or just use implicit typing:
var baggage = ...;
You can then iterate over the groups:
foreach (var group in baggage)
{
Console.WriteLine("Key: {0}", group.Key);
foreach (var component in group)
{
...
}
}

I have 2 Lists of strings. How do I get a bool that tells me if one lists contains atleast one string from the other list ? (Using Lambda)

This should be simple but I could not wrap my head around it.. Here is how I am doing it now but it seems so wasteful.
There is a
List<string> committees
and
List<string> P.committees
I just want to see if one list has any strings that are contained in the other.
List<Person> listFilteredCommitteesPerson = new List<Person>();
foreach (Person p in listFilteredPerson)
{
foreach (string strCommittee in p.Committees)
{
if (committees.Contains(strCommittee))
{
listFilteredCommitteesPerson.Add(p);
}
}
}
listFilteredPerson = listFilteredCommitteesPerson;
For a boolean value:
var match =
committees.Intersect(listFilteredPerson.SelectMany(p => p.Committees)).Any();
If you want a collection of Person that have a match you can use:
var peopleThatMatch =
listFilteredPerson.Where(p => committees.Intersect(p.Committees).Any());
or:
var peopleThatMatch =
listFilteredPerson.Where(p => p.Committees.Any(s => committees.Contains(s)));
You might want to consider another collection type (e.g. HashSet<T>) for performance reasons if you have large collections.

Complex foreach loop possible to shorten to linq?

I have a cluttery piece of code that I would like to shorten using Linq. It's about the part in the foreach() loop that performs an additional grouping on the result set and builds a nested Dictionary.
Is this possible using a shorter Linq syntax?
var q = from entity in this.Context.Entities
join text in this.Context.Texts on new { ObjectType = 1, ObjectId = entity.EntityId} equals new { ObjectType = text.ObjectType, ObjectId = text.ObjectId}
into texts
select new {entity, texts};
foreach (var result in q)
{
//Can this grouping be performed in the LINQ query above?
var grouped = from tx in result.texts
group tx by tx.Language
into langGroup
select new
{
langGroup.Key,
langGroup
};
//End grouping
var byLanguage = grouped.ToDictionary(x => x.Key, x => x.langGroup.ToDictionary(y => y.PropertyName, y => y.Text));
result.f.Apply(x => x.Texts = byLanguage);
}
return q.Select(x => x.entity);
Sideinfo:
What basically happens is that "texts" for every language and for every property for a certain objecttype (in this case hardcoded 1) are selected and grouped by language. A dictionary of dictionaries is created for every language and then for every property.
Entities have a property called Texts (the dictionary of dictionaries). Apply is a custom extension method which looks like this:
public static T Apply<T>(this T subject, Action<T> action)
{
action(subject);
return subject;
}
isn't this far simpler?
foreach(var entity in Context.Entities)
{
// Create the result dictionary.
entity.Texts = new Dictionary<Language,Dictionary<PropertyName,Text>>();
// loop through each text we want to classify
foreach(var text in Context.Texts.Where(t => t.ObjectType == 1
&& t.ObjectId == entity.ObjectId))
{
var language = text.Language;
var property = text.PropertyName;
// Create the sub-level dictionary, if required
if (!entity.Texts.ContainsKey(language))
entity.Texts[language] = new Dictionary<PropertyName,Text>();
entity.Texts[language][property] = text;
}
}
Sometimes good old foreach loops do the job much better.
Language, PropertyName and Text have no type in your code, so I named my types after the names...

C# - Any clever way to get an int array from an object collection?

How can I create an easy helper method to get an int array from a collection of objects?
The idea would be have a method which receive a collection of "User" class:
public class User {
public int UserId {get;set;}
public string UserName {get;set;}
}
And filter this collection to get an int array of unique UserIds.
List<int> repeatedUserIds = (from item in list
select item.UserId).ToList();
List<int> uniqueUserIds = ((from n in repeatedUserIds
select n).Distinct()).ToList();
Is there a way to create a clever method for this purpose?
You could create an extension method:
public int[] GetUniqueIds<T>(this IEnumerable<T> items, Func<T, int> idSelector)
{
return items.Select(idSelector).Distinct().ToArray();
}
And use it like this:
int[] uniqueUserIds = list.GetUniqueIds(u => u.UserId);
Well, I wouldn't bother with a query expression, personally - but the rest is fine:
List<int> repeatedUserIds = list.Select(item => item.UserId)
.ToList();
List<int> uniqueUserIds = repeatedUserIds.Distinct()
.ToList();
If you don't need repeatedUserIds for anything else, don't bother with the intermediate call to ToList():
List<int> uniqueUserIds = list.Select(item => item.UserId)
.Distinct()
.ToList();
(I generally like putting each operation on a separate line, but of course you don't have to.)
Note that your text asks for an array, but your code has been in terms of List<int>. If you genuinely want an int[] instead of a List<int>, just change the ToList() calls to ToArray().
List<int> uniqueUserIds = (from n in list
select item.UserId).Distinct().ToList();

Resources