List within lists within lists - linq

I have a list called Countries, and each country has a list of Towns, which in it's turn has a list of Streets. And a street has a number of houses. Lists within lists within lists. Very simple.
I need to generate a list of houses that are located in countries which names start with the letter 'A'. Not a very logical example, but it's easier to explain than the more complex structure I'm dealing with.
This is, of course, not too complex and could be done by creating a List and then ForEaching all countries.Where(Name.StartsWith('A')), then ForEaching all towns and finally adding each street in that town to the list.
I don't like that method so I want something prettier...
Could this be done by using something like Aggregate on the Countries.Where() list? If so, how? (Thus, in a single statement.)
Yes, the selection will be on the top list only, so that should make it easier.

This looks like a job for Enumerable.SelectMany (which allows you to ungroup one level of hierarchy):
List<County> countyList = GetCounties();
IEnumerable<County> aCounties = countyList
.Where(c => c.Name.StartsWith("A"));
List<House> aCountyHouses = aCounties
.SelectMany(c => c.Towns)
.SelectMany(t => t.Streets)
.SelectMany(s => s.Houses)
.ToList();

Related

How can l get a total sum from one table if it is linked with a lot of tables using linq?

l have multiple tables with one to may relationships eg Country -> Region -> Center -> Greater ->Section. The section has a column for census. l am trying to write a linq query for my view to get the total census grouped by the Country. l also need to know in a country how many regions are there, how many centers, how many greaters and the total census. They can be separate queries there is no problem.
In this case, the SelectMany method in LINQ is your friend. Each country has a collection of regions, right? You can use .SelectMany to combine all of the regions from several countries into a single collection of regions. Then you need to get a collection of centers from all of the regions, and so on and so on.
Consider this code:
context.Countries.SelectMany(country => country.Regions)
.SelectMany(region => region.Centers)
.SelectMany(center => center.Greaters)
.SelectMany(greater => greater.Sections)
.GroupBy(country => country.Id)
.Sum(section => section.Census);

Finding elements that appear in groups most

Having trouble figuring out how to go about this algorithm.
Input: any number of lists each holding elements grouped by a common attribute
For example,
matched_by_first_name = {"bob" => [person, person, ...], "nancy" => [person, ...], ...}
matched_by_zip_code = {"12345" => [person, person, ...], "56789" => [person, ...], ...}
Output: List of groups of people that appear most frequently in the same groups, with separate "weightings" per input list. So, I might weight two people grouped by the same first name more than I would weight two people grouped by the same zip code.
In other words:
matches = [[person, person], [person], [person, person, person]]
Basically, if there are two persons and for every single grouping they are in the same group, then they should definitely be in the same final matched group. If there's only one group they're not in, then they should probably still be matched (depending on the weighting of that group type).

how to create set of values, after group function in Pig (Hadoop)

Lets say I have set of values in file.txt
a,b,c
a,b,d
k,l,m
k,l,n
k,l,o
And my code is:
file = LOAD 'file.txt' using PigStorage(',');
events = foreach file generate session_id, user_id, code, type;
gr = group events by (session_id, user_id);
and I have set of value:
((a,b),{(a,b,c),(a,b,d)})
((k,l),{(k,l,m),(k,l,n),(k,l,o)})
And I'd like to have:
(a,b,(c,d))
(k,l,(m,n,o))
Have you got any idea how to do it?
Regards
Pawel
Note: you are inconsistent in your question. You say session_id, user_id, code, type in the FOREACH line, but your have a PigStorage not providing values. Also, that FOREACH has 4 values, while your sample data only has 3. I'll assume that type doesn't exist in order to answer your question.
After your gr relation, you are left with the group by key (in this case (session_id, user_id)) in a automatically generated tuple called group.
So, first step: gr2 = FOREACH gr GENERATE FLATTEN(group);
This will give you the tuples (a,b) and (k,l). You need to use FLATTEN because group is a tuple and you are asking for session_id and user_id to be individual columns. FLATTEN does that for you.
Ok, so now modify the gr2 line to also use a projection to tease out the third value:
gr2 = FOREACH gr GENERATE FLATTEN(group), events.code;
events.code creates a bag out of all the code values. events is the name of the bag of grouped tuples (it's named after the original relation).
This should give you:
(a, b, {c, d})
(k, l, {m, n, o})
It's very important to note that the values in the list are in a bag not a tuple, like you asked for. Keeping it in a bag is the right idea because the bag is a variable list, while a tuple is not.
Additional advice: Understanding how GROUP BY outputs data is something I see a lot of people struggle with when first using Pig. If you think my answer doesn't make much sense, I'd recommend spending some time to really get to understand GROUP BY. Understanding versus thinking it is magic will pay off in the long run.

Linq - group by a name, but still get the id

I'm trying to find duplicates in linq by a particular column (the name column), but I also wish to return the unique id, as I wish to bind to the ID to display additional information about the row.
I've dug around on stackoverflow, but can only find ways of finding duplicates in the fashion off:
By the whole object
By a particular property
Getting the number of duplicates
The closest thing I could find was by specifying "Key" in my group by, but I'm ensure if that is working.
Ideally I'm hoping to output something that has the ID, Number of Duplicates.
Thanks
Assume you have people collection:
from p in people
group p by p.Name into g
select new {
Name = g.Key,
NumberOfDuplicates = g.Count(),
IDs = g.Select(x => x.ID)
}

VS 2010 reporting services grouping

I want to load the list of the groups as well as data into two separate datatables (or one, but I don't see that possible). Then I want to apply the grouping like this:
Groups
A
B
Bar
C
Car
Data
Ale
Beer
Bartender
Barry
Coal
Calm
Carbon
The final result after grouping should be like this.
*A
Ale
*B
*Bar
Bartender
Barry
Beer
*C
Calm
*Car
Carbon
Coal
I only have a grouping list, not the levels or anything else. And the items falling under the certain group are the ones that do start with the same letters as a group's name. The indentation is not a must. Hopefully my example clarifies what I need, but am not able to name thus I am unable to find anything similar on google.
The key things here are:
1. Grouping by a provided list of groups
2. There can be unlimited layers of grouping
Since every record has it's children, the query should also take a father for each record. Then there is a nice trick in advanced grouping tab. Choosing a father's column yields as many higher level groups as needed recursively. I learnt about that in http://blogs.microsoft.co.il/blogs/barbaro/archive/2008/12/01/creating-sum-for-a-group-with-recursion-in-ssrs.aspx
I suggest reporting from a query like this:
select gtop.category top_category,
gsub.category sub_category,
dtab.category data_category
from groupTable gtop
join groupTable gsub on gsub.category like gtop.category + '%'
left join dataTable dtab on dtab.category like gsub.category + '%'
where len(gtop.category) = 1 and
not exists
(select null
from groupTable gchk
where gsub.category = gtop.category and
gchk.category like gsub.category + '%' and
gchk.category <> gsub.category and
dtab.category like gchk.category + '%')
- with report groups on top_category and sub_category, and headings for both groups. You will probably want to hide the sub_category heading row when sub_category = top_category.

Resources