Applying Group By in LINQ - linq

I decided to group the collection by length of the string.Need suggestion from you to correct myself.
string[] collection = {"five","four","ten","one"};
var groupedValues =
from w in collection
group w by w.Length into getByGroup
select getByGroup;
foreach (var g in groupedValues)
{
Console.WriteLine(g);
}
The output is :
System.Linq.Lookup....
System.Linq.Lookup....
What went wrong ?

GroupBy returns a Lookup object which contains the Key and the collection in the grouping.
foreach (var g in GroupedValues)
{
Console.WriteLine("There are {1} strings of length {0}.", g.Key, g.Count());
foreach (var v in g)
{
Console.WriteLine(" - {0}", v);
}
}

What went wrong depends on what you wanted to do!
The sequence you get back after grouping is not a flat sequence of the original objects (so in your case, it's not a sequence of strings). Otherwise how would they have been grouped?
Maybe given that you apparently expected a flat list of strings, you actually wanted to order them by length:
var collection = new[] {"five","four","ten","one"};
var byLength = collection.OrderBy(s => s.Length);
foreach (var s in GroupedValues)
Console.WriteLine(s);
Or if you wanted to group them, then you have to deal with each group in turn, and each group is a separate list of strings:
foreach (var g in GroupedValues)
{
Console.WriteLine("Strings of length " + g.Key + ":");
foreach (var s in g)
Console.WriteLine(" " + s);
}

Related

filter elements in nested dictionary LINQ

I have the following data structure:
Dictionary<string, Dictionary<string, List<int>>> data =
new Dictionary<string, Dictionary<string, List<int>>>();
I want to filter some of the elements in that dictionary based on value in first element of the list of the inner dictionary.
for example:
{legion1
{soldier1, [10,1000]},
{soldier2, [50,1000]}
}
Now let's say I want to do foreach loop in which to work only elements where
the value of the first element of the list is less than 20
expected result in the foreach loop is:
{legion1{soldier1, [10,1000]}}
What I've tried:
I do foreach loop and then I want to use something similar:
data.where(x => x.value.where(o => o[0] < 20 ))
I always get error that that way is incorrect.
Please tell how can I solve the issue and why my way is failing.
You can filter and iterate over the result set like so:
var resultSet =
data.ToDictionary(e => e.Key,
e => e.Value.Where(x => x.Value[0] < 20)
.ToDictionary(k => k.Key, v => v.Value)
);
foreach(var item in resultSet){
var key = item.Key; // string
var values = item.Value; // Dictionary<string, List<int>>
...
...
}
The problem is that you are applying operator [] incorrectly. Moreover, since you want to use both Legion and Soldier, you should construct a tuple combining the two of them:
foreach (var t in data.SelectMany(lg => lg.Value.Select(s => new {
Legion = lg
, Soldier = s
})).Where(ls => ls.Soldier.Value[0] < 20)) {
Console.WriteLine("Legion={0} Soldier = {1}", t.Legion.Key, t.Soldier.Key);
}

How to split cell into sepatare rows and find minial summary value

I have the following dataset:
Movies : moviename, genre1, genre2, genre3 ..... genre19
(All the genres above have values 0 or 1, 1 indicates that the movie is of that genre)
Now i want to find which movie(s) has least genre?
I tried the below Pig script:
items = load 'path' using PigStorage('|') as (mName:chararray,g1:int,g2:int,g3:int,g4:int,g5:int,g6:int,g7:int,g8:int,g9:int,g10:int,g11:int,g12:int,g13:int,g14:int,g15:int,g16:int,g17:int,g18:int,g19:int);
sumGenre = foreach items generate mName, g1+g2+g3+g4+g5+g6+g7+g8+g9+g10+g11+g12+g13+g14+g15+g16+g17+g18+g19 as sumOfGenres;
groupAll = group sumGenre All;
In the next step by using MIN(sumGenre.sumofGenres), i can get a genre which is the MIN value , but what am looking for is to get a moviename which has the least no. of genres, alongside the number of genres of that movie.
Can someone please help?
1. I want to know is there any other easy way to get the sum of g1+g2+...g19?
2. Also the output : movie(s) that has the least genre?
After the groupAll
r1 = minGenre = foreach groupAll generate MIN(sumGenre.sumOfGenres) as minG;
do left outer join between r1 by minG with sumGenre by sumOfGenres;
to get the list of movies having least genre..
Hope this will help..
for dynamic row field sum u can use UDF like this..
public class DynRowSum extends EvalFunc<Integer>
{
public Integer exec(Tuple v) throws IOException
{
List<Object> olist = v.getAll();
int sum = 0;
int cnt=0;
for( Object o : olist){
cnt++;
if (cnt!=1) {
int val= (Integer)o;
sum = sum + val;
}
}
return new Integer(sum);
}
}
In pig update the script like this..
grunt>sumGenre = foreach items generate mName,DynRowSum(*) as sumOfGenres;
Advantage here you will get if genre increase or decrease code will remain same..
a = LOAD 'path';
b = FOREACH a generate FLATTEN(STRSPLIT($0, '\\|'));
c = FOREACH b generate $0 as movie, FLATTEN(TOBAG(*)) as genre;
d = FILTER c BY movie!=genre;
e = GROUP d BY $0;
f = FOREACH e GENERATE group, SUM(d);
i = ORDER f BY $1;
j = LIMIT i 1;

c# group by alphabets

I need to show list of authors group by last name first letter.
e.g.
A
Kim, Ami
Dim, Amaiar
jin, Amairaz
B
Bin, Bom
Kin, Bomo
C
Cin, Ci
Con, Co
....
Could some one please help me what's the best way to solve the above problem?
If you want to group by, use GroupBy, I assumed you want the output to be ordered (OrderBy), Change the GroupBy expression to match your exact requirment:
List<String> names = new List<String>{"Bill", "Mark", "Steve", "Amnon", "Benny"};
foreach(var g in names.GroupBy(name => name.First()).OrderBy(g => g.Key)){
Console.WriteLine(g.Key);
g.OrderBy(name => name).ToList().ForEach(Console.WriteLine);
}
Will output:
A
Amnon
B
Bill
Benny
M
Mark
S
Steve
You can use GroupBy extension method over Linq object to get the desire result.
List<string> firstNames = new List<string>(){ "Ami", "Amaiar","Amiraz","Bom","Bomo","Ci","Co" };
var groups = firstNames.GroupBy(x=>x[0]);
foreach (var element in groups)
{
Console.WriteLine("{0}", element.Key);
foreach (var word in element)
Console.WriteLine(" {0}", word);
}

Row number in LINQ

I have a linq query like this:
var accounts =
from account in context.Accounts
from guranteer in account.Gurantors
where guranteer.GuarantorRegistryId == guranteerRegistryId
select new AccountsReport
{
recordIndex = ?
CreditRegistryId = account.CreditRegistryId,
AccountNumber = account.AccountNo,
}
I want to populate recordIndex with the value of current row number in collection returned by the LINQ. How can I get row number ?
Row number is not supported in linq-to-entities. You must first retrieve records from database without row number and then add row number by linq-to-objects. Something like:
var accounts =
(from account in context.Accounts
from guranteer in account.Gurantors
where guranteer.GuarantorRegistryId == guranteerRegistryId
select new
{
CreditRegistryId = account.CreditRegistryId,
AccountNumber = account.AccountNo,
})
.AsEnumerable() // Moving to linq-to-objects
.Select((r, i) => new AccountReport
{
RecordIndex = i,
CreditRegistryId = r.CreditRegistryId,
AccountNumber = r.AccountNo,
});
LINQ to objects has this builtin for any enumerator:
http://weblogs.asp.net/fmarguerie/archive/2008/11/10/using-the-select-linq-query-operator-with-indexes.aspx
Edit: Although IQueryable supports it too (here and here) it has been mentioned that this does unfortunately not work for LINQ to SQL/Entities.
new []{"aap", "noot", "mies"}
.Select( (element, index) => new { element, index });
Will result in:
{ { element = aap, index = 0 },
{ element = noot, index = 1 },
{ element = mies, index = 2 } }
There are other LINQ Extension methods (like .Where) with the extra index parameter overload
Try using let like this:
int[] ints = new[] { 1, 2, 3, 4, 5 };
int counter = 0;
var result = from i in ints
where i % 2 == 0
let number = ++counter
select new { I = i, Number = number };
foreach (var r in result)
{
Console.WriteLine(r.Number + ": " + r.I);
}
I cannot test it with actual LINQ to SQL or Entity Framework right now. Note that the above code will retain the value of the counter between multiple executions of the query.
If this is not supported with your specific provider you can always foreach (thus forcing the execution of the query) and assign the number manually in code.
Because the query inside the question filters by a single id, I think the answers given wont help out. Ofcourse you can do it all in memory client side, but depending how large the dataset is, and whether network is involved, this could be an issue.
If you need a SQL ROW_NUMBER [..] OVER [..] equivalent, the only way I know is to create a view in your SQL server and query against that.
This Tested and Works:
Amend your code as follows:
int counter = 0;
var accounts =
from account in context.Accounts
from guranteer in account.Gurantors
where guranteer.GuarantorRegistryId == guranteerRegistryId
select new AccountsReport
{
recordIndex = counter++
CreditRegistryId = account.CreditRegistryId,
AccountNumber = account.AccountNo,
}
Hope this helps.. Though its late:)

List.GroupBy<> error using LInq

I'm trying to group a generic List<> in C#. The code compiles, but the application (Silverlight) throws the following error (CharOpps is the class of objects in the list I'm trying to group):
Unhandled Error in Silverlight Application Unable to cast object of type 'Grouping[System.DateTime,Invoc_SalesDashboard.ChartOpps]' to type 'Invoc_SalesDashboard.ChartOpps'.
Here's the code:
var newtemplist = list.GroupBy(opp =>
new DateTime(opp.EstimatedCloseDate.Year, opp.EstimatedCloseDate.Month, 1)).OrderBy(opp => opp.Key);
I've also tried:
var newtemplist =
from opp in list
orderby opp.EstimatedCloseDate
group opp by new { opp.EstimatedCloseYear, opp.EstimatedCloseMonth };
ChartOpps have a revenue value, and the EstimatedCloseDate value. What I'm hoping to end up with is a list of ChartOpps with the revenue aggregated in the year/month groupings.
foreach (ChartOpps c in newtemplist)
{
ErrorBox.Text += "o";
}
You're not showing us what you're doing with the result newtemplist. The runtime error message indicates that you are taking a group and trying to treat it as an instance of ChartOpps which is clearly impossible. Show us that code and we can help you fix it.
Edit:
Okay, now the problem is clear. To enumerate the results of the grouping, you need to proceed as follows:
foreach(var group in newtemplist) {
foreach(ChartOpps c in group) {
// do something with c here
}
}
The result of newtemplist is a sequence of sequences, each sequence having all of its elements having the same value of new DateTime(opp.EstimatedCloseDate.Year, opp.EstimatedCloseDate.Month, 1). Therefore, to enumerate this sequence of sequences, you first have to enumerate the groups, and then within each group enumerate the instances of ChartOpps.
Not knowing anything about your class structure, here is a basic attempt:
List<CharOpps> list = GetList();
var newtemplist =
from opp in list
group opp by opp.EstimatedCloseYear into g
select new { g = g.Key, CharOpps = g };
If you take the var out of the picture, it all becomes clear.
IEnumerable<IGrouping<DateTime, ChartOpps>> newtemplist = list
.GroupBy(opp => new DateTime(
opp.EstimatedCloseDate.Year,
opp.EstimatedCloseDate.Month,
1))
.OrderBy(opp => opp.Key);
foreach (ChartOpps c in newtemplist)
{
ErrorBox.Text += "o";
}
The error occurs in the assignment of the first element of newtemplist to c. c is allowed to reference ChartOpps instances. The first element of newtemplist is a IGrouping<DateTime, ChartOpps>, not a ChartOpps. The implicit cast in the foreach fails and you get a runtime exception.
Try instead:
foreach(IGrouping<DateTime, ChartOpps> g in newtemplist)
{
foreach (ChartOpps c in g)
{
ErrorBox.Text += "o";
}
}

Resources