How do I group by sequence in LINQ? - linq

Given sequence :
["1","A","B","C","2","F","K","L","5","6","P","I","E"]
The numbers represent items that I identify as headers, whereas the letters represent items that I identify as data. I want to associate them into groups like this.
1:A,B,C
2:F,K,L
5:
6:P,I,E
I can easily achieve this using a foreach or while loop on the enumerator, but is there a LINQ'ish way to achieve this? This is a recurring pattern in my domain.

Here's a solution with LINQ. It's a little bit complicated though. There may be room for some tricks. It doesn't look that terrible but it can be more readable with a foreach loop.
int lastHeaderIndex = default(int);
Dictionary<string, IEnumerable<string>> groupedItems =
items.Select((text, index) =>
{
int number;
if (int.TryParse(text, out number))
{
lastHeaderIndex = index;
}
return new { HeaderIndex = lastHeaderIndex, Value = text };
})
.GroupBy(item => item.HeaderIndex)
.ToDictionary(item => item.FirstOrDefault().Value,
item => item.Skip(1).Select(arg => arg.Value));

You can make use of a fold:
var aggr = new List<Tuple<Int,List<String>>>();
var res = sequence.Aggregate(aggr, (d, x) => {
int i;
if (Int32.TryParse(x, out i)) {
var newDict = d.Add(new Tuple(i, new List<string>()));
return newDict;
}
else {
var newDict = d[d.Count - 1].Item2.Add(x);
return newDict;
}
}).ToDictionary(x => x.Item1, x => x.Item2);
However, this doesn't look so nice, since there's lacking support for immutable values. Also, I couldn't test this right now.

foreach loop with int.TryParse should help. 'GroupBy' from LINQ won't help here much.

Since this a common pattern in your domain, consider streaming the results instead of gathering them all into a large in-memory object.
public static IEnumerable<IList<string>> SplitOnToken(IEnumerable<string> input, Func<string,bool> isSplitToken)
{
var set = new List<string>();
foreach(var item in input)
{
if (isSplitToken(item) && set.Any())
{
yield return set;
set = new List<string>();
}
set.Add(item);
}
if (set.Any())
{
yield return set;
}
}
Sample usage:
var sequence = new[] { "1", "A", "B", "C", "2", "F", "K", "L", "5", "6", "P", "I", "E" };
var groups = SplitOnToken(sequence, x => Char.IsDigit(x[0]));
foreach (var #group in groups)
{
Console.WriteLine("{0}: {1}", #group[0], String.Join(" ", #group.Skip(1).ToArray()));
}
output:
1: A B C
2: F K L
5:
6: P I E

Here's what I ended up using. Pretty much the same structure as phg's answer.
Basically, it is an aggregate function that maintains a Tuple containing:
1: the accummulated data.
2: state of the parser.
The aggregating function does an if-else to check if currently examined item is a group header or a regular item. Based on this, it updates the datastore (last part of the tuple) and/or changes the parser state (first part of the tuple).
In my case, the parser state is the currently active list (that upcoming items shall be inserted into).
var sequence = new[]{ "1","A","B","C","2","F","K","L","5","6","P","I","E"};
var aggr = Tuple.Create(new List<string>(), new Dictionary<int,List<string>>());
var res = sequence.Aggregate(aggr, (d, x) => {
int i;
if (Int32.TryParse(x, out i))
{
var newList = new List<string>();
d.Item2.Add(i,newList);
return Tuple.Create(newList,d.Item2);
} else
{
d.Item1.Add(x);
return d;
}
},d=>d.Item2);

Related

enumerable group field using Linq?

I've written a Linq sentence like this:
var fs = list
.GroupBy(i =>
new {
X = i.X,
Ps = i.Properties.Where(p => p.Key.Equals("m")) <<<<<<<<<<<
}
)
.Select(g => g.Key });
Am I able to group by IEnumerable.Where(...) fields?
The grouping won't work here.
When grouping, the runtime will try to compare group keys in order to produce proper groups. However, since in the group key you use a property (Ps) which is a distinct IEnumerable<T> for each item in list (the comparison is made on reference equality not on sequence equality) this will result in a different collection for each element; in other words if you'll have two items:
var a = new { X = 1, Properties = new[] { "m" } };
var b = new { X = 1, Properties = new[] { "m" } };
The GroupBy clause will give you two distinct keys as you can see from the image below.
If your intent is to just project the items into the structure of the GroupBy key then you don't need the grouping; the query below should give the same result:
var fs = list.Select(item => new
{
item.X,
Ps = item.Properties.Where(p => p.Key == "m")
});
However, if you do require the results to be distinct, you'll need to create a separate class for your result and implement a separate IEqualityComparer<T> to be used with Distinct clause:
public class Result
{
public int X { get; set; }
public IEnumerable<string> Ps { get; set; }
}
public class ResultComparer : IEqualityComparer<Result>
{
public bool Equals(Result a, Result b)
{
return a.X == b.X && a.Ps.SequenceEqual(b.Ps);
}
// Implement GetHashCode
}
Having the above you can use Distinct on the first query to get distinct results:
var fs = list.Select(item => new Result
{
X = item.X,
Ps = item.Properties.Where( p => p.Key == "m")
}).Distinct(new ResultComparer());

LINQ with list<int> quering the Value of a Dictonary<int, object>

I have a problem with a query. I have a List with int and want to use it to get the values from my dictionary. The dictionary-keys are int and some of them have the value of the list-items. My question is how i get the objects out of the dictionary, thats keys matces the list items. Was programming JAVA the last years and now struggling with LINQ :(
Thanks in advance
Problem solved. Thank you all :)
No idea how to close this topic. I am reading stackoverflow since one year, but this was my first post.
You can use Linq to join list items with dictionary KeyValuePair entries on entry key. And then select entry value from each joined pair:
var values = from l in list
join kvp in dictionary on l equals kvp.Key
select kvp.Value;
Lambda syntax:
var values = list.Join(dictionary, l => l, kvp => kvp.Key, (l,kvp) => kvp.Value);
Basically:
var value = dictionary[integerKey];
Or:
if (dictionary.TryGetValue(integerKey, out value)) {
}
You can also create an extension method:
public static class DictionaryExtensions
{
public static IEnumerable<TValue> FilterValuesBy<TKey, TValue>(this IDictionary<TKey, TValue> dictionary, IEnumerable<TKey> filter)
{
if (dictionary == null) throw new ArgumentNullException("dictionary");
if (filter == null) throw new ArgumentNullException("filter");
var coll = filter as ICollection<TKey> ?? new HashSet<TKey>(filter);
return dictionary.Where(kvp => coll.Contains(kvp.Key)).Select(kvp => kvp.Value);
}
}
Usage:
class Program
{
static void Main()
{
var dict = Enumerable.Range(0, 10).ToDictionary(x => x);
var filter = Enumerable.Range(0, 2);
foreach (var i in dict.FilterValuesBy(filter))
{
Console.WriteLine(i);
}
Console.ReadLine();
}
}
Simple Linq method chain:
var dict = Enumerable.Range(0, 10).ToDictionary(x => x);
var filter = Enumerable.Range(0, 2).ToList();
var filtered = dict.Where(x => filter.Contains(x.Key)).Select(x => x.Value).ToList();

How to select decreasing sub-series with Linq

I have a list of prices ordered by date. I need to select all monotonously decreasing values. The following code works:
public static List<DataPoint> SelectDecreasingValues(List<DataPoint> dataPoints)
{
var ret = new List<DataPoint>(dataPoints.Count);
var previousPrice = dataPoints[0].Price;
for (int i = 0; i < dataPoints.Count; i++)
{
if (dataPoints[i].Price <= previousPrice)
{
ret.Add(dataPoints[i]);
previousPrice = dataPoints[i].Price;
}
}
return ret;
}
However, is there a shorter/cleaner way to accomplish it with Linq?
This code is equivalent:
previousPrice = dataPoints[0].Price;
var ret = dataPoints.Where(x => {
if(x.Price <= previousPrice)
{ previousPrice = x.Price; return true;}
return false;
}).ToList();
However, if you don't need to have a list, go with plain enumerables and drop the ToList at the end. That way you can make use of the deferred execution feature built into LINQ.
The following code is also equivalent:
DataPoint previous = dataPoints.FirstOrDefault();
var ret = dataPoints.Where(x => x.Price <= previous.Price)
.Select(x => previous = x).ToList();
This works because of the deferred execution in LINQ. For each item in dataPoints it will first execute the Where part and then the Select part and only then will it move to the second item in dataPoints.
You need to decide which version you want to use. The second one is not as intention revealing as the first one, because you need to know about the internal workings of LINQ.
public IEnumerable<T> WhereMonotonicDecreasing<T>(
IEnumerable<T> source,
Func<T, IComparable> keySelector)
{
IComparable key;
bool first = true;
foreach(T t in source)
{
if (first)
{
key = keySelector(t);
yield return t;
first = false;
}
else
{
IComparable newKey = keySelector(t);
if (newKey.CompareTo(key) < 0)
{
key = newKey;
yield return t;
}
}
}
}
Called by:
dataPoints.WhereMonotonicDecreasing(x => x.Price);
previousPrice = dataPoints[0];
dataPoints.Where(p => p.Price <= previousPrice.Price)
.Select(p => previousPrice = p);
You can then use .ToList() if you really need one.
How about (untested):
return dataPoints.Take(1)
.Concat(dataPoints.Skip(1)
.Zip(dataPoints,
(next, previous) =>
new { Next = next, Previous = previous })
.Where(a => a.Next.Price <= a.Previous.Price)
.Select(a => a.Next))
.ToList();
Essentially, this overlays a "one-deferred" version of the sequence over the sequence to produce "next, previous" tuples and then applies the relevant filters on those tuples. The Take(1) is to pick the first item of the sequence, which it appears you always want.
If you don't care for the readability of the variable names, you could shorten it to:
return dataPoints.Take(1)
.Concat(dataPoints.Skip(1)
.Zip(dataPoints, Tuple.Create)
.Where(a => a.Item1.Price <= a.Item2.Price)
.Select(a => a.Item1))
.ToList();

Need help with formulating LINQ query

I'm building a word anagram program that uses a database which contains one simple table:
Words
---------------------
varchar(15) alphagram
varchar(15) anagram
(other fields omitted for brevity)
An alphagram is the letters of a word arranged in alphabetical order. For example, the alphagram for OVERFLOW would be EFLOORVW. Every Alphagram in my database has one or more Anagrams. Here's a sample data dump of my table:
Alphagram Anagram
EINORST NORITES
EINORST OESTRIN
EINORST ORIENTS
EINORST STONIER
ADEINRT ANTIRED
ADEINRT DETRAIN
ADEINRT TRAINED
I'm trying to build a LINQ query that would return a list of Alphagrams along with their associated Anagrams. Is this possible?
UPDATE: Here's my solution based on the suggestions below! Thanks all!
using (LexiconEntities ctx = new LexiconEntities())
{
var words = ctx.words;
var query =
from word in words
where word.alphagram == "AEINRST"
group word by word.alphagram into alphagramGroup
select new { Alphagram = alphagramGroup.Key, Anagrams = alphagramGroup };
foreach (var alphagramGroup in query)
{
Console.WriteLine("Alphagram: {0}", alphagramGroup.Alphagram);
foreach (var anagram in alphagramGroup.Anagrams)
{
Console.WriteLine("Anagram: {0}", anagram.word1);
}
}
}
var list = anagrams.Select(
a => new {
Alphagram = a.ToCharArray().OrderBy(s => s).ToString(),
Anagram = a
}).toList();
A totally new answer...
You seem to need a groupby query look at How to: Group Data (Entity Framework).
this should accomplish what you want...
I did a testy with LINQ and this works...
var words = new List<Word>()
{
new ConsoleApplication1.Word("EINORST", "NORITES"),
new ConsoleApplication1.Word("EINORST", "OESTRIN"),
new ConsoleApplication1.Word("EINORST", "STONIER"),
new ConsoleApplication1.Word("ADEINRT", "ANTIRED"),
new ConsoleApplication1.Word("ADEINRT", "DETRAIN"),
new ConsoleApplication1.Word("ADEINRT", "TRAINED")
};
var q = words.GroupBy(w => w.Alphagram).Select(w => new { Alphagram = w.Key, Anagrams = w.Select(p => p.Anagram).ToList() }).ToList();
foreach (var item in q)
{
Console.WriteLine("Alphagram : {0}, Anagrams = {1}", item.Alphagram, String.Join(",", item.Anagrams));
}
var words = new List<Words>()
{
new Words("EINORST", "NORITES"),
new Words("EINORST", "OESTRIN"),
new Words("EINORST", "STONIER"),
new Words("ADEINRT", "ANTIRED"),
new Words("ADEINRT", "DETRAIN"),
new Words("ADEINRT", "TRAINED")
};
var result = words.GroupBy(w => w.Alphagram, w => w.Anagram)
.Select(w => new {
Alphagram = w.Key,
Anagrams = w.Where(p => w.Key.ToCharArray().SequenceEqualUnOrdered(p.ToCharArray())).ToList()
}
)
.ToList();
public static bool SequenceEqualUnOrdered<T>(this IEnumerable<T> first, IEnumerable<T> second)
{
return new HashSet<T>(first).SetEquals(second);
}
Is it what you are looking for? It is LINQ to Objects. You may want to use LINQ-to-SQL or LINQ-to-Entites to fetch your records into your objects and then use the above-mentioned LINQ-to-Objects query over the already-fetched object collection.

GroupBy String and Count in LINQ

I have got a collection. The coll has strings:
Location="Theater=1, Name=regal, Area=Area1"
Location="Theater=34, Name=Karm, Area=Area4445"
and so on. I have to extract just the Name bit from the string. For example, here I have to extract the text 'regal' and group the query. Then display the result as
Name=regal Count 33
Name=Karm Count 22
I am struggling with the query:
Collection.Location.GroupBy(????);(what to add here)
Which is the most short and precise way to do it?
Yet another Linq + Regex approach:
string[] Location = {
"Theater=2, Name=regal, Area=Area1",
"Theater=2, Name=regal, Area=Area1",
"Theater=34, Name=Karm, Area=Area4445"
};
var test = Location.Select(
x => Regex.Match(x, "^.*Name=(.*),.*$")
.Groups[1].Value)
.GroupBy(x => x)
.Select(x=> new {Name = x.Key, Count = x.Count()});
Query result for tested strings
Once you've extracted the string, just group by it and count the results:
var query = from location in locations
let name = ExtractNameFromLocation(location)
group 1 by name in grouped
select new { Name=grouped.Key, Count=grouped.Count() };
That's not particularly efficient, however. It has to do all the grouping before it does any counting. Have a look at this VJ article for an extension method for LINQ to Objects,
and this one about Push LINQ which a somewhat different way of looking at LINQ.
EDIT: ExtractNameFromLocation would be the code taken from answers to your other question, e.g.
public static string ExtractNameFromLocation(string location)
{
var name = (from part in location.Split(',')
let pair = part.Split('=')
where pair[0].Trim() == "Name"
select pair[1].Trim()).FirstOrDefault();
return name;
}
Here is another LINQ alternative solution with a working example.
static void Main(string[] args)
{
System.Collections.Generic.List<string> l = new List<string>();
l.Add("Theater=1, Name=regal, Area=Area"); l.Add("Theater=34, Name=Karm, Area=Area4445");
foreach (IGrouping<string, string> g in l.GroupBy(r => extractName(r)))
{
Console.WriteLine( string.Format("Name= {0} Count {1}", g.Key, g.Count()) );
}
}
private static string extractName(string dirty)
{
System.Text.RegularExpressions.Match m =
System.Text.RegularExpressions.Regex.Match(
dirty, #"(?<=Name=)[a-zA-Z0-9_ ]+(?=,)");
return m.Success ? m.Value : "";
}

Resources