I am stuck on this algorithmic question :
design an algorithm that parse an expression like this :
"((a,b,cy)n,m)" should give :
an - bn - cyn - m
The expression can nest, therefore :
"((a,b)o(m,n)p,b)" parses to ;
aomp - aonp - bomp - bonp - b.
I thought of using stacks, but it is too complicated.
thanks.
You can parse it with a Recursive Descent Parser.
Let's say the comma separated strings are components, so for an expression ((a, b, cy)n, m), (a, b, cy)n and m are two components. a, b and cy are also components. So this is a recursive definition.
For a component (a, b, cy)n, let's say (a, b, cy) and n are two component parts of the component. Component parts will later be combined to produce final result (i.e., an - bn - cyn).
Let's say an expression is comma separated components, for example, (a, cy)n, m is an expression. It has two components (a, cy)n and m, and the component (a, cy)n has two component parts (a, cy) and n, and component part (a, cy) is a brace expression containing a nested expression: a, cy, which also has two components a and cy.
With these definitions (you might use other terms), we can write down the grammar for your expression:
expression = component, component, ...
component = component_part component_part ...
component_part = letters | (expression)
One line is one grammar rule. The first line means an expression is a list of comma separated components. The second line means a component can be constructed with one or more component parts. The third line means a component part can be either a continuous sequence of letters or a nested expression inside a pair of braces.
Then you can use a Recursive Descent Parser to solve your problem with the above grammar.
We will define one method/function for each grammar rule. So basically we will have three methods ParseExpression, ParseComponent, ParseComponentPart.
Algorithm
As I stated above, an expression is comma separated components, so in our ParseExpression method, it simply calls ParseComponent, and then check if the next char is comma or not, like this (I'm using C#, I think you can easily convert it to other languages):
private List<string> ParseExpression()
{
var result = new List<string>();
while (!Eof())
{
// Parsing a component will produce a list of strings,
// they are added to the final string list
var items = ParseComponent();
result.AddRange(items);
// If next char is ',' simply skip it and parse next component
if (Peek() == ',')
{
// Skip comma
ReadNextChar();
}
else
{
break;
}
}
return result;
}
You can see that, when we are parsing an expression, we recursively call ParseComponent (it will then recursively call ParseComponentPart). It's a top-down approach, that's why it's called Recursive Descent Parsing.
ParseComponent is similar, like this:
private List<string> ParseComponent()
{
List<string> leftItems = null;
while (!Eof())
{
// Parse a component part will produce a list of strings (rightItems)
// We need to combine already parsed string list (leftItems) in this component
// with the newly parsed 'rightItems'
var rightItems = ParseComponentPart();
if (rightItems == null)
{
// No more parts, return current result (leftItems) to the caller
break;
}
if (leftItems == null)
{
leftItems = rightItems;
}
else
{
leftItems = Combine(leftItems, rightItems);
}
}
return leftItems;
}
The combine method simply combines two string list:
// Combine two lists of strings and return the combined string list
private List<string> Combine(List<string> leftItems, List<string> rightItems)
{
var result = new List<string>();
foreach (var leftItem in leftItems)
{
foreach (var rightItem in rightItems)
{
result.Add(leftItem + rightItem);
}
}
return result;
}
Then is the ParseComponentPart:
private List<string> ParseComponentPart()
{
var nextChar = Peek();
if (nextChar == '(')
{
// Skip '('
ReadNextChar();
// Recursively parse the inner expression
var items = ParseExpression();
// Skip ')'
ReadNextChar();
return items;
}
else if (char.IsLetter(nextChar))
{
var letters = ReadLetters();
return new List<string> { letters };
}
else
{
// Fail to parse a part, it means a component is ended
return null;
}
}
Full Source Code (C#)
The other parts are mostly helper methods, full C# source code is listed below:
using System;
using System.Collections.Generic;
using System.Text;
namespace Examples
{
public class BashBraceParser
{
private string _expression;
private int _nextCharIndex;
/// <summary>
/// Parse the specified BASH brace expression and return the result string list.
/// </summary>
public IList<string> Parse(string expression)
{
_expression = expression;
_nextCharIndex = 0;
return ParseExpression();
}
private List<string> ParseExpression()
{
// ** This part is already posted above **
}
private List<string> ParseComponent()
{
// ** This part is already posted above **
}
private List<string> ParseComponentPart()
{
// ** This part is already posted above **
}
// Combine two lists of strings and return the combined string list
private List<string> Combine(List<string> leftItems, List<string> rightItems)
{
// ** This part is already posted above **
}
// Peek next char without moving the cursor
private char Peek()
{
if (Eof())
{
return '\0';
}
return _expression[_nextCharIndex];
}
// Read next char and move the cursor to next char
private char ReadNextChar()
{
return _expression[_nextCharIndex++];
}
private void UnreadChar()
{
_nextCharIndex--;
}
// Check if the whole expression string is scanned.
private bool Eof()
{
return _nextCharIndex == _expression.Length;
}
// Read a continuous sequence of letters.
private string ReadLetters()
{
if (!char.IsLetter(Peek()))
{
return null;
}
var str = new StringBuilder();
while (!Eof())
{
var ch = ReadNextChar();
if (char.IsLetter(ch))
{
str.Append(ch);
}
else
{
UnreadChar();
break;
}
}
return str.ToString();
}
}
}
Use The Code
var parser = new BashBraceParser();
var result = parser.Parse("((a,b)o(m,n)p,b)");
var output = String.Join(" - ", result);
// Result: aomp - aonp - bomp - bonp - b
Console.WriteLine(output);
public class BASHBraceExpansion {
public static ArrayList<StringBuilder> parse_bash(String expression, WrapperInt p) {
ArrayList<StringBuilder> elements = new ArrayList<StringBuilder>();
ArrayList<StringBuilder> result = new ArrayList<StringBuilder>();
elements.add(new StringBuilder(""));
while(p.index < expression.length())
{
if (expression.charAt(p.index) == '(')
{
p.advance();
ArrayList<StringBuilder> temp = parse_bash(expression, p);
ArrayList<StringBuilder> newElements = new ArrayList<StringBuilder>();
for(StringBuilder e : elements)
{
for(StringBuilder t : temp)
{
StringBuilder s = new StringBuilder(e);
newElements.add(s.append(t));
}
}
System.out.println("elements :");
elements = newElements;
}
else if (expression.charAt(p.index) == ',')
{
result.addAll(elements);
elements.clear();
elements.add(new StringBuilder(""));
p.advance();
}
else if (expression.charAt(p.index) == ')')
{
p.advance();
result.addAll(elements);
return result;
}
else
{
for(StringBuilder sb : elements)
{
sb.append(expression.charAt(p.index));
}
p.advance();
}
}
return elements;
}
public static void print(ArrayList<StringBuilder> list)
{
for(StringBuilder s : list)
{
System.out.print(s + " * ");
}
System.out.println();
}
public static void main(String[] args) {
WrapperInt p = new WrapperInt();
ArrayList<StringBuilder> list = parse_bash("((a,b)o(m,n)p,b)", p);
//ArrayList<StringBuilder> list = parse_bash("(a,b)", p);
WrapperInt q = new WrapperInt();
ArrayList<StringBuilder> list1 = parse_bash("((a,b,cy)n,m)", q);
ArrayList<StringBuilder> list2 = parse_bash("((a,b)dr(f,g)(k,m),L(p,q))", new WrapperInt());
System.out.println("*****RESULT : ******");
print(list);
print(list1);
print(list2);
}
}
public class WrapperInt {
public WrapperInt() {
index = 0;
}
public int advance()
{
index ++;
return index;
}
public int index;
}
// aomp - aonp - bomp - bonp - b.
Related
I have a list of objects that are retrieved from a DB. The object looks like this:
class MonthlyFinancePlan {
final int id;
final DateTime date;
final double incomeAfterTax;
final double totalToPayOut;
final double totalRemainingForMonth;
final Map<String, dynamic> items;
MonthlyFinancePlan({ this.id, this.date, this.incomeAfterTax, this.totalToPayOut, this.totalRemainingForMonth, this.items });
MonthlyFinancePlan.fromEntity(MonthlyFinancePlanEntity monthlyFinancePlanEntity):
this.id = monthlyFinancePlanEntity.id,
this.date = DateTime.parse(monthlyFinancePlanEntity.date),
this.incomeAfterTax = monthlyFinancePlanEntity.incomeAfterTax.toDouble(),
this.totalToPayOut = monthlyFinancePlanEntity.totalToPayOut.toDouble(),
this.totalRemainingForMonth = monthlyFinancePlanEntity.moneyRemainingForMonth.toDouble(),
this.items = monthlyFinancePlanEntity.items != null ? json.decode(monthlyFinancePlanEntity.items) : Map();
}
I need to sort these by date.year and then pass them into a first class List, I'd like to create a List of these First class lists so that all the MonthlyFinancePlan objects that are from the year 2020 are sorted and contained within the first class list, same for 2021, etc.
The first class list looks like this:
class YearlyFinancePlan {
List<MonthlyFinancePlan> _monthlyFinancePlanList;
int _year;
double _totalIncomeForYear;
double _totalOutGoingsForYear;
List<MonthlyFinancePlan> get items {
return this._monthlyFinancePlanList;
}
int get year {
return this._year;
}
double get totalIncomeForYear {
return this._totalIncomeForYear;
}
double get totalOutgoingsForYear {
return this._totalOutGoingsForYear;
}
YearlyFinancePlan(this._monthlyFinancePlanList) {
this._year = this._monthlyFinancePlanList.first.date.year;
this._totalIncomeForYear = this._setTotalIncomeFromList(this._monthlyFinancePlanList);
this._totalOutGoingsForYear = this._setTotalOutGoingsForYear(this._monthlyFinancePlanList);
}
double _setTotalIncomeFromList(List<MonthlyFinancePlan> monthlyFinancePlanList) {
double totalIncome;
monthlyFinancePlanList.forEach((plan) => totalIncome += plan.incomeAfterTax);
return totalIncome;
}
double _setTotalOutGoingsForYear(List<MonthlyFinancePlan> monthlyFinancePlanList) {
double totalOutgoings;
monthlyFinancePlanList.forEach((plan) => totalOutgoings += plan.totalToPayOut);
return totalOutgoings;
}
}
My question is, what sort algorithm would be best suited for what I need? I don't have any code to show as I don't know what sort algorithm to use. I'm not looking for anyone to write my code, but more to guide me through it.
Any help would be greatly appreciated
I've created a Mapper that checks if the MonthlyPlanner.date.year exists as a key in a standard Dart Map and adds it if it doesn't exist. Once the check is complete, it also calls the addMonthlyPlan method to add the entry to the MonthlyPlan to the correct YearlyPlan like so:
class FinancePlanMapper {
static Map<int, YearlyFinancePlan> toMap(List<MonthlyFinancePlan> planList) {
Map<int, YearlyFinancePlan> planMap = Map();
planList.forEach((monthlyPlan) {
planMap.putIfAbsent(monthlyPlan.date.year, () => YearlyFinancePlan(List()));
planMap[monthlyPlan.date.year].addMonthlyPlan(monthlyPlan);
});
return planMap;
}
}
I'm not too sure whether it's the most efficient way of sorting but I plan to refactor it as much as possible. I've also updated the YearlyFinancePlan object so that it doesn't initialise any fields on construction, which would cause the object to throw an error when being initialised with an empty list:
class YearlyFinancePlan {
List<MonthlyFinancePlan> _monthlyFinancePlanList;
List<MonthlyFinancePlan> get items {
return this._monthlyFinancePlanList;
}
int get year {
return this.items.first.date.year;
}
double get totalIncomeForYear {
return this._setTotalIncomeFromList(this._monthlyFinancePlanList);
}
double get totalOutgoingsForYear {
return this._setTotalOutGoingsForYear(this._monthlyFinancePlanList);
}
YearlyFinancePlan(this._monthlyFinancePlanList);
void addMonthlyPlan(MonthlyFinancePlan plan) {
this._monthlyFinancePlanList.add(plan);
}
double _setTotalIncomeFromList(List<MonthlyFinancePlan> monthlyFinancePlanList) {
double totalIncome = 0;
monthlyFinancePlanList.forEach((plan) => totalIncome += plan.incomeAfterTax);
return totalIncome;
}
double _setTotalOutGoingsForYear(List<MonthlyFinancePlan> monthlyFinancePlanList) {
double totalOutgoings = 0;
monthlyFinancePlanList.forEach((plan) => totalOutgoings += plan.totalToPayOut);
return totalOutgoings;
}
}
I am being passed a binary AST representing a math formula. Each internal node is an operator and leaf nodes are the operands. I need to walk the tree and output the formula in infix notation. This is pretty easy to do by walking the tree with a recursive algorithm such as the Print() method shows below. The problem with the Print() method is that the order of operations is lost when converting to infix because no parentheses are generated.
I wrote the PrintWithParens() method which outputs a correct infix formula, however it adds extraneous parentheses. You can see in three of the four cases of my main method it adds parenthesis when none are necessary.
I have been racking my brain trying to figure out what the correct algorithm for PrintWithMinimalParens() should be. I'm sure there must be an algorithm that can output only parentheses when necessary to group terms, however I have been unable to implement it correctly. I think I must need to look at the precedence of the operators in the tree below the current node, but the algorithm I have there now does't work (see the last 2 cases in my main method. No parentheses are needed, but my logic adds them).
public class Test {
static abstract class Node {
Node left;
Node right;
String text;
abstract void Print();
abstract void PrintWithParens();
abstract void PrintWithMinimalParens();
int precedence()
{
return 0;
}
}
enum Operator {
PLUS(1,"+"),
MINUS(1, "-"),
MULTIPLY(2, "*"),
DIVIDE(2, "/"),
POW(3, "^")
;
private final int precedence;
private final String text;
private Operator(int precedence, String text)
{
this.precedence = precedence;
this.text = text;
}
#Override
public String toString() {
return text;
}
public int getPrecedence() {
return precedence;
}
}
static class OperatorNode extends Node {
private final Operator op;
OperatorNode(Operator op)
{
this.op = op;
}
#Override
void Print() {
left.Print();
System.out.print(op);
right.Print();
}
#Override
void PrintWithParens() {
System.out.print("(");
left.PrintWithParens();
System.out.print(op);
right.PrintWithParens();
System.out.print(")");
}
#Override
void PrintWithMinimalParens() {
boolean needParens =
(left.precedence() != 0 && left.precedence() < this.op.precedence)
||
(right.precedence() != 0 && right.precedence() < this.op.precedence);
if(needParens)
System.out.print("(");
left.PrintWithMinimalParens();
System.out.print(op);
right.PrintWithMinimalParens();
if(needParens)
System.out.print(")");
}
#Override
int precedence() {
return op.getPrecedence();
}
}
static class TextNode extends Node {
TextNode(String text)
{
this.text = text;
}
#Override
void Print() {
System.out.print(text);
}
#Override
void PrintWithParens() {
System.out.print(text);
}
#Override
void PrintWithMinimalParens() {
System.out.print(text);
}
}
private static void printExpressions(Node rootNode) {
System.out.print("Print() : ");
rootNode.Print();
System.out.println();
System.out.print("PrintWithParens() : ");
rootNode.PrintWithParens();
System.out.println();
System.out.print("PrintWithMinimalParens() : ");
rootNode.PrintWithMinimalParens();
System.out.println();
System.out.println();
}
public static void main(String[] args)
{
System.out.println("Desired: 1+2+3+4");
Node rootNode = new OperatorNode(Operator.PLUS);
rootNode.left = new TextNode("1");
rootNode.right = new OperatorNode(Operator.PLUS);
rootNode.right.left = new TextNode("2");
rootNode.right.right = new OperatorNode(Operator.PLUS);
rootNode.right.right.left = new TextNode("3");
rootNode.right.right.right = new TextNode("4");
printExpressions(rootNode);
System.out.println("Desired: 1+2*3+4");
rootNode = new OperatorNode(Operator.PLUS);
rootNode.left = new TextNode("1");
rootNode.right = new OperatorNode(Operator.PLUS);
rootNode.right.left = new OperatorNode(Operator.MULTIPLY);
rootNode.right.left.left = new TextNode("2");
rootNode.right.left.right = new TextNode("3");
rootNode.right.right = new TextNode("4");
printExpressions(rootNode);
System.out.println("Desired: 1+2*(3+4)");
rootNode = new OperatorNode(Operator.PLUS);
rootNode.left = new TextNode("1");
rootNode.right = new OperatorNode(Operator.MULTIPLY);
rootNode.right.left = new TextNode("2");
rootNode.right.right = new OperatorNode(Operator.PLUS);
rootNode.right.right.left = new TextNode("3");
rootNode.right.right.right = new TextNode("4");
printExpressions(rootNode);
System.out.println("Desired: 1+2^8*3+4");
rootNode = new OperatorNode(Operator.PLUS);
rootNode.left = new TextNode("1");
rootNode.right = new OperatorNode(Operator.MULTIPLY);
rootNode.right.left = new OperatorNode(Operator.POW);
rootNode.right.left.left = new TextNode("2");
rootNode.right.left.right = new TextNode("8");
rootNode.right.right = new OperatorNode(Operator.PLUS);
rootNode.right.right.left = new TextNode("3");
rootNode.right.right.right = new TextNode("4");
printExpressions(rootNode);
}
}
Output:
Desired: 1+2+3+4
Print() : 1+2+3+4
PrintWithParens() : (1+(2+(3+4)))
PrintWithMinimalParens() : 1+2+3+4
Desired: 1+2*3+4
Print() : 1+2*3+4
PrintWithParens() : (1+((2*3)+4))
PrintWithMinimalParens() : 1+2*3+4
Desired: 1+2*(3+4)
Print() : 1+2*3+4
PrintWithParens() : (1+(2*(3+4)))
PrintWithMinimalParens() : 1+(2*3+4)
Desired: 1+2^8*3+4
Print() : 1+2^8*3+4
PrintWithParens() : (1+((2^8)*(3+4)))
PrintWithMinimalParens() : 1+(2^8*3+4)
Is is possible to implement the PrintWithMinimalParens() that I want? Does the fact that order is implicit in the tree make doing what I want impossible?
In your code you are comparing each operator with its children to see if you need parentheses around it. But you should actually be comparing it with its parent. Here are some rules that can determine if parentheses can be omitted:
You never need parentheses around the operator at the root of the AST.
If operator A is the child of operator B, and A has a higher precedence than B, the parentheses around A can be omitted.
If a left-associative operator A is the left child of a left-associative operator B with the same precedence, the parentheses around A can be omitted. A left-associative operator is one for which x A y A z is parsed as (x A y) A z.
If a right-associative operator A is the right child of a right-associative operator B with the same precedence, the parentheses around A can be omitted. A right-associative operator is one for which x A y A z is parsed as x A (y A z).
If you can assume that an operator A is associative, i.e. that (x A y) A z = x A (y A z) for all x,y,z, and A is the child of the same operator A, you can choose to omit parentheses around the child A. In this case, reparsing the expression will yield a different AST that gives the same result when evaluated.
Note that for your first example, the desired result is only correct if you can assume that + is associative (which is true when dealing with normal numbers) and implement rule #5. This is because your input tree is built in a right-associative fashion, while operator + is normally left-associative.
You're enclosing an entire expression in parentheses if either the left or the right child has a lower-precedence operator even if one of them is a higher- or equal-precedence operator.
I think you need to separate your boolean needParens into distinct cases for the left and right children. Something like this (untested):
void PrintWithMinimalParens() {
boolean needLeftChildParens =
(left.precedence() != 0 && left.precedence() < this.op.precedence);
boolean needRightChildParens =
(right.precedence() != 0 && right.precedence() < this.op.precedence);
if(needLeftChildParens)
System.out.print("(");
left.PrintWithMinimalParens();
if(needLeftChildParens)
System.out.print(")");
System.out.print(op);
if(needRightChildParens)
System.out.print("(");
right.PrintWithMinimalParens();
if(needRightChildParens)
System.out.print(")");
}
Also, I don't think your last example is correct. Looking at your tree I think it should be:
1+2^8*(3+4)
I want to summarize rather than compress in a similar manner to run length encoding but in a nested sense.
For instance, I want : ABCBCABCBCDEEF to become: (2A(2BC))D(2E)F
I am not concerned that an option is picked between two identical possible nestings E.g.
ABBABBABBABA could be (3ABB)ABA or A(3BBA)BA which are of the same compressed length, despite having different structures.
However I do want the choice to be MOST greedy. For instance:
ABCDABCDCDCDCD would pick (2ABCD)(3CD) - of length six in original symbols which is less than ABCDAB(4CD) which is length 8 in original symbols.
In terms of background I have some repeating patterns that I want to summarize. So that the data is more digestible. I don't want to disrupt the logical order of the data as it is important. but I do want to summarize it , by saying, symbol A times 3 occurrences, followed by symbols XYZ for 20 occurrences etc. and this can be displayed in a nested sense visually.
Welcome ideas.
I'm pretty sure this isn't the best approach, and depending on the length of the patterns, might have a running time and memory usage that won't work, but here's some code.
You can paste the following code into LINQPad and run it, and it should produce the following output:
ABCBCABCBCDEEF = (2A(2BC))D(2E)F
ABBABBABBABA = (3A(2B))ABA
ABCDABCDCDCDCD = (2ABCD)(3CD)
As you can see, the middle example encoded ABB as A(2B) instead of ABB, you would have to make that judgment yourself, if single-symbol sequences like that should be encoded as a repeated symbol or not, or if a specific threshold (like 3 or more) should be used.
Basically, the code runs like this:
For each position in the sequence, try to find the longest match (actually, it doesn't, it takes the first 2+ match it finds, I left the rest as an exercise for you since I have to leave my computer for a few hours now)
It then tries to encode that sequence, the one that repeats, recursively, and spits out a X*seq type of object
If it can't find a repeating sequence, it spits out the single symbol at that location
It then skips what it encoded, and continues from #1
Anyway, here's the code:
void Main()
{
string[] examples = new[]
{
"ABCBCABCBCDEEF",
"ABBABBABBABA",
"ABCDABCDCDCDCD",
};
foreach (string example in examples)
{
StringBuilder sb = new StringBuilder();
foreach (var r in Encode(example))
sb.Append(r.ToString());
Debug.WriteLine(example + " = " + sb.ToString());
}
}
public static IEnumerable<Repeat<T>> Encode<T>(IEnumerable<T> values)
{
return Encode<T>(values, EqualityComparer<T>.Default);
}
public static IEnumerable<Repeat<T>> Encode<T>(IEnumerable<T> values, IEqualityComparer<T> comparer)
{
List<T> sequence = new List<T>(values);
int index = 0;
while (index < sequence.Count)
{
var bestSequence = FindBestSequence<T>(sequence, index, comparer);
if (bestSequence == null || bestSequence.Length < 1)
throw new InvalidOperationException("Unable to find sequence at position " + index);
yield return bestSequence;
index += bestSequence.Length;
}
}
private static Repeat<T> FindBestSequence<T>(IList<T> sequence, int startIndex, IEqualityComparer<T> comparer)
{
int sequenceLength = 1;
while (startIndex + sequenceLength * 2 <= sequence.Count)
{
if (comparer.Equals(sequence[startIndex], sequence[startIndex + sequenceLength]))
{
bool atLeast2Repeats = true;
for (int index = 0; index < sequenceLength; index++)
{
if (!comparer.Equals(sequence[startIndex + index], sequence[startIndex + sequenceLength + index]))
{
atLeast2Repeats = false;
break;
}
}
if (atLeast2Repeats)
{
int count = 2;
while (startIndex + sequenceLength * (count + 1) <= sequence.Count)
{
bool anotherRepeat = true;
for (int index = 0; index < sequenceLength; index++)
{
if (!comparer.Equals(sequence[startIndex + index], sequence[startIndex + sequenceLength * count + index]))
{
anotherRepeat = false;
break;
}
}
if (anotherRepeat)
count++;
else
break;
}
List<T> oneSequence = Enumerable.Range(0, sequenceLength).Select(i => sequence[startIndex + i]).ToList();
var repeatedSequence = Encode<T>(oneSequence, comparer).ToArray();
return new SequenceRepeat<T>(count, repeatedSequence);
}
}
sequenceLength++;
}
// fall back, we could not find anything that repeated at all
return new SingleSymbol<T>(sequence[startIndex]);
}
public abstract class Repeat<T>
{
public int Count { get; private set; }
protected Repeat(int count)
{
Count = count;
}
public abstract int Length
{
get;
}
}
public class SingleSymbol<T> : Repeat<T>
{
public T Value { get; private set; }
public SingleSymbol(T value)
: base(1)
{
Value = value;
}
public override string ToString()
{
return string.Format("{0}", Value);
}
public override int Length
{
get
{
return Count;
}
}
}
public class SequenceRepeat<T> : Repeat<T>
{
public Repeat<T>[] Values { get; private set; }
public SequenceRepeat(int count, Repeat<T>[] values)
: base(count)
{
Values = values;
}
public override string ToString()
{
return string.Format("({0}{1})", Count, string.Join("", Values.Select(v => v.ToString())));
}
public override int Length
{
get
{
int oneLength = 0;
foreach (var value in Values)
oneLength += value.Length;
return Count * oneLength;
}
}
}
public class GroupRepeat<T> : Repeat<T>
{
public Repeat<T> Group { get; private set; }
public GroupRepeat(int count, Repeat<T> group)
: base(count)
{
Group = group;
}
public override string ToString()
{
return string.Format("({0}{1})", Count, Group);
}
public override int Length
{
get
{
return Count * Group.Length;
}
}
}
Looking at the problem theoretically, it seems similar to the problem of finding the smallest context free grammar which generates (only) the string, except in this case the non-terminals can only be used in direct sequence after each other, so e.g.
ABCBCABCBCDEEF
s->ttDuuF
t->Avv
v->BC
u->E
ABABCDABABCD
s->ABtt
t->ABCD
Of course, this depends on how you define "smallest", but if you count terminals on the right side of rules, it should be the same as the "length in original symbols" after doing the nested run-length encoding.
The problem of the smallest grammar is known to be hard, and is a well-studied problem. I don't know how much the "direct sequence" part adds to or subtracts from the complexity.
I'd like to split a sequence in C# to a sequence of sequences using LINQ. I've done some investigation, and the closest SO article I've found that is slightly related is this.
However, this question only asks how to partition the original sequence based upon a constant value. I would like to partition my sequence based on an operation.
Specifically, I have a list of objects which contain a decimal property.
public class ExampleClass
{
public decimal TheValue { get; set; }
}
Let's say I have a sequence of ExampleClass, and the corresponding sequence of values of TheValue is:
{0,1,2,3,1,1,4,6,7,0,1,0,2,3,5,7,6,5,4,3,2,1}
I'd like to partition the original sequence into an IEnumerable<IEnumerable<ExampleClass>> with values of TheValue resembling:
{{0,1,2,3}, {1,1,4,6,7}, {0,1}, {0,2,3,5,7}, {6,5,4,3,2,1}}
I'm just lost on how this would be implemented. SO, can you help?
I have a seriously ugly solution right now, but have a "feeling" that LINQ will increase the elegance of my code.
Okay, I think we can do this...
public static IEnumerable<IEnumerable<TElement>>
PartitionMontonically<TElement, TKey>
(this IEnumerable<TElement> source,
Func<TElement, TKey> selector)
{
// TODO: Argument validation and custom comparisons
Comparer<TKey> keyComparer = Comparer<TKey>.Default;
using (var iterator = source.GetEnumerator())
{
if (!iterator.MoveNext())
{
yield break;
}
TKey currentKey = selector(iterator.Current);
List<TElement> currentList = new List<TElement> { iterator.Current };
int sign = 0;
while (iterator.MoveNext())
{
TElement element = iterator.Current;
TKey key = selector(element);
int nextSign = Math.Sign(keyComparer.Compare(currentKey, key));
// Haven't decided a direction yet
if (sign == 0)
{
sign = nextSign;
currentList.Add(element);
}
// Same direction or no change
else if (sign == nextSign || nextSign == 0)
{
currentList.Add(element);
}
else // Change in direction: yield current list and start a new one
{
yield return currentList;
currentList = new List<TElement> { element };
sign = 0;
}
currentKey = key;
}
yield return currentList;
}
}
Completely untested, but I think it might work...
alternatively with linq operators and some abuse of .net closures by reference.
public static IEnumerable<IEnumerable<T>> Monotonic<T>(this IEnumerable<T> enumerable)
{
var comparator = Comparer<T>.Default;
int i = 0;
T last = default(T);
return enumerable.GroupBy((value) => { i = comparator.Compare(value, last) > 0 ? i : i+1; last = value; return i; }).Select((group) => group.Select((_) => _));
}
Taken from some random utility code for partitioning IEnumerable's into a makeshift table for logging. If I recall properly, the odd ending Select is to prevent ambiguity when the input is an enumeration of strings.
Here's a custom LINQ operator which splits a sequence according to just about any criteria. Its parameters are:
xs: the input element sequence.
func: a function which accepts the "current" input element and a state object, and returns as a tuple:
a bool stating whether the input sequence should be split before the "current" element; and
a state object which will be passed to the next invocation of func.
initialState: the state object that gets passed to func on its first invocation.
Here it is, along with a helper class (required because yield return apparently cannot be nested):
public static IEnumerable<IEnumerable<T>> Split<T, TState>(
this IEnumerable<T> xs,
Func<T, TState, Tuple<bool, TState>> func,
TState initialState)
{
using (var splitter = new Splitter<T, TState>(xs, func, initialState))
{
while (splitter.HasNext)
{
yield return splitter.GetNext();
}
}
}
internal sealed class Splitter<T, TState> : IDisposable
{
public Splitter(IEnumerable<T> xs,
Func<T, TState, Tuple<bool, TState>> func,
TState initialState)
{
this.xs = xs.GetEnumerator();
this.func = func;
this.state = initialState;
this.hasNext = this.xs.MoveNext();
}
private readonly IEnumerator<T> xs;
private readonly Func<T, TState, Tuple<bool, TState>> func;
private bool hasNext;
private TState state;
public bool HasNext { get { return hasNext; } }
public IEnumerable<T> GetNext()
{
while (hasNext)
{
Tuple<bool, TState> decision = func(xs.Current, state);
state = decision.Item2;
if (decision.Item1) yield break;
yield return xs.Current;
hasNext = xs.MoveNext();
}
}
public void Dispose() { xs.Dispose(); }
}
Note: Here are some of the design decisions that went into the Split method:
It should make only a single pass over the sequence.
State is made explicit so that it's possible to keep side effects out of func.
Let's say, I have an instance of IQueryable. How can I found out by which parameters it was ordered?
Here is how OrderBy() method looks like (as a reference):
public static IOrderedQueryable<T> OrderBy<T, TKey>(
this IQueryable<T> source, Expression<Func<T, TKey>> keySelector)
{
return (IOrderedQueryable<T>)source.Provider.CreateQuery<T>(
Expression.Call(null,
((MethodInfo)MethodBase.GetCurrentMethod()).MakeGenericMethod(
new Type[] { typeof(T), typeof(TKey) }
),
new Expression[] { source.Expression, Expression.Quote(keySelector) }
)
);
}
A hint from Matt Warren:
All queryables (even IOrderedQueryable's) have expression trees underlying them that encode the activity they represent. You should find using the IQueryable.Expression property a method-call expression node representing a call to the Queryable.OrderBy method with the actual arguments listed. You can decode from the keySelector argument the expression used for ordering. Take a look at the IOrderedQueryable object instance in the debugger to see what I mean.
This isn't pretty, but it seems to do the job:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Linq;
using System.Linq.Expressions;
using System.Windows.Forms;
public class Test
{
public int A;
public string B { get; set; }
public DateTime C { get; set; }
public float D;
}
public class QueryOrderItem
{
public QueryOrderItem(Expression expression, bool ascending)
{
this.Expression = expression;
this.Ascending = ascending;
}
public Expression Expression { get; private set; }
public bool Ascending { get; private set; }
public override string ToString()
{
return (Ascending ? "asc: " : "desc: ") + Expression;
}
}
static class Program
{
public static List<QueryOrderItem> GetQueryOrder(Expression expression)
{
var members = new List<QueryOrderItem>(); // queue for easy FILO
GetQueryOrder(expression, members, 0);
return members;
}
static void GetQueryOrder(Expression expr, IList<QueryOrderItem> members, int insertPoint)
{
if (expr == null) return;
switch (expr.NodeType)
{
case ExpressionType.Call:
var mce = (MethodCallExpression)expr;
if (mce.Arguments.Count > 1)
{ // OrderBy etc is expressed in arg1
switch (mce.Method.Name)
{ // note OrderBy[Descending] shifts the insertPoint, but ThenBy[Descending] doesn't
case "OrderBy": // could possibly check MemberInfo
members.Insert(insertPoint, new QueryOrderItem(mce.Arguments[1], true));
insertPoint = members.Count; // swaps order to enforce stable sort
break;
case "OrderByDescending":
members.Insert(insertPoint, new QueryOrderItem(mce.Arguments[1], false));
insertPoint = members.Count;
break;
case "ThenBy":
members.Insert(insertPoint, new QueryOrderItem(mce.Arguments[1], true));
break;
case "ThenByDescending":
members.Insert(insertPoint, new QueryOrderItem(mce.Arguments[1], false));
break;
}
}
if (mce.Arguments.Count > 0)
{ // chained on arg0
GetQueryOrder(mce.Arguments[0], members, insertPoint);
}
break;
}
}
static void Main()
{
var data = new[] {
new Test { A = 1, B = "abc", C = DateTime.Now, D = 12.3F},
new Test { A = 2, B = "abc", C = DateTime.Today, D = 12.3F},
new Test { A = 1, B = "def", C = DateTime.Today, D = 10.1F}
}.AsQueryable();
var ordered = (from item in data
orderby item.D descending
orderby item.C
orderby item.A descending, item.B
select item).Take(20);
// note: under the "stable sort" rules, this should actually be sorted
// as {-A, B, C, -D}, since the last order by {-A,B} preserves (in the case of
// a match) the preceding sort {C}, which in turn preserves (for matches) {D}
var members = GetQueryOrder(ordered.Expression);
foreach (var item in members)
{
Console.WriteLine(item.ToString());
}
// used to investigate the tree
TypeDescriptor.AddAttributes(typeof(Expression), new[] {
new TypeConverterAttribute(typeof(ExpandableObjectConverter)) });
Application.Run(new Form
{
Controls = {
new PropertyGrid { Dock = DockStyle.Fill, SelectedObject = ordered.Expression }
}
});
}
}