Conditional Dependencies in Compiler Semantic Analysis Passes - compiler-theory

Imagine that we have a been given an Excel spreadsheet with three columns, labeled COND, X and Y.
COND = TRUE or FALSE (user input)
X = if(COND == TRUE) then 0 else Y
Y = if(COND == TRUE) then X else 1;
These formulas evaluate perfectly fine in Excel, and Excel does not generate a Circular Dependency error.
I am writing a compiler that tries to convert these Excel formulas to C code. In my compiler, these formulas do generate a circular dependency error. The issue is that (naïvely) the expression of X depends on Y and the expression for Y depends on X and my compiler is unable to logically continue.
Excel is able to accomplish this feat because it is a lazy, interpreted language. Excel will just lazily evaluate the formulas at run-time (with user inputs), and since no circular dependency occurs at run-time Excel has no problem evaluating such logic.
Unfortunately, I need to convert these formulas to a compiled language (not an interpreted one). The actual formulas, in the actual spreadsheets, have more complicated dependencies between multiple cells/variables (involving up to over half a dozen different cells). This means that my compiler has to perform some kind of sophisticated static, semantic analysis of the formulas and be smart enough to detect that there are no circular references if we "look inside" the conditional branches. The compiler would then have to generate the following C code from the above Excel formulas:
bool COND;
int X, Y;
if(COND) { X = 0; Y = X; } else { Y = 1; X = Y; }
Notice that the order of the assignment instructions is different in each branch of the if-statement in C.
My question is, is there any established algorithm or literature on compilers that explains how to implement this type of analysis in a compiler? Do functional programming language compilers have to solve this problem?

Why aren't standard optimization techniques adequate?
Presumably, the Excel formulas form a DAG with the leaves being primitive values and the nodes being computations/assignments. (If the Excel computation forms a cycle, then you need
some kind of iterative solver assuming you want a fixpoint).
If you simply propagate the conditional by lifting it (a class compiler optimization), we start with your original equations, where each computation is evaluated in any order WRT to others, such that the result computes dag-like (that "anyorder" is an operator intending to model that):
X = if(COND == TRUE) then 0 else Y;
anyorder
Y = if(COND == TRUE) then X else 1;
then lifting the conditional:
if (COND) { X=0; } else { X = 1; }
anyorder
if (COND) { Y=X; } else { Y = 1; }
then
if (COND) { X=0; anyorder Y=X; } else { X = Y; anyorder Y = 1; }
Each of the arms must be dag-like.
The first arm is daglike evaluating the X=0 assignment first.
The second arm is daglike evaluating Y=1 first. So, we get the answer you wanted:
if (COND) { X=0; Y=X; } else { Y = 1; X = Y; }
So conventional transformations and knowledge about anyorder-if-daglike knowledges
seems to give the right effect.
I'm not sure what you do if COND is computed as a function of the cells.
I suspect the way to do this is to generate a dependency graph of computations with
with conditionals on the dependencies. You probably have to propagate/group those conditionals over the arcs more as less as I did over the syntax.

Yes, literature exists, sorry I cannot quote any, I simply don't remember and would it just google up just as you can..
Basic algos for dependency and cycle analysis are really simple. I.e. detect symbols in the expression, build a set of expressions and dependencies in form:
inps expr outs
cell_A6, cell_B7 -> expr3 -> cell_A7
cell_A1, cell_B4 -> expr1 -> cell_A5
cell_A1, cell_A5 -> expr2 -> cell_A6
and then by comparing and iteratively expanding/replacing sets of inputs/outputs:
step0:
cell_A6, cell_B7 -> expr3 -> cell_A7
cell_A1, cell_B4 -> expr1 -> cell_A5 <--1 note that cell_A5 ~ (A1,B4)
cell_A1, cell_A5 -> expr2 -> cell_A6 <--1 apply that knowledge here
so dependency
cell_A1, cell_A5 -> expr2 -> cell_A6
morphs into
cell_A1, cell_B4 -> expr2 -> cell_A6 <--2 note that cell_A6 ~ (A1,B4) and so on
Finally, you will get either a set of full dependencies, where you can easily detect circular dependencies, like for example:
cell_A1, cell_D6, cell_F7 -> exprN -> cell_D6
or, if none found - you will be able to determine a safe, incremental order of the execution.
If the expressions contain branches or sideeffects other than the 'returned value', you can apply various transformations to reduce/expand the expressions into new ones, or into groups of new expressions that will be of the form above. For example:
B5 = { if(A5 + A3 > 0) A3-1 else A5+1 }
so
inps ... outs
A3, A5 -> theExpr -> B5
the condition can be 'lifted' and form two conditional rules:
A5 + A3 > 0 : A3 -> reducedexpr "A3-1" -> B5
A5 + A3 <= 0 : A5 -> reducedexpr "A5-1" -> B5
but now, your execution/analysis must also take care of the conditions before applying the rules. Lifting is only one of possible transformations.
However, you stil need something more than that, at least some an 'extension' for it. The hard part of your problem is that your expressions are complex, have branches, and you need to include user-random input to resolve branches to eliminate the dead branches and break dead dependencies.
Since the key is elimination of dead dependencies, you have to somehow detect dead branches. Conditions can be of any arbitrary complexity, and user-input is random, so you cannot work it out completely statically, really. After playing with transformations, you would still have to analyze the conditions and generate code accordingly. To do so, you would need to generate code for all possible combinations of the outcomes of the conditions, and all resulting branching and rule combinations, which is simply infeasible except for some trivial cases. With number of unknown the number of leafs can grow exponentially (2^N) which is a huge bloat after crossing some threshold.
Of course while analyzing conditions based on Bools, you can analyze, group and eliminate conflicting conditions like (a & b & !a)..
..but if your input values and conditions include NON-BOOL data, like integers or floating or strings, just imagine your condition is have a condition that executes some external weird statistical function and checks its result.. Ignore the 'weird' part and focus on 'external'. If you meet some expressions that use complex functions like AVG or MAX, you cannot chew through something like that statically(*). Even simple arithmetic is hard to analyze: (a+b)*(c+d) - you could derive a fact that c+d can be ignored when a+b==0, but this a really tough task to cover fully..
IIRC, doing a satisfiability analysis (SAT) for boolean expressions with basic operators is an NP-hard problem, not mentioning integers or floating points with all their math.. Calculating the result of expression is much easier than telling which values does it really depend on!!
So, since input values may be either hardcoded (cool) or user-supplied at runtime (doh!), your compler most probably will not be able to fully analyze it up front. Now link it with the fact marked as (*) and it's quite obvious that you can include some static analysis and try to eliminate some branches at 'compilation time', but still there might be some parts that must be delayed until the user provides the actual input.
So, if part of the analysis must be done at runtime, all the branch elimination is just an optional optimisation and I think you should focus on the runtime part now.
At minimal unoptimized version, your generated program could simply remember all the excel-expressions and wait for input data. Once the program is run and input is given, the program has to substitute the input in the expressions, and then try to iteratively reduce them to output values.
Writing such algo in imperative language is completely possible. Actually, you'd need to write it once, and later you'd just merge it with a different sets of rules derived from cell-formulas and done. Runtime part of the program would be the same, formulas would change.
You could then expand the 'compiler' side to try to help by i.e. preliminarily partially analyzing the dependencies and trying to reorder the rules so later they will be checked in a "better order", or by precalculating constants, or inlining some expressions and so on but as I said, it's all optimizations, not core feature.
Sadly, I cannot really tell you much anything serious about the "functional languages", but since usually their runtimes are 'very dynamic' and sometimes they even execute the code in terms of symbols and transformations, it could reduce the complexity of your 'compiler' and 'engine' part. The most valuable asset here is the dynamism. So, even a Ruby would do much better than C - but in no way it's a "compiled" language as you'd say.
For example, you could try to transform excel rules directly into functions:
def cell_A5 = expr1(cell_A1, cell_B4)
def cell_A7 = expr3(cell_A6, cell_B7)
def cell_A6 = expr2(cell_A1, cell_A5)
write it down as part of the program, then when at runtime when the user provides some values, you'd those would just redefine some of the parts of the program
cell_B7 = 11.2 // filling up undefined variable
cell_A1 = 23 // filling up undefined variable
cell_A5 = 13 // overwriting the function with a value
That's the power of dynamic platforms, nothing very 'functional' here. Dynamic platforms make it easy to fill/override bits. But then, once the user provided some bits and once the program has been "corrected on the fly", which one function would you call first?
The answer is somewhat sad.. You don't know.
If your dynamic language has some rule-engine built into it, you can try generating rules instead of functions and later rely on that engine to "fill up" everything that is possible to calculate.
But if it doesn't have rule engine, you are back to point one..
afterthought:
Hm.. sorry, I think I just wrote too much and too vaguely/chatty. If you think it's helpful, please drop me a comment. Otherwise I'll delete it after few days or a week.

Related

Which is better in OCaml pattern matching, `when` or `if-then-else`?

Let's say we have a type called d:
type d = D of int * int
And we want to do some pattern matching over it, is it better to do it this way:
let dcmp = function
| D (x, y) when x > y -> 1
| D (x, y) when x < y -> -1
| _ -> 0
or
let dcmp = function
| D (x, y) ->
if x > y then 1 else if x < y then -1 else 0
Just in general is better to match patterns with many "when" cases or to match one pattern and the put an "if-then-else" in it?
And where can I get more information about such matters, like good practices in OCaml and syntactic sugars and such?
Both approaches have their cons and pros so they should be used accordingly to the context.
The when clause is easier to understand than if because it has only one branch, so you can digest a branch in a time. It comes with the price that when we analyze a clause in order to understand its path condition we have to analyze all branches before it (and negate them), e.g., compare your variant with the following definition, which is equivalent,
let dcmp = function
| D (x, y) when x > y -> 1
| D (x, y) when x = y -> 0
| _ -> -1
Of course, the same is true for if/then/else construct it is just harder to accidentally rearrange branches (e.g., during refactoring) in the if/then/else expression and completely change the logic of the expression.
In addition, the when guards may prevent the compiler from performing decision tree optimizations1 and confuse2 the refutation mechanism.
Given this, the only advantage to using when instead of if in this particular example is that when syntax looks more appealing as it perfectly lined up and it is easier for the human brain to find where are the conditions and their corresponding values, i.e., it looks more like a truth-table. However, if we will write
let dcmp (D (x,y)) =
if x = y then 0 else
if x > y then 1 else -1
we can achieve the same level of readability.
To summarize, it is better to use when when it is impossible or nearly impossible to express the same code with if/then/else. To improve readability it is better to factor your logic into helper functions with readable names. For example, with dcmp the best solution is to use neither if or when, e.g.,
let dcmp (D (x,y)) = compare x y
1)In this particular case the compiler will generate the same code for when and if/then/else. But in more general cases, guards may prevent the matching compiler from generating the efficient code, especially when branches are disjoint. In our case, the compiler just noticed that we're repeating the same branch and coalesced them into a single branch and turned it back into the if/then/else expression, e.g., here is the cmm output of the function with the when guards,
(if (> x y) 3 (if (< x y) -1 1))
which is exactly the same code as generated by the if/then/else version of the dcmp function.
2) Not to the state where it will not notice a missing branch, of course, but to the state where it will report missing branches less precisely or will ask you to add unnecessary branches.
Quoting the OCaml Towards Clarity and Grace style guide:
Code is more often read than written - make the life of the reader easy
and
Less code is better, cryptic code is worse
The first makes me think that the version with multiple when clauses is the better choice, as it makes it easy to predict or evaluate the result when reading the code depending on condition. The second goes further, against the if-then-else because, even if shorter, is cryptic when looking from afar.
Also, in the section Functions, we find out that "Pattern matching is the preferred way to define functions"
From a Haskell functional programmer's point of view.

Blending Boolean Algebra and Numeric Algebra to assign variables

I have written some code which assigns variables using the results of condition expressions without the explicit use of IF-ELSE statements.
In the simplest form, the problem looks like this:
Version 1
if (x < K)
y = A;
else
y = B;
I've seen a "trick" in the past in which people accomplish the same task in one line without the conditional like this:
Version 2
y = (x < K) * A + !(x < K) * B;
This approach extends relatively easily to handle IF-ELSE IF-ELSE assignments. The trick is to ensure that the conditions are all mutually exclusive.
From a unit testing perspective, I'm required to achieve 100% code path coverage.
My coworkers agree that the Version 2 is more elegant, but they contend it is less readable. Furthermore, they argue that I am "side-stepping" the path coverage requirement and that I would be able to achieve 100% path coverage by "hiding" the conditional logic inside the single line of code without actually exercising both conditions ((x < K) and !(x < K)).
I argue that I am able to blend Boolean algebra and numeric algebra to perform variable assignment because the computer treats Boolean 'true' and 'false' as '1' and '0' which can be multiplied by 'float' and 'int' variables. To me, it becomes simply an arithmetic expression with zeros and ones multiplying variables.
Why am I doing this?
I am doing this blend of Boolean and numeric algebra to minimize the number of IF-statements, minimize lines of code, and general code cleanup. Obviously performance can be improved by saving the result of the condition to a variable and referencing.
The Question
Is this practice (and ternary operators) frowned upon from a unit testing perspective?
If this question is too subjective, please suggest edits.
I'd suggest avoiding it (this trick is actually useful when the intention is to avoid branching, which may be the context you've seen it in). Given that the language doesn't have a conditional operator, you should be able to define the equivalent of
cond(bool, x, y) { if (bool) return x; else return y; }
yourself and write y = cond(x < K, A, B). It's more readable, harder to make a mistake when writing, is usable with non-number types, and is considered correctly in path coverage. It evaluates both sides, unlike the actual conditional operator (unless the language has macros or lazy evaluation), but so does the described trick.

Function to detect conflicting mathematical operators in VB6

I've developed a program which generates insurance quotes using different types of coverages based on state criteria. Now I want to add the ability to specify 'rules'. For example we may have 3 types of coverage (we'll call them UM, BI, and PD). Well some states don't allow PD to be greater than BI and other states don't allow UM to exist without BI. So I've added the ability for the user to create these rules so that when the quote is generated the rule will be followed and thus no state regulations will be violated when the program generates the quote.
The Problem
I don't want the user to be able to select conflicting rules. The user can select any of the VB mathematical operators (>, <, >=, <=, =, <>) and set a coverage on either side. They can do this multiple times (but only one at a time) so they might end up with a list of rules like this:
A > B
B > C
C > A
As you can see, the last rule conflicts with the previously set rules. My solution to this was to validate the list each time the user clicks 'Add rule to list'.
Pretend the 3rd list item is not yet in the list but the user has clicked 'add rule' to put it in the list. The validation process first checks to see if both incoming variables have already been used on the same line. If not, it just searches for the left side incoming variable (in this case 'C') in the already created list. if it finds it, it then sets tmp1 equal to the variable across from the match (tmp1 = 'B'). It then does the same for the incoming variable on the right side (in this case 'A'). Then tmp2 is set equal to the variable across from A (tmp2 = 'B'). If tmp1 and tmp2 are equal then the incoming rule is either conflicting OR is irrelevant regardless of the operators used. I'm pretty sure this is solid logic given 3 variables. However, I found that adding any additional variables could easily bypass my validation. There could be upwards of 10 coverage types in any given state so it is important to be able to validate more than just 3.
Is there any uniform way to do a sound validation given any number of variables? Any ideas or thoughts are appreciated. I hope my explanation makes sense.
Thanks
My best bet is some sort of hierarchical tree of rules. When the user adds the first rule (say A > B), the application could create a data structure like this (lowerValues is a Map which the key leads to a list of values):
lowerValues['A'] = ['B']
Now when the user adds the next rule (B > C), the application could check if B is already in a any lowerValues list (in this case, A). If that happens, C is added to lowerValues['A'], and lowerValues['B'] is also created:
lowerValues['A'] = ['B', 'C']
lowerValues['B'] = ['C']
Finally, when the last rule is provided by the user (C > A), the application checks if C is in any lowerValues list. Since it's in B and A, the rule is invalid.
Hope that helps. I don't remember if there's some sort of mapping in VB. I think you should try the Dictionary object.
In order to this idea works out, all the operations must be internally translated to a simple type. So, for example:
A > B
could be translated as
B <= A
Good luck
In general this is a pretty hard problem. What you in fact want to know is if a set of propositional equations over (apparantly) some set of arithmetic is true. To do this you need what amounts to constraint solvers that "know" arithmetic. Not likely to find that in VB6, but you might be able to invoke one as a subprocess.
If the rules are propositional equations only over inequalities (AA", write them only one way).
Second, try solving the propositions for tautology (see for Wang's algorithm which you can likely implment awkwardly in VB6).
If the propositions are not a tautology, now you want build chains of inequalities (e.g, A > B > C) as a graph and look for cycles. The place this fails is when your propositions have disjunctions, e.g., ("A>B or B>Q"); you'll have to generate an inequality chain for each combination of disjunctions, and discard the inconsistent ones. If you discard all of them, the set is inconsistent. Watch out for expressions like "A and B"; by DeMorgans theorem, they're equivalent to "not A or not B", e.g., "A>B and B>Q" is the same as "A<=B or B<=Q". You might want to reduce the conditions to disjunctive normal form to avoid getting suprised.
There are apparantly decision procedures for such inequalities. They're likely hard to implement.

What is the difference between LR, SLR, and LALR parsers?

What is the actual difference between LR, SLR, and LALR parsers? I know that SLR and LALR are types of LR parsers, but what is the actual difference as far as their parsing tables are concerned?
And how to show whether a grammar is LR, SLR, or LALR? For an LL grammar we just have to show that any cell of the parsing table should not contain multiple production rules. Any similar rules for LALR, SLR, and LR?
For example, how can we show that the grammar
S --> Aa | bAc | dc | bda
A --> d
is LALR(1) but not SLR(1)?
EDIT (ybungalobill): I didn't get a satisfactory answer for what's the difference between LALR and LR. So LALR's tables are smaller in size but it can recognize only a subset of LR grammars. Can someone elaborate more on the difference between LALR and LR please? LALR(1) and LR(1) will be sufficient for an answer. Both of them use 1 token look-ahead and both are table driven! How they are different?
SLR, LALR and LR parsers can all be implemented using exactly the same table-driven machinery.
Fundamentally, the parsing algorithm collects the next input token T, and consults the current state S (and associated lookahead, GOTO, and reduction tables) to decide what to do:
SHIFT: If the current table says to SHIFT on the token T, the pair (S,T) is pushed onto the parse stack, the state is changed according to what the GOTO table says for the current token (e.g, GOTO(T)), another input token T' is fetched, and the process repeats
REDUCE: Every state has 0, 1, or many possible reductions that might occur in the state. If the parser is LR or LALR, the token is checked against lookahead sets for all valid reductions for the state. If the token matches a lookahead set for a reduction for grammar rule G = R1 R2 .. Rn, a stack reduction and shift occurs: the semantic action for G is called, the stack is popped n (from Rn) times, the pair (S,G) is pushed onto the stack, the new state S' is set to GOTO(G), and the cycle repeats with the same token T. If the parser is an SLR parser, there is at most one reduction rule for the state and so the reduction action can be done blindly without searching to see which reduction applies. It is useful for an SLR parser to know if there is a reduction or not; this is easy to tell if each state explicitly records the number of reductions associated with it, and that count is needed for the L(AL)R versions in practice anyway.
ERROR: If neither SHIFT nor REDUCE is possible, a syntax error is declared.
So, if they all the use the same machinery, what's the point?
The purported value in SLR is its simplicity in implementation; you don't have to scan through the possible reductions checking lookahead sets because there is at most one, and this is the only viable action if there are no SHIFT exits from the state. Which reduction applies can be attached specifically to the state, so the SLR parsing machinery doesn't have to hunt for it. In practice L(AL)R parsers handle a usefully larger set of langauges, and is so little extra work to implement that nobody implements SLR except as an academic exercise.
The difference between LALR and LR has to do with the table generator. LR parser generators keep track of all possible reductions from specific states and their precise lookahead set; you end up with states in which every reduction is associated with its exact lookahead set from its left context. This tends to build rather large sets of states. LALR parser generators are willing to combine states if the GOTO tables and lookhead sets for reductions are compatible and don't conflict; this produces considerably smaller numbers of states, at the price of not be able to distinguish certain symbol sequences that LR can distinguish. So, LR parsers can parse a larger set of languages than LALR parsers, but have very much bigger parser tables. In practice, one can find LALR grammars which are close enough to the target langauges that the size of the state machine is worth optimizing; the places where the LR parser would be better is handled by ad hoc checking outside the parser.
So: All three use the same machinery. SLR is "easy" in the sense that you can ignore a tiny bit of the machinery but it is just not worth the trouble. LR parses a broader set of langauges but the state tables tend to be pretty big. That leaves LALR as the practical choice.
Having said all this, it is worth knowing that GLR parsers can parse any context free language, using more complicated machinery but exactly the same tables (including the smaller version used by LALR). This means that GLR is strictly more powerful than LR, LALR and SLR; pretty much if you can write a standard BNF grammar, GLR will parse according to it. The difference in the machinery is that GLR is willing to try multiple parses when there are conflicts between the GOTO table and or lookahead sets. (How GLR does this efficiently is sheer genius [not mine] but won't fit in this SO post).
That for me is an enormously useful fact. I build program analyzers and code transformers and parsers are necessary but "uninteresting"; the interesting work is what you do with the parsed result and so the focus is on doing the post-parsing work. Using GLR means I can relatively easily build working grammars, compared to hacking a grammar to get into LALR usable form. This matters a lot when trying to deal to non-academic langauges such as C++ or Fortran, where you literally needs thousands of rules to handle the entire language well, and you don't want to spend your life trying to hack the grammar rules to meet the limitations of LALR (or even LR).
As a sort of famous example, C++ is considered to be extremely hard to parse... by guys doing LALR parsing. C++ is straightforward to parse using GLR machinery using pretty much the rules provided in the back of the C++ reference manual. (I have precisely such a parser, and it handles not only vanilla C++, but also a variety of vendor dialects as well. This is only possible in practice because we are using a GLR parser, IMHO).
[EDIT November 2011: We've extended our parser to handle all of C++11. GLR made that a lot easier to do. EDIT Aug 2014: Now handling all of C++17. Nothing broke or got worse, GLR is still the cat's meow.]
LALR parsers merge similar states within an LR grammar to produce parser state tables that are exactly the same size as the equivalent SLR grammar, which are usually an order of magnitude smaller than pure LR parsing tables. However, for LR grammars that are too complex to be LALR, these merged states result in parser conflicts, or produce a parser that does not fully recognize the original LR grammar.
BTW, I mention a few things about this in my MLR(k) parsing table algorithm here.
Addendum
The short answer is that the LALR parsing tables are smaller, but the parser machinery is the same. A given LALR grammar will produce much larger parsing tables if all of the LR states are generated, with a lot of redundant (near-identical) states.
The LALR tables are smaller because the similar (redundant) states are merged together, effectively throwing away context/lookahead info that the separate states encode. The advantage is that you get much smaller parsing tables for the same grammar.
The drawback is that not all LR grammars can be encoded as LALR tables because more complex grammars have more complicated lookaheads, resulting in two or more states instead of a single merged state.
The main difference is that the algorithm to produce LR tables carries more info around between the transitions from state to state while the LALR algorithm does not. So the LALR algorithm cannot tell if a given merged state should really be left as two or more separate states.
Yet another answer (YAA).
The parsing algorithms for SLR(1), LALR(1) and LR(1) are identical like Ira Baxter said,
however, the parser tables may be different because of the parser-generation algorithm.
An SLR parser generator creates an LR(0) state machine and computes the look-aheads from the grammar (FIRST and FOLLOW sets). This is a simplified approach and may report conflicts that do not really exist in the LR(0) state machine.
An LALR parser generator creates an LR(0) state machine and computes the look-aheads from the LR(0) state machine (via the terminal transitions). This is a correct approach, but occasionally reports conflicts that would not exist in an LR(1) state machine.
A Canonical LR parser generator computes an LR(1) state machine and the look-aheads are already part of the LR(1) state machine. These parser tables can be very large.
A Minimal LR parser generator computes an LR(1) state machine, but merges compatible states during the process, and then computes the look-aheads from the minimal LR(1) state machine. These parser tables are the same size or slightly larger than LALR parser tables, giving the best solution.
LRSTAR 10.0 can generate LALR(1), LR(1), CLR(1) or LR(*) parsers in C++, whatever is needed for your grammar. See this diagram which shows the difference among LR parsers.
[Full disclosure: LRSTAR is my product]
The basic difference between the parser tables generated with SLR vs LR, is that reduce actions are based on the Follows set for SLR tables. This can be overly restrictive, ultimately causing a shift-reduce conflict.
An LR parser, on the other hand, bases reduce decisions only on the set of terminals which can actually follow the non-terminal being reduced. This set of terminals is often a proper subset of the Follows set of such a non-terminal, and therefore has less chance of conflicting with shift actions.
LR parsers are more powerful for this reason. LR parsing tables can be extremely large, however.
An LALR parser starts with the idea of building an LR parsing table, but combines generated states in a way that results in significantly less table size. The downside is that a small chance of conflicts would be introduced for some grammars that an LR table would otherwise have avoided.
LALR parsers are slightly less powerful than LR parsers, but still more powerful than SLR parsers. YACC and other such parser generators tend to use LALR for this reason.
P.S. For brevity, SLR, LALR and LR above really mean SLR(1), LALR(1), and LR(1), so one token lookahead is implied.
SLR parsers recognize a proper subset of grammars recognizable by LALR(1) parsers, which in turn recognize a proper subset of grammars recognizable by LR(1) parsers.
Each of these is constructed as a state machine, with each state representing some set of the grammar's production rules (and position in each) as it's parsing the input.
The Dragon Book example of an LALR(1) grammar that is not SLR is this:
S → L = R | R
L → * R | id
R → L
Here is one of the states for this grammar:
S → L•= R
R → L•
The • indicates the position of the parser in each of the possible productions. It doesn't know which of the productions it's actually in until it reaches the end and tries to reduce.
Here, the parser could either shift an = or reduce R → L.
An SLR (aka LR(0)) parser would determine whether it could reduce by checking if the next input symbol is in the follow set of R (ie, the set of all terminals in the grammar that can follow R). Since = is also in this set, the SLR parser encounters a shift-reduce conflict.
However, an LALR(1) parser would use the set of all terminals that can follow this particular production of R, which is only $ (ie, end of input). Thus, no conflict.
As previous commenters have noted, LALR(1) parsers have the same number of states as SLR parsers. A lookahead propagation algorithm is used to tack lookaheads on to SLR state productions from corresponding LR(1) states. The resulting LALR(1) parser can introduce reduce-reduce conflicts not present in the LR(1) parser, but it cannot introduce shift-reduce conflicts.
In your example, the following LALR(1) state causes a shift-reduce conflict in an SLR implementation:
S → b d•a / $
A → d• / c
The symbol after / is the follow set for each production in the LALR(1) parser. In SLR, follow(A) includes a, which could also be shifted.
Suppose a parser without a lookahead is happily parsing strings for your grammar.
Using your given example it comes across a string dc, what does it do? Does it reduce it to S, because dc is a valid string produced by this grammar? OR maybe we were trying to parse bdc because even that is an acceptable string?
As humans we know the answer is simple, we just need to remember if we had just parsed b or not. But computers are stupid :)
Since an SLR(1) parser had the additional power over LR(0) to perform a lookahead, we know that any amounts of lookahead cannot tell us what to do in this case; instead, we need to look back in our past. Thus comes the canonical LR parser to the rescue. It remembers the past context.
The way it remembers this context is that it disciplines itself, that whenever it will encounter a b, it will start walking on a path towards reading bdc, as one possibility. So when it sees a d it knows whether it is already walking a path.
Thus a CLR(1) parser can do things an SLR(1) parser cannot!
But now, since we had to define so many paths, the states of the machine gets very large!
So we merge same looking paths, but as expected it could give rise to problems of confusion. However, we are willing to take the risk at the cost of reducing the size.
This is your LALR(1) parser.
Now how to do it algorithmically.
When you draw the configuring sets for the above language, you will see a shift-reduce conflict in two states. To remove them you might want to consider an SLR(1), which takes decisions looking at a follow, but you would observe that it still won't be able to. Thus you would, draw the configuring sets again but this time with a restriction that whenever you calculate the closure, the additional productions being added must have strict follow(s). Refer any textbook on what should these follow be.
In addition to the answers above, this diagram demonstrates how different parsers relate:
Adding on top of the above answers, the difference in between the individual parsers in the class of bottom-up LR parsers is whether they result in shift/reduce or reduce/reduce conflicts when generating the parsing tables. The less it will have the conflicts, the more powerful will be the grammar (LR(0) < SLR(1) < LALR(1) < CLR(1)).
For example, consider the following expression grammar:
E → E + T
E → T
T → F
T → T * F
F → ( E )
F → id
It's not LR(0) but SLR(1). Using the following code, we can construct the LR0 automaton and build the parsing table (we need to augment the grammar, compute the DFA with closure, compute the action and goto sets):
from copy import deepcopy
import pandas as pd
def update_items(I, C):
if len(I) == 0:
return C
for nt in C:
Int = I.get(nt, [])
for r in C.get(nt, []):
if not r in Int:
Int.append(r)
I[nt] = Int
return I
def compute_action_goto(I, I0, sym, NTs):
#I0 = deepcopy(I0)
I1 = {}
for NT in I:
C = {}
for r in I[NT]:
r = r.copy()
ix = r.index('.')
#if ix == len(r)-1: # reduce step
if ix >= len(r)-1 or r[ix+1] != sym:
continue
r[ix:ix+2] = r[ix:ix+2][::-1] # read the next symbol sym
C = compute_closure(r, I0, NTs)
cnt = C.get(NT, [])
if not r in cnt:
cnt.append(r)
C[NT] = cnt
I1 = update_items(I1, C)
return I1
def construct_LR0_automaton(G, NTs, Ts):
I0 = get_start_state(G, NTs, Ts)
I = deepcopy(I0)
queue = [0]
states2items = {0: I}
items2states = {str(to_str(I)):0}
parse_table = {}
cur = 0
while len(queue) > 0:
id = queue.pop(0)
I = states[id]
# compute goto set for non-terminals
for NT in NTs:
I1 = compute_action_goto(I, I0, NT, NTs)
if len(I1) > 0:
state = str(to_str(I1))
if not state in statess:
cur += 1
queue.append(cur)
states2items[cur] = I1
items2states[state] = cur
parse_table[id, NT] = cur
else:
parse_table[id, NT] = items2states[state]
# compute actions for terminals similarly
# ... ... ...
return states2items, items2states, parse_table
states, statess, parse_table = construct_LR0_automaton(G, NTs, Ts)
where the grammar G, non-terminal and terminal symbols are defined as below
G = {}
NTs = ['E', 'T', 'F']
Ts = {'+', '*', '(', ')', 'id'}
G['E'] = [['E', '+', 'T'], ['T']]
G['T'] = [['T', '*', 'F'], ['F']]
G['F'] = [['(', 'E', ')'], ['id']]
Here are few more useful function I implemented along with the above ones for LR(0) parsing table generation:
def augment(G, S): # start symbol S
G[S + '1'] = [[S, '$']]
NTs.append(S + '1')
return G, NTs
def compute_closure(r, G, NTs):
S = {}
queue = [r]
seen = []
while len(queue) > 0:
r = queue.pop(0)
seen.append(r)
ix = r.index('.') + 1
if ix < len(r) and r[ix] in NTs:
S[r[ix]] = G[r[ix]]
for rr in G[r[ix]]:
if not rr in seen:
queue.append(rr)
return S
The following figure (expand it to view) shows the LR0 DFA constructed for the grammar using the above code:
The following table shows the LR(0) parsing table generated as a pandas dataframe, notice that there are couple of shift/reduce conflicts, indicating that the grammar is not LR(0).
SLR(1) parser avoids the above shift / reduce conflicts by reducing only if the next input token is a member of the Follow Set of the nonterminal being reduced. The following parse table is generated by SLR:
The following animation shows how an input expression is parsed by the above SLR(1) grammar:
The grammar from the question is not LR(0) as well:
#S --> Aa | bAc | dc | bda
#A --> d
G = {}
NTs = ['S', 'A']
Ts = {'a', 'b', 'c', 'd'}
G['S'] = [['A', 'a'], ['b', 'A', 'c'], ['d', 'c'], ['b', 'd', 'a']]
G['A'] = [['d']]
as can be seen from the next LR0 DFA and the parsing table:
there is a shift / reduce conflict again:
But, the following grammar which accepts the strings of the form a^ncb^n, n >= 1 is LR(0):
A → a A b
A → c
S → A
# S --> A
# A --> a A b | c
G = {}
NTs = ['S', 'A']
Ts = {'a', 'b', 'c'}
G['S'] = [['A']]
G['A'] = [['a', 'A', 'b'], ['c']]
As can be seen from the following figure, there is no conflict in the parsing table generated.
Here is how the input string a^2cb^2 can be parsed using the above LR(0) parse table, using the following code:
def parse(input, parse_table, rules):
input = 'aaacbbb$'
stack = [0]
df = pd.DataFrame(columns=['stack', 'input', 'action'])
i, accepted = 0, False
while i < len(input):
state = stack[-1]
char = input[i]
action = parse_table.loc[parse_table.states == state, char].values[0]
if action[0] == 's': # shift
stack.append(char)
stack.append(int(action[-1]))
i += 1
elif action[0] == 'r': # reduce
r = rules[int(action[-1])]
l, r = r['l'], r['r']
char = ''
for j in range(2*len(r)):
s = stack.pop()
if type(s) != int:
char = s + char
if char == r:
goto = parse_table.loc[parse_table.states == stack[-1], l].values[0]
stack.append(l)
stack.append(int(goto[-1]))
elif action == 'acc': # accept
accepted = True
df2 = {'stack': ''.join(map(str, stack)), 'input': input[i:], 'action': action}
df = df.append(df2, ignore_index = True)
if accepted:
break
return df
parse(input, parse_table, rules)
The next animation shows how the input string a^2cb^2 is parsed with LR(0) parser using the above code:
One simple answer is that all LR(1) grammars are LALR(1) grammars.
Compared to LALR(1), LR(1) has more states in the associated finite-state machine (more than double the states). And that is the main reason LALR(1) grammars require more code to detect syntax errors than LR(1) grammars.
And one more important thing to know regarding these two grammars is that in LR(1) grammars we might have less reduce/reduce conflicts. But in LALR(1) there is more possibility of reduce/reduce conflicts.

Are there Mathematica packages for presenting proofs/derivations?

When I write out a proof or derivation on paper I frequently make sign errors or drop terms as I move from one step to the next. I'd like to use Mathematica to save myself from these silly mistakes. I don't want Mathematica to solve the expression, I just want to use it carry out and display a series of algebraic manipulations. For a (trivial) example
In[111]:= MultBothSides[Equal[a_, b_], c_] := Equal[c a, c b];
In[112]:= expression = 2 a == a b
Out[112]= 2 a == a b
In[113]:= MultBothSides[expression, 1/a]
Out[113]= 2 == b
Can anyone point me to a package that would support this kind of manipulation?
Edit
Thanks for the input, not quite what I'm looking for though. The symbol manipulation isn't really the problem. I'm really looking for something that will make explicit the algebraic or mathematical justification of each step of a derivation. My goal here is really pedagogical.
Mathematica also provides a number of high-level functions for manipulating algebraic. Among these are Expand, Apart and Together, and Cancel, though there are quite a few more.
Also, for your specific example of applying the same transformation to both sides of an equation (that is, and expression with the head Equal), you can use the Thread function, which works just like your MultBothSides function, but with a great deal more generality.
In[1]:= expression = 2 a == a b
Out[1]:= 2 a == a b
In[2]:= Thread[expression /a, Equal]
Out[2]:= 2 == b
In[3]:= Thread[expression - c, Equal]
Out[3]:= 2 a - c == a b - c
In either of the presented solutions, it should be relatively easy to see what the step entailed. If you want something a little more explicit, you can write your own function like so:
In[4]:= ApplyToBothSides[f_, eq_Equal] := Map[f, eq]
In[5]:= ApplyToBothSides[4 * #&, expression]
Out[5]:= 8 a == 4 a b
It's a generalization of your MultBothSides function that takes advantage of the fact that Map works on expressions with any head, not just head List. If you're trying to communicate with an audience that is unfamiliar with Mathematica, using these sorts of names can help you communicate more clearly. In a related vein, if you want to use replacement rules as suggested by Ira Baxter, it may be helpful to write out Replace or ReplaceAll instead of using the /. syntactic sugar.
In[6]:= ReplaceAll[expression, a -> (x + y)]
Out[6]:= 2 (x + y) == b (x + y)
If you think it would be clearer to have the actual equation, instead of the variable name expression, in your input, and you're using the notebook interface, highlight the word expression with your mouse, call up the contextual menu, and select "Evaluate in Place".
The notebook interface is also a very pleasant environment for doing "literate programming", so you can also explain any steps that are not immediately obvious in words. I believe this is a good practice when writing mathematical proofs regardless of the medium.
I don't think you need a package. What you want to do is to manipulate each formula according to an inference rule. In MMa, you can model inference rules on a formula using transformations. So, if you have a formula f, you can apply an inference rule I by executing (my MMa syntax is 15 years rusty)
f ./ I
to produce the next formula in your sequence.
MMa will of course try to simplify your formulas if they contain standard algebraic operators and terms, such as constant numbers and arithmetic operators. You can prevent MMa from applying its own "inference" rules by enclosing your formula in a Hold[...] form.

Resources