Parsing a nested list with Antlr3, non-LL(*) decision due to recursive rule invocations - antlr3

I have the following grammar to parse a nested list using Antlr3
parse:
list
;
list:
LBRACK list_element* RBRACK
;
list_element:
tree_ | list
;
tree_:
node | ATOM
;
node:
LBRACK tree_ SEPARATOR tree_ RBRACK
;
ATOM: 'nil';
LBRACK: '(';
RBRACK: ')';
SEPARATOR: '.';
WS : (' ' | '\f' | '\r' | '\n' | '\t')+{$channel = HIDDEN;};
I can't find out what is causing, or how to remove the error:
'/ListParseTest/src/ListParse.g:17:13: [fatal]
rule list_element has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2.
Resolve by left-factoring or using syntactic predicates or using backtrack=true option.
|---> list_element:
'
I recognize it has something to do with the recursive relationships between list, list_element and tree_, but I am not able to solve the problem.
Can anybody help?

The problem is due to the nature of the input leading to a decision on which rule to take not always being immediately possible. (nil could either be the start of a new list, or the start of a new tree).
The solution is to enable the 'backtracking' option, which allows the parser to go back on itself when it realizes it has taken the wrong path.
This is achieved by adding
backtrack=true;
to the grammar options.

Related

Xpath combine predicates with common ancestor?

I want immediate tr of table optionally wrapped in tbody:
//table[complex-predictor]/tbody/tr | //table[complex-predictor]/tr
I want to combine the predicates as:
//table[complex-predictor](/tbody/tr | /tr)
But it not works. What is the correct way to do this?
Btw, i don't want tr deep in table
(/tbody/tr/td/table/tbody/tr)
This is one possible way :
//table//*[self::th|self::tr]
The main XPath returns all descendant elements of table, then the predicate (the expression in []) filters the descendants to be returned to only th and tr elements.
"Btw, i don't want tr deep in table
(/tbody/tr/td/table/tbody/tr)"
In XPath 2.0 or above you can do :
//table[complex-predictor]/(tbody/tr|tr)
But in XPath 1.0, I don't see a clean way to get this done without repeating the 'complex-predictor'

XPath 1.0 exclusive or node-set expression

What I need doesn't quite seem to match what other articles of a similar title are about.
I need, using Xpath 1, to be able to get node a, or node b, excusively, in that order.
That is, node a if it exists, otherwise, node b.
an xpath expression such as :
expression | expression
will get me both in the case they both exist. that is not what I want.
I could go:
(expression | expression)[last()]
Which does in fact gget me what I need (in my case), but seems to be a bit inefficient, because it will evaluate both sides of the expression before the last result is selected.
I was hoping for an expression that is going to stop working once the left side succeeds.
A more concrete example of XML
<one>
<two>
<three>hello</three>
<four>bye</four>
</two>
<blahfive>again</blahfive>
</one>
and the xpath that works (but inefficient):
(/one/*[starts-with(local-name(.), 'blah')] | .)[last()]
To be clear, I would like to grab the immediate child node of 'one' which starts with 'blah'. However, if it doesn't exist, I would like only the current node.
If the 'blah' node does exist, I do not want the current node.
Is there a more efficient way to achieve this?
I need, using Xpath 1, to be able to get node a, or node b,
excusively, in that order. That is, node a if it exists, otherwise,
node b.
an xpath expression such as :
expression | expression
will get me both in the case they both exist. that is not what I want.
I could go:
(expression | expression)[last()]
Which does in fact gget me what I need (in my case),
This statement is not true.
Here is an example. Let us have this XML document:
<one>
<a/>
<b/>
</one>
Expression1 is:
/*/a
Expression2 is:
/*/b
Your composite expression:
(Expression1 | Expression2)[last()]
when we substitute the two expressions above is:
(/*/a | /*/b)[last()]
And this expression actually selects b -- not a -- because b is the last of the two in document order.
Now, here is an expression that selects just a if it exists, and selects b only if a doesn't exist -- regardless of document order:
/*/a | /*/b[not(/*/a)]
When this expression is evaluated on the XML document above, it selects a, regardless of its document order -- try swapping in the XML document above the places of a and b to confirm that in both cases the element that is selected is a.
To summarize, one expression that selects the wanted node regardless of any document order is:
Expression1 | Expression2[not(Expression1)]
Let us apply this general expression in your case:
Expression1 is:
/one/*[starts-with(local-name(.), 'blah')]
Expression2 is:
self::node()
The wanted expression (after substituting Expression1 and Expression2 in the above general expression) is:
/one/*[starts-with(local-name(.), 'blah')]
|
self::node()[not(/one/*[starts-with(local-name(.), 'blah')])]

Prolog syntax error: expression expected checking for parentheses

I'm writing a program in Prolog where I'm given a set of grammar rules and the user inputs a sentence, I must make sure the sentence follows the given rules.
I'm only stuck on one rule:
expr -> ( expr ) also written as expr -> ( id op expr )
Here is my code for this part:
expr(X) :- list(X), length(X, Length), =(Length, 5),
=(X, [Left, Id, Op, Expr | Right]),
=(Left, ‘(‘),
id(Id), op(Op), expr([Expr]),
=(Right, ‘)’).
I believe the issue is with checking the parentheses since the other parts of this code are used elsewhere with no errors. When using =(Left, '(') or =(Right, ')') I get a syntax error: expression expected why do I get this error and what would be a better way to check for left and right parentheses?
I think you should use single quotes here =(Left, ‘(‘), and here =(Right, ‘)’). I.e. =(Left, '('), and =(Right, ')').
That said, your Expr will only match a single token, and this is not what I expect. Consider to match the entire 'right' sequence with
X = [Left, Id, Op | Expr],
and further split Expr to get the right parenthesi. Anyway, as I advised in another answer, your parsing (also after correction) will fail on [a,=,'(',b,')',+,c].

Does "match ... true -> foo | false -> bar" have special meaning in Ocaml?

I encountered the following construct in various places throughout Ocaml project I'm reading the code of.
match something with
true -> foo
| false -> bar
At first glance, it works like usual if statement. At second glance, it.. works like usual if statement! At third glance, I decided to ask at SO. Does this construct have special meaning or a subtle difference from if statement that matters in peculiar cases?
Yep, it's an if statement.
Often match cases are more common in OCaml code than if, so it may be used for uniformity.
I don't agree with the previous answer, it DOES the work of an if statement but it's more flexible than that.
"pattern matching is a switch statement but 10 times more powerful" someone stated
take a look at this tutorial explaining ways to use pattern matching Link here
Also, when using OCAML pattern matching is the way to allow you break composed data to simple ones, for example a list, tuple and much more
> Let imply v =
match v with
| True, x -> x
| False, _ -> true;;
> Let head = function
| [] -> 42
| H:: _ -> am;
> Let rec sum = function
| [] -> 0
| H:: l -> h + sum l;;

yacc, only applying rule once

I'm trying to write a shell using yacc and lex and I'm running into some problems with my I/O redirectors. Currently, I can use the < and > operators fine and in any order, but my problem is I can redirect twice with no error, such as "ls > log > log2"
My rule code is below, can anyone give me some tips on how to fix this? Thanks!
io_mod:
iomodifier_opt io_mod
|
;
iomodifier_opt:
GREAT WORD {
printf(" Yacc: insert output \"%s\"\n", $2);
Command::_currentCommand._outFile = $2;
}
|
LESS WORD {
printf(" Yacc: insert input \"%s\"\n", $2);
Command::_currentCommand._inputFile = $2;
}
| /* can be empty */
;
EDIT: After talking to my TA, I learned that I did not actually need to have only 1 modifier for my command and that I actually can have multiple copies of the same I/O redirection.
There are two approaches:
(1) Modify the grammar so that you can only have one of each kind of modifier:
io_mod_opt: out_mod in_mod | in_mod out_mod | in_mod | out_mod | ;
(2) Modify the clause handler to count the modifiers and report an error if there's more than one:
GREAT_WORD {
if (already_have_output_file()) {
error("too many output files: \"%s\"\n", $2)
} else {
/* record output file */
}
}
Option (2) seems likely to lead to better error messages and a simpler grammar.
There's also a third approach - don't fret. Bash (under Cygwin) does not generate an error for:
ls > x > y
It creates x and then y and ends up writing to y.
I realize this might be just an exercise to learn lexx and yacc, but otherwise the first question is to ask why you want to use lexx and yacc? Any usual shell command language has a pretty simple grammar; what are you gaining from using an LALR generator?
Well, other than complexity, difficulty generating good error messages, and code bulk, I mean.

Resources