Parse a step of an XPATH expression in JavaCC again - xpath

A while ago I was struggling with writing a JavaCC template for XPath steps so that it would support both a full step definition and a definition with axis name omitted (in which case the axis name would default to child). I posted a question on SO and got a working answer by Theodore Norvell.
Now I'm trying to extend the template so that the parser would, in addition to the two previous possibilities, also support using a "#" sign as a shortcut for the attribute axis.
The following snippet does not work:
Step Step() :
{
Token t;
Step step;
Axis axis;
NodeTest nodeTest;
Expression predicate;
}
{
{ axis = Axis.child; }
(
<AT>
{ axis = Axis.attribute; }
|
LOOKAHEAD( <IDENTIFIER> <DOUBLE_COLON> )
t = <IDENTIFIER>
{ axis = Axis.valueOf(t.image); }
<DOUBLE_COLON>
)?
t = <IDENTIFIER>
{ nodeTest = new NodeNameTest(t.image); }
{ step = new Step(axis, nodeTest); }
(
<OPEN_PAR>
predicate = Expression()
{ step.addPredicate(predicate); }
<CLOSE_PAR>
)*
{ return step; }
}
Instead it emits the following warning message:
Choice conflict in [...] construct at line 162, column 9.
Expansion nested within construct and expansion following construct
have common prefixes, one of which is: <IDENTIFIER>
Consider using a lookahead of 2 or more for nested expansion.
I have tried setting the LOOKAHEAD parameter in various ways but the only way that worked was to set it globally to 2. I would prefer changing it locally though.
How do I do that? And why doesn't the snippet shown in this question work?

Try this
(
<AT>
{ axis = Axis.attribute; }
|
LOOKAHEAD( <IDENTIFIER> <DOUBLE_COLON> )
t = <IDENTIFIER>
{ axis = Axis.valueOf(t.image); }
<DOUBLE_COLON>
|
{}
)
--Edit--
I'd forgotten to answer the second question: "Why doesn't the given snippet work?"
The look ahead spec that you have only applies to the alternation. I'm suprised JavaCC doesn't give you a warning, as the LOOKAHEAD is on the last alternative and hence useless. By the time the parser gets to the LOOKAHEAD, it has already decided (on the basis of the next token being an identifier) to process the part inside the (...)? Another solution is thus
( LOOKAHEAD( <AT> | <IDENTIFIER> <DOUBLE_COLON> )
(<AT> {...} | <IDENTIFIER> {...} <DOUBLE_COLON> )
)?

Related

Binary operator '/' cannot be applied to two (Int) operands [duplicate]

This question already has answers here:
Passing lists from one function to another in Swift
(2 answers)
Closed 7 years ago.
I am getting a Binary operator '/' cannot be applied to two (Int) operands error when I put the following code in a Swift playground in Xcode.
func sumOf(numbers: Int...) -> Int {
var sum = 0
for number in numbers {
sum += number
}
return sum
}
sumOf()
sumOf(42, 597, 12)
The above was a function calculating the total sum of any numbers.
Below is a function calculating the average of the numbers. The function is calling the sumOf() function from within itself.
func avg(numbers: Int...) -> Float {
var avg:Float = ( sumOf(numbers) ) / ( numbers.count ) //Binary operator '/' cannot be applied to two (Int) operands
return avg
}
avg(1, 2, 3);
Note: I have looked everywhere in stack exchange for the answer, but the questions all are different from mine because mine is involving two Ints, the same type and not different two different types.
I would like it if someone could help me to solve the problem which I have.
Despite the error message it seems that you cannot forward the sequence (...) operator. A single call of sumOf(numbers) within the agv() function gives an error cannot invoke sumOf with an argument of type ((Int))
The error is telling you what to do. If you refer to https://developer.apple.com/library/mac/documentation/AppleScript/Conceptual/AppleScriptLangGuide/reference/ASLR_operators.html
/ Division.
A binary arithmetic operator that divides the number to its left by the number to its right.
Class of operands: integer, real
Class of result: real
The second argument has to be real. Convert it like so. I don't use xcode, but I think my syntax is correct.
var avg:Float = ( sumOf(numbers) ) / Float( numbers.count )

Semantic Predicates antlr don't recognize chain of integers of width 4

I need to recognize arrays of integers in Fortran's I4 format (stands for an integer of width four) as the following example:
Using a pure context-free grammar:
WS : ' ' ;
MINUS : '-' ;
DIGIT : '0'..'9' ;
int4:
WS WS (WS| MINUS ) DIGIT
| WS (WS| MINUS ) DIGIT DIGIT
| (WS| MINUS | DIGIT ) DIGIT DIGIT DIGIT
;
numbers
: int4*;
The above example is correctly matched:
However if I use semantic predicates to encode semantic constraints of rule int4 :
int4
scope { int n; }
#init { $int4::n = 0; }
: ( {$int4::n < 3}?=> WS {$int4::n++;} )*
( MINUS {$int4::n++;} )?
( {$int4::n < 4}?=> DIGIT{$int4::n++;} )+
{$int4::n == 4}?
;
it works for the int4 rule, but it's not the same for the numbers rule, because it doesn't recognize the array of integers of the first example:
In this case may be better pure context-free grammar, but in case of the format I30 (stands for an integer of width 30)?
The main question is: Is it possible to use Semantic Predicates with this grammar?
Your parse tree seems to end at the numbers rule because your numbers rule throws an exception (but it does not show up in the diagram...). You can see it if you run the code generated, and if you take a closer look at the exception, it says (line info may differ for you):
Exception in thread "main" java.util.EmptyStackException
at java.util.Stack.peek(Stack.java:102)
at FortranParser.numbers(FortranParser.java:305)
at Main.main(Main.java:9)
and the code throwing the exception is:
public final void numbers() throws RecognitionException {
....
else if ( (LA5_0==DIGIT) && ((int4_stack.peek().n < 4))) {
alt5=1;
}
So your problem is that the semantic predicate gets propagated to the numbers rule, and at that level the scope stack is empty, hence int4_stack.peek() throws an exception
A trick to avoid it is that you use a variable in the global scope, e.g.:
#members {
int level=0;
}
and modify the semantic predicates to check level before the predicates, just like:
int4
scope { int n; }
#init { $int4::n = 0; level++; }
#after { level--; }
: ( {level==0 || $int4::n < 3}?=> WS {$int4::n++;} )*
( MINUS {$int4::n++;} )?
( {level==0 || $int4::n < 4}?=> DIGIT{$int4::n++;} )+
{$int4::n == 4}?
;
This is just a workaround to avoid the error that you get, maybe (knowing the error) there is a better solution and you don't need to mess up your semantic predicates.
But, I think, the answer is yes, it is possible to use semantic predicates with that grammar.

Are semicolons optional in Rust?

Since semicolons are apparently optional in Rust, why, if I do this:
fn fn1() -> i32 {
let a = 1
let b = 2
3
}
I get the error:
error: expected one of `.`, `;`, `?`, or an operator, found `let`
--> src/main.rs:3:9
|
2 | let a = 1
| - expected one of `.`, `;`, `?`, or an operator here
3 | let b = 2
| ^^^ unexpected token
They're not optional. Semicolons modify the behaviour of an expression statement so it should be a conscious decision whether you use them or not for a line of code.
Almost everything in Rust is an expression. An expression is something that returns a value. If you put a semicolon you are suppressing the result of this expression, which in most cases is what you want.
On the other hand, this means that if you end your function with an expression without a semicolon, the result of this last expression will be returned. The same can be applied for a block in a match statement.
You can use expressions without semicolons anywhere else a value is expected.
For example:
let a = {
let inner = 2;
inner * inner
};
Here the expression inner * inner does not end with a semicolon, so its value is not suppressed. Since it is the last expression in the block, its value will be returned and assigned to a. If you put a semicolon on this same line, the value of inner * inner won't be returned.
In your specific case, not suppressing the value of your let statement doesn't make sense, and the compiler is rightly giving you an error for it. In fact, let is not an expression.
Semicolons are generally not optional, but there are a few situations where they are. Namely after control expressions like for, if/else, match, etc.
fn main() {
let a: u32 = 5;
if 5 == a {
println!("Hello!");
}
if 5 == a {
println!("Hello!");
};
for x in "World".chars() {
println!("{}", x);
}
for x in "World".chars() {
println!("{}", x);
};
}
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=1bf94760dccae285a2bdc9c44e8f658a
(There are situations where you do need to have or not have semicolons for these statements. If you're returning the value from within you can't have a semicolon, and if you're setting a variable to be the value from within you'll need a semicolon.)

C++ Boost qi recursive rule construction

[It seems my explanations and expectations are not clear at all, so I added precision on how I'd like to use the feature at the end of the post]
I'm currently working on grammars using boost qi. I had a loop construction for a rule cause I needed to build it from the elements of a vector. I have re-written it with simple types, and it looks like:
#include <string>
// using boost 1.43.0
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_eps.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace bqi = boost::spirit::qi;
typedef const char* Iterator;
// function that you can find [here][1]
template<typename P> void test_phrase_parser(char const* input, P const& p, bool full_match = true);
int main()
{
// my working rule type:
bqi::rule<Iterator, std::string()> myLoopBuiltRule;
std::vector<std::string> v;
std::vector<std::string>::const_iterator iv;
v.push_back("abc");
v.push_back("def");
v.push_back("ghi");
v.push_back("jkl");
myLoopBuiltRule = (! bqi::eps);
for(iv = v.begin() ; iv != v.end() ; iv++)
{
myLoopBuiltRule =
myLoopBuiltRule.copy() [ bqi::_val = bqi::_1 ]
| bqi::string(*iv) [ bqi::_val = bqi::_1 ]
;
}
debug(myLoopBuiltRule);
char s[] = " abc ";
test_phrase_parser(s, myLoopBuiltRule);
}
(Looks like here does not want to be replaced by corresponding hyperlink, so here is the address to find function test_phrase_parser(): http://www.boost.org/doc/libs/1_43_0/libs/spirit/doc/html/spirit/qi/reference/basics.html)
All was for the best in the best of all worlds... until I had to pass an argument to this rule. Here is the new rule type:
// my not-anymore-working rule type:
bqi::rule<Iterator, std::string(int*)> myLoopBuiltRule;
'int*' type is for example purpose only, my real pointer is adressing a much more complex class... but still a mere pointer.
I changed my 'for' loop accordingly, i.e.:
for(iv = v.begin() ; iv != v.end() ; iv++)
{
myLoopBuiltRule =
myLoopBuiltRule.copy()(bqi::_r1) [ bqi::_val = bqi::_1 ]
| bqi::string(*iv) [ bqi::_val = bqi::_1 ]
;
}
I had to add a new rule because test_phrase_parser() cannot guess which value is to be given to the int pointer:
bqi::rule<Iterator> myInitialRule;
And change everything that followed the for loop:
myInitialRule = myLoopBuiltRule((int*)NULL);
debug(myLoopBuiltRule);
char s[] = " abc ";
test_phrase_parser(s, myInitialRule);
Then everything crashed:
/home/sylvain.darras/software/repository/software/external/include/boost/boost_1_43_0/boost/spirit/home/qi/nonterminal/rule.hpp:199: error: no matching function for call to ‘assertion_failed(mpl_::failed************ (boost::spirit::qi::rule<Iterator, T1, T2, T3, T4>::operator=(const Expr&)
Then I got crazy and tried:
myLoopBuiltRule =
myLoopBuiltRule.copy(bqi::_r1) [ bqi::_val = bqi::_1 ]
| bqi::string(*iv) [ bqi::_val = bqi::_1 ]
-->
error: no matching function for call to ‘boost::spirit::qi::rule<const char*, std::string(int*), boost::fusion::unused_type, boost::fusion::unused_type, boost::fusion::unused_type>::copy(const boost::phoenix::actor<boost::spirit::attribute<1> >&)’
Then I got mad and wrote:
myLoopBuiltRule =
myLoopBuiltRule(bqi::_r1) [ bqi::_val = bqi::_1 ]
| bqi::string(*iv) [ bqi::_val = bqi::_1 ]
Which compiles since it is perfectly syntactically correct, but which magnificently stack overflows coz it happily, nicely, recursively, calls itself to death...
Then I lost my mind and typed:
myLoopBuiltRule =
jf jhsgf jshdg fjsdgh fjsg jhsdg jhg sjfg jsgh df
Which, as you probably expect, has failed to compile.
You imagine that before writing the above novel, I checked out on the web, but didn't find out anything related to copy() and argument passing in the same time. Has anyone already experienced this problem ? Have I missed something ?
Be assured that any help will be really really appreciated.
PS: Great thanks to hkaiser who has, without knowing it, answered a lot of my boost::qi problems through google (but this one).
Further information:
The purpose of my parser is to read files written in a given language L. The purpose of my post is to propagate my "context" (i.e.: variable definitions and especially constant values, so I can compute expressions).
The number of variable types I handle is small, but it's bound to grow, so I keep these types in a container class. I can loop on these managed types.
So, let's consider a pseudo-algorithm of what I would like to achive:
LTypeList myTypes;
LTypeList::const_iterator iTypes;
bqi::rule<Iterator, LType(LContext*)> myLoopBuiltRule;
myLoopBuiltRule = (! bqi::eps);
for(iTypes = myTypes.begin() ; iTypes != myTypes.end() ; iTypes++)
{
myLoopBuiltRule =
myLoopBuiltRule.copy()(bqi::_r1) [ bqi::_val = bqi::_1 ]
| iTypes->getRule()(bqi::_r1) [ bqi::_val = bqi::_1 ]
}
This is done during initialization and then myLoopBuiltRule is used and reused with different LContext*, parsing multiple types. And since some L types can have bounds, which are integer expressions, and that these integer expressions can exhibit constants, I (think that I) need my inherited attribute to take my LContext around and be able to compute expression value.
Hope I've been clearer in my intentions.
Note I just extended my answer with a few more informational links. In this particular case I have a hunch that you could just get away with the Nabialek trick and replacing the inherited attribute with a corresponding qi::locals<> instead. If I have enough time, I might work out a demonstration later.
Caveats, expositioning the problem
Please be advised that there are issues when copying proto expression trees and spirit parser expressions in particular - it will create dangling references as the internals are not supposed to live past the end of the containing full expressions. See BOOST_SPIRIT_AUTO on Zero to 60 MPH in 2 seconds!
Also see these answers which also concerns themselves with building/composing rules on the fly (at runtime):
Generating Spirit parser expressions from a variadic list of alternative parser expressions
Can Boost Spirit Rules be parameterized which demonstrates how to return rules from a function using boost::proto::deepcopy (like BOOST_SPIRIT_AUTO does, actually)
Nabialek Trick
In general, I'd very strongly advise against combining rules at runtime. Instead, if you're looking to 'add alternatives' to a rule at runtime, you can always use qi::symbols<> instead. The trick is to store a rule in the symbol-table and use qi::lazy to call the rule. In particular, this is known as the Nabialek Trick.
I have a toy command-line arguments parser here that demonstrates how you could use this idiom to match a runtime-defined set of command line arguments:
https://gist.github.com/sehe/2a556a8231606406fe36
Limitations of qi::lazy, what's next?
Unfortunately, qi::lazy does not support inherited arguments see e.g.
http://boost.2283326.n4.nabble.com/pass-inhertited-attributes-to-nabialek-trick-td2679066.html
You might be better off writing a custom parser component, as documented here:
http://boost-spirit.com/home/articles/qi-example/creating-your-own-parser-component-for-spirit-qi/
I'll try to find some time to work out a sample that replaces inherited arguments by qi::locals later.

JavaCC - parse a step of an XPATH expression

I'm trying to write a JavaCC script for a (simple) XPath parser and I'm having problems with the part to parse individual steps.
My idea of the grammar is this:
Step ::= ( AxisName "::" )? NodeTest ( "[" Predicate "]" )*
I have transformed it into the following script snippet:
Step Step() :
{
Token t;
Step step;
Axis axis;
NodeTest nodeTest;
Expression predicate;
}
{
{ axis = Axis.child; }
(
t = <IDENTIFIER>
{ axis = Axis.valueOf(t.image); }
<COLON>
<COLON>
)?
t = <IDENTIFIER>
{ nodeTest = new NodeNameTest(t.image); }
{ step = new Step(axis, nodeTest); }
(
<OPEN_PAR>
predicate = Expression()
{ step.addPredicate(predicate); }
<CLOSE_PAR>
)*
{ return step; }
}
This, however, doesn't work. Given the following expression:
p
it throws the following error:
Exception in thread "main" java.lang.IllegalArgumentException: No enum constant cz.dusanrychnovsky.generator.expression.Axis.p
at java.lang.Enum.valueOf(Unknown Source)
at cz.dusanrychnovsky.generator.expression.Axis.valueOf(Axis.java:3)
at cz.dusanrychnovsky.generator.parser.XPathParser.Step(XPathParser.java:123)
at cz.dusanrychnovsky.generator.parser.XPathParser.RelativeLocationPath(XPathParser.java:83)
at cz.dusanrychnovsky.generator.parser.XPathParser.AbsoluteLocationPath(XPathParser.java:66)
at cz.dusanrychnovsky.generator.parser.XPathParser.Start(XPathParser.java:23)
at cz.dusanrychnovsky.generator.parser.XPathParser.parse(XPathParser.java:16)
at cz.dusanrychnovsky.generator.Main.main(Main.java:24)
I believe that what happens is that the parser sees an identifier on the input so it takes the axis branch even though no colons will follow, which the parser cannot know at that time.
What is the best way to fix this? Should I somehow increase the lookahead value for the Step rule, and if that's the case, then how exactly would I do that? Or do I need to rewrite the rule somehow?
Two choices:
( LOOKAHEAD(3)
t = <IDENTIFIER>
{ axis = Axis.valueOf(t.image); }
<COLON>
<COLON>
)?
or
( LOOKAHEAD( <IDENTIFIER> <COLON> <COLON> )
t = <IDENTIFIER>
{ axis = Axis.valueOf(t.image); }
<COLON>
<COLON>
)?

Resources