Performance issue in Xpath grammar file using antlr

Performance issue in Xpath grammar file using antlr - xpath

I am running into a performance issue while creating the grammar for Xpath.
The whole grammar was working fine till we added support for the xpaths like:
((div)[1]//span)[1]
or
((//div)[1]/div)[last()]
After adding support for this in the grammar file, the above xpaths started working fine but other xpaths started giving performance issues.
Like this one :
//label[normalize-space(.)='Phone']/parent::lightning-input/parent::slot/parent::slot/parent::span/parent::div/parent::force-record-layout-item/parent::slot/parent::force-record-layout-row/parent::slot/parent::div/parent::div/parent::div/parent::force-record-layout-section/parent::slot/parent::force-record-layout-block/parent::forcegenerated-detailpanel_contact___012b0000000jhhvia4___full___view___recordlayout2/parent::records-lwc-record-layout/parent::slot/parent::records-record-layout-event-broker/parent::div/parent::div/following-sibling::force-form-footer//button
which started taking 30 seconds in parsing (it took 170ms earlier).
These were the lines of code added/modified in the attached grammar file which started giving performance issue:
union
: expressions+=pathExpression (WS? operator='|' WS? expressions+=pathExpression)*
;
pathExpression
: expressionAtom
| expressionAtom nodeSet
;
expressionAtom
: functionCall
| nodeSet
| literal
| parenthesis
;
Earlier it was:
union
: expressions+=expressionAtom (WS? operator='|' WS? expressions+=expressionAtom)*
;
expressionAtom
: functionCall
| nodeSet
| literal
| parenthesis
;
and used to give no performance issues.
On debugging the XpathParser I figured out that it is because the DFAState has** requiresFullContext as true **in this case.
The documentation states that the true value of this “Indicates that this state was created during SLL prediction that discovered a conflict between the configurations in the state.”
Can you help me resolve this issue and what is causing it to search the full context which takes time?

I don't know Antlr well, but how about replacing
pathExpression
: expressionAtom
| expressionAtom nodeSet
;
with
pathExpression
: expressionAtom nodeSet?
;

Related

My simple prolog code is throwing an syntax error: operator expected

As the title suggest, my prolog code is throwing a syntax error. Im not sure what Im doing wrong. Im using Swi for my IDE and I tried playing with it to fix the problem, but to no avail.
heres my simple prolog code with error
?-
| male(bob)
| male(jeff)
|
| female(jane)
| female(erica)
|
| father(bob,jane)
| mother(erica, jane)
|
| ?-mother(erica,X).
ERROR: Syntax error: Operator expected
ERROR: male(bob)
ERROR: ** here **
ERROR:
male(jeff)
female(jane)
female(erica)
father(bob,jane)
mother(erica, jane)
?-mother(erica,X) .

There are two phases of Prolog development: Writing the program and interacting with it in the Prolog shell. These two phases are separate. You don't write your program in the shell, at least not directly.
Save your facts in a file called family.pl (with a dot . at the end of each fact!), then start the Prolog shell. In the shell, you can load the program using
?- consult(family).
or
?- consult('family.pl').
Note that in the first case you leave off the .pl extension, but in the second case, if you do use the extension, you should use single quotes (') around the file name.
Now you can run your query:
?- mother(erica, X).
X = jane.
There are some other ways to load files, such as putting the file name between square brackets [] instead of using consult, or (for many Prolog systems) simply adding the file name on the command line.

Statements in prolog end with a dot:
male(jeff).
female(jane).
female(erica).
father(bob,jane).
mother(erica, jane).

XText Validator shows Parse Error in wrong line

I am currently developing a small dsl with the following (shortend) grammar:
grammar mydsl with org.eclipse.xtext.common.Terminals hidden(WS, SL_COMMENT)
generate mydsl "uri::mydsl"
CommandSet:
(commands+=Command)*
;
Command:
(commandName=CommandName LBRACKET (args=ArgumentList)? RBRACKET EOL ) |
;
terminal LBRACKET:
'('
;
terminal RBRACKET:
')'
;
terminal EOL:
';'
;
As you can see, I use a semicolon as a EOL seperator and it works just fine for me. The problem occurs with the built-in syntax validator when working with the dsl in eclipse. When I miss a semicolon, the validator throws an syntax error in the wrong line:
Is there an error with my grammar? Thanks ;)

Here is a small DSL loosely based on your example. Basically, I do not consider linebreaks as "hidden" any longer (i.e. they will no longer be ignored by the parser), only the whitespaces. Note new terminals MY_WS and MY_NL as well as modified hidden statement in the grammar header (I also added some comments at relevant places). This approach just gives you some general idea and you can experiment with it to achieve what you want. Note, that if linebreaks are no longer hidden, you will need to take account of them in your grammar rules.
grammar org.xtext.example.mydsl.MyDsl
with org.eclipse.xtext.common.Terminals
hidden( MY_WS, SL_COMMENT ) // ---> hide whitespaces and comments only, not linebreaks!
generate mydsl "uri::mydsl"
CommandSet:
(commands+=Command)*
;
CommandName:
name=ID
;
ArgumentList:
arguments += STRING (',' STRING)*
;
Command:
(commandName=CommandName LBRACKET (args=ArgumentList)? RBRACKET EOL);
terminal LBRACKET:
'('
;
terminal RBRACKET:
')'
;
terminal EOL:
';' MY_NL? // ---> now an optional linebreak at the end!
;
terminal MY_WS: (' '|'\t')+; // ---> whitespace characters (formerly part of WS)
terminal MY_NL: ('\r'|'\n')+; // ---> linebreak characters (no longer hidden)
Here is an image demonstrating the resulting behavior.

XTEXT: Controlling when whitespace is allowed

I have a custom scripting language, that I am attempting to use XTEXT for syntax checking. It boils down to single line commands in the format
COMMAND:PARAMETERS
For the most part, xtext is working great. The only problem I have currently run into is how to handle wanted (or unwanted) white spaces. The language cannot have a space to begin a line, and there cannot be a space following the colon. As well, I need to allow white space in the parameters, as it could be a string of text, or something similar.
I have used a datatype to allow white space in the parameter:
UNQUOTED_STRING:
(ID | INT | WS | '.' )+
;
This works, but has the side effect of allowing spaces throughout the line.
Does anyone know a way to limit where white spaces are allowed?
Thanks in advance for any advice!

You can disallow whitespace globally for your grammar by using an empty set of hidden tokens, e.g.
grammar org.xyz.MyDsl with org.eclipse.xtext.common.Terminals hidden()
Then you can enable it at specific rules, e.g.
XParameter hidden(WS):
'x' '=' value=ID
;
Note that this would allow linebreaks as well. If you don't want that you can either pass a custom terminal rule or overwrite the default WSrule.
Here is a more complete example (not perfect):
grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals hidden()
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
Model:
(commands+=Command '\r'? '\n')+
;
Command:
SampleCommand
;
SampleCommand:
command='get' ':' parameter=Parameter
;
Parameter:
'{' x=XParameter '}'
;
XParameter hidden(WS):
'x' '=' value=ID
;
This will parse commands such as:
get:{x=TEST}
get:{ x = TEST}
But will reject:
get:{x=TEST}
get: {x=TEST}
Hope that gives you an idea. You can also do this the other way around by limiting the whitespace only for certain rules, e.g.
CommandList hidden():
(commands+=Command '\r'? '\n')+
;
If that works better for your grammar.

Can't make ANTLR4 grammar skip comments

I am trying to write an ANTLR4 grammar to parse actionscript3. I've decided to start with something fairly coarse grained:
grammar actionscriptGrammar;
OBRACE:'{';
CBRACE:'}';
STRING_DELIM:'"';
BLOCK_COMMENT : '/*' .*? '*/' -> skip;
EOL_COMMENT : '//' .*? '/n' -> skip;
WS: [ \n\t\r]+ -> skip;
TEXT: ~[{} \n\t\r"]+;
thing
: TEXT
| string_literal
| OBRACE thing+? CBRACE;
string_literal : STRING_DELIM .+? STRING_DELIM;
start_rule
: thing+?;
Basically, I want a tree of things grouped by their lexical scope. I want comments to be ignored, and string literals be their own things so that any braces they may include do not affect lexical scope. The string_literal rule works fine (such as it is) but the two comment rules don't appear to have any effect. (i.e. comments aren't being ignored).
What am I missing?

This is from a simplified Java grammar I wrote in ANTLR v4.
WS
: [ \t\r\n]+ -> channel(HIDDEN)
;
COMMENT
: '/*' .*? '*/' -> skip
;
LINE_COMMENT
: '//' ~[\r\n]* -> skip
;
May be this could help you out.
Also, try rearranging your code. Write the Parser Rules first and Lexer Rules last. Follow a Top-Down approach. I find it much more helpful in debugging. It will also look nice when you create an HTML export of your grammar from ANTLR 4 Eclipse Plugin.
Good Luck!

The answer is that your TEXT rule is consuming your comments. Rather than using a negated set, use something like:
TEXT: [a-zA-Z0-9_][/a-zA-Z0-9.;()\[\]_-]+ ;
That way, your comments cannot be matched by TEXT.

Ruby Grammar

I'm looking for Ruby grammar in BNF form. Is there an official version?

The YACC syntax is in the Ruby source. Download it and run the bundled utiliy to get the readable syntax.
wget ftp://ftp.ruby-lang.org/pub/ruby/2.0/ruby-2.0.0-p195.tar.gz
tar xvzf ruby-2.0.0-p195.tar.gz
cd ruby-2.0.0-p195
ruby sample/exyacc.rb < parse.y
Output sample (total 918 lines for the v2.0.0-p195)
program : top_compstmt
;
top_compstmt : top_stmts opt_terms
;
top_stmts : none
| top_stmt
| top_stmts terms top_stmt
| error top_stmt
;
top_stmt : stmt
| keyword_BEGIN
'{' top_compstmt '}'
;
bodystmt : compstmt
opt_rescue
opt_else
opt_ensure
;
compstmt : stmts opt_terms
;

Yes, there is one Ruby BNF syntax by the University of buffalo.
Edit: I've also found this alternate Ruby BNF syntax.

also an official version: Ruby Draft Specification. you can find the grammar there.
Ruby Draft Specification: http://ruby-std.netlab.jp. the server is down, but you can download it from
http://www.ipa.go.jp/osc/english/ruby

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Performance issue in Xpath grammar file using antlr - xpath

I don't know Antlr well, but how about replacing pathExpression : expressionAtom | expressionAtom nodeSet ; with pathExpression : expressionAtom nodeSet? ;

Related

My simple prolog code is throwing an syntax error: operator expected

XText Validator shows Parse Error in wrong line

XTEXT: Controlling when whitespace is allowed

Can't make ANTLR4 grammar skip comments

Ruby Grammar

Categories

Resources