Why does this //ID pass but //DEF fails? - xpath

I'm experimenting with the XPath using the grammar provided in the test suite and am having a problem with the path //ID being identified, but //DEF is not found. An IllegalArgumentException is thrown. "DEF at index 2 isn't a valid token name" Why is //ID matched, but //DEFnot?
String exprGrammar = "grammar Expr;\n" +
"prog: func+ ;\n" +
"func: DEF ID '(' arg (',' arg)* ')' body ;\n" +
"body: '{' stat+ '}' ;\n" +
"arg : ID ;\n" +
"stat: expr ';' # printExpr\n" +
" | ID '=' expr ';' # assign\n" +
" | 'return' expr ';' # ret\n" +
" | ';' # blank\n" +
" ;\n" +
"expr: expr ('*'|'/') expr # MulDiv\n" +
" | expr ('+'|'-') expr # AddSub\n" +
" | primary # prim\n" +
" ;\n" +
"primary" +
" : INT # int\n" +
" | ID # id\n" +
" | '(' expr ')' # parens\n" +
" ;" +
"\n" +
"MUL : '*' ; // assigns token name to '*' used above in grammar\n" +
"DIV : '/' ;\n" +
"ADD : '+' ;\n" +
"SUB : '-' ;\n" +
"RETURN : 'return' ;\n" +
"DEF: 'def';\n" +
"ID : [a-zA-Z]+ ; // match identifiers\n" +
"INT : [0-9]+ ; // match integers\n" +
"NEWLINE:'\\r'? '\\n' -> skip; // return newlines to parser (is end-statement signal)\n" +
"WS : [ \\t]+ -> skip ; // toss out whitespace\n";
String SAMPLE_PROGRAM =
"def f(x,y) { x = 3+4; y; ; }\n" +
"def g(x) { return 1+2*x; }\n";
Grammar g2 = new Grammar(exprGrammar);
LexerInterpreter g2LexerInterpreter = g2.createLexerInterpreter(new ANTLRInputStream(SAMPLE_PROGRAM));
CommonTokenStream tokens = new CommonTokenStream(g2LexerInterpreter);
ParserInterpreter parser = g2.createParserInterpreter(tokens);
parser.setBuildParseTree(true);
ParseTree tree = parser.parse(g2.rules.get("prog").index);
String xpath = "//DEF";
for (ParseTree t : XPath.findAll(tree, xpath, parser) ) {
System.out.println(t.getSourceInterval());
}

When I run your code, the following gets printed:
0..0
18..18
In other words:
;)
This XPath tree pattern matching is all rather new, so my guess is that you've stumbled upon a bug that has been fixed. I'm using ANTLR version 4.2.2

Related

YACC and LEX, Getting syntax error at the end of line and can't figure out why

HERE IS MY YACC
%{
#include <stdio.h>
#include <ctype.h>
#include "lex.yy.c"
void yyerror (s) /* Called by yyparse on error */
char *s;
{
printf ("%s\n", s);
}
%}
%start program
%union{
int value;
char * string;
}
%token <value> NUM INT VOID WHILE IF THEN ELSE READ WRITE RETURN LE GE EQ NE
%token <string> ID
%token <value> INTEGER
%left '|'
%left '&'
%left '+' '-'
%left '*' '/' '%'
%left UMINUS
%%
program : decllist
{fprintf(stderr, "program");}
;
decllist : dec
{fprintf(stderr, "\n dec");}
| dec decllist
;
dec : vardec
{fprintf(stderr, "vardec");}
| fundec
{fprintf(stderr, "YEAH");}
;
typespec : INT
| VOID
;
vardec : typespec ID ';'
{fprintf(stderr, "yep");}
| typespec ID '[' NUM ']' ';'
{fprintf(stderr, "again");}
;
fundec : typespec ID '(' params ')' compoundstmt
;
params : VOID
| paramlist
;
paramlist : param
| param ',' paramlist
;
param : typespec ID
| typespec ID '['']'
;
compoundstmt : '{' localdeclerations statementlist '}'
;
localdeclerations :/* empty */
|vardec localdeclerations
;
statementlist : /* empty */
| statement statementlist
;
statement : expressionstmt
| compoundstmt
| selectionstmt
| iterationstmt
| assignmentstmt
| returnstmt
| readstmt
| writestmt
;
expressionstmt : expression ';'
| ';'
;
assignmentstmt : var '=' expressionstmt
;
selectionstmt : IF '(' expression ')' statement
| IF '(' expression ')' statement ELSE statement
;
iterationstmt : WHILE '(' expression ')' statement
;
returnstmt : RETURN ';'
| RETURN expression ';'
;
writestmt : WRITE expression ';'
;
readstmt : READ var ';'
;
expression : simpleexpression
;
var : ID
| ID '[' expression ']'
;
simpleexpression : additiveexpression
| additiveexpression relop simpleexpression
;
relop : LE
| '<'
| '>'
| GE
| EQ
| NE
;
additiveexpression : term
| term addop term
;
addop : '+'
| '-'
;
term : factor
| term multop factor
;
multop : '*'
| '/'
;
factor : '(' expression ')'
| NUM
| var
| call
;
call : ID '(' args ')'
;
args : arglist
| /* empty */
;
arglist : expression
| expression ',' arglist
;
%%
main(){
yyparse();
}
AND HERE IS MY LEX
%{
int mydebug=1;
int lineno=0;
#include "y.tab.h"
%}
%%
int {if (mydebug) fprintf(stderr, "int found\n");
return(INT);
}
num {if (mydebug) fprintf(stderr, "num found\n");
return(NUM);
}
void {if (mydebug) fprintf(stderr, "void found \n");
return(VOID);
}
while {if (mydebug) fprintf(stderr, "while found \n");
return(WHILE);
}
if {if (mydebug) fprintf(stderr, "if found \n");
return(IF);
}
then {if (mydebug) fprintf(stderr, "then found \n");
return(THEN);
}
else {if (mydebug) fprintf(stderr, "else found \n");
return(ELSE);
}
read {if (mydebug) fprintf(stderr, "read found \n");
return(READ);
}
write {if (mydebug) fprintf(stderr, "void found \n");
return(WRITE);
}
return {if (mydebug) fprintf(stderr, "void found \n");
return(RETURN);
}
'<=' {if (mydebug) fprintf(stderr, "void found \n");
return(LE);
}
'>=' {if (mydebug) fprintf(stderr, "void found \n");
return(GE);
}
'==' {if (mydebug) fprintf(stderr, "void found \n");
return(EQ);
}
'!=' {if (mydebug) fprintf(stderr, "void found \n");
return(NE);
}
[a-zA-Z][a-zA-Z0-9]* {if (mydebug) fprintf(stderr,"Letter found\n");
yylval.string=strdup(yytext); return(ID);}
[0-9][0-9]* {if (mydebug) fprintf(stderr,"Digit found\n");
yylval.value=atoi((const char *)yytext); return(NUM);}
[ \t] {if (mydebug) fprintf(stderr,"Whitespace found\n");}
[=\-+*/%&|()\[\]<>;] { if (mydebug) fprintf(stderr,"return a token %c\n",*yytext);
return (*yytext);}
\n { if (mydebug) fprintf(stderr,"cariage return %c\n",*yytext);
lineno++;
return (*yytext);}
%%
int yywrap(void)
{ return 1;}
If i type in something like 'int a;' it gets all the way to the new line and prints 'carriage returned' but then stops and spits out syntax error at the end. Can anyone see why?
I have gone over this a lot and can't seem to find what keeps stopping it. I have a previous program that i am going back to trying to see if i can't figure it out but i'm stumped. Can anyone help?
Your lexer is returning '\n' (newline) tokens at the end of the line, but your parser never accepts them, so you'll get a syntax error when the parser hits the first newline.

combining jq conditions using and [duplicate]

This question already has an answer here:
combine two jq filters into one
(1 answer)
Closed 5 years ago.
How to combine two jq conditions using 'and'.
test.json
{
"url": "https://<part1>.test/hai/<part1>",
"ParameterValue": "<value>"
}
jq --arg input1 "$arg1" --arg input2 "$arg2" \
'if .url | contains("<part1>")
then . + {"url" : ("https://" + $input1 + ".test/hai/" + $input1) }
else . end' and
'if .ParameterValue == "<value>"
then . + {"ParameterValue" : ($input2) }
else . end' test.json > test123.json
and is a boolean (logical) operator. What you want here is to create a pipeline using '|':
jq --arg input1 "$arg1" --arg input2 "$arg2" '
if .url | contains("<part1>")
then . + {url : ("https://" + $input1 + ".test/hai/" + $input1) }
else . end
| if .ParameterValue == "<value>"
then . + {ParameterValue : $input2 }
else . end' test.json > test123.json
Or maybe better:
def when(filter; action): if (filter?) // null then action else . end;
when(.url | contains("<part1>");
.url = ("https://" + $input1 + ".test/hai/" + $input1))
| when(.ParameterValue == "<value>";
.ParameterValue = $input2)

Processing lines of file in Ruby

I have some file like this
file alldataset; append next;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
and I am trying to write a ruby program to push any line that comes after a semi colon to a new line. In addition, if a line has a 'do', indent from the 'do' so that the following line is indented by two blanks and any inner 'do' be indented by 4 blanks and so on.
I am very new to Ruby and my code so far is quite away from what I want. This is what I have
def indent(text, num)
" "*num+" " + text
end
doc = File.open('newtext.txt')
doc.to_a.each do |line|
if line.downcase =~ /^(file).+(;)/i
puts line+"\n"
end
if line.downcase.include?('do')
puts indent(line, 2)
end
end
This is the desired output
file alldataset;
append next;
if file.first? do
line + "\n";
if !file.last? do
line.indent(2);
end;
end;
Any help would be appreciated.
As you are interested in parsing, here is a quickly made example, just to give you a taste. I have learned Lex/Yacc, Flex/Bison, ANTLR v3 and ANTLR v4. I strongly recommend ANTLR4 which is so powerful. References :
the ANTLR site
The ANTLR mega tutorial
the expert book
StackOverflow -> Tags -> antlr
The following grammar can parse only the input example you have provided.
File Question.g4 :
grammar Question;
/* Simple grammar example to parse the following code :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
*/
start
#init {System.out.println("Question last update 1048");}
: file* EOF
;
file
: FILE ID ';' statement_p*
;
statement_p
: statement
{System.out.println("Statement found : " + $statement.text);}
;
statement
: 'append' ID ';'
| if_statement
| other_statement
| 'end' ';'
;
if_statement
: 'if' expression 'do' expression ';'
;
other_statement
: ID ';'
;
expression
: receiver=( ID | FILE ) '.' method_call # Send
| expression '+' expression # Addition
| '!' expression # Negation
| atom # An_atom
;
method_call
: method_name=ID arguments?
;
arguments
: '(' ( argument ( ',' argument )* )? ')'
;
argument
: ID | NUMBER
;
atom
: ID
| FILE
| STRING
;
FILE : 'file' ;
ID : LETTER ( LETTER | DIGIT | '_' )* ( '?' | '!' )? ;
NUMBER : DIGIT+ ( ',' DIGIT+ )? ( '.' DIGIT+ )? ;
STRING : '"' .*? '"' ;
NL : ( [\r\n] | '\r\n' ) -> skip ;
WS : [ \t]+ -> channel(HIDDEN) ;
fragment DIGIT : [0-9] ;
fragment LETTER : [a-zA-Z] ;
File input.txt :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
Execution :
$ export CLASSPATH=".:/usr/local/lib/antlr-4.6-complete.jar"
$ alias
alias a4='java -jar /usr/local/lib/antlr-4.6-complete.jar'
alias grun='java org.antlr.v4.gui.TestRig'
$ a4 Question.g4
$ javac Q*.java
$ grun Question start -tokens -diagnostics input.txt
[#0,0:0=' ',<WS>,channel=1,1:0]
[#1,1:4='file',<'file'>,1:1]
[#2,5:5=' ',<WS>,channel=1,1:5]
[#3,6:15='alldataset',<ID>,1:6]
[#4,16:16=';',<';'>,1:16]
[#5,17:17=' ',<WS>,channel=1,1:17]
[#6,18:23='append',<'append'>,1:18]
[#7,24:24=' ',<WS>,channel=1,1:24]
[#8,25:28='next',<ID>,1:25]
[#9,29:29=';',<';'>,1:29]
[#10,30:30=' ',<WS>,channel=1,1:30]
[#11,31:33='xyz',<ID>,1:31]
[#12,34:34=';',<';'>,1:34]
[#13,36:36=' ',<WS>,channel=1,2:0]
[#14,37:38='if',<'if'>,2:1]
[#15,39:39=' ',<WS>,channel=1,2:3]
[#16,40:43='file',<'file'>,2:4]
[#17,44:44='.',<'.'>,2:8]
[#18,45:50='first?',<ID>,2:9]
[#19,51:51=' ',<WS>,channel=1,2:15]
[#20,52:53='do',<'do'>,2:16]
[#21,54:54=' ',<WS>,channel=1,2:18]
[#22,55:58='line',<ID>,2:19]
[#23,59:59=' ',<WS>,channel=1,2:23]
[#24,60:60='+',<'+'>,2:24]
[#25,61:61=' ',<WS>,channel=1,2:25]
[#26,62:65='"\n"',<STRING>,2:26]
[#27,66:66=';',<';'>,2:30]
...
[#59,133:132='<EOF>',<EOF>,7:0]
Question last update 1048
Statement found : append next;
Statement found : xyz;
Statement found : if file.first? do line + "\n";
Statement found : if !file.last? do line.indent(2);
Statement found : end;
Statement found : end;
Statement found : xyz;
One advantage of ANTLR4 over previous versions or other parser generators is that the code is no longer scattered among the parser rules, but gathered in a separate listener. This is where you do the actual processing, such as producing a new reformatted file. It would be too long to show a complete example. Today you can write the listener in C++, C#, Python and others. As I don't know Java, I have a machinery using Jruby, see my forum answer.
In Ruby there are many ways to do things. So my solution is one among others.
File t.rb :
def print_indented(p_file, p_indent, p_text)
p_file.print p_indent
p_file.puts p_text
end
# recursively split the line at semicolon, as long as the rest is not empty
def partition_on_semicolon(p_line, p_answer, p_level)
puts "in partition_on_semicolon for level #{p_level} p_line=#{p_line} / p_answer=#{p_answer}"
first_segment, semi, rest = p_line.partition(';')
p_answer << first_segment + semi
partition_on_semicolon(rest.lstrip, p_answer, p_level + 1) unless rest.empty?
end
lines = IO.readlines('input.txt')
# Compute initial indentation, the indentation of the first line.
# This is to preserve the spaces which are in the input.
m = lines.first.match(/^( *)(.*)/)
initial_indent = ' ' * m[1].length
# initial_indent = '' # uncomment if the initial indentation needs not to be preserved
puts "initial_indent=<#{initial_indent}> length=#{initial_indent.length}"
level = 1
indentation = ' '
File.open('newtext.txt', 'w') do | output_file |
lines.each do | line |
line = line.chomp
line = line.lstrip # remove trailing spaces
puts "---<#{line}>"
next_indent = initial_indent + indentation * (level - 1)
case
when line =~ /^file/ && line.count(';') > 1
level = 1 # restore, remove this if files can be indented
next_indent = initial_indent + indentation * (level - 1)
# split in count fragments
fragments = []
partition_on_semicolon(line, fragments, 1)
puts '---fragments :'
puts fragments.join('/')
print_indented(output_file, next_indent, fragments.first)
fragments[1..-1].each do | fragment |
print_indented(output_file, next_indent + indentation, fragment)
end
level += 1
when line.include?(' do ')
fragment1, _fdo, fragment2 = line.partition(' do ')
print_indented(output_file, next_indent, "#{fragment1} do")
print_indented(output_file, next_indent + indentation, fragment2)
level += 1
else
level -= 1 if line =~ /end;/
print_indented(output_file, next_indent, line)
end
end
end
File input.txt :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
Execution :
$ ruby -w t.rb
initial_indent=< > length=1
---<file alldataset; append next; xyz;>
in partition_on_semicolon for level 1 p_line=file alldataset; append next; xyz; / p_answer=[]
in partition_on_semicolon for level 2 p_line=append next; xyz; / p_answer=["file alldataset;"]
in partition_on_semicolon for level 3 p_line=xyz; / p_answer=["file alldataset;", "append next;"]
---fragments :
file alldataset;/append next;/xyz;
---<if file.first? do line + "\n";>
---<if !file.last? do line.indent(2);>
---<end;>
---<end;>
---<file file2; xyz;>
in partition_on_semicolon for level 1 p_line=file file2; xyz; / p_answer=[]
in partition_on_semicolon for level 2 p_line=xyz; / p_answer=["file file2;"]
---fragments :
file file2;/xyz;
---<>
Output file newtext.txt :
file alldataset;
append next;
xyz;
if file.first? do
line + "\n";
if !file.last? do
line.indent(2);
end;
end;
file file2;
xyz;

Mismatched input errors in antlr 3.5

i have problems with my grammar code in antlr3.5 . My input file is
` define tcpChannel ChannelName
define
listener
ListnerProperty
end
listener ;
define
execution
request with format RequestFormat,
response with format ResponseFormat,
error with format ErrorFormat ,
call servicename.executionname
end define execution ;
end
define channel ;
`
My lexer code is as follows:
lexer grammar ChannelLexer;
// ***************** lexer rules:
Define
:
'define'
;
Tcpchannel
:
'tcphannel'
;
Listener
:
'Listener'
;
End
:
'end'
;
Execution
:
' execution '
;
Request
:
' request '
;
With
:
' with '
;
Format
:
' format '
;
Response
:
' response '
;
Error
:
' error '
;
Call
:
' call '
;
Channel
:
' channel '
;
Dot
:
'.'
;
SColon
:
';'
;
Comma
:
','
;
Value
:
(
'a'..'z'
|'A'..'Z'
|'_'
)
(
'a'..'z'
|'A'..'Z'
|'_'
|Digit
)*
;
fragment
String
:
(
'"'
(
~(
'"'
| '\\'
)
| '\\'
(
'\\'
| '"'
)
)*
'"'
| '\''
(
~(
'\''
| '\\'
)
| '\\'
(
'\\'
| '\''
)
)*
'\''
)
{
setText(getText().substring(1, getText().length() - 1).replaceAll("\\\\(.)",
"$1"));
}
;
fragment
Digit
:
'0'..'9'
;
Space
:
(
' '
| '\t'
| '\r'
| '\n'
| '\u000C'
)
{
skip();
}
;
My parser code is:
parser grammar ChannelParser;
options
{
// antlr will generate java lexer and parser
language = Java;
// generated parser should create abstract syntax tree
output = AST;
}
// ***************** parser rules:
//our grammar accepts only salutation followed by an end symbol
expression
:
tcpChannelDefinition listenerDefinition executionDefintion endchannel
;
tcpChannelDefinition
:
Define Tcpchannel channelName
;
channelName
:
i= Value
{
$i.setText("CHANNEL_NAME#" + $i.text);
}
;
listenerDefinition
:
Define Listener listenerProperty endListener
;
listenerProperty
:
i=Value
{
$i.setText("PROPERTY_VALUE#" + $i.text);
}
;
endListener
:
End Listener SColon
;
executionDefintion
:
Define Execution execution
;
execution
:
Request With Format requestValue Comma
Response With Format responseValue Comma
Error With Format errorValue Comma
Call servicename Dot executionname
;
requestValue
:
i=Value
{
$i.setText("REQUEST_FORMAT#" + $i.text);
}
;
responseValue
:
i=Value
{
$i.setText("RESPONSE_FORMAT#" + $i.text);
}
;
errorValue
:
i=Value
{
$i.setText("ERROR_FORMAT#" + $i.text);
}
;
servicename
:
i=Value
{
$i.setText("SERVICE_NAME#" + $i.text);
}
;
executionname
:
i=Value
{
$i.setText("OPERATION_NAME#" + $i.text);
}
;
endexecution
:
End Define Execution SColon
;
endchannel
:
End Channel SColon
;
im getting error like missing Tcpchannel at 'tcpChannel' and extraneous input 'ChannelName' expecting Define. How to correct them. Please do help.ASAP

Error when trying to get the length of a variable in Ruby

I have a header method that write a standard header to all text files I throw it's way. Here's the code:
def header(file, description)
File.open(file, 'w') do |out|
out.puts ".------------------------------------------------------------------------."
out.puts "| File generated by: " + Etc.getlogin() + " " * (52-Etc.getlogin().length) + "|"
out.puts "| File generated at: " + Time.now().to_s() + " |"
out.puts ".------------------------------------------------------------------------."
out.puts "| DESCRIPTION: |"
out.puts "| " + description.to_s + " " * (70-description.length) + "|"
out.puts "| |"
out.puts "'------------------------------------------------------------------------'"
out.puts "=====FINDINGS====="
end
end
So when I run the following call statement:
header('01httpserver.txt', "This file details all configuration files with where http servers are concerned.")
I get the following error:
cis.rb:63:in `*': negative argument (ArgumentError)
from cis.rb:63:in `block in header'
from cis.rb:57:in `open'
from cis.rb:57:in `header'
from cis.rb:70:in `<main>'
Line 63 is this line:
out.puts "| " + description.to_s + " " * (70-description.length) + "|"
What am I doing wrong?
The problem is your description is longer than 70 characters and so you're multiplying the string by a negative value, which is not allowed.
To fix it change the line with the error to this:
out.puts "| " + description.to_s + " " * [0, (70-description.length)].max + "|"
If you want to do string padding, which it seems you're doing here, why not use the sprint notation?
out.puts "| %-70s|" % description.to_s
This uses the String#% method.
The %-70s means pad string to 70 spaces, left aligned. Without - it is right aligned.
Any values that are too long will overflow this spot. To handle that:
out.puts "| %-70s|" % description.to_s[0,70]
That should limit it to the first 70 characters. Within the Rails environment there's a method called truncate that can add on an ellipsis to show truncation occurred.

Resources