Mismatched input errors in antlr 3.5 - antlr3

i have problems with my grammar code in antlr3.5 . My input file is
` define tcpChannel ChannelName
define
listener
ListnerProperty
end
listener ;
define
execution
request with format RequestFormat,
response with format ResponseFormat,
error with format ErrorFormat ,
call servicename.executionname
end define execution ;
end
define channel ;
`
My lexer code is as follows:
lexer grammar ChannelLexer;
// ***************** lexer rules:
Define
:
'define'
;
Tcpchannel
:
'tcphannel'
;
Listener
:
'Listener'
;
End
:
'end'
;
Execution
:
' execution '
;
Request
:
' request '
;
With
:
' with '
;
Format
:
' format '
;
Response
:
' response '
;
Error
:
' error '
;
Call
:
' call '
;
Channel
:
' channel '
;
Dot
:
'.'
;
SColon
:
';'
;
Comma
:
','
;
Value
:
(
'a'..'z'
|'A'..'Z'
|'_'
)
(
'a'..'z'
|'A'..'Z'
|'_'
|Digit
)*
;
fragment
String
:
(
'"'
(
~(
'"'
| '\\'
)
| '\\'
(
'\\'
| '"'
)
)*
'"'
| '\''
(
~(
'\''
| '\\'
)
| '\\'
(
'\\'
| '\''
)
)*
'\''
)
{
setText(getText().substring(1, getText().length() - 1).replaceAll("\\\\(.)",
"$1"));
}
;
fragment
Digit
:
'0'..'9'
;
Space
:
(
' '
| '\t'
| '\r'
| '\n'
| '\u000C'
)
{
skip();
}
;
My parser code is:
parser grammar ChannelParser;
options
{
// antlr will generate java lexer and parser
language = Java;
// generated parser should create abstract syntax tree
output = AST;
}
// ***************** parser rules:
//our grammar accepts only salutation followed by an end symbol
expression
:
tcpChannelDefinition listenerDefinition executionDefintion endchannel
;
tcpChannelDefinition
:
Define Tcpchannel channelName
;
channelName
:
i= Value
{
$i.setText("CHANNEL_NAME#" + $i.text);
}
;
listenerDefinition
:
Define Listener listenerProperty endListener
;
listenerProperty
:
i=Value
{
$i.setText("PROPERTY_VALUE#" + $i.text);
}
;
endListener
:
End Listener SColon
;
executionDefintion
:
Define Execution execution
;
execution
:
Request With Format requestValue Comma
Response With Format responseValue Comma
Error With Format errorValue Comma
Call servicename Dot executionname
;
requestValue
:
i=Value
{
$i.setText("REQUEST_FORMAT#" + $i.text);
}
;
responseValue
:
i=Value
{
$i.setText("RESPONSE_FORMAT#" + $i.text);
}
;
errorValue
:
i=Value
{
$i.setText("ERROR_FORMAT#" + $i.text);
}
;
servicename
:
i=Value
{
$i.setText("SERVICE_NAME#" + $i.text);
}
;
executionname
:
i=Value
{
$i.setText("OPERATION_NAME#" + $i.text);
}
;
endexecution
:
End Define Execution SColon
;
endchannel
:
End Channel SColon
;
im getting error like missing Tcpchannel at 'tcpChannel' and extraneous input 'ChannelName' expecting Define. How to correct them. Please do help.ASAP

Related

How to replace the values of a Param

How can I replace the values of parameters step by step.
What I mean is,
For Example-
Url is
https://example.com/?p=first&q=second&r=third
First I want to add '123' on p param
https://example.com/?p=123&q=second&r=third
Then again with same URL but different parameter, such as q param
https://example.com/?p=first&q=123&r=third
Again with same URL but different parameter,
https://example.com/?p=first&q=second&r=123
What I tried:
while read line; do
first_part=`echo $line | cut -d'=' -f1` second_part=`echo $line | cut -d'=' -f2`
echo "${first_part}=123${second_part}"
echo "${first_part}${second_part}=123"
done < urls.txt
The problem described is a good application for AWK's capabilities. The demo script includes samples for both URLs and a mapping functions file for global transformation of URLs.
This approach allows for parameters to "free float", not dependent on matching at a specific sequential position in the URL string.
This approach also allows for parameters to be strings of any length.
#!/bin/bash
#QUESTION: https://stackoverflow.com/questions/75124190/how-to-replace-the-values-of-a-param
cat >URL.list <<"EnDoFiNpUt"
https://example.com/?p=first&q=second&r=third
https://example.com/?r=zinger
https://example.com/?r=bonkers&q=junk&p=wacko
https://example.com/?p=flyer
EnDoFiNpUt
cat >mapfile.txt <<"EnDoFiNpUt"
q=SECOND
r=THIRD
p=FIRST
EnDoFiNpUt
awk -v datFile="mapfile.txt" 'BEGIN{
## Initial loading of the mapping file into array for comparison
split( "", transforms ) ;
indexT=0 ;
while( getline < datFile ){
indexT++ ;
transforms[indexT]=$0 ;
} ;
}
{
### Split off beginning of URL from parameters
qPos=index( $0, "?" ) ;
beg=substr( $0, 1, qPos ) ;
### Load URL elements into array for comparison
rem=substr( $0, qPos+1 ) ;
n=split( rem, parts, "&" ) ;
### Match and Map transforms elements with URL parts
for( k=1 ; k<= indexT ; k++ ){
dPos=index( transforms[k], "=" ) ;
fieldPref=substr( transforms[k], 1, dPos ) ;
for( i=1 ; i<=n ; i++ ){
if( parts[i] ~ fieldPref ){
parts[i]=transforms[k] ;
} ;
} ;
} ;
### Print transformed URL
printf("%s%s", beg, parts[1] ) ;
for( i=2 ; i<=n ; i++ ){
printf("&%s", parts[i] ) ;
} ;
print "" ;
}' URL.list
The output looks like this:
https://example.com/?p=FIRST&q=SECOND&r=THIRD
https://example.com/?r=THIRD
https://example.com/?r=THIRD&q=SECOND&p=FIRST
https://example.com/?p=FIRST
HTML params are, by spec, orderless, so you can simply place p='s new value at the tail instead of original position :
echo 'https://example.com/?p=first&q=second&r=third' |
mawk NF=NF FS='p=[^&]*[&]?' OFS= ORS='&p=123\n'
1 https://example.com/?q=second&r=third&p=123
same for q=.
if you're modifying r= instead, then set both FS and OFS to "=", and do it it like a vanilla value update for $NF

YACC and LEX, Getting syntax error at the end of line and can't figure out why

HERE IS MY YACC
%{
#include <stdio.h>
#include <ctype.h>
#include "lex.yy.c"
void yyerror (s) /* Called by yyparse on error */
char *s;
{
printf ("%s\n", s);
}
%}
%start program
%union{
int value;
char * string;
}
%token <value> NUM INT VOID WHILE IF THEN ELSE READ WRITE RETURN LE GE EQ NE
%token <string> ID
%token <value> INTEGER
%left '|'
%left '&'
%left '+' '-'
%left '*' '/' '%'
%left UMINUS
%%
program : decllist
{fprintf(stderr, "program");}
;
decllist : dec
{fprintf(stderr, "\n dec");}
| dec decllist
;
dec : vardec
{fprintf(stderr, "vardec");}
| fundec
{fprintf(stderr, "YEAH");}
;
typespec : INT
| VOID
;
vardec : typespec ID ';'
{fprintf(stderr, "yep");}
| typespec ID '[' NUM ']' ';'
{fprintf(stderr, "again");}
;
fundec : typespec ID '(' params ')' compoundstmt
;
params : VOID
| paramlist
;
paramlist : param
| param ',' paramlist
;
param : typespec ID
| typespec ID '['']'
;
compoundstmt : '{' localdeclerations statementlist '}'
;
localdeclerations :/* empty */
|vardec localdeclerations
;
statementlist : /* empty */
| statement statementlist
;
statement : expressionstmt
| compoundstmt
| selectionstmt
| iterationstmt
| assignmentstmt
| returnstmt
| readstmt
| writestmt
;
expressionstmt : expression ';'
| ';'
;
assignmentstmt : var '=' expressionstmt
;
selectionstmt : IF '(' expression ')' statement
| IF '(' expression ')' statement ELSE statement
;
iterationstmt : WHILE '(' expression ')' statement
;
returnstmt : RETURN ';'
| RETURN expression ';'
;
writestmt : WRITE expression ';'
;
readstmt : READ var ';'
;
expression : simpleexpression
;
var : ID
| ID '[' expression ']'
;
simpleexpression : additiveexpression
| additiveexpression relop simpleexpression
;
relop : LE
| '<'
| '>'
| GE
| EQ
| NE
;
additiveexpression : term
| term addop term
;
addop : '+'
| '-'
;
term : factor
| term multop factor
;
multop : '*'
| '/'
;
factor : '(' expression ')'
| NUM
| var
| call
;
call : ID '(' args ')'
;
args : arglist
| /* empty */
;
arglist : expression
| expression ',' arglist
;
%%
main(){
yyparse();
}
AND HERE IS MY LEX
%{
int mydebug=1;
int lineno=0;
#include "y.tab.h"
%}
%%
int {if (mydebug) fprintf(stderr, "int found\n");
return(INT);
}
num {if (mydebug) fprintf(stderr, "num found\n");
return(NUM);
}
void {if (mydebug) fprintf(stderr, "void found \n");
return(VOID);
}
while {if (mydebug) fprintf(stderr, "while found \n");
return(WHILE);
}
if {if (mydebug) fprintf(stderr, "if found \n");
return(IF);
}
then {if (mydebug) fprintf(stderr, "then found \n");
return(THEN);
}
else {if (mydebug) fprintf(stderr, "else found \n");
return(ELSE);
}
read {if (mydebug) fprintf(stderr, "read found \n");
return(READ);
}
write {if (mydebug) fprintf(stderr, "void found \n");
return(WRITE);
}
return {if (mydebug) fprintf(stderr, "void found \n");
return(RETURN);
}
'<=' {if (mydebug) fprintf(stderr, "void found \n");
return(LE);
}
'>=' {if (mydebug) fprintf(stderr, "void found \n");
return(GE);
}
'==' {if (mydebug) fprintf(stderr, "void found \n");
return(EQ);
}
'!=' {if (mydebug) fprintf(stderr, "void found \n");
return(NE);
}
[a-zA-Z][a-zA-Z0-9]* {if (mydebug) fprintf(stderr,"Letter found\n");
yylval.string=strdup(yytext); return(ID);}
[0-9][0-9]* {if (mydebug) fprintf(stderr,"Digit found\n");
yylval.value=atoi((const char *)yytext); return(NUM);}
[ \t] {if (mydebug) fprintf(stderr,"Whitespace found\n");}
[=\-+*/%&|()\[\]<>;] { if (mydebug) fprintf(stderr,"return a token %c\n",*yytext);
return (*yytext);}
\n { if (mydebug) fprintf(stderr,"cariage return %c\n",*yytext);
lineno++;
return (*yytext);}
%%
int yywrap(void)
{ return 1;}
If i type in something like 'int a;' it gets all the way to the new line and prints 'carriage returned' but then stops and spits out syntax error at the end. Can anyone see why?
I have gone over this a lot and can't seem to find what keeps stopping it. I have a previous program that i am going back to trying to see if i can't figure it out but i'm stumped. Can anyone help?
Your lexer is returning '\n' (newline) tokens at the end of the line, but your parser never accepts them, so you'll get a syntax error when the parser hits the first newline.

combining jq conditions using and [duplicate]

This question already has an answer here:
combine two jq filters into one
(1 answer)
Closed 5 years ago.
How to combine two jq conditions using 'and'.
test.json
{
"url": "https://<part1>.test/hai/<part1>",
"ParameterValue": "<value>"
}
jq --arg input1 "$arg1" --arg input2 "$arg2" \
'if .url | contains("<part1>")
then . + {"url" : ("https://" + $input1 + ".test/hai/" + $input1) }
else . end' and
'if .ParameterValue == "<value>"
then . + {"ParameterValue" : ($input2) }
else . end' test.json > test123.json
and is a boolean (logical) operator. What you want here is to create a pipeline using '|':
jq --arg input1 "$arg1" --arg input2 "$arg2" '
if .url | contains("<part1>")
then . + {url : ("https://" + $input1 + ".test/hai/" + $input1) }
else . end
| if .ParameterValue == "<value>"
then . + {ParameterValue : $input2 }
else . end' test.json > test123.json
Or maybe better:
def when(filter; action): if (filter?) // null then action else . end;
when(.url | contains("<part1>");
.url = ("https://" + $input1 + ".test/hai/" + $input1))
| when(.ParameterValue == "<value>";
.ParameterValue = $input2)

Processing lines of file in Ruby

I have some file like this
file alldataset; append next;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
and I am trying to write a ruby program to push any line that comes after a semi colon to a new line. In addition, if a line has a 'do', indent from the 'do' so that the following line is indented by two blanks and any inner 'do' be indented by 4 blanks and so on.
I am very new to Ruby and my code so far is quite away from what I want. This is what I have
def indent(text, num)
" "*num+" " + text
end
doc = File.open('newtext.txt')
doc.to_a.each do |line|
if line.downcase =~ /^(file).+(;)/i
puts line+"\n"
end
if line.downcase.include?('do')
puts indent(line, 2)
end
end
This is the desired output
file alldataset;
append next;
if file.first? do
line + "\n";
if !file.last? do
line.indent(2);
end;
end;
Any help would be appreciated.
As you are interested in parsing, here is a quickly made example, just to give you a taste. I have learned Lex/Yacc, Flex/Bison, ANTLR v3 and ANTLR v4. I strongly recommend ANTLR4 which is so powerful. References :
the ANTLR site
The ANTLR mega tutorial
the expert book
StackOverflow -> Tags -> antlr
The following grammar can parse only the input example you have provided.
File Question.g4 :
grammar Question;
/* Simple grammar example to parse the following code :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
*/
start
#init {System.out.println("Question last update 1048");}
: file* EOF
;
file
: FILE ID ';' statement_p*
;
statement_p
: statement
{System.out.println("Statement found : " + $statement.text);}
;
statement
: 'append' ID ';'
| if_statement
| other_statement
| 'end' ';'
;
if_statement
: 'if' expression 'do' expression ';'
;
other_statement
: ID ';'
;
expression
: receiver=( ID | FILE ) '.' method_call # Send
| expression '+' expression # Addition
| '!' expression # Negation
| atom # An_atom
;
method_call
: method_name=ID arguments?
;
arguments
: '(' ( argument ( ',' argument )* )? ')'
;
argument
: ID | NUMBER
;
atom
: ID
| FILE
| STRING
;
FILE : 'file' ;
ID : LETTER ( LETTER | DIGIT | '_' )* ( '?' | '!' )? ;
NUMBER : DIGIT+ ( ',' DIGIT+ )? ( '.' DIGIT+ )? ;
STRING : '"' .*? '"' ;
NL : ( [\r\n] | '\r\n' ) -> skip ;
WS : [ \t]+ -> channel(HIDDEN) ;
fragment DIGIT : [0-9] ;
fragment LETTER : [a-zA-Z] ;
File input.txt :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
Execution :
$ export CLASSPATH=".:/usr/local/lib/antlr-4.6-complete.jar"
$ alias
alias a4='java -jar /usr/local/lib/antlr-4.6-complete.jar'
alias grun='java org.antlr.v4.gui.TestRig'
$ a4 Question.g4
$ javac Q*.java
$ grun Question start -tokens -diagnostics input.txt
[#0,0:0=' ',<WS>,channel=1,1:0]
[#1,1:4='file',<'file'>,1:1]
[#2,5:5=' ',<WS>,channel=1,1:5]
[#3,6:15='alldataset',<ID>,1:6]
[#4,16:16=';',<';'>,1:16]
[#5,17:17=' ',<WS>,channel=1,1:17]
[#6,18:23='append',<'append'>,1:18]
[#7,24:24=' ',<WS>,channel=1,1:24]
[#8,25:28='next',<ID>,1:25]
[#9,29:29=';',<';'>,1:29]
[#10,30:30=' ',<WS>,channel=1,1:30]
[#11,31:33='xyz',<ID>,1:31]
[#12,34:34=';',<';'>,1:34]
[#13,36:36=' ',<WS>,channel=1,2:0]
[#14,37:38='if',<'if'>,2:1]
[#15,39:39=' ',<WS>,channel=1,2:3]
[#16,40:43='file',<'file'>,2:4]
[#17,44:44='.',<'.'>,2:8]
[#18,45:50='first?',<ID>,2:9]
[#19,51:51=' ',<WS>,channel=1,2:15]
[#20,52:53='do',<'do'>,2:16]
[#21,54:54=' ',<WS>,channel=1,2:18]
[#22,55:58='line',<ID>,2:19]
[#23,59:59=' ',<WS>,channel=1,2:23]
[#24,60:60='+',<'+'>,2:24]
[#25,61:61=' ',<WS>,channel=1,2:25]
[#26,62:65='"\n"',<STRING>,2:26]
[#27,66:66=';',<';'>,2:30]
...
[#59,133:132='<EOF>',<EOF>,7:0]
Question last update 1048
Statement found : append next;
Statement found : xyz;
Statement found : if file.first? do line + "\n";
Statement found : if !file.last? do line.indent(2);
Statement found : end;
Statement found : end;
Statement found : xyz;
One advantage of ANTLR4 over previous versions or other parser generators is that the code is no longer scattered among the parser rules, but gathered in a separate listener. This is where you do the actual processing, such as producing a new reformatted file. It would be too long to show a complete example. Today you can write the listener in C++, C#, Python and others. As I don't know Java, I have a machinery using Jruby, see my forum answer.
In Ruby there are many ways to do things. So my solution is one among others.
File t.rb :
def print_indented(p_file, p_indent, p_text)
p_file.print p_indent
p_file.puts p_text
end
# recursively split the line at semicolon, as long as the rest is not empty
def partition_on_semicolon(p_line, p_answer, p_level)
puts "in partition_on_semicolon for level #{p_level} p_line=#{p_line} / p_answer=#{p_answer}"
first_segment, semi, rest = p_line.partition(';')
p_answer << first_segment + semi
partition_on_semicolon(rest.lstrip, p_answer, p_level + 1) unless rest.empty?
end
lines = IO.readlines('input.txt')
# Compute initial indentation, the indentation of the first line.
# This is to preserve the spaces which are in the input.
m = lines.first.match(/^( *)(.*)/)
initial_indent = ' ' * m[1].length
# initial_indent = '' # uncomment if the initial indentation needs not to be preserved
puts "initial_indent=<#{initial_indent}> length=#{initial_indent.length}"
level = 1
indentation = ' '
File.open('newtext.txt', 'w') do | output_file |
lines.each do | line |
line = line.chomp
line = line.lstrip # remove trailing spaces
puts "---<#{line}>"
next_indent = initial_indent + indentation * (level - 1)
case
when line =~ /^file/ && line.count(';') > 1
level = 1 # restore, remove this if files can be indented
next_indent = initial_indent + indentation * (level - 1)
# split in count fragments
fragments = []
partition_on_semicolon(line, fragments, 1)
puts '---fragments :'
puts fragments.join('/')
print_indented(output_file, next_indent, fragments.first)
fragments[1..-1].each do | fragment |
print_indented(output_file, next_indent + indentation, fragment)
end
level += 1
when line.include?(' do ')
fragment1, _fdo, fragment2 = line.partition(' do ')
print_indented(output_file, next_indent, "#{fragment1} do")
print_indented(output_file, next_indent + indentation, fragment2)
level += 1
else
level -= 1 if line =~ /end;/
print_indented(output_file, next_indent, line)
end
end
end
File input.txt :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
Execution :
$ ruby -w t.rb
initial_indent=< > length=1
---<file alldataset; append next; xyz;>
in partition_on_semicolon for level 1 p_line=file alldataset; append next; xyz; / p_answer=[]
in partition_on_semicolon for level 2 p_line=append next; xyz; / p_answer=["file alldataset;"]
in partition_on_semicolon for level 3 p_line=xyz; / p_answer=["file alldataset;", "append next;"]
---fragments :
file alldataset;/append next;/xyz;
---<if file.first? do line + "\n";>
---<if !file.last? do line.indent(2);>
---<end;>
---<end;>
---<file file2; xyz;>
in partition_on_semicolon for level 1 p_line=file file2; xyz; / p_answer=[]
in partition_on_semicolon for level 2 p_line=xyz; / p_answer=["file file2;"]
---fragments :
file file2;/xyz;
---<>
Output file newtext.txt :
file alldataset;
append next;
xyz;
if file.first? do
line + "\n";
if !file.last? do
line.indent(2);
end;
end;
file file2;
xyz;

Why does this //ID pass but //DEF fails?

I'm experimenting with the XPath using the grammar provided in the test suite and am having a problem with the path //ID being identified, but //DEF is not found. An IllegalArgumentException is thrown. "DEF at index 2 isn't a valid token name" Why is //ID matched, but //DEFnot?
String exprGrammar = "grammar Expr;\n" +
"prog: func+ ;\n" +
"func: DEF ID '(' arg (',' arg)* ')' body ;\n" +
"body: '{' stat+ '}' ;\n" +
"arg : ID ;\n" +
"stat: expr ';' # printExpr\n" +
" | ID '=' expr ';' # assign\n" +
" | 'return' expr ';' # ret\n" +
" | ';' # blank\n" +
" ;\n" +
"expr: expr ('*'|'/') expr # MulDiv\n" +
" | expr ('+'|'-') expr # AddSub\n" +
" | primary # prim\n" +
" ;\n" +
"primary" +
" : INT # int\n" +
" | ID # id\n" +
" | '(' expr ')' # parens\n" +
" ;" +
"\n" +
"MUL : '*' ; // assigns token name to '*' used above in grammar\n" +
"DIV : '/' ;\n" +
"ADD : '+' ;\n" +
"SUB : '-' ;\n" +
"RETURN : 'return' ;\n" +
"DEF: 'def';\n" +
"ID : [a-zA-Z]+ ; // match identifiers\n" +
"INT : [0-9]+ ; // match integers\n" +
"NEWLINE:'\\r'? '\\n' -> skip; // return newlines to parser (is end-statement signal)\n" +
"WS : [ \\t]+ -> skip ; // toss out whitespace\n";
String SAMPLE_PROGRAM =
"def f(x,y) { x = 3+4; y; ; }\n" +
"def g(x) { return 1+2*x; }\n";
Grammar g2 = new Grammar(exprGrammar);
LexerInterpreter g2LexerInterpreter = g2.createLexerInterpreter(new ANTLRInputStream(SAMPLE_PROGRAM));
CommonTokenStream tokens = new CommonTokenStream(g2LexerInterpreter);
ParserInterpreter parser = g2.createParserInterpreter(tokens);
parser.setBuildParseTree(true);
ParseTree tree = parser.parse(g2.rules.get("prog").index);
String xpath = "//DEF";
for (ParseTree t : XPath.findAll(tree, xpath, parser) ) {
System.out.println(t.getSourceInterval());
}
When I run your code, the following gets printed:
0..0
18..18
In other words:
;)
This XPath tree pattern matching is all rather new, so my guess is that you've stumbled upon a bug that has been fixed. I'm using ANTLR version 4.2.2

Resources