YACC and LEX, Getting syntax error at the end of line and can't figure out why

YACC and LEX, Getting syntax error at the end of line and can't figure out why - syntax

HERE IS MY YACC
%{
#include <stdio.h>
#include <ctype.h>
#include "lex.yy.c"
void yyerror (s) /* Called by yyparse on error */
char *s;
{
printf ("%s\n", s);
}
%}
%start program
%union{
int value;
char * string;
}
%token <value> NUM INT VOID WHILE IF THEN ELSE READ WRITE RETURN LE GE EQ NE
%token <string> ID
%token <value> INTEGER
%left '|'
%left '&'
%left '+' '-'
%left '*' '/' '%'
%left UMINUS
%%
program : decllist
{fprintf(stderr, "program");}
;
decllist : dec
{fprintf(stderr, "\n dec");}
| dec decllist
;
dec : vardec
{fprintf(stderr, "vardec");}
| fundec
{fprintf(stderr, "YEAH");}
;
typespec : INT
| VOID
;
vardec : typespec ID ';'
{fprintf(stderr, "yep");}
| typespec ID '[' NUM ']' ';'
{fprintf(stderr, "again");}
;
fundec : typespec ID '(' params ')' compoundstmt
;
params : VOID
| paramlist
;
paramlist : param
| param ',' paramlist
;
param : typespec ID
| typespec ID '['']'
;
compoundstmt : '{' localdeclerations statementlist '}'
;
localdeclerations :/* empty */
|vardec localdeclerations
;
statementlist : /* empty */
| statement statementlist
;
statement : expressionstmt
| compoundstmt
| selectionstmt
| iterationstmt
| assignmentstmt
| returnstmt
| readstmt
| writestmt
;
expressionstmt : expression ';'
| ';'
;
assignmentstmt : var '=' expressionstmt
;
selectionstmt : IF '(' expression ')' statement
| IF '(' expression ')' statement ELSE statement
;
iterationstmt : WHILE '(' expression ')' statement
;
returnstmt : RETURN ';'
| RETURN expression ';'
;
writestmt : WRITE expression ';'
;
readstmt : READ var ';'
;
expression : simpleexpression
;
var : ID
| ID '[' expression ']'
;
simpleexpression : additiveexpression
| additiveexpression relop simpleexpression
;
relop : LE
| '<'
| '>'
| GE
| EQ
| NE
;
additiveexpression : term
| term addop term
;
addop : '+'
| '-'
;
term : factor
| term multop factor
;
multop : '*'
| '/'
;
factor : '(' expression ')'
| NUM
| var
| call
;
call : ID '(' args ')'
;
args : arglist
| /* empty */
;
arglist : expression
| expression ',' arglist
;
%%
main(){
yyparse();
}
AND HERE IS MY LEX
%{
int mydebug=1;
int lineno=0;
#include "y.tab.h"
%}
%%
int {if (mydebug) fprintf(stderr, "int found\n");
return(INT);
}
num {if (mydebug) fprintf(stderr, "num found\n");
return(NUM);
}
void {if (mydebug) fprintf(stderr, "void found \n");
return(VOID);
}
while {if (mydebug) fprintf(stderr, "while found \n");
return(WHILE);
}
if {if (mydebug) fprintf(stderr, "if found \n");
return(IF);
}
then {if (mydebug) fprintf(stderr, "then found \n");
return(THEN);
}
else {if (mydebug) fprintf(stderr, "else found \n");
return(ELSE);
}
read {if (mydebug) fprintf(stderr, "read found \n");
return(READ);
}
write {if (mydebug) fprintf(stderr, "void found \n");
return(WRITE);
}
return {if (mydebug) fprintf(stderr, "void found \n");
return(RETURN);
}
'<=' {if (mydebug) fprintf(stderr, "void found \n");
return(LE);
}
'>=' {if (mydebug) fprintf(stderr, "void found \n");
return(GE);
}
'==' {if (mydebug) fprintf(stderr, "void found \n");
return(EQ);
}
'!=' {if (mydebug) fprintf(stderr, "void found \n");
return(NE);
}
[a-zA-Z][a-zA-Z0-9]* {if (mydebug) fprintf(stderr,"Letter found\n");
yylval.string=strdup(yytext); return(ID);}
[0-9][0-9]* {if (mydebug) fprintf(stderr,"Digit found\n");
yylval.value=atoi((const char *)yytext); return(NUM);}
[ \t] {if (mydebug) fprintf(stderr,"Whitespace found\n");}
[=\-+*/%&|()\[\]<>;] { if (mydebug) fprintf(stderr,"return a token %c\n",*yytext);
return (*yytext);}
\n { if (mydebug) fprintf(stderr,"cariage return %c\n",*yytext);
lineno++;
return (*yytext);}
%%
int yywrap(void)
{ return 1;}
If i type in something like 'int a;' it gets all the way to the new line and prints 'carriage returned' but then stops and spits out syntax error at the end. Can anyone see why?
I have gone over this a lot and can't seem to find what keeps stopping it. I have a previous program that i am going back to trying to see if i can't figure it out but i'm stumped. Can anyone help?

Your lexer is returning '\n' (newline) tokens at the end of the line, but your parser never accepts them, so you'll get a syntax error when the parser hits the first newline.

Related

Bug in a simple parser specification in F#

I wonder where the parser specification below went wrong. The parser aims to parse and evaluate an expression like 2+3*4 to 14. It is to be run with FsLexYacc.
%{
%}
%token <int> CSTINT
%token PLUS MINUS MUL
%token LPAR RPAR
%token EOF
%left MINUS PLUS /* lowest precedence */
%left TIMES DIV /* highest precedence */
%start Main
%type int Main
%%
Main:
Expr EOF { $1 }
;
Expr:
| CSTINT { $1 }
| MINUS CSTINT { - $2 }
| LPAR Expr RPAR { $2 }
| Expr MUL Expr { $1 * $3 }
| Expr PLUS Expr { $1+$3 }
| Expr MINUS Expr { $1-$3 }
;
I got the error
ExprPar.fsy(18,0): error: Unexpected character '%'%
The line 18 refers to the line up before "Main". Where is the bug?

I believe the type specified by %type should be in angle brackets:
%type <int> Main

Processing lines of file in Ruby

I have some file like this
file alldataset; append next;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
and I am trying to write a ruby program to push any line that comes after a semi colon to a new line. In addition, if a line has a 'do', indent from the 'do' so that the following line is indented by two blanks and any inner 'do' be indented by 4 blanks and so on.
I am very new to Ruby and my code so far is quite away from what I want. This is what I have
def indent(text, num)
" "*num+" " + text
end
doc = File.open('newtext.txt')
doc.to_a.each do |line|
if line.downcase =~ /^(file).+(;)/i
puts line+"\n"
end
if line.downcase.include?('do')
puts indent(line, 2)
end
end
This is the desired output
file alldataset;
append next;
if file.first? do
line + "\n";
if !file.last? do
line.indent(2);
end;
end;
Any help would be appreciated.

As you are interested in parsing, here is a quickly made example, just to give you a taste. I have learned Lex/Yacc, Flex/Bison, ANTLR v3 and ANTLR v4. I strongly recommend ANTLR4 which is so powerful. References :
the ANTLR site
The ANTLR mega tutorial
the expert book
StackOverflow -> Tags -> antlr
The following grammar can parse only the input example you have provided.
File Question.g4 :
grammar Question;
/* Simple grammar example to parse the following code :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
*/
start
#init {System.out.println("Question last update 1048");}
: file* EOF
;
file
: FILE ID ';' statement_p*
;
statement_p
: statement
{System.out.println("Statement found : " + $statement.text);}
;
statement
: 'append' ID ';'
| if_statement
| other_statement
| 'end' ';'
;
if_statement
: 'if' expression 'do' expression ';'
;
other_statement
: ID ';'
;
expression
: receiver=( ID | FILE ) '.' method_call # Send
| expression '+' expression # Addition
| '!' expression # Negation
| atom # An_atom
;
method_call
: method_name=ID arguments?
;
arguments
: '(' ( argument ( ',' argument )* )? ')'
;
argument
: ID | NUMBER
;
atom
: ID
| FILE
| STRING
;
FILE : 'file' ;
ID : LETTER ( LETTER | DIGIT | '_' )* ( '?' | '!' )? ;
NUMBER : DIGIT+ ( ',' DIGIT+ )? ( '.' DIGIT+ )? ;
STRING : '"' .*? '"' ;
NL : ( [\r\n] | '\r\n' ) -> skip ;
WS : [ \t]+ -> channel(HIDDEN) ;
fragment DIGIT : [0-9] ;
fragment LETTER : [a-zA-Z] ;
File input.txt :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
Execution :
$ export CLASSPATH=".:/usr/local/lib/antlr-4.6-complete.jar"
$ alias
alias a4='java -jar /usr/local/lib/antlr-4.6-complete.jar'
alias grun='java org.antlr.v4.gui.TestRig'
$ a4 Question.g4
$ javac Q*.java
$ grun Question start -tokens -diagnostics input.txt
[#0,0:0=' ',<WS>,channel=1,1:0]
[#1,1:4='file',<'file'>,1:1]
[#2,5:5=' ',<WS>,channel=1,1:5]
[#3,6:15='alldataset',<ID>,1:6]
[#4,16:16=';',<';'>,1:16]
[#5,17:17=' ',<WS>,channel=1,1:17]
[#6,18:23='append',<'append'>,1:18]
[#7,24:24=' ',<WS>,channel=1,1:24]
[#8,25:28='next',<ID>,1:25]
[#9,29:29=';',<';'>,1:29]
[#10,30:30=' ',<WS>,channel=1,1:30]
[#11,31:33='xyz',<ID>,1:31]
[#12,34:34=';',<';'>,1:34]
[#13,36:36=' ',<WS>,channel=1,2:0]
[#14,37:38='if',<'if'>,2:1]
[#15,39:39=' ',<WS>,channel=1,2:3]
[#16,40:43='file',<'file'>,2:4]
[#17,44:44='.',<'.'>,2:8]
[#18,45:50='first?',<ID>,2:9]
[#19,51:51=' ',<WS>,channel=1,2:15]
[#20,52:53='do',<'do'>,2:16]
[#21,54:54=' ',<WS>,channel=1,2:18]
[#22,55:58='line',<ID>,2:19]
[#23,59:59=' ',<WS>,channel=1,2:23]
[#24,60:60='+',<'+'>,2:24]
[#25,61:61=' ',<WS>,channel=1,2:25]
[#26,62:65='"\n"',<STRING>,2:26]
[#27,66:66=';',<';'>,2:30]
...
[#59,133:132='<EOF>',<EOF>,7:0]
Question last update 1048
Statement found : append next;
Statement found : xyz;
Statement found : if file.first? do line + "\n";
Statement found : if !file.last? do line.indent(2);
Statement found : end;
Statement found : end;
Statement found : xyz;
One advantage of ANTLR4 over previous versions or other parser generators is that the code is no longer scattered among the parser rules, but gathered in a separate listener. This is where you do the actual processing, such as producing a new reformatted file. It would be too long to show a complete example. Today you can write the listener in C++, C#, Python and others. As I don't know Java, I have a machinery using Jruby, see my forum answer.

In Ruby there are many ways to do things. So my solution is one among others.
File t.rb :
def print_indented(p_file, p_indent, p_text)
p_file.print p_indent
p_file.puts p_text
end
# recursively split the line at semicolon, as long as the rest is not empty
def partition_on_semicolon(p_line, p_answer, p_level)
puts "in partition_on_semicolon for level #{p_level} p_line=#{p_line} / p_answer=#{p_answer}"
first_segment, semi, rest = p_line.partition(';')
p_answer << first_segment + semi
partition_on_semicolon(rest.lstrip, p_answer, p_level + 1) unless rest.empty?
end
lines = IO.readlines('input.txt')
# Compute initial indentation, the indentation of the first line.
# This is to preserve the spaces which are in the input.
m = lines.first.match(/^( *)(.*)/)
initial_indent = ' ' * m[1].length
# initial_indent = '' # uncomment if the initial indentation needs not to be preserved
puts "initial_indent=<#{initial_indent}> length=#{initial_indent.length}"
level = 1
indentation = ' '
File.open('newtext.txt', 'w') do | output_file |
lines.each do | line |
line = line.chomp
line = line.lstrip # remove trailing spaces
puts "---<#{line}>"
next_indent = initial_indent + indentation * (level - 1)
case
when line =~ /^file/ && line.count(';') > 1
level = 1 # restore, remove this if files can be indented
next_indent = initial_indent + indentation * (level - 1)
# split in count fragments
fragments = []
partition_on_semicolon(line, fragments, 1)
puts '---fragments :'
puts fragments.join('/')
print_indented(output_file, next_indent, fragments.first)
fragments[1..-1].each do | fragment |
print_indented(output_file, next_indent + indentation, fragment)
end
level += 1
when line.include?(' do ')
fragment1, _fdo, fragment2 = line.partition(' do ')
print_indented(output_file, next_indent, "#{fragment1} do")
print_indented(output_file, next_indent + indentation, fragment2)
level += 1
else
level -= 1 if line =~ /end;/
print_indented(output_file, next_indent, line)
end
end
end
File input.txt :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
Execution :
$ ruby -w t.rb
initial_indent=< > length=1
---<file alldataset; append next; xyz;>
in partition_on_semicolon for level 1 p_line=file alldataset; append next; xyz; / p_answer=[]
in partition_on_semicolon for level 2 p_line=append next; xyz; / p_answer=["file alldataset;"]
in partition_on_semicolon for level 3 p_line=xyz; / p_answer=["file alldataset;", "append next;"]
---fragments :
file alldataset;/append next;/xyz;
---<if file.first? do line + "\n";>
---<if !file.last? do line.indent(2);>
---<end;>
---<end;>
---<file file2; xyz;>
in partition_on_semicolon for level 1 p_line=file file2; xyz; / p_answer=[]
in partition_on_semicolon for level 2 p_line=xyz; / p_answer=["file file2;"]
---fragments :
file file2;/xyz;
---<>
Output file newtext.txt :
file alldataset;
append next;
xyz;
if file.first? do
line + "\n";
if !file.last? do
line.indent(2);
end;
end;
file file2;
xyz;

Mismatched input errors in antlr 3.5

i have problems with my grammar code in antlr3.5 . My input file is
` define tcpChannel ChannelName
define
listener
ListnerProperty
end
listener ;
define
execution
request with format RequestFormat,
response with format ResponseFormat,
error with format ErrorFormat ,
call servicename.executionname
end define execution ;
end
define channel ;
`
My lexer code is as follows:
lexer grammar ChannelLexer;
// ***************** lexer rules:
Define
:
'define'
;
Tcpchannel
:
'tcphannel'
;
Listener
:
'Listener'
;
End
:
'end'
;
Execution
:
' execution '
;
Request
:
' request '
;
With
:
' with '
;
Format
:
' format '
;
Response
:
' response '
;
Error
:
' error '
;
Call
:
' call '
;
Channel
:
' channel '
;
Dot
:
'.'
;
SColon
:
';'
;
Comma
:
','
;
Value
:
(
'a'..'z'
|'A'..'Z'
|'_'
)
(
'a'..'z'
|'A'..'Z'
|'_'
|Digit
)*
;
fragment
String
:
(
'"'
(
~(
'"'
| '\\'
)
| '\\'
(
'\\'
| '"'
)
)*
'"'
| '\''
(
~(
'\''
| '\\'
)
| '\\'
(
'\\'
| '\''
)
)*
'\''
)
{
setText(getText().substring(1, getText().length() - 1).replaceAll("\\\\(.)",
"$1"));
}
;
fragment
Digit
:
'0'..'9'
;
Space
:
(
' '
| '\t'
| '\r'
| '\n'
| '\u000C'
)
{
skip();
}
;
My parser code is:
parser grammar ChannelParser;
options
{
// antlr will generate java lexer and parser
language = Java;
// generated parser should create abstract syntax tree
output = AST;
}
// ***************** parser rules:
//our grammar accepts only salutation followed by an end symbol
expression
:
tcpChannelDefinition listenerDefinition executionDefintion endchannel
;
tcpChannelDefinition
:
Define Tcpchannel channelName
;
channelName
:
i= Value
{
$i.setText("CHANNEL_NAME#" + $i.text);
}
;
listenerDefinition
:
Define Listener listenerProperty endListener
;
listenerProperty
:
i=Value
{
$i.setText("PROPERTY_VALUE#" + $i.text);
}
;
endListener
:
End Listener SColon
;
executionDefintion
:
Define Execution execution
;
execution
:
Request With Format requestValue Comma
Response With Format responseValue Comma
Error With Format errorValue Comma
Call servicename Dot executionname
;
requestValue
:
i=Value
{
$i.setText("REQUEST_FORMAT#" + $i.text);
}
;
responseValue
:
i=Value
{
$i.setText("RESPONSE_FORMAT#" + $i.text);
}
;
errorValue
:
i=Value
{
$i.setText("ERROR_FORMAT#" + $i.text);
}
;
servicename
:
i=Value
{
$i.setText("SERVICE_NAME#" + $i.text);
}
;
executionname
:
i=Value
{
$i.setText("OPERATION_NAME#" + $i.text);
}
;
endexecution
:
End Define Execution SColon
;
endchannel
:
End Channel SColon
;
im getting error like missing Tcpchannel at 'tcpChannel' and extraneous input 'ChannelName' expecting Define. How to correct them. Please do help.ASAP

Why does this //ID pass but //DEF fails?

I'm experimenting with the XPath using the grammar provided in the test suite and am having a problem with the path //ID being identified, but //DEF is not found. An IllegalArgumentException is thrown. "DEF at index 2 isn't a valid token name" Why is //ID matched, but //DEFnot?
String exprGrammar = "grammar Expr;\n" +
"prog: func+ ;\n" +
"func: DEF ID '(' arg (',' arg)* ')' body ;\n" +
"body: '{' stat+ '}' ;\n" +
"arg : ID ;\n" +
"stat: expr ';' # printExpr\n" +
" | ID '=' expr ';' # assign\n" +
" | 'return' expr ';' # ret\n" +
" | ';' # blank\n" +
" ;\n" +
"expr: expr ('*'|'/') expr # MulDiv\n" +
" | expr ('+'|'-') expr # AddSub\n" +
" | primary # prim\n" +
" ;\n" +
"primary" +
" : INT # int\n" +
" | ID # id\n" +
" | '(' expr ')' # parens\n" +
" ;" +
"\n" +
"MUL : '*' ; // assigns token name to '*' used above in grammar\n" +
"DIV : '/' ;\n" +
"ADD : '+' ;\n" +
"SUB : '-' ;\n" +
"RETURN : 'return' ;\n" +
"DEF: 'def';\n" +
"ID : [a-zA-Z]+ ; // match identifiers\n" +
"INT : [0-9]+ ; // match integers\n" +
"NEWLINE:'\\r'? '\\n' -> skip; // return newlines to parser (is end-statement signal)\n" +
"WS : [ \\t]+ -> skip ; // toss out whitespace\n";
String SAMPLE_PROGRAM =
"def f(x,y) { x = 3+4; y; ; }\n" +
"def g(x) { return 1+2*x; }\n";
Grammar g2 = new Grammar(exprGrammar);
LexerInterpreter g2LexerInterpreter = g2.createLexerInterpreter(new ANTLRInputStream(SAMPLE_PROGRAM));
CommonTokenStream tokens = new CommonTokenStream(g2LexerInterpreter);
ParserInterpreter parser = g2.createParserInterpreter(tokens);
parser.setBuildParseTree(true);
ParseTree tree = parser.parse(g2.rules.get("prog").index);
String xpath = "//DEF";
for (ParseTree t : XPath.findAll(tree, xpath, parser) ) {
System.out.println(t.getSourceInterval());
}

When I run your code, the following gets printed:
0..0
18..18
In other words:
;)
This XPath tree pattern matching is all rather new, so my guess is that you've stumbled upon a bug that has been fixed. I'm using ANTLR version 4.2.2

error during the compilation_undefined reference

my file myComp.l
%{
#include <stdlib.h>
#include <stdio.h>
#include "y.tab.h"
int yyerror(char *);
%}
%%
[a-z] {
yylval = *yytext - 'a';
return VAR;
}
[0-9]+ {
yylval = atoi(yytext);
return INT;
}
[-+()=/*\n] { return *yytext; } [ \t] ;
. { yyerror("Input non valido"); }
%% int yywrap(void){
return 1; }
and this is the file myComp.y
%{ /* Prologo */
#define YYSTYPE int
#include <math.h>
#include <stdio.h>
int yyerror(char *);
int yylex(void) ;
int sym[26];
%}
/* Definizioni */
%token INT VAR
%left '+' '-'
%left '*' '/'
%%
program:
program statement '\n'
|
;
statement:
expr { printf("%d\n", $1); }
| VAR '=' expr { sym[$1] = $3; }
;
expr:
INT
| VAR { $$ = sym[$1]; }
| expr '+' expr { $$ = $1 + $3; }
| expr '-' expr { $$ = $1 - $3; }
| expr '*' expr { $$ = $1 * $3; }
| expr '/' expr { $$ = $1 / $3; }
| '(' expr ')' { $$ = $2; }
;
%%
int yyerror(char *s) {
fprintf(stderr, "%s\n", s);
return 1;
}
int main( void ) {
yyparse();
return 0;
}
i used this commands for compiling
flex myComp.l
bison -y myComp.y
gcc -o myComp y.tab.c
but i have this error:
/tmp/ccaHRWZu.o: In function `yyparse':
y.tab.c:(.text+0x24a): undefined reference to `yylex'
collect2: ld returned 1 exit status
all programs that i installed are updated in the last version.i can't unterstand where is the problem?what i can i do for risolving this error.please help me to fix it.thanks all
thk's all

you are missing the linker flag -lfl to link your parser against the flex library where yylex is defined. Additionally you need to build the output of flex, too. That c-file is probably called: myComp.lex.c
compile with:
gcc -o myComp y.tab.c myComp.lex.c -lfl

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

YACC and LEX, Getting syntax error at the end of line and can't figure out why - syntax

Your lexer is returning '\n' (newline) tokens at the end of the line, but your parser never accepts them, so you'll get a syntax error when the parser hits the first newline.

Related

Bug in a simple parser specification in F#

Processing lines of file in Ruby

Mismatched input errors in antlr 3.5

Why does this //ID pass but //DEF fails?

error during the compilation_undefined reference

Categories

Resources