Bug in a simple parser specification in F# - debugging

I wonder where the parser specification below went wrong. The parser aims to parse and evaluate an expression like 2+3*4 to 14. It is to be run with FsLexYacc.
%{
%}
%token <int> CSTINT
%token PLUS MINUS MUL
%token LPAR RPAR
%token EOF
%left MINUS PLUS /* lowest precedence */
%left TIMES DIV /* highest precedence */
%start Main
%type int Main
%%
Main:
Expr EOF { $1 }
;
Expr:
| CSTINT { $1 }
| MINUS CSTINT { - $2 }
| LPAR Expr RPAR { $2 }
| Expr MUL Expr { $1 * $3 }
| Expr PLUS Expr { $1+$3 }
| Expr MINUS Expr { $1-$3 }
;
I got the error
ExprPar.fsy(18,0): error: Unexpected character '%'%
The line 18 refers to the line up before "Main". Where is the bug?

I believe the type specified by %type should be in angle brackets:
%type <int> Main

Related

How to replace the values of a Param

How can I replace the values of parameters step by step.
What I mean is,
For Example-
Url is
https://example.com/?p=first&q=second&r=third
First I want to add '123' on p param
https://example.com/?p=123&q=second&r=third
Then again with same URL but different parameter, such as q param
https://example.com/?p=first&q=123&r=third
Again with same URL but different parameter,
https://example.com/?p=first&q=second&r=123
What I tried:
while read line; do
first_part=`echo $line | cut -d'=' -f1` second_part=`echo $line | cut -d'=' -f2`
echo "${first_part}=123${second_part}"
echo "${first_part}${second_part}=123"
done < urls.txt
The problem described is a good application for AWK's capabilities. The demo script includes samples for both URLs and a mapping functions file for global transformation of URLs.
This approach allows for parameters to "free float", not dependent on matching at a specific sequential position in the URL string.
This approach also allows for parameters to be strings of any length.
#!/bin/bash
#QUESTION: https://stackoverflow.com/questions/75124190/how-to-replace-the-values-of-a-param
cat >URL.list <<"EnDoFiNpUt"
https://example.com/?p=first&q=second&r=third
https://example.com/?r=zinger
https://example.com/?r=bonkers&q=junk&p=wacko
https://example.com/?p=flyer
EnDoFiNpUt
cat >mapfile.txt <<"EnDoFiNpUt"
q=SECOND
r=THIRD
p=FIRST
EnDoFiNpUt
awk -v datFile="mapfile.txt" 'BEGIN{
## Initial loading of the mapping file into array for comparison
split( "", transforms ) ;
indexT=0 ;
while( getline < datFile ){
indexT++ ;
transforms[indexT]=$0 ;
} ;
}
{
### Split off beginning of URL from parameters
qPos=index( $0, "?" ) ;
beg=substr( $0, 1, qPos ) ;
### Load URL elements into array for comparison
rem=substr( $0, qPos+1 ) ;
n=split( rem, parts, "&" ) ;
### Match and Map transforms elements with URL parts
for( k=1 ; k<= indexT ; k++ ){
dPos=index( transforms[k], "=" ) ;
fieldPref=substr( transforms[k], 1, dPos ) ;
for( i=1 ; i<=n ; i++ ){
if( parts[i] ~ fieldPref ){
parts[i]=transforms[k] ;
} ;
} ;
} ;
### Print transformed URL
printf("%s%s", beg, parts[1] ) ;
for( i=2 ; i<=n ; i++ ){
printf("&%s", parts[i] ) ;
} ;
print "" ;
}' URL.list
The output looks like this:
https://example.com/?p=FIRST&q=SECOND&r=THIRD
https://example.com/?r=THIRD
https://example.com/?r=THIRD&q=SECOND&p=FIRST
https://example.com/?p=FIRST
HTML params are, by spec, orderless, so you can simply place p='s new value at the tail instead of original position :
echo 'https://example.com/?p=first&q=second&r=third' |
mawk NF=NF FS='p=[^&]*[&]?' OFS= ORS='&p=123\n'
1 https://example.com/?q=second&r=third&p=123
same for q=.
if you're modifying r= instead, then set both FS and OFS to "=", and do it it like a vanilla value update for $NF

Unordered result because of an ambiguous grammar using Flex and Bison

I'm trying to create a section for variable declaration (a bit similar to HTML) using Flex and Bison, my grammar is correct (no lexical or syntax errors), but the displayed result isn't ordered.
example.txt:
<SUB VARIABLE>
< a AS INT />;
<string AS STR />;
< x | y AS FLT />;
<bool AS BOL />;
<char AS CHR />;
</SUB VARIABLE>
the result I get (the incorrect one):
a ---> 1
x ---> 2
y ---> 2
string ---> 4
char ---> 3
bool ---> 5
the result I want to display (the correct one):
a ---> 1
string ---> 4
x ---> 2
y ---> 2
bool ---> 5
char ---> 3
Here's my code:
synt.y:
DECLARATION: DECLARATION '<' SUB VARIABLE '>' SUITE
|
;
SUITE: '<' idf SUITE_VAR {inserer($2,getType());}
| '<' '/' SUB VARIABLE '>'
;
SUITE_VAR: '|' idf SUITE_VAR {inserer($2, getType());}
| AS INT '/' '>' ';' SUITE {setType(1);}
| AS FLT '/' '>' ';' SUITE {setType(2);}
| AS CHR '/' '>' ';' SUITE {setType(3);}
| AS STR '/' '>' ';' SUITE {setType(4);}
| AS BOL '/' '>' ';' SUITE {setType(5);}
;
My grammar may be ambiguous, I tried many other grammars but I had the same problem.
Could you please tell me how I should write my grammar to have an ordered result?
Thanks a lot.
This is not a mistake in your grammar. It's the semantics of the inserer function where you probably have an issue.

YACC and LEX, Getting syntax error at the end of line and can't figure out why

HERE IS MY YACC
%{
#include <stdio.h>
#include <ctype.h>
#include "lex.yy.c"
void yyerror (s) /* Called by yyparse on error */
char *s;
{
printf ("%s\n", s);
}
%}
%start program
%union{
int value;
char * string;
}
%token <value> NUM INT VOID WHILE IF THEN ELSE READ WRITE RETURN LE GE EQ NE
%token <string> ID
%token <value> INTEGER
%left '|'
%left '&'
%left '+' '-'
%left '*' '/' '%'
%left UMINUS
%%
program : decllist
{fprintf(stderr, "program");}
;
decllist : dec
{fprintf(stderr, "\n dec");}
| dec decllist
;
dec : vardec
{fprintf(stderr, "vardec");}
| fundec
{fprintf(stderr, "YEAH");}
;
typespec : INT
| VOID
;
vardec : typespec ID ';'
{fprintf(stderr, "yep");}
| typespec ID '[' NUM ']' ';'
{fprintf(stderr, "again");}
;
fundec : typespec ID '(' params ')' compoundstmt
;
params : VOID
| paramlist
;
paramlist : param
| param ',' paramlist
;
param : typespec ID
| typespec ID '['']'
;
compoundstmt : '{' localdeclerations statementlist '}'
;
localdeclerations :/* empty */
|vardec localdeclerations
;
statementlist : /* empty */
| statement statementlist
;
statement : expressionstmt
| compoundstmt
| selectionstmt
| iterationstmt
| assignmentstmt
| returnstmt
| readstmt
| writestmt
;
expressionstmt : expression ';'
| ';'
;
assignmentstmt : var '=' expressionstmt
;
selectionstmt : IF '(' expression ')' statement
| IF '(' expression ')' statement ELSE statement
;
iterationstmt : WHILE '(' expression ')' statement
;
returnstmt : RETURN ';'
| RETURN expression ';'
;
writestmt : WRITE expression ';'
;
readstmt : READ var ';'
;
expression : simpleexpression
;
var : ID
| ID '[' expression ']'
;
simpleexpression : additiveexpression
| additiveexpression relop simpleexpression
;
relop : LE
| '<'
| '>'
| GE
| EQ
| NE
;
additiveexpression : term
| term addop term
;
addop : '+'
| '-'
;
term : factor
| term multop factor
;
multop : '*'
| '/'
;
factor : '(' expression ')'
| NUM
| var
| call
;
call : ID '(' args ')'
;
args : arglist
| /* empty */
;
arglist : expression
| expression ',' arglist
;
%%
main(){
yyparse();
}
AND HERE IS MY LEX
%{
int mydebug=1;
int lineno=0;
#include "y.tab.h"
%}
%%
int {if (mydebug) fprintf(stderr, "int found\n");
return(INT);
}
num {if (mydebug) fprintf(stderr, "num found\n");
return(NUM);
}
void {if (mydebug) fprintf(stderr, "void found \n");
return(VOID);
}
while {if (mydebug) fprintf(stderr, "while found \n");
return(WHILE);
}
if {if (mydebug) fprintf(stderr, "if found \n");
return(IF);
}
then {if (mydebug) fprintf(stderr, "then found \n");
return(THEN);
}
else {if (mydebug) fprintf(stderr, "else found \n");
return(ELSE);
}
read {if (mydebug) fprintf(stderr, "read found \n");
return(READ);
}
write {if (mydebug) fprintf(stderr, "void found \n");
return(WRITE);
}
return {if (mydebug) fprintf(stderr, "void found \n");
return(RETURN);
}
'<=' {if (mydebug) fprintf(stderr, "void found \n");
return(LE);
}
'>=' {if (mydebug) fprintf(stderr, "void found \n");
return(GE);
}
'==' {if (mydebug) fprintf(stderr, "void found \n");
return(EQ);
}
'!=' {if (mydebug) fprintf(stderr, "void found \n");
return(NE);
}
[a-zA-Z][a-zA-Z0-9]* {if (mydebug) fprintf(stderr,"Letter found\n");
yylval.string=strdup(yytext); return(ID);}
[0-9][0-9]* {if (mydebug) fprintf(stderr,"Digit found\n");
yylval.value=atoi((const char *)yytext); return(NUM);}
[ \t] {if (mydebug) fprintf(stderr,"Whitespace found\n");}
[=\-+*/%&|()\[\]<>;] { if (mydebug) fprintf(stderr,"return a token %c\n",*yytext);
return (*yytext);}
\n { if (mydebug) fprintf(stderr,"cariage return %c\n",*yytext);
lineno++;
return (*yytext);}
%%
int yywrap(void)
{ return 1;}
If i type in something like 'int a;' it gets all the way to the new line and prints 'carriage returned' but then stops and spits out syntax error at the end. Can anyone see why?
I have gone over this a lot and can't seem to find what keeps stopping it. I have a previous program that i am going back to trying to see if i can't figure it out but i'm stumped. Can anyone help?
Your lexer is returning '\n' (newline) tokens at the end of the line, but your parser never accepts them, so you'll get a syntax error when the parser hits the first newline.

Processing lines of file in Ruby

I have some file like this
file alldataset; append next;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
and I am trying to write a ruby program to push any line that comes after a semi colon to a new line. In addition, if a line has a 'do', indent from the 'do' so that the following line is indented by two blanks and any inner 'do' be indented by 4 blanks and so on.
I am very new to Ruby and my code so far is quite away from what I want. This is what I have
def indent(text, num)
" "*num+" " + text
end
doc = File.open('newtext.txt')
doc.to_a.each do |line|
if line.downcase =~ /^(file).+(;)/i
puts line+"\n"
end
if line.downcase.include?('do')
puts indent(line, 2)
end
end
This is the desired output
file alldataset;
append next;
if file.first? do
line + "\n";
if !file.last? do
line.indent(2);
end;
end;
Any help would be appreciated.
As you are interested in parsing, here is a quickly made example, just to give you a taste. I have learned Lex/Yacc, Flex/Bison, ANTLR v3 and ANTLR v4. I strongly recommend ANTLR4 which is so powerful. References :
the ANTLR site
The ANTLR mega tutorial
the expert book
StackOverflow -> Tags -> antlr
The following grammar can parse only the input example you have provided.
File Question.g4 :
grammar Question;
/* Simple grammar example to parse the following code :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
*/
start
#init {System.out.println("Question last update 1048");}
: file* EOF
;
file
: FILE ID ';' statement_p*
;
statement_p
: statement
{System.out.println("Statement found : " + $statement.text);}
;
statement
: 'append' ID ';'
| if_statement
| other_statement
| 'end' ';'
;
if_statement
: 'if' expression 'do' expression ';'
;
other_statement
: ID ';'
;
expression
: receiver=( ID | FILE ) '.' method_call # Send
| expression '+' expression # Addition
| '!' expression # Negation
| atom # An_atom
;
method_call
: method_name=ID arguments?
;
arguments
: '(' ( argument ( ',' argument )* )? ')'
;
argument
: ID | NUMBER
;
atom
: ID
| FILE
| STRING
;
FILE : 'file' ;
ID : LETTER ( LETTER | DIGIT | '_' )* ( '?' | '!' )? ;
NUMBER : DIGIT+ ( ',' DIGIT+ )? ( '.' DIGIT+ )? ;
STRING : '"' .*? '"' ;
NL : ( [\r\n] | '\r\n' ) -> skip ;
WS : [ \t]+ -> channel(HIDDEN) ;
fragment DIGIT : [0-9] ;
fragment LETTER : [a-zA-Z] ;
File input.txt :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
Execution :
$ export CLASSPATH=".:/usr/local/lib/antlr-4.6-complete.jar"
$ alias
alias a4='java -jar /usr/local/lib/antlr-4.6-complete.jar'
alias grun='java org.antlr.v4.gui.TestRig'
$ a4 Question.g4
$ javac Q*.java
$ grun Question start -tokens -diagnostics input.txt
[#0,0:0=' ',<WS>,channel=1,1:0]
[#1,1:4='file',<'file'>,1:1]
[#2,5:5=' ',<WS>,channel=1,1:5]
[#3,6:15='alldataset',<ID>,1:6]
[#4,16:16=';',<';'>,1:16]
[#5,17:17=' ',<WS>,channel=1,1:17]
[#6,18:23='append',<'append'>,1:18]
[#7,24:24=' ',<WS>,channel=1,1:24]
[#8,25:28='next',<ID>,1:25]
[#9,29:29=';',<';'>,1:29]
[#10,30:30=' ',<WS>,channel=1,1:30]
[#11,31:33='xyz',<ID>,1:31]
[#12,34:34=';',<';'>,1:34]
[#13,36:36=' ',<WS>,channel=1,2:0]
[#14,37:38='if',<'if'>,2:1]
[#15,39:39=' ',<WS>,channel=1,2:3]
[#16,40:43='file',<'file'>,2:4]
[#17,44:44='.',<'.'>,2:8]
[#18,45:50='first?',<ID>,2:9]
[#19,51:51=' ',<WS>,channel=1,2:15]
[#20,52:53='do',<'do'>,2:16]
[#21,54:54=' ',<WS>,channel=1,2:18]
[#22,55:58='line',<ID>,2:19]
[#23,59:59=' ',<WS>,channel=1,2:23]
[#24,60:60='+',<'+'>,2:24]
[#25,61:61=' ',<WS>,channel=1,2:25]
[#26,62:65='"\n"',<STRING>,2:26]
[#27,66:66=';',<';'>,2:30]
...
[#59,133:132='<EOF>',<EOF>,7:0]
Question last update 1048
Statement found : append next;
Statement found : xyz;
Statement found : if file.first? do line + "\n";
Statement found : if !file.last? do line.indent(2);
Statement found : end;
Statement found : end;
Statement found : xyz;
One advantage of ANTLR4 over previous versions or other parser generators is that the code is no longer scattered among the parser rules, but gathered in a separate listener. This is where you do the actual processing, such as producing a new reformatted file. It would be too long to show a complete example. Today you can write the listener in C++, C#, Python and others. As I don't know Java, I have a machinery using Jruby, see my forum answer.
In Ruby there are many ways to do things. So my solution is one among others.
File t.rb :
def print_indented(p_file, p_indent, p_text)
p_file.print p_indent
p_file.puts p_text
end
# recursively split the line at semicolon, as long as the rest is not empty
def partition_on_semicolon(p_line, p_answer, p_level)
puts "in partition_on_semicolon for level #{p_level} p_line=#{p_line} / p_answer=#{p_answer}"
first_segment, semi, rest = p_line.partition(';')
p_answer << first_segment + semi
partition_on_semicolon(rest.lstrip, p_answer, p_level + 1) unless rest.empty?
end
lines = IO.readlines('input.txt')
# Compute initial indentation, the indentation of the first line.
# This is to preserve the spaces which are in the input.
m = lines.first.match(/^( *)(.*)/)
initial_indent = ' ' * m[1].length
# initial_indent = '' # uncomment if the initial indentation needs not to be preserved
puts "initial_indent=<#{initial_indent}> length=#{initial_indent.length}"
level = 1
indentation = ' '
File.open('newtext.txt', 'w') do | output_file |
lines.each do | line |
line = line.chomp
line = line.lstrip # remove trailing spaces
puts "---<#{line}>"
next_indent = initial_indent + indentation * (level - 1)
case
when line =~ /^file/ && line.count(';') > 1
level = 1 # restore, remove this if files can be indented
next_indent = initial_indent + indentation * (level - 1)
# split in count fragments
fragments = []
partition_on_semicolon(line, fragments, 1)
puts '---fragments :'
puts fragments.join('/')
print_indented(output_file, next_indent, fragments.first)
fragments[1..-1].each do | fragment |
print_indented(output_file, next_indent + indentation, fragment)
end
level += 1
when line.include?(' do ')
fragment1, _fdo, fragment2 = line.partition(' do ')
print_indented(output_file, next_indent, "#{fragment1} do")
print_indented(output_file, next_indent + indentation, fragment2)
level += 1
else
level -= 1 if line =~ /end;/
print_indented(output_file, next_indent, line)
end
end
end
File input.txt :
file alldataset; append next; xyz;
if file.first? do line + "\n";
if !file.last? do line.indent(2);
end;
end;
file file2; xyz;
Execution :
$ ruby -w t.rb
initial_indent=< > length=1
---<file alldataset; append next; xyz;>
in partition_on_semicolon for level 1 p_line=file alldataset; append next; xyz; / p_answer=[]
in partition_on_semicolon for level 2 p_line=append next; xyz; / p_answer=["file alldataset;"]
in partition_on_semicolon for level 3 p_line=xyz; / p_answer=["file alldataset;", "append next;"]
---fragments :
file alldataset;/append next;/xyz;
---<if file.first? do line + "\n";>
---<if !file.last? do line.indent(2);>
---<end;>
---<end;>
---<file file2; xyz;>
in partition_on_semicolon for level 1 p_line=file file2; xyz; / p_answer=[]
in partition_on_semicolon for level 2 p_line=xyz; / p_answer=["file file2;"]
---fragments :
file file2;/xyz;
---<>
Output file newtext.txt :
file alldataset;
append next;
xyz;
if file.first? do
line + "\n";
if !file.last? do
line.indent(2);
end;
end;
file file2;
xyz;

error during the compilation_undefined reference

my file myComp.l
%{
#include <stdlib.h>
#include <stdio.h>
#include "y.tab.h"
int yyerror(char *);
%}
%%
[a-z] {
yylval = *yytext - 'a';
return VAR;
}
[0-9]+ {
yylval = atoi(yytext);
return INT;
}
[-+()=/*\n] { return *yytext; } [ \t] ;
. { yyerror("Input non valido"); }
%% int yywrap(void){
return 1; }
and this is the file myComp.y
%{ /* Prologo */
#define YYSTYPE int
#include <math.h>
#include <stdio.h>
int yyerror(char *);
int yylex(void) ;
int sym[26];
%}
/* Definizioni */
%token INT VAR
%left '+' '-'
%left '*' '/'
%%
program:
program statement '\n'
|
;
statement:
expr { printf("%d\n", $1); }
| VAR '=' expr { sym[$1] = $3; }
;
expr:
INT
| VAR { $$ = sym[$1]; }
| expr '+' expr { $$ = $1 + $3; }
| expr '-' expr { $$ = $1 - $3; }
| expr '*' expr { $$ = $1 * $3; }
| expr '/' expr { $$ = $1 / $3; }
| '(' expr ')' { $$ = $2; }
;
%%
int yyerror(char *s) {
fprintf(stderr, "%s\n", s);
return 1;
}
int main( void ) {
yyparse();
return 0;
}
i used this commands for compiling
flex myComp.l
bison -y myComp.y
gcc -o myComp y.tab.c
but i have this error:
/tmp/ccaHRWZu.o: In function `yyparse':
y.tab.c:(.text+0x24a): undefined reference to `yylex'
collect2: ld returned 1 exit status
all programs that i installed are updated in the last version.i can't unterstand where is the problem?what i can i do for risolving this error.please help me to fix it.thanks all
thk's all
you are missing the linker flag -lfl to link your parser against the flex library where yylex is defined. Additionally you need to build the output of flex, too. That c-file is probably called: myComp.lex.c
compile with:
gcc -o myComp y.tab.c myComp.lex.c -lfl

Resources