Is there an easy way to make Jison calculator parser return symbolic results? - jison

Jison parsers return the calculated result:
calculator.parse("2^3"); // returns 8
calculator.parse("x^2"); // gives a parse err
I would like that it return the symbolic expression:
calculator.parse("x^2");
// should return
// "Math.pow(x,2)"
And
calculator.parse("x^(x+1)");
// should return
// "Math.pow(x,x+1)"
And
calculator.parse("cos(x)");
// should return
// "Math.cos(x)"

If what you need is simple enough, you might get by by modifying the calculator. For instance:
Add an IDENTIFIER token after NUMBER in the list of tokens:
[a-z] return 'IDENTIFIER'
This allows a single lower case letter to serve as identifie.
Modify the e '^' e rule to return a string rather than the computed value:
| e '^' e
{$$ = "Math.pow(" + $1 + "," + $3 + ");"}
Add a new rule to the list of rules for e:
| IDENTIFIER
(No explicit action needed.)
With these changes parsing, x^2 results in "Math.pow(x,2);"
To support the operators, the other rules would have to be modified like the one for e '^' e to return strings rather than the result of the math.
This is extremely primitive and won't optimize things that could be optimized. For instance, 1^2 will be output as "Math.pow(1, 2)" when it could be optimized to 1.

(Based on #Louis answer and comments)
Here is the code for a basic symbolic calculator using jison:
/* description: Parses and executes mathematical expressions. */
/* lexical grammar */
%lex
%%
\s+ /* skip whitespace */
(acos|asin|atan|atan2|cos|log|sin|sqrt|tan) return 'FUNCTION'
[0-9]+("."[0-9]+)?\b return 'NUMBER'
[a-z] return 'IDENTIFIER'
"|" return '|'
"*" return '*'
"/" return '/'
"-" return '-'
"+" return '+'
"^" return '^'
"!" return '!'
"%" return '%'
"(" return '('
")" return ')'
"PI" return 'PI'
"E" return 'E'
<<EOF>> return 'EOF'
. return 'INVALID'
/lex
/* operator associations and precedence */
%left '+' '-'
%left '*' '/'
%left '^'
%right '!'
%right '%'
%left UMINUS
%start expressions
%% /* language grammar */
expressions
: e EOF
{ typeof console !== 'undefined' ? console.log($1) : print($1);
return $1; }
;
e
: e '+' e
{$$ = $1 + " + " + $3;}
| e '-' e
{$$ = $1 + "-" + $3;}
| e '*' e
{$$ = $1 + "*" + $3;}
| e '/' e
{$$ = $1 + "/" + $3;}
| e '^' e
{$$ = "Math.pow(" + $1 + ", " + $3 + ");"}
| e '!'
{{
$$ = (function fact (n) { return n==0 ? 1 : fact(n-1) * n })($1);
}}
| e '%'
{$$ = $1/100;}
| '-' e %prec UMINUS
{$$ = -$2;}
| '(' e ')'
{$$ = $2;}
| FUNCTION '(' e ')'
{$$ = "Math." + $1 + "(" + $3 + ")";}
| '|' e '|'
{$$ = "Math.abs(" + $2 + ")";}
| NUMBER
{$$ = Number(yytext);}
| E
{$$ = Math.E;}
| PI
{$$ = Math.PI;}
| IDENTIFIER
| FUNCTION
;

No.
Not an easy way. Depending on what you call 'easy'. You need to define tokens. And then symbolic manipulation of those tokens. And stuff.
I don't think that calculator would be a good starting point, and its such a trivial thing anyway that throwing it away and writing your JS symbolic parser grammar from scratch wouldn't be too much more of a problem.
Just to save some google time for everyone, this is what we are talking about: http://zaach.github.io/jison/demos/calc/

Related

How could I generate my abstract tree using this makefile? Why I see only an error at 1 line?

def.h
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <string.h>
typedef enum
{
NPROGRAM,
NVARDECLLIST,
NFUNCDECLLIST,
NVARDECL,
NIDLIST,
NOPTPARAMLIST,
NSTATLIST,
NASSIGNSTAT,
NWHILESTAT,
NIFSTAT,
NFORSTAT,
NRELEXPR,
NRETURNSTAT,
NREADSTAT,
NLOGICEXPR,
NWRITESTAT,
NNEGEXPR,
NFUNCCALL,
NEXPRLIST,
NCONDEXPR,
NPARAMDECL,
NMATHEXPR,
NCASTING
} Nonterminal;
typedef enum
{
T_BREAK,
T_TYPE,
T_BOOLEAN,
T_INTCONST,
T_REALCONST,
T_BOOLCONST,
T_WRITEOP,
T_STRCONST,
T_ID,
T_NONTERMINAL
} Typenode;
typedef union
{
int ival;
float rval;
char *sval;
enum {FALSE, TRUE} bval;
} Value;
typedef struct snode
{
Typenode type;
Value value;
struct snode *p1, *p2, *p3;
} Node;
typedef Node *Pnode;
char *newstring(char*);
int yylex();
Pnode nontermnode(Nonterminal, int),
ntn(Nonterminal),
idnode(),
keynode(Typenode, int),
intconstnode(),
realconstnode(),
strconstnode(),
boolconstnode(),
newnode(Typenode);
void yyerror(),
treeprint(Pnode, int);
lexer.lex
%{
#include "parser.h"
#include "def.h"
int line = 1;
Value lexval;
%}
%option noyywrap
spacing ([ \t])+
commento "#"(.)*\n
letter [A-Za-z]
digit [0­9]
intconst {digit}+
strconst \"([^\"])*\"
boolconst false|true
realconst {intconst}\.{digit}+
id {letter}({letter}|{digit})*
sugar [ \( \) : , ; \. \+ \- \* / ]
%%
{spacing} ;
\n {line++;}
integer {return(INTEGER);}
string {return(STRING);}
boolean {return(BOOLEAN);}
real {return(REAL);}
void {return(VOID);}
func {return(FUNC);}
body {return(BODY);}
end {return(END);}
else {return(ELSE);}
while {return(WHILE);}
do {return(DO);}
for {return(FOR);}
to {return(TO);}
return {return(RETURN);}
read {return(READ);}
write {return(WRITE);}
writeln {return(WRITELN);}
and {return(AND);}
or {return(OR);}
not {return(NOT);}
if {return(IF);}
then {return(THEN);}
break {return(BREAK);}
"<=" {return(LEQ);}
">=" {return(GEQ);}
"!=" {return(NEQ);}
"==" {return(EQU);}
"<" {return(LT);}
">" {return(GT);}
{intconst} {lexval.ival = atoi(yytext); return(INTCONST);}
{strconst} {lexval.sval = newstring(yytext); return(STRCONST);}
{boolconst} {lexval.bval = (yytext[0] == 'f' ? FALSE : TRUE); return(BOOLCONST);}
{realconst} {lexval.rval = atof(yytext); return(REALCONST);}
{id} {lexval.sval = newstring(yytext); return(ID);}
{sugar} {return(yytext[0]);}
. {return(ERROR);}
%%
char *newstring(char *s)
{
char *p;
p = malloc(strlen(s)+1);
strcpy(p, s);
return(p);
}
makefile
bup: lexer.o parser.o tree.o
cc -g -o bup lexer.o parser.o tree.o
lexer.o: lexer.c parser.h def.h
cc -g -c lexer.c
parser.o: parser.c def.h
cc -g -c parser.c
tree.o: tree.c def.h
cc -g -c tree.c
lexer.c: lexer.lex parser.y parser.h parser.c def.h
flex -o lexer.c lexer.lex
parser.h: parser.y def.h
bison -vd -o parser.c parser.y
parser.y
%{
#include "def.h"
#define YYSTYPE Pnode
extern char *yytext;
extern Value lexval;
extern int line;
extern FILE *yyin;
Pnode root = NULL;
%}
%token ID FUNC BODY END BREAK IF THEN ELSE TYPE WHILE DO FOR RETURN READ WRITE WRITELN
%token AND OR INTCONST REALCONST BOOLCONST STRCONST INTEGER REAL NOT STRING BOOLEAN VOID PLUS MINUS TIMES SLASH
%token LEQ GEQ NEQ EQU GT LT TO ERROR
%%
program : var-decl-list func-decl-list body '.' {root = $$ = ntn(NPROGRAM);
root->p1 = ntn(NVARDECLLIST);
root->p2 = ntn(NFUNCDECLLIST);
root->p1->p1 = $1;
root->p2->p1 = $2;
root->p3 = $3;}
;
var-decl-list : var-decl var-decl-list {$$ -> p1=$1;
$1->p3=$2;}
| {$$ = NULL;}
;
var-decl : id-list ':' type ';' {$$ = ntn(NVARDECL);
$$ -> p1 = ntn(NIDLIST);
$$->p1->p1=$1; $$ -> p1 -> p3 = $3;}
;
id-list : ID {$$ = idnode();} ',' id-list {$$ = $2;
$2 -> p3 = $4;}
| ID {$$ = idnode();}
;
type : INTEGER {$$ = keynode(T_TYPE, INTEGER);}
| REAL {$$ = keynode(T_TYPE, REAL);}
| STRING {$$ = keynode(T_TYPE, STRING);}
| BOOLEAN {$$ = keynode(T_TYPE, BOOLEAN);}
| VOID {$$ = keynode(T_TYPE, VOID); }
;
func-decl-list : func-decl func-decl-list {$$ -> p1 = $1;
$1 -> p3 = $2;}
| {$$ = NULL;}
;
func-decl : FUNC ID {$$ = idnode();} '(' opt-param-list ')' ':' type var-decl-list body ';' {$$ -> p1 = $3;
$$ -> p2 = ntn(NOPTPARAMLIST);
$$ -> p2 ->p1=$5;
$$ -> p2 -> p3 = $8;
$$ -> p2 -> p3->p3 = ntn(NVARDECLLIST);
$$ -> p2 -> p3->p3->p1 = $9;
$$ -> p2 -> p3->p3->p3 = $10;}
;
opt-param-list : param-list {$$ = $1;}
| {$$ = NULL;}
;
param-list : param-decl ',' param-list {$$ = $1;
$1 -> p3 = $3;}
| param-decl
;
param-decl : ID {$$ = idnode();} ':' type {$$=ntn(NPARAMDECL);
$$ -> p1 = $2;
$$ -> p2 = $4;}
;
body : BODY stat-list END {$$ = ntn(NSTATLIST);
$$->p1=$2;}
;
stat-list : stat ';' stat-list {$$ = $1;
$1 -> p3 = $3;}
| stat ';' {$$=$1;}
;
stat : assign-stat
| if-stat
| while-stat
| for-stat
| return-stat
| read-stat
| write-stat
| func-call
| BREAK {$$ = newnode(T_BREAK);}
;
assign-stat : ID {$$ = idnode();} '=' expr {$$ = ntn(NASSIGNSTAT);
$$ -> p1 = $2;
$$ -> p2 = $4;}
;
if-stat : IF expr THEN stat-list opt-else-stat END {$$ = ntn(NIFSTAT);
$$ -> p1 = $2;
$$ -> p2 = ntn(NSTATLIST);
$$ ->p2 -> p3 = $5;}
;
opt-else-stat : ELSE stat-list {$$ = ntn(NSTATLIST);
$$->p1=$2;}
| {$$ = NULL;}
;
while-stat : WHILE expr DO stat-list END {$$ = ntn(NWHILESTAT);
$$->p1=$2;
$$->p2=ntn(NSTATLIST);
$$->p2->p1=$4;}
;
for-stat : FOR ID {$$=idnode();} '=' expr TO expr DO stat-list END {$$ = ntn(NFORSTAT);
$$->p1=$3;
$$->p2=$5;
$$->p2->p3=$7;
$$->p2->p3->p3=ntn(NSTATLIST);
$$->p2->p3->p3->p1=$9;}
;
return-stat : RETURN opt-expr {$$ = ntn(NRETURNSTAT);
$$->p1=$2;}
;
opt-expr : expr {$$=$1;}
| {$$=NULL;}
;
read-stat : READ '(' id-list ')' {$$ = ntn(NREADSTAT);
$$->p1=ntn(NIDLIST);
$$->p1->p1=$3;}
;
write-stat : write-op '(' expr-list ')' {$$ = ntn(NWRITESTAT);
$$->p1=$1;
$$->p2=ntn(NEXPRLIST);
$$->p2->p1=$3;}
;
write-op : WRITE {$$ = keynode(T_WRITEOP, WRITE);}
| WRITELN {$$ = keynode(T_WRITEOP, WRITELN);}
;
expr-list : expr ',' expr-list {$$=$1;
$1->p3=$3;}
| expr
;
expr : expr logic-op bool-term { $$=$2;
$2->p1=$1;
$2->p2=$3;}
| bool-term
;
logic-op : AND {$$=nontermnode(NLOGICEXPR, AND);}
| OR {$$=nontermnode(NLOGICEXPR, OR);}
;
bool-term : rel-term rel-op rel-term {$$=$2;
$2->p1=$1;
$2->p2=$3;}
| rel-term
;
rel-op : EQU {$$=nontermnode(NRELEXPR, EQU);}
| NEQ {$$=nontermnode(NRELEXPR, NEQ);}
| GT {$$=nontermnode(NRELEXPR, GT);}
| GEQ {$$=nontermnode(NRELEXPR, GEQ);}
| LT {$$=nontermnode(NRELEXPR, LT);}
| LEQ {$$=nontermnode(NRELEXPR, LEQ);}
;
rel-term : rel-term low-prec-op low-term {$$=$2;
$2->p1=$1;
$2->p2=$3;}
| low-term
;
low-prec-op : PLUS {$$=nontermnode(NMATHEXPR, PLUS);}
| MINUS {$$=nontermnode(NMATHEXPR, MINUS);}
;
low-term : low-term high-prec-op factor {$$=$2;
$2->p1=$1;
$2->p2=$3;}
| factor
;
high-prec-op : TIMES {$$=nontermnode(NMATHEXPR, TIMES);}
| SLASH {$$=nontermnode(NMATHEXPR, SLASH);}
;
factor : unary-op factor {$$=$1;
$1->p3=$2;}
| '(' expr ')' {$$=$2;}
| ID {$$=idnode();}
| const {$$=$1;}
| func-call {$$=$1;}
| cond-expr {$$=$1;}
| cast '(' expr ')' {$$=$1;
$1->p3=$3;}
;
unary-op : MINUS {$$=nontermnode(NNEGEXPR, MINUS);}
| NOT {$$=nontermnode(NNEGEXPR, NOT);}
;
const : INTCONST {$$=intconstnode();}
| REALCONST {$$=realconstnode();}
| STRCONST {$$=strconstnode();}
| BOOLCONST {$$=boolconstnode();}
;
func-call : ID {$$=idnode();} '(' opt-expr-list ')' {$$ = ntn(NFUNCCALL);
$$->p1=$2; $$->p2=$4;}
;
opt-expr-list : expr-list {$$=ntn(NEXPRLIST); $$->p1=$1;}
| {$$=NULL;}
;
cond-expr : IF expr THEN expr ELSE expr END {$$=ntn(NCONDEXPR);
$$->p1=$2;
$$->p2=$4;
$$->p3=$6;}
;
cast : INTEGER {$$=nontermnode(NCASTING,INTEGER);}
| REAL {$$=nontermnode(NCASTING, REAL);}
;
%%
Pnode ntn(Nonterminal nonterm)
{
Pnode p = newnode(T_NONTERMINAL);
p->value.rval = nonterm;
return(p);
}
Pnode nontermnode(Nonterminal nonterm, int valore)
{
Pnode p = newnode(T_NONTERMINAL);
p->value.rval = nonterm;
p->value.ival = valore;
return(p);
}
Pnode idnode()
{
Pnode p = newnode(T_ID);
p->value.sval = lexval.sval;
return(p);
}
Pnode keynode(Typenode keyword, int valore)
{
Pnode p = newnode(keyword);
p->value.ival = valore;
return p;
}
Pnode intconstnode()
{
Pnode p = newnode(T_INTCONST);
p->value.ival = lexval.ival;
return(p);
}
Pnode realconstnode()
{
Pnode p = newnode(T_REALCONST);
p->value.rval = lexval.rval;
return(p);
}
Pnode strconstnode()
{
Pnode p = newnode(T_STRCONST);
p->value.sval = lexval.sval;
return(p);
}
Pnode boolconstnode()
{
Pnode p = newnode(T_BOOLCONST);
p->value.bval = lexval.bval;
return(p);
}
Pnode newnode(Typenode tnode)
{
Pnode p = malloc(sizeof(Node));
p->type = tnode;
p->p1 = p->p2 = p->p3 = NULL;
return(p);
}
int main()
{
int result;
printf("----------------------------------------------");
yyin = stdin;
if((result = yyparse()) == 0)
treeprint(root, 0);
return(result);
}
void yyerror()
{
fprintf(stderr, "Line %d: syntax error on symbol \"%s\"\n",
line, yytext);
exit(-1);
}
tree.c
#include "def.h"
char* tabtypes[] =
{
"T_BREAK",
"T_TYPE",
"T_BOOLEAN",
"T_INTCONST",
"T_REALCONST",
"T_BOOLCONST",
"T_WRITEOP",
"T_STRCONST",
"T_ID",
"T_NONTERMINAL"
};
char* tabnonterm[] =
{
"PROGRAM",
"NVARDECLLIST",
"NFUNCDECLLIST",
"NVARDECL",
"NIDLIST",
"NOPTPARAMLIST",
"NSTATLIST",
"NASSIGNSTAT",
"NWHILESTAT",
"NIFSTAT",
"NFORSTAT",
"NRELEXPR",
"NRETURNSTAT",
"NREADSTAT",
"NLOGICEXPR",
"NWRITESTAT",
"NNEGEXPR",
"NFUNCCALL",
"NEXPRLIST",
"NCONDEXPR",
"NPARAMDECL",
"NMATHEXPR",
"NCASTING"
};
void treeprint(Pnode root, int indent)
{
int i;
Pnode p;
for(i=0; i<indent; i++)
printf(" ");
printf("%s", (root->type == T_NONTERMINAL ? tabnonterm[root->value.ival] : tabtypes[root->type]));
if(root->type == T_ID || root->type == T_STRCONST)
printf(" (%s)", root->value.sval);
else if(root->type == T_INTCONST)
printf(" (%d)", root->value.ival);
else if(root->type == T_BOOLCONST)
printf(" (%s)", (root->value.ival == TRUE ? "true" : "false"));
printf("\n");
for(p=root->p1; p != NULL; p = p->p3)
treeprint(p, indent+1);
}
prog ( File with example of grammar)
numero: integer;
func fattoriale(n: integer): integer
fact: integer;
body
if n == 0 then
fact = 1;
else
fact = n * fattoriale(n­1);
end;
return fact;
end;
func stampaFattoriali(tot: integer): void
i, f: integer;
body
for i=0 to tot do
f = fattoriale(i);
writeln("Il fattoriale di ", i, "è ", f);
end;
end;
body
read(numero);
if numero < 0 then
writeln("Il numero ", numero, "non è valido");
else
stampaFattoriali(numero);
end.
When i type make on terminal , it create the files : tree.o parser.o parser.h parser.c lexer.c lexer.o bup.
When i execute the bup file , the terminal show me this error message :
"ine 1: syntax error on symbol "
So it don't generate the abstract tree.
I don't know if this error refers to the prog or lexer.lex or parse.y file.
Yacc/bison assign their own numbers to terminal tokens, and assume that the lexer will use those numbers. But you provide your own numbers in the def.h header, which yacc/bison knows absolutely nothing about. It will not correctly interpret the codes returned by yylex which will make it impossible to parse correctly.
So don't do that.
Let bison generate the token codes, use the header file it generates (parser.h with your settings), and don't step on its feet by trying to define the enum values yourself.
As a hint about debugging, that is really way too much code to have written before you start debugging, and that fact is exactly illustrated by your complaint at the end of your question that you don't know where to look for the error. Instead of writing the whole project and then hoping it works as a whole, write little pieces and debug them as you go. Although you need to parser to generate the token type values, you don't need to run the parser to test your scanner. You can write a simple program which repeatedly calls yylex and prints the returned types and values. (Or you can just enable flex debugging with the -d command line option, which is even simpler.)
Similarly, you should be able to test your AST methods by writing some test functions which use these methods to build, walk and print out an AST. Make sure that they produce the expected results.
Only once you have evidence that the lexer is producing the correct tokens and that your AST construction functions work should you start to debug your parser. Again, you will find it much easier if you use the built-in debugging facilities; see the Debugging your parser section of the Bison manual for instructions.
Good luck.

Generate all valid combinations of N pairs of parentheses

UPDATE (task detailed Explanation):
We have a string consist of numbers 0 and 1, divided by operators |, ^ or &. The task is to create all fully parenthesized expressions. So the final expressions should be divided into "2 parts"
For example
0^1 -> (0)^(1) but not extraneously: 0^1 -> (((0))^(1))
Example for expression 1|0&1:
(1)|((0)&(1))
((1)|(0))&(1)
As you can see both expressions above have left and write part:
left: (1); right: ((0)&(1))
left: ((1)|(0)); right: (1)
I tried the following code, but it does not work correctly (see output):
// expression has type string
// result has type Array (ArrayList in Java)
function setParens(expression, result) {
if (expression.length === 1) return "(" + expression + ")";
for (var i = 0; i < expression.length; i++) {
var c = expression[i];
if (c === "|" || c === "^" || c === "&") {
var left = expression.substring(0, i);
var right = expression.substring(i + 1);
leftParen = setParens(left, result);
rightParen = setParens(right, result);
var newExp = leftParen + c + rightParen;
result.push(newExp);
}
}
return expression;
}
function test() {
var r = [];
setParens('1|0&1', r);
console.log(r);
}
test();
code output: ["(0)&(1)", "(0)|0&1", "(1)|(0)", "1|0&(1)"]
Assuming the input expression is not already partially parenthesized and you want only fully parenthesized results:
FullyParenthesize(expression[1...n])
result = {}
// looking for operators
for p = 1 to n do
// binary operator; parenthesize LHS and RHS
// parenthesize the binary operation
if expression[p] is a binary operator then
lps = FullyParenthesize(expression[1 ... p - 1])
rps = FullyParenthesize(expression[p + 1 ... n])
for each lp in lps do
for each rp in rps do
result = result U {"(" + lp + expression[p] + rp + ")"}
// no binary operations <=> single variable
if result == {} then
result = {"(" + expression + ")")}
return result
Example: 1|2&3
FullyParenthesize("1|2&3")
result = {}
binary operator | at p = 2;
lps = FullyParenthesize("1")
no operators
result = {"(" + "1" + ")"}
return result = {"(1)"}
rps = Parenthesize("2&3")
result = {"2&3", "(2&3)"}
binary operator & at p = 2
lps = Parenthesize("2")
no operators
result = {"(" + "2" + ")"}
return result = {"(2)"}
rps = Parenthesize("3")
no operators
result = {"(" + "3" + ")"}
return result = {"(3)"}
lp = "(2)"
rp = "(3)"
result = result U {"(" + "(2)" + "&" + "(3)" + ")"}
return result = {"((2)&(3))"}
lp = "(1)"
rp = "((2)&(3))"
result = result U {"(" + "(1)" + "|" + "((2)&(3))" + ")"}
binary operator & at p = 4
...
result = result U {"(" + "((1)|(2))" + "&" + "(3)" + ")"}
return result {"((1)|((2)&(3)))", "(((1)|(2))&(3))"}
You will have 2^k unique fully parenthesized expressions (without repeated parentheses) given an input expression with k binary operators.

Recursion: printing implied parenthesis

I am trying to print all the parenthesis but I am stuck with printing only one output instead of all the implied parenthesis.
How to show all possible implied parenthesis? From here I got the idea.
I am not able to get the ways to print all the implied parenthesis.
def add_up(exp):
operators = ["+", "/", "*", "-"]
if len(exp) == 1:
return exp
result = ""
for i in range(0, len(exp)):
if exp[i] in operators:
left = add_up(exp[0:i])
right = add_up(exp[i+1:len(exp)])
result = "(" + left + exp[i] + right + ")"
return result
print(add_up("3*4+5"))
Your code returns a single result, whereas the solution consists of several possible results. And when you recurse, the left and right branches may yield more than one result, too, of course.
You can fix that by turning your function into a generator and then looping over the results:
operators = ["+", "/", "*", "-"]
def add_up(exp):
if len(exp) == 1:
yield exp
for i in range(0, len(exp)):
if exp[i] in operators:
for left in add_up(exp[:i]):
for right in add_up(exp[i + 1:]):
yield "(" + left + exp[i] + right + ")"
for e in add_up("1+2*3+4"):
print e, ':', eval(e)
Alternatively (and more verbosely), you can make the results lists:
def add_up(exp):
if len(exp) == 1:
return [exp]
result = []
for i in range(0, len(exp)):
if exp[i] in operators:
for left in add_up(exp[:i]):
for right in add_up(exp[i + 1:]):
result += ["(" + left + exp[i] + right + ")"]
return result

Change the parsing language

I'm using a modal-SAT solver. This solver is unfortunately using Flex and Bison, both languages that I don't master...
I wanted to change one syntax to another, but I've got some issue to do it, even after tutorials about Flex-Lexer and Bison.
So here is the problem :
I want to be able to parse such modal logic formulas :
In the previous notation, such formulas were written like this :
(NOT (IMP (AND (ALL R0 (IMP C0 C1)) (ALL R0 C0)) (ALL R0 C1)))
And here are the Flex/Bisons file used to parse them :
alc.y
%{
#include "fnode.h"
#define YYMAXDEPTH 1000000
fnode_t *formula_as_tree;
%}
%union {
int l;
int i;
fnode_t *f;
}
/* Tokens and types */
%token LP RP
%token ALL SOME
%token AND IMP OR IFF NOT
%token TOP BOT
%token RULE CONC
%token <l> NUM
%type <f> formula
%type <f> boolean_expression rule_expression atomic_expression
%type <f> other
%type <i> uboolop bboolop nboolop ruleop
%type <l> rule
%% /* Grammar rules */
input: formula {formula_as_tree = $1;}
;
formula: boolean_expression {$$ = $1;}
| rule_expression {$$ = $1;}
| atomic_expression {$$ = $1;}
;
boolean_expression: LP uboolop formula RP
{$$ = Make_formula_nary($2,empty_code,$3);}
| LP bboolop formula formula RP
{$$ = Make_formula_nary($2,empty_code, Make_operand_nary($3,$4));}
| LP nboolop formula other RP
{$$ = Make_formula_nary($2,empty_code,Make_operand_nary($3,$4));}
;
rule_expression: LP ruleop rule formula RP {$$ = Make_formula_nary($2,$3,$4);}
;
atomic_expression: CONC NUM {$$ = Make_formula_nary(atom_code,$2,Make_empty());}
| TOP {$$ = Make_formula_nary(top_code,empty_code,Make_empty());}
| BOT {$$ = Make_formula_nary(bot_code,empty_code,Make_empty());}
;
other: formula other {$$ = Make_operand_nary($1,$2);}
| {$$ = Make_empty();}
;
uboolop: NOT {$$ = not_code;}
;
bboolop: IFF {$$ = iff_code;}
| IMP {$$ = imp_code;}
;
nboolop: AND {$$ = and_code;}
| OR {$$ = or_code;}
;
ruleop: SOME {$$ = dia_code;}
| ALL {$$ = box_code;}
rule: RULE NUM {$$ = $2;}
;
%% /* End of grammar rules */
int yyerror(char *s)
{
printf("%s\n", s);
exit(0);
}
alc.lex
%{
#include <stdio.h>
#include "fnode.h"
#include "y.tab.h"
int number;
%}
%%
[ \n\t] ;
"(" return LP;
")" return RP;
"ALL" return ALL;
"SOME" return SOME;
"AND" return AND;
"IMP" return IMP;
"OR" return OR;
"IFF" return IFF;
"NOT" return NOT;
"TOP" return TOP;
"BOTTOM" return BOT;
"R" return RULE;
"C" return CONC;
0|[1-9][0-9]* {
sscanf(yytext,"%d",&number);
yylval.l=number;
return NUM;
}
. {
/* Error function */
fprintf(stderr,"Illegal character\n");
return -1;
}
%%
Now, let's write our example but in the new syntax that I want to use :
begin
(([r0](~pO | p1) & [r0]p0) | [r0]p1)
end
Major problems for me that are blocking me to parse this new syntax correctly is :
IMP (A B) is now written ~B | A (as in the boolean logic (A => B) <=> (~B v A)).
ALL RO is now written [r0].
SOME RO is now written <r0>.
IFF (A B) is now written (~B | A) & (~A | B). (IFF stands for if and only if)
Here is the small list of what are the new symbol, even if I don't know how to parse them :
"(" return LP;
")" return RP;
"[]" return ALL;
"<>" return SOME;
"&" return AND;
"IMP" return IMP;
"|" return OR;
"IFF" return IFF;
"~" return NOT;
"true" return TOP;
"false" return BOT;
"r" return RULE;
"p" return CONC;
I assume that only theses 2 files will change, Because it should still be able to read the previous syntaxe, by compiling the source code with other .y and .lex
But I'm asking your help to know exactly how to write it down :/
Thanks in advance !
Tommi Junttila's BC Package implements a language for Boolean expressions and circuits using Bison and Flex.
To study the source files won't fully replace going through a proper Bison/Flex tutorial, but it certainly should give you a good start.
For someone who would have the exact same problem (I assume that this problem is quite rare :) )
With the good vocabulary, it's much easier to google the problem and find a solution.
The first notation
(NOT (IMP (AND (ALL R0 (IMP C0 C1)) (ALL R0 C0)) (ALL R0 C1)))
is the ALC format.
The other notation
begin
(([r0](~pO | p1) & [r0]p0) | [r0]p1)
end
is the InToHyLo format.
And there is a tool called the formula translation tool ("ftt") developed and bundled with Spartacus (http://www.ps.uni-saarland.de/spartacus/). It can translate between all the formats of provers.
Using this tool is a little hack who avoid dealing with the Flex/Bison languages.
One just needs to translate one problem to another, problems will be equivalent and it's very fast to translate.

conversion from infix to prefix

I am preparing for an exam where i couldn't understand the convertion of infix notation to polish notation for the below expression:
(a–b)/c*(d + e – f / g)
Can any one tell step by step how the given expression will be converted to prefix?
Algorithm ConvertInfixtoPrefix
Purpose: Convert an infix expression into a prefix expression. Begin
// Create operand and operator stacks as empty stacks.
Create OperandStack
Create OperatorStack
// While input expression still remains, read and process the next token.
while( not an empty input expression ) read next token from the input expression
// Test if token is an operand or operator
if ( token is an operand )
// Push operand onto the operand stack.
OperandStack.Push (token)
endif
// If it is a left parentheses or operator of higher precedence than the last, or the stack is empty,
else if ( token is '(' or OperatorStack.IsEmpty() or OperatorHierarchy(token) > OperatorHierarchy(OperatorStack.Top()) )
// push it to the operator stack
OperatorStack.Push ( token )
endif
else if( token is ')' )
// Continue to pop operator and operand stacks, building
// prefix expressions until left parentheses is found.
// Each prefix expression is push back onto the operand
// stack as either a left or right operand for the next operator.
while( OperatorStack.Top() not equal '(' )
OperatorStack.Pop(operator)
OperandStack.Pop(RightOperand)
OperandStack.Pop(LeftOperand)
operand = operator + LeftOperand + RightOperand
OperandStack.Push(operand)
endwhile
// Pop the left parthenses from the operator stack.
OperatorStack.Pop(operator)
endif
else if( operator hierarchy of token is less than or equal to hierarchy of top of the operator stack )
// Continue to pop operator and operand stack, building prefix
// expressions until the stack is empty or until an operator at
// the top of the operator stack has a lower hierarchy than that
// of the token.
while( !OperatorStack.IsEmpty() and OperatorHierarchy(token) lessThen Or Equal to OperatorHierarchy(OperatorStack.Top()) )
OperatorStack.Pop(operator)
OperandStack.Pop(RightOperand)
OperandStack.Pop(LeftOperand)
operand = operator + LeftOperand + RightOperand
OperandStack.Push(operand)
endwhile
// Push the lower precedence operator onto the stack
OperatorStack.Push(token)
endif
endwhile
// If the stack is not empty, continue to pop operator and operand stacks building
// prefix expressions until the operator stack is empty.
while( !OperatorStack.IsEmpty() ) OperatorStack.Pop(operator)
OperandStack.Pop(RightOperand)
OperandStack.Pop(LeftOperand)
operand = operator + LeftOperand + RightOperand
OperandStack.Push(operand)
endwhile
// Save the prefix expression at the top of the operand stack followed by popping // the operand stack.
print OperandStack.Top()
OperandStack.Pop()
End
(a–b)/c*(d + e – f / g)
Prefix notation (reverse polish has the operator last, it is unclear which one you meant, but the principle will be exactly the same):
(/ f g)
(+ d e)
(- (+ d e) (/ f g))
(- a b)
(/ (- a b) c)
(* (/ (- a b) c) (- (+ d e) (/ f g)))
If there's something about what infix and prefix mean that you don't quite understand, I'd highly suggest you reread that section of your textbook. You aren't doing yourself any favors if you come out of this with the right answer for this one problem, but still don't understand the concept.
Algorithm-wise, its pretty darn simple. You just act like a computer yourself a bit. Start by puting parens around every calculation in the order it would be calculated. Then (again in order from first calculation to last) just move the operator in front of the expression on its left hand side. After that, you can simplify by removing parens.
(a–b)/c*(d + e – f / g)
step 1: (a-b)/c*(d+e- /fg))
step 2: (a-b)/c*(+de - /fg)
step 3: (a-b)/c * -+de/fg
Step 4: -ab/c * -+de/fg
step 5: /-abc * -+de/fg
step 6: */-abc-+de/fg
This is prefix notation.
I saw this method on youtube hence posting here.
given infix expression : (a–b)/c*(d + e – f / g)
reverse it :
)g/f-e+d(*c/)b-a(
read characters from left to right.
maintain one stack for operators
1. if character is operand add operand to the output
2. else if character is operator or )
2.1 while operator on top of the stack has lower or **equal** precedence than this character pop
2.2 add the popped character to the output.
push the character on stack
3. else if character is parenthesis (
3.1 [ same as 2 till you encounter ) . pop ) as well
4. // no element left to read
4.1 pop operators from stack till it is not empty
4.2 add them to the output.
reverse the output and print.
credits : youtube
(a–b)/c*(d + e – f / g)
remember scanning the expression from leftmost to right most
start on parenthesized terms
follow the WHICH COMES FIRST rule...
*, /, % are on the same level and higher than
+ and -.... so
(a-b) = -bc prefix
(a-b) = bc- for postfix
another parenthesized term:
(d + e - f / g) = do move the / first
then plus '+' comes first before minus sigh '-' (remember they are on the same level..)
(d + e - f / g)
move / first
(d + e - (/fg)) = prefix
(d + e - (fg/)) = postfix
followed by + then -
((+de) - (/fg)) = prefix
((de+) - (fg/)) = postfix
(-(+de)(/fg)) = prefix so the new expression is now -+de/fg (1)
((de+)(fg/)-) = postfix so the new expression is now de+fg/- (2)
(a–b)/c* hence
(a-b)/c*(d + e – f / g) = -bc prefix [-ab]/c*[-+de/fg] ---> taken from (1)
/ c * do not move yet
so '/' comes first before '*' because they on the same level, move '/'
to the rightmost : /[-ab]c * [-+de/fg]
then move '*' to the rightmost
/ [-ab]c[-+de/fg] = remove the grouping symbols = */-abc-+de/fg --> Prefix
(a-b)/c*(d + e – f / g) = bc- for postfix [ab-]/c*[de+fg/-]---> taken from (2)
so '/' comes first before '' because they on the same level, move '/'
to the leftmost: [ab-]c[de+fg/-]/
then move '' to the leftmost
[ab-] c [de+fg/-]/ = remove the grouping symbols= a b - c d e + f g / - / * --> Postfix
simple google search came up with this. Doubt anyone can explain this any simpler. But I guess after an edit, I can try to bring forward the concepts so that you can answer your own question.
Hint :
Study for exam, hard, you must. Predict you, grade get high, I do :D
Explanation :
It's all about the way operations are associated with operands. each notation type has its own rules. You just need to break down and remember these rules. If I told you I wrote (2*2)/3 as [* /] (2,2,3) all you need to do is learn how to turn the latter notation in the former notation.
My custom notation says that take the first two operands and multiple them, then the resulting operand should be divided by the third. Get it ? They are trying to teach you three things.
To become comfortable with different notations. The example I gave is what you will find in assembly languages. operands (which you act on) and operations (what you want to do to the operands).
Precedence rules in computing do not necessarily need to follow those found in mathematics.
Evaluation: How a machine perceives programs, and how it might order them at run-time.
This algorithm will help you for better understanding .
Step 1. Push “)” onto STACK, and add “(“ to end of the A.
Step 2. Scan A from right to left and repeat step 3 to 6 for each element of A until the STACK is empty.
Step 3. If an operand is encountered add it to B.
Step 4. If a right parenthesis is encountered push it onto STACK.
Step 5. If an operator is encountered then:
a. Repeatedly pop from STACK and add to B each operator (on the top of STACK) which has same
or higher precedence than the operator.
b. Add operator to STACK.
Step 6. If left parenthesis is encontered then
a. Repeatedly pop from the STACK and add to B (each operator on top of stack until a left parenthesis is encounterd)
b. Remove the left parenthesis.
Step 7. Exit
https://en.wikipedia.org/wiki/Shunting-yard_algorithm
The shunting yard algorithm can also be applied to produce prefix
notation (also known as polish notation). To do this one would simply
start from the end of a string of tokens to be parsed and work
backwards, reverse the output queue (therefore making the output queue
an output stack), and flip the left and right parenthesis behavior
(remembering that the now-left parenthesis behavior should pop until
it finds a now-right parenthesis).
from collections import deque
def convertToPN(expression):
precedence = {}
precedence["*"] = precedence["/"] = 3
precedence["+"] = precedence["-"] = 2
precedence[")"] = 1
stack = []
result = deque([])
for token in expression[::-1]:
if token == ')':
stack.append(token)
elif token == '(':
while stack:
t = stack.pop()
if t == ")": break
result.appendleft(t)
elif token not in precedence:
result.appendleft(token)
else:
# XXX: associativity should be considered here
# https://en.wikipedia.org/wiki/Operator_associativity
while stack and precedence[stack[-1]] > precedence[token]:
result.appendleft(stack.pop())
stack.append(token)
while stack:
result.appendleft(stack.pop())
return list(result)
expression = "(a - b) / c * (d + e - f / g)".replace(" ", "")
convertToPN(expression)
step through:
step 1 : token ) ; stack:[ ) ]
result:[ ]
step 2 : token g ; stack:[ ) ]
result:[ g ]
step 3 : token / ; stack:[ ) / ]
result:[ g ]
step 4 : token f ; stack:[ ) / ]
result:[ f g ]
step 5 : token - ; stack:[ ) - ]
result:[ / f g ]
step 6 : token e ; stack:[ ) - ]
result:[ e / f g ]
step 7 : token + ; stack:[ ) - + ]
result:[ e / f g ]
step 8 : token d ; stack:[ ) - + ]
result:[ d e / f g ]
step 9 : token ( ; stack:[ ]
result:[ - + d e / f g ]
step 10 : token * ; stack:[ * ]
result:[ - + d e / f g ]
step 11 : token c ; stack:[ * ]
result:[ c - + d e / f g ]
step 12 : token / ; stack:[ * / ]
result:[ c - + d e / f g ]
step 13 : token ) ; stack:[ * / ) ]
result:[ c - + d e / f g ]
step 14 : token b ; stack:[ * / ) ]
result:[ b c - + d e / f g ]
step 15 : token - ; stack:[ * / ) - ]
result:[ b c - + d e / f g ]
step 16 : token a ; stack:[ * / ) - ]
result:[ a b c - + d e / f g ]
step 17 : token ( ; stack:[ * / ]
result:[ - a b c - + d e / f g ]
# the final while
step 18 : token ( ; stack:[ ]
result:[ * / - a b c - + d e / f g ]
here is an java implementation for convert infix to prefix and evaluate prefix expression (based on rajdip's algorithm)
import java.util.*;
public class Expression {
private static int isp(char token){
switch (token){
case '*':
case '/':
return 2;
case '+':
case '-':
return 1;
default:
return -1;
}
}
private static double calculate(double oprnd1,double oprnd2,char oprt){
switch (oprt){
case '+':
return oprnd1+oprnd2;
case '*':
return oprnd1*oprnd2;
case '/':
return oprnd1/oprnd2;
case '-':
return oprnd1-oprnd2;
default:
return 0;
}
}
public static String infix2prefix(String infix) {
Stack<String> OperandStack = new Stack<>();
Stack<Character> OperatorStack = new Stack<>();
for(char token:infix.toCharArray()){
if ('a' <= token && token <= 'z')
OperandStack.push(String.valueOf(token));
else if (token == '(' || OperatorStack.isEmpty() || isp(token) > isp(OperatorStack.peek()))
OperatorStack.push(token);
else if(token == ')'){
while (OperatorStack.peek() != '(') {
Character operator = OperatorStack.pop();
String RightOperand = OperandStack.pop();
String LeftOperand = OperandStack.pop();
String operand = operator + LeftOperand + RightOperand;
OperandStack.push(operand);
}
OperatorStack.pop();
}
else if(isp(token) <= isp(OperatorStack.peek())){
while (!OperatorStack.isEmpty() && isp(token)<= isp(OperatorStack.peek())) {
Character operator = OperatorStack.pop();
String RightOperand = OperandStack.pop();
String LeftOperand = OperandStack.pop();
String operand = operator + LeftOperand + RightOperand;
OperandStack.push(operand);
}
OperatorStack.push(token);
}
}
while (!OperatorStack.isEmpty()){
Character operator = OperatorStack.pop();
String RightOperand = OperandStack.pop();
String LeftOperand = OperandStack.pop();
String operand = operator + LeftOperand + RightOperand;
OperandStack.push(operand);
}
return OperandStack.pop();
}
public static double evaluatePrefix(String prefix, Map values){
Stack<Double> stack = new Stack<>();
prefix = new StringBuffer(prefix).reverse().toString();
for (char token:prefix.toCharArray()){
if ('a'<=token && token <= 'z')
stack.push((double) values.get(token));
else {
double oprnd1 = stack.pop();
double oprnd2 = stack.pop();
stack.push(calculate(oprnd1,oprnd2,token));
}
}
return stack.pop();
}
public static void main(String[] args) {
Map dictionary = new HashMap<>();
dictionary.put('a',2.);
dictionary.put('b',3.);
dictionary.put('c',2.);
dictionary.put('d',5.);
dictionary.put('e',16.);
dictionary.put('f',4.);
dictionary.put('g',7.);
String s = "((a+b)/(c-d)+e)*f-g";
System.out.println(evaluatePrefix(infix2prefix(s),dictionary));
}
}
In Prefix expression operators comes first then operands : +ab[ oprator ab ]
Infix : (a–b)/c*(d + e – f / g)
Step 1: (a - b) = (- ab) [ '(' has highest priority ]
step 2: (d + e - f / g) = (d + e - / fg) [ '/' has highest priority ]
= (+ de - / fg )
['+','-' has same priority but left to right associativity]
= (- + de / fg)
Step 3: (-ab )/ c * (- + de / fg) = / - abc * (- + de / fg)
= * / - abc - + de / fg
Prefix : * / - abc - + de / fg
Infix to PostFix using Stack:
Example: Infix--> P-Q*R^S/T+U *V
Postfix --> PQRS^*T/-UV
Rules:
Operand ---> Add it to postfix
"(" ---> Push it on the stack
")" ---> Pop and add to postfix all operators till 1st left parenthesis
Operator ---> Pop and add to postfix those operators that have preceded
greater than or equal to the precedence of scanned operator.
Push the scanned symbol operator on the stack
Maybe you're talking about the Reverse Polish Notation? If yes you can find on wikipedia a very detailed step-to-step example for the conversion; if not I have no idea what you're asking :(
You might also want to read my answer to another question where I provided such an implementation: C++ simple operations (+,-,/,*) evaluation class
This is the algorithm using stack.Just follow these simple steps.
1.Reverse the given infix expression.
2.Replace '(' with ')' and ')' with '(' in the reversed expression.
3.Now apply standard infix to postfix subroutine.
4.Reverse the founded postfix expression, this will give required prefix expression.
In case you find step 3 difficult consult http://scanftree.com/Data_Structure/infix-to-prefix
where a worked out example is also given.

Resources