Conflicts in ocamlyacc

Conflicts in ocamlyacc - debugging

I am trying to write a parser for a simple language that recognizes integer and float expressions using ocamlyacc. However I want to introduce the possiblity of having variables. So i defined the token VAR in my lexer.mll file which allows it to be any alphanumneric string starting with a capital letter.
expr:
| INT { $1 }
| VAR { /*Some action */}
| expr PLUS expr { $1 + $3 }
| expr MINUS expr { $1 - $3 }
/* and similar rules below for real expressions differently */
Now i have a similar definition for real numbers. However when i run this file, I get 2 reduce/reduce conflict because if i just enter a random string(identified as token VAR). The parser would not know if its a real or an integer type of variable as the keyword VAR is present in defining both int and real expressions in my grammar.
Var + 12 /*means that Var has to be an integer variable*/
Var /*Is a valid expression according to my grammar but can be of any type*/
How do I eliminate this reduce/reduce conflict without losing the generality of variable declaration and mainting the 2 data types available to me.

Related

Why || can't be used in pattern matching?

In OCaml when I do a pattern matching I can't do the following:
let rec example = function
| ... -> ...
| ... || ... -> ... (* here I get a syntax error because I use ||*)
Instead I need to do:
let rec example1 = function
|... -> ...
|... | ... -> ...
I know that || means or in OCaml, but why do we need to use only one 'pipe' : | to specify 'or' in pattern matching?
Why don't the usual || work?

|| doesn't really mean "or" generally, it means "boolean or", or rather it's the boolean or operator. Operators operate on values resulting from the evaluation of expressions, its operands. Operations and operands together also form expressions which can then be used as operands with other operators to form further expressions and so on.
Pattern matching on the other hand evaluate patterns, which are neither boolean or expressions. Although patterns do in a sense evaluate to true or false if applied to, or rather matched against, a value, they do not evaluate to anything on their own. They are in that sense more like operators than operands. Furthermore, the result of matching against a pattern is not just a boolean value, but also a set of bindings.
Using || instead of | with patterns would overload its meaning and serve more to confuse than to clarify I think.

What is the empty statement in Golang?

In Python we can use pass clause as an placeholder.
What is the equivalent clause in Golang?
An ; or something else?

The Go Programming Language Specification
Empty statements
The empty statement does nothing.
EmptyStmt = .
Notation
The syntax is specified using Extended Backus-Naur Form (EBNF):
Production = production_name "=" [ Expression ] "." .
Expression = Alternative { "|" Alternative } .
Alternative = Term { Term } .
Term = production_name | token [ "…" token ] | Group | Option | Repetition .
Group = "(" Expression ")" .
Option = "[" Expression "]" .
Repetition = "{" Expression "}" .
Productions are expressions constructed from terms and the following
operators, in increasing precedence:
| alternation
() grouping
[] option (0 or 1 times)
{} repetition (0 to n times)
Lower-case production names are used to identify lexical tokens.
Non-terminals are in CamelCase. Lexical tokens are enclosed in double
quotes "" or back quotes ``.
The form a … b represents the set of characters from a through b as
alternatives. The horizontal ellipsis … is also used elsewhere in the
spec to informally denote various enumerations or code snippets that
are not further specified. The character … (as opposed to the three
characters ...) is not a token of the Go language.
The empty statement is empty. In EBNF (Extended Backus–Naur Form) form: EmptyStmt = . or an empty string.
For example,
for {
}
var no
if true {
} else {
no = true
}

semicolon single expression in a for loop

Why is there a semicolon at the end of Proc.num_stack_slots.(i) <- 0 in the following code?
I thought semicolons are separators in OCaml. Can we always put an optional semicolon for the last expression of a block?
for i = 0 to Proc.num_register_classes - 1 do
Proc.num_stack_slots.(i) <- 0;
done;
See https://github.com/def-lkb/ocaml-tyr/blob/master/asmcomp/coloring.ml line 273 for the complete example.

There is no need for a semicolon after this expression, but as a syntactic courtesy, it is allowed here. In the example, you referenced, there is a semicolon, because after it a second expression follows.
Essentially, you can view a semicolon as a binary operator, that takes two-unit expressions, executes them from left to right, and returns a unit.
val (;): unit -> unit -> unit
then the following example will be more understandable:
for i = 1 to 5 do
printf "Hello, ";
printf "world\n"
done
here ; works just a glue. It is allowed to put a ; after the second expression, but only as the syntactic sugar, nothing more than a courtesy from compiler developers.
If you open a parser definition of the OCaml compiler you will see, that an expression inside a seq_expr can be ended by a semicolumn:
seq_expr:
| expr %prec below_SEMI { $1 }
| expr SEMI { reloc_exp $1 }
| expr SEMI seq_expr { mkexp(Pexp_sequence($1, $3)) }
That means that you can even write such strange code:
let x = 2 in x; let y = 3 in y; 25

awk: Interpreting strings as mathematical expressions

Context: I have an input file that contains parameters with associated values followed by literal mathematical expressions such as:
PARAMETERS DEFINITION
A = 5; B = 2; C=1.5; D=7.5
MATHEMATICAL EXPRESSIONS
A*B
C/D
...
and I would like to get the strings of the second part to be interpreted as mathematical expressions so that I get the results of the expressions in my output file:
...
MATHEMATICAL EXPRESSIONS
10
0.2
...
What I did already: So far, using awk, I store all the parameters names and their corresponding values in two distinct arrays. I then replace each parameter with its value so that I am now in a similar situation as the author of this thread.
However, the answers s/he gets are not in awk except for the last one which is very specific to her/his situation, and hard to understand for me as a beginner with awk and shell scripting.
What I tried afterwards: As I have no clue how to do this in awk, the idea I had was to store the new field value in a variable, then use a shell command within the awk script like this:
#!bin/awk -f
BEGIN{}
{
myExpression=$1
system("echo $myExpression | bc")
}
END{}
This, unfortunately does not work as the variable is somehow not recognized by the echo command.
What I would like:
I would prefer a solution using awk alone with no call to external functions, however, I am not against one using a shell command if it is simpler.
EDIT Taking into account all the comments so far, I will be more precise, my input files look more like this:
PARAMETERS_DEFINITION
[param1] = 5
[param2] = 2
[param3] = 1.5
[param4] = 7.5
MATHEMATICAL_EXPRESSIONS
[param1]*[param2]
some text containing also numbers and formulas that I do not want to be affected.
e.g: 1.45*2.6 = x, de(x)/dx=e(x) ; blah,blah,blah
[param3]/[param4]
The names of the parameters are complex enough so that any match of the string: "[param#]" within the document corresponds to a parameter that I want changed for its value.
Below is the way I manage to store the parameters and their value in arrays is the following:
{
if (match($2,/PARAMETERS_DEFINITION/) != 0) {paramSwitch = 1}
if (match($2,/MATHEMATICAL_EXPRESSIONS/) != 0) {paramSwitch = 0}
if (paramSwitch == 1)
{
parameterName[numOfParam] = $1 ;
parameterVal[numOfParam] = $3 ;
numOfParam += 1
}
}

Instead of this:
{
myExpression=$1
system("echo $myExpression | bc")
}
I think you'd want this:
{
myExpression=$1
system("echo " myExpression " | bc")
}
That's because in awk, assignments do not end up as environment variables, and putting strings next to each other concatenates them.

You asking awk: Interpreting strings as mathematical expressions - this functionality usually called as eval, and no, (AFAIK) awk doesn't knows such function. Therefore your questions is an typical XY problem
The right tool for this is bc, where you (nearly) don't need modify anything, and simply feed the bc with your input, only ensure than the variables are are lowercase, such the following input (edited the your example)
#PARAMETERS DEFINITION
a=5; b=2; c=1.5; d=7.5
#MATHEMATICAL EXPRESSIONS
a*b
c/d
using like
bc -l < inputfile
produces
10
.20000000000000000000
EDIT
For your edit, for the new input data. The following
grep '\[' inputfile | sed 's/[][]//g' | bc -l
for the input
PARAMETERS_DEFINITION
[param1] = 5
[param2] = 2
[param3] = 1.5
[param4] = 7.5
MATHEMATICAL_EXPRESSIONS
[param1]*[param2]
some text containing also numbers and formulas that I do not want to be affected.
e.g: 1.45*2.6 = x, de(x)/dx=e(x) ; blah,blah,blah
[param3]/[param4]
produces the following output:
10
.20000000000000000000
e.g. grepping out only lines what contains [ - any param definition or expression, remove any [], e.g. creating the following bc program:
param1 = 5
param2 = 2
param3 = 1.5
param4 = 7.5
param1*param2
param3/param4
and send the whole "program" to bc...

Using BIDMAS as a basis i have created this mathematical function in awk
I have not included brackets(or indices) yet as they will require some extra effort but i may add them later
This awk script effectively works as bc does.
No system call required, all in awk.
Generic version for all applications
awk '{split($0,a,"+")
for(i in a){
split(a[i],s,"-")
for(j in s){
split(s[j],m,"*")
for(k in m){
split(m[k],d,"/")
for(l in d){
if(l>1)d[1]=d[1]/d[l]
}
m[k]=d[1]
delete d
if(k>1)m[1]=m[1]*m[k]
}
s[j]=m[1]
delete m
if(j>1)s[1]=s[1]-s[j]
}
a[i]=s[1]
delete s
}
for(i in a)b=b+a[i];print b}{b=0}' file
For your specific example
awk '
/MATHEMATICAL_EXPRESSIONS/{z=1}
NR>1&&!z{split($0,y," = ");x[y[1]]=y[2]}
z&&/[\+\-\/\*]/{
for (n in x)gsub(n,x[n])
split($0,a,"+")
for(i in a){
split(a[i],s,"-")
for(j in s){
split(s[j],m,"*")
for(k in m){
split(m[k],d,"/")
for(l in d){
if(l>1)d[1]=d[1]/d[l]
}
m[k]=d[1]
delete d
if(k>1)m[1]=m[1]*m[k]
}
s[j]=m[1]
delete m
if(j>1)s[1]=s[1]-s[j]
}
a[i]=s[1]
delete s
}
for(i in a)b=b+a[i];print b}{b=0}' file

There's something like an eval for awk, its a magical conversion when needed in the context, here adding +0 would do the convertion.
What I got for you (detailled version below) with a file named awkinput with your exemple input
awk '/[A-Z]=[0-9.]+;/ { for (i=1;i<=NF ;i++) { print "working on "$i; split($i,fields,"="); sub(/;/,"",fields[2]); params[fields[1]]=strtonum(fields[2]) } }; /[A-Z](*|\/|+|-)[A-Z]/ { for (p in params) { sub(p, params[p],$0); }; system("echo " $0 " | bc -ql") }' awkinput
Detailled:
/[A-Z]=[0-9.]+;?/ { # if we match something like A=4.2 with or wothout a ; at end
for (i=1;i<=NF ;i++) { # loop through the fields (separated by space, the default Field Separator of awk)
print "working on "$i; # inform on what we do
split($i,fields,"="); # split in an array to get param and value
sub(/;/,"",fields[2]); # Eventually remove the ; at end
params[fields[1]]=strtonum(fields[2]) # new array of parameters where the values are numeric
}
}
/[A-Z](*|\/|+|-)[A-Z]/ { #when the line match a math operation with one param on each side (at least)
for (p in params) { # loop over know params
sub(p, params[p],$0); # replace each param with its value
};
system("echo " $0 " | bc -ql") # print the result (no way to get of system call here)
}
Drawback:
A math of the form AB*C would be resolved to 52*1.5

$ cat test
PARAMETERS DEFINITION
A=5; B=2; C=1.5; D=7.5
MATHEMATICAL EXPRESSIONS
A*B
C/D
$ awk -vRS='[= ;\n]' '{if ($0 ~ /[0-9]/){a[x] = $0; print x"="a[x]}else{x=$0}}/MATHEMATICAL/{print "MATHEMATICAL EXPRESSIONS"}{if ($0~"*") print a[substr($0,1,1)] * a[substr($0,3,1)]}{if ($0~"/") print a[substr($0,1,1)] / a[substr($0,3,1)]}' test
A=5
B=2
C=1.5
D=7.5
MATHEMATICAL EXPRESSIONS
10
0.2
Formatted nicely:
$ cat test.awk
# Store all variables in an array
{
if ($0 ~ /[0-9]/){
a[x] = $0;
print x " = " a[x] # Print the keys & values
}
else{
x = $0
}
}
# Print header
/MATHEMATICAL/ {print "MATHEMATICAL EXPRESSIONS"}
# Do the maths (case can work too, but it's not as widely available)
{
if ($0~"*")
print a[substr($0,1,1)] * a[substr($0,3,1)]
}
{
if ($0~"/")
print a[substr($0,1,1)] / a[substr($0,3,1)]
}
{
if ($0~"+")
print a[substr($0,1,1)] + a[substr($0,3,1)]
}
{
if ($0~"-")
print a[substr($0,1,1)] - a[substr($0,3,1)]
}
$ cat test
PARAMETERS DEFINITION
A=5; B=2; C=1.5; D=7.5
MATHEMATICAL EXPRESSIONS
A*B
C/D
D+C
C-A
$ awk -f test.awk -vRS='[= ;\n]' test
A = 5
B = 2
C = 1.5
D = 7.5
MATHEMATICAL EXPRESSIONS
10
0.2
9
-3.5

ANTLR parse problem

I need to be able to match a certain string ('[' then any number of equals signs or none then '['), then i need to match a matching close bracket (']' then the same number of equals signs then ']') after some other match rules. ((options{greedy=false;}:.)* if you must know). I have no clue how to do this in ANTLR, how can i do it?
An example: I need to match [===[whatever arbitrary text ]===] but not [===[whatever arbitrary text ]==].
I need to do it for an arbitrary number of equals signs as well, so therein lies the problem: how do i get it to match an equal number of equals signs in the open as in the close? The supplied parser rules so far dont seem to make sense as far as helping.

You can't easely write a lexer for it, you need parsing rules. Two rules should be sufficient. One is responsible for matching the braces, one for matching the equal signs.
Something like this:
braces : '[' ']'
| '[' equals ']'
;
equals : '=' equals '='
| '=' braces '='
;
This should cover the use case you described. Not absolute shure but maybe you have to use a predicate in the first rule of 'equals' to avoid ambiguous interpretations.
Edit:
It is hard to integrate your greedy rule and at the same time avoid a lexer context switch or something similar (hard in ANTLR). But if you are willing to integrate a little bit of java in your grammer you can write an lexer rule.
The following example grammar shows how:
grammar TestLexer;
SPECIAL : '[' { int counter = 0; } ('=' { counter++; } )+ '[' (options{greedy=false;}:.)* ']' ('=' { counter--; } )+ { if(counter != 0) throw new RecognitionException(input); } ']';
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
rule : ID
| SPECIAL
;

Your tags mention lexing, but your question itself doesn't. What you're trying to do is non-regular, so I don't think it can be done as part of lexing (though I don't remember if ANTLR's lexer is strictly regular -- it's been a couple of years since I last used ANTLR).
What you describe should be possible in parsing, however. Here's the grammar for what you described:
thingy : LBRACKET middle RBRACKET;
middle : EQUAL middle EQUAL
| LBRACKET RBRACKET;

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Conflicts in ocamlyacc - debugging

Related

Why || can't be used in pattern matching?

What is the empty statement in Golang?

semicolon single expression in a for loop

awk: Interpreting strings as mathematical expressions

ANTLR parse problem

Categories

Resources