I'm writing a program in Prolog where I'm given a set of grammar rules and the user inputs a sentence, I must make sure the sentence follows the given rules.
I'm only stuck on one rule:
expr -> ( expr ) also written as expr -> ( id op expr )
Here is my code for this part:
expr(X) :- list(X), length(X, Length), =(Length, 5),
=(X, [Left, Id, Op, Expr | Right]),
=(Left, ‘(‘),
id(Id), op(Op), expr([Expr]),
=(Right, ‘)’).
I believe the issue is with checking the parentheses since the other parts of this code are used elsewhere with no errors. When using =(Left, '(') or =(Right, ')') I get a syntax error: expression expected why do I get this error and what would be a better way to check for left and right parentheses?
I think you should use single quotes here =(Left, ‘(‘), and here =(Right, ‘)’). I.e. =(Left, '('), and =(Right, ')').
That said, your Expr will only match a single token, and this is not what I expect. Consider to match the entire 'right' sequence with
X = [Left, Id, Op | Expr],
and further split Expr to get the right parenthesi. Anyway, as I advised in another answer, your parsing (also after correction) will fail on [a,=,'(',b,')',+,c].
Related
As I said in the title, I am trying to do an exercise where I need to write a DCG capable of reading propositional logic, which are represented by lowercase letters, operators (not, and , and or), with the tokens separated by whitespace. So the expression:
a or not b and c
is parsed as
a or ( not b and c )
producing a parse tree that looks like:
or(
a,
and(
not(b),
c
)
)
To be completely honest I have been having a hard time understanding how to effectively use DCGs, but this is what I've got so far:
bexpr([T]) --> [T].
bexpr(not,R1) --> oper(not), bexpr(R1).
bexpr(R1,or,R2) --> bexpr(R1),oper(or), bexpr(R2).
bexpr(R1, and ,R2) --> bexpr(R1),oper(and), bexpr(R2).
oper(X) --> X.
I would appreciate any suggestions, either on the exercise itself, or on how to better understand DCGs.
The key to understanding DCGs is that they are syntactic sugar over writing a recursive descent parser. You need to think about operator precedence (how tightly do your operators bind?). Here, the operator precedence, from tightest to loosest is
not
and
or
so a or not b and c is evaluated as a or ( (not b) and c ) ).
And we can say this (I've included parenthetical expressions as well, because they're pretty trivial to do):
% the infix OR operator is the lowest priority, so we start with that.
expr --> infix_OR.
% and an infix OR expression is either the next highest priority operator (AND),
% or... it's an actual OR expression.
infix_OR --> infix_AND(T).
infix_OR --> infix_AND(X), [or], infix_OR(Y).
% and an infix AND expression is either next highest priority operator (NOT)
% or... it's an actual AND expression.
infix_AND --> unary_NOT(T).
infix_AND --> unary_NOT(X), [and], infix_AND(Y).
% and the unary NOT expression is either a primary expression
% or... an actual unary NOT expression
unary_NOT --> primary(T).
unary_NOT --> [not], primary(X).
% and a primary expression is either an identifer
% or... it's a parenthetical expression.
%
% NOTE that the body of the parenthetical expression starts parsing at the root level.
primary --> identifier(ID).
primary --> ['(', expr(T), ')' ].
identifier --> [X], {id(X)}. % the stuff in '{...}' is evaluated as normal prolog code.
id(a).
id(b).
id(c).
id(d).
id(e).
id(f).
id(g).
id(h).
id(i).
id(j).
id(k).
id(l).
id(m).
id(n).
id(o).
id(p).
id(q).
id(r).
id(s).
id(t).
id(u).
id(v).
id(w).
id(x).
id(y).
id(z).
But note that all this does is to recognize sentences of the grammar (pro tip: if you write your grammar correctly, it should also be able to generate all possible valid sentences of the grammar). Note that this might take a while to do, depending on your grammar.
So, to actually DO something with the parse, you need to add a little extra. We do this by adding extra arguments to the DCG, viz:
expr( T ) --> infix_OR(T).
infix_OR( T ) --> infix_AND(T).
infix_OR( or(X,Y) ) --> infix_AND(X), [or], infix_OR(Y).
infix_AND( T ) --> unary_NOT(T).
infix_AND( and(X,Y) ) --> unary_NOT(X), [and], infix_AND(Y).
unary_NOT( T ) --> primary(T).
unary_NOT( not(X) ) --> [not], primary(X).
primary( ID ) --> identifier(ID).
primary( T ) --> ['(', expr(T), ')' ].
identifier( ID ) --> [X], { id(X), ID = X }.
id(a).
id(b).
id(c).
id(d).
id(e).
id(f).
id(g).
id(h).
id(i).
id(j).
id(k).
id(l).
id(m).
id(n).
id(o).
id(p).
id(q).
id(r).
id(s).
id(t).
id(u).
id(v).
id(w).
id(x).
id(y).
id(z).
And that is where the parse tree is constructed. One might note that one could just as easily evaluate the expression instead of building the parse tree... and then you're on you way to writing an interpreted language.
You can fiddle with it at this fiddle: https://swish.swi-prolog.org/p/gyFsAeAz.pl
where you'll notice that executing the goal phrase(expr(T),[a, or, not, b, and, c]). yields the desired parse T = or(a, and(not(b), c)).
I would like to know, how expressions are parsed when are mixed with control flow.
Let's assume such syntax:
case
when a == Method() + 1
then Something(1)
when a == Other() - 2
then 1
else 0
end
We've got here two conditional expressions, Method() + 1, Something(1) and 0. Each can be translated to postfix by Shunting-yard algorithm and then easily translated into AST. But is it possible to extend this algorithm to handle control - flow also? Or are there other approaches to solve such mixing of expressions and control flows?
another example:
a == b ? 1 : 2
also how can I classify such expression: a between b and c, can I say that between is three arguments function? Or is there any special name for such expressions?
You can certainly parse the ternary operator with an operator-precedence grammar. In
expr ? expr : expr
the binary "operator" here is ? expr :, which conveniently starts and ends with an operator token (albeit different ones). To adapt shunting yard to that, assign the right-precedence of ? and the left-precedence of : to the precedence of the ?: operator. The left-precedence of ? and the right-precedence of : are ±∞, just like parentheses (which, in effect, they are).
Since the case statement is basically repeated application of the ternary operator, using slightly different spellings for the tokens, and yields to a similar solution. (Here case when and end are purely parenthetic, while then and the remaining whens correspond to ? and :.)
Having said that, it really is simpler to use an LALR(1) parser generator, and there is almost certainly one available for whatever language you are writing in.
It's clear that both the ternary operator and OP's case statement are operator grammars:
Ternary operator:
ternary-expr: non-ternary-expr
| non-ternary-expr '?' expr ':' ternary-expr
Normally, the ternary operator will be lower precedence from any other operator and associate to the right, which is how the above is written. In C and other languages ternary expressions have the same precedence as assignment expressions, which is straightforward to add. That results in the relationships
X ·> ?
? <· X
? ·=· :
X ·> :
: <· X
Case statement (one of many possible formulations):
case_statement: 'case' case_body 'else' expr 'end'
case_body: 'when' expr 'then' expr
| case_body 'when' expr 'then' expr
Here are the precedence relationships for the above grammar:
case <· when
case ·=· else
when <· X (see below)
when ·=· then
then ·> when
then ·> else
else <· X
else ·=· end
X ·> then
X ·> when
X ·> end
X in the above relations refers to any binary or unary operator, any value terminal, ( and ).
It's straightforward to find left- and right-precedence functions for all of those terminals; the pattern will be similar to that of parentheses in a standard algebraic grammar.
The Shunting-yard algorithm is for expressions with unary and binary operators. You need something more powerful such as LL(1) or LALR(1) to parse control flow statements, and once you have that it will also handle expressions as well. No need for the Shunting-yard algorithm at all.
I've been trying to use this code for a while now and it says there is a syntax error but I'm not sure what it is.
studies(ahmed,history(77,63)).
studies(john,chemistry(0,21)).
passed(Person,Subj):-
studies(Person, Subj(Work, Exam)),
Final is Work + Exam,
Final >=60.
You can't directly "parameterize" the functor, but you can use the =../2 operator, which unifies a functor and arguments with a list:
passed(Person, Subj):-
studies(Person, SubjWorkExam),
SubjWorkExam =.. [Subj, Work, Exam],
Work + Exam >= 60.
This avoids hard-coding the various subjects in your predicate. Also, the comparison operator >=/2 will evaluate expressions, so the separate is/2 is not required.
You can't use a variable for the name of a clause, you could write instead:
passed(Person,Subj):-
(Subj=history-> studies(Person, history(Work, Exam))
;Subj=chemistry-> studies(Person, chemistry(Work, Exam)),
Final is Work + Exam,
Final >=60.
I want to parse a logical expression using DCG in Prolog.
The logical terms are represented as lists e.g. ['x','&&','y'] for x ∧ y the result should be the parse tree and(X,Y) (were X and Y are unassigned Prolog variables).
I implemented it and everything works as expected but I have one problem:
I can't figure out how to parse the variable 'x' and 'y' to get real Prolog variables X and Y for the later assignment of truth values.
I tried the following rule variations:
v(X) --> [X].:
This doesn't work of course, it only returns and('x','y').
But can I maybe uniformly replace the logical variables in this term with Prolog variables? I know of the predicate term_to_atom (which is proposed as a solution for a similar problem) but I don't think it can be used here to achieve the desired result.
v(Y) --> [X], {nonvar(Y)}.:
This does return an unbound variable but of course a new one every time even if the logical variable ('x','y',...) was already in the term so
['X','&&','X'] gets evaluated to and(X,Y) which is not the desired result, either.
Is there any elegant or idiomatic solution to this problem?
Many thanks in advance!
EDIT:
The background to this question is that I'm trying to implement the DPLL-algorithm in Prolog. I thought it would by clever to directly parse the logical term to a Prolog-term to make easy use of the Prolog backtracking facility:
Input: some logical term, e.g T = [x,'&&',y]
Term after parsing: [G_123,'&&',G_456] (now featuring "real" Prolog variables)
Assign a value from { boolean(t), boolean(f) } to the first unbound variable in T.
simplify the term.
... repeat or backtrack until a assignment v is found so that v(T) = t or the search space is depleted.
I'm pretty new to Prolog and honestly couldn't figure out a better approach. I'm very interested in better alternatives! (So I'm kinda half-shure that this is what I want ;-) and thank you very much for your support so far ...)
You want to associate ground terms like x (no need to write 'x') with uninstantiated variables. Certainly that does not constitute a pure relation. So it is not that clear to me that you actually want this.
And where do you get the list [x, &&, x] in the first place? You probably have some kind of tokenizer. If possible, try to associate variable names to variables prior to the actual parsing. If you insist to perform that association during parsing you will have to thread a pair of variables throughout your entire grammar. That is, instead of a clean grammar like
power(P) --> factor(F), power_r(F, P).
you will now have to write
power(P, D0,D) --> factor(F, D0,D1), power_r(F, P, D1,D).
% ^^^^ ^^^^^ ^^^^
since you are introducing context into an otherwise context free grammar.
When parsing Prolog text, the same problem occurs. The association between a variable name and a concrete variable is already established during tokenizing. The actual parser does not have to deal with it.
There are essentially two ways to perform this during tokenization:
1mo collect all occurrences Name=Variable in a list and unify them later:
v(N-V, [N-V|D],D) --> [N], {maybesometest(N)}.
unify_nvs(NVs) :-
keysort(NVs, NVs2),
uniq(NVs2).
uniq([]).
uniq([NV|NVs]) :-
head_eq(NVs, NV).
uniq(NVs).
head_eq([], _).
head_eq([N-V|_],N-V).
head_eq([N1-_|_],N2-_) :-
dif(N1,N2).
2do use some explicit dictionary to merge them early on.
Somewhat related is this question.
Not sure if you really want to do what you asked. You might do it by keeping a list of variable associations so that you would know when to reuse a variable and when to use a fresh one.
This is an example of a greedy descent parser which would parse expressions with && and ||:
parse(Exp, Bindings, NBindings)-->
parseLeaf(LExp, Bindings, MBindings),
parse_cont(Exp, LExp, MBindings, NBindings).
parse_cont(Exp, LExp, Bindings, NBindings)-->
parse_op(Op, LExp, RExp),
{!},
parseLeaf(RExp, Bindings, MBindings),
parse_cont(Exp, Op, MBindings, NBindings).
parse_cont(Exp, Exp, Bindings, Bindings)-->[].
parse_op(and(LExp, RExp), LExp, RExp)--> ['&&'].
parse_op(or(LExp, RExp), LExp, RExp)--> ['||'].
parseLeaf(Y, Bindings, NBindings)-->
[X],
{
(member(bind(X, Var), Bindings)-> Y-NBindings=Var-Bindings ; Y-NBindings=Var-[bind(X, Var)|Bindings])
}.
It parses the expression and returns also the variable bindings.
Sample outputs:
?- phrase(parse(Exp, [], Bindings), ['x', '&&', 'y']).
Exp = and(_G683, _G696),
Bindings = [bind(y, _G696), bind(x, _G683)].
?- phrase(parse(Exp, [], Bindings), ['x', '&&', 'x']).
Exp = and(_G683, _G683),
Bindings = [bind(x, _G683)].
?- phrase(parse(Exp, [], Bindings), ['x', '&&', 'y', '&&', 'x', '||', 'z']).
Exp = or(and(and(_G839, _G852), _G839), _G879),
Bindings = [bind(z, _G879), bind(y, _G852), bind(x, _G839)].
I'm trying to write some predicates to solve the following task (learnprolognow.com)
Suppose we are given a knowledge base with the following facts:
tran(eins,one).
tran(zwei,two).
tran(drei,three).
tran(vier,four).
tran(fuenf,five).
tran(sechs,six).
tran(sieben,seven).
tran(acht,eight).
tran(neun,nine).
Write a predicate listtran(G,E) which translates a list of German number words to the corresponding list of English number words. For example:
listtran([eins,neun,zwei],X).
should give:
X = [one,nine,two].
I've written:
listtran(G,E):- G=[], E=[].
listtran(G,E):- G=[First|T], tran(First, Mean), listtran(T, Eng), E = [Mean|Eng).
But I get the error: "illegal start of term" when compiling. Any suggestions?
The last bracket in your last line should be a square one.
Also, you might want to make use of Prolog's pattern matching:
listtran([], []).
listtran([First|T], [Mean|EngT]):-
tran(First, Mean),
listtran(T, EngT).